Kim, Sung-Hou - University of California, Berkeley

个人简介

Born 1937; B.S. (1960), M.S. (1962) Seoul National University, Korea; Ph.D. University of Pittsburgh (1966); Research Associate M.I.T. (1966-70); Senior Research Scientist M.I.T. (1970-72); Assistant and Associate Professor, Biochemistry, Duke University School of Medicine (1972-78); Professor and Professor of Graduate Studies, Department of Chemistry, University of California, Berkeley (1978-present); Faculty Scientist, Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley (1979-present ); Fulbright Fellow (1962); N.I.H. Research Career Development Award (1976-79); Miller Research Professorship (UC Berkeley, 1983); Guggenheim Fellow (1985); E.O. Lawrence Award (US Department of Energy, 1987); Princess Takamatsu Award (Tokyo, Japan, 1989); The HoAm Prize ( Samsung Foundation, Korea, 1994); Fellow, American Academy of Arts & Sciences (1994); Member, U.S. National Academy of Sciences (1994); Korean Academy of Science and Technology Prize in Science ( Korea, 2000); Legacy Laureate Award, University of Pittsburgh (2005); The Pride of Seoul National Univ. Alumni Award (Seoul, Korea, 2006); Department of Chemistry Alumni Award, Univ. of Pittsburgh (2008); Alexander Rich Medal, M.I.T. (2014) Member American Society of Biological Chemists, American Crystallography Association, American Chemical Society, and Biophysical Society

研究领域

查看导师最新文章（温馨提示：请注意重名现象，建议点开原文通过作者单位确认）

A. Construction of whole genome phylogeny of all organisms, “Tree of Life”. The first task is to develop one or more methods for comparing whole genome sequences of two organisms, not just a few highly conserved gene or protein sequences, as currently practiced in Multiple Sequence Alignment (MSA) method. Our starting point was treating each whole genome sequence as a book consisting of a single string of alphabets for each chromosome without spaces between words. My group developed the “Feature Frequency Profile (FFP)” method, which is a variation of “Word Frequency Profile” method, commonly used when comparing two books using Natural Language Analysis algorithms. Using the FFP method we were able to construct phylogenic trees of two of the most diverse and large groups of Life, Prokaryotes (Archaea and Bacteria combined) and Fungi (the largest kingdom of Eukarya) at a high resolution. When compared to those trees based on MSA methods, our results revealed high similarities in grouping (clading) at the species level, but substantial differences in evolutionary branching order of the clades at deeper evolutionary levels. Our next project is to construct the “Tree of Life” for all living organisms for which whole genome sequences are available. B. Whole genome variation of Human species. Most regions of genomes of normal human cells have been found to have the same sequences among individuals, but a small fraction, spread throughout the genome, have variations within a population. Of these, the single nucleotide polymorphisms (SNPs) account for the largest number of variations and, have been identified in over 3 million genomic “tag” positions out of 3 billion positions in a whole haploid genome. It has been widely accepted that the analysis of SNPs may be able to allow one to predict the genomic component of the disease susceptibility of individuals to complex diseases such as cancers, neurological diseases, autoimmune diseases and other traits. So far, the results from the current analysis methods (e.g. Genome-wide Association Studies method) and interpretation of the results have yielded information of limited predictive value of practical utility for making health-related decisions at individual or population level without information of family histories. Recognizing the complexity and heterogeneity of cancer mechanisms, we have developed, using SNPs, an empirical approach using supervised machine-learning, a branch of Artificial Intelligence, for predicting the relative genomic susceptibility of an individual to 9 traits consisting of 8 major cancer classes plus a healthy class. The multiclass accuracy of the combined prediction ranges from 33 to 56% depending on cancer classes of testing sets, as compared to 11% for a random prediction among 9 traits. Despite limited SNP data available and absence of rare SNPs in public databases at present, the results suggest that the framework of this approach or its improvement can predict the cancer susceptibility with probability estimates useful for making health-decisions for individuals or for a population. Similar approaches are being applied to predict genomic susceptibility for neurological diseases and autoimmune diseases.