CLSP Homepage : Workshop Homepage
Workshop 2000


Jump To:

Computational Methods for Genome Sequence Analysis - Steven Salzberg - 7/26/2000
Presentation: HTML or PPT
View a Video Archive of the Presentation (RealPlayer Required)
  • Abstract:

    With the increasing number of completed genome sequencing projects, computational techniques for analysis and comparison of these genomes are becoming a critically important research tool. In the past year alone, 15 new bacterial genomes have appeared, as well as several complete or nearly complete eukaryotic genomes, and the pace continues to accelerate. Each new genome contains thousands of new genes, all of which are deposited into public databases immediately upon publication. These genes then become the basis for much further research into the biology of these organisms, and their sequences are used as the basis for further biological study.

    Genome sequence analysis includes many sophisticated computational steps, including assembly of the genome, initial identification of the locations of all the genes, and assignment of the gene functions. Gene identification is perhaps the most difficult task, relying heavily on sequence alignment algorithms and de novo gene finders. For bacterial genomes, the initial gene finding problem is largely solved by the Glimmer system, which finds approximately 97-99% of all genes automatically. Assignment of functions to these genes remains difficult, especially when homology is either weak or nonexistent. For eukaryotic chromosomes, both gene finding and role assignment are difficult tasks. Genome analysis also includes comparison of the complete DNA sequence to the sequences of other genomes. Such comparison can identify major rearrangements of the genomes and other evolutionary changes; it can also identify regions of the sequence that have been transferred into the genome from other organisms.

    This talk will describe computational methods for finding genes and for comparing large genome sequences, and will highlight some of the biological discoveries that these methods have yielded.

    Web reference: http://www.tigr.org/softlab/glimmer/glimmer.html

     

  • Biography:

    Dr. Steven Salzberg is the Director of Bioinformatics at The Institute for Genomic Research, where he has been since July 1997, and a Research Professor in the Department of Computer Science at Johns Hopkins University, which he joined in 1989. He received his B.A., M.S., and M.Phil. degrees in computer science from Yale University. For three years he worked as a research scientist with Applied Expert Systems, a software company in Cambridge, Massachusetts. He received the Ph.D. in computer science from Harvard University in 1989. While at Harvard he held a position as Associate in Research in the Graduate School of Business Administration.

    Dr. Salzberg's research interests include gene finding, sequence alignment and assembly, genomics, Markov modeling, and machine learning. He has co-authored two books and over 60 research papers in refereed publications including Science, Nature, Nucleic Acids Research, Nature Genetics, Gene, J. Computational Biology, and Machine Learning. He and his colleagues have developed four computational gene-finding systems using Markov models and decision trees as the underlying technologies. He was a co-developer of two widely distributed machine learning systems. Some of his publications andcomputer systems are available from his webpage at http://www.tigr.org/~salzberg. He is the co-Chair for the Third and Fourth Annual Conferences on Computational Genomics, and has been on the program committees for the Intelligent Systems for Molecular Biology Conference, the International Conference on Machine Learning, and the National Conference on Artificial Intelligence, among others. He is currently a member of the editorial boards of the journals Gene, Artificial Intelligence Review, and Pattern Analysis and Applications.