Title: Comparing Genomes in terms of Protein Structure Abstract: My talk will be focussed on the emerging field of structural genomics, describing how genomes can be compared in terms of protein structure. I'll try to cover some of the points below: * An essential requirement for a structure survey is a library of folds, which groups the known structures into "fold families." I will describe various aspects of a fold library, including methods of structural alignment and how important objective statistical measures are for assessing similarities within the library. * One can use a fold library to count the number of folds in genomes, expressing the results in the form of Venn diagrams, "top-10" lists, and fold trees for shared and common folds. One particular analysis shows that the common folds shared between very different microorganisms - i.e. in different kingdoms - have a remarkably similar super secondary structure, being comprised of repeated strand-helix-strand units. * A major difficulty with this sort of "fold-counting" is that only a small subset of the structures in a complete genome are currently known and this subset is prone to sampling bias. I'll talk about where we currently are with respect to assigning known folds to genomes, focusing in particular on M. genitalium and the impact of PSI-blast. * One way of overcoming biases is through structure prediction, which can be applied uniformly and comprehensively to a whole genome. I'll describe a number of findings derived from such predictions. * Another interesting application of analyzing structures in genomes is assessing the relationship between protein fold and function in a comprehensive fashion. I'll talk a little about how this, focussing on the yeast genome. Continuously updated tables and further information pertinent to this talk is available over the web at http://bioinfo.mbb.yale.edu/genome. The talk is available from http://bioinfo.mbb.yale.edu/lectures, sublink http://bioinfo.mbb.yale.edu/lectures/celera . - Mark Gerstein