Research
Overview | Current Research | Previous Research
Overview: Yeast as a Genomic Model
In its brief history, functional genomics has benefited greatly from analysis of the baker's yeast Saccharomyces cerevisiae. Long recognized as an informative model organism in traditional genetic studies, Saccharomyces also presents an ideal model genome for large-scale functional analysis. Relative to other eukaryotes, S. cerevisiae possesses a compact genome: approximately 70% of its total (non rDNA) genetic complement is dedicated to protein-coding sequence. Encompassing 16 chromosomes, the 12-Megabase yeast genome is predicted to encode approximately 6,000 genes, with one gene per 2 kb of genomic sequence. Genes within higher eukaryotes typically contain introns; however, only 271 yeast genes are known to contain introns, thereby simplifying processes of computer-based gene identification.
That said, only one-third of all predicted yeast genes had been functionally characterized when the complete sequence of the yeast genome first became available. At present, approximately 2,518 yeast genes still encode proteins of unknown function. Furthermore, much debate still exists as to the exact gene complement in S. cerevisiae. Several hundred currently annotated genes may be spurious; on the other hand, at least three hundred previously overlooked genes may reside in the genome. There exists, therefore, a strong need to better characterize gene function in S. cerevisiae. Genomic techniques provide an exciting route towards this goal.
- back to top -
Current Research
The Kumar lab is interested in integrating the fields of genomics, proteomics, and bioinformatics as a means of investigating fundamental processes of cell biology in the bakers' yeast Saccharomyces cerevisiae. Towards this goal, the lab will be undertaking several studies. Initial work will focus on a process wherein fungi convert from growth as single, oval-shaped cells to growth as invasive filaments. In S. cerevisiae, these filaments are called pseudohyphae (PH). PH growth is a strong model of similar processes occurring in many pathogenic fungi known to infect crops and humans. Since this change in growth form is absolutely required for virulence in these pathogenic fungi, an understanding of PH growth in yeast will ultimately help researchers devise better antifungal therapies and treatments, while also shedding light upon many fundamental processes of cell biology (such as the establishment of cell polarity and cell cycle progression).
The Kumar lab will attempt to identify proteins and pathways involved in PH growth through the following studies.
- High-throughput technologies will be used to construct a genome-wide collection of plasmids facilitating the in vivo identification of proteins differently abundant or differently localized during PH growth.
- In complement to this proteomic analysis, microarray-based expression profiling will be used to identify genes differentially transcribed during PH growth.
- PH growth genes will be systematically disrupted for subsequent analysis of mutant phenotypes related to filamentous differentiation.
- Co-localization analysis and fluorescence resonance energy transfer (FRET) will be used to establish a high-confidence set of protein-protein interactions encompassing known and putative PH growth genes.
In addition to defining potential drug targets, this work will help define functions for many previously uncharacterized yeast genes. At present, approximately 4,000 genes in yeast have been functionally characterized — at least to some degree. The remaining 2,000 genes, however, are very poorly understood. Additional research in the lab will employ genomic and proteomic approaches to uncover functions for these genes, as well as more fully define functions for those genes partially characterized.
The Kumar lab is also very interested in applying computational biology and bioinformatics as tools for functional genomics. In particular, the lab will be undertaking studies integrating large and varied data sets as a means of drawing biologically relevant conclusions.
Obviously, genes function in complex networks, and, therefore, very rarely act towards a single purpose. Large-scale approaches such as these may be needed to appreciate the full breadth of gene function, even in a supposedly "simple" organism like yeast.
- back to top -
Previous Research
Transposon Mutagenesis and Phenotypic Screening | Gene-Finding in Yeast | Subcellular Localization of the Yeast Proteome
Transposon Mutagenesis and Phenotypic Screening (PDF)
As a postdoctoral fellow in Michael Snyder's lab at Yale University , Dr. Kumar managed a high-throughput project using large-scale transposon mutagenesis to characterize gene function on a genome-wide scale in yeast. This approach employs a series of multipurpose transposons derived from either the bacterial transposons Tn 3 or Tn 7 — the latter constructed as part of a collaboration with Nancy Craig's lab at Johns Hopkins University . Each transposon was engineered to generate a diverse array of informative alleles including reporter gene fusions, gene disruptions, conditional alleles, and epitope-tagged alleles. These multipurpose transposons were used to mutagenize a plasmid-based library of yeast genomic DNA, producing a series of insertional libraries for subsequent screening. Employing high-throughput methods, Dr. Kumar and colleagues screened over a quarter of a million transposon insertion alleles for reporter gene activity in yeast — preparing plasmid DNA from each bacterial strain and subsequently introducing the insertion allele into an appropriate yeast strain. This work generated a collection of approximately 28,000 yeast mutants, each carrying a defined transposon insertion within gene-coding sequence (collectively representing nearly 4,000 annotated yeast genes). Approximately 8,000 strains derived from this collection have been screened in an array format for disruption phenotypes under 20 different growth conditions. The resulting data was analyzed by k-means clustering, yielding insight into gene function for nearly 200 previously uncharacterized open reading frames.
To view data from this study, please access the TRIPLES web site.
- back to top -
Gene-Finding in Yeast (PDF)
Gene identification is a critical step preliminary to comparative or functional study of any genome; yet, genes are difficult to identify, even in a small genome like that of Saccharomyces cerevisiae. As first annotated in 1997, the nuclear genome of S. cerevisiae strain S288c was predicted to encode 6,275 genes. For purposes of this annotation, a gene was defined as any open reading frame (ORF) of at least 100 codons that does not completely overlap a longer ORF on either strand. This definition, however, was not comprehensive. Over the next four years, a total of 65 previously non-annotated genes were identified in yeast, largely through non-systematic methods.
To more exhaustively identify overlooked genes in yeast, Dr. Kumar and colleagues in the Snyder and Gerstein labs at Yale University developed an integrated approach utilizing large-scale transposon-tagging, RNA microarray analysis, and genome-wide homology-searching. First, expressed sequences were trapped using a modified transposon that produces protein fusions to b -galactosidase; non-annotated open reading frames translated as β-gal chimeras were selected as a candidate pool of potential genes. To verify expression of these sequences, labeled RNA was hybridized against a microarray of oligonucleotides designed to detect gene transcripts in a strand-specific manner. In complement to this experimental method, novel genes were also identified in silico by homology to previously annotated proteins. In total, this method identified 137 putative new genes—nearly 2% of the total gene complement in yeast. As these methods are capable of identifying both short ORFs and antisense ORFs, this approach provides an effective supplement to current gene-finding schemes and should be applicable to higher eukaryotes as well.
- back to top -
Subcellular Localization of the Yeast Proteome (PDF)
Protein localization data sets are valuable resources in elucidating gene function, yet, until recently, protein localization had not been analyzed on a large-scale for any single eukaryotic proteome. Towards this end, in 2002, Dr. Kumar and coworkers in Mike Snyder's lab at Yale University published the first proteome-scale analysis of protein localization, epitope-tagging 60% of the yeast proteome and determining the subcellular localization of 2,744 yeast proteins. Extrapolating this data through a computational algorithm employing Bayesian formalism, Dr. Kumar and colleagues in Mark Gerstein's lab estimated the subcellular distribution of all 6100 yeast proteins. In collaboration with Shirleen Roeder's lab at Yale, a subset of nuclear proteins was further analyzed by immunolocalization using surface-spread preparations of meiotic chromosomes; 38% of these proteins were found associated with chromosomal DNA. In total, this study presented the first "localizome" for any eukaryote and defined the subcellular localization of 955 proteins of previously unknown function -- nearly half of all functionally uncharacterized proteins in yeast.
- back to top -

