Mon. Mar. 12, 2001 Lecture 3 on Yeast Genetics David Nelson
Last modified 3/12/01 12:20 PM
Genome Scale Yeast Genetics
Yeast is done. The genome is sequenced. Do we understand yeast, a one celled eukaryote
that has no tissues, no organs. No we don't! According to the yeast protein database
I made the following table
1997 1999 2001
total genes 6027 6083 6238
biochemically or genetically characterized 2522 3113 3808
homologs from other species 1175 944 540
unknown 2330 2026 1890
In 2001 the 1890 genes that are unidentified = 30% of the SIMPLE yeast genome
that we do not understand.
The word most often mentioned now that the sequence is done is function. We have to
learn the function of these proteins. As we have discussed already, the way to go about
this is to disrupt or delete the gene to see if there is a phenotype, but currently 60% or more
of yeast mutants give no phenotype. Therefore this approach won't help for about 1100
genes. Even those that do give a phenotype won't give much of a clue to the protein's
function. Failure to grow at high temperature or in high salt medium does not say much
about function. So what should be done?
(from the SGD Saccharomyces Genome Database)
Systematic gene disruption results now available
The complete set of results from the yeast deletion project is now available at SGD.
The results of the work by the international deletion project consortium were
submitted to SGD. Each gene that has been deleted has its systematic deletion
phenotype (viable/inviable) displayed on its locus page. (Posted January 9, 2001)
The Research Genetics site that is selling the deletion strains says that the consortium has
deleted 93% of the 6200 ORFS and 20,000 strains are available, MATa, MAT alpha, diploid
homozygous (if not lethal) and diploid heterozygous.
The way forward is through big biology. At the very least, each gene needs to be disrupted
or better yet deleted exactly from start codon to stop codon, because this is the minimum
needed to begin analysis of a yeast gene. Second, each gene should be placed on high and
low copy vectors with suitable tags for studies of its expression. These tags could and will
probably include green fluorescent protein and or epitope tags and his-tags. Chromosome
VIII has already had all its genes deleted and in their place GFP has been inserted so the
expression of these genes can be monitored by the action of their own promoters.
The EUROFAN (European functional analysis network) project is deleting every gene in
yeast. They are making deletion cassettes for each gene that can be used to delete the gene
from other yeast strains.
The Research Genetics site that is selling the deletion strains says that the consortium has
deleted 93% of the 6200 ORFS and 20,000 strains are available, MATa, MAT alpha, diploid
homozygous (if not lethal) and diploid heterozygous.
EUROSCARF The European Saccharomyces Cerevisiae Archive for Functional Analysis
sells deletion strains also. They had the following notice on their current web site.
EUROSCARF
31-Aug-2000:
22472 yeast deletion strains covering 5867
different genes plus 1309 plasmids
It is already possible to purchase pairs of primers that are specific for each yeast gene, or
you can buy the gene already amplified with multiple restriction sites on each end. In a few
months, every yeast gene deletion will be available by catalog, but this opens the door to
one-gene-at-a-time type of biology. There is still the need to do all-genes-at-once biology.
The complement to a gene knockout is gene overexpression. If one does not show a
phenotype, then the other might. Having too much of a protein can often be bad for the
cell, because it titrates out interacting proteins, or it may make too much of a product, or it
may disrupt membrane structure, if it is a membrane protein. If it is a kinase it could shift
the regulation of a pathway to the extreme right, where all substrate proteins are
phosphorylated. These phenotypes might be even more informative that the knockout
phenotype.
In Jan. 1996 EUROFAN was founded (European Functional Analysis Network). with
144 labs to try to understand the function of 1000 novel yeast genes (about half of the
uncharacterized genes). These labs will systematically delete the genes replacing them with
a kanamycin resistance gene kanMX (now done). Then each deletion will be tested for viability,
growth rate, respiration, mating and sporulation. The deletions will be sent to a repository
and mailed out to other labs with more specific tests.
The expert opinion of Stephen Oliver is that novel biochemical pathways are probably not
going to account for many of these unknown gene functions. Probably, these genes are
involved in specific tasks of regulation that may only be required at specific times, making
them hard to detect. Systematic approaches will be needed.
If there are 1100 genes (or thereabouts) that have no phenotype, the simplest idea to
identify redundant genes would be to cross all these strains with each other and see if
double knockouts had a phenotype. The problem with this is the number of crosses. (1100
x 1100 ~ 1.2 million). Also, each mating would have to be sporulated to get haploid strains
again and each strain would have to be checked somehow for the combination of the two
gene deletions. This is really too much labor.
This approach would be much more feasible and successful if small sets of similar protein
sequences were chosen as the starting sample. For example, the mitochondrial carriers in
yeast comprise 35 genes. Only a few of these give obvious phenotypes when they are
disrupted or deleted. If there were say 25 mitochondrial carrier genes that had no
phenotype, it would be possible to cross 25 x 25 strains and do the work to see if the
double mutants were now respiration defective. That is only 625 crosses, and it would
probably take a few months to do the analysis. Similar crosses could be done with
kinases, since there are only 113 in yeast. By breaking up the task, it may be possible to
identify many redundant genes and reduce the size of the pool of completely
uncharacterized genes from 1100 to just a few hundred. At this point some automated way
to perform crosses and sporulations might allow 10,000 to 20,000 crosses to be
undertaken.
Another approach that we talked about earlier was to search for high copy suppressors. A
massive scale search for high copy suppressors could be performed in the following way.
All deletion strains that are known to be lethal in haploid yeast could be prepared in yeast
with a copy of the wild type gene on a URA3 or CAN1 plasmid. (This is estimated as
11.6% of yeast genes or about 700 genes. The HHMI report for 1996 says that 950 genes
have been identified in yeast as haploid lethal. These yeast could then be transformed with a
library of all the uncharacterized genes on high copy plasmids (about 1200 genes). The
cells should be plated on media containing FOA or canavanine to chase out the wild type
copy of the essential gene. Colonies that form on these plates will have high copy
suppressors of the essential gene. This would take 700 transformations with a library, but
you have the potential to identify several hundred redundant genes.
Shuttle mutagenesis and the large scale analysis of yeast genes.
To make progress in identifying yeast gene function it will be necessary to manipulate all
6000 genes at once. This requires methods that are random and can be applied to the whole
yeast genome. One method devised in 1986 is shuttle mutagenesis (Shuttle mutagenesis: A
method of transposon mutagenesis for Saccharomyces cerevisiae. Seifert H.S. et al.
PNAS 83, 735-739 1986). In this method yeast DNA is transformed into E. coli on a
special plasmid designed for transposon mutagenesis by a Tn3 derivative. Since the Tn3
transposon can insert anywhere in this plasmid, all unnecessary sequence has been
removed so the yeast DNA is the largest part of the plasmid. The plasmid contains no Tn3
terminal repeats since these make the plasmid immune to Tn3 insertion.
Once the yeast library has been transformed into E. coli, the Tn3 derived transposon
inserts into this vector. It cannot insert into any other DNA in the cell because E. coli
chromosome DNA is immune and the other plasmids in the cell have Tn3 terminal repeats
that make them immune. We won't go into the details of how the transposon is inserted or
how the cointegrate is resolved, but the outcome is yeast DNA with a randomly inserted
Tn3 derivative. This derivative has been engineered to contain a selectable marker like
LEU2 and the lac Z gene. The lacZ gene is added at the end so it has the capability of
forming a lac Z fusion with the yeast DNA in one of the six reading frames of the yeast
genes. The lac Z fragment has no start codon or promoter, so it can only be expressed if
the correct in frame fusion is made. The beta galactosidase product will then be made from
the promoter of the yeast gene, and it will be expressed only when the yeast gene would
normally be expressed.
The yeast DNA is cut out from the plasmid with NotI, a rare DNA restriction enzyme with
sites flanking the yeast DNA. The linear DNA is used to transform diploid yeast, where,
by homologous recombination, the linear fragment replaces the wild type sequence.
Plating on LEU2 deficient media selects for the insertion, thus creating a library of
randomly generated gene knockouts.
Remember that each inserted element has a lacZ gene, so screening for blue colonies will
identify yeast expressing the fusion in frame. A more recent approach offers the alternative
of green fluorescent protein GFP, or a haemagglutinin epitope tag with similar results.
The paper in 1986 demonstrated the feasibility of this method, but it only reported the
mutagenesis of one yeast gene, ADE1. Yeast that were transformed with the knockout of
ADE1 (ade1::Tn3 derivative) became pink as the adenine biosynthetic pathway was
blocked. In 1994 the method was applied to the whole yeast genome in a landmark paper
by Nancy Burns and others in Michael Snyder's lab at Yale (Burns et al. Genes and Devel.
8, 1087-1105 1994). They examined 20,250 yeast transformants and found that 2800
turned blue on X-gal plates under normal vegetative growth conditions. That is 13.8%.
Since only one frame in six is correct for expression of beta galactosidase the maximum
they could expect was 16.7%. They could only expect that if all the DNA was coding for
genes and all the genes were expressed under these conditions. The ratio 13.8/16.7 gives a
rough estimate of the percent of the yeast genome that is expressed as protein under
vegetative growth conditions (82.6%) and this number is very similar to the actual percent
of the genome that codes for genes. The conclusion is that nearly every gene in yeast is
expressed at some level under normal conditions. These levels may increase under special
conditions, but there are very few genes that are lying dormant.
So what can be done with this collection of about one third to one half of the yeast genes?
(Note: some of the 2800 fusions are bound to be duplicates, so the total of unique
knockouts is less than 2800/6000 = 47%.) Since the fusions could be detected by
fluorescent antibodies, 2373 fusions were examined by immunofluorescence microscopy.
(This is where GFP fusions would be nice, no antibodies required.) If enough of the yeast
protein is present, you could expect the beta gal part to be carried along to its normal
cellular location. In 10% of the cases the protein localized to discrete cellular sites. 3%
went to the nucleus. The others are summarized in Table I of the paper. This does give
some clue to the protein's role in the cell.
Some of the interesting knockouts were recovered from the yeast by integrating a URA3
gene into the transposon at the ampicillin resistance marker. This insertion was selected on
URA minus media and the yeast DNA was then cut with an enzyme that recognized one site
in the transposon and the other had to be in the yeast DNA. The linear fragments were
recircularized and recovered in E. coli using ampicillin resistance. The yeast DNA was
sequenced from the lacZ fusion joint.
They sequenced 90 genes. 31 were known, 5 were homologs of known genes and 54
were novel. So for those genes we now know were they are located in the cell.
So far, we have just been talking about diploid cells that are heterozygous for the knockout.
Obviously, there is a need to make the haploid for each knockout and look for a phenotype.
This means the cells must be sporulated and tetrads must be dissected (a tedious job). This
was carried out for 59 of the localized knockouts. 9 had a haploid lethal phenotype,
meaning these genes are essential to survival. In six of these knockouts the cells
germinated and started to divide, but they arrested at specific points in the cell cycle
suggesting they were cell division cycle CDC genes. 9 more had growth defects. 41
looked normal on rich media. 41/59 is 69% that had no visible defect on rich media. This
result emphasizes the redundancy built into even a simple cell like yeast. 48 knockouts
were examined more closely on different media and at different temperatures to see if a
phenotype could be found. 35/59 = 59% had no detected phenotype.
The authors tested for genes that were expressed under conditions of meiosis and
sporulation (starvation conditions). They looked at 19000 knockout strains and found 55
that were induced under these conditions. They estimated about 93-135 genes in yeast
would be induced this way. They sequenced 43 of these and found 40 different genes. 9
were previously known to be induced in meiosis. 5 more were known, but were not
identified as meiotically induced. 26 genes were not in the database.
These 43 genes were examined by immunofluorescence. Only 10 could be detected above
background, meaning they are expressed at low levels. 7 were cytoplasmic. One appeared
at a specific time in a unique place near the spindle poles. It appeared to have a limited
specific task to do then it was gone.
Haploids of these knockouts were made and checked for sporulation efficiency. 64% did
not show a phenotype.
Update on Snyder's lab and what they are doing now
The Snyder lab is currently screening 1200 transformants per week to find new beta gal
fusions. They find about 120 of these each week (blue colonies on X-gal plates). In a
couple years at that rate they should have nearly every gene in yeast.
The transposon used in this screen has been improved to be removable by cre protein. This
leaves a 93 amino acid insertion in an otherwise full length and normal protein. The 93
amino acids contain an HA epitope tag for immunofluorescence. All fusions that are in
frame are being saved and will be publicly available. The presence of the correct C-
terminal should allow correct localization of proteins that normally target via their C-
terminal sequences. (KDEL proteins etc.)
Kits are available to add these lacZ fusions or HA epitope tags to your own gene.
References to The Snyder Lab knockouts.
Kumar A, Cheung KH, Ross-Macdonald P, Coelho PS, Miller P, Snyder M.
TRIPLES: a database of gene function in Saccharomyces cerevisiae.
Nucleic Acids Res. 2000 Jan 1;28(1):81-4.
Ross-Macdonald P, Coelho PS, Roemer T, Agarwal S, Kumar A, Jansen R, Cheung KH, Sheehan A,
Symoniatis D, Umansky L, Heidtman M, Nelson FK, Iwasaki H, Hager K, Gerstein M, Miller P,
Roeder GS, Snyder M.
Large-scale analysis of the yeast genome by transposon tagging and gene disruption.
Nature. 1999 Nov 25;402(6760):413-8.
More transposon mutagenesis in yeast.
Dr. Snyder's lab is not the only lab doing this type of work. Kristin Chun and Mark Goebl
published a paper in 1996 describing a similar screen that focused on identifying haploid
lethal mutants that had defects in budding (The identification of transposon-tagged
mutations in essential genes that affect cell morphology in Saccharomyces cerevisiae.
Genetics 142, 39-50 1996). Dr. Chun was here in December 1996 to give a seminar on
this work.
The motivation behind this screen was to identify genes required in yeast budding. Many
of these genes had already been identified by conditional (temperature sensitive) mutant
screens. Chun and Goebl point out that some genes will be missed in a screen of
conditional mutants because these mutants might not be possible in some essential genes.
Therefore, they set out to make a collection of haploid lethal mutants on a massive scale and
search this set for budding defects.
The yeast DNA library was made of BamH1 fragments and represented 6200 independent
fragments. These do not represent the whole yeast genome, because some are duplicates.
These fragments were transformed into E. coli and a transposon was inserted as we saw
earlier. The NotI fragments of the recovered transposon inserted yeast DNA was
transformed into yeast (diploid) and gene disruptions were created by homologous
recombination. From these transformations, 34500 colonies were screened for haploid
lethal mutations.
This was done in a clever way using canavanine and cycloheximide (remember these are
used for counterselection to chase out plasmids with wild type CAN1 or CYH2 genes.) In
this case a diploid strain was constructed that was heterozygous for each allele, so one copy
was mutant and one copy was wild type. Since both genes are dominant, all diploids will
die on media with cycloheximide and canavanine, but spores that segregate these two genes
out will survive. This strain was transformed with the knockout fragments that had URA3
as the selectable marker gene in the transposon. The diploids were replica printed to a
sporulation medium causing them to form spores. The spores were transferred to a medium
with cycloheximide, canavanine and no uracil. This media killed all diploid cells and all
spores that did not segregate the CHY2 and CAN1 wild type genes out while keeping the
URA3 transposon. The chances of keeping each gene is 1/2 so this combination of genes
would happen in 1/8 of the spores. It would be less for genes that were closely linked to
CYH2 or CAN1 on the yeast chromosomes, so these genes are probably underrepresented
in this library. Since only 1/8th or fewer of the spores can grow on this media, papillations
appear. The spores that failed to form these papillations at all contain haploid lethal
knockouts. These are then recovered as viable diploids from the master plate.
Of the 34500 transformants screened, 4025 were haploid lethal (11.6%).
The original goal of this screen was to look for haploid lethal mutants that were defective in
bud formation, so all 4025 of these yeast were sporulated and examined under the
microscope to find budding defects as the cells died. 495 candidates were found. After
closer inspection 209 of these still met the criteria.
Initially, 39 of these genes were identified by sequencing or PCR screens to be in or
near just four genes. Three were previously uncharacterized genes.
Later, 178 of the 209 knockouts were shown to be in 29 different genes, 10 of which were
new, meaning they had not been identified in previous screens for budding defects. One of
these appears to be a CDC kinase involved in cell cycle regulation. The last 31 genes
remain to be analyzed.
This is just the beginning of large scale genomic analysis. At our recitation, we
will cover genetic footprinting and molecular bar coding of yeast genes in highly parallel
methods. One uses PCR and the other uses DNA chip technology to examine large
numbers of genes without purification or sorting of the yeast strains.
Papers for discussion:
Genetic footprinting: a genomic strategy for determining a gene's function given its
sequence. Smith, V., Botstein, D. and Brown, P.O. PNAS 92, 6479-6483 (1995)
Quantitative phenotypic analysis of yeast deletion mutants using a highly parallel molecular
bar-coding strategy. Shoemaker, D.D., Lashkari, D.A., Morris, D., Mittmann, M. and
Davis, R.