recruitment sequencesequence是什么意思思

CpG islands and the regulation of transcription
The Wellcome Trust Centre for Cell Biology, University of Edinburgh, Edinburgh EH9 3JR, United Kingdom
Vertebrate CpG islands (CGIs) are short interspersed DNA sequences that deviate significantly from the average genomic pattern
by being GC-rich, CpG-rich, and predominantly nonmethylated. Most, perhaps all, CGIs are sites of transcription initiation,
including thousands that are remote from currently annotated promoters. Shared DNA sequence features adapt CGIs for promoter
function by destabilizing nucleosomes and attracting proteins that create a transcriptionally permissive chromatin state.
Silencing of CGI promoters is achieved through dense CpG methylation or polycomb recruitment, again using their distinctive
DNA sequence composition. CGIs are therefore generically equipped to influence local chromatin structure and simplify regulation
of gene activity.
Vertebrate genomes are methylated predominantly at the dinucleotide CpG, and consequently are CpG-deficient owing to the mutagenic
properties of methylcytosine (; ). The globally methylated, CpG-poor genomic landscape is punctuated, however, by CpG islands (CGIs), which are, on average,
1000 base pairs (bp) long and show an elevated G+C base composition, little CpG depletion, and frequent absence of DNA methylation.
These shared properties have allowed CGIs to be isolated as a relatively homogeneous fraction of the genome, despite the heterogeneity
of their individual nucleotide sequences (; ; ). Approximately 70% of annotated gene promoters are associated with a CGI, making this the most common promoter type in the
vertebrate genome (). Included are virtually all housekeeping genes, as well as a proportion of tissue-specific genes and developmental regulator
genes (; ). Recent work has uncovered a large class of CGIs that are remote from annotated transcription start sites (TSSs), but nevertheless
show evidence for promoter function (; ). These findings emphasize the strong correlation between CGIs and transcription initiation.
In spite of their link with transcription, the functional significance of CGIs is only just beginning to emerge. CGI promoters
turn out to have distinctive patterns of transcription initiation and chromatin configuration. Their regulation involves proteins
(some of which specifically bind nonmethylated CpG) that influence the modification status of CGI chromatin. In addition,
the CpG moieties themselves are sometimes subject to cytosine methylation, which correlates with stable shutdown of the associated
promoter. Here we examine the properties shared by vertebrate CGIs and how transcription is regulated at these sites. Recent
related reviews include , , and .
Evolutionary conservation of CGIs
CGIs are distinct in vertebrates due to their lack of DNA methylation and absence of CpG deficiency, which sets them apart
form bulk genomic DNA. Organisms such as the invertebrates Drosophila melanogaster and Caenorhabditis elegans and the fungus Saccharomyces cereviceae have little or no DNA methylation and, as a result, CpG occurs at the expected frequency throughout the genome. CGIs are
not detectable in these genomes because, in a sense, the whole genome is CGI-like. On the other hand, many plant genomes are
very highly methylated, and early research detected a nonmethylated CGI-like genomic fraction (). Lack of DNA methylation in plants is evidently linked to genes, as isolation of nonmethylated DNA by “methylation filtration”
greatly enriches for transcribed sequences (), despite the presence of gene body methylation in these organisms (; ). Extensive DNA methylome analysis has documented nonmethylated regions at both extremities of plant transcription units
(; ). Whether these can be considered equivalent to or distinct from vertebrate CGIs is not presently known.
Unlike vertebrates, most invertebrate animals exhibit mosaically methylated genomes comprising alternating methylated and
nonmethylated domains (; ). The persistence over evolutionary time of discrete genomic regions with different DNA methylation states has partitioned
DNA sequences into two fractions: (1) methylated and, consequently, CpG- and (2) nonmethylated, with the expected
frequency of CpG. The origin of vertebrates appears to have coincided with a transition from mosaic to global DNA methylation,
accompanied by concomitant CpG depletion throughout most of the genome (). Well-studied among the mosaic genomes is the invertebrate Ciona intestinalis (sea squirt), which is evolutionarily close to the invertebrate–vertebrate boundary. C. intestinalis genes within methylated domains are sometimes associated with short nonmethylated CGI-like regions that colocalize with TSSs
(). CGIs may therefore predate the evolution of vertebrates.
Until recently, it was not clear that CGIs were conserved in either number or genomic location between different vertebrates.
Initially, far fewer CGIs were bioinformatically predicted in the mouse genome than in the human genome (), and this apparent lack of conservation called into question their regulatory importance. CGI prediction algorithms by necessity
employ thresholds for detection, alteration of which dramatically changes the number predicted (; ). The algorithms are also unable to take into account the methylation status of CGIs. A biochemical approach has shed fresh
light on the issue, using affinity purification with the CXXC protein domain to isolate clusters of unmethylated CpGs from
genomic DNA (). High-throughput DNA sequencing of this fraction identified a comprehensive CGI complement from both humans and mice () and revealed similar numbers of CGIs per haploid genome: 25,495 and 23,021, respectively. The reason for the initial discrepancy
is that mouse CGIs show, on average, a slightly lower CpG content compared with human CGIs. Biochemical purification of clusters
of nonmethylated CpG overcomes this weakness. Also conserved between humans and mice was the proportion of CGIs associating
with annotated TSSs (~50%). Moreover, the remaining half of CGIs were distributed equally between locations within gene bodies
(intragenic) or between genes (intergenic) in both species (). The genomic position of many of these additional CGIs appears to have been maintained since the divergence of humans and
mice ~75 million years ago, implying functional importance.
View larger version:
The genomic distribution of CGIs. (A) CGIs can be located at annotated TSSs, within gene bodies (Intragenic), or between annotated genes (Intergenic). Intragenic
and intergenic CGIs of unknown function are classed as “orphan” CGIs. (Empty circles) Unmethylated CpG residues. (Filled circles)
Methylated CpG residues. (B) The genomic distribution of CGIs in the human and mouse genome as determined by Illingworth and colleagues (2010). The total
number of CGIs is given at the top of each graph.
CGIs are sites of transcriptional initiation
About half of all CGIs self-evidently contain TSSs, as they coincide with promoters of annotated genes. The other half are
either within or between characterized transcription units and have been termed “orphan” CGIs to reflect uncertainty over
their significance (). Do orphan CGIs weaken the correlation between CGIs and transcriptional activity, or do they also mark hitherto unsuspected
promoters? The available evidence shows that many orphan CGIs are also sites of transcriptional initiation. Specific examples
of intragenic CGI promoters have been known for many years. For example, CGIs at the 3′ end of the Pomc gene and exon 2 of the MHC class II I-Aβ gene both initiate transcripts of unknown function whose coding potential is minimal (; ). More is known functionally about the role played by the Air transcript in imprinting of the Igf2r gene. Air is a noncoding RNA (ncRNA) that initiates at a CGI within intron 2 of Igf2r and is essential for silencing of the paternal allele (). Similarly, analysis of a CGI in intron 10 of the imprinted Kcnq1 gene identified it as the origin of a noncoding transcript (Kcnq1ot1) that is required for imprinting of several genes within this domain (, ). In these and other cases, the presence of a CGI in an unexpected location stimulated a successful search for an associated
transcript.
Genome-wide analyses have confirmed that many orphan CGIs represent novel promoters (; ). Criteria for transcript annotation include colocalization of a CGI with bound RNA Polymerase II (RNAPII) as detected by
chromatin immunoprecipitation (ChIP) or, more tellingly, RNA sequence data showing that transcripts originate within orphan
CGIs (; ; ). Examples of the latter data include cap analysis of gene expression (CAGE), which uses the 5′ cap to isolate and sequence
full-length transcripts (), and global run-on sequencing (GRO-seq) (), which detects in vitro elongation of engaged RNAPII. Altogether, evidence for transcriptional initiation has been found
at ~40% of orphan CGIs (~5000) (). Less directly, CGIs are frequently marked by trimethylation of histone H3 (H3K4me3), which is a signature of active promoters
and below). A subset of intergenic H3K4me3 peaks (~1600), many of which are likely to correspond to orphan CGIs, were found
to represent TSSs for long ncRNAs ().
View larger version:
Orphan CGIs are sites of transcriptional initiation. High-throughput sequencing data showing colocalization of orphan CGIs
with sites of transcriptional initiation taken from . CXXC affinity purification identifies the locations of CGIs that overlap with H3K4me3, RNAPII, GRO-seq (), and CAGE tags (). Genes (RefSeq) are annotated below the sequencing profiles, with those mapped to the positive and negative strand displayed above and below the chromosome line, respectively. Orphan CGIs are denoted by asterisks.
Many orphan CGI promoters are active in a tissue-specific manner (), suggesting that they are tightly regulated. Because only a few tissues have been investigated so far, it is likely that
most, if not all, orphan CGIs will be associated with a novel transcript. What is the functional significance of this transcription?
Some orphan CGIs probably represent alternative promoters of nearby annotated genes (). Others may initiate ncRNAs that regulate gene expression. For example, the study of imprinted genes has already uncovered
key examples of the regulatory relevance of CGI-derived ncRNAs (see above). In addition to Air and Kcnq1ot1, the Xist and Tsix ncRNAs function in X-chromosome inactivation (; ), and the ncRNA HOTAIR has been reported to regulate Hox gene expression (). Given the hitherto unsuspected abundance of CGI promoters in the genome, it seems likely that involvement of ncRNAs in
gene regulation may be widespread. Recent studies have shown that enhancers are often associated with transcription of ncRNAs,
although these have not been linked so far to CGIs (). In a different vein, CGI promoters within MHC class II genes coincide precisely with the hypervariable exon 2, whose polymorphism
is caused by gene conversion between members of the class II gene family. It has been suggested that the “open” CGI chromatin
structure in germ cells enhances gene conversion in this exon, thereby benefiting the immune system (). Potential roles as boundaries or insulators of transcriptional units remain as yet unexplored. In summary, the biological
roles played by CGI transcripts may be diverse, but the stage is now set for rigorous testing of hypotheses concerning their
It is clear from the above discussion that CGIs act as promoters in mammalian genomes. This leaves two possibilities regarding
their significance: (1) CGIs are evolutionary footprints of molecular events that occur at many eukaryotic promoters, but
are only visible in organisms that have extensive genomic DNA or (2) CGIs are important regulatory structures
that have evolved under selection in genomes where DNA methylation plays a regulatory role. The weight of evidence now appears
to support the second of these possibilities. In what follows, we summarize this emerging data and discuss ways in which CGIs
are adapted for promoter function.
Characteristics of CGI promoters
DNA sequence motifs
CGIs colocalize with the majority of promoters in both the human and mouse genomes. Early studies suggested that CGI promoters
may often lack TATA boxes and display heterogeneous TSSs (). CAGE analysis has shown, on a genome-wide scale, the broad correlation between this distributive pattern of transcription
initiation, typically over a region of 50–100 bp, and the presence of CGIs (). These observations are compatible with the idea that CGI promoters adopt a transcriptionally permissive state within which
initiation can occur at a number of locations. In general, TATA boxes, along with other core promoter elements (such as the
BRE, DPE, and DCE), tend to be associated with focused transcriptional initiation, whereas CGIs tend to lack these elements
and display dispersed initiation patterns (for review, see ). There are, however, exceptions to this generalization. The human genes for α-globin, MyoD1, and erythropoietin, for example,
have CGI promoters, yet possess TATA boxes.
The idea that many CGI promoters are transcriptionally permissive is supported by genome-wide ChIP and transcriptome analysis.
RNAPII is bound at the CGI promoters of many inactive genes in embryonic stem (ES) cells () and at silent lipopolysaccharide (LPS)-inducible genes in unstimulated macrophages (). Also, global nuclear run-on analysis examining the products of transcriptionally engaged RNAPII molecules detected bidirectional
transcription of short nonproductive RNAs as well as full-length transcripts at many CGI promoters (; ). Much emphasis had been placed previously on the recruitment of RNAPII as the rate-limiting step in transcription, but these
results suggest that regulation of many CGI promoters takes place downstream from polymerase binding. One way of achieving
this is by regulating transcriptional elongation and mRNA processing, as in the case of CGI-associated inducible genes in
macrophages (). In the latter study, RNAPII recruitment was dependent on the transcription factor Sp1, but the initiating form of RNAPII
(phosphorylated at Ser 5 of the C-terminal domain [CTD]) was promoter-bound even when the genes were inactive and low levels
of full-length transcripts were produced (). A switch from nonproductive to productive transcription was triggered by inducible transcription factor-dependent recruitment
of P-TEFb and subsequent Ser 2 phosphorylation of the RNAPII CTD. This resulted in the production of mature, processed transcripts.
Similarly, gene regulation at the level of transcriptional elongation via RNAPII pausing (controlled by the pause factors
DSIF and NELF) and release of RNAPII mediated by P-TEFb (for review, see ) is reported to be widespread in ES cells (). CGI promoters, it seems, attract RNAPII and, unless actively restrained, will engage in transcription of some kind.
Transcription factor binding at CGIs
CGIs share little long-range sequence conservation, apart from an elevated CpG density and G+C content, and often lack core
promoter elements such as the TATA box (; ). What features adapt them to promoter function? A simplistic possibility is that GC richness increases the probability that
ubiquitous transcription factors will bind. In general, mammalian transcription factor-binding sites are more GC-rich than
the bulk genome (see ) and many contain CpG in their recognition sequence. These include the general transcription factor Sp1, which has been shown
to recruit TATA-binding protein (TBP) to promoters lacking a TATA box (). Transient reporter gene assays examining the activity of 4575 human promoters found that ubiquitously active CGI promoters
tended to be enriched for Sp1, Nrf-1, E2F, and ETS transcription factor-binding motifs, each of which contains a CpG (). Consistent with this, potential ETS and E2F family binding sites, as characterized in vitro, are overrepresented in mouse
CGIs (). Similarly, CpG-containing ETS, NRF-1, BoxA, SP1, CRE, and E-box motifs are enriched in the CGI promoters of housekeeping
genes (), and Sp1 binding on human chromosomes 21 and 22 is focused predominantly at CGIs (). In the case of the α-globin gene, DNA footprinting studies detected little difference in transcription factor-binding patterns
whether the gene was active or inactive (), suggesting that, even in the case of a highly tissue-specific gene, transcription factor binding can be constitutive.
View larger version:
Transcription factor-binding sites are, on average, GC-rich. The G+C content of binding sites for 46 mouse transcription factors
was calculated using fasta sequences of actual binding sites obtained from the JASPAR database (). G+C frequency is given on the X-axis, while the Y-axis shows the number of each of these 46 binding sites with a given G+C frequency.
The chromatin signature of CGIs
Unstable nucleosomes
There is evidence that nonmethylated CGIs are organized in a characteristic chromatin structure that predisposes them toward
promoter activity. A study of chromatin at LPS-inducible genes in macrophages found that CGIs are relatively nucleosome-deficient
(). Inducible “primary response genes” fall into two classes: those that require SWI/SNF chromatin remodeling complexes for
their activation and those that do not. It was noticed that these groups corresponded with non-CGI and CGI promoters, respectively,
suggesting that DNA in CGI chromatin is intrinsically accessible without the need for ATP-dependent nucleosome displacement
(). In macrophages, the CGIs showed a reduced density of histone H3 even in the uninduced state. Accordingly, in vitro nucleosome
assembly indicated that a set of these CGIs is significantly more reluctant to assemble into nucleosomes than other genomic
DNA (). An attractive interpretation of the in vitro instability of CGI chromatin is that weakening of this barrier allows greater
accessibility of the underlying DNA to transcriptional regulators in vivo. Other evidence has shown that nucleosome deficiency
is a feature of CGI promoters in general. Early analysis of CGI chromatin detected abundant nonnucleosomal DNA that was absent
in preparations of bulk chromatin (). The same conclusion emerged from a re-examination of genome-wide nucleosome mapping data (; ). In addition to chromatin instability, nucleosome deficiency in vivo may also arise because CGI promoters, in common with
all eukaryotic promoters, typically possess a nucleosome-free region surrounding the TSS (). In other words, active promoters by definition may be nucleosome-deficient whether or not they are CGIs. It is not yet
certain whether nucleosome deficiency at CGIs is due primarily to intrinsic chromatin instability or nucleosome exclusion
due to the presence of the transcription initiation complex. It could, of course, be a mixture of both, and may even vary
between individual CGIs.
Characteristic histone modifications
Early biochemical studies of isolated CGI chromatin showed high levels of histone H3 and H4 acetylation, which are characteristic
of transcriptionally active chromatin. Histone H1, on the other hand, which is regarded as antagonistic to transcription,
was depleted in this fraction (). Genome-wide studies have confirmed this association at high resolution (; ) and have revealed that H3K4me3 is a signature histone mark of CGI promoters, often persisting even when the associated gene
is inactive (; ). Recent work has established a biochemical connection between the abundance of CpG in CGIs and H3K4me3, mediated by a CXXC
domain protein that binds specifically to nonmethylated CpG (). Cfp1 (CXXC finger protein 1; also known as CGBP) is an integral component of the Setd1 H3K4 methyltransferase complex () and localizes to the vast majority of CGIs in the mouse genome, suggesting dependence of this histone modification on the
DNA sequence (). In keeping with this model, depletion of Cfp1 reduces H3K4me3 at many CGIs. Importantly, insertion of an artificial CGI-like
DNA sequence into the genome results in recruitment of Cfp1 and creates a novel peak of H3K4me3 in the absence of RNAPII (). Further support for a mechanistic link between unmethylated CpG residues, Cfp1, and H3K4me3 comes from the finding that
CpG density in CGIs correlates positively with H3K4me3 levels (). The ability of CpG density alone to directly influence the chromatin modification state () is likely to be a key function of CGIs.
The presence of H3K4me3 appears to facilitate transcription in a number of ways. The H3K4me3 tail has been shown to interact
with the NuRF chromatin remodeling complex (; ; ) as well as ING4-containing histone acetyltransferase complexes (). Also, H3K4me3 interacts with the transcriptional machinery directly, as the core transcription factor TFIID has an affinity
for the H3K4me3 mark (; ). Core transcriptional machinery can recruit H3K4 methyltransferases to chromatin (for review, see ), so it is likely that transcription also contributes to H3K4me3 at CGIs. The relative contributions of CpG-mediated and
transcription-mediated H3K4me3 to CGI chromatin modification have yet to be determined.
Another distinctive feature of CGI chromatin is depletion of histone H3K36 dimethylation (H3K36me2) compared with non-CGI
promoters and gene bodies (). The H3K36me2 histone demethylase Kdm2a () is a CXXC domain protein that, like Cfp1, binds specifically to nonmethylated CpG. Accordingly, Kdm2a is bound in vivo at
~90% of CGIs in mouse ES cells and mediates demethylation of H3K36me2 at these regions (). Why H3K36me2 should be depleted at CGIs is uncertain, but this modification has been reported to inhibit transcriptional
initiation through histone deacetylase (HDAC) recruitment in yeast (; ; ). H3K36me2 depletion may therefore contribute to a transcriptionally permissive state at CGIs (). Both Cfp1 and Kdm2a have the characteristics of CGI-specific proteins that use CpG density to influence chromatin modification.
It is likely that more factors of this kind are yet to be identified, for example, by a comprehensive characterization of
the CGI proteome.
View larger version:
The chromatin state at CGIs. (A) CGIs usually exist in an unmethylated transcriptionally permissive state. They are marked by histone acetylation (H3/H4Ac)
and H3K4me3, which is directed by Cfp1, and show Kdm2a-dependent H3K36me2 depletion. Nucleosome deficiency and constitutive
binding of RNAPII may also contribute to this transcriptionally permissive state. (B) DNA methylation is associated with stable long-term silencing of CGI promoters. This can be mediated by MBD proteins, which
recruit corepressor complexes associated with HDAC activity, or may be due to directed inhibition of transcription factor
binding by DNA methylation. (C) CGIs can also be silenced by PcG proteins and may be key elements involved in polycomb recruitment. An unknown CGI-binding
factor could be responsible for recruiting PRC2 to CGIs that then trimethylates H3K27. This H3K27me3 is recognized by PRC1
complexes that act to impede transcriptional elongation, thereby silencing genes. Note that the transcriptionally permissive
and polycomb-repressed states can coexist at bivalent CGIs, predominantly in totipotent embryonic cells.
CGI promoter silencing by DNA methylation
CGIs are typically in a nonmethylated state in an otherwise heavily methylated genome, even when the corresponding gene is
transcriptionally inactive. There are, however, well-known examples of CGIs that become methylated during normal development,
leading to stable silencing of the associated promoter (; ; ). Silencing is thought to be either due to direct inhibition of transcription factor binding by DNA methylation or mediated
by methyl-binding domain (MBD) proteins that recruit chromatin-modifying activities to methylated DNA ( for reviews, ). It seems that CGI methylation is not the initiating event in gene silencing, but acts to lock in the silent state. For
example, during X-chromosome inactivation in female eutherian mammals, X-linked CGIs do not become methylated until after
gene silencing and the acquisition of several silencing chromatin modifications, such as H3K27me3 (for reviews, ). CGI methylation is, however, essential for maintenance of leak-proof X-chromosome inactivation, as inhibition of DNA methylation
leads to gene reactivation in a fraction of cells (; ). As discussed above, CGI methylation also has well-characterized roles in genomic imprinting where parent-of-origin monoallelic
expression is controlled by CGI methylation marks (for review, see ). In several instances, the CGIs concerned act as promoters for ncRNAs whose expression is silenced by DNA methylation. Expression
of these ncRNAs—including Air and Kcnq1ot1—is responsible for the silencing of neighboring genes (; , ).
Until recently, methylation of CGIs during imprinting and X-chromosome inactivation were suspected to be special cases, most
CGIs remaining nonmethylated regardless of gene expression. However, genome-wide studies focusing on CGIs at annotated TSSs
have uncovered numerous instances of CGI methylation in normal somatic cells. CGIs in the germline are almost invariably nonmethylated,
but a small proportion acquire methylation in somatic tissues (; ; ). Similarly, a small number of CGI promoters acquire methylation during differentiation of ES cells into neurons, with most
of the changes taking place during the early stages of differentiation (). The majority of CGIs that gain methylation during differentiation are already silent in ES cells (), providing further evidence that silencing precedes DNA methylation. The genes affected are often expressed only in the
germline, such as the MAGE family of testis-specific antigens (). Differences in gene-associated CGI methylation between one somatic tissue and another have also been reported, although
these are relatively rare compared with differences between germline and somatic CGIs ().
In contrast to the rarity of methylated CGIs at the promoters of annotated genes, orphan CGIs are methylated much more frequently.
About 17% of orphan CGIs have been found so far in a methylated state, compared with ~3% of CGIs at annotated gene promoters
(, ; ; ). By further separating orphan CGIs into intragenic and intergenic categories, it becomes apparent that intragenic CGIs are
especially prone to methylation (~20%–34%) (; ). Accordingly, CGIs located within gene bodies show the greatest number of DNA methylation differences between different
somatic cells and tissues (; AM Deaton, S Webb, ARW Kerr, RS Illington, J Guy, R Andrews, and A Bird, in prep.). Functionally, it can be speculated that
some of the transcripts initiating from gene body CGIs are regulatory ncRNAs whose presence or absence affects expression
of the associated protein-coding gene or a nearby gene (; ; ). Another possibility is that these sites of unusual chromatin and transcription affect alternative splicing of the gene
in which they are located in a manner that differs with methylation status (). It is also possible that a methylated CGI within a gene body down-regulates transcriptional elongation, as reported in
a transgenic cell model (). Further studies are required to elucidate the consequences of methylation at these sites.
In addition to its occurrence during normal development, CGI methylation has been well documented in cancer. The CGIs of several
tumor suppressor genes acquire cancer-specific methylation, and many genes involved in familial forms of cancer undergo DNA
methylation-associated silencing in sporadic cancers (for review, see , ). These changes are thought to contribute to uncontrolled proliferation and thus tumor development. However, whether DNA
methylation is the initiating event in gene silencing or is acquired at already silenced genes remains unclear. Genome-wide
studies have demonstrated that many of the CGIs that acquire aberrant methylation in cancer are not associated with tumor
suppressor genes (; ; ; ). A key issue regarding cancer-specific CGI methylation is whether it mirrors normal developmental methylation. Profiling
DNA methylation in the flanks of CGIs suggested that cancer-specific methylation patterns resemble those occurring in normal
tissues (). A recent study, however, suggested that cancer-specific CGI methylation can be distinguished from that in normal tissues.
Unlike normal tissues, where orphan CGIs are preferentially methylated, in colorectal cancers, the proportion of TSSs, intragenic
CGIs, and intergenic CGIs acquiring tumor-specific methylation was approximately equal ().
CGIs and polycomb-mediated silencing
In addition to DNA methylation, CGI promoters can be silenced by polycomb group proteins (PcG). There are two PcG complexes
in mammals: polycomb-repressive complex 1 (PRC1) and PRC2. PRC2 mediates H3K27me3, and this mark is recognized by PRC1, which
is thought to inhibit transcriptional elongation by a mechanism involving H2A ubiquitylation (; ) and/or chromatin compaction (). Differentiation of ES cells into neurons is accompanied by losses and gains of H3K27me3 at many promoters at various stages
of differentiation. In contrast, DNA methylation is gained at a relatively small number of promoters only during the early
stages of differentiation, remaining relatively stable thereafter (). This indicates that polycomb is a more dynamic repression system than DNA methylation, at least during later developmental
stages. In ES cells, CGIs silenced by polycomb possess the “active” mark H3K4me3 as well as H3K27me3 (; ; ). These “bivalent” CGI promoters are poised between two alternative states: either active transcription or stable repression.
Upon differentiation, they can lose H3K27me3 and become active or be subject to more stable transcriptional repression. Bivalent
CGI promoters account for approximately one-fifth of CGI promoters in ES cells () but are also found in other cell types, although to a lesser extent (; ).
Acquisition of CGI methylation in cancer has been found to occur preferentially at genes marked by H3K27me3 in ES cells (; ; ). By mirroring the pattern of polycomb-dependent gene silencing established in ES cells, DNA methylation may hypothetically
facilitate maintenance of a “pseudo-pluripotent state,” thereby favoring unrestrained proliferation of cancer cells (; ; ). It has also been found that promoters that are positive for H3K27me3 in ES cells are more than four times more likely to
acquire DNA methylation, supporting the idea that there is a relationship of some sort between the two systems (). One report claimed that the PcG protein Ezh2, which catalyzes H3K27me3, can recruit DNA methyltransferases directly (), but this mechanistic link remains unproven. In many respects, DNA methylation and polycomb behave as alternative silencing
mechanisms in ES cells (; ).
Interestingly, polycomb tends to be targeted to CGI-containing regions of the genome (; ), whereas non-CGI promoters do not often associate with H3K27me3 (; ). The observation that some CGIs are specifically targeted by polycomb led to the hypothesis that these DNA sequences are
prone to polycomb recruitment (). Although H3K27me3 often tends to mark large genomic domains, binding of Ezh2 (the catalytic component of PRC2) in human
ES cells strongly correlates with CGIs and, in particular, those lacking motifs for activating transcription factors (such
as Ets1 and 2, NFY, YY1, and c-Myc) (). Intriguingly, a polycomb recruitment element identified in human cells is a nucleosome-deficient sequence that includes
a CGI (). A recent study provided functional evidence of a role for CGIs in polycomb recruitment. The CGI of the bivalent Zfpm2 locus was found to be sufficient to recruit H3K27me3, H3K4me3, and PRC2 to a human gene desert region inserted into mouse
ES cells (). Interestingly, GC-rich DNA from a bacterial source also had the ability to recruit PRC2, suggesting that base composition
plays an important role.
As the majority of CGIs do not normally recruit polycomb, base composition per se is very unlikely to be the only factor involved
in attracting or excluding polycomb. To explain this selectivity, it was proposed that the presence of transcription factor-binding
motifs within a CGI might be sufficient to protect it from PRC2 recruitment (). Consistent with this hypothesis, deletion of activating motifs at a CGI promoter that normally lacks H3K27me3 in ES cells
resulted in PRC2 recruitment and modification of H3K27 (). This suggests that transcription factor binding or productive transcription may be sufficient to protect against silencing
by polycomb. A corollary of this hypothesis is that polycomb-mediated silencing, like DNA methylation, is secondary to gene
silencing by other mechanisms. This fits with early work on Drosophila showing that polycomb group proteins do not affect the establishment of developmental gene expression patterns, but are essential
to maintain those patterns through time. Indeed, recent experiments in Drosophila indicate that polycomb is preferentially targeted to stalled promoters of coding and noncoding transcripts (; ) or to forms of RNA polymerase that are not competent to couple RNA synthesis with cotranscriptional modification (). Transcriptionally silent CGIs may not always be sufficient to invite polycomb, however, as PRC2-recruitment is proposed
to depend on transcription of ncRNAs from CGIs (; ). Short transcripts produced from bivalent CGIs in ES and T cells are reported to form stem–loop structures that may be involved
in recruiting polycomb to these CGIs (). In addition, the JmjC domain-containing protein Jarid2 has been identified as a novel component of PRC2 and has been implicated
in polycomb targeting in mammals (for review, see ). We have clues about the mechanisms behind polycomb recruitment to specific sites in the mammalian genome and the role played
by CGIs in this process, but much remains to be learned.
What protects CGIs from DNA methylation?
CGIs are generally unmethylated CpG-enriched domains that occur against a backdrop of genome-wide DNA methylation and consequent
CpG depletion. How are they protected from the waves of de novo DNA methylation that occur during development? A simple explanation
would be that CGIs are intrinsically refractory to the action of DNA methyltransferases, but this is very unlikely. Some CGIs
do become methylated durin for example, hundreds of CGIs are heavily methylated on one female X chromosome,
but are nonmethylated on the other. An alternative possibility is that methylation at CGIs is actively removed by a DNA demethylase
(), but uncertainty concerning the identity of the DNA demethylating activities in animals has made this hypothesis difficult
to test. The discovery that 5-methylcytosine can be converted to 5-hydroxymethylcytosine (hmC) poses the tantalizing possibility
that hmC is an intermediate in the demethylation process (; ). The demethylating enzyme Tet1 possesses a domain related to the CpG-binding CXXC domain found in Cfp1 (), and could in theory be targeted to CGIs. Compatible with this hypothesis, recent studies locate Tet1 preferentially at
CGIs in mouse ES cells ( ,), and depletion of Tet1 results in increased CpG methylation at CGIs (). An attractive scenario is that CGIs are subject to sporadic de novo methylation, but are continually swept clean by a mechanism
involving oxidation of 5-methylcytosine. Defects in such a system may predispose to de novo CGI methylation, as seen in many
cancers. Indeed, mutations in the TET2 gene are frequent in leukemias and compromise the hydroxylation reaction ().
Regardless of the detailed molecular mechanism, there is evidence that the methylation-free state of CGIs is causally related
to their function as promoters. Deletion or mutation of Sp1 transcription factor-binding sites in the mouse Aprt promoter results in a failure to maintain the unmethylated state of the Aprt CGI in transgenic experiments (; ). This suggests that the binding of transcription factors, or the act of transcription itself, during early development is
required for establishment of the DNA methylation-free state. A prediction of this hypothesis is that all CGI promoters should
be active during the waves of de novo methylation that occur at the blastocyst stage and in developing germ cells of the embryo
(). Indeed, studies of a small number of CGI promoters for highly tissue-specific CGI-associated genes showed that several
are expressed in early embryonic cells (). Large-scale analyses have strengthened this relationship, showing that 90% of genes with CGI promoters are expressed in
the early embryo or testis (). The idea that transcription is antagonistic to CGI methylation also fits well with the observation that the presence of
RNAPII at CGIs, irrespective of gene activity, is associated with resistance to DNA methylation in cancer (). DNA sequence-specific transcription factors also have been implicated in preventing DNA methylation, as cooperative binding
of the transcription factors Sp1, Nrf-1, and YY1 in normal monocytes correlates with protection from CGI methylation in leukemia
cells (). In this case, CGI promoters bound by these factors also had the highest expression levels. A survey of human CGIs reinforced
the correlation with transcription factor binding (). Subsets of CGIs that become methylated or remain nonmethylated during development were screened for DNA sequence motifs
that might influence this decision. An algorithm based on these motifs, several of which were transcription factor-binding
sites, was able to predict which CGIs would be immune from DNA methylation, again suggesting that the switch is dependent
on cis-acting factors.
There is evidently a close relationship between transcription in the early embryo and lack of CGI methylation, but mechanisms
that relate the two are unknown. It has been proposed that origins of DNA replication (ORIs) are the missing link. Based on
evidence that CGIs often colocalize with ORIs (), it was speculated that intermediates in the process of replication initiation lead to local exclusion of DNA methylation
and, over time, an altered base composition (). A large fraction of CGIs (83%) colocalizes with ORIs in ES cells (), which may reflect the situation in the early embryo. A causal relationship between ORI function and CGIs has yet to be
established, however, and therefore other kinds of DNA-based metabolism might be responsible for excluding DNA methylation
from these regions. For example, CGI promoters are typically loaded with polymerases that create short abortive transcripts
() even when the associated gene is inactive (). This “futile” transcription cycle may somehow protect CGIs from the action of DNA methyltransferases, allowing these “silent”
promoters to exclude DNA methylation.
Another possible explanation for the immunity of most CGIs to DNA methylation is that their signature chromatin mark, H3K4me3,
interferes with DNA methyltransferase activity. Chromatin binding of Dnmt3L, a partner protein of the de novo DNA methyltransferases
Dnmt3a and Dnmt3b, is inhibited by H3K4me3 (). Importantly, the ADD domains of both Dnmt3a and Dnmt3b also fail to interact with H3K4me3 and are catalytically less active
in vitro on chromatin containing this modification compared with unmodified or H3K9me3-modified chromatin (). Multiple potential mechanisms for preventing CGI methylation, including those discussed above, are not mutually exclusive,
but may act in concert. It is noteworthy that a promoterless CpG-rich sequence marked by Cfp1 binding and H3K4me3 was only
partially immune to DNA methylation, suggesting that this chromatin modification by itself is insufficient (). It may be that a combination of factors—including, perhaps, initiation of transcription—is required to exclude DNA methylation
from CGIs.
Concluding remarks regarding CGI function
CGIs represent a dispersed but coherent DNA sequence class whose members function as genomic platforms for regulating transcription
at their associated promoters. These properties depend on the shared features of their DNA notably, CpG richness
and a higher than average G+C content. Paradoxically, CpG richness by itself attracts protein complexes that promote H3K4me3
(), but G+C-rich DNA also recruits H3K27me3 (). An equivocal activity state due to the coexistence of these contradictory tendencies prevails at bivalent CGIs, particularly
in ES cells, where transcription decisions at many promoters are pending (; ). Of course, CpG is also the substrate for DNA methyltransferases, providing another opportunity for regulation. Why should
CGI DNA be poised to set up both active and inactive chromatin structures? A possible reason is to create tension between
these opposing states and thus facilitate decisive switching. A nonbiological analogy is the spring in an electrical switch,
which ensures that a light can be flicked easily between on and off states that are subsequently stable. A key part of such
a “spring” model is that features of the silent or active condition should be self-reinforcing once a decision has been made.
Biochemically, this can be achieved by positive feedback on pathways that emphasize the chosen state, or by inhibition of
opposing pathways. For example, H3K4me3 can attract the ING4 zinc finger protein, which indirectly augments acetylation of
histone H3 tails (), thereby emphasizing the active chromatin configuration. At the same time, this mark inhibits the action of Dnmts, which
might otherwise impose transcriptional silence. It is likely that a cascade of related positive and negative feedback loops
ensures spring-like stability. Indeed, the more we learn about the marking of chromatin, the more loops of this kind are uncovered,
suggesting that consolidation of transcriptional states could be a major role for these epigenetic systems. How is the decision
between activity or silence made? The evidence suggests that DNA sequence-specific transcriptional regulators are often ultimately
responsible, and that their influence, and that of the proteins they recruit, is dominant over the tendencies imposed by CGI
DNA. It is notable that the ability of GC-rich DNA to recruit H3K4me3 and H3K27me3 was detected using promoterless exogenous
DNA domains (; ). Thus, factors that repress or activate transcription appear to exaggerate the chromatin changes to which CGIs are already
prone. The construction of abrupt switches based on biochemical processes that are often continuously variable is a challenge
in many biological processes. CGIs may provide one solution at the level of transcriptional regulation.
Acknowledgments
We thank Shaun Webb for analyzing the base composition of transcription factor-binding sites. We are also grateful to Thomas
Clouaire, Jim Selfridge, and Elisabeth Wachter for helpful comments on the manuscript. The Bird laboratory is funded by The
Wellcome Trust, the Medical Research Council, and Cancer Research UK.
1 Corresponding author.
E-MAIL a.bird{at}ed.ac.uk; FAX 44-131-6505379.
Article is online at .
References
Antequera F,
AP. . Unmethylated CpG islands associated with genes in higher plant DNA. EMBO J 7: 2295–2299.
Antequera F,
A. . CpG islands as genomic footprints of promoters that are associated with replication origins. Curr Biol 9: R661–R667. .
Spivakov M,
Jorgensen HF,
Casanova M,
Merkenschlager M
S, Spivakov
M, Jorgensen
M, Casanova
G, Merkenschlager
M, et al. . Chromatin signatures of pluripotent cell lines. Nat Cell Biol 8: 532–538.
R. . Silencing chromatin: comparing modes and mechanisms. Natl Rev 12: 123–135.
Bernstein BE,
Mikkelsen TS,
Huebert DJ,
Meissner A,
BE, Mikkelsen
M, Huebert
B, Meissner
K, et al. . A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell 125: 315–326.
AP. . DNA methylation and the frequency of CpG in animal DNA. Nucleic Acids Res 8: 1499–1504.
Taggart MH,
AP, Taggart
BA. . Methylated and unmethylated DNA compartments in the sea urchin genome. Cell 17: 889–901.
Taggart M,
Frommer M,
Miller OJ,
A, Taggart
M, Frommer
OJ, Macleod
D. . A fraction of the mouse genome that is derived from islands of nonmethylated, CpG-rich DNA. Cell 40: 91–99.
Stamatoyannopoulos JA,
Gingeras TR,
Margulies EH,
Dermitzakis ET,
Thurman RE
E, Stamatoyannopoulos
R, Gingeras
TR, Margulies
M, Dermitzakis
ET, Thurman
RE, et al. . Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447: 799–816.
Blackledge NP,
Blackledge
RJ. . CpG island chromatin: a platform for gene regulation. Epigenetics 6: 147–152.
Blackledge NP,
Tolstorukov MY,
Farcas AM,
Blackledge
JC, Tolstorukov
MY, Farcas
RJ. . CpG islands recruit a histone H3 lysine 36 demethylase. Mol Cell 38: 179–190.
Bogdanovic O,
Veenstra GJ
Bogdanovic
O, Veenstra
GJ. . DNA methylation and methyl-CpG binding proteins: developmental requirements and function. Chromosoma 118: 549–565.
Brandeis M,
Siegfried Z,
Mendelsohn M,
I, Siegfried
Z, Mendelsohn
H. . Sp1 elements protect a CpG island from de novo methylation. Nature 371: 435–438.
Brookes E,
A. . Modifications of RNA polymerase II are pivotal in regulating gene expression states. EMBO Rep 10: 1213–1219.
Marstrand T,
Winther O,
da Piedade I,
Lenhard B,
Sandelin A
MH, Marstrand
T, Winther
O, da Piedade
A, Lenhard
B, Sandelin
A. . JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update. Nucleic Acids Res 36: D102–D106. .
Butler JE,
Kadonaga JT
JE, Kadonaga
JT. . The RNA polymerase II core promoter: a key component in the regulation of gene expression. Genes Dev 16: 2583–2592.
Carninci P,
Sandelin A,
Lenhard B,
Katayama S,
Shimokawa K,
Ponjavic J,
Semple CA,
Taylor MS,
Engstrom PG,
P, Sandelin
A, Lenhard
B, Katayama
S, Shimokawa
K, Ponjavic
CA, Taylor
MS, Engstrom
MC, et al. . Genome-wide analysis of mammalian promoter architecture and evolution. Nat Genet 38: 626–635.
Bekiranov S,
Kapranov P,
Sekinger EA,
Piccolboni A,
Sementchenko V,
Williams AJ
S, Bekiranov
HH, Kapranov
P, Sekinger
D, Piccolboni
A, Sementchenko
J, Williams
AJ, et al. . Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of
noncoding RNAs. Cell 116: 499–509.
JK. . Contrasting chromatin organization of CpG islands and exons in the human genome. Genome Biol 11: R70. .
Waterfall JJ,
LJ, Waterfall
JT. . Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science 322: 1845–1848.
Coulondre C,
Miller JH,
Farabaugh PJ,
JH, Farabaugh
PJ, Gilbert
W. . Molecular basis of base substitution hotspots in Escherichia coli. Nature 274: 775–780.
Charlton JA,
SH, Charlton
AP. . Purification of CpG islands using a methylated DNA binding column. Nat Genet 6: 236–244.
Csankovszki G,
Jaenisch R
Csankovszki
A, Jaenisch
R. . Synergism of Xist RNA, DNA methylation, and histone hypoacetylation in maintaining X chromosome inactivation. J Cell Biol 153: 773–784.
Cuadrado M,
Sacristan M,
Antequera F
M, Sacristan
M, Antequera
F. . Species-specific organization of CpG island promoters at mammalian homologous genes. EMBO Rep 2: 586–592.
Daniels R,
M. . Transcription of tissue-specific genes in human preimplantation embryos. Hum Reprod 12: 2251–2256.
Delgado S,
Antequera F
A, Antequera
F. . Initiation of DNA replication at CpG islands in mammalian chromosomes. EMBO J 17: 2426–2435.
De Smet C,
Lurquin C,
Martelange V,
C, Lurquin
B, Martelange
T. . DNA methylation is the primary silencing mechanism for a set of germ line- and tumor-specific genes with a CpG-rich promoter. Mol Cell Biol 19: 7327–7335.
Dinger ME,
Amaral PP,
Mercer TR,
Gardiner BB,
Askarian-Amiri ME,
ME, Amaral
PP, Mercer
SJ, Gardiner
BB, Askarian-Amiri
C, et al. . Long noncoding RNAs in mouse embryonic stem cell pluripotency and differentiation. Genome Res 18: 1433–1445.
Edwards CA,
Ferguson-Smith AC
CA, Ferguson-Smith
AC. . Mechanisms regulating imprinted genes in clusters. Curr Opin Cell Biol 19: 281–289.
Enderle D,
Stadler MB,
Gerstung M,
C, Stadler
MB, Gerstung
R. . Polycomb preferentially targets stalled promoters of coding and noncoding transcripts. Genome Res 21: 216–226.
Eskeland R,
Grimes GR,
Gilbert N,
Skoultchi AI,
D, Gilbert
Y, Skoultchi
A, et al. . Ring1B compacts chromatin structure and represses gene expression independent of histone ubiquitination. Mol Cell 38: 452–464.
Faulkner GJ,
Irvine KM,
Schroder K,
Cloonan N,
Steptoe AL,
Lassmann T
GJ, Kimura
KM, Schroder
K, Cloonan
N, Steptoe
AL, Lassmann
T, et al. . The regulated retrotransposon transcriptome of mammalian cells. Nat Genet 41: 563–571.
Bostick M,
Strauss SH,
Halpern ME
PY, Bostick
MG, Hetzel
J, Strauss
SH, Halpern
ME, et al. . Conservation and divergence of methylation patterning in plants and animals. Proc Natl Acad Sci 107: 8689–8694.
Branco MR,
Seisenberger S,
Krueger F,
Marques CJ,
Andrews S,
MR, Seisenberger
F, Krueger
TA, Marques
CJ, Andrews
W. . Dynamic regulation of 5-hydroxymethylcytosine in mouse ES cells and during differentiation. Nature .
Pellegrini M,
Meissner A,
Van Neste L,
Jaenisch R,
Y, Pellegrini
S, Meissner
A, Van Neste
L, Jaenisch
G. . Promoter CpG methylation contributes to ES cell gene regulation in parallel with Oct4/Nanog, PcG complex, and histone H3 K4/K27
trimethylation. Cell Stem Cell 2: 160–169.
Gardiner-Garden M,
Gardiner-Garden
M, Frommer
M. . Transcripts and CpG islands associated with the pro-opiomelanocortin gene and other neurally expressed genes. J Mol Endocrinol 12: 365–382.
Gebhard C,
Schwarzfischer L,
Schilling E,
Dietmaier W,
Andreesen R
M, Schwarzfischer
L, Schilling
M, Dietmaier
E, Andreesen
R, et al. . General transcription factor binding at CpG islands in normal cells correlates with resistance to de novo DNA methylation
in cancer cells. Cancer Res 70: 1398–1407.
Guenther MG,
Levine SS,
Jaenisch R,
MG, Levine
LA, Jaenisch
RA. . A chromatin landmark and transcription initiation at most promoters in human cells. Cell 130: 77–88.
Guttman M,
Feldser D,
Cassady JP
MF, Feldser
BW, Cassady
JP, et al. . Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature 458: 223–227.
Hargreaves DC,
Medzhitov R
Hargreaves
T, Medzhitov
R. . Control of inducible gene expression by signal-dependent transcriptional elongation. Cell 138: 129–145.
Shilatifard A
HM, Shilatifard
A. . The JARID2–PRC2 duality. Genes Dev 24: 857–861.
Herzing LB,
Ashworth A
JM, Ashworth
A. . Xist has properties of the X-chromosome inactivation centre. Nature 386: 272–275.
Champagne KS,
Johnson K,
Kutateladze TG,
O, Champagne
AJ, Johnson
MD, Kutateladze
TG, Gozani
O. . ING4 mediates crosstalk between histone H3 K4 trimethylation and H3 acetylation to attenuate cellular transformation. Mol Cell 33: 248–256.
Illingworth RS,
Illingworth
AP. . CpG islands—‘a rough guide.’ FEBS Lett 583: 1713–1720.
Illingworth R,
Desousa D,
Jorgensen H,
Stalker J,
Jackson D,
Illingworth
A, Desousa
D, Jorgensen
P, Stalker
J, Jackson
J, et al. . A novel CpG island set identifies tissue-specific methylation at developmental gene loci. PLoS Biol 6: e22. .
Illingworth RS,
Gruenewald-Schneider U,
Turner DJ,
Harrison DJ,
Andrews R,
Illingworth
RS, Gruenewald-Schneider
ARW, James
KD, Turner
C, Harrison
DJ, Andrews
AP. . Orphan CpG islands identify numerous conserved promoters in the mammalian genome. PLoS Genet 6: e1001134. .
Irizarry RA,
Ladd-Acosta C,
Montano C,
Onyango P,
Rongione M,
RA, Ladd-Acosta
Z, Montano
C, Onyango
K, Rongione
M, Webster
M, et al. . The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores. Nat Genet 41: 178–186.
D'Alessio AC,
Taranova OV,
Sowers LC,
S, D'Alessio
AC, Taranova
Y. . Role of Tet proteins in 5mC to 5hmC conversion, ES-cell self-renewal and inner cell mass specification. Nature 466: 1129–1133.
Jaeger SA,
Berger MF,
Stottmann R,
Hughes TR,
ET, Berger
MF, Stottmann
ML. . Conservation and regulatory associations of a wide affinity range of mouse transcription factor binding sites. Genomics 95: 185–195.
PA, Baylin
SB. . The fundamental role of epigenetic events in cancer. Natl Rev 3: 415–428.
PA, Baylin
SB. . The epigenomics of cancer. Cell 128: 683–692.
Juven-Gershon T,
Theisen JW,
Kadonaga JT
Juven-Gershon
JY, Theisen
JW, Kadonaga
JT. . The RNA polymerase II core promoter—the gateway to transcription. Curr Opin Cell Biol 20: 253–259.
Kanhere A,
Araujo CC,
Rasaiyaah J,
Bouwman RD,
Pereira CF,
Brookes E,
CC, Rasaiyaah
J, Bouwman
WA, Pereira
CF, Brookes
GW, et al. . Short RNAs are transcribed from repressed polycomb target genes and interact with polycomb repressive complex-2. Mol Cell 38: 675–688.
Schlesinger Y,
Farkash S,
Pikarski E,
Niveleau A,
I, Schlesinger
Y, Farkash
E, Pikarski
RA, Niveleau
H, et al. . Evidence for an instructive mechanism of de novo methylation in cancer cells. Nat Genet 38: 149–153.
Hemberg M,
Harmin DA,
Laptewicz M,
Barbara-Haley K,
Kuersten S
TK, Hemberg
DA, Laptewicz
M, Barbara-Haley
K, Kuersten
S, et al. . Widespread transcription at neuronal activity-regulated enhancers. Nature 465: 182–187.
AP. . Genomic DNA methylation: the mark and its mediators. Trends Biochem Sci 31: 89–97.
Jankowska AM,
Tahiliani M,
Bandukwala HS,
Lamperti ED,
Ganetzky R
Y, Jankowska
UJ, Tahiliani
M, Bandukwala
J, Lamperti
KP, Ganetzky
R, et al. . Impaired hydroxylation of 5-methylcytosine in myeloid cancers with mutant TET2. Nature 468: 839–843.
Kornblihtt AR
Kornblihtt
AR. . Chromatin, transcript elongation and alternative splicing. Nat Struct Mol Biol 13: 5–7.
Rheinbay E,
Mendenhall EM,
Mikkelsen TS,
Presser A,
Nusbaum C,
RP, Rheinbay
E, Mendenhall
M, Mikkelsen
TS, Presser
A, Nusbaum
AS, et al. . Genomewide analysis of PRC1 and PRC2 occupancy identifies two classes of bivalent domains. PLoS Genet 4: e1000242. .
Landolin JM,
Johnson DS,
Trinklein ND,
Aldred SF,
JM, Johnson
DS, Trinklein
ND, Aldred
SF, Medina
RM. . Sequence features that drive human promoter function and tissue specificity. Genome Res 20: 890–898.
Gundersen G,
F, Gundersen
H. . CpG islands as gene markers in the human genome. Genomics 13: 1095–1107.
Skalnik DG
JH, Skalnik
DG. . CpG-binding protein (CXXC finger protein 1) is a component of the mammalian Set1 histone H3-Lys4 methyltransferase complex,
the analogue of the yeast Set1/COMPASS complex. J Biol Chem 280: 41725–41731.
Davidow LS,
Warshawsky D
JT, Davidow
LS, Warshawsky
D. . Tsix, a gene antisense to Xist at the X-inactivation centre. Nat Genet 21: 400–404.
Duncan EM,
Wysocka J,
EM, Wysocka
DJ. . Molecular basis for site-specific read-out of histone H3K4me3 by the BPTF PHD finger of NURF. Nature 442: 91–95.
Jackson J,
Fleharty B,
Workman JL,
Shilatifard A
B, Jackson
MD, Fleharty
C, Workman
JL, Shilatifard
A. . Histone H3 lysine 36 dimethylation (H3K36me2) is sufficient to recruit the Rpd3s histone deacetylase complex and to repress
spurious transcription. J Biol Chem 284: 7970–7976.
Lorincz MC,
Dickerson DR,
Schmitt M,
Groudine M
MC, Dickerson
DR, Schmitt
M, Groudine
M. . Intragenic DNA methylation alters chromatin structure and elongation efficiency in mammalian cells. Nat Struct Mol Biol 11: 1068–1075.
Macleod D,
Charlton J,
Mullins J,
D, Charlton
J, Mullins
AP. . Sp1 sites in the mouse aprt gene promoter are required to prevent methylation of the CpG island. Genes & Dev 8: 2282–2292.
Macleod D,
A. . An alternative promoter in the mouse major histocompatibility complex class II I-Aβ gene: implications for the origin of CpG
islands. Mol Cell Biol 18: 4433–4443.
Mancini-DiNardo D,
Steele SJ,
Ingram RS,
Tilghman SM
Mancini-DiNardo
SJ, Ingram
RS, Tilghman
SM. . A differentially methylated region within the gene Kcnq1 functions as an imprinted promoter and silencer. Hum Mol Genet 12: 283–294.
Mancini-DiNardo D,
Steele SJ,
Levorse JM,
Ingram RS,
Tilghman SM
Mancini-DiNardo
SJ, Levorse
JM, Ingram
RS, Tilghman
SM. . Elongation of the Kcnq1ot1 transcript is required for genomic imprinting of neighboring genes. Genes Dev 20: 1268–1282.
Maunakea AK,
Nagarajan RP,
Bilenky M,
Ballinger TJ,
D'Souza C,
Johnson BE,
Nielsen C,
AK, Nagarajan
RP, Bilenky
M, Ballinger
TJ, D'Souza
SD, Johnson
C, Nielsen
Y, et al. . Conserved role of intragenic DNA methylation in regulating alternative promoters. Nature 466: 253–257.
Meissner A,
Mikkelsen TS,
Sivachenko A,
Bernstein BE,
Nusbaum C,
A, Mikkelsen
J, Sivachenko
X, Bernstein
BE, Nusbaum
DB, et al. . Genome-scale DNA methylation maps of pluripotent and differentiated cells. Nature 454: 766–770.
Mendenhall EM,
Bernstein BE
Mendenhall
RP, Truong
M, Bernstein
BE. . GC-rich sequence elements recruit PRC2 in mammalian ES cells. PLoS Genet 6: e1001244. .
Mercer TR,
Dinger ME,
Mattick JS
TR, Dinger
ME, Mattick
JS. . Long non-coding RNAs: insights into functions. Natl Rev 10: 155–159.
Mikkelsen TS,
Lieberman E,
Giannoukos G,
Alvarez P,
Brockman W,
B, Lieberman
E, Giannoukos
G, Alvarez
P, Brockman
RP, et al. . Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature 448: 553–560.
Schubeler D
F, Schubeler
D. . Genetics and epigenetics: stability and plasticity during cellular differentiation. Trends Genet 25: 129–136.
Roloff TC,
Richter J,
Stadler MB,
Schubeler D
TC, Richter
J, Stadler
M, Schubeler
D. . Lineage-specific polycomb targets and de novo DNA methylation define restriction and potential of neuronal progenitors. Mol Cell 30: 755–766.
Morgan HD,
HD, Santos
W. . Epigenetic reprogramming in mammals. Hum Mol Genet 14: R47–R58. .
McGarvey KM,
Schuebel KE,
Mohammad HP,
Daniel VC,
JE, McGarvey
L, Schuebel
L, Mohammad
W, et al. . A stem cell-like chromatin pattern may predispose tumor suppressor genes to DNA hypermethylation and heritable silencing. Nat Genet 39: 237–242.
Okamoto I,
E. . Lessons from comparative analysis of X-chromosome inactivation in mammals. Chromosome Res 17: 659–669.
Bernstein E,
Erdjument-Bromage H,
C, Bernstein
Z, Erdjument-Bromage
CD, et al. . DNMT3L connects unmethylated lysine 4 of histone H3 to de novo methylation of DNA. Nature 448: 714–717.
Derrien T,
Beringer M,
Gumireddy K,
Gardini A,
Bussotti G,
Zytnicki M,
Notredame C,
UA, Derrien
T, Beringer
M, Gumireddy
K, Gardini
A, Bussotti
F, Zytnicki
M, Notredame
Q, et al. . Long noncoding RNAs with enhancer-like function in human cells. Cell 143: 46–58.
Palmer LE,
Rabinowicz PD,
O'Shaughnessy AL,
Balija VS,
Nascimento LU,
de la Bastide M,
Martienssen RA,
McCombie WR
LE, Rabinowicz
PD, O'Shaughnessy
AL, Balija
VS, Nascimento
S, de la Bastide
M, Martienssen
RA, McCombie
WR. . Maize genome sequencing by methylation filtration. Science 302: 2115–2117.
JT. . X chromosome dosage compensation: how mammals keep the balance. Annu Rev Genet 42: 733–772.
Peterlin BM,
DH. . Controlling the elongation phase of transcription with P-TEFb. Mol Cell 23: 297–305.
McCuine S,
RA, McCuine
RA. . c-Myc regulates transcriptional pause release. Cell 141: 432–445.
Ramirez-Carrozzi VR,
Hoffmann A,
Ramirez-Carrozzi
JC, Hoffmann
ST. . A unifying model for the selective regulation of inducible transcription by CpG islands and nucleosome remodeling. Cell 138: 114–128.
Pfeifer GP
AD, Pfeifer
GP. . A human B cell methylome at 100-base pair resolution. Proc Natl Acad Sci 106: 671–678.
Reynolds GA,
Osborne TF,
Goldstein JL,
SK, Osborne
MS, Goldstein
JL, Luskey
KL. . HMG CoA reductase: a negatively regulated gene with unusual promoter and 5′ untranslated regions. Cell 38: 275–285.
Kertesz M,
Squazzo SL,
Brugmann SA,
Goodnough LH,
Farnham PJ,
JL, Kertesz
JK, Squazzo
X, Brugmann
SA, Goodnough
JA, Farnham
E, et al. . Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell 129: 1311–1323.
Cuddapah S,
TY, Cuddapah
K. . The genomic landscape of histone modifications in human T cells. Proc Natl Acad Sci 103: 15782–15787.
Rozenberg JM,
Shlyakhtenko A,
Myakishev MV,
FitzGerald PC,
JM, Shlyakhtenko
V, Myakishev
MV, FitzGerald
PC, Vinson
C. . All and only CpG containing sequences are enriched in promoters abundantly bound by RNA polymerase II in multiple tissues. BMC Genomics 9: 67. .
Imanaka Y,
Shimizu K,
Tsujimoto G
Y, Imanaka
F, Shimizu
K, Tsujimoto
G. . Genome-wide analysis of aberrant methylation in human breast cancer cells using methyl-DNA immunoprecipitation combined with
high-throughput sequencing. BMC Genomics 11: 137. .
Ruthenburg AJ,
Ruthenburg
CD, Wysocka
J. . Methylation of lysine 4 on histone H3: intricacy of writing and reading a single epigenetic mark. Mol Cell 25: 15–30.
Fenner MH,
E. . X inactivation in the mouse embryo deficient for Dnmt1: distinct effect of hypomethylation on imprinted and random X inactivation. Dev Biol 225: 294–303.
Saksouk N,
Avvakumov N,
Champagne KS,
Landry AJ,
N, Avvakumov
N, Champagne
V, et al. . HBO1 HAT complexes target chromatin throughout gene coding regions via multiple PHD finger interactions with histone H3 tail. Mol Cell 33: 257–265.
Saxonov S,
Brutlag DL
P, Brutlag
DL. . A genome-wide analysis of CpG dinucleotides in the human genome distinguishes two distinct classes of promoters. Proc Natl Acad Sci 103: 1412–1417.
Schilling E,
M. . Global, comparative analysis of tissue-specific promoter CpG methylation. Genomics 90: 314–323.
Schlesinger Y,
Straussman R,
Farkash S,
Zimmerman J,
Yakhini Z,
Ben-Shushan E,
Reubinoff BE
Schlesinger
Y, Straussman
I, Farkash
M, Zimmerman
E, Yakhini
Z, Ben-Shushan
E, Reubinoff
BE, et al. . Polycomb-mediated methylation on Lys27 of histone H3 pre-marks genes for de novo methylation in cancer. Nat Genet 39: 232–236.
Schones DE,
Cuddapah S,
K, Cuddapah
TY, Barski
K. . Dynamic regulation of nucleosome positioning in the human genome. Cell 132: 887–898.
Calabrese JM,
Levine SS,
AC, Calabrese
JM, Levine
PA. . Divergent transcription from active promoters. Science 322: 1849–1851.
Sequeira-Mendes J,
Diaz-Uriarte R,
Apedaile A,
Huntley D,
Brockdorff N,
Sequeira-Mendes
J, Diaz-Uriarte
R, Apedaile
A, Huntley
D, Brockdorff
M. . Transcription initiation activity sets replication origin efficiency in mammalian cells. PLoS Genet 5: e1000446. .
Waterland RA,
X, Waterland
JP. . Genome-wide profiling of DNA methylation reveals a class of normally methylated CpG island promoters. PLoS Genet 3: 2023–2036.
Shiraki T,
Katayama S,
Kasukawa T,
Kodzius R,
Watahiki A,
Nakamura M,
S, Katayama
K, Kasukawa
H, Kodzius
R, Watahiki
A, Nakamura
M, Arakawa
T, et al. . Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter
usage. Proc Natl Acad Sci 100: 15776–15781.
Sleutels F,
DP. . The non-coding Air RNA is required for silencing autosomal imprinted genes. Nature 415: 810–813.
H. . In vitro methylation of the hamster adenine phosphoribosyltransferase gene inhibits its expression in mouse L cells. Proc Natl Acad Sci 79: 3418–3422.
Giadrossi S,
Casanova M,
Brookes E,
Brockdorff N,
Fisher AG,
JK, Giadrossi
S, Casanova
M, Brookes
H, Brockdorff
A. . Ring1-mediated ubiquitination of H2A restrains poised RNA polymerase II at bivalent genes in mouse ES cells. Nat Cell Biol 9: 1428–1435.
Strahl BD,
Briggs SD,
Caldwell JA,
Shabanowitz J,
PA, Briggs
JR, Caldwell
JA, Mollah
RG, Shabanowitz
DF, et al. . Set2 is a nucleosomal histone H3-selective methyltransferase that mediates transcriptional repression. Mol Cell Biol 22: 1298–1306.
Straussman R,
Roberts D,
Steinfeld I,
Benvenisty N,
Yakhini Z,
Straussman
D, Roberts
D, Steinfeld
B, Benvenisty
I, Yakhini
H. . Developmental programming of CpG island methylation profiles in the human genome. Nat Struct Mol Biol 16: 564–571.
Suzuki MM,
De Sousa D,
AR, De Sousa
A. . CpG methylation is targeted to transcription units in an invertebrate genome. Genome Res 17: 625–631.
Tahiliani M,
Pastor WA,
Bandukwala H,
Agarwal S,
WA, Bandukwala
Y, Agarwal
DR, Aravind
L, et al. . Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science 324: 930–935.
Takeshima H,
Yamashita S,
Shimazu T,
Ushijima T
H, Yamashita
S, Shimazu
T, Ushijima
T. . The presence of RNA polymerase II, active or stalled, predicts epigenetic fate of promoter CpG islands. Genome Res 19: 1974–1982.
A. . Alternative chromatin structure at CpG islands. Cell 60: 909–920.
Thomson JP,
Selfridge J,
Clouaire T,
Andrews R,
PJ, Selfridge
J, Clouaire
AR, Deaton
A, Andrews
KD, et al. . CpG islands influence chromatin structure via the CpG-binding protein Cfp1. Nature 464: 1082–1086.
Tsukada Y,
Erdjument-Bromage H,
Warren ME,
Borchers CH,
J, Erdjument-Bromage
ME, Borchers
CH, Tempst
Y. . Histone demethylation by a family of JmjC domain-containing proteins. Nature 439: 811–816.
Tweedie S,
Charlton J,
S, Charlton
A. . Methylation of genomes and genes at the invertebrate-vertebrate boundary. Mol Cell Biol 17: 1469–1475.
van Ingen H,
van Schaik FM,
Ballering J,
Rehmann H,
Dechesne AC,
Kruijzer JA,
Liskamp RM,
Timmers HT,
H, van Schaik
H, Ballering
J, Rehmann
H, Dechesne
AC, Kruijzer
JA, Liskamp
RM, Timmers
HT, Boelens
R. . Structural insight into the recognition of the H3K4me3 mark by the TFIID subunit TAF3. Structure 16: 1245–1256.
Vermeulen M,
Mulder KW,
Denissov S,
Pijnappel WW,
van Schaik FM,
Varier RA,
Baltissen MP,
Stunnenberg}

我要回帖

更多关于 sequence是什么 的文章

更多推荐

版权声明:文章内容来源于网络,版权归原作者所有,如有侵权请点击这里与我们联系,我们将及时删除。

点击添加站长微信