Christopher P. Austin, M. Featured Content. Introduction to Genomics. Classical 1D lattice models for DNA-ligand binding go back to s for review see [15] , [40].
Together with a related class of lattice models for the DNA helix-coil transition, these are the descendants of the famous Ising model of ferromagnetism [41]. A common theme of these models is that the DNA is considered as a lattice of units base pairs, bps , which can be in different states e. An important conclusion from these studies is that non-specific binding of a large ligand to DNA in the presence of boundaries close to the ends of the DNA segment or close to some other obstacles results in a non-random periodic oscillation of the ligand binding probability [42].
In , Kornberg and Stryer proposed that a similar effect accounts for regular oscillations of preferred nucleosome positions in the genome [43]. This model had only one input parameter, the nucleosome density. Later, it was generalized in the spirit of classical ligand-DNA binding models [44] to include basic thermodynamic parameters such as the histone octamer binding constant, the length of the nucleoprotein particle in units of DNA base pairs bp , and the contact cooperativity parameter for the interaction between nucleosomes [45].
Following bioinformatic studies based on high-throughput sequencing provided compelling arguments to introduce a discrete distribution of sizes of the linker DNA between nucleosomes [17] and long-range nucleosome-nucleosome interactions [18] , [46].
Additionally, arguments have been put forward that the model should be modified to take into account that the nucleosome core particle is not a static structure and some plasticity and partial DNA unwrapping from the histone octamer core can allow and facilitate the binding of TFs to the nucleosome-associated DNA [47] — [50].
Correspondingly, generic lattice models were adapted to include nucleosome unwrapping [16] , and DNA unwrapping, which was shown to be an essential feature required for an adequate analysis of the available experimental data on nucleosome positioning [7] , [10] , [16].
We will start from the model that includes all the features mentioned above. On top of it, we will take into account nucleosome interactions with each other and with chromatin architectural proteins H1, HMG1, etc. The scheme of our generalized 1D lattice model for nucleosome arrangement is presented in Figure 1.
In the framework of this model, genomic DNA is represented as a 1D lattice of units numbered by index n , each of which can be either bound by any of f protein species or remain unoccupied. Each protein type g is characterized by its size m g in terms of the number of DNA base pairs covered upon its binding, the concentration of free protein, c 0 g , and DNA sequence-specific binding constant, K n , g.
In principle, the histone octamer is treated as just one of the many possible types of DNA-binding protein complexes. The use of the binding constant for the histone octamer does not mean that the complex freely binds and unbinds at equilibrium. Indeed, it is known that the free thermal sliding of the histone octamer along the DNA is limited at physiological conditions [51] , and most nucleosome repositioning events happen in vivo actively with the help of ATP-dependent remodelers and histone chaperones, which decrease energy barriers for nucleosome translocations and effectively help to equilibrate the system.
Remodelers can have their context- and cell type-dependent rules [20] , [52] , but since these rules are not well known in practice, we can assume that these rules are already taken into account in the experimentally measured cell type-specific nucleosome distributions. Still, one can treat it as a collective equilibrium in an ensemble of many identical cells or for many instances of the cell at different time points [53].
A All-or-none models require that a DNA region is either within a nucleosome or bound by a transcription factor. B Advanced view on co-binding of a TF and histone octamer to the same DNA region top , and the corresponding lattice model bottom , which takes into account the possibility of partial nucleosome unwrapping. C Taking into account linker histones requires the introduction of long-range interactions between DNA-bound proteins.
Firstly, the oscillations of the nucleosome density are plotted around the boundary of interest for example, an end of the DNA segment would be appropriate as a boundary. Then the coordinates of the peaks from D are collected and fitted with a linear function.
The slope of the line in E determines the NRL. The linker histone H1 or any other architectural protein is allowed to bind both the free DNA and the nucleosome. The model assumes, for mathematical simplicity, that the binding takes place at the DNA lattice units free from other proteins. The physical connectivity between the linker histone and nucleosome and in general for other protein binders is accounted for by the interaction potential w L , g 1 , g 2 , where L is the distance along the DNA between proteins g 1 and g 2.
In particular, since H1 interacts mostly with the nucleosome [56] , this has to be reflected by a high value of the contact cooperativity parameter w 0,H1,NCP. The lattice model illustrated in Figure 1 can be solved mathematically either using dynamic programming or the transfer matrix formalism [14] , [16] , [40] , [53].
Here we have performed the calculations of nucleosome binding maps using our software suite TFnuc [53] , which is based on the dynamic programming algorithm developed in our previous publication [40]. See Supplementary Materials for the details of the computational implementation.
TFnuc takes as input concentrations of DNA and DNA-binding proteins and position weight matrices PWMs for all the TFs studied, as well as the thermodynamic parameters listed above, which define the properties of the interaction model.
As the output, TFnuc calculates binding probabilities c g, n for each protein type g at a genomic position n taking into account the presence of all other proteins and nucleosomes. The NRL for a given genomic region can be then determined, following our previous work [28] , from a linear fit of the nucleosome occupancy peak positions versus the corresponding peak numbers Figure 1D and E.
Figure 2 shows the results of calculations of the NRL as a function of different thermodynamic parameters, assuming that there are no sequence-specific preferences of histone octamer binding to the DNA. We make a number of general conclusions based on these calculations:. Figure 2A demonstrates that as the nucleosome density increases, NRL sharply decreases down to the minimum size defined by neighboring nucleosome-nucleosome interactions see below.
It is this decrease of NRL with the nucleosome density predicted by the theory but not observed in the yeast chromatin reconstitution experiments [29] , that has lead to the questioning of the validity of the Kornberg-Stryer statistical model for this system [58]. Therefore, it is important to note that the assumption of non-sequence-specific binding is not valid for the case when DNA sequence-specificity dominates e.
Figure 2B shows that when nucleosome unwrapping is allowed, NRL decreases nonlinearly as a function of the maximum allowed unwrapping length h max. The non-linearity arises due to the fact that not all nucleosomes adopt the state with the largest possible unwrapping; some nucleosomes in the ensemble stay completely intact, while some others have just a few base pairs unwrapped due to the thermal distribution of NCP unwrapping lengths.
More unwrapping requires more energy to break attractive histone-DNA contacts, which is only partially compensated by the favorable entropy increase [16]. Previously, nucleosome unwrapping was shown to be essential to describe in vitro AFM data [16] , in vivo genome-wide nucleosome distribution in yeast [7] , and the effect of nucleosomes on the activating enhancer function in drosophila [14].
Thus, nucleosome unwrapping is an essential feature of our current model and its effect on the NRL found in Figure 2B needs to be taken into account. On the other hand, introducing long-range anticooperative interactions leads to a linear increase of the NRL as a function of the length of prohibitive interactions V Figure 2D.
The long-range interaction potential w L can be introduced in any form in the framework of this model. Intermediate situations with length-dependent interaction potential would lead to a more complicated behavior, which can be also studied with the help of this model. Now let us consider linker histones H1 or their variants such as H5, or other chromatin proteins such as HMGN1 , as schematically depicted in Figure 1C. In the 1D lattice model, the linker histone is assumed not only to bind the nucleosome, but also to cover several free DNA lattice units between nucleosomes, depending on the size of this protein.
Effectively, linker proteins introduce additional nucleosome-nucleosome interactions e. The results of our calculations shown in Figure 2E suggest that the effect of linker proteins is quite different from the effect of direct nucleosome-nucleosome interactions considered in panels 2C and 2D. The major difference is that the effect of linker proteins is concentration-dependent.
Figure 2E shows that when the linker protein concentration is large, it has a significant effect of the NRL. In particular, a two-fold change of the linker-to-core histone ratio of molar concentrations leads to an experimentally detectable 6 bp NRL change. Figure 2F shows that the size of the linker histone or other non-histone players nonspecifically binding the nucleosome and the DNA linker between nucleosomes is quite important for its ability to alter the NRL.
Counterintuitively, smaller proteins appear to be more effective in increasing the NRL due to larger configurational entropy of rearrangements of bound proteins along the DNA. The latter effect was obtained assuming that H1-DNA binding affinity does not depend on the H1 size, which is not necessarily the case if the electrostatics of DNA-histone binding prevails [35]. The finding that smaller proteins can exert larger steric effect on the NRL is in line with in vitro DNA condensation experiments, which have established that although cationic ligands with higher charge are better DNA condensing agents, smaller cations have stronger DNA condensing propensity when ligands of the same charge are considered, such as e.
In a similar way, Blank and Becker reported that the effect of multivalent binders including metal ions, polyamines and H1 on the NRL increases with their charge [36]. One should expect that proteins or polyamines with the same charge have different properties depending on their size according to Figure 2F. Since this effect is directly testable, it would be interesting to confirm it experimentally. We will start our analysis of sequence-specific NRL effects from the description of the nucleosome arrangement around Saccharomyces cerevisiae TSSs [29] , the system which has already become a standard benchmark for this type of models [7] , [10] , [32] , [58] , [60] , [61].
Figure 3A shows the average experimental nucleosome distribution and the corresponding prediction from the Kornberg-Stryer model where nucleosomes are modeled by stiff rod-like particles non-cooperatively binding the DNA and protecting bp from other nucleosomes and proteins Figure 1A.
While revealing the expected oscillatory occupancy pattern, this model fails to describe the experimental data quantitatively. In particular, the first dip of the nucleosome density at position around bp is too sharp in comparison with the experimental curve. This discrepancy has been noted by Riposo and Mozziconacci [60] , which they tried to solve mathematically via blurring the precise nucleosome dyad positions by a finite-width Gaussian distribution.
A more physically motivated tuning of the model was used to correct for this effect in a recent study by Mobius et al. In the latter work it was shown that allowing DNA unwrapping from the nucleosome makes the distribution smoother and more resembling the experimental one. Yet, the calculation in Figure 3B shows that extending the model to allow nucleosome unwrapping brings another problem. In the frame of our model we have the option to fine-tune the parameter landscape by changing the histone octamer affinity to the DNA, the concentrations of core and linker and the contact nucleosome-nucleosome cooperativity parameter.
However, none of these new elements of the model allowed fitting the experimental curve adequately. In particular, Figure 3C shows that the introduction of the linker histone H1, while allowing properly changing the NRL, does not lead to the proper shapes of the peaks of the nucleosome density.
This is not surprising, since yeast does not have H1 [62] , and the related protein Hho1 is probably not involved in determining NRL [63] , [64].
The only model change that helped us arriving to the experimentally observed yeast TSS nucleosome distribution was the introduction of the long-range anticooperative interactions between nucleosomes, as schematically shown in Figure 1C. Several previous publications have also encountered the problem that the theoretically predicted NRL is too short for this experimental system, and attempted to solve it by empirically assuming that nucleosomes cannot form closer than a certain distance e.
The 3D structure of the nucleosome is essentially the same in these species, so if one postulates a nucleosome consisting of bp in S. Therefore, instead of setting the fixed-geometry contact interaction between nucleosome core particles, our model only assumes that nucleosomes need to overcome a certain energy barrier to position themselves closer than a certain distance at a given genomic region.
Thus the lower NRL limit determined by long-range interactions is different for different cell types depending on the thermodynamic conditions. What can be the source of such long-range interactions? One of the differences between S. Different remodeler composition could explain NRL differences in different species; in addition, the co-evolution of the remodeling system and the nucleosomal DNA code might determine that the DNA sequence is also adapted to this or that NRL in different regions in different species.
Another possible justification for long-range nucleosome-nucleosome interactions could be a specific structural nucleosome arrangement in the chromatin fiber [67]. For instance, the geometries and nucleosome axial densities of the in vitro reconstituted chromatin fibers in the presence of linker H5 histones were shown to be dramatically different for the NRL of and bp [39]. Longer DNA linkers enable easier compaction of the chromatin [68] , while for short linkers, on the contrary, the elastic penalty of the linker DNA bending can become too large [69].
Let us compare predictions of the model with available experimental dependences of NRL on the concentration of linker histone H1, [H1]. We will use two experimental examples. In the first example, Oberg and coauthors have systematically studied the effect of the concentration of different histone variants on the NRL [33].
They have found that for all the histone variants an increase of the NRL with increasing H1 concentration was observed, which saturated at a certain value Figure 4A. In addition, a simple competitive model where H1 binds the nucleosomal DNA and excludes nucleosomes cannot explain the saturating behavior of NRL as a function of [H1]. To recapitulate this feature, we have introduced in the model a limiting case of cooperative binding: we prohibit H1 binding if there is no nucleosome in its vicinity.
This model keeps nucleosome-H1 and nucleosome-H1-nucleosome distances flexible, but allows not more than one H1 per nucleosome to be bound Figure 1C.
Furthermore, this modified model predicts the correct slope of the curve and the correct saturation level in comparison with the experiment of Oberg et al. This leads us to the refined lattice model for H1-nucleosome interaction, where nucleosomes can bind a small number of H1 molecules, or accommodate different types of nucleosome-H1-nucleosome connectivity, but only less than a critical number of H1 molecules per nucleosome is allowed one H1 per nucleosome in Figure 4C.
C A scheme illustrating the refined model for nucleosome-H1 arrangement: different configurations of bound H1 around nucleosome are allowed, but not more than a critical number of H1 per nucleosome. Solid line - theoretical prediction. A similar behavior was also found by Blank and Becker for various smaller charged molecules such as metal cations and polyamines [36].
Electrostatics is also believed to play a role in the regulation of chromatin states through histone modifications [71]. Interestingly, the linear regime observed by Woodcock and colleagues represents only a fraction of the interval of H1 concentrations, where the overall saturating dependence of NRL on the H1 concentration was found by Oberg et al. Thus, our model is consistent both with the Oberg et al. This model is not limited to the in vitro system consisting of just the DNA and histones.
Importantly, it is also readily applicable to more complicated systems where histones are complemented by non-histone chromatin proteins such as HMGN1. Let us now return to the sequence-specific NRL effects at genomic regions. Korber and colleagues have noted that the statistical model of Kornberg and Stryer [43] predicts a pronounced NRL change with the increase of the nucleosome density as in Figure 2A , which they did not observe experimentally upon reconstituting nucleosomes at the DNA sequences enclosing yeast TSSs [29] , [58].
Furthermore, subsequent experimental work of Celona et al. In line with this, the authors of a recent Monte Carlo simulation hypothesize that DNA sequence-specific effects have the primary role on nucleosome positioning near TSSs [72]. To check this hypothesis in a more general context, we have utilized the nucleosome positioning data measured by MNase-seq for mouse embryonic stem cells mESCs [28]. Figure 5A shows that the average nucleosome landscape in mESCs calculated for two different experimental nucleosome datasets from Refs.
This is not surprising since most stalled Pol2 reside near the TSS, which is known to be nucleosome-depleted [28]. Panel 5B shows the heat map of the nucleosome density for each of the individual genomic regions used in the calculation of the average profile in panel 5B. This heat map also reveals oscillations of the nucleosome density.
To check whether these oscillations are only determined by the TSS-induced boundary or also reflected by the DNA sequence, we have calculated nucleosome distributions using the DNA sequence preferences of histone octamer predicted by Segal and coauthors [13] black and blue lines in panel 5A. Surprisingly, this nucleosome pattern also exhibits pronounced oscillations.
Moreover, it is estimated that the human body contains about 50 trillion cells—which works out to trillion meters of DNA per human. Now, consider the fact that the Sun is billion meters from Earth. This means that each of us has enough DNA to go from here to the Sun and back more than times, or around Earth's equator 2.
How is this possible? DNA is negatively charged, due to the phosphate groups in its phosphate-sugar backbone, so histones bind with DNA very tightly.
Figure 1: Chromosomes are composed of DNA tightly-wound around histones. Chromosomal DNA is packaged inside microscopic nuclei with the help of histones. These are positively-charged proteins that strongly adhere to negatively-charged DNA and form complexes called nucleosomes.
Each nuclesome is composed of DNA wound 1. Nucleosomes fold up to form a nanometer chromatin fiber, which forms loops averaging nanometers in length. The nm fibers are compressed and folded to produce a nm-wide fiber, which is tightly coiled into the chromatid of a chromosome.
Genetics: A Conceptual Approach , 2nd ed. All rights reserved. Figure Detail. Figure 2: Electron micrograph of chromatin: the beads on a string In this micrograph, nucleosomes are indicated by arrows.
Chromatin history: our view from the bridge. Nature Reviews Molecular Cell Biology 4, The basic repeating structural and functional unit of chromatin is the nucleosome, which contains eight histone proteins and about base pairs of DNA Van Holde, ; Wolffe, The observation by electron microscopists that chromatin appeared similar to beads on a string provided an early clue that nucleosomes exist Olins and Olins, ; Woodcock et al.
Another clue came from chemically cross-linking i. This experiment demonstrated that H2A, H2B, H3, and H4 form a discrete protein octamer, which is fully consistent with the presence of a repeating histone-containing unit in the chromatin fiber. Today, researchers know that nucleosomes are structured as follows: Two each of the histones H2A, H2B, H3, and H4 come together to form a histone octamer, which binds and wraps approximately 1.
The addition of one H1 protein wraps another 20 base pairs, resulting in two full turns around the octamer, and forming a structure called a chromatosome Box 4 in Figure 1. The resulting base pairs is not very long, considering that each chromosome contains over million base pairs of DNA on average.
Therefore, every chromosome contains hundreds of thousands of nucleosomes, and these nucleosomes are joined by the DNA that runs between them an average of about 20 base pairs.
One such enzyme, micrococcal nuclease MNase , has the important property of preferentially cutting the linker DNA between nucleosomes well before it cuts the DNA that is wrapped around octamers. By regulating the amount of cutting that occurs after application of MNase, it is possible to stop the reaction before every linker DNA has been cleaved.
At this point, the treated chromatin will consist of mononucleosomes, dinucleosomes connected by linker DNA , trinucleosomes, and so forth Hewish and Burgoyne, If DNA from MNase-treated chromatin is then separated on a gel, a number of bands will appear, each having a length that is a multiple of mononucleosomal DNA Noll, The simplest explanation for this observation is that chromatin possesses a fundamental repeating structure.
When this was considered together with data from electron microscopy and chemical cross-linking of histones, the "subunit theory" of chromatin Kornberg, ; Van Holde et al.
The subunits were later named nucleosomes Oudet et al. The model of the nucleosome that crystallographers constructed from their data is shown in Figure 3. Note that only eukaryotes i. Prokaryotes, such as bacteria , do not. Figure 4: Electron micrograph of chromatin A 30nm fiber of chromatin. The packaging of DNA into nucleosomes shortens the fiber length about sevenfold.
In other words, a piece of DNA that is 1 meter long will become a "string-of-beads" chromatin fiber just 14 centimeters about 6 inches long. Despite this shortening, a half-foot of chromatin is still much too long to fit into the nucleus, which is typically only 10 to 20 microns in diameter.
Therefore, chromatin is further coiled into an even shorter, thicker fiber, termed the "nanometer fiber," because it is approximately 30 nanometers in diameter Figure 4. Over the years, there has been a great deal of speculation concerning the manner in which nucleosomes are folded into nanometer fibers Woodcock, Part of the problem lies in the fact that electron microscopy is perhaps the best way to visualize packaging, but individual nucleosomes are hard to discern after the fiber has formed.
In addition, it also makes a difference whether observations are made using isolated chromatin fibers or chromatin within whole nuclei.
Thus, the nanometer fiber may be highly irregular and not quite the uniform structure depicted in instructive drawings such as Figure 1 Bednar et al.
Interestingly, histone H1 is very important in stabilizing chromatin higher-order structures, and nanometer fibers form most readily when H1 is present. Processes such as transcription and replication require the two strands of DNA to come apart temporarily, thus allowing polymerases access to the DNA template. However, the presence of nucleosomes and the folding of chromatin into nanometer fibers pose barriers to the enzymes that unwind and copy DNA. Generally speaking, there are two major mechanisms by which chromatin is made more accessible:.
When eukaryotic cells divide, genomic DNA must be equally partitioned into both daughter cells. To accomplish this, the DNA becomes highly compacted into the classic metaphase chromosomes that can be seen with a light microscope.
Once a cell has divided, its chromosomes uncoil again. Comparing the length of metaphase chromosomes to that of naked DNA, the packing ratio of DNA in metaphase chromosomes is approximately 10, depending on the chromosome. This can be thought of as akin to taking a rope as long as a football field and compacting it down to less than half an inch.
This level of compaction is achieved by repeatedly folding chromatin fibers into a hierarchy of multiple loops and coils Figure 1. Exactly how this is accomplished is unclear, but the phosphorylation of histone H1 may play a role. Indeed, this is just one area of DNA packaging that researchers will continue to explore in the years to come. Bednar, J. Nucleosomes, linker DNA, and linker histones form a unique structural motif that directs the higher-order folding and compaction of chromatin.
Proceedings of the National Academy of Sciences 95 , — Fischle, W. Histone and chromatin cross-talk. Current Opinion in Cellular Biology 15 , — Hewish, D. Chromatin sub-structure. The digestion of chromatin DNA at regularly spaced sites by a nuclear deoxyribonuclease. Biochem Biophys Res Commun 52 ,
0コメント