Large-Scale Topological Changes Restrain Malignant Progression in Colorectal Cancer
Large-Scale Topological Changes Restrain Malignant Progression in Colorectal Cancer
Normal
nuclei
Correspondence
Normal cells [email protected] (M.J.A.),
[email protected].
edu (B.E.B.)
Tumor
nuclei
Human colon Excessive replication In Brief
Integrated analyses of genome
Tumor suppression topological, epigenetic, and
Impact of transcriptional features of colorectal
A compartment B compartment compartment shift
Intermediate tumors highlight substantial genome
compartment
compartmental reorganization
Oncogene
Normal nucleus Repression associated with tumor-suppressive
rather than oncogenic transcriptional
outcomes.
ERV/CGA
Activation
DNA hypomethylation
Tumor/aging
Highlights
d Hierarchical layers of nuclear architecture are altered in
colorectal tumors
Article
Large-Scale Topological Changes Restrain
Malignant Progression in Colorectal Cancer
Sarah E. Johnstone,1,2,3,10 Alejandro Reyes,2,4,5,10 Yifeng Qi,2,6 Carmen Adriaens,1,2,3 Esmat Hegazi,1,2,3 Karin Pelka,2,3
Jonathan H. Chen,1,2,3 Luli S. Zou,2,4,5 Yotam Drier,7 Vivian Hecht,2 Noam Shoresh,2 Martin K. Selig,1 Caleb A. Lareau,1,2,8
Sowmya Iyer,1 Son C. Nguyen,9 Eric F. Joyce,9 Nir Hacohen,2,3 Rafael A. Irizarry,2,4,5 Bin Zhang,2,6 Martin J. Aryee,1,2,3,5,*
and Bradley E. Bernstein1,2,3,11,*
1Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
2Broad Institute of MIT and Harvard, Cambridge, MA 02139, USA
3Center for Cancer Research, Massachusetts General Hospital, Boston, MA 02129, USA
4Department of Data Sciences, Dana Farber Cancer Institute, Boston, MA 02215, USA
5Department of Biostatistics, Harvard School of Public Health, Boston, MA 02215, USA
6Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
7The Lautenberg Center for Immunology and Cancer Research, The Hebrew University, Jerusalem, Israel
8Program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA 02215, USA
9Department of Genetics, Penn Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
10These authors contributed equally
11Lead Contact
SUMMARY
Widespread changes to DNA methylation and chromatin are well documented in cancer, but the fate of
higher-order chromosomal structure remains obscure. Here we integrated topological maps for colon tumors
and normal colons with epigenetic, transcriptional, and imaging data to characterize alterations to chromatin
loops, topologically associated domains, and large-scale compartments. We found that spatial partitioning
of the open and closed genome compartments is profoundly compromised in tumors. This reorganization is
accompanied by compartment-specific hypomethylation and chromatin changes. Additionally, we identify a
compartment at the interface between the canonical A and B compartments that is reorganized in tumors.
Remarkably, similar shifts were evident in non-malignant cells that have accumulated excess divisions.
Our analyses suggest that these topological changes repress stemness and invasion programs while
inducing anti-tumor immunity genes and may therefore restrain malignant progression. Our findings call
into question the conventional view that tumor-associated epigenomic alterations are primarily oncogenic.
1474 Cell 182, 1474–1489, September 17, 2020 ª 2020 Elsevier Inc.
ll
Article
2011; Hansen et al., 2011). The functional implications of hypo- houders et al., 2019). We identified 25,125 loops in normal colon
methylation remain obscure, but it may impact genome stability or tumors and annotated the subset that connects an histone H3
and/or transcriptional activity (Baylin and Jones, 2016). lysine 27 acetylation (H3K27ac) peak (enhancer-like) to an anno-
Conversely, CpG island hypermethylation occurs in a subset of tated promoter as E-P loops (n = 14,121). Differential analysis re-
colon tumors, termed the CpG island methylator phenotype vealed 571 E-P loops that are stronger in tumors and 248 that are
(CIMP), and is associated with promoter silencing (Hinoue weaker in tumors (Figure 1B). To relate these differential loops to
et al., 2012; Toyota et al., 1999). However, our understanding transcription, we evaluated the expression of the corresponding
of these epigenomic changes and their relationship to genome genes in our cohort and across 521 samples from The Cancer
topology has been hindered by a lack of data for primary tumors. Genome Atlas (TCGA) (Cancer Genome Atlas Network, 2012).
Here we mapped genome topology across a cohort of colon Genes connected to E-P loops that were stronger in tumors
tumors, normal colons, and colon cancer cell lines and were upregulated in tumors, whereas genes connected to E-P
compared successive layers of topology. We focus on large- loops that were weaker in tumors were downregulated (Figures
scale reorganization of the conventional genome compartments 1C and S2A; Table S3). This association was evident even
A and B and characterize an intermediate compartment at their when excluding loci subject to CNVs (Figure S2B).
interface. Remarkably, compartmental reorganization is associ- Several topological alterations involved known oncogenes or
ated with repression of stem cell, invasion, and metastasis genes tumor suppressors. For example, although the locus encoding
and induction of genes associated with anti-tumor immunity. Our the receptor tyrosine kinase oncogene EPHA2 (Dunne et al.,
results suggest that the most profound topological alterations in 2016) contains multiple enhancers in normal colon and tumors,
tumors are actually a consequence of accumulated cell divisions the gene is connected to a strong enhancer by a tumor-associ-
and that they may have a tumor-suppressive role. ated E-P loop (Figures 1D and S2C). Accordingly, EPHA2 is up-
regulated in colon tumors (Figures S2D and S2E). Conversely,
RESULTS the PDCD4 tumor suppressor (Wang et al., 2017) loses an E-P
loop to a distal enhancer and is downregulated in tumors (Fig-
Maps of DNA Methylation, Chromatin State, and ures 1E and S2F–S2H). The annotated E-P loops can also facil-
Topology in Human Tumors itate interpretation of single-nucleotide polymorphisms (SNPs)
To understand how nuclear architecture is altered in cancer, we associated with colon cancer risk (Figures S2I–S2K; Table S4;
profiled genome topology along with DNA methylation, chro- STAR Methods).
matin modifications, and CTCF in primary colon tumors, normal
colons, and colon cancer cell lines (Figure 1A). Our clinical cohort
included 26 tumors and 7 normal colon tissue samples (Table Topological Boundaries Are Largely Retained in Tumors
S1). Our in vitro models included colon cancer cell lines A next layer of topological organization involves TADs and their
(HCT116, SW480, RKO, and LS-174T), a line derived from boundaries. TADs are evident in Hi-C maps as sub-megabase-
normal colon (FHC), and primary fibroblasts (WI-38). Our full da- scale regions with increased intra-domain interactions (Fig-
taset comprises 175 libraries and 28 billion sequencing reads for ure 1F; Dixon et al., 2015). CTCF occupies and contributes
Hi-C, HiChIP, bisulfite sequencing, chromatin immunoprecipita- to the stability of many TAD boundaries (Rao et al., 2014;
tion sequencing (ChIP-seq) and RNA sequencing (RNA-seq). Nora et al., 2017). We used Hi-C data to assess the locations
We performed hybrid-capture bisulfite sequencing on 26 and integrity of TAD boundaries genome wide (STAR
tumors, 3 normal colons, and 5 cell lines. Although only two tu- Methods). Boundary location and strength were largely
mors had CpG island hypermethylation (CIMP) (Figure S1A), all concordant between tumors, normal controls, and cell lines
tumors exhibited degrees of hypomethylation across expansive (Figures 1G and S3A–S3D), consistent with previous studies
genomic regions, termed ‘‘hypomethylated blocks’’ (Figure S1B; of cell lines and non-malignant tissues (Dixon et al., 2012;
Berman et al., 2011; Hansen et al., 2011). We also inferred copy Krefting et al., 2018; Nora et al., 2012; Schmitt et al., 2016). Tu-
number variants (CNVs) for each tumor (Table S2) and controlled mors, on average, shared 92% of TAD boundaries with normal
for CNV-related variability in further analyses by incorporating colon and 89% with the cell lines. Visual inspection revealed
terms for copy number estimates into our linear models and veri- that discordant boundary calls were most often caused by
fying results in CNV-stable regions and tumors (STAR Methods). subtle differences in strength rather than complete bound-
We integrated high-resolution topological maps and epige- ary loss.
nomics data to investigate successive layers of genome organi- Prior studies have shown that CTCF boundaries may be dis-
zation, from chromatin loops to TADs to large-scale genome rupted by genetic deletion or hypermethylation (Flavahan et al.,
compartments, in tissues, tumors, and cell lines. 2016, 2019; Hnisz et al., 2016; Modrek et al., 2017). Consistently,
we identified more than 100 TAD boundaries that gain DNA
E-P Loops Are Associated with Oncogenic methylation and lose CTCF binding in our hypermethylated tu-
Transcriptional Programs mors (Figures S3E and S3F). The integrity of these boundaries
We began by identifying loops that could influence transcrip- was compromised, as evidenced by weaker ‘‘peaks’’ on the
tional states in tumors. HiChIP assays targeting the cohesin sub- Hi-C contact maps (Figure S3G) and more frequent cross-
unit SMC1 (Mumbach et al., 2016) reveal CTCF-CTCF loops, boundary E-P interactions in HiChIP data (Figure S3H). However,
which contribute to TAD boundaries, as well as E-P loops, which the transcriptional consequences of these boundary losses ap-
are hypothesized to mediate enhancer gene activation (Stad- peared to be relatively limited (Figure S3I).
B C D E
F G
Figure 1. Integrated Topological Maps Reveal Tumor-Specific Chromatin Loops and Stable TAD Structure
(A) Schematic of hierarchical genome organization with indication of genomic scale (left) and summary of genome-wide assays (center) and models (right).
(B) Volcano plot presenting a differential analysis of loops between tumors and normal samples. Loops, represented as dots, with significantly stronger or weaker
interactions in tumors compared with normal colon are highlighted in red and green, respectively.
(C) Boxplots depicting expression fold change (log2) between tumors and normal samples (y axis) for genes engaged in enhancer-promoter (E-P) loops. Genes
are stratified by change in E-P loop strength between tumors and normal colon (x axis).
(D) Genomic view of the EPHA2 locus (130 kb), showing SMC1 HiChIP loops (arcs) and H3K27ac enrichment for normal colon (green) and colon tumor (purple).
The width of the arcs corresponds to the average loop strength summarized for the set of 2 normal and 7 colon samples. An asterisk indicates the differential loop
(STAR Methods).
(E) Genomic view of the PDCD4 locus (100 kb) as in (D).
(F) Hi-C contact map showing pairwise contact frequencies (red heat) between genomic positions across chromosome 7 (rows, columns) in normal colon. Top: Hi-C
eigenvector (PC1) based on long-range interactions demarcates compartments A (positive values, blue) and B (negative values, yellow). Right: inset with a magnified
view of a representative region reveals TAD structures (highlighted by black triangles). Rotation of this inset by 45 yields a horizontal display of TAD structures (see G).
(G) Horizonal heatmaps showing local Hi-C contact patterns (red heat) across chromosome 14 for normal colon (green), colon tumors (purple), and cell lines
(black). Exemplar TAD boundaries are indicated by black arrows.
In summary, topological boundaries were largely conserved To further investigate, we directly visualized chromatin and
across colon tumors, normal colons, and cell lines, with the specific genomic loci. First, we imaged specimens by transmis-
exception of a relatively small set of boundaries compromised sion electron microscopy, which revealed characteristic epithe-
in hypermethylated tumors. lial structures and organization. In epithelial nuclei from normal
colon, electron-dense heterochromatin was juxtaposed to the
Megabase-Scale Compartment Structure Is nuclear membrane, consistent with peripheral lamina associa-
Reorganized in Tumors tion, whereas characteristic light-staining euchromatin was
The genome is partitioned into open A and closed B spatial com- visible throughout the nuclear interior (Figures 2G and S4K). In
partments. TADs in the same compartment have a greater ten- marked contrast, epithelial tumor nuclei had large dark-staining
dency to self-interact, whereas inter-compartmental interactions heterochromatin foci dispersed throughout their interiors (Fig-
are disfavored (Rao et al., 2014), resulting in the characteristic ures 2H, 2I, and S4K; STAR Methods).
checkerboard pattern of Hi-C maps (Figure 1F). We used a stan- We next evaluated the positioning of specific genomic regions
dard eigenvector-based method to assign compartments for using DNA fluorescence in situ hybridization (FISH). We de-
each of our Hi-C datasets. In contrast to the striking conservation signed 26,000 Oligopaint probes targeting loci on chromo-
of TAD boundaries, compartment assignments varied between some 12 that were assigned to compartment A or B according
samples. For example, most cell lines were distinct from normal to Hi-C (Figure S4J; STAR Methods; Beliveau et al., 2012). We
tissues and tumors (Figures 2A, 2B, S4A, and S4B). then labeled the respective compartments with secondary fluo-
Comparisons also revealed widespread differences between rescent probes and visualized them by Airyscan confocal imag-
colon tumors and normal colon (Figure 2B). To understand ing. Chromosome 12 territories were generally positioned
these differences, we directly compared Hi-C interaction peripherally, consistent with prior studies (Bolzer et al., 2005).
matrices. Although compartment assignments were mostly To quantify radial distributions of the A and B compartments,
concordant between normal colon and tumors (Figure 2A), we scored probe signals according to their intensity in 20 radial
long-range interactions between compartments A and B were bins starting from the nuclear center in 47 nuclei from 2 normal
more frequent in tumors (Figure 2C). These differential interac- colon samples and 82 nuclei from 2 tumors. Although the local-
tion patterns were evident regardless of whether the underlying ization of fluorescence signals varied, compartment B signals
compartment assignments were derived from normal colon or were strongly skewed toward the periphery of normal colon
tumor Hi-C data (Figure S4C) and translated to a genome- nuclei, whereas compartment A signals were more evenly
wide increase in inter-compartment interaction in tumors (p < distributed (Figures 2J and 2L). In contrast, in tumor nuclei,
0.005) (Figure S4D). compartment B signals lost their peripheral skew and assumed
Prior studies have related compartments with nuclear posi- a distribution similar to compartment A (Figures 2K and 2L).
tioning (Falk et al., 2019; Wang et al., 2016). Compartment B cor- Hence, concordant analyses based on Hi-C polymer models,
relates with lamina-associated domains (LADs), which are electron microscopy, and multi-color FISH imaging indicate pro-
located at the nuclear periphery and may be damaged in ag- found compartmental reorganization in tumor nuclei. Spatial par-
ing-related disease, senescence, and cancer (Sakthivel and titioning between compartments is compromised. Compartment
Sehgal, 2016; Schreiber and Kennedy, 2013; van Steensel and B relocates from its physiologic peripheral position toward the
Belmont, 2017). Conversely, active loci tend to localize to the nu- nuclear interior.
clear interior. In principle, relative nuclear positioning of genomic
loci can be inferred from Hi-C data. We therefore used a A Genome Compartment with Intermediate Properties
maximum entropy approach to derive topological models for Despite widespread differences in compartmental interactions,
normal colon and tumor nuclei (STAR Methods). Our method A/B assignments were relatively consistent between tumor and
models the genome at 1-Mb resolution as a 3-dimensional poly- normal colon. This prompted us to more closely examine the
mer, taking into account linear constraints inherent to the DNA. Hi-C eigenvector, which is a continuous rather than a dichoto-
We then compute an in silico Hi-C map for each polymer model mous measure. We observed hundreds of large genomic inter-
and repeat the process iteratively until the computed map con- vals, hundreds of kilobases in size, with eigenvector values
verges on the experimental Hi-C data (Figure 2D; STAR that were lower in tumors than in normal colon (Figures 3A and
Methods). 3B). The majority of these intervals were assigned to compart-
Polymer models optimized to the normal colon Hi-C data ment A in tumor and normal colon because they had positive
separated compartments A and B and positioned compartment eigenvectors. However, quantitative analysis indicated that
B peripherally (Figures 2E, S4E, and S4F), consistent with expec- these regions shifted their interactions toward compartment B
tation. However, applying the same modeling approach to the in tumors.
colon tumor Hi-C data yielded a strikingly different result (Figures When we examined the transitioning regions, we observed
2F and S4G). In these models, both compartments were distrib- that they exhibited a striking loss of DNA methylation in tumors
uted heterogeneously throughout the nucleus. We verified that and, in fact, largely coincided with hypomethylated blocks (Fig-
these changes were not solely driven by genetic alterations by ures 3A, 3B, and S5A). This was unexpected because hypome-
restricting analysis to stable chromosomes and a genomically thylated blocks have been primarily associated only with
stable tumor (Figures S4H and S4I). These results suggest that compartment B (Berman et al., 2011; Fortin and Hansen, 2015;
the asymmetric radial positioning of compartments A and B is Hansen et al., 2011). We found that hypomethylated blocks,
profoundly altered in tumors. which tend to span single or consecutive TADs, covered a full
A B C
D E G J K
I
L
F H
19% of compartment A (Figures 3C, 3D, S5B, and S5C; mean cal regions do not fit the typical checkerboard pattern that arises
methylation difference, >10%; n = 1,032; mean size. 217 kb). Ex- from long-range compartmental interactions (Figure 3E). Rather,
amination of Hi-C contact maps revealed that these noncanoni- they have a distinct contact pattern characterized by intermediate
interactions with both conventional compartments (Figures 3E) compartment A (Figures 4A, 4C, and 4D), whereas H3K9me3
and preferential self-interactions (Figure S5D). This distinct was enriched in compartment B and relatively increased in tu-
contact pattern was evident in normal (Figures S5E and S5F) mors (Figures 4B–4E).
and tumor samples (Figures S5G and S5H). Compartment I was clearly distinguished from both conven-
We considered that these hypomethylated A blocks might tional compartments by broad H3K27me3 enrichment (Figures
reflect an intermediate compartment ‘‘I’’ that interacts with 4A, 4C–4E, and S6A). H3K27me3 signal intensity was particularly
both canonical compartments at baseline and shifts toward pronounced in tumors and correlated with the degree of DNA hy-
B in tumors (Figure S5I). In support, we found that compart- pomethylation. H3K36me3, which antagonizes H3K27me3 and
ment I regions could be distinguished on multiple chromo- has been proposed to protect against DNA hypomethylation, is
somes by examining additional eigenvectors of the Hi-C depleted in compartment I (Figure 4D; Yuan et al., 2011; Zhou
matrix (Figures 3F and S5J–S5L; STAR Methods). Although et al., 2018). Compartment I was also notable for relatively low
the order of the declarative eigenvector varied between chro- transcriptional levels in normal colon and moderate gene den-
mosomes, this suggested that compartment I can be distin- sity, both of which were in between compartments A and B (Fig-
guished from structural data alone. Furthermore, our polymer ures 4D and S6B).
models for normal colon placed compartment I in an interme- Thus, in addition to its topological features, compartment I is
diate nuclear position between compartments A and B distinguished by its facultative heterochromatin state, modest
(Figure 3G). transcriptional output, and methylation changes in tumors.
To investigate further, we visualized compartment I regions by
multi-color FISH. We designed a third set of 14,500 oligonucle- Compartmental Changes Linked to DNA
otide probes complementary to compartment I regions on chro- Hypomethylation and Accumulated Cell Divisions
mosome 12 and a corresponding secondary probe with a Compartments B and I largely correspond to hypomethylated
distinct fluorophore (Figure S4J). We then used three-color blocks in tumors. Moreover, we observed a striking correlation
FISH imaging to simultaneously localize compartment A, B, between the extent of hypomethylation of a given region and
and I regions in HCT116 colon cancer cells. We selected its eigenvector: genomic loci with more extreme hypomethyla-
HCT116 cells because chromosome 12 is copy number stable, tion became relatively more B-like or compact (Figure 5A).
and the loci targeted by our probes had similar compartment To further assess this relationship between compartmental
assignments as our primary tissues (Table S2; Figure S5M). changes and DNA hypomethylation, we treated HCT116 cells
We visualized and quantified the radial positioning of each with the demethylating agent 50 -azacytidine (5-aza) for 24 h
compartment in 305 HCT116 nuclei. We found that compartment and measured methylation and topology changes by Hi-C. The
I is spatially intermediate between the more peripheral compart- treatment reduced methylation of large genomic intervals, with
ment B and the more internal compartment A (Figures 3H and 3I). 56% of 100-kb windows losing more than 20% methylation
We confirmed this observation in primary tissues by quantifying (Figure S6C). Notably, we found that genomic regions with the
fluorescence FISH signals in normal colon epithelial cell nuclei most significant methylation loss shifted their interactions to-
(Figures 3J and S5N). ward compartment B in a topological reorganization reminiscent
Thus, a convergence of Hi-C, methylation, polymer modeling, of tumors (Figure 5B). This suggested that block hypomethyla-
and imaging data support the existence of a third genomic tion may underlie the altered compartment structure in colon
compartment I that interacts with both conventional compart- tumors.
ments and adopts an intermediate spatial position in the nucleus. Block hypomethylation was originally described in cancer
In tumors, compartment I becomes broadly hypomethylated and cells (Berman et al., 2011; Hansen et al., 2011; Nordor et al.,
shifts its interactions toward the B compartment. 2017) but has since been recognized to be a feature of cells
that have accumulated many divisions, including aging and sen-
Distinct Chromatin and Transcriptional States Support escing cells (Cruickshanks et al., 2013; Zhou et al., 2018).
the Three-Compartment Model Genome topology and nuclear structure are also altered in fibro-
To investigate whether compartment I is associated with distinct blasts passaged to replicative senescence (Criscione et al.,
histone modifications, we mapped markers of active regulatory 2016; Sati et al., 2020). We hypothesized that the compartmental
elements (H3K27ac), elongating transcripts (histone H3 lysine shifts in tumor cells might relate to those in passaged fibroblasts,
36 trimethylation/H3K36me3), constitutive heterochromatin (his- in both cases reflecting excessive replications. To test this, we
tone H3 lysine 9 trimethylation/H3K9me3), and facultative (his- passaged WI-38 fibroblasts over a 14-week course and gener-
tone H3 lysine 27 trimethylation/H3K27me3) heterochromatin. ated DNA methylation profiles and Hi-C data for cells harvested
As expected, H3K27ac and H3K36me3 were enriched in at passages 16, 30, and 40 (STAR Methods). Late-passage cells
(G) Whole-nucleus maximum entropy model (100-kb resolution) for normal colon, showing compartments A, I, and B.
(H) Representative DNA FISH image (left) and high-magnification image (right) for HCT116 cell nuclei. Signal intensities are shown for compartment A (blue), I (light
blue), and B (yellow) regions on chromosome 12.
(I) Barplot indicating the percentage of cells for which the maximum DNA FISH signal intensity for compartment A, B, or I is located at the indicated radial position
for 305 HCT116 cell nuclei.
(J) Representative image of nuclei from normal colon epithelial cells. The image shows DNA FISH signal intensities of probes for compartments A, B, and I of
chromosome 12. Two chromosome territories are magnified in the insets.
A B C
D E
continued to replicate and had not yet progressed to replicative (Figure S6D; Fan et al., 2020). Taken together, these results
senescence. Comparison of early and late passage data suggest that compartmental shifts in colorectal tumors closely
confirmed progressive hypomethylation of compartments B relate to DNA hypomethylation and may arise gradually over
and I with passage (Figures 5C and 5D). Moreover, late-passage the course of proliferation. Thus, the most profound topolog-
hypomethylation was accompanied by topological changes ical changes in tumors likely reflect their accumulated cell di-
analogous to tumors: the most hypomethylated regions had visions rather than specific oncogenic programs.
reduced eigenvector values, suggestive of compaction (Fig-
ure 5E). Importantly, intermediate-passage fibroblasts exhibited Transcriptional Consequences of Compartmental
intermediate degrees of hypomethylation and structural reorga- Reorganization
nization (Figures 5C and 5F). We next considered the transcriptional consequences of
The indication that compartmental hypomethylation and to- compartment B/I reorganization. We were struck that the
pological shifts arise gradually as cells accumulate divisions overall transcriptional activity in these compartments was
prompted us to examine methylation in colonic adenomas. actually reduced in tumors, despite loss of DNA methylation
These pre-malignant lesions are entirely submitted for diag- and relocation of compartment B from the nuclear periphery.
nostic evaluation, precluding assessment of topology, but Considering genes in compartments B and I with detectable
their DNA methylation can be profiled from paraffin sections. expression (transcripts per million/TPM > 0.1), we found that
Assessment of a published cohort of adenomas confirmed 3-fold more were downregulated than upregulated (Figures
that compartmental hypomethylation was evident in these S6E and S6F). In contrast, compartment A genes did not
pre-malignant lesions and more severe in higher-grade cases exhibit such an imbalance.
We reasoned that gene silencing in the reorganized compart- mimicry and immunogenicity (Gibbs and Whitehurst, 2018; Roo-
ments might be sustained (or enhanced) by their repressive chro- ney et al., 2015; Roulois et al., 2015).
matin states. Indeed, compartment I gained widespread A much larger set of genes in compartments B and I were
H3K27me3 in tumors (Figure 4E), and analysis of published downregulated with block hypomethylation (Figures 6A and
data showed that many compartment I genes were upregulated 6B). Downregulated genes in compartment B were marked by
in colon cancer-initiating cells treated with the Ezh2 inhibitor H3K9me3 and/or promoter methylation in tumors, whereas
(Figure S6G; Lima-Fernandes et al., 2019). Compartment B is those in compartment I were enriched for H3K27me3 (Fig-
broadly covered by the repressive H3K9me3 mark, and a subset ure S7B). We curated a list of robustly downregulated genes in
of its promoters is silenced by focal hypermethylation within the these compartments (Table S5). To focus on malignant cell-
hypomethylated blocks (Figures 4D and S6H). Thus, alternate intrinsic expression, we controlled for stromal content and
epigenetic mechanisms may actually further repress compart- excluded genes with high expression in immune cells or other
ments B and I upon reorganization. non-epithelial cell types (STAR Methods). Remarkably, the re-
We therefore sought to identify specific genes that were de- sulting list of 146 genes was highly enriched for functions related
regulated by this process. Because reorganization correlated to mesenchymal development, stem cell proliferation, and Wnt
quantitatively with hypomethylation (Figure 5A), we reasoned signaling (Figure 6D). Further analysis highlighted specific genes
that methylation could be a surrogate for compartmental with established roles in colorectal cancer progression, Wnt
changes. We collated methylation and RNA-seq data for 239 signaling, epithelial-mesenchymal transition (EMT), invasion,
colorectal tumors (TCGA) and used a correlation metric to iden- and metastasis (e.g., CCBE1, EPHA4, FGFR1, FGFR2, FZD2,
tify genes in compartments B and I whose expression was GPR137B, MEIS2, NFIB, PRRX1, PYGO1, SPP1, and TIAM1; Ta-
consistently altered in association with block hypomethylation ble S6; Koveitypour et al., 2019; Nguyen et al., 2020). In contrast,
(Figures 6A, 6B, and S7A; STAR Methods). Notably, genes in only a few of the downregulated genes were nominally associ-
compartment A showed no consistent transcriptional change ated with tumor-suppressive functions.
in association with hypomethylation (Figure 6C). Our analyses suggest that compartmental reorganization
Correlation analysis highlighted two gene sets (Figures 6A and drives induction of CGAs and ERVs, which are associated with
6B). A small set of genes was upregulated with block hypome- anti-tumor immunity, and repression of genes with functions in
thylation and included cancer germline antigens (CGAs) and Wnt signaling, EMT, invasion, and metastasis. Hence, the most
endogenous retroviruses (ERVs). De-repression of CGAs and profound topological alterations evident in tumors are actually
ERVs has been described in colon tumors and linked to viral associated with tumor-suppressive transcriptional programs.
A B C
E F G
Compartment-Specific Epigenetic Changes Restrain hypomethylation correlated with reduced gene expression in
Tumor Progression all 10 cancers (Figure S7G). We next collated genes in blocks
Our collective findings suggested that accumulation of excess that were significantly downregulated in association with hypo-
cell divisions leads to compartmental shifts that enact tumor- methylation in each cohort. The resulting sets were highly over-
suppressive transcriptional programs. They led us to hypothe- lapping and include a shared set of 367 genes that were
size that compartmental reorganization hinders malignant pro- commonly downregulated with hypomethylation in at least 7 of
gression. To test this, we examined whether the compartmental the 10 tumor types (Table S7). Remarkably, these shared genes
shifts were predictive of disease risk and outcome. were enriched for annotated oncogenes (p = 0.002), suggesting
First, we considered a recent survey of DNA methylation in 206 that compartmental reorganization may also hinder the develop-
normal colon biopsies, stratified into low- and high-risk groups ment and progression of other epithelial tumor types.
according to whether the donor had a concurrent colorectal tu- In conclusion, we document profound compartmental shifts in
mor elsewhere in the colon (Wang et al., 2020). Examination of tumors and other cells that have accumulated many divisions
these data confirmed that compartments B and I became pro- (Figures 7A and 7B). This reorganization is associated with wide-
gressively hypomethylated with increasing donor age (Fig- spread transcriptional changes, including repression of EMT, in-
ure S7C). To test whether hypomethylation was protective, we vasion, metastasis, and stemness programs. Further analysis of
compared low- and high-risk groups. We found that compart- methylation and expression data for normal colon biopsies and
ments B and I were significantly less hypomethylated in normal tumor cohorts supports the hypothesis that the compartmental
colon biopsies from the high-risk group, consistent with our hy- shifts and associated transcriptional programs restrain tumor
pothesis that the compartment shift is tumor suppressive progression.
(Figure 6E).
Second, we examined two clinical cohorts of colorectal tu- DISCUSSION
mors (Marisa et al., 2013; The Cancer Genome Atlas Network,
2012). Although these tumors have presumably overcome im- We presented a systematic integration of genome topology,
pediments posed by compartment shifts, we reasoned that the methylation, and chromatin state in colorectal cancer. Our data
associated transcriptional changes should nonetheless hinder and analyses parse multiple organizational layers, from E-P
their progression and correlate with favorable patient outcomes. loops to TADs to compartment structures. In particular, they
Here we focused on the 146 genes in compartments B and I that revealed three principles of large-scale compartmental organi-
were robustly downregulated with block hypomethylation (Table zation. First, topology data for primary tissues uncovered a
S5). As expected, these genes were expressed at substantially structurally distinct intermediate compartment I. Second, com-
lower levels in tumors (Figure 6F). However, there was consider- parison of tumors and normal colon revealed widespread
able tumor-to-tumor variability. Remarkably, we found that this changes in spatial partitioning, nuclear positioning, and epige-
set of genes was highly enriched for poor prognosis markers in netic states of compartments B and I that appear to be shared
a cohort of 566 colon tumors (Figure S7D; p = 2.4 3 1010; Mar- by tumor, aging, and other excessively replicated cells. Third,
isa et al., 2013). A risk score constructed from their average these compartmental changes correlate with and may promote
expression was a strong predictor of shorter recurrence-free tumor-suppressive expression programs associated with
survival (RFS) (Figure 6G; p = 0.0007). This prognostic associa- reduced cancer risk and better prognosis. Although tumor-asso-
tion was also validated in a second cohort of 443 tumors from ciated epigenetic changes are typically construed to be onco-
TCGA (p = 0.03). genic, our findings suggest that these most profound topological
Survival difference was evident even after controlling for mi- alterations actually restrain malignant progression.
crosatellite instability (MSI), BRAF mutations, and clinical stage Evidence of compartment I emerged from our analysis of
(p = 0.025) (Figure S7E). It was evident even in node-negative primary epithelial tissue. Compartment I resides at the interface
stage II tumors (p = 0.039), which is significant, given the clinical between the A and B compartments and engages in promiscu-
challenge associated with the uncertain course of these interme- ous long-range interactions with both conventional compart-
diate-stage tumors (Figure S7F) (Fotheringham et al., 2019). The ments. Polymer models and FISH imaging data indicate that
gene set was also associated with metastases (p = 0.018), compartment I regions occupy intermediate radial positions in
consistent with its functional annotations and supportive of its nuclei. Compartment I is also distinguished epigenetically by
clinical significance. broad H3K27me3 and robust block hypomethylation in tumors.
Finally, we considered whether compartmental reorganization Compartment I is distinct from previously described sub-com-
could be a general tumor-suppressive mechanism. We exam- partments, showing the highest overlap (47%) with B1 (Rao
ined methylation and expression data for cohorts spanning 10 et al., 2014). It may relate to nuclear foci documented in high-res-
epithelial tumor types (ICGC/TCGA Pan-Cancer Analysis of olution imaging studies (Boettiger et al., 2016; Rowley and Cor-
Whole Genomes Consortium, 2020). We confirmed that block ces, 2018; Xu et al., 2018). For example, Boettiger et al. (2016)
(F) Plots showing average expression of the high-confidence downregulated B/I genes (from the insets in A and B) in clinical specimens. Each point corresponds
to a different sample from a cohort of colorectal tumors and normal colons (Cancer Genome Atlas Network, 2012). The y axis represents log2-normalized counts.
(G) Kaplan-Meier curve depicting survival outcomes of patients stratified by their average tumor expression of the high-confidence downregulated compartment
B/I genes.
A B
visualized a compartment enriched for Polycomb-associated most severely hypomethylated loci undergoing more negative
marks and developmental genes in Drosophila. eigenvector shifts, indicative of compaction. Methylation loss
The coherent organization of compartments A, B, and I in may be causal because demethylating agents directly induce to-
normal colon was profoundly distorted in tumors. We observed pological changes. Although block hypomethylation was initially
a breakdown of partitioning between compartments A and B, described in cancer, it is increasingly recognized to be a com-
whereas compartment I shifted its interactions toward the mon feature of cells that have accumulated excess divisions,
closed B compartment. The aberrations appear to be closely including aging and senescing cells (Berman et al., 2011; Cruick-
related to nuclear architecture. Concordant polymer models, shanks et al., 2013; Nordor et al., 2017; Timp et al., 2014).
electron microscopy, and FISH imaging data indicate that Indeed, compartments B and I become progressively hypome-
compartment B loses its tight association with the periphery thylated and structurally reorganized in passaged fibroblasts.
and shifts toward the nuclear interior. Falk et al. (2019) found Moreover, colon adenomas exhibit intermediate compartmental
previously that the radial asymmetry of compartments A and hypomethylation, consistent with a replicative history between
B is inverted in rod photoreceptors. However, compartmental normal colon and tumors. Although future studies are needed
partitioning was largely maintained in photoreceptors, in to examine the topology of these and other pre-malignant le-
contrast to the overall disorganization of compartmental struc- sions, our findings suggest that compartmental shifts are not a
ture in tumors. consequence of malignancy but, rather, arise progressively as
The breakdown of compartment structure was closely tied to cells accumulate divisions.
pervasive changes to methylation and chromatin state. Com- We therefore propose that the compartmental reorganization
partments B and I acquired near-uniform hypomethylation in tu- reflects a fundamental epigenetic process primed by excess
mors and became further enriched for their characteristic chro- cell divisions, a perspective that enabled us to interpret atten-
matin states, H3K9me3 and H3K27me3, a pattern that has dant transcriptional changes. Compartmental hypomethylation
also been described for hypomethylated loci in breast cancer was associated with repression of compartment B and I genes,
cell lines (Hon et al., 2012). The reorganized compartments might likely as a consequence of repressive epigenetic states that arise
relate to phase-separated condensates and nuclear foci thought in the hypomethylated compartments. Repressed genes were
to play wide-ranging roles in gene and genome regulation (Lar- enriched for oncogenic functions related to EMT, invasion, and
son et al., 2017; Strom et al., 2017). Notably, high-resolution an- Wnt signaling, leading us to speculate that the topological shifts
alyses have identified senescence-associated heterochromatin present a barrier to tumorigenesis. Indeed, prior studies have
foci with a central density of H3K9me3-marked heterochromatin shown that cultured cells undergoing EMT remodel large hetero-
surrounded by a ring of H3K27me3 (Chandra and Narita, 2013; chromatin domains (McDonald et al., 2011). Although most
Sati et al., 2020), features consistent with our three-compart- compartment B and I genes were downregulated, CGAs and
ment model in tumors. ERVs with pro-immunity functions were induced and could com-
Compartmental reorganization appears to be tightly linked to plement restriction of stemness and invasion programs to
DNA hypomethylation and proliferative history. Hypomethylated restrain malignant progression in aging colonic epithelium or
blocks in tumors correspond to compartments B and I, with the pre-malignant lesions.
Final support for a proposed tumor-suppressive role emerged B DNA polymer modeling
from our analysis of clinical cohorts. First, methylation profiles for B Energy function
normal colon biopsies revealed that age-associated compart- B Parameter optimization
ment B and I hypomethylation was associated with reduced B Molecular dynamics simulation details
colorectal cancer risk (Wang et al., 2020), consistent with our B Radial density profile
model and with a recent report relating morphological changes B Electron Microscopy Analysis
in uninvolved colonic nuclei to tumor risk (Gladstein et al., B DNA-FISH analysis
2018). Second, examination of colorectal tumor cohorts re-
vealed that a transcriptional signature of compartmental shift SUPPLEMENTAL INFORMATION
was predictive of patient outcome and likelihood of metastasis.
Finally, a pan-cancer analysis suggested that the compartmental Supplemental Information can be found online at https://doi.org/10.1016/j.
cell.2020.07.030.
shifts and tumor-suppressive effects may be generalizable to
other epithelial cancers. Future studies of these pervasive archi-
ACKNOWLEDGMENTS
tectural changes and their functional significance in cancer and
aging could inform new strategies for early detection, patient We thank Ryanne Boursiquot for assistance with sequencing; Mohammed Miri
stratification, and therapeutic intervention. for assistance with clinical samples; Elizabeth Gaskell, Volker Hovestadt,
Christine Eyler and Ryan Corcoran, Angela Shih, and Omer Yilmaz for thought-
ful discussions; and Leslie Gaffney for graphic support. S.E.J. is supported by
STAR+METHODS NIH T32CA009216. A.R. and R.A.I. are supported by NIH R01GM083084 and
R01HG005220. J.H.C. is supported by NIH 1T32CA207021-01. M.J.A. is sup-
Detailed methods are provided in the online version of this paper ported by a Broad Institute Merkin Fellowship. B.E.B. is the Bernard and Mil-
and include the following: dred Kayden Endowed MGH Research Institute Chair and an American Can-
cer Society Research Professor. This research was supported by the
d KEY RESOURCES TABLE National Cancer Institute (DP1CA216873) and the Starr Cancer Consortium.
d RESOURCE AVAILABILITY This paper is dedicated to the memory of Yaw Adu Kuffour.
B Lead Contact
B Materials Availability AUTHOR CONTRIBUTIONS
B Data and Code Availability
Conception and Experimental Design, S.E.J., A.R., M.J.A., and B.E.B.; Meth-
d EXPERIMENTAL MODEL AND SUBJECT DETAILS odology and Data Acquisition, S.E.J., A.R., C.A., E.H., K.P., J.H.C., and B.E.B.;
B Human tumor specimens Analysis and Interpretation of Data, S.E.J., A.R., Y.Q., C.A., L.S.Z., Y.D., V.H.,
B Cell lines N.S., M.K.S., C.L., S.I., S.C.N., E.F.J., N.H., R.A.I., B.Z., M.J.A., and B.E.B.
d METHOD DETAILS Manuscript Writing and Revision, S.E.J., A.R., M.J.A., and B.E.B.
B Tissue dissociation and crosslinking
B Hybrid selection bisulfite sequencing DECLARATION OF INTERESTS
B Whole genome bisulfite library preparation
N.H. is an equity holder of BioNTech and a consultant for Related Sciences.
B ChIP-seq
M.J.A. declares outside interest in Excelsior Genomics. B.E.B. declares
B Hi-C outside interests in Fulcrum Therapeutics, 1CellBio, HiFiBio, Arsenal Biosci-
B HiChIP ences, Cell Signaling Technologies, BioMillenia, and Nohla Therapeutics.
B RNA-seq
B Electron microscopy Received: July 8, 2019
B DNA-FISH Revised: May 4, 2020
Accepted: July 20, 2020
B Treatment with 5-azacytidine
Published: August 24, 2020
d QUANTIFICATION AND STATISTICAL ANALYSIS
B DNA methylation data preprocessing and REFERENCES
quantification
B Cancer-associated DNA methylation alterations in Baylin, S.B., and Jones, P.A. (2016). Epigenetic Determinants of Cancer. Cold
normal colon tissue Spring Harb. Perspect. Biol. 8, a019505.
B Hi-C analysis Beliveau, B.J., Joyce, E.F., Apostolopoulos, N., Yilmaz, F., Fonseka, C.Y.,
B Association of compartmental organization with DNA McCole, R.B., Chang, Y., Li, J.B., Senaratne, T.N., Williams, B.R., et al.
hypomethylation (2012). Versatile design and synthesis platform for visualizing genomes with
Oligopaint FISH probes. Proc. Natl. Acad. Sci. USA 109, 21301–21306.
B HiChIP analysis
Berman, B.P., Weisenberger, D.J., Aman, J.F., Hinoue, T., Ramjan, Z., Liu, Y.,
B Copy number variant analysis
Noushmehr, H., Lange, C.P.E., van Dijk, C.M., Tollenaar, R.A.E.M., et al.
B SNP Analysis
(2011). Regions of focal DNA hypermethylation and long-range hypomethyla-
B Gene expression analysis tion in colorectal cancer coincide with nuclear lamina-associated domains.
B Pan-cancer analysis of gene expression associated to Nat. Genet. 44, 40–46.
hypomethylation Bickmore, W.A., and van Steensel, B. (2013). Genome architecture: domain or-
B Exclusion of genes of likely non-tumor cell-origin ganization of interphase chromosomes. Cell 152, 1270–1284.
B Survival analysis Boettiger, A.N., Bintu, B., Moffitt, J.R., Wang, S., Beliveau, B.J., Fudenberg,
B ChIP-seq analysis G., Imakaev, M., Mirny, L.A., Wu, C.-T., and Zhuang, X. (2016). Super-
Durand, N.C., Robinson, J.T., Shamim, M.S., Machol, I., Mesirov, J.P., Lander, Imakaev, M., Fudenberg, G., McCord, R.P., Naumova, N., Goloborodko, A.,
E.S., and Aiden, E.L. (2016). Juicebox Provides a Visualization System for Hi-C Lajoie, B.R., Dekker, J., and Mirny, L.A. (2012). Iterative correction of Hi-C
Contact Maps with Unlimited Zoom. Cell Syst. 3, 99–101. data reveals hallmarks of chromosome organization. Nat. Methods 9,
999–1003.
Falk, M., Feodorova, Y., Naumova, N., Imakaev, M., Lajoie, B.R., Leonhardt,
H., Joffe, B., Dekker, J., Fudenberg, G., Solovei, I., et al. (2019). Heterochro- Kloetgen, A., Thandapani, P., Ntziachristos, P., Ghebrechristos, Y., Nomikou,
matin drives compartmentalization of inverted and conventional nuclei. Nature S., Lazaris, C., Chen, X., Hu, H., Bakogianni, S., Wang, J., et al. (2020). Three-
570, 395–399. dimensional chromatin landscapes in T cell acute lymphoblastic leukemia.
Fan, J., Li, J., Guo, S., Tao, C., Zhang, H., Wang, W., Zhang, Y., Zhang, D., Nat. Genet. 52, 388–400.
Ding, S., and Zeng, C. (2020). Genome-wide DNA methylation profiles of Koveitypour, Z., Panahi, F., Vakilian, M., Peymani, M., Seyed Forootan, F.,
low- and high-grade adenoma reveals potential biomarkers for early detection Nasr Esfahani, M.H., and Ghaedi, K. (2019). Signaling pathways involved in
of colorectal carcinoma. Clin. Epigenetics 12, 56. colorectal cancer progression. Cell Biosci. 9, 97.
STAR+METHODS
Continued
REAGENT or RESOURCE SOURCE IDENTIFIER
chr12_A_all (Secondary 1 Binding Site) F IDT CACCGACGTCGCATAGAACGGAAGAGCGTGTG
GACAGCCGGTTCGGTCGTTC
chr12_A_all (Secondary 1 Binding Site) R IDT TAATACGACTCACTATAGGGCGGTCCCGTCCG
AGGTATAC
chr12_B_all (Secondary 5 Binding Site) F IDT TAGCGCAGGAGGTCCACGACGTGCAAGGGTG
TTCGTTCACCGCGCGTTGAAG
chr12_B_all (Secondary 5 Binding Site) R IDT TAATACGACTCACTATAGGGCGGTCCCGTCCG
AGGTATAC
chr12_I_all (Secondary 6 Binding Site) F IDT CACACGCTCTCCGTCTTGGCCGTGGTCGATCA
GCGATCTGCGCATGGTAATC
chr12_I_all (Secondary 6 Binding Site) R IDT TAATACGACTCACTATAGGGCGGTCCCGTCCG
AGGTATAC
Secondary 1- Alexa488 IDT ACACACGCTCTTCCGTTCTATGCGACGTCGGTGA
Secondary 1- Alexa647 IDT ACACACGCTCTTCCGTTCTATGCGACGTCGGTGA
Secondary 5- Atto565 IDT ACACCCTTGCACGTCGTGGACCTCCTGCGCTA
Secondary 6- Alexa647 IDT TGATCGACCACGGCCAAGACGGAGAGCGTGTG
Software and Algorithms
Juicebox Durand et al., 2016 http://aidenlab.org/juicebox/
Cell Profiler McQuin et al., 2018 version 3.1.9
FIJI Schindelin et al., 2012 version 2.0.0-rc-69/1.52p
HiC-Pro Servant et al., 2015 version 2.10.0
Bioconductor Huber et al., 2015 release 3.11
Code supporting this study This paper https://github.com/aryeelab/colon-dna-topology
OligoMiner scripts Beliveau et al., 2012 https://github.com/beliveau-lab/OligoMiner
RESOURCE AVAILABILITY
Lead Contact
Further information and requests for resources and reagents should be directed to the Lead Contact, Bradley Bernstein (Bernstein.
[email protected]).
Materials Availability
No unique reagents were generated for this study.
Cell lines
Our in vitro models included colon cancer cell lines (HCT116, SW480, RKO, LS-174T), a line derived from normal fetal colonic epithe-
lium (FHC) and a primary fibroblast line (WI-38). Colon cell lines were purchased from ATCC: HCT116 (CCL-247), SW480 (CCL-228),
LS174-T(CL-188), RKO (CRL-2577) and FHC (CRL-1831). The primary fibroblast line WI38 was obtained from Coriell (AG06814-N).
HCT116 and SW480 were grown in McCoy’s 5A medium (GIBCO 16600082), 10% FBS and 0.5% pen-strep (GIBCO 10378016).
LS174T and RKO were grown in EMEM (ATCC 30-2003), 10% FBS and pen-strep (GIBCO 10378016). FHC was cultured as per
ATCC in DMEM/F12 media (ATCC 30-2006), 25mM HEPES, 10 ng/ml cholera toxin, 0.005 mg/ml insulin, 0.005 mg/ml transferrin,
100 ng/ml hydrocortisone, 20 ng/ml recombinant EGF (Thermo Fisher PHG0311) and 10% FBS. WI38 was cultivated in EMEM
with 15% FBS and passaged serially (approximately twice weekly) for 14 weeks.
METHOD DETAILS
ChIP-seq
We generated chromatin state maps (H3K27ac, H3K36me3, H3K9me3 and H3K27me3) and binding profiles for the CTCF insulator
protein by ChIP-seq. ChIP-seq was performed as described previously (Liau et al., 2017). In brief, crosslinked cells were lysed and
DNA was sheared to between 400 and 2,000 base pair fragments. Antibodies were as follows: CTCF (Cell signaling #3418), H3K27ac
(Active Motif #39133), H3K9me3 (Abcam #8898), H3K27me3 (Cell Signaling #97335) and H3K36me3 (Abcam #9050). ChIP DNA was
used to generate sequencing libraries by end repair (End-It DNA repair kit, Epicenter), 30 A base overhang addition via Klenow frag-
ment (NEB), and ligation of barcoded sequencing adapters. Barcoded fragments were amplified via PCR. Libraries were sequenced
as 38-base paired-end reads on an Illumina NextSeq500 instrument.
Hi-C
Hi-C maps of chromosome topology were initially generated for a cohort of 7 primary tumors, 4 normal colon tissue samples and 5
cell lines (cohort 1; Table S1). We then confirmed our results by acquiring Hi-C data for a validation cohort of 5 tumors and 3 normal
colon samples (cohort 2; Table S1).
In situ Hi-C was performed as described previously (Rao et al., 2014). In brief, crosslinked cells or tumor were thawed on ice in Hi-C
lysis buffer. Tissue samples were mechanically disrupted with the Biomasher tissue grinder (Kimble Chase). Tissue and cell line
samples were permeabilized in 0.5% SDS at 37 degrees, quenched with Triton X-100 and chromatin was digested with 100-
200U MboI at 37 degrees overnight. Nuclei were then pelleted, ends were marked with biotin-14-dATP (ThermoFisher 19524016)
and chromatin was ligated for 5 hours by T4 DNA ligase (M0202). Samples were treated with proteinase K at 55 degrees for 30 minutes
and cross-links were reversed at 68 degrees overnight. DNA was ethanol precipitated and sheared on a Covaris LE220. DNA was
cleaned up via AMPure XP beads (Beckman Coulter, A63881) and quantified by Qubit dsDNA High Sensitivity Assay (Life Technol-
ogies, Q32854). Samples were bound to Dynabeads MyOne Streptavidin T1 beads (Life technologies, 65602) and washed. End
repair, dATP attachment and adaptor ligation was performed. Final PCR amplification was performed using barcoded sequencing
primers and PCR. Libraries were purified using AMPure XP beads and sequenced on either a NextSeq500 (150 cycle kit), HiSeq2500
(high output; 200 cycle kit) or NovaSeq S4 (200 cycles).
HiChIP
We acquired SMC1 HiChIP data for cohort 1 (Table S1). HiChIP was performed as described previously (Mumbach et al., 2016).
Briefly, crosslinked samples were lysed in Hi-C lysis buffer and chromatin was permeabilized in 0.5% SDS at 63C for 10 minutes.
Chromatin was digested with MboI for 2 hours at 37C. Overhangs were filled in and marked with Biotin-dATP (ThermoFisher
19524016), and ends were ligated with T4 DNA Ligase (NEB M0202) for 4 hours at room temperature. Nuclei were pelleted and lysed
and chromatin was sheared on the Covaris E220 with the following conditions: Fill level 5, Duty Cycle 5, PIP 140, Cycles/burst 200,
Time 4 minutes. Samples were clarified, diluted in ChIP dilution buffer and precleared with Protein G beads (Invitrogen 11205D) for 1
hour at 4C. Samples were cleared on magnet and supernatant was added to antibody. Chromatin was ChIP’d overnight at 4C with
rotation. Protein G beads were added and incubated for 2 hours at 4C with rotation. Beads were washed with low salt, high salt and
LiCl buffer. Sample was eluted in ChIP elution buffer, treated with Proteinase K and crosslinks were reversed. DNA was purified using
the Zymo clean & concentrate kit (DCC-100). Streptavidin M280 beads (Invitrogen 11205D) were washed and resuspended in 2x
biotin binding buffer and DNA was bound for 15 minutes at room temperature. Beads were washe and Tn5 fragmentation was carried
out as per Mumbach et al.,2016, with dilutions of Tn5 to process low input samples. Libraries were amplified using the Nextera DNA
Library Prep kit (Illumina). Material was cleaned up using Ampure XP beads. Libraries were sequenced on NextSeq500 (150 cycle kit)
or the HiSeq2500 (high output; 200 cycle kit).
RNA-seq
Whole RNA was extracted using the QIAGEN RNeasy kit according to the manufacturer’s protocol. For RNA-seq library preparation,
Poly(A)+ RNA was enriched using magnetic oligo(dT)-beads (Life Technologies) and then ligated to RNA adaptors for sequencing.
RNA-seq was performed with two biological replicates per colon cancer line and in singlicate for tumor samples. Libraries were
sequenced as 38-base paired-end reads on an Illumina NextSeq500 instrument.
Electron microscopy
Fresh tissue biopsies were placed directly into EM fixative (2.5% glutaraldehyde, 2.0% paraformaldehyde, 0.025 calcium chloride in
a 0.1M sodium cacodylate buffer, pH 7.4) and allowed to fix for 3 hours at room temperature or ON at 4 C. Further processing was
done in an EMS (Electron Microscopy Sciences) Lynx ll automatic tissue processor. Briefly, tissues were post-fixed with osmium te-
troxide, dehydrated in a series of ethanol solutions, en block stained in the 70% ethanol step with uranyl acetate, further dehydrated
in 100% ethanol and propylene oxide. Tissues were infiltrated in a series of propylene oxide, Epon mixtures and embedded in pure
Epon. The Epon blocks were polymerized overnight in a 60 C oven. One micron sections were cut using glass knives and stained with
toluidine blue. Representative areas were chosen by light microscopy. Thin sections were cut using an LKB ultramicrotome and dia-
mond knife. The sections were stained with Sato’s lead stain and examined with a FEI Mogagni transmission electron microscope.
Images were captured with an AMT (Advanced Microscopy Techniques) 2K digital CCD camera.
DNA-FISH
Generation of DNA-FISH probes
Based on evidence of CNV stability in tumors with available biological material, chromosome 12 was chosen for validation experi-
ments. Only the p-arm of chromosome 12 was considered to avoid measuring arm-related differences in nuclear positioning. For
compartments A and B, consecutive regions of at least 300Kb with absolute PC1 values larger than 0.5 in normal colon were iden-
tified (see methods section of eigenvector decomposition of Hi-C matrices). For these consecutive segments, the 100Kb regions at
the middle of each segment were selected. For compartment I, a set of 100Kb regions was selected so that their linear distances to
the B candidate regions was equivalent to the distance between the A candidate regions and the B candidate regions. Furthermore,
the candidate regions were screened to have the same compartment-specific characteristics in HCT116 cells. Then, oligopaint li-
braries were designed using the Oligominer pipeline (Beliveau et al., 2012). Specifically, candidate regions were mined for probes
using the length requirement of 80 nucleotides of homology, melting temperature range of 47-80 C, and default settings for the re-
maining Oligominer parameters. Resulting oligos yielded an average probe density of 4.6 probes/kb. An oligo pool (Twist Bioscience)
was synthesized such that all probes targeting A, B, or I regions could be created in aggregate. Single-stranded probes were pro-
duced using PCR, T7 RNA synthesis, and reverse transcription as described previously (Rosin et al., 2018; Shav-Tal, 2013).
DNA-FISH on tissue
For DNA-FISH, tissue samples were fixed in 4% paraformaldehyde (FisherScientific, 15710) for 2 hours, washed in 1X PBS and
soaked for 2x 10 minutes in 0.5M NH4Cl (26.6g/L in PBS, Sigma-Aldrich, 213330). The samples were cryoprotected by overnight
incubation in 30% w/v sucrose (Sigma-Aldrich, S0389) in PBS at 4 C, nutating. The next morning, the sucrose was replaced with
30% sucrose, 50% OCT (Tissue-Tek* O.C.T. Compound, Sakura, 25608-930) for two hours at 4 C, nutating. They were embedded in
plastic peel-away molds in OCT, frozen on a cold block in liquid nitrogen, and stored in the freezer until sectioning. 5 mm sections were
sectioned using a cryostat, collected on Superfrost Plus microscope slides (FisherSci 22-037-246) and stored at 20 C.
Slides were thawed at room temperature for 30’ and rehydrated in 10 mM sodium citrate for 50 . Next, the slides were incubated for
10’ in 10 mM sodium citrate at 80 C and allowed to cool down at room temperature for 30’. They were washed twice at 2X SSC, 50
each, and transferred to 50% formamide (Sigma Aldrich, F9037) in 2X SSCT (SSC + 0.1% Tween-20) for at least 1h. Following dena-
turation, the tissue was dehydrated in an ethanol row (70%, 95%, 100%, three minutes each) and air-dried for at least 90 minutes.
They were then acetylated with acetic anhydride as follows: dried slides were equilibrated in 0.1M fresh ethanolamine (Sigma Aldrich,
90279) in dH2O, pH 8.0 for 10 minutes, then transferred to 0.25% v/v acetic anhydride (Sigma Aldrich, 320102) in 0.1M ethanolamine
for 50 followed by washing for 10’ in 2X SSCT. During the previous steps, probes against compartments A, B and I were prepared by
adding 1 uL of 100 uM dNTPs (Life Technologies, R1121) and 50 pmol probes each per slide, speed vacuuming on high for 20’ (or until
the liquid had evaporated), and resuspending the probes in 5.25 uL per sample. This probe-dNTP mix was then added to the hybrid-
ization mixture containing a final concentration of 50% formamide, 10 ug RnaseA (Thermo Scientific EN0531) and 1x Dextran Sulfate
Mix (10% Dextran Sulfate D8906, Sigma Aldrich, 2xSSC, 0.1% Tween-20). Next, the slides were mounted with probe mix, covered
with a glass coverslip and rubber cement (Staples, EPI231) and incubated on a hot block at 42 C for 1-3 hours to allow infiltration with
the probe. Then, the slides were heat-shocked to denature the section and probe at 85 C for 7 minutes and incubated overnight
(16h+) at 37 C in a temperature-controlled oven. The next day, the coverslips were removed from the sections and washed at
60 C for 15 minutes in pre-warmed 2X SSCT, at room temperature for 10’ in 2X SSCT, at room temperature for 10’ in 0.2X SSCT,
and transferred to 2X SSC while secondary mix was prepared consisting of 10% formamide, 1x Dextran Sulfate Mix and 10 pmol
of secondary probes labeled with Cy3 (B compartment), Cy5 (A or I compartment) and A488 (A compartment) in dH2O. The sections
were incubated with secondary mix in a humid chamber at room temperature in the dark for at least two hours, followed by the same
washes as before. In the last step with 2X SSC, the nuclei were counterstained with DAPI (Sigma-Aldrich, D1305, 5 mg/mL final con-
centration), briefly washed in 2X SSC and mounted using SlowFade Gold Antifade Mountant (Invitrogen, S36936).
DNA-FISH on HCT116 cells
Cells were grown on glass coverslips, fixed in 4% PFA for 15 minutes, and processed as described above with the following adjust-
ments: sodium citrate and acetic anhydride treatments were omitted; instead, the cells were permeabilized after fixation using 0.5%
Triton X-100 for 15’ followed by a 50 wash in PBS and alcohol denaturation. Prior to probe mix addition, the samples were re-equil-
ibrated using 2X SSCT + formamide (50% v/v 4x SSCT + 50% v/v formamide) for 1h at 37 C. The probes were then immediately
denatured at 80 C for 50 and incubated overnight at 37 C. The DNA-FISH stainings were imaged on a Zeiss LSM800 confocal mi-
croscope with Airyscan settings and a 63x oil objective.
Hi-C analysis
We confirmed data quality by assessing the fraction of cis-long range contacts for each library. Each map contained an average of
320 million contacts, for an average resolution of 10 Kb. We used these data to derive TAD and compartment structures.
Data were controlled for quality, mapped to the reference genome (hg19) and converted into interaction matrices using HiC-Pro
v2.10.0 (Servant et al., 2015) using pipeline code available at https://github.com/aryeelab/topology_tools. Within sample normaliza-
tion was performed using the Iterative Correction and Eigenvector decomposition (ICE) method (Imakaev et al., 2012). For each chro-
mosome in each sample, compartments were called using the standard PCA method (Lieberman-Aiden et al., 2009). Briefly, the inter-
action matrix X = xij was transformed into an observed over expected (O/E) matrix by dividing each element of the matrix by the
expected interaction frequency for a given distance from the diagonal k = i j, defined as the mean of the values xij with the same
value of k. A correlation matrix was generated by estimating the pairwise correlation coefficients of all the rows of the O/E matrix.
Then, an eigendecomposition was performed on the correlation matrix, and the sign of the first eigenvector was used to assign
compartment labels. We used expression data and GC content to flip the sign of the eigenvector such that values larger than 0 corre-
spond to open (A) regions and values smaller than 0 correspond to closed (B) regions.
In addition to using the eigenvector (PC1) metric, we also directly quantified the tendency of each region to interact with other re-
gions in either the A or B compartments. We calculated the ‘‘A/B interaction ratio,’’ defined for each 100kb genomic window as the
ratio of interaction frequency with the A versus B compartments using the O/E matrix. Specifically, we calculate log2(mean O/E inter-
action frequency with A regions) – log2(mean O/E interaction frequency with B regions). For each comparison, A and B were defined in
the ‘baseline’ condition: normal colon tissue, HCT116 cells treated with DMSO, or early passage fibroblasts (passage 16). The same
A and B definitions were used for all samples within a comparison. We confirmed that assessing the relationship between compart-
mental change and hypomethylation gave consistent results when using A/B interaction ratio instead of eigenvector as the outcome
(See ‘‘Association of compartmental organization with DNA hypomethylation’’).
In primary tissues, compartment I was defined as those genomic regions with a positive value of the first eigenvector that were
within a block of DNA hypomethylation (defined by comparing tumors versus normal). We verified that compartment I could be iden-
tified based only on Hi-C data. To do this, we plotted the first 5 eigenvectors of the matrix decomposition used to define compart-
ments A and B. The first eigenvector separates compartment A and B. Compartment I could be identified using the second eigen-
vector on several chromosomes. In other chromosomes, compartment I was often separated by lower eigenvectors, and became
more evident when we applied the matrix decomposition method to individual chromosome arms. For IMR90 cells, compartment
I was assigned as in primary tissues using DNA methylation differences between proliferating and senescent cells. For HCT116,
100Kb genomic bins were labeled as I if two consecutive genomic bins had a positive value on the eigenvector decomposition
described (Lieberman-Aiden et al., 2009), and if their open sea CpG DNA methylation values were equal or less than 80%. We
confirmed that this approach enriched for regions that were consistent with our definition of compartment I, i.e., that had intermediate
A/B contact Hi-C patterns and that were enriched for the H3K27me3 chromatin mark compared to compartments A and B.
Sub-compartment structures defined by Rao et al., 2014 were called using SNIPER (Xiong and Ma, 2019). SNIPER uses Hi-C data
and sub-compartment calls to train a model that is able to learn the interaction patterns of each sub-compartment. These trained
models can be used to define sub-compartments from Hi-C data that are distinct from the training Hi-C data. To call sub-compart-
ments in colon tissue, we used the pre-trained models provided by SNIPER that were trained using 5% of GM12878 Hi-C data (Rao
et al., 2014), as these data were of a comparable coverage to our individual HiC samples.
Insulation scores were calculated by defining two windows of length l, one upstream and one downstream of a given genomic po-
sition. For each chromosome, the log (base 2) ratio of the sum of interaction counts within each window and the interaction counts
between the two windows was calculated at each position. These log ratios were further transformed into z-scores by subtracting the
median and dividing by the median absolute deviation. In order to capture boundaries of TADs of varying sizes, including nested TAD
structures, the z-scores were calculated genome-wide for different values of l, specifically l = 200Kb, l = 400Kb and l = 800Kb.
Boundaries were called for genomic positions where the z-scores were larger than the 90% quantile of a standard normal distribution
in at least one resolution in one sample. Three strategies were used to assess the stability of TADs across samples: (1) we calculated
the correlation coefficients of the insulation scores, (2) for each sample, we defined the position of TAD boundaries in that sample and
calculated the percentage of TAD boundaries where the insulation scores of other samples were also in a local minima and (3) we
plotted metaplots of the insulation scores to visually inspect conservation of TAD boundaries.
Juicebox (Durand et al., 2016) was used for exploratory visualization of the Hi-C data.
HiChIP analysis
Data were controlled for quality, mapped to the reference genome (hg19) and converted into interaction matrices using the HiC-Pro
pipeline (Servant et al., 2015). Chromatin loops were called using the hichipper 0.7.3 with the parameter to use user-defined peaks
(Lareau and Aryee, 2018b). hichipper was run for each HiChIP sample using the union of CTCF and H3K27ac peaks as the predefined
peak set. The hichipper pipeline defines potential loop anchors by extending peaks by a fixed window (i.e., 500bp) to account for
uncertainty in the peak calling and merging peaks whose genomic distance is below 500bp. These extended peaks are overlapped
with restriction fragments and are further extended to the edges of the restriction fragments they overlap. For each pair of potential
loop anchors, hichipper counts the number of valid contact pairs (defined by HiC-Pro) that support their 3D interaction. Running this
step for each sample results in a matrix Zij = zij where each column is a sample j and each row i represents a pair of loop anchors (i.e.,
a loop), and zij is the number of valid read pairs that supports a loop i in sample j. To distinguish between random background con-
tacts from contacts due to DNA looping, hichipper runs the mango background correction model on the sum of loop counts across
P
samples (i.e., zij ). The mango correction consists of modeling the counts using a binomial distribution to estimate the probability of
j
observing the counts between two genomic loci given its genomic distance. The resulting p values are corrected for multiple testing.
Loops with a q-value smaller than 0.1, with at least 4 valid contacts in two or more samples and with at least 20 counts across all
samples were considered high-confidence loops and were considered for further analysis. Significant loops were annotated as
enhancer-promoter loops if one of the anchors overlapped an H3K27ac peak (enhancer-like) and the other anchor overlapped the
promoter of a gene.
To assess the robustness of our results to the loop calling algorithm, we repeated our analyses using a different loop calling algorithm,
cLoops (Cao et al., 2019). We ran cLoops version 0.92 using the parameters ‘‘-hic -eps 2500,5000,7500,10000 -minPts 3,5,10 -j -s -w’’
for each sample and used the union of the per-sample significant loops as our final set of loops. The global trends described in the main
text were robust to the loop calling algorithm.
The software tool diffloop was used to test for differential looping (Lareau and Aryee, 2018a). diffloop uses the statistical engine of
the edgeR package (Robinson et al., 2010), where the matrix of loop counts Zij is modeled using generalized linear models (GLM) of
the negative binomial distribution:
Zij NB mij ; ai
where mij is the fitted mean and ai is the dispersion estimate, which is estimated using the common dispersion method from edgeR. nij
are normalization factors that account both for library size and for copy number differences between samples. Specifically, we over-
lapped the genomic coordinates of the loop anchors with the copy number estimates of each sample (see Section ‘‘Copy number
variant analysis’’ for details) to generate a matrix of copy number estimates (loops i times sample j). This matrix of copy numbers
was row-centered. The resulting matrix was multiplied by the library size factors estimated by edgeR and the resulting values defined
nij , which were introduced as offsets when fitting the GLMs. To test for differential looping between conditions, a GLM was fitted for
each gene and a likelihood ratio test was used to calculate p values for each loop and the Benjamini-Hochberg was used to correct for
multiple testing.
error variability, rklnorm is smoothed throughout the genome (Huang et al., 2007). Third, rklnorm values are transformed so that the most
common genomic regions are centered to ratio one. Then, CNAnorm normalizes for tumor purity by shrinking rklnorm so that the modes
of the distribution fit values resulting from copy number alteration processes (deletion = 0, deletion of one chromosome copy = 0.5, no
CNV = 1, amplification of one copy = 1.5, etc). Finally, a circular binary segmentation algorithm is used to define regions with copy
number alterations (Olshen et al., 2004).
Since not all our tumor samples contained a matching normal sample, we used the normal sample with the highest coverage as the
reference sample for all our tumors. We used copy number calls to verify that the epigenetic differences between tumors and normal
were not only driven by copy number alterations by introducing the estimated CNV as offsets in the statistical models when doing
inferences and making sure that the epigenetic differences were present after we masked genomic regions with CNVs.
SNP Analysis
We identified 28 colon cancer risk SNPs (MacArthur et al., 2017) that coincide with E-P loop enhancer anchors and assigned target
genes based on the corresponding promoter contact (Table S4). These looping data confirmed predicted targets, including risk SNPs
previously associated with COLCA1/2 or TERC expression (Figures S2I–S2K; Peltekova et al., 2014). 5 of the 28 risk SNPs were asso-
ciated with distal genes rather than the nearest promoter (Table S4).
Survival analysis
Preprocessed Affymetrix U133Plus2 microarray gene expression data (NCBI GEO GSE39582) was downloaded using the curate-
dCRCData R/Bioconductor package (Marisa et al., 2013). Eight of 566 samples with low average pairwise correlation (< 0.9) with
other samples were excluded in a QC filtering step. Gene expression values were Z-score transformed.
We computed a gene expression risk score as the average normalized expression level of the 146 genes that are downregulated
with hypomethylation in the I and B compartments for samples from Marisa et al. (2013) and the TCGA COAD cohort (Marisa et al.,
2013). For Marisa et al. (2013), we restricted to the 145 genes where the corresponding gene symbol was present. We defined a
‘‘high’’ score as those values that were 2sd (robustly estimated with the R ‘mad’ function) above the median for the Marisa et al.
(2013) study, and 1 sd above the median for the TCGA cohort. We constructed Kaplan-Meier survival curves using available
data for recurrence-free survival (Marisa et al., 2013) and overall survival (Cancer Genome Atlas Network, 2012). To assess the as-
sociation of expression risk score with survival outcomes while adjusting for known risk factors we fit a Cox Proportional Hazards
model to data from the Marisa et al. study, using gene expression risk score, clinical stage, BRAF mutation status and MSI status
as predictors.
ChIP-seq analysis
Reads were mapped to the reference genome (hg19) using bwa version 0.7.12 (Li and Durbin, 2009). CTCF peaks were called using
GCAPC with default parameters (Teng and Irizarry, 2017) and H3K27ac peaks were called using MACS (Zhang et al., 2008). The data
revealed the expected punctate peaks of the enhancer-associated mark, H3K27ac, as well as broader regions of the repressive mod-
ifications, H3K9me3 and H3K27me3. CTCF binding sites were highly enriched for the CTCF binding motif (OR = 19.04; p < 1015). For
differential CTCF peak analysis, the union of peaks that were detected in at least two samples were considered and reads were
counted for each peak in each sample. Differential CTCF binding was inferred using DESeq2, introducing offsets to the generalized
linear model to normalize for library size (Love et al., 2014), copy number differences, and non-linear trends (Lun and Smyth, 2016).
Copy number estimates for each genomic region were obtained using CNAnorm (Gusnanto et al., 2012). To account for technical
experimental variation, the loadings of the first principal component were introduced as a covariate on the generalized linear model.
CTCF sites were considered lost if they had a q-value smaller than 0.1 and a methylation difference larger than 20%. To evaluate the
effects of methylation of CTCF sites in chromatin looping, we performed aggregate peak analyses based on a comprehensive list of
loops annotated on the human genome (Rao et al., 2014). Lost CTCF peaks were assigned to loops if they overlapped with loop’s
bidirectional CTCF motifs. To assign genes upregulated upon TAD boundary disruption, only peaks within 50Kb of a TAD boundary
were considered. For each gene near a disrupted TAD boundary, a linear model was fitted on TCGA data using gene expression as a
response variable and the average methylation at the lost CTCF site as predictors. Genes were selected if they had a significant
positive (at a false discovery rate of 15%) association between their expression and the methylation levels at the corresponding
TAD boundary CTCF.
Energy function
The potential energy function for the genome adopts the following form:
X
UME ðrÞ = ½Uðr I Þ + Uideal ðr I Þ + Ucompt ðrÞ; [1]
I
where r represents the 3D conformation of the entire genome. I indexes over different chromosomes and r I corresponds to the
conformation of chromosome I. By definition, r = fr 1 ; r 2 ; /; r 23 g Uðr I Þ and Uideal ðr I Þ are generic potentials shared by all chromo-
somes, and Ucompt ðrÞ describes compartment-type-specific interactions within the same chromosome and between different
chromosomes.
Specifically, Uðr I Þ is the energy function for a confined homopolymer and consists of four terms, Ubond , Uangle , Usc and Uc . Ubond is
the bonding potential between neighboring beads. Uangle is the angular potential applied among every three neighboring beads to
define the persistence length of the polymer. Usc is a soft-core potential applied to all the non-bonded pairs to enforce the excluded
volume effect among genomic loci. Uc models a spherical boundary and is introduced to mimic the confinement effect applied by the
nuclear envelop onto the chromosomes. The radii of the spherical confinement is chosen to ensure a volume fraction of 0.1. Explicit
expressions for Uðr I Þ can be found in Zhang and Wolynes, 2015 and Qi and Zhang, 2019.
Uideal ðr I Þ is introduced to reproduce the power law decay of the contact probability as a function of genomic separation for each
chromosome (Di Pierro et al., 2016; Qi and Zhang, 2019). It describes the tendency for chromosomes to collapse and form territories
in addition to what has been enforced by the confinement potential Uc . Uideal ðr I Þ is defined as
where fðrij Þ determines the contact probability of a genomic pair with a spatial distance of rij , and i; j index over all pairs of non-bonded
chromatin beads from chromosome I. Following Qi and Zhang (2019), we define fðrÞ as
8
> 1
>
< ½1 + tanhðsðrc r ÞÞ; if r%rc
2
f ðr Þ = [3]
>
>
: 1 rc 4
; if r > rc
2 r
where rc = 2:0 and s = 2:0. aideal ðji jjÞ measures of the strength of the contact at a given genomic separation ji jj and its value can be
determined from Hi-C data as detailed below. It contributes to a total of N 1 parameters, where N is the number of beads for the
longest chromosome (chromosome 1, 249Mb).
For the 1Mb-resolution model, Ucompt ðrÞ is defined as
XX XX
Ucompt ðrÞ = aintra CIi ; CIj fðrij Þ + ainter CIi ; CJj fðrij Þ; [4]
I i;j I;J i˛I;j˛J
where I and J index over different chromosomes and i and j index over non-bonded pairs of chromatin beads. CIi denotes the
compartment type for bead i from chromosome I and can be either A or B. We used different parameters aintra and ainter for intra-
and inter-chromosome interactions to account for the presence of different molecular players that organize the genome at various
lengthscales (Qi and Zhang, 2019). This potential contributes a total of 6 parameters to the model.
Therefore, for the 1Mb-resolution model, the total number of parameters is N 1 + 6 = 254.
For the 100kb-resolution model, we further separated the intra-chromosome potential into intra- and inter-TAD interactions de-
pending on whether the pair of beads are within the same topologically associating domain (TAD) or not. Specifically,
XXh i XX
Ucompt ðrÞ = intra Ci ; Cj dT I ;T I + ainter Ci ; Cj
aTAD I I TAD I I
1 dT I ;T I fðrij Þ + ainter CIi ; CJj fðrij Þ; [5]
i j i j
I i;j I;J i˛I;j˛J
where TiI denotes the TAD index for bead i from chromosome I. dT I ;T I is the Kronecker delta function and equals to 1 if TiI = TjI and
i j
0 otherwise. The positions of TAD boundaries were determined from experimental Hi-C data using the software TADbit (Serra
et al., 2017). Here CIi can adopt three values: A; B and I. Therefore, Ucompt ðrÞ contributes 18 parameters to the model. The total num-
ber of parameters for the 100kb-resolution model is thus 2492 + 18 = 2510.
Parameter optimization
Parameters in the above energy function can be derived
using
the iterative
algorithm introduced in our previous works (Qi and Zhang,
2019). In particular, parameters aideal ðjj ijÞ, aintra CIi ; CIj and ainter CIi ; CJj are tuned to ensure that the following ensemble aver-
ages determined with simulated genome conformation matches corresponding experimental constraints calculated using Hi-C data.
* +
XX XX exp
f ðrij Þdjjij;s = fij djjij;s ; for s = 1; . ; N 1 [6]
I i;j I i;j
* +
XX XX exp
f ðrij ÞdCI ;c1 dCI ;c2 = fij dCI ;c1 dCI ;c2 ; for ðc1 ; c2 Þ˛fð A; AÞ; ð A; BÞ; ðB; BÞg
i j i j
I i;j I i;j
* +
XX XX
f ðrij ÞdCI ;c1 dCJ ;c2 = fijexp dCI ;c1 dCJ ;c2 ; for ðc1 ; c2 Þ˛fð A; AÞ; ð A; BÞ; ðB; BÞg
i j i j
I;J i˛I;j˛J I;J i˛I;j˛J
In the above equations, the Kronecker delta function dCI ;c1 equals to 1 if CIi = c1 and 0 otherwise. dCJ ;c2 is similarly defined. fijexp is the
i j
contact probability between the pair of genomic segments i and j determined from Hi-C. UME ðrÞ can be shown as the least biased
potential to reproduce these experimental constraints following the maximum entropy principle.
The constraints used to parameterize the 100kb-resolution model can be similarly defined.
To generate an initial configuration for these simulations, we first placed all the chromosomes consecutively on a cubic lattice with
pffiffiffi
an edge length of 0:9R= 3, where R is the radii of the spherical confinement introduced to ensure a volume fraction of 0.1. This
P
configuration was subsequently equilibrated along a 100,000-step-long simulation under the potential Uðr I Þ to relax both the to-
I
pology and energy of the polymer structures. The last configuration from this equilibration trajectory was then used to initialize our
whole genome simulations. We note that the long sampling time used in our simulations ensures their convergence. Therefore, all the
results presented in the manuscript are independent of this initial configuration.
Parameters of the whole genome models were determined iteratively. We initialized the first iteration of these simulations using the
equilibrated configuration mentioned above. All subsequent simulations were initialized using the end configurations from the pre-
vious iteration. During each iteration, we carried out six independent ten-million-time-step-long simulations for the 1Mb-resolution
model and ten independent two-million-time-step-long simulations for the 100kb-resolution model. Genome conformations were
saved at every 2000 timesteps to calculate the ensemble averages. A total of 10 iterations were performed for the 1Mb-resolution
P P
model to reach an error of less than 5%. We define the error as ε = fisim fiexp = fiexp , where fiexp are the experimental constraints
defined in Equation 6 and fisim are the corresponding ensemble averages determined from computer simulation. We used 35 itera-
tions for the 100kb-resolution model to reach an error of less than 15%.
With the converged parameters, we performed additional six independent twenty-million-time-step-long simulations for the 1Mb-res-
olution model and ten independent four-million-time-step-long simulations for the 100kb-resolution model. A total of 60,000 and 20,000
structures were collected for the 1Mb- and 100kb-resolution model respectively to perform all the analysis presented in the main text.
where r is the spatial distance from the nuclear center. nðrÞ is the number of genomic loci of a given compartment type found in the
spherical shell from r to r + Dr, and the angular brackets indicates an ensemble average over all the simulated genome structures. N is
the total number of genomic loci of that given compartment type.
DNA-FISH analysis
To calculate redistribution of the A and B compartment in primary tissues, the nuclei for 2D images were manually curated to delin-
eate intact single tumor and colon epithelial nuclei in FIJI version 2.0.0-rc-69/1.52p (N = 2 tumors and 2 normal samples). This step
was necessary to avoid generating data from poorly oriented and non-tumor and -colon epithelial nuclei such as immune cells and
fibroblasts. As the latter cells have strikingly different nuclear morphologies, we could easily exclude them by visual inspection. Next,
the pictures were loaded into Cellprofiler version 3.1.9, the nuclei and compartment spots segmented, and the original channel im-
ages masked on the identified DNA-FISH spots. The radial intensity distribution of the masked images was calculated in the nuclei
using 20 scaled bins per cell. To determine redistribution of the B compartment in copy number stable tumors toward the nuclear
interior, the Fraction at Distance of each masked image bin was plotted in Prism Version 8.4.2. Because the chromosome territory
of chromosome 12 is in general peripherally located, bin 1-10 were summed together for their visualization.
Radial distribution of DNA-FISH in cells was quantified using Cellprofiler version 3.1.9 2 (McQuin et al., 2018), segmenting the
nuclei, followed by fill holes and exclusion of cells touching the edge of the image. Next, the radial distribution of each channel (A
compartment (A488); B compartment (Cy3) and I compartment (Cy5) was measured using the module MeasureObjectIntensityDis-
tribution using each nucleus as the center of the points. To obtain distributions for each cell’s radial bin in which the maximum of the
signal was located, we multiplied the Fraction at Distance for each bin and channel with each Mean Fraction value, and gave a value
of 1 to the bin containing the highest value. The counts for each cell and channel were then plotted using Prism Version 8.4.2. Repre-
sentative images and insets in all panels were generated using FIJI version 2.0.0-rc-69/1.52p.
Supplemental Figures
(F) Volcano plot shows differential analysis of CTCF binding sites (points) between CIMP and non-CIMP tumors (points to the upper left represent CTCF binding
sites lost in CIMP tumors). Sites that are hypermethylated in CIMP tumors relative to non-CIMP samples are highlighted (red; methylation difference > 15%).
(G) Left: Cartoon schematic of Hi-C heatmap shows a strong loop peak corresponding to an interaction between two CTCF bound loop anchors flanking a TAD
(top panel). This theoretical CTCF-CTCF loop interaction is weakened in a sample with reduced CTCF binding at one or both anchors (bottom). Right: Heatmaps
show actual Hi-C signals aggregated over CTCF-CTCF loops, revealing interaction peaks (i.e., averaged signal for the pixels corresponding to the tops of the TAD
triangles illustrated at left). Top: Heatmaps aggregate signals for loops whose CTCF anchors are stable in normal colon, non-CIMP tumors and CIMP tumors (top).
Bottom: Heatmaps aggregate signals for loops whose CTCF anchors are lost in CIMP tumors. These loop anchor interactions are weakened in CIMP tumors.
(H) Boxplots depict fold-change (log2) in E-P loop strength between tumors and normal. Loops crossing TAD boundaries are shown. Loops are stratified ac-
cording to whether the TAD boundary that they span loses CTCF binding and gains methylation in CIMP tumors (lost) or whether it retains CTCF (stable).
(I) Boxplots depict expression fold-change (log2) between CIMP and non-CIMP tumors stratified by whether the genes are located in a disrupted TAD or not.
ll
Article
(C) Aggregated contact map shows Hi-C signal averaged over all hypomethylated blocks across normal and tumor samples. The x axis shows genomic positions
relative to hypomethylated blocks. The edges of hypomethylated blocks correspond to TAD boundaries.
(D) Plots show average frequency of Hi-C contacts for pairwise interactions that occur within the same genomic compartment (left), and between different
compartments (right). Data are shown for four normal colon samples (dots). Compartment I regions have inter-compartment interactions with both A and B
regions.
(E-F) Hi-C contact map of observed versus expected interactions in normal colon for two representative regions across chromosomes 6 (E) and 14 (F).
Compartment designations are shown for both rows and columns.
(G-H) Hi-C contact map of observed versus expected interactions in colon tumors for two representative regions across chromosomes 6 (G) and 14 (H).
Compartment designations are shown for both rows and columns.
(I) Plot shows average ratio of interactions with the A versus B compartments (y axis), summarized for compartment I. Each point represents the average of 100 kb
windows for normal colons (green) and tumors (purple). Shown for original (left) and validation (right) cohorts.
(J-L) Scatterplots of first and second (J and L) or first and third (K) eigenvectors for chromosomes 12 (I), 13 (J) and 20 (K) resulting from the eigenvector
decomposition method to define compartments. Data are shown for the aggregated normal colon Hi-C matrices. Each point represents one 100 kb bin and is
colored by compartment (A: dark blue; I: light blue; B: yellow).
(M) Boxplot shows the distribution of PC1 values resulting from the eigenvector decomposition of the HCT116 Hi-C matrix. Data are shown for the 100Kb-bins
that overlap with our DNA-FISH probes. Probes for the respective compartments have the expected distributions of PC1 values.
(N) Barplot indicates the percent of cells for which the maximum DNA-FISH signal intensity for compartments A, B or I is located at the indicated radial position for
102 normal colon nuclei.
ll
Article