Single Chromosome Simulation

This tutorial should take between 20 to 30 minutes of reading and performing simulations.

Chromatin Dynamics Simulations on Chromosome 10 of GM12878 Cell Line

The first step is to import the OpenMiChroM module

note:

[1]:
import sys
sys.path.append('../../')
import OpenMiChroM
[2]:
from OpenMiChroM.ChromDynamics import MiChroM
from OpenMiChroM.CndbTools import cndbTools

MiChroM class sets the initial parameters of the simulation:

  • time_step=0.01: set the simulation time step to perfom the integration

  • temperature=1.0: set the temperature of your simulation

[3]:
sim = MiChroM(temperature=1.0, timeStep=0.01)
    ***************************************************************************************
    **** **** *** *** *** *** *** *** OpenMiChroM-1.1.1rc *** *** *** *** *** *** **** ****

         OpenMiChroM is a Python library for performing chromatin dynamics simulations.
                            OpenMiChroM uses the OpenMM Python API,
                employing the MiChroM (Minimal Chromatin Model) energy function.
      The chromatin dynamics simulations generate an ensemble of 3D chromosomal structures
      that are consistent with experimental Hi-C maps, also allows simulations of a single
                 or multiple chromosome chain using High-Performance Computing
                            in different platforms (GPUs and CPUs).

         OpenMiChroM documentation is available at https://open-michrom.readthedocs.io

         OpenMiChroM is described in: Oliveira Junior, A. B & Contessoto, V, G et. al.
      A Scalable Computational Approach for Simulating Complexes of Multiple Chromosomes.
                  Journal of Molecular Biology. doi:10.1016/j.jmb.2020.10.034.
               We also thank the polychrom <https://github.com/open2c/polychrom>
                 where part of this code was inspired - 10.5281/zenodo.3579472.

                    Copyright (c) 2024, The OpenMiChroM development team at
                                        Rice University
    ***************************************************************************************

There are four hardware platform options to run the simulations:

platform="cuda"
platform="opencl"
platform="hip"
platform="cpu"

Choose accordingly.

[4]:
#sim.setup(platform="opencl")
sim.setup(platform="cuda")
Using platform: CUDA

Set the directory name in which the output of the simulation is saved:

[5]:
sim.saveFolder('output_chr10')
saveFileName = "traj_chr10"

The next step is to load the chromatin compartment sequence for chromosome 10 and generate an initial 3D structure to start the simulation. We can use the createSpringSpiral function to set the initial configuration of the polymer based in the sequence file.

The first column of the sequence file should contain the locus index. The second should have the locus type annotation. A template file of the chromatin sequence of types can be found here.

The loci positions are stored in the variable chr10 as a NumPy array \([N:3]\), where \(N\) is the number of beads.

[6]:
chr10 = sim.createSpringSpiral(ChromSeq='inputs/chr10_beads.txt', isRing=False)
# or write as:
# chr10 = sim.initStructure(mode='auto', CoordFiles=None, ChromSeq='inputs/chr10_beads.txt', isRing=False, chromosome=None)

We can check the position of the first five beads:

[7]:
print(chr10[:5])
[[410.61654383 221.72955137 221.17249751]
 [410.62605506 222.10093022 222.10093022]
 [410.61654383 222.47230906 223.02936292]
 [410.5880139  222.84357838 223.95743532]
 [410.54047655 223.21462872 224.88478724]]

The initial structure should then be loaded into the sim object.

The option center=True moves your system to the origin.

[8]:
sim.loadStructure(chr10, center=True)

The initial 3D chromosome structure can be saved in .ndb file format. The file is stored in the path given in saveFolder.

[9]:
# sim.saveStructure(mode='ndb') # need sim.createSimulation first, but before that forces should be set

The next step is to add the force field in the simulation object sim.

In this tutorial, the forces can be divided into two sets:

MiChroM Homopolymer (Bonded) Potentials

[10]:
sim.addFENEBonds(kFb=30.0)
sim.addAngles(kA=2.0)
sim.addRepulsiveSoftCore(eCut=4.0)

MiChroM Non-Bonded Potentials

[11]:
sim.addTypetoType(mu=3.22, rc=1.78)
sim.addIdealChromosome(mu=3.22, rc=1.78, dinit=3, dend=500)

The last potential adds a spherical constrain to collapse the initial structure.

[12]:
sim.addFlatBottomHarmonic(kR=5*10**-3, nRad=15.0)

Run a short simulation to generate a collapsed structure.

[13]:
block = 3*10**2
n_blocks = 2*10**3

Two variables control the chromatin dynamics simulation steps:

block: The number of time-steps performed in each cycle (or block) n_blocks: The number of cycles (or blocks) simulated.

The initial collapse simulation will run for \(3\times10^2 \times 2\times10^3 = 6\times10^5\) time-steps.

[14]:
sim.createSimulation()
sim.saveStructure(mode='ndb')
FENEBond was added
AngleForce was added
RepulsiveSoftCore was added
TypetoType was added
IdealChromosome was added
FlatBottomHarmonic was added
Setting positions... loaded!
Setting velocities... loaded!
Context created!

Simulation name: OpenMiChroM
Number of beads: 2712, Number of chains: 1
Potential energy: 64.29285, Kinetic Energy: 1.46175 at temperature: 1.0

Potential energy per forceGroup:
                                  Values
FENEBond                   55318.664162
AngleForce                     0.989383
RepulsiveSoftCore              0.000000
TypetoType                  -222.474259
IdealChromosome               -1.086155
FlatBottomHarmonic        119266.123097
Potential Energy (total)  174362.216227
[15]:
# since there's no need to record xyz position, just run that is ok
sim.run(nsteps = n_blocks*block, checkSystem = True, report = True,blockSize = block)

'''for _ in range(n_blocks):
    sim.run(nsteps = block, checkSystem=True, blockSize=block) '''
#"Progress (%)" "Step"  "Speed (ns/day)"        "Time Remaining"
1.7%    10000   --      --
3.3%    20000   4.94e+03        1:41
5.0%    30000   4.7e+03 1:44
6.7%    40000   4.51e+03        1:47
8.3%    50000   4.37e+03        1:48
10.0%   60000   4.28e+03        1:49
11.7%   70000   4.18e+03        1:49
13.3%   80000   4.1e+03 1:49
15.0%   90000   4.04e+03        1:48
16.7%   100000  4e+03   1:47
18.3%   110000  3.96e+03        1:46
20.0%   120000  3.94e+03        1:45
21.7%   130000  3.9e+03 1:44
23.3%   140000  3.86e+03        1:42
25.0%   150000  3.84e+03        1:41
26.7%   160000  3.83e+03        1:39
28.3%   170000  3.82e+03        1:37
30.0%   180000  3.81e+03        1:35
31.7%   190000  3.8e+03 1:33
33.3%   200000  3.79e+03        1:31
35.0%   210000  3.79e+03        1:29
36.7%   220000  3.78e+03        1:26
38.3%   230000  3.77e+03        1:24
40.0%   240000  3.76e+03        1:22
41.7%   250000  3.76e+03        1:20
43.3%   260000  3.75e+03        1:18
45.0%   270000  3.75e+03        1:16
46.7%   280000  3.75e+03        1:13
48.3%   290000  3.74e+03        1:11
50.0%   300000  3.74e+03        1:09
51.7%   310000  3.73e+03        1:07
53.3%   320000  3.73e+03        1:04
55.0%   330000  3.73e+03        1:02
56.7%   340000  3.73e+03        1:00
58.3%   350000  3.72e+03        0:58
60.0%   360000  3.72e+03        0:55
61.7%   370000  3.72e+03        0:53
63.3%   380000  3.72e+03        0:51
65.0%   390000  3.72e+03        0:48
66.7%   400000  3.72e+03        0:46
68.3%   410000  3.72e+03        0:44
70.0%   420000  3.72e+03        0:41
71.7%   430000  3.72e+03        0:39
73.3%   440000  3.72e+03        0:37
75.0%   450000  3.72e+03        0:34
76.7%   460000  3.71e+03        0:32
78.3%   470000  3.71e+03        0:30
80.0%   480000  3.71e+03        0:27
81.7%   490000  3.71e+03        0:25
83.3%   500000  3.71e+03        0:23
85.0%   510000  3.71e+03        0:20
86.7%   520000  3.71e+03        0:18
88.3%   530000  3.71e+03        0:16
90.0%   540000  3.71e+03        0:13
91.7%   550000  3.71e+03        0:11
93.3%   560000  3.7e+03 0:09
95.0%   570000  3.7e+03 0:07
96.7%   580000  3.7e+03 0:04
98.3%   590000  3.7e+03 0:02
100.0%  600000  3.7e+03 0:00
[15]:
'for _ in range(n_blocks):\n    sim.run(nsteps = block, checkSystem=True, blockSize=block) '

Details about the output of each simulation block:

  • bl=0: index number of the simulated block. The parameter increment=False is used to ignore the steps counting.

  • pos[1]=[X,Y,Z]: spatial position for the locus 1.

  • dr=1.26: average of the loci displacements in each block (in units of sigma).

  • t=0: current simulation time.

  • kin=1.5: kinetic energy of the system (reduced units).

  • pot=19.90: total potential energy of the system (reduced units).

  • RG=7.654: radius of gyration at the end of the simulated block.

  • SPS=12312: steps per second of each block.

The radius of gyration is a good parameter to check the performance of the collapse.(But the function to calculate that is deleted) If the chromosome polymer is not collapsed, it is necessary to rerun the initial collapse steps. We can also save the structure for inspection.

[16]:
# print(sim.chromRG()) # chromRG function is already removed
sim.saveStructure(mode='ndb', fileName = saveFileName)

The structure can also be saved using stardard file formats used for macromolecules, as the pdb and gro formats.

[17]:
sim.saveStructure(mode='gro', fileName=saveFileName)
sim.saveStructure(mode='pdb', fileName=saveFileName)

The next step is to remove the spherical constrain force to run the production simulation.

[18]:
sim.removeFlatBottomHarmonic()
Removed FlatBottomHarmonic from the system!

If necessary, one could remove any of the forces applied in the system. To see the forces in the system:

[19]:
sim.forceDict
[19]:
{'FENEBond': <openmm.openmm.CustomBondForce; proxy of <Swig Object of type 'OpenMM::CustomBondForce *' at 0x000002492838F6F0> >,
 'AngleForce': <openmm.openmm.CustomAngleForce; proxy of <Swig Object of type 'OpenMM::CustomAngleForce *' at 0x000002491CC8CF30> >,
 'RepulsiveSoftCore': <openmm.openmm.CustomNonbondedForce; proxy of <Swig Object of type 'OpenMM::CustomNonbondedForce *' at 0x000002492838F840> >,
 'TypetoType': <openmm.openmm.CustomNonbondedForce; proxy of <Swig Object of type 'OpenMM::CustomNonbondedForce *' at 0x000002492838F990> >,
 'IdealChromosome': <openmm.openmm.CustomNonbondedForce; proxy of <Swig Object of type 'OpenMM::CustomNonbondedForce *' at 0x000002491CC8C900> >}
[20]:
# sim.removeForce(forceName="TypetoType")

To run the production simulation, it is necessary to initialize the .cndb file to save the chromatin dynamics trajectory.

[ ]:
sim.createReporters(statistics=True, traj=True, outputName = saveFileName, trajFormat="cndb", energyComponents=True, interval=5*10**2)

Set the parameters of the production simulation:

\(block = 5\times10^2\) \(n\_blocks = 2\times10^3\)

[22]:
block = 5*10**2
n_blocks = 2*10**3
[23]:
for _ in range(n_blocks):
    sim.run(block)
    # sim.saveStructure() # comment that this the structure will be automatically saved in the reporter
101.7%  610000  3.35e+03        23:59:58
103.3%  620000  3.36e+03        23:59:55
105.0%  630000  3.36e+03        23:59:53
106.7%  640000  3.36e+03        23:59:50
108.3%  650000  3.37e+03        23:59:48
110.0%  660000  3.37e+03        23:59:45
111.7%  670000  3.37e+03        23:59:43
113.3%  680000  3.37e+03        23:59:40
115.0%  690000  3.38e+03        23:59:37
116.7%  700000  3.38e+03        23:59:35
118.3%  710000  3.38e+03        23:59:32
120.0%  720000  3.38e+03        23:59:30
121.7%  730000  3.39e+03        23:59:27
123.3%  740000  3.39e+03        23:59:25
125.0%  750000  3.39e+03        23:59:22
126.7%  760000  3.39e+03        23:59:20
128.3%  770000  3.4e+03 23:59:17
130.0%  780000  3.4e+03 23:59:15
131.7%  790000  3.4e+03 23:59:12
133.3%  800000  3.4e+03 23:59:10
135.0%  810000  3.41e+03        23:59:07
136.7%  820000  3.41e+03        23:59:05
138.3%  830000  3.41e+03        23:59:02
140.0%  840000  3.41e+03        23:59:00
141.7%  850000  3.41e+03        23:58:57
143.3%  860000  3.42e+03        23:58:55
145.0%  870000  3.42e+03        23:58:52
146.7%  880000  3.42e+03        23:58:50
148.3%  890000  3.42e+03        23:58:47
150.0%  900000  3.43e+03        23:58:45
151.7%  910000  3.43e+03        23:58:42
153.3%  920000  3.43e+03        23:58:40
155.0%  930000  3.43e+03        23:58:37
156.7%  940000  3.43e+03        23:58:35
158.3%  950000  3.43e+03        23:58:32
160.0%  960000  3.43e+03        23:58:30
161.7%  970000  3.44e+03        23:58:27
163.3%  980000  3.44e+03        23:58:25
165.0%  990000  3.44e+03        23:58:23
166.7%  1000000 3.44e+03        23:58:20
168.3%  1010000 3.44e+03        23:58:18
170.0%  1020000 3.44e+03        23:58:15
171.7%  1030000 3.44e+03        23:58:13
173.3%  1040000 3.45e+03        23:58:10
175.0%  1050000 3.45e+03        23:58:08
176.7%  1060000 3.45e+03        23:58:05
178.3%  1070000 3.45e+03        23:58:03
180.0%  1080000 3.45e+03        23:58:00
181.7%  1090000 3.45e+03        23:57:58
183.3%  1100000 3.45e+03        23:57:55
185.0%  1110000 3.45e+03        23:57:53
186.7%  1120000 3.45e+03        23:57:50
188.3%  1130000 3.45e+03        23:57:48
190.0%  1140000 3.46e+03        23:57:45
191.7%  1150000 3.46e+03        23:57:43
193.3%  1160000 3.46e+03        23:57:41
195.0%  1170000 3.46e+03        23:57:38
196.7%  1180000 3.46e+03        23:57:36
198.3%  1190000 3.46e+03        23:57:33
200.0%  1200000 3.46e+03        23:57:31
201.7%  1210000 3.46e+03        23:57:28
203.3%  1220000 3.46e+03        23:57:26
205.0%  1230000 3.47e+03        23:57:23
206.7%  1240000 3.47e+03        23:57:21
208.3%  1250000 3.47e+03        23:57:19
210.0%  1260000 3.47e+03        23:57:16
211.7%  1270000 3.47e+03        23:57:14
213.3%  1280000 3.47e+03        23:57:11
215.0%  1290000 3.47e+03        23:57:09
216.7%  1300000 3.47e+03        23:57:06
218.3%  1310000 3.48e+03        23:57:04
220.0%  1320000 3.48e+03        23:57:02
221.7%  1330000 3.48e+03        23:56:59
223.3%  1340000 3.48e+03        23:56:57
225.0%  1350000 3.48e+03        23:56:54
226.7%  1360000 3.48e+03        23:56:52
228.3%  1370000 3.48e+03        23:56:49
230.0%  1380000 3.48e+03        23:56:47
231.7%  1390000 3.48e+03        23:56:45
233.3%  1400000 3.48e+03        23:56:42
235.0%  1410000 3.48e+03        23:56:40
236.7%  1420000 3.49e+03        23:56:37
238.3%  1430000 3.49e+03        23:56:35
240.0%  1440000 3.49e+03        23:56:32
241.7%  1450000 3.49e+03        23:56:30
243.3%  1460000 3.49e+03        23:56:28
245.0%  1470000 3.49e+03        23:56:25
246.7%  1480000 3.49e+03        23:56:23
248.3%  1490000 3.49e+03        23:56:20
250.0%  1500000 3.49e+03        23:56:18
251.7%  1510000 3.49e+03        23:56:15
253.3%  1520000 3.5e+03 23:56:13
255.0%  1530000 3.5e+03 23:56:11
256.7%  1540000 3.5e+03 23:56:08
258.3%  1550000 3.5e+03 23:56:06
260.0%  1560000 3.5e+03 23:56:03
261.7%  1570000 3.5e+03 23:56:01
263.3%  1580000 3.5e+03 23:55:59
265.0%  1590000 3.5e+03 23:55:56
266.7%  1600000 3.5e+03 23:55:54

Once the simulation is completed, it is necessary to close the .cndb file to avoid losing the trajectory data.

[23]:
# sim.storage[0].close()

The simulation should generate the traj_chr10_0.cndb trajectory file in the output_chr10 folder. This file contains 2000 frames (one snapshot per block). In the new version, there’s no code to set the name of the cndb file!

Trajectory analysis using cndbTools

cndbTools is a class that allows analyses in the chromatin dynamics trajectories using the binary format .cndb (compact ndb).

[26]:
cndbTools = cndbTools()

Load the cndb file in the variable chr10_traj.

[27]:
chr10_traj = cndbTools.load('output_chr10/traj_chr10_0.cndb')
[28]:
print(chr10_traj) # Print the information of the cndb trajectory.
<OpenMiChroM.CndbTools.cndbTools object at 0x24938dc2880>
Cndb file has 2000 frames, with 2712 beads and {b'NA', b'A1', b'A2', b'B3', b'B2', b'B1'} types

Extract the loci XYZ position over the simulated 2000 frames and save in the variable chr10_xyz.

[29]:
chr10_xyz = cndbTools.xyz(frames=range(0,2000,1), XYZ=[0,1,2])
[30]:
max([int(key) for key in chr10_traj.cndb.keys() if key != 'types'])
[30]:
1999

The variable chr10_xyz allows the cndbTools to perform several analyses. In this example, the radius of gyration can be obtained as a function of the simulated frames.

[33]:
import matplotlib.pyplot as plt
import matplotlib as mpl


chr10_RG = cndbTools.compute_RG(chr10_xyz)
plt.plot(chr10_RG)
plt.ylabel(r'Radius of Gyration ($\sigma$)',fontsize=11)
plt.xlabel(r'Simulation Frames',fontsize=11)
[33]:
Text(0.5, 0, 'Simulation Frames')
../_images/Tutorials_Tutorial_Single_Chromosome_60_1.png

cndbTools allows the selection of beads to compute the analyses. An example is the Radial Distribution Probability (RDP) for each chromatin subcompartments A1 and B1.

[34]:
chr10_A1 = cndbTools.xyz(frames=range(0,2000,1), beadSelection=chr10_traj.dictChromSeq[b'A1'], XYZ=[0,1,2]) # revision:change frame list to numpy array
chr10_B1 = cndbTools.xyz(frames=range(0,2000,1), beadSelection=chr10_traj.dictChromSeq[b'B1'], XYZ=[0,1,2])
[35]:
print("Computing RDP...")
r_A1, RDP_chr10_A1 = cndbTools.compute_RDP(chr10_A1, radius=15.0, bins=200)
r_B1, RDP_chr10_B1 = cndbTools.compute_RDP(chr10_B1, radius=15.0, bins=200)
Computing RDP...
[36]:
plt.plot(r_A1, RDP_chr10_A1, color='red', label='A')
plt.plot(r_B1, RDP_chr10_B1, color='blue', label='B')
plt.xlabel(r'r ($\sigma$)', fontsize=11,fontweight='normal', color='k')
plt.ylabel(r'$\rho(r)/N_{type}$', fontsize=11,fontweight='normal', color='k')
plt.legend()
plt.gca().set_xlim([1/200,15.0])
[36]:
(0.005, 15.0)
../_images/Tutorials_Tutorial_Single_Chromosome_64_1.png

We can also use cndbTools to generate the in silico Hi-C map (contact probability matrix).

In this tutorial, the trajectory contains 2,000 snapshots of chromosome 10 of the GM12878 cell line. For this set of structures, we expect the in silico Hi-C to not be fully converged due to inadequate sampling. To produce a converged map, it is recommended to simulate around 20 replicas with 10,000 frames on each, which generates an ensemble of 200,000 chromosome structures.

[38]:
print("Generating the contact probability matrix...")
chr10_sim_HiC = cndbTools.traj2HiC(chr10_xyz)
Generating the contact probability matrix...
Reading frame 0 of 2000
Reading frame 500 of 2000
Reading frame 1000 of 2000
Reading frame 1500 of 2000
[39]:
plt.matshow(chr10_sim_HiC, norm=mpl.colors.LogNorm(vmin=0.001, vmax=chr10_sim_HiC.max()),cmap="Reds")
plt.colorbar()
[39]:
<matplotlib.colorbar.Colorbar at 0x13566bf73a0>
../_images/Tutorials_Tutorial_Single_Chromosome_67_1.png

To visualize the chromosome’s 3D structures in the standard visualization softwares for macromolecules, there are available scripts for converting the ndb/cndb file format to .pdb and .gro. For details, please check the Nucleome Data Bank.

The ndb plugin for visualizing the chromatin dynamics trajectories in VMD/Chimera/Pymol is under development.