Solvating a System, Adding Ions and Generating Input Files
Overview
Teaching: 30 min
Exercises: 5 minQuestions
Why the simulation system should be neutralized?
How to add water and ions to a simulation system?
How to choose size and shape of a solvent box?
Objectives
Understand why it is necessary to neutralize the simulation system.
Neutralize a system.
Solvate a macromolecule.
Add ions to a simulation system to mimic a salt solution.
Generate molecular topology for simulation with GROMACS and NAMD.
Why ions are added to simulations?
- To calculate correctly electrostatic energy with periodic boundary conditions
- Ions concentration and composition affect conformations, dynamics, and functions of biological macromolecules.
Neutralizing a system
Fist we will add enough counter-ions to neutralize the system. Ions can be added using two approaches:
- Solvate the system and then replace random solvent molecules with ions.
- Place ions according to the electrostatic potential of the macromolecule before solvation.
Adding more ions to a neutralized system will be necessary to represent physiological salt concentrations.
Caveats and limitations of the random ion placement
- Random placement of ions will require longer equilibration and may affect structural stability of a macromolecule.
- A better approach is to place ions according to the electrostatic potential of the macromolecule.
Let’s neutralize 1RGG protein using the leap module. We will add ions prior to solvation so that the potential from un-equilibrated water will not interfere with ion placement:
cd ~/scratch/workshop_amber/example_04
module load StdEnv/2023 amber
tleap
source leaprc.water.opc
source leaprc.protein.ff19SB
s = loadpdb ../example_03/1RGG_chain_A_prot.pdb
charge s
addions s Na+ 0
Adding Ions to Mimic the Macroscopic Salt Concentration
To mimic the macroscopic salt concentration in the environment being studied we will need to add more cation/anion pairs to the simulation system. The number of ion pairs can be estimated using the formula:
$N_{Ions}=0.0187\cdot[Molarity]\cdot{N_{WaterMol}}$
- This calculation does not take into account the charge of a macromolecule.
- For more accurate salt concentration you can calculate the number of ions corrected for screening effects using the SLTCAP server.
- To calculate the number of ions we need to know the number of water molecules in the simulation system.
The command solvateBox creates a periodic solvent box around the macromolecule.
solvatebox s SPCBOX 15 iso
Solute vdw bounding box: 40.514 32.235 37.352
Total bounding box for atom centers: 70.514 70.514 70.514
(box expansion for 'iso' is 18.6%)
Solvent unit box: 18.774 18.774 18.774
Volume: 399256.044 A^3
Total mass 194369.824 amu, Density 0.808 g/cc
Added 10202 residues.
Now that we know the number of water molecules in the simulation system, we can add salt to the desired concentration.
Preparing an Aqueous Salt Solution
How many Na+ and Cl- ions do we need to add to the simulation box with 1RGG protein and 10202 water molecules to prepare 0.15 M salt solution? Calculate the number of ions using two methods: the formula above and the SLTCAP server. For this calculation you need to know molecular weight of the protein. You can calculate it here. FASTA sequence of the protein is available here.
Solution
- N_ions = 0.0187 x 0.15 x 10202 = 29. We need to add 35 Na+ and 29 Cl- ions
- SLTCAP calculation with the following input: (MW 11 KDa, 10202 water molecules, charge 6, 150 mM salt) yields 30 Na+ and 24 Cl- ions.
We already have the neutralized and solvated simulation system, and in the exercise above we determined that we need to add 24 ion pairs to prepare 150 mM salt solution. Let’s replace 48 randomly selected water molecules with 24 Na+ and 24 Cl- ions:
addionsrand s Na+ 24 Cl- 24
Generating Molecular Topology for Simulation with AMBER or NAMD
Setup of our simulation is almost complete. Our protein has cross-linked cysteine residues, so the last thing to do is to make disulfide bond between Cys7 and Cys96:
bond s.7.SG s.96.SG
We can now save the molecular topology (parm7) file and AMBER coordinates (rst7). To build GROMACS topology later we will also save the solvated system in PDB format:
saveamberparm s 1RGG_chain_A.parm7 1RGG_chain_A.rst7
savepdb s 1RGG_chain_A_solvated.pdb
quit
A complete script for preparing coordinates and the topology in LEAP
Save the following commands in a file, e.g. solvate_1RRG.leap
source leaprc.water.opc
source leaprc.protein.ff19SB
s = loadpdb ../exanmple_03/1RGG_chain_A_prot.pdb
addions s Na+ 0
solvatebox s SPCBOX 15 iso
addionsrand s Na+ 24 Cl- 24
bond s.7.SG s.96.SG
saveamberparm s 1RGG_chain_A.parm7 1RGG_chain_A.rst7
savepdb s 1RGG_chain_A_solvated.pdb
quit
Execute the script:
tleap -f solvate_1RRG.leap
Automation and Reproducibility of a Simulation Setup
- The process of molecular dynamics system setup can be automated by saving the whole sequence of commands into a text file.
You can download an example shell script that performs all preparation steps here. The script downloads the molecular structure file from PDB and generates input files for simulation with AMBER, NAMD, and GROMACS.
Key Points
Simulation system must be neutralized by adding counter-ions to obtain the correct electrostatic energy.
Ions are added to a simulations system to mimic composition of a local macromolecule environment.
Solvent box should be large enough to allow for unrestricted conformational dynamics of a macromolecule.