@Billy-The-Crescent
2019-10-08T08:01:21.000000Z
字数 13716
阅读 1011
Moleculle Dynamics GROMACS SimulationSoftware: GROMACS: http://www.gromacs.org
Manual: http://www.gromacs.org/Documentation/Manual
Platform: Linux
Initial Protein Structure: cns1_model1.pdb
GROMACS location:
/usr/local/gromacs/bin/
Set environment variable:
export PATH=$PATH:/usr/local/gromacs/bin/#test if environment worksgmx -version
Result:

So, it works.
1 . Remove the crystal waters in the pdb file
Use grep to delete these lines very easily:
grep -v HOH cns1_model1.pdb > cns1_clean.pdb
- The topology for the molecule.
- A position restraint file.
- A post-processed structure file.
gmx pdb2gmx -ignh -ff amber99sb-ildn -f cns1_clean.pdb -o cns1_clean.gro -p cns1_clean.top -water tip3p
-ignh: Ignore all hydrogen atoms, because if H atoms are present, they must be in the named exactly how the force fields in GROMACS expect them to be. So dealing with H atoms can occasionally be a headache!-ff: We use Amber99SB-ILDN force field. -f: Specify the input file name-o: Specify the newly output GROMCAS file-p: Specify the newly ouput topology file, including the parameters of all atom-atom interactions.-water: Specify the mater model. We use TIP3P water model.We use editconf operation to create periodic simulation box.
gmx editconf -f cns1_clean.gro -o cns1_clean-PB.gro -bt cubic -d 1.2#-bt dodecahedron (dodecahedron is probable better in other situations)
-f: Input protein structure.-o: Output structure file with the simulation box information.-bt: Simulation box tyoe, default: rectangle box.-d: The minimum distance of the protein feom the X, Y, Z direction of the simulation box.We use -bt option to create the rhombic dodecahedron, because its volume is ~71% of the cubic box of the same periodic distance, thus saving on the number of water molecules that need to be added to solvate the protein. -d option specifies the minimum distance of the molecule to the box margin, using nm as the unit. It specifies the size of the box. In theory, of the majority of the systems, -d can not be less than 0.9nm, so we use 1.2nm.
We can check the structure of the protein with the added box. We need to transfer the gro format to pdb format using
editconfoperation
For example, we can convert the file.gro to file.pdb
gmx editconf -f file.gro -o file.pdb
We should design the specific .mdp file to specify parameters.
em-vac-pme.mdp
; Definition delivered to preprocessordefine = -DFLEXIBLE ; Use flexible water model rather than rigid water model, thus steepest descent method will further minimize the energy; Model type, Ending control, input control parameterintegrator = steep ; Algorithm (steep = steepest descent minimization)emtol = 500.0 ; Stop minimization when the maximum force < 1000.0 kJ/mol/nmemstep = 0.01 ; Minimization step sizensteps = 1000 ; Maximum number of (minimization) steps to performnstenergy = 1 ; Energy writing frequencyenergygrps = System ; Writing energy group; Nearst list, interaction calculation parameternstlist = 1 ; Frequency to update the neighbor list and long range forcesns_type = grid ; Method to determine neighbor list (simple, grid)coulombtype = PME ; Treatment of long range electrostatic interactionsrlist = 1.0 ;rcoulomb = 1.0 ; Short-range electrostatic cut-offvdwtype = cut-off ; Method for calculating van der waals interactionrvdw = 1.0 ; Short-range Van der Waals cut-offconstraints = none ; Set the restriction of the modelpbc = xyz ; Periodic Boundary Conditions in all 3 dimensions
Gromacs pre-processor (grompp) can integrate all information including simulated parameter, molecular structure, molecular structure, acquired force field parameter into a single binary file (tpr file), so it requires only tpr file while running mdrun.
gmx grompp -f em-vac-pme.mdp -c cns1_clean-PB.gro -p cns1_clean.top -o cns1_em-vac.tpr
-f option specifies the input parameter file-c option specifies the input structure file -p option soecifies the input topology file-o option specifies the output file which is used for mdrunAfter that, we get file cns1_em-vac.tpr and parameter file mdout.mdp.
We use mdrun intruction to run the energy minimization
gmx mdrun -v -deffnm cns1_em-vac
-deffnm specifies default file name-v shows the infomation of the simulation processAfter 1001 steps, result is :

As a result, we get the record file cns1_em-vac.log, Binary full-precision trajectory cns1_em-vac.trr, Energy-minimized structure cns1_em-vac.gro.
If we only need the molecule dynamics simulation in vaccum, the above steps have got what we want, the .trr file, which can be depicted to movies using VMD software. However, if we need simulation in the solvent, water, for example, we have to go further.
We use gmx solvate to infuse the water into the box, simulating the solvation of the protein molecule.
gmx solvate -cp cns1_em-vac.gro -cs spc216.gro -p cns1_clean.top -o cns1-b4ion.gro
-cp specifies the model of the simulating protein box-cs specifies the water model to be SPC water model used for infillingspc216 is the general 3-site water structure in GROMACS-p adjusts the topology file, adding the physical parameters of the corresponding water molecule-o specifies the output file after filling the water moleculeAs a result, we get cns1-b4ion.gro.
Then we need to minimize the energy of the system, using another em-sol-pme.mdp file.
em-sol-pme.mdp
define = -DFLEXIBLEintegrator = steepemtol = 250.0nsteps = 5000nstenergy = 1energygrps = Systemnstlist = 1ns_type = gridcoulombtype = PMErlist = 1.0rcoulomb = 1.0rvdw = 1.0constraints = nonepbc = xyz
Likewise, we need to use grompp to process the .mdp file, after which we start to conduct mdrun.
gmx grompp -f em-sol-pme.mdp -c cns1-b4ion.gro -p cns1_clean.top -o ion.tpr
The generated ion.tpr is the result, and the output message is as follows:

We can see from the message that the total charge is -3.999995. Since the total charge should be described in integers, there are around -4e charge.
Then, we need to add the ion into the system, using the ion.tpr file. The ion we add should neutralise the charge original in the system. Since we have -4e charge in the system, we need to add more negion than cation.
We use geion operation to replace some water molecules by ions.
gmx genion -s ion.tpr -o cns1-b4em.gro -neutral -conc 0.15 -p cns1_clean.top
-neutral option garantee the whole charge in the system equals zero, and the system is neutralized.-conc option sets the required ion concentration (here we use 0.15M) -g option specifies the name of the output record file -norandom option set the ion according to the electric potential, not random (default). However, here we use random set, so we do not use this optionAftering executing this operation, it appears a consecutive solvent molecule group. Here we choose 13(SOL). Typing Enter, the procudure will tell your your solvent molecules have been replaced by Cl ion.
Inorder to run energy minimization by mdrun, we have to use the grompp operation again.
gmx grompp -f em-sol-pme.mdp -c cns1-b4em.gro -p cns1_clean.top -o cns1_em-sol.tpr
Then, we run energy minimization:
gmx mdrun -v -deffnm cns1_em-sol
We can see that after 5001 steps, it reaches convergence.

Note: genion uses NaCl in default. If users want any other negion and cation, they can use -pname (cation) and -nname (negion) to specify the name of the ions. Users can refer tothe ion.itp file to see how to call different ions
We now need to construct position-restricted pre-equalization simulation, that is to say, we fix the position of the protein atoms and keep the solvent molecules moving, just like soaking the protein molecules into the water, enabling the further equilibration of the system.
We need to perform two stages of position-restricted pre-equalizations: 100ps NVT and 100ps NPT, with the temperature of 310 K, which is similar to the experiment room temperature. It is appropriate to simulate in 310 K, since it is similar to the body temperature.
Likewise, we need another .mdp file called nvt-pr-md.mdp file.
define = -DPOSRES ; Tell GROMACS run position restricted siulationintegrator = mddt = 0.002 ; step length, the unit of which is ps, here we use 2fsnsteps = 50000 ; number of steps, the simulation time is nsteps*dt; output control parameternstxout = 500 ; Writing frequency, nstxout=500 and dt=0.002, so it writes out every 1 psnstvout = 500 ; Velocity saving frequencynstenergy = 500 ; Energy saving frequencynstlog = 500 ; log file output frequencyenergygrps = Protein Non-Protein;nstlist = 5ns_type = gridpbc = xyzrlist = 1.0;coulombtype = PMEpme_order = 4fourierspacing = 0.16rcoulomb = 1.0 ; unit is nmvdw-type = Cut-offrvdw = 1.0;tcoupl = v-rescaletc-grps = Protein Non-Proteintau_t = 0.1 0.1ref_t = 300 300;DispCorr = EnerPres;pcoupl = no;gen_vel = yesgen_temp = 300gen_seed = -1;constraints = all-bondscontinuation = noconstraint_algorithm = lincslincs_iter = 1lincs_order = 4
Also, we need to use grommp and then mdrun again.
gmx grompp -f nvt-pr-md.mdp -c cns1_em-sol.gro -p cns1_clean.top -o cns1_nvt-pr.tpr#gmx mdrun -deffnm cns1_nvt-pr
If mdrun requires so much time, users can use nohup to run the program background.
nohup gmx mdrun -deffnm cns1_nvt-pr &#If you want to check the state of the programtail -n 25 cns1_nvt-pr.log
NPT simulation uses npt-pr-md.mdp ass the parameter file
define = -DPOSRESintegrator = mddt = 0.002nsteps = 50000nstxout = 500nstvout = 500nstfout = 500nstenergy = 500nstlog = 500energygrps = Protein Non-Proteinnstlist = 5ns-type = Gridpbc = xyzrlist = 1.0coulombtype = PMEpme_order = 4fourierspacing = 0.16rcoulomb = 1.0vdw-type = Cut-offrvdw = 1.0Tcoupl = v-rescaletc-grps = Protein Non-Proteintau_t = 0.1 0.1ref_t = 300 300DispCorr = EnerPres;Pcoupl = Parrinello-RahmanPcoupltype = Isotropictau_p = 2.0compressibility = 4.5e-5ref_p = 1.0refcoord_scaling = comgen_vel = noconstraints = all-bondscontinuation = yesconstraint_algorithm = lincslincs_iter = 1lincs_order = 4
We use the result .gro file as the input of the nvt simulation.
gmx grompp -f npt-pr-md.mdp -c cns1_nvt-pr.gro -p cns1_clean.top -o cns1_npt-pr.tprnohup gmx mdrun -deffnm cns1_npt-pr &#tail -n 25 cns1_npt-pr.log
After equalization, we are able to conduct the final simulation
Still, we need another .mdp file
npt-nopr-md.mdp
integrator = mddt = 0.002nsteps = 500000 ; 1 nsnstxout = 1000nstvout = 1000nstfout = 1000nstenergy = 1000nstlog = 1000energygrps = Protein Non-Proteinnstlist = 5ns-type = Gridpbc = xyzrlist = 1.0coulombtype = PMEpme_order = 4fourierspacing = 0.16rcoulomb = 1.0vdw-type = Cut-offrvdw = 1.0Tcoupl = v-rescaletc-grps = Protein Non-Proteintau_t = 0.1 0.1ref_t = 300 300DispCorr = EnerPresPcoupl = Parrinello-RahmanPcoupltype = Isotropictau_p = 2.0compressibility = 4.5e-5ref_p = 1.0gen_vel = noconstraints = all-bondscontinuation = yesconstraint_algorithm = lincslincs_iter = 1lincs_order = 4
Coding:
gmx grompp -f npt-nopr-md.mdp -c cns1_npt-pr.gro -p cns1_clean.top -o cns1_npt-nopr.tprnohup gmx mdrun -deffnm cns1_npt-nopr &#tail -n 25 cns1_npt-nopr.log
In order to save disk space, users can use trjconv operation to compress .trr trace file to xtc file, which is calculated faster in analysis. Also we should use -pbc nojump option to ensure all the atoms in the box.
gmx trjconv -f cns1_npt-nopr.trr -s cns1_npt-nopr.tpr -o cns1_npt-nopr.xtc -pbc nojump -ur compact -center#Choose 0 in the two choice options. If the xtc file size is too large, choose 1 in thsese 2 options.
Note: If the resulting .trr file is to big, which is more than 10GB and even 20GB, you'd better alter the npt-nopr-md.mdp, changing the writing frequency (nstxout, nstvout, nstfout, nstenergy, nstlog) from 100 to 5000 or even 10000.
Once we get the .xtc trace file, we'd like to analyse it.
Instead of using VMD software, we choose Pymol alternatively, since latest VMD only support 32-bit program, which means that the max number of memory addresses is 2GB. When we used VMD, we could not even be able to operate 300M xtc trace file together with .gro structural file. Additionally, the movie made from Pymol is better than from VMD, and the UI is more friendly.
Firstly, we need to convert the xtc file and gro file together to pdb file, which can be processed by Pymol.
trjconv -s cns1_npt-nopr.gro -f cns1_npt-nopr.xtc -o movie.pdb
Then open movie.pdb in Pymol.
1. Pre-review the movie of the framesPyMOL> mplayType "mstop" to stop the animation2. Saving frames in PNG formatPyMOL> mpng frameIt will produce .npg files in Pymol directory
Then, use gif convertion software of linux operation to convert the npg files to gif format.
In linux, the operation is:
convert -delay 10 -loop 0 frame*.png movie.gif
-delay 10: the time interval between frames
frame*.png: the frames which compose the animation
In VideoMach software, load the npg format files and export the media in gif format.
Use gmx g_energy operatio to extract the data
gmx energy -f cns1_npt-nopr.edr -o cns1_enrg-npt.xvg
It will appears some options of the data we would like to extract:

Here, we'd like to extract the energy and temperature data, so we type 10 11 12 13 and key in Enter twice to execute the operation.

The result will show in cns1_enrg-npt.xvg, and we can draw the picture using Microsoft Office Excel.
Firstly, we need to convert the cns1_npt-nopr.pro to pdb file.
gmx editconf -f cns1_npt-nopr.gro -o cns1_npt-nopr.pdb
Use align operation in Pymol to align the two structures
Aftering loading the structures into pymol software, we should remove the waters and ions of the MD result structure.
In pymol, on the right bar of cns1_npt-nopr, click action|remove waters to remove waters, and alert the selecting state of Residue on the bottom right to Molecule.
Click on the protein molecule and it appears an object called (sele), and click action|extract object and it created an object called obj01. Rename it to cns1_MD.
Load the initial structure file cns1_model1.pdb and we can align the two structures.
align cns1_MD, cns1_model1
After alignment, we can get the RMSD value in the console.
If we want to make a movie of the 360-degree-view about the alignment, we need to use pymol console.
PyMOL> mset 1 x360This command creates a movie with 60 framesPyMOL> util.mroll 1,360This command rotates the protein molecule 360 deg in 360 frames#the number of the frames can control the speed of the rotation, more frames means lower speed.PyMOL> mplayType "mstop" to stop the animation2. Saving frames in PNG formatPyMOL> mpng frame
It saves 360 frames png file in the python36 directory, use VideoMach to generate gif file.
Use gmx rms to calculate the RMSD between the backbone after molecular dynamics simulation and the initial backbone.
gmx rms -s cns1_npt-nopr.tpr -f cns1_npt-nopr.xtc -o cns1-bkbone-rmsd.xvg
Choose option 4 (backbone) when it alerts.
Use gmx rmsdist to calculate the RMSD of the distance fluctuation betwwen the atoms.
gmx rmsdist -s cns1_npt-nopr.tpr -f cns1_npt-nopr.xtc -o cns1_distrmsd.xvg
Likewise, use Excel to draw the picture.