The mdpow-* scripts

A number of python scripts are installed together with the mdpow package. They simplify some common tasks (especially at the analysis stage) but they make some assumptions about directory layout and filenames. If one uses defaults for all directory and filename options then it should “just work”.

In particular, a directory hierarchy such as the following is assumed:

moleculename/
     water.simulation
     octanol.simulation
     Equilibrium/
           water/
           octanol/
     FEP/
           water/
                 Ghyd.fep
                 Coulomb/
                 VDW/
           octanol/
                 Goct.fep
                 Coulomb/
                 VDW/

moleculename is, for instance, “benzene” or “amantadine”; in the run input file (see Equilibrium simulations) is the value of the variable name in the [setup] section.

Equilibrium simulations

The mdpow-equilibrium script

  • sets up equilibrium MD simulations for the solvents (e.g., water or octanol)
  • runs energy minimization, MD_relaxed, and MD_NPT protocols; the user can choose if she wants to launch mdrun herself (e.g. on a cluster) or let the script do it locally on the workstation

The script runs essentially the same steps as described in the tutorial Tutorial: Manual session: 1-octanol as a solute but it gathers all required parameters from a run input file and it allows one to stop and continue and the protocol transparently.

It requires as at least Gromacs 4.6.5 ready to run (check that all commands can be found). The required input is

  1. a run configuration file (runinput.yml);
  2. a structure file (PDB or GRO) for the compound
  3. a Gromacs ITP file for the compound (OPLS/AA force field)

A template RUNFILE can be generated with mdpow-get-runinput runinput.yml, which will copy the default run input file bundled with MDPOW and put it in the current directory under the name runinput.yml.

For an example of a RUNFILE see benzene.yml. It is recommended to use absolute paths to file names. The run input file uses YAML syntax (and is parsed by yaml).

Note

It is recommended to enclose all strings in the input file in quotes, especially if they can be interpreted as numbers. For example, a name “005” would be interpreted as the number 5 unless explicitly quoted.

The script keeps track of the stages of the simulation protocol and allows the user to restart from the last completed stage. For instance, one can use the script to set up a simulation, then run the simulation on a cluster, transfer back the generated files, and start mdpow-equilibrium again with the exact same input to finish the protocol. Since Gromacs 4.5 it is also possible to interrupt a running mdrun process (e.g. with Control-c) and then resume the simulation at the last saved trajectory checkpoint by running mdpow-equilibrium again.

If in doubt, just try running mdpow-equilibrium running again and let it figure out the best course of action. Look at the log file to see what has been done. Lines reading “Fast forwarding: stage” indicate that results from stage are available.

Usage of the command:

mdpow-equilibrium [options] RUNFILE

Set up (and possibly run) the equilibration equilibrium simulation for one compound and one solvent. All parameters except the solvent are specified in the RUNFILE.

Options:

-h, --help

show this help message and exit

-S <NAME>, --solvent=<NAME>

solvent <NAME> for compound, can be ‘water’, ‘octanol’, ‘cyclohexane’ [water]

-d <DIRECTORY>, --dirname=<DIRECTORY>

generate files and directories in <DIRECTORY>, which is created if it does not already exist. The default is to use the molecule name from the run input file.

FEP simulations

The mdpow-fep script sets up (and can also run) the free energy perturbation calculations for one compound and one solvent. It uses the results from mdpow-equilibrium together with the run input file.

You will require:

  1. at least Gromacs 4.0.5 ready to run (check that all commands can be found)
  2. the results from a previous complete run of mdpow-equilibrium

Usage of the command:

mdpow-fep [options] RUNFILE

Options:

-h, --help

show this help message and exit

--get-template

generate a template RUNFILE and exit

-S <NAME>, --solvent=<NAME>

solvent <NAME> for compound, can be ‘water’ or ‘octanol’ [water]

-d <DIRECTORY>, --dirname=<DIRECTORY>

generate files and directories in <DIRECTORY>, which is created if it does not already exist. The default is to use the molecule name from the run input file.

Running analysis

Solvation free energy

The mdpow-solvationenergy calculatest the solvation free energy. It

  • collects data from FEP simulations (converting EDR files to XVG.bz2 when necessary)
  • calculates desolvation free energies for solvent –> vacuum via thermodynamic integration (TI)
  • plots dV/dlambda graphs
  • appends results to energies.txt (when the default names are chosen), see Output data file formats.

Usage of the command:

Options:

-h, --help

show this help message and exit

-S <NAME>, --solvent=<NAME>

solvent NAME for compound, ‘water’, ‘octanol’, or ‘cyclohexane’ [water]

-p <FILE>, --plotfile=<FILE>

plot dV/dlambda to FILE; use png or pdf suffix to determine the file type. ‘auto’ generates a pdf file DIRECTORY/dVdl.pdf and ‘None’ disables it [auto]

-e <FILE>, --energies=<FILE>

append solvation free energies to <FILE> [energies.txt]

-s <N>, --stride=<N>

Use every N-th datapoint from the original dV/dlambda data. Using a number larger than 1 can significantly reduce memory usage with little impact on the precision of the final output. [10]

--force

force rereading all data [False]

--ignore-corrupted

skip lines in the md.xvg files that cannot be parsed, perhaps because to an intermittent file system problem. In order to salvage the existing data, this option is provided for diagnostic purposes. [False]

Warning

Only use this option with care; skipped lines can indicate that other parts of the file have been corrupted but still pass the line corruption test. USE AT YOUR OWN RISK.

It is recommended to simply rerun the affected simulation(s).

Partition coefficients

The mdpow-pow script

  • collects data from FEP simulations
  • calculates desolvation free energies for octanol –> vacuum and water –> vacuum via thermodynamic integration (TI)
  • calculates transfer free energy water –> octanol
  • calculates log POW
  • plots dV/dlambda graphs
  • appends results to pow.txt and energies.txt (when the default names are chosen), see Output data file formats.

Usage of the command:

mdpow-pow [options] DIRECTORY [DIRECTORY …]

Run the free energy analysis for water and octanol in DIRECTORY/FEP and return the octanol-water partition coefficient log POW.

DIRECTORY should contain all the files resulting from running mdpow.fep.Goct.setup() and mdpow.fep.Goct.setup() and the results of the MD FEP simulations. It relies on the canonical naming scheme (basically: just use the defaults as in the tutorial).

The dV/dlambda plots can be produced automatically (mdpow-pow -p auto). If multiple DIRECTORY arguments are provided then one has to choose the auto option (or None).

Options:

-h, --help

show this help message and exit

-p <FILE>, --plotfile=<FILE>

plot dV/dlambda to FILE; use png or pdf suffix to determine the file type. ‘auto’ generates a pdf file DIRECTORY/dVdl.pdf and ‘None’ disables it [auto]

-o <FILE>, --outfile=<FILE>

append one-line results summary to <FILE> [pow.txt]

-e <FILE>, --energies=<FILE>

append solvation free energies to <FILE> [energies.txt]

--force

force rereading all data [False]

--ignore-corrupted

skip lines in the md.xvg files that cannot be parsed, perhaps because to an intermittent file system problem. In order to salvage the existing data, this option is provided for diagnostic purposes. [False]

Warning

Only use this option with care; skipped lines can indicate that other parts of the file have been corrupted but still pass the line corruption test. USE AT YOUR OWN RISK.

It is recommended to simply rerun the affected simulation(s).

Output data file formats

Results are appended to data files.

Note

Energies are always output in kJ/mol.

POW summary file

The pow.txt output file summarises the results from the water and octanol solvation calculations. Its name set with the option mdpow-pow -o. It contains fixed column data in the following order and all energies are in kJ/mol. See the Table of computed water-octanol transfer energies and logPow as an example.

itp_name

molecule identifier from the itp file

DeltaG0

transfer free energy from water to octanol (difference between DeltaG0 for octanol and water), in kJ/mol; >0: partitions into water, <0: partitions into octanol,

errDG0

error on DeltaG0; errors are calculated through propagation of errors from the errors on the means <dV/dlambda>

logPOW

log POW, base-10 logarithm of the octanol-water partition coefficient; >0: partitions into octanol, <0: partitions preferrably into water

errlogP

error on logPow

directory

directory under which all data files are stored; by default this is the name of the molecule and hence it can be used to identify the compound.
Computed log POW and water-to-octanol transfer energies (in kJ/mol).
itp_name DeltaG0 errDG0 logPow errlogP directory
BNZ -12.87 0.43 +2.24 0.07 benzene
OC9 -16.24 1.12 +2.83 0.20 octanol
URE +4.66 1.13 -0.81 0.20 urea
902 +9.35 1.06 -1.63 0.18 water_TIP4P

Energy file

The energy.txt output file collects all computed energy terms together with the results also found in the summary file pow.txt. Its name is set with the option mdpow-pow -e. It contains fixed column data in the following order and all energies are in kJ/mol. As an example see Table of Solvation Energies for the same compounds as above.

molecule

molecule identifier from the itp file

solvent

solvent name (water, octanol)

DeltaG0

solvation free energy difference in kJ/mol (Ben-Naim standard state, i.e. 1M gas/1M solution)

errDG0

error on DeltaG0, calculated through propagation of errors from the errors on the mean <dV/dlambda>

coulomb

Coulomb (discharging) contribution to the solvation free energy DeltaG0

errCoul

error on coulomb

VDW

Van der Waals/Lennard-Jones (decoupling) contribution to DeltaG0

errVDW

error on VDW

Vdp

correction if the FEP are run at constant volume but the desired solvation free energy is the Gibbs energy (currently neglected and set to 0)

errVdp

error on Vdp

w2oct

transfer free energy from water to octanol (difference between DeltaG0 for octanol and water)

errw2oct

error on w2oct

logPOW

log POW

errlogP

error on logPow

directory

directory under which all data files are stored; by default this is the name of the molecule and hence it can be used to identify the compound.
Solvation free energies for compounds in water and octanol (in kJ/mol).
molecule solvent DeltaG0 errDG0 coulomb errCoul VDW errVDW Vdp errVdp w2oct errw2oct logPOW errlogP directory
BNZ water -2.97 0.21 +7.65 0.07 -4.68 0.20 +0.00 0.00 -12.87 0.43 +2.24 0.07 benzene
BNZ octanol -15.84 0.37 +1.40 0.19 +14.44 0.32 +0.00 0.00 -12.87 0.43 +2.24 0.07 benzene
OC9 water -16.03 0.32 +27.35 0.09 -11.32 0.31 +0.00 0.00 -16.24 1.12 +2.83 0.20 octanol
OC9 octanol -32.28 1.08 +11.32 0.92 +20.96 0.56 +0.00 0.00 -16.24 1.12 +2.83 0.20 octanol
URE water -53.52 0.17 +56.94 0.11 -3.41 0.14 +0.00 0.00 +4.66 1.13 -0.81 0.20 urea
URE octanol -48.86 1.12 +35.75 1.09 +13.11 0.25 +0.00 0.00 +4.66 1.13 -0.81 0.20 urea
902 water -25.46 0.11 +34.93 0.10 -9.48 0.06 +0.00 0.00 +9.35 1.06 -1.63 0.18 water_TIP4P
902 octanol -16.11 1.05 +21.16 1.05 -5.05 0.09 +0.00 0.00 +9.35 1.06 -1.63 0.18 water_TIP4P

House-keeping scripts

A number of scripts are provided to complete simple tasks; they can be used to check that all required files are present or they can help in fixing small problems without one having to write Python code to do this oneself (e.g. by manipulating the checlpoint files). They generally make the same assumptions about file system layout as the other mdpow scripts.

Checking if the simulation is complete

Run mdpow-check in order to check if all necessary files are available.

Usage of the command:

mdpow-check [options] DIRECTORY [DIRECTORY …]

Check status of the progress of the project in DIRECTORY.

Output is only written to the status file (status.txt). A quick way to find problems is to do a

cat status.txt | gawk -F '|' '$2 !~ /OK/ {print $0}'

Options:

-h, --help

show this help message and exit

-o <FILE>, --outfile=<FILE>

write status results to FILE [status.txt]

Changing paths in water.simulation and octanol.simulation

It can become necessary to recreate the solvent.simulation status/checkpoint files in order to change paths, e.g. when one moved the directories or transferred all files to a different file system.

Typically, one would execute the mdpow-rebuild-simulation command in the parent directory of moleculename.

Usage of the command:

mdpow-rebuild-simulation [options] DIRECTORY [DIRECTORY …]

Re-create the water.simulation or octanol.simulation file with adjusted paths (now rooted at DIRECTORY).

Options:

-h, --help

show this help message and exit

--solvent=<NAME>

rebuild file for ‘water’, ‘octanol’, or ‘all’ [all]

Re-building Ghyd.fep and Goct.fep

It can become necessary to recreate the name.fep status/checkpoint files (e.g. if the files became corrupted due to a software error or in order to change paths).

Typically, one would execute the mdpow-rebuild-fep command in the parent directory of moleculename.

Usage of the command:

mdpow-rebuild-fep [options] DIRECTORY [DIRECTORY …]

Re-create the Goct.fep or Ghyd.fep file using the appropriate equilibrium simulation file under DIRECTORY/FEP.

This should only be necessary when the fep file was destroyed due to a software error or when the files are transferred to a different file system and some of the paths stored in the name.fep files have to be changed.

Options:

-h, --help

show this help message and exit

--solvent=<NAME>

rebuild fep for ‘water’, ‘octanol’, or ‘all’ [all]

--setup=<LIST>

Re-generate queuing system scripts with appropriate paths: runs fep.Gsolv.setup() with argument qscript=[LIST] after fixing Gsolv.

LIST should contain a comma-separated list of queing system templates. For example: 'icsn_8pd.sge,icsn_2pd.sge,local.sh'. It is an error if --setup is set without a LIST.