5.1. Workflows Base

Added in version 0.9.0.

5.1.1. mdpow.workflows.base — Automated workflow base functions

To analyze multiple MDPOW projects, provide project_paths() with the top-level directory containing all MDPOW projects’ simulation data to obtain a pandas.DataFrame containing the project information and paths. Then, automated_project_analysis() takes as input the aforementioned pandas.DataFrame and runs the specified EnsembleAnalysis for all MDPOW projects under the top-level directory provided to project_paths().

See also

registry

mdpow.workflows.base.project_paths(parent_directory=None, csv=None, csv_save_dir=None)[source]

Takes a top directory containing MDPOW projects and determines the molname, resname, and path, of each MDPOW project within.

Optionally takes a .csv file containing molname, resname, and paths, in that order.

Keywords:

parent_directory

the path for the location of the top directory under which the subdirectories of MDPOW simulation data exist, additionally creates a ‘project_paths.csv’ file for user manipulation of metadata and for future reference

csv

.csv file containing the molecule names, resnames, and paths, in that order, for the MDPOW simulation data to be iterated over must contain header of the form: molecule,resname,path

csv_save_dir

optionally provided directory to save .csv file, otherwise, data will be saved in current working directory

Returns:

project_paths

pandas.DataFrame containing MDPOW project metadata

Example

Typical Workflow:

project_paths = project_paths(parent_directory='/foo/bar/MDPOW_projects')
automated_project_analysis(project_paths)

or:

project_paths = project_paths(csv='/foo/bar/MDPOW.csv')
automated_project_analysis(project_paths)
mdpow.workflows.base.automated_project_analysis(project_paths, ensemble_analysis, **kwargs)[source]

Takes a pandas.DataFrame created by project_paths() and iteratively runs the specified EnsembleAnalysis for each of the projects by running the associated automated workflow in each project directory returned by project_paths().

Compatibility with more automated analyses in development.

Keywords:

project_paths

pandas.DataFrame that provides paths to MDPOW projects

ensemble_analysis

name of the EnsembleAnalysis that corresponds to the desired automated workflow module

kwargs

keyword arguments for the supported automated workflows, see the registry for all available workflows and their call signatures

Example

A typical workflow is the automated dihedral analysis from mdpow.workflows.dihedrals, which applies the ensemble analysis DihedralAnalysis to each project. The registry contains this automated workflow under the key “DihedralAnalysis” and so the automated execution for all project_paths (obtained via project_paths()) is performed by passing the specific key to automated_project_analysis():

project_paths = project_paths(parent_directory='/foo/bar/MDPOW_projects')
automated_project_analysis(project_paths, ensemble_analysis='DihedralAnalysis', **kwargs)
mdpow.workflows.base.guess_elements(atoms, rtol=0.001)[source]

guess elements for atoms from masses

Given masses, we perform a reverse lookup on MDAnalysis.topology.tables.masses to find the corresponding element. Only atoms where the standard MDAnalysis guesser finds elements with masses contradicting the topology masses are corrected.

Note

This function requires correct masses to be present. No sanity checks because MDPOW always uses TPR files that contain correct masses.

Arguments:

atoms

MDAnalysis AtomGroup with masses defined

Keywords:

rtol

relative tolerance for a match (as used in numpy.isclose()); atol=1e-6 is at a fixed value, which means that “zero” is only recognized for values =< 1e-6

Note

In order to reliably match GROMACS masses, rtol should be at least 1e-3.

Returns:

elements

array of guessed element symbols, in same order as atoms

Example

As an example we guess masses and then set the elements for all atoms:

elements = guess_elements(atoms)
atoms.add_TopologyAttr("elements", elements)