7.1. Workflows Base

New in version 0.9.0.

7.1.1. mdpow.workflows.base — Automated workflow base functions

To analyze multiple MDPOW projects, provide project_paths() with the top-level directory containing all MDPOW projects’ simulation data to obtain a pandas.DataFrame containing the project information and paths. Then, automated_project_analysis() takes as input the aforementioned pandas.DataFrame and runs the specified EnsembleAnalysis for all MDPOW projects under the top-level directory provided to project_paths().

See also

registry

mdpow.workflows.base.project_paths(parent_directory=None, csv=None, csv_save_dir=None)[source]

Takes a top directory containing MDPOW projects and determines the molname, resname, and path, of each MDPOW project within.

Optionally takes a .csv file containing molname, resname, and paths, in that order.

Keywords:
parent_directory
the path for the location of the top directory under which the subdirectories of MDPOW simulation data exist, additionally creates a ‘project_paths.csv’ file for user manipulation of metadata and for future reference
csv
.csv file containing the molecule names, resnames, and paths, in that order, for the MDPOW simulation data to be iterated over must contain header of the form: molecule,resname,path
csv_save_dir
optionally provided directory to save .csv file, otherwise, data will be saved in current working directory
Returns:
project_paths
pandas.DataFrame containing MDPOW project metadata

Example

Typical Workflow:

project_paths = project_paths(parent_directory='/foo/bar/MDPOW_projects')
automated_project_analysis(project_paths)

or:

project_paths = project_paths(csv='/foo/bar/MDPOW.csv')
automated_project_analysis(project_paths)
mdpow.workflows.base.automated_project_analysis(project_paths, ensemble_analysis, **kwargs)[source]

Takes a pandas.DataFrame created by project_paths() and iteratively runs the specified EnsembleAnalysis for each of the projects by running the associated automated workflow in each project directory returned by project_paths().

Compatibility with more automated analyses in development.

Keywords:
project_paths
pandas.DataFrame that provides paths to MDPOW projects
ensemble_analysis
name of the EnsembleAnalysis that corresponds to the desired automated workflow module
kwargs
keyword arguments for the supported automated workflows, see the registry for all available workflows and their call signatures

Example

A typical workflow is the automated dihedral analysis from mdpow.workflows.dihedrals, which applies the ensemble analysis DihedralAnalysis to each project. The registry contains this automated workflow under the key “DihedralAnalysis” and so the automated execution for all project_paths (obtained via project_paths()) is performed by passing the specific key to automated_project_analysis():

project_paths = project_paths(parent_directory='/foo/bar/MDPOW_projects')
automated_project_analysis(project_paths, ensemble_analysis='DihedralAnalysis', **kwargs)
mdpow.workflows.base.guess_elements(atoms, rtol=0.001)[source]

guess elements for atoms from masses

Given masses, we perform a reverse lookup on MDAnalysis.topology.tables.masses to find the corresponding element. Only atoms where the standard MDAnalysis guesser finds elements with masses contradicting the topology masses are corrected.

Note

This function requires correct masses to be present. No sanity checks because MDPOW always uses TPR files that contain correct masses.

Arguments:
atoms
MDAnalysis AtomGroup with masses defined
Keywords:
rtol

relative tolerance for a match (as used in numpy.isclose()); atol=1e-6 is at a fixed value, which means that “zero” is only recognized for values =< 1e-6

Note

In order to reliably match GROMACS masses, rtol should be at least 1e-3.

Returns:
elements
array of guessed element symbols, in same order as atoms

Example

As an example we guess masses and then set the elements for all atoms:

elements = guess_elements(atoms)
atoms.add_TopologyAttr("elements", elements)