7.1. Workflows Base¶
New in version 0.9.0.
7.1.1. mdpow.workflows.base
— Automated workflow base functions¶
To analyze multiple MDPOW projects, provide project_paths()
with the top-level directory containing all MDPOW projects’ simulation data
to obtain a pandas.DataFrame
containing the project information
and paths. Then, automated_project_analysis()
takes as input the
aforementioned pandas.DataFrame
and runs the specified
EnsembleAnalysis
for all MDPOW projects
under the top-level directory provided to project_paths()
.
See also
-
mdpow.workflows.base.
project_paths
(parent_directory=None, csv=None, csv_save_dir=None)[source]¶ Takes a top directory containing MDPOW projects and determines the molname, resname, and path, of each MDPOW project within.
Optionally takes a .csv file containing molname, resname, and paths, in that order.
Keywords: - parent_directory
- the path for the location of the top directory under which the subdirectories of MDPOW simulation data exist, additionally creates a ‘project_paths.csv’ file for user manipulation of metadata and for future reference
- csv
- .csv file containing the molecule names, resnames, and paths, in that order, for the MDPOW simulation data to be iterated over must contain header of the form: molecule,resname,path
- csv_save_dir
- optionally provided directory to save .csv file, otherwise, data will be saved in current working directory
Returns: - project_paths
pandas.DataFrame
containing MDPOW project metadata
Example
Typical Workflow:
project_paths = project_paths(parent_directory='/foo/bar/MDPOW_projects') automated_project_analysis(project_paths)
or:
project_paths = project_paths(csv='/foo/bar/MDPOW.csv') automated_project_analysis(project_paths)
-
mdpow.workflows.base.
automated_project_analysis
(project_paths, ensemble_analysis, **kwargs)[source]¶ Takes a
pandas.DataFrame
created byproject_paths()
and iteratively runs the specifiedEnsembleAnalysis
for each of the projects by running the associated automated workflow in each project directory returned byproject_paths()
.Compatibility with more automated analyses in development.
Keywords: - project_paths
pandas.DataFrame
that provides paths to MDPOW projects- ensemble_analysis
- name of the
EnsembleAnalysis
that corresponds to the desired automated workflow module - kwargs
- keyword arguments for the supported automated workflows,
see the
registry
for all available workflows and their call signatures
Example
A typical workflow is the automated dihedral analysis from
mdpow.workflows.dihedrals
, which applies the ensemble analysisDihedralAnalysis
to each project. Theregistry
contains this automated workflow under the key “DihedralAnalysis” and so the automated execution for all project_paths (obtained viaproject_paths()
) is performed by passing the specific key toautomated_project_analysis()
:project_paths = project_paths(parent_directory='/foo/bar/MDPOW_projects') automated_project_analysis(project_paths, ensemble_analysis='DihedralAnalysis', **kwargs)
-
mdpow.workflows.base.
guess_elements
(atoms, rtol=0.001)[source]¶ guess elements for atoms from masses
Given masses, we perform a reverse lookup on
MDAnalysis.topology.tables.masses
to find the corresponding element. Only atoms where the standard MDAnalysis guesser finds elements with masses contradicting the topology masses are corrected.Note
This function requires correct masses to be present. No sanity checks because MDPOW always uses TPR files that contain correct masses.
Arguments: - atoms
- MDAnalysis AtomGroup with masses defined
Keywords: - rtol
relative tolerance for a match (as used in
numpy.isclose()
); atol=1e-6 is at a fixed value, which means that “zero” is only recognized for values =< 1e-6Note
In order to reliably match GROMACS masses, rtol should be at least 1e-3.
Returns: - elements
- array of guessed element symbols, in same order as atoms
Example
As an example we guess masses and then set the elements for all atoms:
elements = guess_elements(atoms) atoms.add_TopologyAttr("elements", elements)