5.3. Automated Dihedral Analysis
Added in version 0.9.0.
5.3.1. mdpow.workflows.dihedrals — Automation for DihedralAnalysis
dihedrals module provides functions for automated
workflows that encompass DihedralAnalysis.
See each function for requirements and examples.
Most functions can be used as standalone, individually, or in combination
depending on the desired results. Details of the completely automated workflow
are discussed under automated_dihedral_analysis().
Atom indices obtained by get_atom_indices() are 0-based,
atom index labels on the molecule in the plots are 0-based,
but atom names in plots and file names are 1-based.
- mdpow.workflows.dihedrals.SOLVENTS_DEFAULT = ('water', 'octanol')
Default solvents are water and octanol:
must match solvents used in project directory
one or two solvents can be specified
current solvents supported,
See also
mdpow.forcefields
- mdpow.workflows.dihedrals.INTERACTIONS_DEFAULT = ('Coulomb', 'VDW')
Default interactions set to Coulomb and VDW:
default values should not be changed
order should not be changed
- mdpow.workflows.dihedrals.SMARTS_DEFAULT = [!#1]~[!$(*#*)&!D1]-!@[!$(*#*)&!D1]~[!#1]
Default SMARTS string to identify relevant dihedral atom groups:
[!#1]: any atom, not Hydrogen~: any bond[!$(*#*)&!D1]: any atom that is not part of linear triple bond and not atom with 1 explicit bond-!@: single bond that is not ring bond[!$(*#*)&!D1]-!@[!$(*#*)&!D1]: the central portion selects two atoms that are not involved in a triple bond and are not terminal, that are connected by a single, non-ring bond[!#1]~` or `~[!#1]: the first and last portion specify any bond, to any atom that is not hydrogen
- mdpow.workflows.dihedrals.PLOT_WIDTH_DEFAULT = 190
Plot width (plot_pdf_width) should be provided in millimeters (mm), and is converted to pixels (px) for use with
cairosvg.conversion factor: 1 mm = 3.7795275591 px default value: 190 mm = 718.110236229 pixels
- mdpow.workflows.dihedrals.automated_dihedral_analysis(dirname, resname, figdir=None, df_save_dir=None, molname=None, SMARTS='[!#1]~[!$(*#*)&!D1]-!@[!$(*#*)&!D1]~[!#1]', plot_pdf_width=190, dataframe=None, padding=45, width=0.9, solvents=('water', 'octanol'), interactions=('Coulomb', 'VDW'), start=None, stop=None, step=None)[source]
Runs
DihedralAnalysisfor a single MDPOW project and creates violin plots of dihedral angle frequencies for each relevant dihedral atom group.For one MDPOW project, automatically determines all relevant dihedral atom groups in the molecule, runs
DihedralAnalysisfor each group, pads the dihedral angles to maintain periodicity, creates violin plots of dihedral angle frequencies (KDEs), and saves publication quality PDF figures for each group, separately.Optionally saves all pre-padded
DihedralAnalysisresults as a singlepandas.DataFramein df_save_dir provided.- Keywords:
- dirname
Molecule Simulation directory. Loads simulation files present in lambda directories into the new instance. With this method for generating an
Ensemblethe lambda directories are explored and_load_universe_from_dir()searches for .gro, .gro.bz2, .gro.gz, and .tpr files for topology, and .xtc files for trajectory. It will default to using the tpr file available.- figdir
path to the location to save figures (REQUIRED but marked as a kwarg for technical reasons; will be changed in #244)
- resname
resname for the molecule as defined in the topology and trajectory
- df_save_dir
optional, path to the location to save results
pandas.DataFrame- molname
molecule name to be used for labelling plots, if different from resname
- SMARTS
The default SMARTS string is described in detail under
SMARTS_DEFAULT.- plot_pdf_width
The default value for width of plot output is described in detail under
PLOT_WIDTH_DEFAULT.- dataframe
optional, if
DihedralAnalysiswas done prior, then resultspandas.DataFramecan be input to utilize angle padding and violin plotting functionality- padding
value in degrees default: 45
See also
- width
width of the violin element (>1 overlaps) default: 0.9
See also
- solvents
The default solvents are documented under
SOLVENTS_DEFAULT. Normally takes a two-tuple, but analysis is compatible with single solvent selections. Single solvent analyses will result in a figure with fully filled violins for the single solvent.- interactions
The default interactions are documented under
INTERACTIONS_DEFAULT.- start, stop, step
arguments passed to
run(), as parameters for iterating through the trajectories of the current ensembleSee also
Example
Typical Workflow:
import dihedrals dihedrals.automated_dihedral_analysis(dirname='/foo/bar/MDPOW_project_data', figdir='/foo/bar/MDPOW_figure_directory', resname='UNK', molname='benzene', padding=45, width=0.9, solvents=('water','octanol'), interactions=('Coulomb','VDW'), start=0, stop=100, step=10)
- mdpow.workflows.dihedrals.build_universe(dirname, solvents=('water', 'octanol'))[source]
Builds
Universefrom the./Coulomb/0000topology and trajectory of the project for the first solvent specified.Output used by
rdkit_conversion()andget_atom_indices()to obtain atom indices for each dihedral atom group.- Keywords:
- dirname
Molecule Simulation directory. Loads simulation files present in lambda directories into the new instance. With this method for generating an
Ensemblethe lambda directories are explored and_load_universe_from_dir()searches for .gro, .gro.bz2, .gro.gz, and .tpr files for topology, and .xtc files for trajectory. It will default to using the tpr file available.- solvents
The default solvents are documented under
SOLVENTS_DEFAULT. Normally takes a two-tuple, but analysis is compatible with single solvent selections. Single solvent analyses will result in a figure with fully filled violins for the single solvent.
- Returns:
- u
Universeobject
- mdpow.workflows.dihedrals.rdkit_conversion(u, resname)[source]
Converts the solute, resname, of the
Universetordkit.Chem.rdchem.Molobject for use with a SMARTS selection string to identify dihedral atom groups.Accepts
Universeobject made withbuild_universe()and a resname as input. Uses resname to select the solute for conversion byRDKitConvertertordkit.Chem.rdchem.Mol, and will add element attributes for Hydrogen if not listed in the topology, usingMDAnalysis.topology.guessers.guess_atom_element().- Keywords:
- u
Universeobject- resname
resname for the molecule as defined in the topology and trajectory
- Returns:
- tuple(mol, solute)
function call returns tuple, see below
- mol
rdkit.Chem.rdchem.Molobject converted from solute- solute
the
MDAnalysisAtomGroup for the solute
- mdpow.workflows.dihedrals.get_atom_indices(mol, SMARTS='[!#1]~[!$(*#*)&!D1]-!@[!$(*#*)&!D1]~[!#1]')[source]
Uses a SMARTS selection string to identify atom indices for relevant dihedral atom groups.
Requires a
rdkit.Chem.rdchem.Molobject as input for theSMARTS_DEFAULTkwarg to match patterns to and identify relevant dihedral atom groups.- Keywords:
- mol
rdkit.Chem.rdchem.Molobject converted from solute- SMARTS
The default SMARTS string is described in detail under
SMARTS_DEFAULT.
- Returns:
- atom_indices
tuple of tuples of indices for each dihedral atom group
- mdpow.workflows.dihedrals.get_bond_indices(mol, atom_indices)[source]
From the
rdkit.Chem.rdchem.Molobject, uses atom_indices to determine the indices of the bonds between those atoms for each dihedral atom group.- Keywords:
- mol
rdkit.Chem.rdchem.Molobject converted from solute- atom_indices
tuple of tuples of indices for each dihedral atom group
- Returns:
- bond_indices
tuple of tuples of indices for the bonds in each dihedral atom group
- mdpow.workflows.dihedrals.get_dihedral_groups(solute, atom_indices)[source]
Uses the 0-based atom_indices of the relevant dihedral atom groups determined by
get_atom_indices()and returns the 1-based index names for each atom in each group.Requires the atom_indices from
get_atom_indices()to index the solute specified byselect_atoms()and return an array of the names of each atom within its respective dihedral atom group as identified by the SMARTS selection string.- Keywords:
- solute
the
MDAnalysisAtomGroup for the solute- atom_indices
tuple of tuples of indices for each dihedral atom group
- Returns:
- dihedral_groups
list of
numpy.array()for atom names in each dihedral atom group
- mdpow.workflows.dihedrals.get_paired_indices(atom_indices, bond_indices, dihedral_groups)[source]
Combines atom_indices and bond_indices in tuples to be paired with their respective dihedral atom groups.
A dictionary is created with key-value pairs as follows: atom_indices and bond_indices are joined in a tuple as the value, with the key being the respective member of dihedral_groups to facilitate highlighting the relevant dihedral atom group when generating violin plots. As an example, ‘C1-N2-O3-S4’: ((0, 1, 2, 3), (0, 1, 2)), would be one key-value pair in the dictionary.
- Keywords:
- atom_indices
tuple of tuples of indices for each dihedral atom group
- bond_indices
tuple of tuples of indices for the bonds in each dihedral atom group
- dihedral_groups
list of
numpy.array()for atom names in each dihedral atom group
- Returns:
- name_index_pairs
dictionary with key-value pair for dihedral atom group, atom indices, and bond indices
- mdpow.workflows.dihedrals.dihedral_groups_ensemble(dirname, atom_indices, solvents=('water', 'octanol'), interactions=('Coulomb', 'VDW'), start=None, stop=None, step=None)[source]
Creates one
Ensemblefor the MDPOW project and runsDihedralAnalysisfor each dihedral atom group identified by the SMARTS selection string.- Keywords:
- dirname
Molecule Simulation directory. Loads simulation files present in lambda directories into the new instance. With this method for generating an
Ensemblethe lambda directories are explored and_load_universe_from_dir()searches for .gro, .gro.bz2, .gro.gz, and .tpr files for topology, and .xtc files for trajectory. It will default to using the tpr file available.- atom_indices
tuples of atom indices for dihedral atom groups
See also
- solvents
The default solvents are documented under
SOLVENTS_DEFAULT. Normally takes a two-tuple, but analysis is compatible with single solvent selections. Single solvent analyses will result in a figure with fully filled violins for the single solvent.- interactions
The default interactions are documented under
INTERACTIONS_DEFAULT.- start, stop, step
arguments passed to
run(), as parameters for iterating through the trajectories of the current ensembleSee also
- Returns:
- df
pandas.DataFrameofDihedralAnalysisresults, including all dihedral atom groups for molecule of current project
- mdpow.workflows.dihedrals.save_df(df, df_save_dir, resname, molname=None)[source]
Takes a
pandas.DataFrameof results fromDihedralAnalysisas input before padding the angles to optionaly save the raw data.Optionally saves results before padding the angles for periodicity and plotting dihedral angle frequencies as KDE violins with
dihedral_violins(). Given a parent directory, creates subdirectory for molecule, saves fully sampled, unpadded resultspandas.DataFrameas a compressed csv file, default: .csv.bz2.- Keywords:
- df
pandas.DataFrameofDihedralAnalysisresults, including all dihedral atom groups for molecule of current project- df_save_dir
optional, path to the location to save results
pandas.DataFrame- resname
resname for the molecule as defined in the topology and trajectory
- molname
molecule name to be used for labelling plots, if different from resname
- mdpow.workflows.dihedrals.periodic_angle_padding(df, padding=45)[source]
Pads the angles from the results
DataFrameto maintain periodicity in the violin plots.Takes a
pandas.DataFrameof results fromDihedralAnalysisordihedral_groups_ensemble()as input and pads the angles to maintain periodicity for properly plotting dihedral angle frequencies as KDE violins withdihedral_violins()andplot_dihedral_violins(). Creates two newpandas.DataFramebased on the padding value specified, pads the angle values, concatenates all threepandas.DataFrame, maintaining original data and adding padded values, and returns new augmentedpandas.DataFrame.- Keywords:
- df
pandas.DataFrameofDihedralAnalysisresults, including all dihedral atom groups for molecule of current project- padding
value in degrees to specify angle augmentation threshold default: 45
- Returns:
- df_aug
augmented results
pandas.DataFramecontaining padded dihedral angles as specified by padding
- mdpow.workflows.dihedrals.dihedral_violins(df, width=0.9, solvents=('water', 'octanol'), plot_title=None)[source]
Plots kernel density estimates (KDE) of dihedral angle frequencies for one dihedral atom group as violin plots, using as input the augmented
pandas.DataFramefromperiodic_angle_padding().Output is converted to SVG by
build_svg()and final output is saved as PDF byplot_dihedral_violins()- Keywords:
- df
augmented results
pandas.DataFramefromperiodic_angle_padding()- width
width of the violin element (>1 overlaps); default: 0.9
- solvents
The default solvents are documented under
SOLVENTS_DEFAULT. Normally takes a two-tuple, but analysis is compatible with single solvent selections. Single solvent analyses will result in a figure with fully filled violins for the single solvent.- plot_title
generated by
build_svg()using molname, dihedral_groups, atom_indices, and interactions in this order and format: f’{molname}, {name[0]} {a} | ‘’{col_name}’
- Returns:
- g
returns a
seaborn.FacetGridobject containing a violin plot of the kernel density estimates (KDE) of the dihedral angle frequencies for each dihedral atom group identified bySMARTS_DEFAULT
- mdpow.workflows.dihedrals.build_svg(mol, molname, name_index_pairs, atom_group_selection, solvents=('water', 'octanol'), width=0.9)[source]
Converts and combines figure components into an SVG object to be converted and saved as a publication quality PDF.
- Keywords:
- mol
rdkit.Chem.rdchem.Molobject converted from solute- molname
molecule name to be used for labelling plots, if different from resname (in this case, carried over from an upstream decision between the two)
- name_index_pairs
dictionary with key-value pair for dihedral atom group, atom indices, and bond indices
See also
- atom_group_selection
name of each section in the groupby series of atom group selections
See also
- solvents
The default solvents are documented under
SOLVENTS_DEFAULT. Normally takes a two-tuple, but analysis is compatible with single solvent selections. Single solvent analyses will result in a figure with fully filled violins for the single solvent.- width
width of the violin element (>1 overlaps); default: 0.9
- Returns:
- fig
svgutilsSVG figure object
- mdpow.workflows.dihedrals.plot_dihedral_violins(df, resname, mol, name_index_pairs, figdir=None, molname=None, width=0.9, plot_pdf_width=190, solvents=('water', 'octanol'))[source]
Coordinates plotting and saving figures for all dihedral atom groups.
Makes a subdirectory for the current project within the specified figdir using resname or molname as title and saves production quality PDFs for each dihedral atom group separately.
- Keywords:
- df
augmented results
pandas.DataFramefromperiodic_angle_padding()- resname
resname for the molecule as defined in the topology and trajectory
- mol
rdkit.Chem.rdchem.Molobject converted from solute- name_index_pairs
dictionary with key-value pair for dihedral atom group, atom indices, and bond indices
See also
- figdir
path to the location to save figures (REQUIRED but marked as a kwarg for technical reasons; will be changed in #244)
- molname
molecule name to be used for labelling plots, if different from resname
- width
width of the violin element (>1 overlaps) default: 0.9
See also
- plot_pdf_width
The default value for width of plot output is described in detail under
PLOT_WIDTH_DEFAULT.- solvents
The default solvents are documented under
SOLVENTS_DEFAULT. Normally takes a two-tuple, but analysis is compatible with single solvent selections. Single solvent analyses will result in a figure with fully filled violins for the single solvent.