openfe_analysis.rmsd

Functions

gather_rms_data(pdb_topology, dataset[, ...])

Compute structural RMSD-based metrics for a multistate BFE simulation.

Classes

LigandCOMDrift(*args, **kwargs)

Ligand center-of-mass displacement from initial position.

Protein2DRMSD(*args, **kwargs)

Flattened 2D RMSD matrix

RMSDAnalysis(*args, **kwargs)

1D RMSD time series for an AtomGroup.

SymmetryCorrectedLigandRMSD(*args, **kwargs)

Symmetry-corrected 1D RMSD time series for a ligand AtomGroup.

class openfe_analysis.rmsd.Protein2DRMSD(*args: Any, **kwargs: Any)

Flattened 2D RMSD matrix

For all unique frame pairs (i, j) with i < j, this function computes the RMSD between atomic coordinates after optimal alignment. Alignment is performed by centering each frame on its center of geometry, followed by rotational and translational superposition using the QCP method.

Parameters:
  • atomgroup (mda.AtomGroup) – Protein atoms (e.g. CA selection)

  • weights (np.ndarray, optional) – Per-atom weights to use in the RMSD calculation. If None, all atoms are weighted equally.

Notes

All atom positions are accumulated in memory during the trajectory iteration. For long trajectories or large systems this may result in significant memory usage. Consider using the step argument to run() to reduce the number of frames analyzed.

class openfe_analysis.rmsd.RMSDAnalysis(*args: Any, **kwargs: Any)

1D RMSD time series for an AtomGroup.

Parameters:
  • atomgroup (MDAnalysis.AtomGroup) – Atoms to compute RMSD for.

  • reference (Optional[MDAnalysis.AtomGroup]) – Reference AtomGroup. If None, the reference positions are taken from the first analyzed frame, so run(start=10) measures RMSD relative to frame 10, not frame 0.

  • mass_weighted (bool, optional) – If True, compute mass-weighted RMSD.

  • center (bool, optional) – If True, subtract the center of geometry before computing RMSD. Defaults to False as the trajectory is assumed to be pre-centered.

  • superposition (bool, optional) – If True, perform rotational superposition before computing RMSD. Defaults to False as the trajectory is assumed to be pre-superposed.

class openfe_analysis.rmsd.SymmetryCorrectedLigandRMSD(*args: Any, **kwargs: Any)

Symmetry-corrected 1D RMSD time series for a ligand AtomGroup.

Parameters:
  • atomgroup (mda.AtomGroup) – Ligand atoms to compute RMSD for. If rdmol is not provided, bonds must be guessed on the atomgroup before instantiating this class; use guess_ligand_bonds() for this purpose.

  • rdmol (Chem.Mol, optional) – RDKit molecule corresponding to atomgroup. If provided, it is used directly and guess_ligand_bonds does not need to be called. If None, the RDKit molecule is derived from atomgroup via convert_to("RDKIT").

Raises:
  • ValueError – If rdmol is None and no bonds are found on the atomgroup.

  • ValueError – If the number of atoms in atomgroup and rdmol do not match.

class openfe_analysis.rmsd.LigandCOMDrift(*args: Any, **kwargs: Any)

Ligand center-of-mass displacement from initial position.

Parameters:

atomgroup (mda.AtomGroup) – Ligand atoms for which the center-of-mass drift is calculated.

Notes

The reference COM is taken from whatever frame the trajectory is on when .run() is called, not necessarily the first analyzed frame. For consistent results, ensure the trajectory is at frame 0 (or your desired reference frame) before calling .run().

PBC are not applied as the trajectory is assumed to have been pre-processed, ensuring the ligand does not jump between periodic images. Passing a box to apply the minimum image convention would give incorrect results for ligands that have drifted more than half a box length from their starting position.

openfe_analysis.rmsd.gather_rms_data(pdb_topology: Path, dataset: Path, skip: int | None = None, protein_selection: str = 'protein and name CA', ligand_selection: str = 'resname UNK') dict[str, list[ndarray]]

Compute structural RMSD-based metrics for a multistate BFE simulation.

Parameters:
  • pdb_topology (pathlib.Path) – Path to the PDB file defining system topology.

  • dataset (pathlib.Path) – Path to the NetCDF trajectory file produced by a multistate simulation.

  • skip (int, optional) – Frame stride for analysis. If None, a stride is chosen such that approximately 500 frames are analyzed per state.

  • protein_selection (str) – MDAnalysis selection string for the protein atoms used in RMSD calculations. Default is "protein and name CA".

  • ligand_selection (str) – MDAnalysis selection string for the ligand atoms. Default is "resname UNK".

Returns:

Dictionary containing per-state analysis results with keys: protein_RMSD, ligand_RMSD, ligand_wander, protein_2D_RMSD, and time(ps).

Return type:

dict[str, list]

Notes

For each thermodynamic state (lambda), this function:

  • Loads the trajectory using FEReader

  • Applies standard PBC-handling and alignment transformations

  • Computes protein and ligand structural metrics over time

The following analyses are produced per state:

  • 1D protein CA RMSD time series

  • 1D ligand RMSD time series

  • Ligand center-of-mass displacement from its initial position (ligand_wander)

  • Flattened 2D protein RMSD matrix (pairwise RMSD between frames)