luna.mol.groups module

class AtomGroup(atoms, features=None, interactions=None, recursive=True, manager=None)[source]

Bases: object

Represent single atoms, chemical functional groups, or simply an arrangement of atoms as in hydrophobes.

Parameters
  • atoms (iterable of ExtendedAtom) – A sequence of atoms.

  • features (iterable of ChemicalFeature, optional) – A sequence of chemical features.

  • interactions (iterable of InteractionType, optional) – A sequence of interactions established by an atom group.

  • recursive (bool) – If True, add the new atom group to the list of atom groups of each atom in atoms.

  • manager (AtomGroupsManager, optional) – The AtomGroupsManager object that contains this AtomGroup object.

add_features(features)[source]

Add ChemicalFeature objects to features.

add_interactions(interactions)[source]

Add InteractionType objects to interactions.

as_json()[source]

Represent the atom group as a dict containing the atoms, compounds, features, and compound classes (water, hetero group, residue, or nucleotide).

The dict is defined as follows:

  • atoms (iterable of ExtendedAtom): the list of atoms comprising the atom group;

  • compounds (iterable of Residue): the list of unique compounds that contain the atoms;

  • classes (iterable of str): the list of compound classes;

  • features (iterable of ChemicalFeature): the atom group’s list of chemical features.

property atoms

The sequence of atoms that belong to an atom group.

Type

iterable of ExtendedAtom, read-only

property centroid

The centroid (x, y, z) of the atom group.

If atoms contains only one atom, then centroid returns the same as coords.

Type

array-like of floats, read-only

clear_refs()[source]

References to this AtomGroup instance will be removed from the list of atom groups of each atom in atoms.

property compounds

The set of unique compounds that contain the atoms in atoms.

As an atom group can be formed by the union of two or more compounds (e.g., amide of peptide bonds), it may return more than one compound.

Type

set of Residue, read-only

contain_group(atm_grp)[source]

Check if the atom group atm_grp is a subset of this atom group.

For example, consider the benzene molecule. Its aromatic ring itself forms an AtomGroup object composed of all of its six atoms. Consider now any subset of carbons in the benzene molecule. This subset forms an AtomGroup object that is part of the group formed by the aromatic ring. Therefore, in this example, contain_group() will return True because the aromatic ring contains the subset of hydrophobic atoms.

Parameters

atm_grp (AtomGroup)

Returns

If one atom group contains another atom group.

Return type

bool

property coords

Atomic coordinates (x, y, z) of each atom in atoms.

Type

array-like of floats

property feature_names

The name of each chemical feature in features.

Type

iterable of str

property features

A sequence of chemical features.

To add or remove a feature use add_features() or remove_features(), respectively.

Type

iterable of ChemicalFeature

get_chains()[source]

Get all unique chains in an atom group.

get_interactions_with(atm_grp)[source]

Get all interactions that an atom group establishes with another atom group atm_grp.

Returns

All interactions established with the atom group atm_grp.

Return type

iterable of InteractionType

get_serial_numbers()[source]

Get the serial number of each atom in an atom group.

get_shortest_path_length(trgt_grp, cutoff=None)[source]

Compute the shortest path length between this atom group to another atom group trgt_grp.

The shortest path between two atom groups is defined as the shortest path between any of their atoms, which are calculated using Dijkstra’s algorithm.

If manager is not provided, None is returned.

If there is not any path between src_grp and trgt_grp, infinite is returned.

Parameters
  • trgt_grp (AtomGroup) – The target atom group to calculate the shortest path.

  • cutoff (int, optional) – Only paths of length <= cutoff are returned. If None, all path lengths are considered.

Returns

The shortest path.

Return type

int, float(‘inf’), or None:

has_atom(atom)[source]

Check if an atom group contains a given atom atom.

Parameters

atom (ExtendedAtom)

Returns

If the atom group contains or not atom.

Return type

bool

has_hetatm()[source]

Return True if at least one atom in the atom group belongs to a hetero group, i.e., non-standard residues of proteins, DNAs, or RNAs, as well as atoms in other kinds of groups, such as carbohydrates, substrates, ligands, solvent, and metal ions.

has_nucleotide()[source]

Return True if at least one atom in the atom group belongs to a nucleotide.

has_residue()[source]

Return True if at least one atom in the atom group belongs to a standard residue of proteins.

has_target()[source]

Return True if at least one compound is the target of LUNA’s analysis

has_water()[source]

Return True if at least one atom in the atom group belongs to a water molecule.

property interactions

The sequence of interactions established by an atom group.

To add or remove an interaction use add_interactions() or remove_interactions(), respectively.

Type

iterable of InteractionType

is_hetatm()[source]

Return True if all atoms in the atom group belong to hetero group, i.e., non-standard residues of proteins, DNAs, or RNAs, as well as atoms in other kinds of groups, such as carbohydrates, substrates, ligands, solvent, and metal ions.

Hetero groups are designated by the flag HETATM in the PDB format.

is_mixed()[source]

Return True if the atoms in the atom group belong to different compound classes (water, hetero group, residue, or nucleotide).

is_nucleotide()[source]

Return True if all atoms in the atom group belong to nucleotides.

is_residue()[source]

Return True if all atoms in the atom group belong to standard residues of proteins.

is_water()[source]

Return True if all atoms in the atom group belong to water molecules.

property manager

The AtomGroupsManager object that contains an AtomGroup object.

Type

AtomGroupsManager

property normal

The normal vector (x, y, z) of the points given by coords.

Type

array-like of floats, read-only

remove_features(features)[source]

Remove ChemicalFeature objects from features.

remove_interactions(interactions)[source]

Remove InteractionType objects from interactions.

property size

The number of atoms comprising an atom group.

Type

int

class AtomGroupNeighborhood(atm_grps, bucket_size=10)[source]

Bases: object

Class for fast neighbor atom groups searching.

AtomGroupNeighborhood makes use of a KD Tree implemented in C, so it’s fast.

Parameters
  • atm_grps (iterable of AtomGroup, optional) – A sequence of AtomGroup objects, which is used in the queries. It can contain atom groups from different molecules.

  • bucket_size (int) – Bucket size of KD tree. You can play around with this to optimize speed if you feel like it. The default value is 10.

search(center, radius)[source]

Return all atom groups in atm_grps that is up to a maximum of radius away (measured in Å) of center.

For atom groups with more than one atom, their centroid is used as a reference.

class AtomGroupPerceiver(feature_extractor, add_h=False, ph=None, amend_mol=True, mol_obj_type='rdkit', charge_model=<luna.mol.charge_model.OpenEyeModel object>, tmp_path=None, expand_selection=True, radius=2.2, critical=True)[source]

Bases: object

Perceive and create atom groups for molecules.

Parameters
  • feature_extractor (FeatureExtractor) – Perceive pharmacophoric properties from molecules.

  • add_h (bool) – If True, add hydrogen to the molecules.

  • ph (float, optional) – If not None, add hydrogens appropriate for pH ph.

  • amend_mol (bool) – If True, apply validation and standardization of molecules read from a PDB file.

  • mol_obj_type ({“rdkit”, “openbabel”}) – If “rdkit”, parse the converted molecule with RDKit and return an instance of rdkit.Chem.rdchem.Mol. If “openbabel”, parse the converted molecule with Open Babel and return an instance of openbabel.pybel.Molecule.

  • charge_model (class:ChargeModel) – A charge model object. By default, the implementation of OpenEye charge model is used.

  • tmp_path (str, optional) – A temporary directory to where temporary files will be saved. If not provided, the system’s default temporary directory will be used instead.

  • expand_selection (bool) – If True (the default), perceive features for a given molecule considering all nearby molecules. The goal is to identify any covalently bonded molecules that may alter the pharmacophoric properties or chemical functional groups.

    For instance, consider an amide of a peptide bond. If expand_selection is False, the residues forming the peptide bond will be analyzed separately, which will make the oxygen and the nitrogen of the amide to be perceived as carbonyl oxygen and amine, respectively. On the other hand, if expand_selection is True, the covalent bond between the residues will be identified and the amide will be correctly perceived.

  • radius (float) – If expand_selection is True, select all molecules up to a maximum of radius away (measured in Å). The default value is 2.2, which comprises covalent bond distances.

  • critical (bool) – If False, ignore any errors during the processing a molecule and continue to the next one. The default value is True, which implies that any errors will raise an exception.

Raises

IllegalArgumentError – If mol_obj_type is not either ‘rdkit’ nor ‘openbabel’.

perceive_atom_groups(compounds, mol_objs_dict=None)[source]

Perceive and create atom groups for each molecule in compounds.

Parameters
  • compounds (iterable of Residue) – A sequence of molecules.

  • mol_objs_dict (dict) – Map a compound, represented by its id, to a molecular object (MolWrapper, rdkit.Chem.rdchem.Mol, or openbabel.pybel.Molecule).

    This parameter can be used in cases where the ligand is read from a molecular file and no standardization or validation is required.

Returns

An AtomGroupsManager object containing all atom groups perceived for the molecules in compounds.

Return type

AtomGroupsManager

class AtomGroupsManager(atm_grps=None, entry=None)[source]

Bases: object

Store and manage AtomGroup objects.

Parameters
  • atm_grps (iterable of AtomGroup, optional) – An initial sequence of AtomGroup objects.

  • entry (Entry, optional) – The chain or molecule from where the atom groups were perceived.

Variables
  • ~AtomGroupsManager.entry (Entry) – The chain or molecule from where the atom groups were perceived.

  • ~AtomGroupsManager.graph (networkx.Graph) – Represent entry as a graph and its vicinity.

  • ~AtomGroupsManager.version (str) – The LUNA version when the object was created.

add_atm_grps(atm_grps)[source]

Add one or more AtomGroup objects to atm_grps and automatically update child_dict.

apply_filter(func)[source]

Apply a filtering function over the atom groups in ``atm_grps`.

Parameters

func (callable) – A filtering function that returns True case an AtomGroup object is valid and False otherwise.

Yields

AtomGroup – A valid AtomGroup object.

property atm_grps

The sequence of AtomGroup objects. Additional objects should be added using the method add_atm_grps().

Type

iterable of AtomGroup, read-only

property child_dict

Mapping between atoms (ExtendedAtom) and atom groups (AtomGroup).

The mapping is a dict of {tuple of ExtendedAtom instances : AtomGroup} and is automatically updated when add_atm_grps() is called.

Type

dict, read-only

filter_by_types(types, must_contain_all=True)[source]

Filter AtomGroup objects by their physicochemical features.

Parameters
  • types (iterable of str) – A sequence of physicochemical features.

  • must_contain_all (bool) – If True, an AtomGroup object should contain all physicochemical features in types to be accepted. Otherwise, it will be filtered out.

Yields

AtomGroup – A valid AtomGroup object.

find_atm_grp(atoms)[source]

Find the atom group that contains the sequence of atoms atoms.

Returns

An atom group object or None if atoms is not in the child_dict mapping.

Return type

AtomGroup or None

get_all_interactions()[source]

Return all interactions established by the atom groups in atm_grps.

Returns

All interactions.

Return type

set of InteractionType

get_shortest_path_length(src_grp, trgt_grp, cutoff=None)[source]

Compute the shortest path length between two atom groups src_grp and trgt_grp.

The shortest path between two atom groups is defined as the shortest path between any of their atoms, which are calculated using Dijkstra’s algorithm and the graph graph.

If there is not any path between src_grp and trgt_grp, infinite is returned.

Parameters
  • src_grp, trgt_grp (AtomGroup) – Two atom groups to calculate the shortest path.

  • cutoff (int) – Only paths of length <= cutoff are returned. If None, all path lengths are considered.

Returns

The shortest path.

Return type

int or float(‘inf’):

static load(input_file)[source]

Load the pickled representation of an AtomGroupsManager object saved at the file input_file.

Returns

The reconstituted AtomGroupsManager object, including its set of atom groups and interactions.

Return type

AtomGroupsManager

Raises

PKLNotReadError – If the file could not be loaded.

merge_hydrophobic_atoms(interactions_mngr)[source]

Create hydrophobic islands by merging covalently bonded hydrophobic atoms in atm_grps. Hydrophobic islands are atom groups having the feature Hydrophobe.

Atom-atom hydrophobic interactions in interactions_mngr are also converted to island-island interactions.

Parameters

interactions_mngr (InteractionsManager) – An InteractionsManager object from where hydrophobic interactions are selected and convert from atom-atom to island-island interactions.

new_atm_grp(atoms, features=None, interactions=None)[source]

Create a new AtomGroup object for atoms if one does not exist yet. Otherwise, return the existing AtomGroup object.

Parameters
  • atoms (iterable of ExtendedAtom) – A sequence of atoms.

  • features (iterable of ChemicalFeature, optional) – If provided, add features to a new or an already existing AtomGroup object.

  • interactions (iterable of InteractionType, optional) – If provided, add interactions to a new or an already existing AtomGroup object.

Returns

A new or an already existing AtomGroup object.

Return type

AtomGroup

remove_atm_grps(atm_grps)[source]

Remove one or more AtomGroup objects from atm_grps and automatically update child_dict.

Any recursive references to the removed objects will also be cleared.

save(output_file, compressed=True)[source]

Write the pickled representation of the AtomGroupsManager object to the file output_file.

Parameters
  • output_file (str) – The output file.

  • compressed (bool, optional) – If True (the default), compress the pickled representation as a gzip file (.gz).

Raises

FileNotCreated – If the file could not be created.

property size

The number of atom groups in atm_grps.

Type

int, read-only

property summary

The number of physicochemical features in atm_grps.

Type

dict, read-only

class PseudoAtomGroup(parent_grp, atoms, features=None, interactions=None)[source]

Bases: luna.mol.groups.AtomGroup

Represent only the atoms from an AtomGroup object that are involved in an interaction.

Currently, this class is only used during the generation of LUNA’s fingerprints.

Parameters
  • parent_grp (AtomGroup) – The atom group that contains the subset of atoms atoms.

  • atoms (iterable of ExtendedAtom) – A sequence of atoms.

  • features (iterable of ChemicalFeature, optional) – A sequence of chemical features.

  • interactions (iterable of InteractionType, optional) – A sequence of interactions established by an atom group.