luna.projects module

class EntryResults(entry, atm_grps_mngr, interactions_mngr, ifp=None, mfp=None)[source]

Bases: object

Store entry results.

Parameters
Variables
static load(input_file)[source]

Read the pickled representation of an EntryResults object from the file input_file and return the reconstituted object hierarchy specified therein. input_file can be a gzip-compressed file.

Raises

PKLNotReadError – If the file could not be loaded.

save(output_file, compressed=True)[source]

Write the pickled representation of this object to the file output_file.

Parameters
  • output_file (str) – The output file where the pickled representation will be saved.

  • compressed (bool, optional) – If True (the default), compress the pickled representation as a gzip file (.gz).

Raises

FileNotCreated – If the file could not be created.

class LocalProject(entries, working_path, **kwargs)[source]

Bases: luna.projects.Project

Define a local LUNA project, i.e., results are saved locally and not to a database.

This class inherits from Project and implements run().

Examples

In this minimum example, we will calculate protein-ligand interactions for dopamine D4 complexes.

First, we should define the ligand entries and initialize a new InteractionCalculator object.

>>> from luna.util.default_values import LUNA_PATH
>>> from luna.interaction.calc import InteractionCalculator
>>> entries = list(MolFileEntry.from_file(input_file=f"{LUNA_PATH}/tutorial/inputs/MolEntries.txt",
...                                       pdb_id="D4", mol_file=f"{LUNA_PATH}/tutorial/inputs/ligands.mol2"))
>>> ic = InteractionCalculator(inter_filter=InteractionFilter.new_pli_filter())

Finally, just create the new LUNA project with desired parameters and call run(). Here, we opted to define the parameters first as a dict, and then we pass it as an argument to LocalProject.

>>> from luna import LocalProject
>>> opts = {}
>>> opts["working_path"] = "%s/Results/Test3" % main_path
>>> opts["pdb_path"] = f"{LUNA_PATH}/tutorial/inputs/"
>>> opts["entries"] = entries
>>> opts["inter_calc"] = ic
>>> proj_obj = LocalProject(**opts)
>>> proj_obj.run()
generate_ifps()[source]

Generate LUNA interaction fingerprints (IFPs).

This function can be used to generate new IFPs after a project is run. Thus, you can reload your project, vary IFP parameters (ifp_num_levels, ifp_radius_step, ifp_length, ifp_count, ifp_diff_comp_classes, ifp_type, ifp_output), and call generate_ifps to create new IFPs without having to run the project from the scratch.

Examples

In the below example, we will assume a LUNA project object named proj_obj already exists.

>>> from luna.interaction.fp.type import IFPType
>>> proj_obj.ifp_num_levels = 5
>>> proj_obj.ifp_radius_step = 1
>>> proj_obj.ifp_length = 4096
>>> proj_obj.ifp_type = IFPType.EIFP
>>> proj_obj.ifp_output = "EIFP-4096__length-5__radius-1.csv"
>>> proj_obj.generate_ifps()
class Project(entries, working_path, pdb_path='/home/docs/checkouts/readthedocs.org/user_builds/luna-toolkit/checkouts/latest/output/public/pdb', overwrite_path=False, add_h=True, ph=7.4, amend_mol=True, mol_obj_type='rdkit', atom_prop_file='/home/docs/checkouts/readthedocs.org/user_builds/luna-toolkit/checkouts/latest/luna/data/LUNA.fdef', inter_calc=None, binding_mode_filter=None, calc_mfp=False, mfp_output=None, calc_ifp=True, ifp_num_levels=2, ifp_radius_step=5.73171, ifp_length=4096, ifp_count=True, ifp_diff_comp_classes=True, ifp_type=IFPType.EIFP, ifp_output=None, ifp_sim_matrix_output=None, out_pse=False, append_mode=False, verbosity=3, logging_enabled=True, nproc=1)[source]

Bases: object

Define a LUNA project.

Note

This class is not intended to be used directly because run() is not implemented by default. Instead, you should use a class that inherits from Project and implements run(). An example is the class LocalProject that implements a custom run() that saves results as local files.

Parameters
  • entries (iterable of Entry) – Entries determine the target molecule to which interactions and other properties will be calculated. They can be ligands, chains, etc, and can be defined in a number of ways. Each entry has an associated PDB file that may contain macromolecules (protein, RNA, DNA) and other small molecules, water, and ions. Refer to Entry for more information.

  • working_path (str) – Where project results will be saved.

  • pdb_path (str) – Path containing local PDB files or to where the PDB files will be downloaded. PDB filenames must match that defined for the entries. If not provided, the default PDB path will be used.

  • overwrite_path (bool) – If True, allow LUNA to overwrite any existing directory, which may remove files from a previous project. The default value is False.

  • add_h (bool) – Define if you need to add hydrogens or not. The default value is True.

    Note

    To be cautious, it does not add hydrogens to NMR-solved structures and ligands initialized from molecular files (MolFileEntry objects) as they usually already contain hydrogens.

  • ph (float) – Control the pH and how the hydrogens are going to be added. The default value is 7.4.

    Note

    To be cautious, it does not modify the protonation of molecular files defined by a MolFileEntry object.

  • amend_mol (bool) – If True (the default), try to fix atomic charges, valence, and bond types for small molecules and residues at PDB files. Only molecules at PDB files are validated because they do not contain charge, valence, and bond types, which may cause molecules to be incorrectly perceived. More information here.

    Note

    Molecules from external files (MolFileEntry objects) will not be modified.

  • mol_obj_type ({‘rdkit’, ‘openbabel’}) – Define which library (RDKit or Open Babel) to use to parse molecules. The default value is ‘rdkit’.

  • atom_prop_file (str) – A feature definition file (FDef) containing all information needed to define a set of chemical or pharmacophoric features. The default value is ‘LUNA.fdef’, which contains default LUNA features definition.

  • inter_calc (InteractionCalculator) – Define which and how interactions are calculated.

  • binding_mode_filter (BindingModeFilter) – Define how to filter interactions based on binding modes.

  • calc_mfp (bool) – If True, generate ECFP4 fingerprints for each entry in entries. The default value is False.

  • mfp_output (str) – If calc_mfp is True, save ECFP4 fingerprints to file mfp_output. If not provided, fingerprints are saved at <working_path>/results/fingerprints/mfp.csv.

  • calc_ifp (bool) – If True (the default), generate LUNA interaction fingerprints (IFPs) for each entry in entries.

  • ifp_num_levels (int) – The maximum number of iterations for fingerprint generation. The default value is 2.

  • ifp_radius_step (float) – The multiplier used to increase shell size at each iteration. At iteration 0, shell radius is 0 * radius_step, at iteration 1, radius is 1 * radius_step, etc. The default value is 5.73171.

  • ifp_length (int) – The fingerprint length (total number of bits). The default value is 4096.

  • ifp_count (bool) – If True (the default), create a count fingerprint (CountFingerprint). Otherwise, return a bit fingerprint (Fingerprint).

  • ifp_diff_comp_classes – If True (the default), include differentiation between compound classes. That means structural information originated from AtomGroup objects belonging to residues, nucleotides, ligands, or water molecules will be considered different even if their structural information are the same. This is useful for example to differentiate protein-ligand interactions from residue-residue ones.

  • ifp_type (IFPType) – The fingerprint type (EIFP, FIFP, or HIFP). The default value is EIFP.

  • ifp_output (str) – If calc_ifp is True, save LUNA interaction fingerprints (IFPs) to file ifp_output. If not provided, fingerprints are saved at <working_path>/results/fingerprints/ifp.csv.

  • ifp_sim_matrix_output (str, optional) – If provided, compute Tanimoto similarity between interaction fingerprints (IFPs) and save the similarity matrix to ifp_sim_matrix_output.

  • out_pse (bool) – If True, depict interactions save them as Pymol sessions (PSE file). The default value is False. PSE files are saved at <working_path>/results/pse.

  • append_mode (bool) – If True, skip entries from processing if a result for them already exists in working_path. This can save processing time in case additional entries are to be added to an existing project.

  • verbosity (int) – Verbosity level. The higher the verbosity level the more information is displayed. Valid values are:

    • 4: DEBUG messages;

    • 3: INFO messages (the default);

    • 2: WARNING messages;

    • 1: ERROR messages;

    • 0: CRITICAL messages.

  • logging_enabled (bool) – If True (the default), enable the logging system.

  • nproc (int) – The number of CPUs to use. The default value is the maximum number of CPUs - 1. If nproc is smaller than 1 or greater than the maximum amount of available CPUs at your PC, then nproc is set to its default value. If you set it to None, LUNA will be run serially.

Variables
  • ~Project.entries (iterable of Entry) –

  • ~Project.working_path (str) –

  • ~Project.pdb_path (str) –

  • ~Project.overwrite_path (bool) –

  • ~Project.add_h (bool) –

  • ~Project.ph (float) –

  • ~Project.amend_mol (bool) –

  • ~Project.mol_obj_type ({'rdkit', 'openbabel'}) –

  • ~Project.atom_prop_file (str) –

  • ~Project.inter_calc (InteractionCalculator) –

  • ~Project.binding_mode_filter (BindingModeFilter) –

  • ~Project.calc_mfp (bool) –

  • ~Project.mfp_output (str) –

  • ~Project.calc_ifp (bool) –

  • ~Project.ifp_num_levels (int) –

  • ~Project.ifp_radius_step (float) –

  • ~Project.ifp_length (int) –

  • ~Project.ifp_count (bool) –

  • ~Project.ifp_diff_comp_classes (bool) –

  • ~Project.ifp_type (IFPType) –

  • ~Project.ifp_output (str) –

  • ~Project.out_pse (bool) –

  • ~Project.out_ifp_sim_matrix (bool) –

  • ~Project.append_mode (bool) –

  • ~Project.logging_file (str) – The file to where logging messages are saved.

  • ~Project.version (str) – The LUNA’s version with which results were generated.

  • ~Project.errors (list of tuple) – Any errors found during the processing of an entry. Each tuple contains the input and the exception raised during the execution of a task with that input.

property atm_grps_mngrs

An AtomGroupsManager object for each entry.

Type

iterable of AtomGroupsManager

get_entry_results(entry)[source]

Get results for a given entry.

Parameters

entry (Entry) – An entry from entries.

Return type

EntryResults

property ifps

An interaction fingerprint (IFP) for each entry.

Type

iterable of Fingerprint

property interactions_mngrs

An InteractionsManager object for each entry.

Type

iterable of InteractionsManager

static load(pathname, verbosity=3, logging_enabled=True)[source]

Read the pickled representation of a Project object from a file or project path and return the reconstituted object hierarchy specified therein. The pathname can be a gzip-compressed file.

Parameters
  • pathname (str) – A file containing the pickled representation of a Project object or the project path (working_path) from where the pickled representation will be recovered.

  • verbosity (int) – Verbosity level. The higher the verbosity level the more information is displayed. Valid values are:

    • 4: DEBUG messages;

    • 3: INFO messages (the default);

    • 2: WARNING messages;

    • 1: ERROR messages;

    • 0: CRITICAL messages.

  • logging_enabled (bool) – If True (the default), enable the logging system.

Raises
  • CompatibilityError – If the project version is not compatible with the current LUNA version.

  • PKLNotReadError – If the file could not be loaded.

  • IllegalArgumentError – If the provided pathname does not exist or is an invalid file/directory.

property logging_enabled

If the logging system is enable or not.

Type

bool

property mfps

A molecular fingerprint for each entry.

Type

iterable of RDKit ExplicitBitVect or SparseBitVect

property nproc

The number of CPUs to use.

Type

int

property project_file

Where the pickled representation of the LUNA project is saved.

Type

str

remove_duplicate_entries()[source]

Search and remove duplicate entries from entries.

property results

LUNA results for each entry.

Type

iterable of EntryResults

run()[source]

Run LUNA. However, this method is not implemented by default. Instead, you should use a class that inherits from Project and implements run(). An example is the class LocalProject that implements a custom run() that saves results as local files.

save(output_file, compressed=True)[source]

Write the pickled representation of this project to the file output_file.

Parameters
  • output_file (str) – The output file where the pickled representation will be saved.

  • compressed (bool, optional) – If True (the default), compress the pickled representation as a gzip file (.gz).

Raises

FileNotCreated – If the file could not be created.

property verbosity

Verbosity level.

Type

int

verify_pdb_files_existence()[source]

Verify if a local PDB file exists for each entry in entries. If it does not find a given PDB file, then LUNA will try to download it from RCSB.