luna.projects module¶
- class EntryResults(entry, atm_grps_mngr, interactions_mngr, ifp=None, mfp=None)[source]¶
Bases:
object
Store entry results.
- Parameters
entry (
Entry
) – AnEntry
object that represents a molecule or an entire chain.atm_grps_mngr (
AtomGroupsManager
) – AnAtomGroupsManager
object that stores the perceived atoms and atom groups in the vicinity given byentry
.interactions_mngr (
InteractionsManager
) – AnInteractionsManager
object that interactions in the vicinity given byentry
.ifp (
Fingerprint
, optional) – An interaction fingerprint (IFP) generated forentry
.mfp (RDKit
ExplicitBitVect
orSparseBitVect
, optional) – A molecular fingerprint generated forentry
.
- Variables
~EntryResults.entry (
Entry
) –~EntryResults.atm_grps_mngr (
AtomGroupsManager
) –~EntryResults.interactions_mngr (
InteractionsManager
) –~EntryResults.ifp (
Fingerprint
) –~EntryResults.mfp (RDKit
ExplicitBitVect
orSparseBitVect
) –~EntryResults.version (str) – The LUNA’s version with which results were generated.
- static load(input_file)[source]¶
Read the pickled representation of an
EntryResults
object from the fileinput_file
and return the reconstituted object hierarchy specified therein.input_file
can be a gzip-compressed file.- Raises
PKLNotReadError – If the file could not be loaded.
- save(output_file, compressed=True)[source]¶
Write the pickled representation of this object to the file
output_file
.- Parameters
output_file (str) – The output file where the pickled representation will be saved.
compressed (bool, optional) – If True (the default), compress the pickled representation as a gzip file (.gz).
- Raises
FileNotCreated – If the file could not be created.
- class LocalProject(entries, working_path, **kwargs)[source]¶
Bases:
luna.projects.Project
Define a local LUNA project, i.e., results are saved locally and not to a database.
Examples
In this minimum example, we will calculate protein-ligand interactions for dopamine D4 complexes.
First, we should define the ligand entries and initialize a new
InteractionCalculator
object.>>> from luna.util.default_values import LUNA_PATH >>> from luna.interaction.calc import InteractionCalculator >>> entries = list(MolFileEntry.from_file(input_file=f"{LUNA_PATH}/tutorial/inputs/MolEntries.txt", ... pdb_id="D4", mol_file=f"{LUNA_PATH}/tutorial/inputs/ligands.mol2")) >>> ic = InteractionCalculator(inter_filter=InteractionFilter.new_pli_filter())
Finally, just create the new LUNA project with desired parameters and call
run()
. Here, we opted to define the parameters first as a dict, and then we pass it as an argument toLocalProject
.>>> from luna import LocalProject >>> opts = {} >>> opts["working_path"] = "%s/Results/Test3" % main_path >>> opts["pdb_path"] = f"{LUNA_PATH}/tutorial/inputs/" >>> opts["entries"] = entries >>> opts["inter_calc"] = ic >>> proj_obj = LocalProject(**opts) >>> proj_obj.run()
- generate_ifps()[source]¶
Generate LUNA interaction fingerprints (IFPs).
This function can be used to generate new IFPs after a project is run. Thus, you can reload your project, vary IFP parameters (
ifp_num_levels
,ifp_radius_step
,ifp_length
,ifp_count
,ifp_diff_comp_classes
,ifp_type
,ifp_output
), and callgenerate_ifps
to create new IFPs without having to run the project from the scratch.Examples
In the below example, we will assume a LUNA project object named
proj_obj
already exists.>>> from luna.interaction.fp.type import IFPType >>> proj_obj.ifp_num_levels = 5 >>> proj_obj.ifp_radius_step = 1 >>> proj_obj.ifp_length = 4096 >>> proj_obj.ifp_type = IFPType.EIFP >>> proj_obj.ifp_output = "EIFP-4096__length-5__radius-1.csv" >>> proj_obj.generate_ifps()
- class Project(entries, working_path, pdb_path='/home/docs/checkouts/readthedocs.org/user_builds/luna-toolkit/checkouts/latest/output/public/pdb', overwrite_path=False, add_h=True, ph=7.4, amend_mol=True, mol_obj_type='rdkit', atom_prop_file='/home/docs/checkouts/readthedocs.org/user_builds/luna-toolkit/checkouts/latest/luna/data/LUNA.fdef', inter_calc=None, binding_mode_filter=None, calc_mfp=False, mfp_output=None, calc_ifp=True, ifp_num_levels=2, ifp_radius_step=5.73171, ifp_length=4096, ifp_count=True, ifp_diff_comp_classes=True, ifp_type=IFPType.EIFP, ifp_output=None, ifp_sim_matrix_output=None, out_pse=False, append_mode=False, verbosity=3, logging_enabled=True, nproc=1)[source]¶
Bases:
object
Define a LUNA project.
Note
This class is not intended to be used directly because
run()
is not implemented by default. Instead, you should use a class that inherits fromProject
and implementsrun()
. An example is the classLocalProject
that implements a customrun()
that saves results as local files.- Parameters
entries (iterable of
Entry
) – Entries determine the target molecule to which interactions and other properties will be calculated. They can be ligands, chains, etc, and can be defined in a number of ways. Each entry has an associated PDB file that may contain macromolecules (protein, RNA, DNA) and other small molecules, water, and ions. Refer toEntry
for more information.working_path (str) – Where project results will be saved.
pdb_path (str) – Path containing local PDB files or to where the PDB files will be downloaded. PDB filenames must match that defined for the entries. If not provided, the default PDB path will be used.
overwrite_path (bool) – If True, allow LUNA to overwrite any existing directory, which may remove files from a previous project. The default value is False.
add_h (bool) – Define if you need to add hydrogens or not. The default value is True.
Note
To be cautious, it does not add hydrogens to NMR-solved structures and ligands initialized from molecular files (
MolFileEntry
objects) as they usually already contain hydrogens.ph (float) – Control the pH and how the hydrogens are going to be added. The default value is 7.4.
Note
To be cautious, it does not modify the protonation of molecular files defined by a
MolFileEntry
object.amend_mol (bool) – If True (the default), try to fix atomic charges, valence, and bond types for small molecules and residues at PDB files. Only molecules at PDB files are validated because they do not contain charge, valence, and bond types, which may cause molecules to be incorrectly perceived. More information here.
Note
Molecules from external files (
MolFileEntry
objects) will not be modified.mol_obj_type ({‘rdkit’, ‘openbabel’}) – Define which library (RDKit or Open Babel) to use to parse molecules. The default value is ‘rdkit’.
atom_prop_file (str) – A feature definition file (FDef) containing all information needed to define a set of chemical or pharmacophoric features. The default value is ‘LUNA.fdef’, which contains default LUNA features definition.
inter_calc (
InteractionCalculator
) – Define which and how interactions are calculated.binding_mode_filter (
BindingModeFilter
) – Define how to filter interactions based on binding modes.calc_mfp (bool) – If True, generate ECFP4 fingerprints for each entry in
entries
. The default value is False.mfp_output (str) – If
calc_mfp
is True, save ECFP4 fingerprints to filemfp_output
. If not provided, fingerprints are saved at <working_path
>/results/fingerprints/mfp.csv.calc_ifp (bool) – If True (the default), generate LUNA interaction fingerprints (IFPs) for each entry in
entries
.ifp_num_levels (int) – The maximum number of iterations for fingerprint generation. The default value is 2.
ifp_radius_step (float) – The multiplier used to increase shell size at each iteration. At iteration 0, shell radius is 0 *
radius_step
, at iteration 1, radius is 1 *radius_step
, etc. The default value is 5.73171.ifp_length (int) – The fingerprint length (total number of bits). The default value is 4096.
ifp_count (bool) – If True (the default), create a count fingerprint (
CountFingerprint
). Otherwise, return a bit fingerprint (Fingerprint
).ifp_diff_comp_classes – If True (the default), include differentiation between compound classes. That means structural information originated from
AtomGroup
objects belonging to residues, nucleotides, ligands, or water molecules will be considered different even if their structural information are the same. This is useful for example to differentiate protein-ligand interactions from residue-residue ones.ifp_type (
IFPType
) – The fingerprint type (EIFP, FIFP, or HIFP). The default value is EIFP.ifp_output (str) – If
calc_ifp
is True, save LUNA interaction fingerprints (IFPs) to fileifp_output
. If not provided, fingerprints are saved at <working_path
>/results/fingerprints/ifp.csv.ifp_sim_matrix_output (str, optional) – If provided, compute Tanimoto similarity between interaction fingerprints (IFPs) and save the similarity matrix to
ifp_sim_matrix_output
.out_pse (bool) – If True, depict interactions save them as Pymol sessions (PSE file). The default value is False. PSE files are saved at <
working_path
>/results/pse.append_mode (bool) – If True, skip entries from processing if a result for them already exists in
working_path
. This can save processing time in case additional entries are to be added to an existing project.verbosity (int) – Verbosity level. The higher the verbosity level the more information is displayed. Valid values are:
4: DEBUG messages;
3: INFO messages (the default);
2: WARNING messages;
1: ERROR messages;
0: CRITICAL messages.
logging_enabled (bool) – If True (the default), enable the logging system.
nproc (int) – The number of CPUs to use. The default value is the
maximum number of CPUs - 1
. Ifnproc
is smaller than 1 or greater than the maximum amount of available CPUs at your PC, thennproc
is set to its default value. If you set it to None, LUNA will be run serially.
- Variables
~Project.entries (iterable of
Entry
) –~Project.working_path (str) –
~Project.pdb_path (str) –
~Project.overwrite_path (bool) –
~Project.add_h (bool) –
~Project.ph (float) –
~Project.amend_mol (bool) –
~Project.mol_obj_type ({'rdkit', 'openbabel'}) –
~Project.atom_prop_file (str) –
~Project.inter_calc (
InteractionCalculator
) –~Project.binding_mode_filter (
BindingModeFilter
) –~Project.calc_mfp (bool) –
~Project.mfp_output (str) –
~Project.calc_ifp (bool) –
~Project.ifp_num_levels (int) –
~Project.ifp_radius_step (float) –
~Project.ifp_length (int) –
~Project.ifp_count (bool) –
~Project.ifp_diff_comp_classes (bool) –
~Project.ifp_type (
IFPType
) –~Project.ifp_output (str) –
~Project.out_pse (bool) –
~Project.out_ifp_sim_matrix (bool) –
~Project.append_mode (bool) –
~Project.logging_file (str) – The file to where logging messages are saved.
~Project.version (str) – The LUNA’s version with which results were generated.
~Project.errors (list of tuple) – Any errors found during the processing of an entry. Each tuple contains the input and the exception raised during the execution of a task with that input.
- property atm_grps_mngrs¶
An
AtomGroupsManager
object for each entry.- Type
iterable of
AtomGroupsManager
- get_entry_results(entry)[source]¶
Get results for a given entry.
- Parameters
entry (
Entry
) – An entry fromentries
.- Return type
- property ifps¶
An interaction fingerprint (IFP) for each entry.
- Type
iterable of
Fingerprint
- property interactions_mngrs¶
An
InteractionsManager
object for each entry.- Type
iterable of
InteractionsManager
- static load(pathname, verbosity=3, logging_enabled=True)[source]¶
Read the pickled representation of a
Project
object from a file or project path and return the reconstituted object hierarchy specified therein. Thepathname
can be a gzip-compressed file.- Parameters
pathname (str) – A file containing the pickled representation of a
Project
object or the project path (working_path
) from where the pickled representation will be recovered.verbosity (int) – Verbosity level. The higher the verbosity level the more information is displayed. Valid values are:
4: DEBUG messages;
3: INFO messages (the default);
2: WARNING messages;
1: ERROR messages;
0: CRITICAL messages.
logging_enabled (bool) – If True (the default), enable the logging system.
- Raises
CompatibilityError – If the project version is not compatible with the current LUNA version.
PKLNotReadError – If the file could not be loaded.
IllegalArgumentError – If the provided pathname does not exist or is an invalid file/directory.
- property mfps¶
A molecular fingerprint for each entry.
- Type
iterable of RDKit
ExplicitBitVect
orSparseBitVect
- property results¶
LUNA results for each entry.
- Type
iterable of
EntryResults
- run()[source]¶
Run LUNA. However, this method is not implemented by default. Instead, you should use a class that inherits from
Project
and implementsrun()
. An example is the classLocalProject
that implements a customrun()
that saves results as local files.
- save(output_file, compressed=True)[source]¶
Write the pickled representation of this project to the file
output_file
.- Parameters
output_file (str) – The output file where the pickled representation will be saved.
compressed (bool, optional) – If True (the default), compress the pickled representation as a gzip file (.gz).
- Raises
FileNotCreated – If the file could not be created.