molli map script¶
Used to apply a function defined in a generic python file to all items in the library with multiprocessing parallelization. This is a newer approach of embarassingly parallel loop running that requires way less code and results in workflows that are way more legible
[3]:
!molli map --help
usage: molli combine [-h] -t <lib> [-n 1] [-b 1] [-o <combined.mlib>]
[--overwrite]
script
Read a molli library and perform some basic inspections
positional arguments:
script This is a python file that defines a molli_main
function
options:
-h, --help show this help message and exit
-t <lib>, --target <lib>
Target library that the function is going to be
applied to
-n 1, --nprocs 1 Number of processes to be used in parallel
-b 1, --batchsize 1 Number of molecules to be processed at a time on a
single core
-o <combined.mlib>, --output <combined.mlib>
Output library
--overwrite Overwrite the target files if they exist (default is
false)
The main file to provide here is in script (positional argument). This file can be stored anywhere on the file system and does not need to be in the molli script folder, which improves its usability.
An example of this file would be the following script designed for conformer generation using RDKit:
File: ../examples-scripts/004_rdkit_confs.py
import molli as ml
from molli.external.openbabel import obabel_optimize
from rdkit import Chem
from rdkit.Chem import rdDepictor, rdDistGeom
from rdkit import RDLogger
IN_CTYPE = ml.MoleculeLibrary
OUT_CTYPE = ml.ConformerLibrary
N_CONFS = 20
def main(mol: ml.Molecule) -> ml.ConformerEnsemble:
rdmol = Chem.MolFromMol2Block(
ml.dumps(mol, "mol2"),
removeHs=False,
)
if rdmol is None:
return RuntimeError(f"Cannot create an rdkit mol from {mol}")
rdDistGeom.EmbedMolecule(rdmol)
rdDistGeom.EmbedMultipleConfs(rdmol, N_CONFS, pruneRmsThresh=0.3)
ens = ml.ConformerEnsemble(mol, n_conformers=rdmol.GetNumConformers())
for i, conf in enumerate(rdmol.GetConformers()):
ens.coords[i] = conf.GetPositions()
return ens
A few parts of this script are important:
IN_CTYPEandOUT_CTYPEare intended to indicate which type of library is appropriate to use with the input and output files, respectfully.mainfunction should take one argument corresponding to the type compatible withIN_CTYPEand return one object compatible withOUT_TYPEThe rest of the dependencies need to exist in the same pip/conda environment as the current molli version
[5]:
!molli map ../examples-scripts/004_rdkit_confs.py -t ../molli/files/cinchonidine.mlib -n 4 -o ../misc/output.clib
0%| | 0/88 [00:00<?, ?it/s]
1%| | 1/88 [00:05<07:19, 5.05s/it]
2%|▏ | 2/88 [00:05<03:24, 2.38s/it]
6%|▌ | 5/88 [00:08<01:52, 1.36s/it]
9%|▉ | 8/88 [00:08<00:56, 1.42it/s]
10%|█ | 9/88 [00:11<01:26, 1.09s/it]
11%|█▏ | 10/88 [00:13<01:35, 1.23s/it]
15%|█▍ | 13/88 [00:14<01:01, 1.22it/s]
16%|█▌ | 14/88 [00:14<00:56, 1.31it/s]
17%|█▋ | 15/88 [00:15<00:59, 1.23it/s]
18%|█▊ | 16/88 [00:15<00:47, 1.52it/s]
19%|█▉ | 17/88 [00:16<00:44, 1.59it/s]
20%|██ | 18/88 [00:17<00:57, 1.22it/s]
22%|██▏ | 19/88 [00:20<01:27, 1.27s/it]
23%|██▎ | 20/88 [00:20<01:17, 1.13s/it]
26%|██▌ | 23/88 [00:24<01:17, 1.19s/it]
31%|███ | 27/88 [00:26<00:50, 1.20it/s]
33%|███▎ | 29/88 [00:27<00:44, 1.33it/s]
34%|███▍ | 30/88 [00:28<00:48, 1.21it/s]
35%|███▌ | 31/88 [00:30<00:53, 1.06it/s]
38%|███▊ | 33/88 [00:31<00:47, 1.15it/s]
39%|███▊ | 34/88 [00:32<00:40, 1.33it/s]
40%|███▉ | 35/88 [00:34<00:53, 1.02s/it]
41%|████ | 36/88 [00:35<00:53, 1.03s/it]
42%|████▏ | 37/88 [00:35<00:40, 1.26it/s]
44%|████▍ | 39/88 [00:37<00:44, 1.11it/s]
47%|████▋ | 41/88 [00:37<00:28, 1.67it/s]
48%|████▊ | 42/88 [00:38<00:31, 1.46it/s]
49%|████▉ | 43/88 [00:39<00:35, 1.26it/s]
50%|█████ | 44/88 [00:40<00:30, 1.45it/s]
51%|█████ | 45/88 [00:41<00:41, 1.04it/s]
53%|█████▎ | 47/88 [00:43<00:36, 1.12it/s]
55%|█████▍ | 48/88 [00:43<00:32, 1.24it/s]
57%|█████▋ | 50/88 [00:44<00:26, 1.44it/s]
58%|█████▊ | 51/88 [00:46<00:30, 1.23it/s]
59%|█████▉ | 52/88 [00:47<00:36, 1.03s/it]
61%|██████▏ | 54/88 [00:48<00:21, 1.59it/s]
62%|██████▎ | 55/88 [00:49<00:29, 1.12it/s]
64%|██████▎ | 56/88 [00:50<00:25, 1.25it/s]
65%|██████▍ | 57/88 [00:51<00:26, 1.16it/s]
67%|██████▋ | 59/88 [00:52<00:22, 1.28it/s]
68%|██████▊ | 60/88 [00:54<00:27, 1.03it/s]
69%|██████▉ | 61/88 [00:55<00:28, 1.04s/it]
73%|███████▎ | 64/88 [00:58<00:23, 1.02it/s]
74%|███████▍ | 65/88 [00:58<00:18, 1.23it/s]
75%|███████▌ | 66/88 [00:59<00:20, 1.09it/s]
77%|███████▋ | 68/88 [01:00<00:14, 1.35it/s]
78%|███████▊ | 69/88 [01:03<00:21, 1.13s/it]
80%|███████▉ | 70/88 [01:04<00:19, 1.10s/it]
83%|████████▎ | 73/88 [01:05<00:11, 1.26it/s]
84%|████████▍ | 74/88 [01:06<00:10, 1.32it/s]
85%|████████▌ | 75/88 [01:06<00:09, 1.35it/s]
86%|████████▋ | 76/88 [01:07<00:08, 1.46it/s]
88%|████████▊ | 77/88 [01:08<00:08, 1.36it/s]
89%|████████▊ | 78/88 [01:09<00:09, 1.10it/s]
90%|████████▉ | 79/88 [01:09<00:06, 1.42it/s]
92%|█████████▏| 81/88 [01:10<00:04, 1.62it/s]
93%|█████████▎| 82/88 [01:12<00:05, 1.07it/s]
94%|█████████▍| 83/88 [01:12<00:03, 1.37it/s]
95%|█████████▌| 84/88 [01:13<00:02, 1.58it/s]
97%|█████████▋| 85/88 [01:14<00:02, 1.25it/s]
98%|█████████▊| 86/88 [01:16<00:01, 1.02it/s]
99%|█████████▉| 87/88 [01:16<00:00, 1.07it/s]
100%|██████████| 88/88 [01:16<00:00, 1.15it/s]
Now we can inspect the output file:
[6]:
import molli as ml
ml.visual.configure()
[7]:
!molli ls ../misc/output.clib
3_13_c_cf0
11_1_c_cf0
6_1_c_cf0
3_1_c_cf0
1_4_c_cf0
4_3_c_cf0
6_6_c_cf0
9_3_c_cf0
5_5_c_cf0
2_3_c_cf0
5_3_c_cf0
7_4_c_cf0
9_7_c_cf0
7_7_c_cf0
1_13_c_cf0
7_6_c_cf0
8_7_c_cf0
1_6_c_cf0
3_7_c_cf0
10_4_c_cf0
1_7_c_cf0
2_5_c_cf0
11_4_c_cf0
10_1_c_cf0
4_7_c_cf0
8_3_c_cf0
8_12_c_cf0
6_12_c_cf0
2_4_c_cf0
6_13_c_cf0
2_1_c_cf0
3_5_c_cf0
3_12_c_cf0
7_12_c_cf0
5_13_c_cf0
10_3_c_cf0
10_5_c_cf0
11_5_c_cf0
2_13_c_cf0
5_7_c_cf0
3_3_c_cf0
10_12_c_cf0
9_4_c_cf0
5_6_c_cf0
9_12_c_cf0
9_6_c_cf0
7_3_c_cf0
9_5_c_cf0
10_6_c_cf0
10_13_c_cf0
5_4_c_cf0
3_4_c_cf0
6_4_c_cf0
2_6_c_cf0
7_13_c_cf0
6_3_c_cf0
8_4_c_cf0
11_13_c_cf0
2_12_c_cf0
4_12_c_cf0
9_13_c_cf0
4_4_c_cf0
11_7_c_cf0
4_1_c_cf0
7_1_c_cf0
8_5_c_cf0
4_5_c_cf0
4_6_c_cf0
1_1_c_cf0
10_7_c_cf0
7_5_c_cf0
9_1_c_cf0
2_7_c_cf0
3_6_c_cf0
11_6_c_cf0
11_12_c_cf0
8_13_c_cf0
4_13_c_cf0
1_12_c_cf0
11_3_c_cf0
1_5_c_cf0
6_5_c_cf0
1_3_c_cf0
8_1_c_cf0
5_1_c_cf0
8_6_c_cf0
6_7_c_cf0
5_12_c_cf0
[8]:
%clib_view ../misc/output.clib 3_13_c_cf0
3Dmol.js failed to load for some reason. Please check your browser console for error messages.