{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Libraries and Serialization\n", "\n", "These files are highly optimized binary storage formats developed for efficient constant time access of elements from large files. There are two key library formats: `MoleculeLibrary` and `ConformerLibrary`. \n", "\n", "Both library formats take advantage of a \"Lazy Loading\" approach, where the initialization of an object is avoided until needed. This prevents excessively long loading times when operating with large files. It's important to note that as of `molli 1.2`, we do not currently support an ordered library objects for speed purposes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Molecule Library\n", "\n", "This is a collection made up of `Molecule` objects serialized in a binary format. To access these, we operate with stream-like operations when doing lazy loading. Note, the library will not show the number of items it contains until a `with mlib.reading()` statement is used.\n", "\n", "### Reading Example of `MoleculeLibrary`:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "MoleculeLibrary(backend=UkvCollectionBackend('/home/blakeo2/new_molli/molli_dev/molli/molli/files/cinchonidine.mlib'), n_items=88)\n", "Here is an example name: 3_3_c_cf0\n", "Here is it's associated molecule object: Molecule(name='3_3_c_cf0', formula='C33 H37 N2 O1')\n" ] } ], "source": [ "# Imports molli\n", "import molli as ml\n", "\n", "#This is the path for an existing MoleculeLibrary\n", "mlib_path = ml.files.cinchonidine_no_conf\n", "\n", "#This instantiates the MoleculeLibrary to be read\n", "mlib = ml.MoleculeLibrary(mlib_path, overwrite=False, readonly=True)\n", "\n", "#This reads the MoleculeLibrary\n", "with mlib.reading():\n", "\n", " print(mlib)\n", "\n", " #This iterates through the names in the library\n", " for name in mlib:\n", " #This retrieves the associated Molecule Object\n", " mol = mlib[name]\n", "\n", "print(f'Here is an example name: {name}')\n", "print(f\"Here is it's associated molecule object: {mol}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Writing Example of `MoleculeLibrary`\n", "\n", "This is a slightly different, if one has an existing molecule object they would like to serialize, the syntax is as follows:\n", "\n", "```python\n", "#This loads a molecule of interest\n", "mol = ml.load(ml.files.dendrobine_mol2)\n", "\n", "#This instantiates a MoleculeLibrary to be written to\n", "new_mlib = ml.MoleculeLibrary('New Path', overwrite=False, readonly=False)\n", "\n", "#This prepares the molecule library\n", "with new_mlib.writing():\n", " \n", " #This serializes the molecule to the MoleculeLibrary using the existing name as a retrieval\n", " new_mlib[mol.name] = new_mlib\n", "\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reading and Writing Multiple Libraries\n", "\n", "In the event that you have a library you would like to do an operation with then serialize, this can be written at the same time\n", "\n", "```python\n", "#This is the path for an existing MoleculeLibrary\n", "mlib_path = ml.files.cinchonidine_no_conf\n", "\n", "#This instantiates the MoleculeLibrary to be read\n", "mlib = ml.MoleculeLibrary(mlib_path, overwrite=False, readonly=True)\n", "\n", "#This instantiates a MoleculeLibrary to be written to\n", "new_mlib = ml.MoleculeLibrary('New Path', overwrite=False, readonly=False)\n", "\n", "#This prepares the molecule libraries\n", "\n", "with mlib.reading(), new_mlib.writing():\n", " \n", " #This iterates through the serialized mlib\n", " for name in mlib:\n", " #This instantiates the Molecule Object\n", " mol = mlib[name]\n", "\n", " #This translates the molecule 50 units in the x direction\n", " mol.translate([50,0,0])\n", "\n", " #This serializes this into a new molecule library\n", " new_mlib[name] = mol\n", "```\n", "\n", "This can be done with as many molecule libraries as desired, allowing unique serialization method implementations." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## `ConformerLibrary`\n", "\n", "These have the exact same functionality and syntax as the `MoleculeLibraries`, with the only difference being these are made up of `ConformerEnsemble` objects serialized in a binary format. Since the syntax is the same, this notebook will only give a reading example, but the same writing, and multi-reading and writing functionality exists\n", "\n", "### Reading Example of `ConformerLibrary`" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ConformerLibrary(backend=UkvCollectionBackend('/home/blakeo2/new_molli/molli_dev/molli/molli/files/cinchonidine_rdconfs.clib'), n_items=88)\n", "Here is an example name: 3_6_c\n", "Here is it's associated ConformerEnsemble object: ConformerEnsemble(name='3_6_c', formula='C33 H34 F3 N2 O1', n_conformers=200)\n" ] } ], "source": [ "#This is the path for an existing ConformerLibrary\n", "clib_path = ml.files.cinchonidine_rd_conf\n", "\n", "#This instantiates the MoleculeLibrary to be read\n", "clib = ml.ConformerLibrary(clib_path, overwrite=False, readonly=True)\n", "\n", "#This reads the MoleculeLibrary\n", "with clib.reading():\n", "\n", " print(clib)\n", "\n", " #This iterates through the names in the library\n", " for name in clib:\n", " #This retrieves the associated ConformerEnsemble Object\n", " ens = clib[name]\n", "\n", "print(f'Here is an example name: {name}')\n", "print(f\"Here is it's associated ConformerEnsemble object: {ens}\")" ] } ], "metadata": { "kernelspec": { "display_name": "dev-blake", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.6" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }