{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Example 6.1: Calculate ASO and AEIF descriptors\n", "\n", "## Prerequisites\n", "In order to calculate the descriptor, it is crucial that we get an aligned conformer library (*e. g.* in `../00-libraries/box-aligned.clib`)\n", "\n", "## Hardware Specification for Rerun\n", "\n", "Desktop workstation with 2x (AMD EPYC 7702 64-Core) with total of 128 physical and 256 logical cores, 1024 GB DDR4 with Ubuntu 22.04 LTS operating system.\n", "\n", "## Step 1. Calculate the grid file\n", "\n", "This step addresses several important features. \n", "1. The grid is calculated based on the basic bounding box\n", "2. The grid is then (optionally) *pruned* (using k-D tree algorithm), which means that all points reasonably far away from the molecule are not taken into account when computing the interaction field.\n", "3. Lastly, the nearest atoms to all grid points are located, forming an \"atom index indicator field\"" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "usage: molli grid [-h] [-o ] [-n NPROCS] [-p 0.0] [-s 1.0]\n", " [-b BATCHSIZE] [--prune [:]]\n", " [--nearest [NEAREST]] [--overwrite] [--dtype DTYPE]\n", " library\n", "\n", "Read a molli library and calculate a grid\n", "\n", "positional arguments:\n", " library Conformer library file to perform the calculations on\n", "\n", "options:\n", " -h, --help show this help message and exit\n", " -o , --output \n", " Destination for calculation results\n", " -n NPROCS, --nprocs NPROCS\n", " Specifies the number of jobs for constructing a grid\n", " -p 0.0, --padding 0.0\n", " The bounding box will be padded by this many angstroms\n", " prior to grid construction\n", " -s 1.0, --spacing 1.0\n", " Intervals at which the grid points will be placed\n", " -b BATCHSIZE, --batchsize BATCHSIZE\n", " Number of molecules to be treated simulateneously\n", " --prune [:]\n", " Obtain the pruning indices for each conformer ensemble\n", " --nearest [NEAREST] Obtain nearest atom indices for conformer ensembles.\n", " This is necessary for indicator field descriptors.\n", " Accepts up to 1 parameter which corresponds to the\n", " cutoff distance.\n", " --overwrite Overwrite the existing grid file\n", " --dtype DTYPE Specify the data format to be used for grid parameter\n", " storage.\n" ] } ], "source": [ "# This will display some help for the grid calculator\n", "!molli grid --help" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Using output file: box_aligned_grid.hdf5\n", "Successfully imported grid and bbox data from the previous calculation\n", "Vectors: [ -9.027 -14.043 -10.131], [16.825 14.178 15.219]. Number of grid points: 19604. Volume: 18495.13 A**3.\n", "Requested to calculate grid pruning with max_dist=2.000 eps=0.500\n", "Skipping the pruning: all keys have been found already!\n", "Nearest atoms:: 100%|█████████████████████████| 567/567 [01:53<00:00, 5.00it/s]\n" ] } ], "source": [ "# Calculating the grid with 1.0 A step size, 32 parallel processes, 128 molecules/batch and with\n", "# pruning and nearest atom locations\n", "\n", "!molli grid -s 1.0 -n 32 -b 128 box_aligned.mlib --prune --nearest" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: The `--prune` option is necessary for the accelerated GBCA descriptor calculation, but unecessary for standard grid calculation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 2. Calculation of the descriptors\n", "\n", "Having computed the grid, as well as having narrowed down the list of points to the most useful ones, we can finally compute the ASO and/or AEIF descriptor." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "usage: molli gbca [-h] [-w] [-n 128] [-b 128] [-g ]\n", " [-o ] [--dtype DTYPE] [--overwrite]\n", " {aso,aeif} CLIB_FILE\n", "\n", "This module can be used for standalone computation of descriptors\n", "\n", "positional arguments:\n", " {aso,aeif} This selects the specific descriptor to compute.\n", " CLIB_FILE Conformer library to perform the calculation on\n", "\n", "options:\n", " -h, --help show this help message and exit\n", " -w, --weighted Apply the weights specified in the conformer files\n", " -n 128, --nprocs 128 Selects number of processors for python\n", " multiprocessing application. If the program is\n", " launched via MPI backend, this parameter is ignored.\n", " -b 128, --batchsize 128\n", " Number of conformer ensembles to be processed in one\n", " batch.\n", " -g , --grid \n", " File that contains the information about the\n", " gridpoints.\n", " -o , --output \n", " File that contains the information about the\n", " gridpoints.\n", " --dtype DTYPE Specify the data format to be used for grid parameter\n", " storage.\n", " --overwrite Overwrite the existing descriptor file\n" ] } ], "source": [ "!molli gbca --help" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "To be computed: 72542 ensembles. Skipping 0\n", "Computing descriptor ASO: 100%|███████████████| 567/567 [02:56<00:00, 3.21it/s]\n" ] } ], "source": [ "!molli gbca aso box_aligned.mlib -n 64 -b 128" ] } ], "metadata": { "kernelspec": { "display_name": "dev-blake", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.6" } }, "nbformat": 4, "nbformat_minor": 2 }