Molecule and Atom Introduction

This tutorial is meant to illustrate some of the properties and functionality available within the Molecule class. Individual pieces of the molecule can be called and operated on, or the molecule can be operated on as a whole. Let’s start with understanding how to work with some of the properties of a molecule.

[1]:
# Imports necessary packages and checks the version of molli.
import molli as ml
ml.aux.assert_molli_version_min("1.0a")

Molecule

A Molecule can be loaded in very quickly via a file path or available string.

[2]:
#Loads in a molecule
mol1 = ml.load(ml.files.dendrobine_mol2)
mol1
[2]:
Molecule(name='dendrobine', formula='C16 H25 N1 O2')

An existing Molecule object can be quickly copied as is and coordinates can be quickly manipulated and combined. The below example illustrate

[3]:
#This makes a copy of the original Molecule object and all attributes
mol2 = ml.Molecule(mol1)

#This translates the copy 50 units in the x axis
mol2.translate([50,0,0])

#This creates a new molecule that combines the original molecule and the new molecule.
print((mol1 | mol2).dumps_xyz())
88
unknown
N         1.296000    -0.231900     1.267000
C         0.057300    -0.022600     2.122700
C        -1.097400    -0.473800     1.205900
C        -0.428400    -0.411300    -0.168700
C         0.868300     0.359800    -0.000400
C         2.456200     0.441700     1.836100
C        -2.432800     0.265900     0.983200
C        -2.655300     0.247200    -0.580100
C        -1.242800     0.562600    -1.016300
C         1.535300     0.579000    -1.417900
O         1.341000     2.036300    -1.650700
C         0.078200     2.229100    -2.216600
C        -0.597700     0.853900    -2.402000
O        -0.413500     3.318000    -2.455200
C         0.760200     0.167800    -2.700100
C         0.781200    -1.214100    -3.377300
H         1.242600     0.768400    -3.498200
H        -1.179400     1.558000    -0.537700
H         0.651600     1.395200     0.317400
H        -1.372200    -1.491000     1.513400
C        -0.282800    -1.883500    -0.595700
C         2.177500    -1.841200    -3.321500
C         0.348500    -1.100600    -4.851400
H         0.059000    -1.900900    -2.947800
H        -0.064900     1.026400     2.419800
H         0.116200    -0.640800     3.024400
H         2.715000    -0.004900     2.801900
H         2.292700     1.514700     1.987400
H         3.325500     0.316200     1.181800
H        -2.371600     1.296200     1.353400
H        -3.263900    -0.226100     1.498200
H        -3.381000     1.006900    -0.883600
H        -2.993400    -0.735800    -0.922200
H         2.608800     0.379800    -1.439400
H        -1.292900     0.877200    -3.242300
H        -0.307500    -2.585400     0.247500
H         0.692000    -2.089000    -1.031900
H        -1.094300    -2.190400    -1.261400
H         2.179500    -2.824100    -3.804900
H         2.517600    -1.982200    -2.291400
H         2.914300    -1.213500    -3.834300
H         0.352200    -2.085900    -5.330700
H         1.023200    -0.452000    -5.421000
H        -0.665200    -0.696700    -4.934600
N        51.296000    -0.231900     1.267000
C        50.057300    -0.022600     2.122700
C        48.902600    -0.473800     1.205900
C        49.571600    -0.411300    -0.168700
C        50.868300     0.359800    -0.000400
C        52.456200     0.441700     1.836100
C        47.567200     0.265900     0.983200
C        47.344700     0.247200    -0.580100
C        48.757200     0.562600    -1.016300
C        51.535300     0.579000    -1.417900
O        51.341000     2.036300    -1.650700
C        50.078200     2.229100    -2.216600
C        49.402300     0.853900    -2.402000
O        49.586500     3.318000    -2.455200
C        50.760200     0.167800    -2.700100
C        50.781200    -1.214100    -3.377300
H        51.242600     0.768400    -3.498200
H        48.820600     1.558000    -0.537700
H        50.651600     1.395200     0.317400
H        48.627800    -1.491000     1.513400
C        49.717200    -1.883500    -0.595700
C        52.177500    -1.841200    -3.321500
C        50.348500    -1.100600    -4.851400
H        50.059000    -1.900900    -2.947800
H        49.935100     1.026400     2.419800
H        50.116200    -0.640800     3.024400
H        52.715000    -0.004900     2.801900
H        52.292700     1.514700     1.987400
H        53.325500     0.316200     1.181800
H        47.628400     1.296200     1.353400
H        46.736100    -0.226100     1.498200
H        46.619000     1.006900    -0.883600
H        47.006600    -0.735800    -0.922200
H        52.608800     0.379800    -1.439400
H        48.707100     0.877200    -3.242300
H        49.692500    -2.585400     0.247500
H        50.692000    -2.089000    -1.031900
H        48.905700    -2.190400    -1.261400
H        52.179500    -2.824100    -3.804900
H        52.517600    -1.982200    -2.291400
H        52.914300    -1.213500    -3.834300
H        50.352200    -2.085900    -5.330700
H        51.023200    -0.452000    -5.421000
H        49.334800    -0.696700    -4.934600

Each Molecule object is made up of a few key properties

  • A list of Atom objects

  • A list of Bond objects

  • A numpy array of XYZ (3D) coordinates of atoms

  • A dict of attributes

[4]:
print(f'Here are the list of atoms\n{mol1.atoms}\n')
print(f'Here are the list of bonds\n{mol1.bonds}')
print(f'Here are the coordinates of the atoms\n{mol1.coords}')
print(f'Here are the empty dictionary of attributes\n{mol1.attrib}')
Here are the list of atoms
[Atom(element=N, isotope=None, label='N', formal_charge=0, formal_spin=0), Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0), Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0), Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0), Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0), Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0), Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0), Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0), Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0), Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0), Atom(element=O, isotope=None, label='O', formal_charge=0, formal_spin=0), Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0), Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0), Atom(element=O, isotope=None, label='O', formal_charge=0, formal_spin=0), Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0), Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0), Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0), Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0), Atom(element=H, isotope=None, label='H', formal_charge=0, formal_spin=0)]

Here are the list of bonds
[Bond(a1=42, a2=22, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=41, a2=22, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=43, a2=22, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=22, a2=15, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=40, a2=21, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=38, a2=21, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=16, a2=14, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=15, a2=21, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=15, a2=23, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=15, a2=14, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=21, a2=39, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=34, a2=12, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=14, a2=12, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=14, a2=9, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=13, a2=11, label=None, btype=Double, stereo=Unknown, f_order=1.0), Bond(a1=12, a2=11, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=12, a2=8, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=11, a2=10, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=10, a2=9, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=33, a2=9, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=9, a2=4, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=37, a2=20, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=36, a2=20, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=8, a2=7, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=8, a2=17, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=8, a2=3, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=32, a2=7, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=31, a2=7, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=20, a2=3, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=20, a2=35, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=7, a2=6, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=3, a2=4, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=3, a2=2, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=4, a2=18, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=4, a2=Atom(element=N, isotope=None, label='N', formal_charge=0, formal_spin=0), label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=6, a2=2, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=6, a2=29, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=6, a2=30, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=28, a2=5, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=2, a2=19, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=2, a2=1, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=Atom(element=N, isotope=None, label='N', formal_charge=0, formal_spin=0), a2=5, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=Atom(element=N, isotope=None, label='N', formal_charge=0, formal_spin=0), a2=1, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=5, a2=27, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=5, a2=26, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=1, a2=24, label=None, btype=Single, stereo=Unknown, f_order=1.0), Bond(a1=1, a2=25, label=None, btype=Single, stereo=Unknown, f_order=1.0)]
Here are the coordinates of the atoms
[[ 1.2960e+00 -2.3190e-01  1.2670e+00]
 [ 5.7300e-02 -2.2600e-02  2.1227e+00]
 [-1.0974e+00 -4.7380e-01  1.2059e+00]
 [-4.2840e-01 -4.1130e-01 -1.6870e-01]
 [ 8.6830e-01  3.5980e-01 -4.0000e-04]
 [ 2.4562e+00  4.4170e-01  1.8361e+00]
 [-2.4328e+00  2.6590e-01  9.8320e-01]
 [-2.6553e+00  2.4720e-01 -5.8010e-01]
 [-1.2428e+00  5.6260e-01 -1.0163e+00]
 [ 1.5353e+00  5.7900e-01 -1.4179e+00]
 [ 1.3410e+00  2.0363e+00 -1.6507e+00]
 [ 7.8200e-02  2.2291e+00 -2.2166e+00]
 [-5.9770e-01  8.5390e-01 -2.4020e+00]
 [-4.1350e-01  3.3180e+00 -2.4552e+00]
 [ 7.6020e-01  1.6780e-01 -2.7001e+00]
 [ 7.8120e-01 -1.2141e+00 -3.3773e+00]
 [ 1.2426e+00  7.6840e-01 -3.4982e+00]
 [-1.1794e+00  1.5580e+00 -5.3770e-01]
 [ 6.5160e-01  1.3952e+00  3.1740e-01]
 [-1.3722e+00 -1.4910e+00  1.5134e+00]
 [-2.8280e-01 -1.8835e+00 -5.9570e-01]
 [ 2.1775e+00 -1.8412e+00 -3.3215e+00]
 [ 3.4850e-01 -1.1006e+00 -4.8514e+00]
 [ 5.9000e-02 -1.9009e+00 -2.9478e+00]
 [-6.4900e-02  1.0264e+00  2.4198e+00]
 [ 1.1620e-01 -6.4080e-01  3.0244e+00]
 [ 2.7150e+00 -4.9000e-03  2.8019e+00]
 [ 2.2927e+00  1.5147e+00  1.9874e+00]
 [ 3.3255e+00  3.1620e-01  1.1818e+00]
 [-2.3716e+00  1.2962e+00  1.3534e+00]
 [-3.2639e+00 -2.2610e-01  1.4982e+00]
 [-3.3810e+00  1.0069e+00 -8.8360e-01]
 [-2.9934e+00 -7.3580e-01 -9.2220e-01]
 [ 2.6088e+00  3.7980e-01 -1.4394e+00]
 [-1.2929e+00  8.7720e-01 -3.2423e+00]
 [-3.0750e-01 -2.5854e+00  2.4750e-01]
 [ 6.9200e-01 -2.0890e+00 -1.0319e+00]
 [-1.0943e+00 -2.1904e+00 -1.2614e+00]
 [ 2.1795e+00 -2.8241e+00 -3.8049e+00]
 [ 2.5176e+00 -1.9822e+00 -2.2914e+00]
 [ 2.9143e+00 -1.2135e+00 -3.8343e+00]
 [ 3.5220e-01 -2.0859e+00 -5.3307e+00]
 [ 1.0232e+00 -4.5200e-01 -5.4210e+00]
 [-6.6520e-01 -6.9670e-01 -4.9346e+00]]
Here are the empty dictionary of attributes
{}

Atom class

Individual atoms can be retrieved in a few different ways from the molecule:

Atoms can be achieved via an index

[5]:
ex_atom1 = mol1.get_atom(2)
ex_atom1
[5]:
Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0)

Atoms can be retrieved via an element

[6]:
ex_atom2 = mol1.get_atom(ml.Element.N)
ex_atom2
[6]:
Atom(element=N, isotope=None, label='N', formal_charge=0, formal_spin=0)

Atoms can also be retrieved via an assigned label

[7]:
ex_atom2.label = "Important"
ex_atom3 = mol1.get_atom("Important")
print(f'Example atom 3: {ex_atom3}')
print(f'Is example atom 2 = example atom 3? --> {ex_atom2 == ex_atom3}')
Example atom 3: Atom(element=N, isotope=None, label='Important', formal_charge=0, formal_spin=0)
Is example atom 2 = example atom 3? --> True

Atoms also have a few other important properties

[8]:
ex_atom2.as_dict()
[8]:
{'element': N,
 'isotope': None,
 'label': 'Important',
 'atype': <AtomType.sp3: 31>,
 'stereo': <AtomStereo.Unknown: 0>,
 'geom': <AtomGeom.Unknown: 0>,
 'formal_charge': 0,
 'formal_spin': 0,
 'attrib': {},
 '_parent': <weakref at 0x7f9a5bfac900; to 'Molecule' at 0x7f99bb853d10>}

Some other important functionality exists to allow quick retrieval of atoms and information regarding atoms.

This example retrieves a tuple of atoms

[9]:
atoms_of_interest = mol1.get_atoms(7,8,9,10)
atoms_of_interest
[9]:
(Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0),
 Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0),
 Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0),
 Atom(element=O, isotope=None, label='O', formal_charge=0, formal_spin=0))

This example retrieves all atoms that are carbon

[10]:
all_carbon_atoms = mol1.get_atoms(*mol1.yield_atoms_by_element(ml.Element.C))
all_carbon_atoms
[10]:
(Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0),
 Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0),
 Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0),
 Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0),
 Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0),
 Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0),
 Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0),
 Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0),
 Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0),
 Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0),
 Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0),
 Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0),
 Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0),
 Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0),
 Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0),
 Atom(element=C, isotope=None, label='C', formal_charge=0, formal_spin=0))

This example retrieves all atoms that contain the label “Important”

[11]:
mol1.get_atoms(*mol1.yield_atoms_by_label("Important"))
[11]:
(Atom(element=N, isotope=None, label='Important', formal_charge=0, formal_spin=0),)

This retrieves the index of an atom

[12]:
mol1.get_atom_index(ex_atom2)
[12]:
0

This retrieves the indices of atoms of interest

[13]:
mol1.get_atom_indices(*atoms_of_interest)
[13]:
(7, 8, 9, 10)

In addition, atoms can be instantiated entirely independently of a Molecule object

[14]:
# Assigns Silicon to both a and b, but with different geometries and atom types
a = ml.Atom("Si", isotope=29, geom=ml.AtomGeom.R4_Tetrahedral)
b = ml.Atom("Si", isotope=29, atype=ml.AtomType.AttachmentPoint)
print(a)
b.is_attachment_point
Atom(element=Si, isotope=29, label=None, formal_charge=0, formal_spin=0)
[14]:
True

The elements of atoms can also be rapidly changed

[15]:
a.element = 83
a
[15]:
Atom(element=Bi, isotope=29, label=None, formal_charge=0, formal_spin=0)

Each Atom can also be assigned different attributes of common data types within python easily as though it were a dictionary.

[16]:
print(f'Current attribute list for Example Atom A')
a.attrib["List"] = [0,1,2]
a.attrib["Tuple"] = ('a','b','c')
a.attrib["Dictionary"] = {"Data": (0.1, 0.2, 0.3)}

a.attrib
Current attribute list for Example Atom A
[16]:
{'List': [0, 1, 2],
 'Tuple': ('a', 'b', 'c'),
 'Dictionary': {'Data': (0.1, 0.2, 0.3)}}

Coordinates of atoms or sets of atoms can quickly be retrieved from their associated atoms

[17]:
atom2_coord = mol1.get_atom_coord(ex_atom1)
print(f'Example Atom 2 Coordinates: {atom2_coord}')

all_carbon_subset = mol1.coord_subset(all_carbon_atoms)
print(f'All Carbon Atom subset:\n{all_carbon_subset}')
Example Atom 2 Coordinates: [-1.0974 -0.4738  1.2059]
All Carbon Atom subset:
[[ 5.7300e-02 -2.2600e-02  2.1227e+00]
 [-1.0974e+00 -4.7380e-01  1.2059e+00]
 [-4.2840e-01 -4.1130e-01 -1.6870e-01]
 [ 8.6830e-01  3.5980e-01 -4.0000e-04]
 [ 2.4562e+00  4.4170e-01  1.8361e+00]
 [-2.4328e+00  2.6590e-01  9.8320e-01]
 [-2.6553e+00  2.4720e-01 -5.8010e-01]
 [-1.2428e+00  5.6260e-01 -1.0163e+00]
 [ 1.5353e+00  5.7900e-01 -1.4179e+00]
 [ 7.8200e-02  2.2291e+00 -2.2166e+00]
 [-5.9770e-01  8.5390e-01 -2.4020e+00]
 [ 7.6020e-01  1.6780e-01 -2.7001e+00]
 [ 7.8120e-01 -1.2141e+00 -3.3773e+00]
 [-2.8280e-01 -1.8835e+00 -5.9570e-01]
 [ 2.1775e+00 -1.8412e+00 -3.3215e+00]
 [ 3.4850e-01 -1.1006e+00 -4.8514e+00]]