opan.xyz

Module implementing OpenBabel XYZ parsing and interpretation.

The single class OpanXYZ imports molecular geometries in the OpenBabel XYZ format extlink, with the following variations:

  • Coordinates of any precision will be read, not just the \(10.5\) specified by OpenBabel
  • Both atomic symbols and atomic numbers are valid
  • Multiple geometries/frames are supported, but the number of atoms and their sequence in the atoms list must be maintained in all geometries.

Contents

Class Variables

Instance Variables

Methods

All return values from a single indicated geometry.

geom_single() – Entire geometry vector

displ_single() – Displacement between two atoms

dist_single() – Euclidean distance between two atoms

angle_single() – Spanned angle of three atoms (one central and two distal atoms)

dihed_single() – Dihedral angle among four atoms

Generators

All yielded values are composited from an arbitrary set of geometry and/or atom indices; indexing with negative values is supported.

Each parameter can either be a single index, or an iterable of indices. If any iterables are passed, all must be the same length R, and a total of R values will be returned, one for each set of elements across the iterables. (This behavior is loosly similar to Python’s zip() builtin.) Single values are used as provided for all values returned.

As an additional option, a None value can be passed to exactly one parameter, which will then be assigned the full ‘natural’ range of that parameter (range(G) for g_nums and range(N) for ats_#).

If the optional parameter invalid_error is False (the default), if IndexError or ValueError is raised in the course of calculating a given value, it is ignored and a None value is returned. If True, then the errors are raised as-is.

Note

For angle_iter() and dihed_iter(), using a None parameter with invalid_error == True is guaranteed to raise an error.

geom_iter() – Geometries

displ_iter() – Displacement vectors

dist_iter() – Euclidean distances

angle_iter() – Span angles

dihed_iter() – Dihedral angles


Class Definition

class opan.xyz.OpanXYZ(**kwargs)

Container for OpenBabel XYZ data.

Initializer can be called in one of two forms:

OpanXYZ(path='path/to/file')
OpanXYZ(atom_syms=[{array of atoms}], coords=[{array of coordinates}])

If path is specified, atom_syms and coords are ignored, and the instance will contain all validly formatted geometries present in the OpenBabel file at the indicated path.

Note

Certain types of improperly formatted geometry blocks, such as one with alphabetic characters on the ‘number-of-atoms’ line, may not raise errors during loading, but instead just result in import of fewer geometries/frames than expected.

If initialized with the atom_syms and coords keyword arguments, the instance will contain the single geometry represented by the provided inputs.

In both forms, the optional keyword argument bohrs can be specified, to indicate the units of the coordinates as Bohrs (True) or Angstroms (False). Angstrom and Bohr units are the default for the path and atom_syms/coords forms, respectively. The units of all coordinates stored in the instance are Bohrs.

‘N’ and ‘G’ in the below documentation refer to the number of atoms per geometry and the number of geometries present in the file, respectively. Note that ALL geometries present MUST contain the same number of atoms, and the elements must all FALL IN THE SAME SEQUENCE in each geometry/frame. No error will be raised if positions of like atoms are swapped, but for obvious reasons this will almost certainly cause semantic difficulties in downstream computations.

Note

In ORCA ‘.xyz’ files contain the highest precision geometry information of any output (save perhaps the textual output generated by the program), and are stored in Angstrom units.


Class Variables

p_coords

re.compile() pattern – Retrieves individual lines from coordinate blocks matched by p_geom

p_geom

re.compile() pattern – Retrieves full OpenBabel XYZ geometry frames/blocks

LOAD_DATA_FLAG = 'NOT FILE'

str – Flag for irrelevant data in atom_syms/coords initialization mode.


Instance Variables

Except where indicated, all str and list-of-str values are stored as LOAD_DATA_FLAG when initialized with atom_syms/coords arguments.

atom_syms

length-N str – Atomic symbols for the atoms (all uppercase)

descs

length-G str – Text descriptions for each geometry included in a loaded file

geoms

length-G list of length-3N np.float_ – Molecular geometry/geometries read from file or passed to coords argument

in_str

str – Complete contents of the input file

num_atoms

int – Number of atoms per geometry, N

num_geoms

int – Number of geometries, G

XYZ_path

str – Full path to imported OpenBabel file


Methods

geom_single(g_num)

Retrieve a single geometry.

The atom coordinates are returned with each atom’s x/y/z coordinates grouped together:

[A1x, A1y, A1z, A2x, A2y, A2z, ...]

An alternate method to achieve the same effect as this function is by simply indexing into geoms:

>>> x = opan.xyz.OpanXYZ(path='...')
>>> x.geom_single(g_num)    # One way to do it
>>> x.geoms[g_num]          # Another way
Parameters:g_numint – Index of the desired geometry
Returns:geom – length-3N np.float_ – Vector of the atomic coordinates for the geometry indicated by g_num
Raises:IndexError – If an invalid (out-of-range) g_num is provided
displ_single(g_num, at_1, at_2)

Displacement vector between two atoms.

Returns the displacement vector pointing from at_1 toward at_2 from geometry g_num. If at_1 == at_2 a strict zero vector is returned.

Displacement vector is returned in units of Bohrs.

Parameters:
  • g_numint – Index of the desired geometry
  • at_1int – Index of the first atom
  • at_2int – Index of the second atom
Returns:

displ – length-3 np.float_ – Displacement vector from at_1 to at_2

Raises:

IndexError – If an invalid (out-of-range) g_num or at_# is provided

dist_single(g_num, at_1, at_2)

Distance between two atoms.

Parameters:
  • g_numint – Index of the desired geometry
  • at_1int – Index of the first atom
  • at_2int – Index of the second atom
Returns:

distnp.float_ – Distance in Bohrs between at_1 and at_2 from geometry g_num

Raises:

IndexError – If an invalid (out-of-range) g_num or at_# is provided

angle_single(g_num, at_1, at_2, at_3)

Spanning angle among three atoms.

The indices at_1 and at_3 can be the same (yielding a trivial zero angle), but at_2 must be different from both at_1 and at_3.

Parameters:
  • g_numint – Index of the desired geometry
  • at_1int – Index of the first atom
  • at_2int – Index of the second atom
  • at_3int – Index of the third atom
Returns:

anglenp.float_ – Spanning angle in degrees between at_1-at_2-at_3, from geometry g_num

Raises:
  • IndexError – If an invalid (out-of-range) g_num or at_# is provided
  • ValueError – If at_2 is equal to either at_1 or at_3
dihed_single(g_num, at_1, at_2, at_3, at_4)

Dihedral/out-of-plane angle among four atoms.

Returns the out-of-plane angle among four atoms from geometry g_num, in degrees. The reference plane is spanned by at_1, at_2 and at_3. The out-of-plane angle is defined such that a positive angle represents a counter-clockwise rotation of the projected at_3\(\rightarrow\)at_4 vector with respect to the reference plane when looking from at_3 toward at_2. Zero rotation corresponds to occlusion of at_1 and at_4; that is, the case where the respective rejections of at_1 \(\rightarrow\)at_2 and at_3\(\rightarrow\)at_4 onto at_2\(\rightarrow\)at_3 are ANTI-PARALLEL.

All four atom indices must be distinct. Both of the atom trios 1-2-3 and 2-3-4 must be sufficiently nonlinear, as diagnosed by a bend angle different from 0 or 180 degrees by at least PRM.NON_PARALLEL_TOL.

Parameters:
  • g_numint – Index of the desired geometry
  • at_1int – Index of the first atom
  • at_2int – Index of the second atom
  • at_3int – Index of the third atom
  • at_4int – Index of the fourth atom
Returns:

dihednp.float_ – Out-of-plane/dihedral angle in degrees for the indicated at_#, drawn from geometry g_num

Raises:
  • IndexError – If an invalid (out-of-range) g_num or at_# is provided
  • ValueError – If any indices at_# are equal
  • XYZError – (typecode DIHED) If either of the atom trios (1-2-3 or 2-3-4) is too close to linearity

Generators

geom_iter(g_nums)

Iterator over a subset of geometries.

The indices of the geometries to be returned are indicated by an iterable of ints passed as g_nums.

As with geom_single(), each geometry is returned as a length-3N np.float_ with each atom’s x/y/z coordinates grouped together:

[A1x, A1y, A1z, A2x, A2y, A2z, ...]

In order to use NumPy slicing or advanced indexing, geoms must first be explicitly converted to np.array, e.g.:

>>> x = opan.xyz.OpanXYZ(path='...')
>>> np.array(x.geoms)[[2,6,9]]
Parameters:g_nums – length-R iterable of int – Indices of the desired geometries
Yields:geom – length-3N np.float_ – Vectors of the atomic coordinates for each geometry indicated in g_nums
Raises:IndexError – If an item in g_nums is invalid (out of range)
displ_iter(g_nums, ats_1, ats_2)

Iterator over indicated displacement vectors.

Displacements are in Bohrs as with displ_single().

See above for more information on calling options.

Parameters:
  • g_numsint or length-R iterable int or None – Index/indices of the desired geometry/geometries
  • ats_1int or length-R iterable int or None – Index/indices of the first atom(s)
  • ats_2int or length-R iterable int or None – Index/indices of the second atom(s)
  • invalid_errorbool, optional – If False (the default), None values are returned for results corresponding to invalid indices. If True, exceptions are raised per normal.
Yields:

displnp.float_ – Displacement vector in Bohrs between each atom pair of
ats_1 \(\rightarrow\) ats_2 from the corresponding geometries of g_nums.

Raises:
  • IndexError – If an invalid (out-of-range) g_num or at_# is provided.
  • ValueError – If all iterable objects are not the same length.
dist_iter(g_nums, ats_1, ats_2)

Iterator over selected interatomic distances.

Distances are in Bohrs as with dist_single().

See above for more information on calling options.

Parameters:
  • g_numsint or length-R iterable int or None – Index/indices of the desired geometry/geometries
  • ats_1int or iterable int or None – Index/indices of the first atom(s)
  • ats_2int or iterable int or None – Index/indices of the second atom(s)
  • invalid_errorbool, optional – If False (the default), None values are returned for results corresponding to invalid indices. If True, exceptions are raised per normal.
Yields:

distnp.float_ – Interatomic distance in Bohrs between each atom pair of ats_1 and ats_2 from the corresponding geometries of g_nums.

Raises:
  • IndexError – If an invalid (out-of-range) g_num or at_# is provided.
  • ValueError – If all iterable objects are not the same length.
angle_iter(g_nums, ats_1, ats_2, ats_3)

Iterator over selected atomic angles.

Angles are in degrees as with angle_single().

See above for more information on calling options.

Parameters:
  • g_numsint or iterable int or None – Index of the desired geometry
  • ats_1int or iterable int or None – Index of the first atom
  • ats_2int or iterable int or None – Index of the second atom
  • ats_3int or iterable int or None – Index of the third atom
  • invalid_errorbool, optional – If False (the default), None values are returned for results corresponding to invalid indices. If True, exceptions are raised per normal.
Yields:

anglenp.float_ – Spanning angles in degrees between corresponding
ats_1-ats_2-ats_3, from geometry/geometries g_nums

Raises:
  • IndexError – If an invalid (out-of-range) g_num or at_# is provided.
  • ValueError – If all iterable objects are not the same length.
  • ValueError – If any ats_2 element is equal to either the corresponding ats_1 or ats_3 element.
dihed_iter(g_nums, ats_1, ats_2, ats_3, ats_4)

Iterator over selected dihedral angles.

Angles are in degrees as with dihed_single().

See above for more information on calling options.

Parameters:
  • g_numsint or iterable int or None – Indices of the desired geometry
  • ats_1int or iterable int or None – Indices of the first atoms
  • ats_2int or iterable int or None – Indices of the second atoms
  • ats_3int or iterable int or None – Indices of the third atoms
  • ats_4int or iterable int or None – Indices of the fourth atoms
  • invalid_errorbool, optional – If False (the default), None values are returned for results corresponding to invalid indices. If True, exceptions are raised per normal.
Yields:

dihednp.float_ – Out-of-plane/dihedral angles in degrees for the indicated atom sets ats_1-ats_2-ats_3-ats_4, drawn from the respective g_nums.

Raises:
  • IndexError – If an invalid (out-of-range) g_num or at_# is provided.
  • ValueError – If all iterable objects are not the same length.
  • ValueError – If any corresponding ats_# indices are equal.
  • XYZError – (typecode DIHED) If either of the atom trios (1-2-3 or 2-3-4) is too close to linearity for any group of ats_#