chython.containers package

Data classes.

class chython.containers.Bond(order: int)
copy() Bond
classmethod from_bond(bond)
property in_ring: bool
property order: int
class chython.containers.MoleculeContainer
add_atom(atom: Element | int | str, *args, charge=0, is_radical=False, xy: Tuple[float, float] = (0.0, 0.0), _skip_hydrogen_calculation=False, **kwargs)

Add new atom.

add_atom_stereo(n: int, env: Tuple[int, ...], mark: bool, *, clean_cache=True)

Add stereo data for specified neighbors bypass. Use it for tetrahedrons or allenes.

Parameters:
  • n – number of tetrahedron atom or central atom of allene.

  • env – numbers of atoms with specified bypass

  • mark – clockwise or anti bypass.

See <https://www.daylight.com/dayhtml/doc/theory/theory.smiles.html> and <http://opensmiles.org/opensmiles.html>

add_bond(n, m, bond: Bond | int, *, _skip_hydrogen_calculation=False)

Connect atoms with bonds.

For Thiele forms of molecule causes invalidation of internal state. Implicit hydrogens marks will not be set if atoms in aromatic rings. Call kekule() and thiele() in sequence to fix marks.

add_cis_trans_stereo(n: int, m: int, n1: int, n2: int, mark: bool, *, clean_cache=True)

Add stereo data to cis-trans double bonds (not allenes).

n1/n=m/n2

Parameters:
  • n – number of starting atom of double bonds chain (alkenes of cumulenes)

  • m – number of ending atom of double bonds chain (alkenes of cumulenes)

  • n1 – number of neighboring atom of starting atom

  • n2 – number of neighboring atom of ending atom

  • mark – cis or trans

See <https://www.daylight.com/dayhtml/doc/theory/theory.smiles.html> and <http://opensmiles.org/opensmiles.html

add_wedge(n: int, m: int, mark: int, *, clean_cache=True)

Add stereo data by wedge notation of bonds. Use it for tetrahedrons of allenes.

Parameters:
  • n – number of atom from which wedge bond started

  • m – number of atom to which wedge bond coming

  • mark – up bond is 1, down is -1

adjacency_matrix(set_bonds=False, /)

Adjacency matrix of Graph.

Parameters:

set_bonds – if True set bond orders instead of 1.

property aromatic_rings: Tuple[Tuple[int, ...], ...]

Aromatic rings atoms numbers

atom(n: int) Atom
atoms() Iterator[Tuple[int, Atom]]

iterate over all atoms

property atoms_count: int
property atoms_numbers: Iterator[int]
property atoms_order: Dict[int, int]

Morgan like algorithm for graph nodes ordering

Returns:

dict of atom-order pairs

property atoms_rings: Dict[int, Tuple[Tuple[int, ...]]]

Dict of atoms rings which contains it.

property atoms_rings_sizes: Dict[int, Tuple[int, ...]]

Sizes of rings containing atom.

augmented_substructure(atoms: Iterable[int], deep: int = 1, **kwargs) MoleculeContainer

Create substructure containing atoms and their neighbors

Parameters:
  • atoms – list of core atoms in graph

  • deep – number of bonds between atoms and neighbors

augmented_substructures(atoms: Iterable[int], deep: int = 1, **kwargs) List[MoleculeContainer]

Create list of substructures containing atoms and their neighbors

Parameters:
  • atoms – list of core atoms in graph

  • deep – number of bonds between atoms and neighbors

Returns:

list of graphs containing atoms, atoms + first circle, atoms + 1st + 2nd, etc up to deep or while new nodes available

bond(n: int, m: int) Bond
bonds() Iterator[Tuple[int, int, Bond]]

iterate other all bonds

property bonds_count: int
property brutto: Dict[str, int]

Counted atoms dict

calculate_cis_trans_from_2d(*, clean_cache=True)

Calculate cis-trans stereo bonds from given 2d coordinates. Unusable for SMILES and INCHI.

canonicalize(*, fix_tautomers=True, keep_kekule=False, logging=False, ignore=True) bool | List[Tuple[Tuple[int, ...], int, str]]

Convert molecule to canonical forms of functional groups and aromatic rings without explicit hydrogens.

Parameters:
  • logging – return log.

  • ignore – ignore standardization bugs.

  • fix_tautomers – convert tautomers to canonical forms.

  • keep_kekule – return kekule form.

check_valence() List[int]

Check valences of all atoms.

Returns:

list of invalid atoms

clean2d()

Calculate 2d layout of graph. https://pubs.acs.org/doi/10.1021/acs.jcim.7b00425 JS implementation used.

clean_isotopes() bool

Clean isotope marks from molecule. Return True if any isotope found.

clean_stereo()

Remove stereo data.

compose(other: MoleculeContainer) CGRContainer

Compose 2 graphs to CGR.

property connected_components: Tuple[Tuple[int, ...], ...]

Isolated components of single graph. E.g. salts as ion pair.

property connected_components_count: int

Number of components in graph

copy() MoleculeContainer

copy of graph

property cumulenes: Tuple[Tuple[int, ...], ...]

Alkenes, allenes and cumulenes atoms numbers.

delete_atom(n: int, *, _skip_hydrogen_calculation=False)

Remove atom.

For Thiele forms of molecule causes invalidation of internal state. Implicit hydrogens marks will not be set if atoms in aromatic rings. Call kekule() and thiele() in sequence to fix marks.

delete_bond(n: int, m: int, *, _skip_hydrogen_calculation=False)

Disconnect atoms.

For Thiele forms of molecule causes invalidation of internal state. Implicit hydrogens marks will not be set if atoms in aromatic rings. Call kekule() and thiele() in sequence to fix marks.

depict(*, width=None, height=None, clean2d: bool = True, _embedding=False) str

Depict molecule in SVG format.

Parameters:
  • width – set svg width param. by default auto-calculated.

  • height – set svg height param. by default auto-calculated.

  • clean2d – calculate coordinates if necessary.

depict3d(index: int = 0) str

Get X3DOM XML string.

Parameters:

index – index of conformer

enumerate_charged_forms(*, deep: int = 4, limit: int = 1000)

Enumerate protonated and deprotonated ions. Use on neutralized molecules.

Parameters:
  • deep – Maximum amount of added or removed protons.

  • limit – Maximum amount of generated structures.

enumerate_charged_tautomers(*, prepare_molecules=True, partial=False, increase_aromaticity=True, keep_sugars=True, heteroarenes=True, keto_enol=True, deep: int = 4, limit: int = 1000)

Enumerate tautomers and protonated-deprotonated forms. Better to use on neutralized non-ionic molecules.

See enumerate_tautomers and enumerate_charged_forms params description.

enumerate_kekule()

Enumerate all possible kekule forms of molecule.

enumerate_tautomers(*, prepare_molecules=True, zwitter=True, partial=False, increase_aromaticity=True, keep_sugars=True, heteroarenes=True, keto_enol=True, limit: int = 1000) Iterator[MoleculeContainer]

Enumerate all possible tautomeric forms of molecule.

Parameters:
  • prepare_molecules – Standardize structures for correct processing

  • zwitter – Do zwitter-ions enumeration

  • partial – Allow OC=CC=C>>O=CCC=C or O=CC=CC>>OC=C=CC

  • increase_aromaticity – prevent aromatic ring destruction

  • keep_sugars – prevent carbonyl moving in sugars

  • heteroarenes – enumerate heteroarenes

  • keto_enol – enumerate keto-enols

  • limit – Maximum attempts count

environment(atom: int, include_bond: bool = True, include_atom: bool = True) Tuple[Tuple[int, Bond, Element] | Tuple[int, Element] | Tuple[int, Bond] | int, ...]

groups of (atom_number, bond, atom) connected to atom or groups of (atom_number, bond) connected to atom or groups of (atom_number, atom) connected to atom or neighbors atoms connected to atom

Parameters:
  • atom – number

  • include_atom – include atom object

  • include_bond – include bond object

explicify_hydrogens(*, start_map=None, _return_map=False, _fix_stereo=True) int | List[Tuple[int, int]]

Add explicit hydrogens to atoms.

Returns:

number of added atoms

explicit_hydrogens(n: int) int

Number of explicit hydrogen atoms connected to atom.

Take into account any type of bonds with hydrogen atoms.

fix_resonance(*, logging=False, _fix_stereo=True) bool | List[int]

Transform biradical or dipole resonance structures into neutral form. Return True if structure form changed.

Parameters:

logging – return list of changed atoms.

fix_stereo()

Reset stereo marks.

flush_cache()
flush_stereo_cache()

Flush chiral morgan and chiral centers cache.

get_automorphism_mapping() Iterator[Dict[int, int]]

Iterator of all possible automorphism mappings.

get_fast_mapping(other: MoleculeContainer) Dict[int, int] | None

Get self to other fast (suboptimal) structure mapping. Only one possible atoms mapping returned. Effective only for big molecules.

get_mapping(other: Container, **kwargs)

Get self to other Molecule substructure mapping generator.

Parameters:
  • other – Molecule

  • automorphism_filter – Skip matches to the same atoms.

  • searching_scope – substructure atoms list to localize isomorphism.

get_mcs_mapping(other: MoleculeContainer, /, *, limit=10000) Iterator[Dict[int, int]]

Find maximum common substructure. Based on clique searching in product graph.

Parameters:

limit – limit tested cliques

has_atom(n: int) bool
has_bond(n: int, m: int) bool
heteroatoms(n: int) int

Number of neighbored heteroatoms (not carbon or hydrogen) except any-bond connected.

hybridization(n: int) int

Atom hybridization.

1 - if atom has zero or only single bonded neighbors, 2 - if has only one double bonded neighbor and any amount of single bonded, 3 - if has one triple bonded and any amount of double and single bonded neighbors or two and more double bonded and any amount of single bonded neighbors, 4 - if atom in aromatic ring.

implicify_hydrogens(*, logging=False, _fix_stereo=True) int | Tuple[int, List[int]]

Remove explicit hydrogen if possible. Return number of removed hydrogens. Works only with Kekule forms of aromatic structures. Keeps isotopes of hydrogen.

Parameters:

logging – return list of changed atoms.

implicit_hydrogens(n: int) int | None

Number of implicit hydrogen atoms connected to atom.

Returns None if count are ambiguous.

property int_adjacency: Dict[int, Dict[int, int]]

Adjacency with integer-coded bonds.

is_automorphic()

Test for automorphism symmetry of graph.

is_equal(other, /) bool

Test self is same structure as other

property is_radical: bool

True if at least one atom is radical

is_ring_bond(n: int, m: int, /) bool

Check is bond in any ring.

is_substructure(other, /) bool

Test self is substructure of other

kekule(*, buffer_size=7) bool

Convert structure to kekule form. Return True if found any aromatic ring. Set implicit hydrogen count and hybridization marks on atoms.

Only one of possible double/single bonds positions will be set. For enumerate bonds positions use enumerate_kekule.

Parameters:

buffer_size – number of attempts of pyridine form searching.

linear_bit_set(min_radius: int = 1, max_radius: int = 4, length: int = 1024, number_active_bits: int = 2, number_bit_pairs: int = 4) Set[int]

Transform structure into set of indexes of True-valued features.

Parameters:
  • min_radius – minimal length of fragments

  • max_radius – maximum length of fragments

  • length – bit string’s length. Should be power of 2

  • number_active_bits – number of active bits for each hashed tuple

  • number_bit_pairs – describe how much repeating fragments we can count in hashable fingerprint (if number of fragment in molecule greater or equal this number, we will activate only this number of fragments). To take into account all repeating fragments put 0 as a value.

linear_fingerprint(min_radius: int = 1, max_radius: int = 4, length: int = 1024, number_active_bits: int = 2, number_bit_pairs: int = 4)

Transform structures into array of binary features.

Parameters:
  • min_radius – minimal length of fragments

  • max_radius – maximum length of fragments

  • length – bit string’s length. Should be power of 2

  • number_active_bits – number of active bits for each hashed tuple

  • number_bit_pairs – describe how much repeating fragments we can count in hashable fingerprint (if number of fragment in molecule greater or equal this number, we will activate only this number of fragments). To take into account all repeating fragments put 0 as a value.

Returns:

array(n_features)

linear_hash_set(min_radius: int = 1, max_radius: int = 4, number_bit_pairs: int = 4) Set[int]

Transform structure into set of integer hashes of fragments with count information.

Parameters:
  • min_radius – minimal length of fragments

  • max_radius – maximum length of fragments

  • number_bit_pairs – describe how much repeating fragments we can count in hashable fingerprint (if number of fragment in molecule greater or equal this number, we will activate only this number of fragments). To take into account all repeating fragments put 0 as a value.

linear_hash_smiles(min_radius: int = 1, max_radius: int = 4, number_bit_pairs: int = 4) Dict[int, List[str]]
Transform structure into dict of integer hashes of fragments with count information and

corresponding fragment SMILES.

Parameters:
  • min_radius – minimal length of fragments

  • max_radius – maximum length of fragments

  • number_bit_pairs – describe how much repeating fragments we can count in hashable fingerprint (if number of fragment in molecule greater or equal this number, we will activate only this number of fragments). To take into account all repeating fragments put 0 as a value.

linear_smiles_hash(min_radius: int = 1, max_radius: int = 4, number_bit_pairs: int = 4) Dict[str, List[int]]

Transform structure into dict of fragment SMILES and list of corresponding integer hashes of fragments.

Parameters:
  • min_radius – minimal length of fragments

  • max_radius – maximum length of fragments

  • number_bit_pairs – describe how much repeating fragments we can count in hashable fingerprint (if number of fragment in molecule greater or equal this number, we will activate only this number of fragments). To take into account all repeating fragments put 0 as a value.

property meta: Dict
property molecular_charge: int

Total charge of molecule

property molecular_mass: float
morgan_bit_set(min_radius: int = 1, max_radius: int = 4, length: int = 1024, number_active_bits: int = 2) Set[int]

Transform structures into set of indexes of True-valued features.

Parameters:
  • min_radius – minimal radius of EC

  • max_radius – maximum radius of EC

  • length – bit string’s length. Should be power of 2

  • number_active_bits – number of active bits for each hashed tuple

morgan_fingerprint(min_radius: int = 1, max_radius: int = 4, length: int = 1024, number_active_bits: int = 2)

Transform structures into array of binary features. Morgan fingerprints. Similar to RDkit implementation.

Parameters:
  • min_radius – minimal radius of EC

  • max_radius – maximum radius of EC

  • length – bit string’s length. Should be power of 2

  • number_active_bits – number of active bits for each hashed tuple

Returns:

array(n_features)

morgan_hash_set(min_radius: int = 1, max_radius: int = 4) Set[int]

Transform structures into integer hashes of atoms with EC.

Parameters:
  • min_radius – minimal radius of EC

  • max_radius – maximum radius of EC

morgan_hash_smiles(min_radius: int = 1, max_radius: int = 4) Dict[int, List[str]]

Transform structures into dictionary of hashes of atoms with EC and corresponding SMILES.

Parameters:
  • min_radius – minimal radius of EC

  • max_radius – maximum radius of EC

morgan_smiles_hash(min_radius: int = 1, max_radius: int = 4) Dict[str, List[int]]

Transform structures into dictionary of smiles and corresponding hashes of atoms with EC.

Parameters:
  • min_radius – minimal radius of EC

  • max_radius – maximum radius of EC

property name: str
neighbors(n: int) int

number of neighbors atoms excluding any-bonded

neutralize(*, keep_charge=True, logging=False, _fix_stereo=True) bool | List[int]

Convert organic salts to neutral form if possible. Only one possible form used for charge unbalanced structures.

Parameters:
  • keep_charge – do partial neutralization to keep total charge of molecule.

  • logging – return changed atoms list.

property not_special_connectivity: Dict[int, Set[int]]

Graph connectivity without special bonds.

pack(*, compressed=True, check=True, version=2, order: List[int] | None = None) bytes

Pack into compressed bytes.

Note:

  • Less than 4096 atoms supported. Atoms mapping should be in range 1-4095.

  • Implicit hydrogens count should be in range 0-6 or unspecified.

  • Isotope shift should be in range -15 - 15 relatively chython.files._mdl.mol.common_isotopes

  • Atoms neighbors should be in range 0-15

Format V2 specification:

Big endian bytes order
8 bit - 0x02 (format specification version)
12 bit - number of atoms
12 bit - cis/trans stereo block size
Atom block 9 bytes (repeated):
12 bit - atom number
4 bit - number of neighbors
2 bit tetrahedron sign (00 - not stereo, 10 or 11 - has stereo)
2 bit - allene sign
5 bit - isotope (00000 - not specified, over = isotope - common_isotope + 16)
7 bit - atomic number (<=118)
32 bit - XY float16 coordinates
3 bit - hydrogens (0-7). Note: 7 == None
4 bit - charge (charge + 4. possible range -4 - 4)
1 bit - radical state
Connection table: flatten list of neighbors. neighbors count stored in atom block.
For example CC(=O)O - {1: [2], 2: [1, 3, 4], 3: [2], 4: [2]} >> [2, 1, 3, 4, 2, 2].
Repeated block (equal to bonds count).
24 bit - paired 12 bit numbers.
Bonds order block 3 bit per bond zero-padded to full byte at the end.
Cis/trans data block (repeated):
24 bit - atoms pair
7 bit - zero padding. in future can be used for extra bond-level stereo, like atropoisomers.
1 bit - sign

Format V3 specification:

Big endian bytes order
8 bit - 0x03 (format specification version)
Atom block 3 bytes (repeated):
1 bit - atom entrance flag (always 1)
7 bit - atomic number (<=118)
3 bit - hydrogens (0-7). Note: 7 == None
4 bit - charge (charge + 4. possible range -4 - 4)
1 bit - radical state
1 bit padding
3 bit tetrahedron/allene sign
    (000 - not stereo or unknown, 001 - pure-unknown-enantiomer, 010 or 011 - has stereo)
4 bit - number of following bonds and CT blocks (0-15)

Bond block 2 bytes (repeated 0-15 times)
12 bit - negative shift from current atom to connected (e.g. 0x001 = -1 - connected to previous atom)
4 bit - bond order: 0000 - single, 0001 - double, 0010 - triple, 0011 - aromatic, 0111 - special

Cis-Trans 2 bytes
12 bit - negative shift from current atom to connected (e.g. 0x001 = -1 - connected to previous atom)
4 bit - CT sign: 1000 or 1001 - to avoid overlap with bond

V2 format is faster than V3. V3 format doesn’t include isotopes, atom numbers and XY coordinates.

Parameters:
  • compressed – return zlib-compressed pack.

  • check – check molecule for format restrictions.

  • version – format version

  • order – atom order in V3

classmethod pack_len(data: bytes, /, *, compressed=True) int

Returns atoms count in molecule pack.

remap(mapping: Dict[int, int], *, copy: bool = False) MoleculeContainer

Change atom numbers

Parameters:
  • mapping – mapping of old numbers to the new

  • copy – keep original graph

remove_acids(*, logging=False) bool | List[int]

Remove common acids from organic bases salts. Works only for neutral pairs like HA+B. Use neutralize before.

Parameters:

logging – return deleted atoms list.

remove_coordinate_bonds(*, keep_to_terminal=True, _fix_stereo=True) int

Remove coordinate (or hydrogen) bonds marked with 8 (any) bond

Parameters:

keep_to_terminal – Keep any bonds to terminal hydrogens

Returns:

removed bonds count

remove_metals(*, logging=False) bool | List

Remove disconnected S-metals and ammonia.

Parameters:

logging – return deleted atoms list.

property ring_atoms

Atoms in rings. Not SSSR based fast algorithm.

property rings_count: int

SSSR rings count. Ignored rings with special bonds.

saturate(neighbors_distances: Dict[int, Dict[int, float]] | None = None, reset_electrons: bool = True, expected_charge: int = 0, expected_radicals_count: int = 0, allow_errors: bool = True, logging: bool = False) bool | List[str]

Saturate molecules with double and triple bonds and charges and radical states to correct valences of atoms. Note: works only with fully explicit hydrogens!

Parameters:
  • neighbors_distances – If given longest bonds can be removed if need.

  • reset_electrons – Can change charges and radicals if need.

  • expected_charge – Reset charge to given. Works only with reset_electrons=True.

  • expected_radicals_count – Reset radical atoms count to given. Works only with reset_electrons=True.

  • allow_errors – allow unbalanced result.

  • logging – return log.

property skin_graph: Dict[int, Set[int]]

Graph without terminal atoms. Only rings and linkers

property smiles_atoms_order: Tuple[int, ...]

Atoms order in canonic SMILES.

split() List[MoleculeContainer]

Split disconnected structure to connected substructures

split_metal_salts(*, logging=False) bool | List[Tuple[int, int]]

Split connected S-metal/lanthanides/actinides salts to cation/anion pairs.

Parameters:

logging – return deleted bonds list.

property sssr: Tuple[Tuple[int, ...], ...]

Smallest Set of Smallest Rings. Special bonds ignored.

Based on idea of PID matrices from: Lee, C. J., Kang, Y.-M., Cho, K.-H., & No, K. T. (2009). A robust method for searching the smallest set of smallest rings with a path-included distance matrix. Proceedings of the National Academy of Sciences of the United States of America, 106(41), 17355–17358. https://doi.org/10.1073/pnas.0813040106

:return rings atoms numbers

standardize(*, logging=False, ignore=True, fix_tautomers=True, _fix_stereo=True) bool | List[Tuple[Tuple[int, ...], int, str]]

Standardize functional groups. Return True if any non-canonical group found.

Parameters:
  • fix_tautomers – convert tautomers to canonical forms.

  • logging – return list of fixed atoms with matched rules.

  • ignore – ignore standardization bugs.

standardize_charges(*, logging=False, prepare_molecule=True, _fix_stereo=True) bool | List[int]

Set canonical positions of charges in heterocycles and ferrocenes.

Parameters:
  • logging – return list of changed atoms.

  • prepare_molecule – do thiele procedure.

sticky_smiles(left: int, right: int = None, *, remove_left: bool = False, remove_right: bool = False, tries: int = 10)

Generate smiles with fixed left and optionally right terminal atoms. Note: Produce expected results only with acyclic terminal atoms.

Parameters:
  • remove_left – drop terminal atom and corresponding bond

  • remove_right – drop terminal atom and corresponding bond

  • tries – number of attempts to generate smiles

substructure(atoms: Iterable[int], *, as_query: bool = False, recalculate_hydrogens=True, skip_neighbors_marks=False, skip_hybridizations_marks=False, skip_hydrogens_marks=False, skip_rings_sizes_marks=False, skip_heteroatoms_marks=False) MoleculeContainer | QueryContainer

Create substructure containing atoms from atoms list.

For Thiele forms of molecule In Molecule substructure causes invalidation of internal state. Implicit hydrogens marks will not be set if atoms in aromatic rings. Call kekule() and thiele() in sequence to fix marks.

Parameters:
  • atoms – list of atoms numbers of substructure

  • as_query – return Query object based on graph substructure

  • recalculate_hydrogens – calculate implicit H count in substructure

  • skip_neighbors_marks – Don’t set neighbors count marks on substructured queries

  • skip_hybridizations_marks – Don’t set hybridizations marks on substructured queries

  • skip_hydrogens_marks – Don’t set hydrogens count marks on substructured queries

  • skip_rings_sizes_marks – Don’t set rings_sizes marks on substructured queries

  • skip_heteroatoms_marks – Don’t set heteroatoms count marks

property tetrahedrons: Tuple[int, ...]

Carbon sp3 atoms numbers.

thiele(*, fix_tautomers=True) bool

Convert structure to aromatic form (Huckel rule ignored). Return True if found any kekule ring. Also marks atoms as aromatic.

Parameters:

fix_tautomers – try to fix condensed rings with pyrroles. N1C=CC2=NC=CC2=C1>>N1C=CC2=CN=CC=C12

total_hydrogens(n: int) int

Number of hydrogen atoms connected to atom.

Take into account any type of bonds with hydrogen atoms.

union(other: MoleculeContainer, *, remap: bool = False, copy: bool = True) MoleculeContainer

Merge Graphs into one.

Parameters:
  • remap – if atoms has collisions then remap other graph atoms else raise exception.

  • copy – keep original structure and return new object

classmethod unpack(data: bytes | memoryview, /, *, compressed=True, _return_pack_length=False) MoleculeContainer

Unpack from compressed bytes.

Parameters:

compressed – decompress data before processing.

view3d(index: int = 0, width='600px', height='400px')

Jupyter widget for 3D visualization.

Parameters:
  • index – index of conformer

  • width – widget width

  • height – widget height

class chython.containers.QueryBond(order: int | List[int] | Set[int] | Tuple[int, ...], in_ring: bool | None = None)
copy() QueryBond
classmethod from_bond(bond)
property in_ring: bool | None
property order: Tuple[int, ...]
class chython.containers.QueryContainer
add_atom(atom: Query | Element | int | str, *args, neighbors: int | List[int] | Tuple[int, ...] | None = None, hybridization: int | List[int] | Tuple[int, ...] | None = None, hydrogens: int | List[int] | Tuple[int, ...] | None = None, rings_sizes: int | List[int] | Tuple[int, ...] | None = None, heteroatoms: int | List[int] | Tuple[int, ...] | None = None, masked: bool = False, **kwargs)

new atom addition

add_bond(n, m, bond: QueryBond | Bond | int | Tuple[int, ...])

Add bond.

atom(n: int) Atom
atoms() Iterator[Tuple[int, Atom]]

iterate over all atoms

property atoms_count: int
property atoms_numbers: Iterator[int]
property atoms_order: Dict[int, int]

Morgan like algorithm for graph nodes ordering

Returns:

dict of atom-order pairs

property atoms_rings: Dict[int, Tuple[Tuple[int, ...]]]

Dict of atoms rings which contains it.

property atoms_rings_sizes: Dict[int, Tuple[int, ...]]

Sizes of rings containing atom.

bond(n: int, m: int) Bond
bonds() Iterator[Tuple[int, int, Bond]]

iterate other all bonds

property bonds_count: int
clean_stereo()

Remove stereo data.

property connected_components: Tuple[Tuple[int, ...], ...]

Isolated components of single graph. E.g. salts as ion pair.

property connected_components_count: int

Number of components in graph

copy() QueryContainer

copy of graph

property cumulenes: Tuple[Tuple[int, ...], ...]

Alkenes, allenes and cumulenes atoms numbers.

enumerate_queries(*, enumerate_marks: bool = False)

Enumerate complex queries into multiple simple ones. For example [N,O]-C into NC and OC.

Parameters:

enumerate_marks – enumerate multiple marks to separate queries

flush_cache()
get_automorphism_mapping() Iterator[Dict[int, int]]

Iterator of all possible automorphism mappings.

get_mapping(other: Container, **kwargs)

Get self to other Molecule or Query substructure mapping generator.

Parameters:
  • other – Molecule or Query

  • automorphism_filter – Skip matches to the same atoms.

  • searching_scope – substructure atoms list to localize isomorphism.

has_atom(n: int) bool
has_bond(n: int, m: int) bool
property int_adjacency: Dict[int, Dict[int, int]]

Adjacency with integer-coded bonds.

is_automorphic()

Test for automorphism symmetry of graph.

is_equal(other, /) bool

Test self is same structure as other

is_ring_bond(n: int, m: int, /) bool

Check is bond in any ring.

is_substructure(other, /) bool

Test self is substructure of other

property not_special_connectivity: Dict[int, Set[int]]

Graph connectivity without special bonds.

remap(mapping: Dict[int, int], *, copy=False) QueryContainer

Change atom numbers

Parameters:
  • mapping – mapping of old numbers to the new

  • copy – keep original graph

property ring_atoms

Atoms in rings. Not SSSR based fast algorithm.

property rings_count: int

SSSR rings count. Ignored rings with special bonds.

property skin_graph: Dict[int, Set[int]]

Graph without terminal atoms. Only rings and linkers

property smiles_atoms_order: Tuple[int, ...]

Atoms order in canonic SMILES.

property sssr: Tuple[Tuple[int, ...], ...]

Smallest Set of Smallest Rings. Special bonds ignored.

Based on idea of PID matrices from: Lee, C. J., Kang, Y.-M., Cho, K.-H., & No, K. T. (2009). A robust method for searching the smallest set of smallest rings with a path-included distance matrix. Proceedings of the National Academy of Sciences of the United States of America, 106(41), 17355–17358. https://doi.org/10.1073/pnas.0813040106

:return rings atoms numbers

property tetrahedrons: Tuple[int, ...]

Carbon sp3 atoms numbers.

union(other: QueryContainer, *, remap: bool = False, copy: bool = True) QueryContainer

Merge Graphs into one.

Parameters:
  • remap – if atoms has collisions then remap other graph atoms else raise exception.

  • copy – keep original structure and return new object

class chython.containers.ReactionContainer(reactants: Iterable[MoleculeContainer] = (), products: Iterable[MoleculeContainer] = (), reagents: Iterable[MoleculeContainer] = (), meta: Dict | None = None, name: str | None = None)

Reaction storage. Contains reactants, products and reagents lists.

Reaction storage hashable and comparable. based on reaction unique signature (SMILES).

New reaction object creation

Parameters:
  • reactants – list of MoleculeContainers in left side of reaction

  • products – right side of reaction. see reactants

  • reagents – middle side of reaction: solvents, catalysts, etc. see reactants

  • meta – dictionary of metadata. like DTYPE-DATUM in RDF

canonicalize(*, fix_mapping: bool = True, logging=False, fix_tautomers=True) bool | List[Tuple[int, Tuple[int, ...], int, str]]

Convert molecules to canonical forms of functional groups and aromatic rings without explicit hydrogens. Return True if in any molecule found not canonical group.

Parameters:
  • fix_mapping – Search AAM errors of functional groups.

  • logging – return log from molecules with index of molecule. Otherwise, return True if these groups found in any molecule.

  • fix_tautomers – convert tautomers to canonical forms.

check_valence() List[Tuple[int, Tuple[int, ...]]]

Check valences of all atoms of all molecules.

Works only on molecules with aromatic rings in Kekule form. :return: list of invalid molecules with invalid atoms lists

clean2d()

Recalculate 2d coordinates

clean_isotopes() bool

Clean isotope marks for all molecules in reaction. Returns True if in any molecule found isotope.

clean_stereo()

Remove stereo data

compose() CGRContainer

Get CGR of reaction

Reagents will be presented as unchanged molecules :return: CGRContainer

contract_ions() bool

Contract ions into salts (Molecules with disconnected components). Note: works only for unambiguous cases. e.g. equal anions/cations and different or equal cations/anions.

Return True if any ions contracted.

copy() ReactionContainer

Get copy of object

depict(*, width=None, height=None, clean2d: bool = True) str

Depict reaction in SVG format.

Parameters:
  • width – set svg width param. by default auto-calculated.

  • height – set svg height param. by default auto-calculated.

  • clean2d – calculate coordinates if necessary.

explicify_hydrogens() int

Add explicit hydrogens to atoms

Returns:

number of added atoms

fix_groups_mapping(*, logging: bool = False) bool | List[Tuple[str, Tuple[int, ...]]]

Fix atom-to-atom mapping of some functional groups. Return True if found AAM errors.

fix_mapping(*, logging: bool = False) bool | List[Tuple[int, str, Tuple[int, ...]]]

Fix mapping by using loaded rules.

fix_positions()

Fix coordinates of molecules in reaction

flush_cache()
implicify_hydrogens() int

Remove explicit hydrogens if possible.

Returns:

number of removed hydrogens.

kekule(*, buffer_size=7) bool

Convert structures to kekule form. Return True if in any molecule found aromatic ring

Parameters:

buffer_size – number of attempts of pyridine form searching.

property meta: Dict

Dictionary of metadata. Like DTYPE-DATUM in RDF

molecules() Iterator[MoleculeContainer]

Iterator of all reaction molecules

property name: str
pack(*, compressed=True, check=True)

Pack into compressed bytes.

Note:
  • Same restrictions as in molecules pack.

  • reactants, reagents nad products should contain less than 256 molecules.

Format specification: Big endian bytes order 8 bit - header byte = 0x01 (current format specification) 8 bit - reactants count 8 bit - reagents count 8 bit - products count x bit - concatenated molecules packs

Parameters:
  • compressed – return zlib-compressed pack.

  • check – check molecules for format restrictions.

classmethod pack_len(data: bytes, /, *, compressed=True) Tuple[List[int], List[int], List[int]]

Returns reactants, reagents, products molecules atoms count in reaction pack.

property products: Tuple[MoleculeContainer, ...]
property reactants: Tuple[MoleculeContainer, ...]
property reagents: Tuple[MoleculeContainer, ...]
remove_reagents(*, keep_reagents: bool = False, mapping: bool = True) bool

Place molecules, except reactants, to reagents list. Reagents - molecules which atoms not presented in products. Mapping based approach remove molecules without reaction center. Rule based approach remove equal molecules in reactants and products, and predefined reactants.

Parameters:
  • mapping – use atom-to-atom mapping to detect reagents, otherwise use predefined list of common reagents.

  • keep_reagents – delete reagents if False

Return True if any reagent found.

reset_mapping(*, return_score: bool = False, multiplier=1.75, keep_reactants_numbering=False) bool | float

Do atom-to-atom mapping. Return True if mapping changed.

standardize(*, fix_mapping: bool = True, logging=False, fix_tautomers=True) bool | List[Tuple[int, Tuple[int, ...], int, str]]

Fix functional groups representation. Return True if in any molecule fixed group.

Deprecated method. Use canonicalize directly.

Parameters:
  • fix_mapping – Search AAM errors of functional groups.

  • logging – return log from molecules with index of molecule. Otherwise, return True if these groups found in any molecule.

  • fix_tautomers – convert tautomers to canonical forms.

thiele(*, fix_tautomers=True) bool

Convert structures to aromatic form. Return True if in any molecule found kekule ring

Parameters:

fix_tautomers – convert tautomers to canonical forms.

classmethod unpack(data: bytes, /, *, compressed=True) ReactionContainer

Unpack from compressed bytes.

Parameters:

compressed – decompress data before processing.