chython.containers package¶

Data classes.

class chython.containers.Bond(order: int)¶

copy() → Bond¶

classmethod from_bond(bond)¶

property in_ring: bool¶

property order: int¶

class chython.containers.MoleculeContainer¶

add_atom(atom: Element | int | str, *args, charge=0, is_radical=False, xy: Tuple[float, float] = (0.0, 0.0), _skip_hydrogen_calculation=False, **kwargs)¶: Add new atom.

add_atom_stereo(n: int, env: Tuple[int, ...], mark: bool, *, clean_cache=True)¶

Add stereo data for specified neighbors bypass. Use it for tetrahedrons or allenes.

Parameters:

n – number of tetrahedron atom or central atom of allene.
env – numbers of atoms with specified bypass
mark – clockwise or anti bypass.

See <https://www.daylight.com/dayhtml/doc/theory/theory.smiles.html> and <http://opensmiles.org/opensmiles.html>

add_bond(n, m, bond: Bond | int, *, _skip_hydrogen_calculation=False)¶

Connect atoms with bonds.

For Thiele forms of molecule causes invalidation of internal state. Implicit hydrogens marks will not be set if atoms in aromatic rings. Call kekule() and thiele() in sequence to fix marks.

add_cis_trans_stereo(n: int, m: int, n1: int, n2: int, mark: bool, *, clean_cache=True)¶

Add stereo data to cis-trans double bonds (not allenes).

n1/n=m/n2

Parameters:

n – number of starting atom of double bonds chain (alkenes of cumulenes)
m – number of ending atom of double bonds chain (alkenes of cumulenes)
n1 – number of neighboring atom of starting atom
n2 – number of neighboring atom of ending atom
mark – cis or trans

See <https://www.daylight.com/dayhtml/doc/theory/theory.smiles.html> and <http://opensmiles.org/opensmiles.html

add_wedge(n: int, m: int, mark: int, *, clean_cache=True)¶

Add stereo data by wedge notation of bonds. Use it for tetrahedrons of allenes.

Parameters:

n – number of atom from which wedge bond started
m – number of atom to which wedge bond coming
mark – up bond is 1, down is -1

adjacency_matrix(set_bonds=False, /)¶

Adjacency matrix of Graph.

Parameters:: set_bonds – if True set bond orders instead of 1.

property aromatic_rings: Tuple[Tuple[int, ...], ...]¶: Aromatic rings atoms numbers

atom(n: int) → Atom¶

atoms() → Iterator[Tuple[int, Atom]]¶: iterate over all atoms

property atoms_count: int¶

property atoms_numbers: Iterator[int]¶

property atoms_order: Dict[int, int]¶

Morgan like algorithm for graph nodes ordering

Returns:: dict of atom-order pairs

property atoms_rings: Dict[int, Tuple[Tuple[int, ...]]]¶: Dict of atoms rings which contains it.

property atoms_rings_sizes: Dict[int, Tuple[int, ...]]¶: Sizes of rings containing atom.

augmented_substructure(atoms: Iterable[int], deep: int = 1, **kwargs) → MoleculeContainer¶

Create substructure containing atoms and their neighbors

Parameters:

atoms – list of core atoms in graph
deep – number of bonds between atoms and neighbors

augmented_substructures(atoms: Iterable[int], deep: int = 1, **kwargs) → List[MoleculeContainer]¶

Create list of substructures containing atoms and their neighbors

Parameters:

atoms – list of core atoms in graph
deep – number of bonds between atoms and neighbors

Returns:

list of graphs containing atoms, atoms + first circle, atoms + 1st + 2nd, etc up to deep or while new nodes available

bond(n: int, m: int) → Bond¶

bonds() → Iterator[Tuple[int, int, Bond]]¶: iterate other all bonds

property bonds_count: int¶

property brutto: Dict[str, int]¶: Counted atoms dict

calculate_cis_trans_from_2d(*, clean_cache=True)¶: Calculate cis-trans stereo bonds from given 2d coordinates. Unusable for SMILES and INCHI.

canonicalize(*, fix_tautomers=True, keep_kekule=False, logging=False, ignore=True) → bool | List[Tuple[Tuple[int, ...], int, str]]¶

Convert molecule to canonical forms of functional groups and aromatic rings without explicit hydrogens.

Parameters:

logging – return log.
ignore – ignore standardization bugs.
fix_tautomers – convert tautomers to canonical forms.
keep_kekule – return kekule form.

check_valence() → List[int]¶

Check valences of all atoms.

Returns:: list of invalid atoms

clean2d()¶: Calculate 2d layout of graph. https://pubs.acs.org/doi/10.1021/acs.jcim.7b00425 JS implementation used.

clean_isotopes() → bool¶: Clean isotope marks from molecule. Return True if any isotope found.

clean_stereo()¶: Remove stereo data.

compose(other: MoleculeContainer) → CGRContainer¶: Compose 2 graphs to CGR.

property connected_components: Tuple[Tuple[int, ...], ...]¶: Isolated components of single graph. E.g. salts as ion pair.

property connected_components_count: int¶: Number of components in graph

copy() → MoleculeContainer¶: copy of graph

property cumulenes: Tuple[Tuple[int, ...], ...]¶: Alkenes, allenes and cumulenes atoms numbers.

delete_atom(n: int, *, _skip_hydrogen_calculation=False)¶

Remove atom.

For Thiele forms of molecule causes invalidation of internal state. Implicit hydrogens marks will not be set if atoms in aromatic rings. Call kekule() and thiele() in sequence to fix marks.

delete_bond(n: int, m: int, *, _skip_hydrogen_calculation=False)¶

Disconnect atoms.

For Thiele forms of molecule causes invalidation of internal state. Implicit hydrogens marks will not be set if atoms in aromatic rings. Call kekule() and thiele() in sequence to fix marks.

depict(*, width=None, height=None, clean2d: bool = True, _embedding=False) → str¶

Depict molecule in SVG format.

Parameters:

width – set svg width param. by default auto-calculated.
height – set svg height param. by default auto-calculated.
clean2d – calculate coordinates if necessary.

depict3d(index: int = 0) → str¶

Get X3DOM XML string.

Parameters:: index – index of conformer

enumerate_charged_forms(*, deep: int = 4, limit: int = 1000)¶

Enumerate protonated and deprotonated ions. Use on neutralized molecules.

Parameters:

deep – Maximum amount of added or removed protons.
limit – Maximum amount of generated structures.

enumerate_charged_tautomers(*, prepare_molecules=True, partial=False, increase_aromaticity=True, keep_sugars=True, heteroarenes=True, keto_enol=True, deep: int = 4, limit: int = 1000)¶

Enumerate tautomers and protonated-deprotonated forms. Better to use on neutralized non-ionic molecules.

See enumerate_tautomers and enumerate_charged_forms params description.

enumerate_kekule()¶: Enumerate all possible kekule forms of molecule.

enumerate_tautomers(*, prepare_molecules=True, zwitter=True, partial=False, increase_aromaticity=True, keep_sugars=True, heteroarenes=True, keto_enol=True, limit: int = 1000) → Iterator[MoleculeContainer]¶

Enumerate all possible tautomeric forms of molecule.

Parameters:

prepare_molecules – Standardize structures for correct processing
zwitter – Do zwitter-ions enumeration
partial – Allow OC=CC=C>>O=CCC=C or O=CC=CC>>OC=C=CC
increase_aromaticity – prevent aromatic ring destruction
keep_sugars – prevent carbonyl moving in sugars
heteroarenes – enumerate heteroarenes
keto_enol – enumerate keto-enols
limit – Maximum attempts count

environment(atom: int, include_bond: bool = True, include_atom: bool = True) → Tuple[Tuple[int, Bond, Element] | Tuple[int, Element] | Tuple[int, Bond] | int, ...]¶

groups of (atom_number, bond, atom) connected to atom or groups of (atom_number, bond) connected to atom or groups of (atom_number, atom) connected to atom or neighbors atoms connected to atom

Parameters:

atom – number
include_atom – include atom object
include_bond – include bond object

explicify_hydrogens(*, start_map=None, _return_map=False, _fix_stereo=True) → int | List[Tuple[int, int]]¶

Add explicit hydrogens to atoms.

Returns:: number of added atoms

explicit_hydrogens(n: int) → int¶

Number of explicit hydrogen atoms connected to atom.

Take into account any type of bonds with hydrogen atoms.

fix_resonance(*, logging=False, _fix_stereo=True) → bool | List[int]¶

Transform biradical or dipole resonance structures into neutral form. Return True if structure form changed.

Parameters:: logging – return list of changed atoms.

fix_stereo()¶: Reset stereo marks.

flush_cache()¶

flush_stereo_cache()¶: Flush chiral morgan and chiral centers cache.

get_automorphism_mapping() → Iterator[Dict[int, int]]¶: Iterator of all possible automorphism mappings.

get_fast_mapping(other: MoleculeContainer) → Dict[int, int] | None¶: Get self to other fast (suboptimal) structure mapping. Only one possible atoms mapping returned. Effective only for big molecules.

get_mapping(other: Container, **kwargs)¶

Get self to other Molecule substructure mapping generator.

Parameters:

other – Molecule
automorphism_filter – Skip matches to the same atoms.
searching_scope – substructure atoms list to localize isomorphism.

get_mcs_mapping(other: MoleculeContainer, /, *, limit=10000) → Iterator[Dict[int, int]]¶

Find maximum common substructure. Based on clique searching in product graph.

Parameters:: limit – limit tested cliques

has_atom(n: int) → bool¶

has_bond(n: int, m: int) → bool¶

heteroatoms(n: int) → int¶: Number of neighbored heteroatoms (not carbon or hydrogen) except any-bond connected.

hybridization(n: int) → int¶

Atom hybridization.

1 - if atom has zero or only single bonded neighbors, 2 - if has only one double bonded neighbor and any amount of single bonded, 3 - if has one triple bonded and any amount of double and single bonded neighbors or two and more double bonded and any amount of single bonded neighbors, 4 - if atom in aromatic ring.

implicify_hydrogens(*, logging=False, _fix_stereo=True) → int | Tuple[int, List[int]]¶

Remove explicit hydrogen if possible. Return number of removed hydrogens. Works only with Kekule forms of aromatic structures. Keeps isotopes of hydrogen.

Parameters:: logging – return list of changed atoms.

implicit_hydrogens(n: int) → int | None¶

Number of implicit hydrogen atoms connected to atom.

Returns None if count are ambiguous.

property int_adjacency: Dict[int, Dict[int, int]]¶: Adjacency with integer-coded bonds.

is_automorphic()¶: Test for automorphism symmetry of graph.

is_equal(other, /) → bool¶: Test self is same structure as other

property is_radical: bool¶: True if at least one atom is radical

is_ring_bond(n: int, m: int, /) → bool¶: Check is bond in any ring.

is_substructure(other, /) → bool¶: Test self is substructure of other

kekule(*, buffer_size=7) → bool¶

Convert structure to kekule form. Return True if found any aromatic ring. Set implicit hydrogen count and hybridization marks on atoms.

Only one of possible double/single bonds positions will be set. For enumerate bonds positions use enumerate_kekule.

Parameters:: buffer_size – number of attempts of pyridine form searching.

linear_bit_set(min_radius: int = 1, max_radius: int = 4, length: int = 1024, number_active_bits: int = 2, number_bit_pairs: int = 4) → Set[int]¶

Transform structure into set of indexes of True-valued features.

Parameters:

min_radius – minimal length of fragments
max_radius – maximum length of fragments
length – bit string’s length. Should be power of 2
number_active_bits – number of active bits for each hashed tuple
number_bit_pairs – describe how much repeating fragments we can count in hashable fingerprint (if number of fragment in molecule greater or equal this number, we will activate only this number of fragments). To take into account all repeating fragments put 0 as a value.

linear_fingerprint(min_radius: int = 1, max_radius: int = 4, length: int = 1024, number_active_bits: int = 2, number_bit_pairs: int = 4)¶

Transform structures into array of binary features.

Parameters:

min_radius – minimal length of fragments
max_radius – maximum length of fragments
length – bit string’s length. Should be power of 2
number_active_bits – number of active bits for each hashed tuple
number_bit_pairs – describe how much repeating fragments we can count in hashable fingerprint (if number of fragment in molecule greater or equal this number, we will activate only this number of fragments). To take into account all repeating fragments put 0 as a value.

Returns:

array(n_features)

linear_hash_set(min_radius: int = 1, max_radius: int = 4, number_bit_pairs: int = 4) → Set[int]¶

Transform structure into set of integer hashes of fragments with count information.

Parameters:

min_radius – minimal length of fragments
max_radius – maximum length of fragments
number_bit_pairs – describe how much repeating fragments we can count in hashable fingerprint (if number of fragment in molecule greater or equal this number, we will activate only this number of fragments). To take into account all repeating fragments put 0 as a value.

linear_hash_smiles(min_radius: int = 1, max_radius: int = 4, number_bit_pairs: int = 4) → Dict[int, List[str]]¶

Transform structure into dict of integer hashes of fragments with count information and: corresponding fragment SMILES.

Parameters:

min_radius – minimal length of fragments
max_radius – maximum length of fragments
number_bit_pairs – describe how much repeating fragments we can count in hashable fingerprint (if number of fragment in molecule greater or equal this number, we will activate only this number of fragments). To take into account all repeating fragments put 0 as a value.

linear_smiles_hash(min_radius: int = 1, max_radius: int = 4, number_bit_pairs: int = 4) → Dict[str, List[int]]¶

Transform structure into dict of fragment SMILES and list of corresponding integer hashes of fragments.

Parameters:

min_radius – minimal length of fragments
max_radius – maximum length of fragments
number_bit_pairs – describe how much repeating fragments we can count in hashable fingerprint (if number of fragment in molecule greater or equal this number, we will activate only this number of fragments). To take into account all repeating fragments put 0 as a value.

property meta: Dict¶

property molecular_charge: int¶: Total charge of molecule

property molecular_mass: float¶

morgan_bit_set(min_radius: int = 1, max_radius: int = 4, length: int = 1024, number_active_bits: int = 2) → Set[int]¶

Transform structures into set of indexes of True-valued features.

Parameters:

min_radius – minimal radius of EC
max_radius – maximum radius of EC
length – bit string’s length. Should be power of 2
number_active_bits – number of active bits for each hashed tuple

morgan_fingerprint(min_radius: int = 1, max_radius: int = 4, length: int = 1024, number_active_bits: int = 2)¶

Transform structures into array of binary features. Morgan fingerprints. Similar to RDkit implementation.

Parameters:

min_radius – minimal radius of EC
max_radius – maximum radius of EC
length – bit string’s length. Should be power of 2
number_active_bits – number of active bits for each hashed tuple

Returns:

array(n_features)

morgan_hash_set(min_radius: int = 1, max_radius: int = 4) → Set[int]¶

Transform structures into integer hashes of atoms with EC.

Parameters:

min_radius – minimal radius of EC
max_radius – maximum radius of EC

morgan_hash_smiles(min_radius: int = 1, max_radius: int = 4) → Dict[int, List[str]]¶

Transform structures into dictionary of hashes of atoms with EC and corresponding SMILES.

Parameters:

min_radius – minimal radius of EC
max_radius – maximum radius of EC

morgan_smiles_hash(min_radius: int = 1, max_radius: int = 4) → Dict[str, List[int]]¶

Transform structures into dictionary of smiles and corresponding hashes of atoms with EC.

Parameters:

min_radius – minimal radius of EC
max_radius – maximum radius of EC

property name: str¶

neighbors(n: int) → int¶: number of neighbors atoms excluding any-bonded

neutralize(*, keep_charge=True, logging=False, _fix_stereo=True) → bool | List[int]¶

Convert organic salts to neutral form if possible. Only one possible form used for charge unbalanced structures.

Parameters:

keep_charge – do partial neutralization to keep total charge of molecule.
logging – return changed atoms list.

property not_special_connectivity: Dict[int, Set[int]]¶: Graph connectivity without special bonds.

pack(*, compressed=True, check=True, version=2, order: List[int] | None = None) → bytes¶

Pack into compressed bytes.

Note:

Less than 4096 atoms supported. Atoms mapping should be in range 1-4095.
Implicit hydrogens count should be in range 0-6 or unspecified.
Isotope shift should be in range -15 - 15 relatively chython.files._mdl.mol.common_isotopes
Atoms neighbors should be in range 0-15

Format V2 specification:

Big endian bytes order
bit - 0x02 (format specification version)
bit - number of atoms
bit - cis/trans stereo block size
Atom block 9 bytes (repeated):
bit - atom number
bit - number of neighbors
bit tetrahedron sign (00 - not stereo, 10 or 11 - has stereo)
bit - allene sign
bit - isotope (00000 - not specified, over = isotope - common_isotope + 16)
bit - atomic number (<=118)
bit - XY float16 coordinates
bit - hydrogens (0-7). Note: 7 == None
bit - charge (charge + 4. possible range -4 - 4)
bit - radical state
Connection table: flatten list of neighbors. neighbors count stored in atom block.
For example CC(=O)O - {1: [2], 2: [1, 3, 4], 3: [2], 4: [2]} >> [2, 1, 3, 4, 2, 2].
Repeated block (equal to bonds count).
bit - paired 12 bit numbers.
Bonds order block 3 bit per bond zero-padded to full byte at the end.
Cis/trans data block (repeated):
bit - atoms pair
bit - zero padding. in future can be used for extra bond-level stereo, like atropoisomers.
bit - sign

Format V3 specification:

Big endian bytes order
bit - 0x03 (format specification version)
Atom block 3 bytes (repeated):
bit - atom entrance flag (always 1)
bit - atomic number (<=118)
bit - hydrogens (0-7). Note: 7 == None
bit - charge (charge + 4. possible range -4 - 4)
bit - radical state
bit padding
bit tetrahedron/allene sign
    (000 - not stereo or unknown, 001 - pure-unknown-enantiomer, 010 or 011 - has stereo)
bit - number of following bonds and CT blocks (0-15)

Bond block 2 bytes (repeated 0-15 times)
bit - negative shift from current atom to connected (e.g. 0x001 = -1 - connected to previous atom)
bit - bond order: 0000 - single, 0001 - double, 0010 - triple, 0011 - aromatic, 0111 - special

Cis-Trans 2 bytes
bit - negative shift from current atom to connected (e.g. 0x001 = -1 - connected to previous atom)
bit - CT sign: 1000 or 1001 - to avoid overlap with bond

V2 format is faster than V3. V3 format doesn’t include isotopes, atom numbers and XY coordinates.

Parameters:

compressed – return zlib-compressed pack.
check – check molecule for format restrictions.
version – format version
order – atom order in V3

classmethod pack_len(data: bytes, /, *, compressed=True) → int¶: Returns atoms count in molecule pack.

remap(mapping: Dict[int, int], *, copy: bool = False) → MoleculeContainer¶

Change atom numbers

Parameters:

mapping – mapping of old numbers to the new
copy – keep original graph

remove_acids(*, logging=False) → bool | List[int]¶

Remove common acids from organic bases salts. Works only for neutral pairs like HA+B. Use neutralize before.

Parameters:: logging – return deleted atoms list.

remove_coordinate_bonds(*, keep_to_terminal=True, _fix_stereo=True) → int¶

Remove coordinate (or hydrogen) bonds marked with 8 (any) bond

Parameters:: keep_to_terminal – Keep any bonds to terminal hydrogens
Returns:: removed bonds count

remove_metals(*, logging=False) → bool | List¶

Remove disconnected S-metals and ammonia.

Parameters:: logging – return deleted atoms list.

property ring_atoms¶: Atoms in rings. Not SSSR based fast algorithm.

property rings_count: int¶: SSSR rings count. Ignored rings with special bonds.

saturate(neighbors_distances: Dict[int, Dict[int, float]] | None = None, reset_electrons: bool = True, expected_charge: int = 0, expected_radicals_count: int = 0, allow_errors: bool = True, logging: bool = False) → bool | List[str]¶

Saturate molecules with double and triple bonds and charges and radical states to correct valences of atoms. Note: works only with fully explicit hydrogens!

Parameters:

neighbors_distances – If given longest bonds can be removed if need.
reset_electrons – Can change charges and radicals if need.
expected_charge – Reset charge to given. Works only with reset_electrons=True.
expected_radicals_count – Reset radical atoms count to given. Works only with reset_electrons=True.
allow_errors – allow unbalanced result.
logging – return log.

property skin_graph: Dict[int, Set[int]]¶: Graph without terminal atoms. Only rings and linkers

property smiles_atoms_order: Tuple[int, ...]¶: Atoms order in canonic SMILES.

split() → List[MoleculeContainer]¶: Split disconnected structure to connected substructures

split_metal_salts(*, logging=False) → bool | List[Tuple[int, int]]¶

Split connected S-metal/lanthanides/actinides salts to cation/anion pairs.

Parameters:: logging – return deleted bonds list.

property sssr: Tuple[Tuple[int, ...], ...]¶

Smallest Set of Smallest Rings. Special bonds ignored.

Based on idea of PID matrices from: Lee, C. J., Kang, Y.-M., Cho, K.-H., & No, K. T. (2009). A robust method for searching the smallest set of smallest rings with a path-included distance matrix. Proceedings of the National Academy of Sciences of the United States of America, 106(41), 17355–17358. https://doi.org/10.1073/pnas.0813040106

:return rings atoms numbers

standardize(*, logging=False, ignore=True, fix_tautomers=True, _fix_stereo=True) → bool | List[Tuple[Tuple[int, ...], int, str]]¶

Standardize functional groups. Return True if any non-canonical group found.

Parameters:

fix_tautomers – convert tautomers to canonical forms.
logging – return list of fixed atoms with matched rules.
ignore – ignore standardization bugs.

standardize_charges(*, logging=False, prepare_molecule=True, _fix_stereo=True) → bool | List[int]¶

Set canonical positions of charges in heterocycles and ferrocenes.

Parameters:

logging – return list of changed atoms.
prepare_molecule – do thiele procedure.

sticky_smiles(left: int, right: int = None, *, remove_left: bool = False, remove_right: bool = False, tries: int = 10)¶

Generate smiles with fixed left and optionally right terminal atoms. Note: Produce expected results only with acyclic terminal atoms.

Parameters:

remove_left – drop terminal atom and corresponding bond
remove_right – drop terminal atom and corresponding bond
tries – number of attempts to generate smiles

substructure(atoms: Iterable[int], *, as_query: bool = False, recalculate_hydrogens=True, skip_neighbors_marks=False, skip_hybridizations_marks=False, skip_hydrogens_marks=False, skip_rings_sizes_marks=False, skip_heteroatoms_marks=False) → MoleculeContainer | QueryContainer¶

Create substructure containing atoms from atoms list.

For Thiele forms of molecule In Molecule substructure causes invalidation of internal state. Implicit hydrogens marks will not be set if atoms in aromatic rings. Call kekule() and thiele() in sequence to fix marks.

Parameters:

atoms – list of atoms numbers of substructure
as_query – return Query object based on graph substructure
recalculate_hydrogens – calculate implicit H count in substructure
skip_neighbors_marks – Don’t set neighbors count marks on substructured queries
skip_hybridizations_marks – Don’t set hybridizations marks on substructured queries
skip_hydrogens_marks – Don’t set hydrogens count marks on substructured queries
skip_rings_sizes_marks – Don’t set rings_sizes marks on substructured queries
skip_heteroatoms_marks – Don’t set heteroatoms count marks

property tetrahedrons: Tuple[int, ...]¶: Carbon sp3 atoms numbers.

thiele(*, fix_tautomers=True) → bool¶

Convert structure to aromatic form (Huckel rule ignored). Return True if found any kekule ring. Also marks atoms as aromatic.

Parameters:: fix_tautomers – try to fix condensed rings with pyrroles. N1C=CC2=NC=CC2=C1>>N1C=CC2=CN=CC=C12

total_hydrogens(n: int) → int¶

Number of hydrogen atoms connected to atom.

Take into account any type of bonds with hydrogen atoms.

union(other: MoleculeContainer, *, remap: bool = False, copy: bool = True) → MoleculeContainer¶

Merge Graphs into one.

Parameters:

remap – if atoms has collisions then remap other graph atoms else raise exception.
copy – keep original structure and return new object

classmethod unpack(data: bytes | memoryview, /, *, compressed=True, _return_pack_length=False) → MoleculeContainer¶

Unpack from compressed bytes.

Parameters:: compressed – decompress data before processing.

view3d(index: int = 0, width='600px', height='400px')¶

Jupyter widget for 3D visualization.

Parameters:

index – index of conformer
width – widget width
height – widget height

class chython.containers.QueryBond(order: int | List[int] | Set[int] | Tuple[int, ...], in_ring: bool | None = None)¶

copy() → QueryBond¶

classmethod from_bond(bond)¶

property in_ring: bool | None¶

property order: Tuple[int, ...]¶

class chython.containers.QueryContainer¶

add_atom(atom: Query | Element | int | str, *args, neighbors: int | List[int] | Tuple[int, ...] | None = None, hybridization: int | List[int] | Tuple[int, ...] | None = None, hydrogens: int | List[int] | Tuple[int, ...] | None = None, rings_sizes: int | List[int] | Tuple[int, ...] | None = None, heteroatoms: int | List[int] | Tuple[int, ...] | None = None, masked: bool = False, **kwargs)¶: new atom addition

add_bond(n, m, bond: QueryBond | Bond | int | Tuple[int, ...])¶: Add bond.

atom(n: int) → Atom¶

atoms() → Iterator[Tuple[int, Atom]]¶: iterate over all atoms

property atoms_count: int¶

property atoms_numbers: Iterator[int]¶

property atoms_order: Dict[int, int]¶

Morgan like algorithm for graph nodes ordering

Returns:: dict of atom-order pairs

property atoms_rings: Dict[int, Tuple[Tuple[int, ...]]]¶: Dict of atoms rings which contains it.

property atoms_rings_sizes: Dict[int, Tuple[int, ...]]¶: Sizes of rings containing atom.

bond(n: int, m: int) → Bond¶

bonds() → Iterator[Tuple[int, int, Bond]]¶: iterate other all bonds

property bonds_count: int¶

clean_stereo()¶: Remove stereo data.

property connected_components: Tuple[Tuple[int, ...], ...]¶: Isolated components of single graph. E.g. salts as ion pair.

property connected_components_count: int¶: Number of components in graph

copy() → QueryContainer¶: copy of graph

property cumulenes: Tuple[Tuple[int, ...], ...]¶: Alkenes, allenes and cumulenes atoms numbers.

enumerate_queries(*, enumerate_marks: bool = False)¶

Enumerate complex queries into multiple simple ones. For example [N,O]-C into NC and OC.

Parameters:: enumerate_marks – enumerate multiple marks to separate queries

flush_cache()¶

get_automorphism_mapping() → Iterator[Dict[int, int]]¶: Iterator of all possible automorphism mappings.

get_mapping(other: Container, **kwargs)¶

Get self to other Molecule or Query substructure mapping generator.

Parameters:

other – Molecule or Query
automorphism_filter – Skip matches to the same atoms.
searching_scope – substructure atoms list to localize isomorphism.

has_atom(n: int) → bool¶

has_bond(n: int, m: int) → bool¶

property int_adjacency: Dict[int, Dict[int, int]]¶: Adjacency with integer-coded bonds.

is_automorphic()¶: Test for automorphism symmetry of graph.

is_equal(other, /) → bool¶: Test self is same structure as other

is_ring_bond(n: int, m: int, /) → bool¶: Check is bond in any ring.

is_substructure(other, /) → bool¶: Test self is substructure of other

property not_special_connectivity: Dict[int, Set[int]]¶: Graph connectivity without special bonds.

remap(mapping: Dict[int, int], *, copy=False) → QueryContainer¶

Change atom numbers

Parameters:

mapping – mapping of old numbers to the new
copy – keep original graph

property ring_atoms¶: Atoms in rings. Not SSSR based fast algorithm.

property rings_count: int¶: SSSR rings count. Ignored rings with special bonds.

property skin_graph: Dict[int, Set[int]]¶: Graph without terminal atoms. Only rings and linkers

property smiles_atoms_order: Tuple[int, ...]¶: Atoms order in canonic SMILES.

property sssr: Tuple[Tuple[int, ...], ...]¶

Smallest Set of Smallest Rings. Special bonds ignored.

Based on idea of PID matrices from: Lee, C. J., Kang, Y.-M., Cho, K.-H., & No, K. T. (2009). A robust method for searching the smallest set of smallest rings with a path-included distance matrix. Proceedings of the National Academy of Sciences of the United States of America, 106(41), 17355–17358. https://doi.org/10.1073/pnas.0813040106

:return rings atoms numbers

property tetrahedrons: Tuple[int, ...]¶: Carbon sp3 atoms numbers.

union(other: QueryContainer, *, remap: bool = False, copy: bool = True) → QueryContainer¶

Merge Graphs into one.

Parameters:

remap – if atoms has collisions then remap other graph atoms else raise exception.
copy – keep original structure and return new object

class chython.containers.ReactionContainer(reactants: Iterable[MoleculeContainer] = (), products: Iterable[MoleculeContainer] = (), reagents: Iterable[MoleculeContainer] = (), meta: Dict | None = None, name: str | None = None)¶

Reaction storage. Contains reactants, products and reagents lists.

Reaction storage hashable and comparable. based on reaction unique signature (SMILES).

New reaction object creation

Parameters:

reactants – list of MoleculeContainers in left side of reaction
products – right side of reaction. see reactants
reagents – middle side of reaction: solvents, catalysts, etc. see reactants
meta – dictionary of metadata. like DTYPE-DATUM in RDF

canonicalize(*, fix_mapping: bool = True, logging=False, fix_tautomers=True) → bool | List[Tuple[int, Tuple[int, ...], int, str]]¶

Convert molecules to canonical forms of functional groups and aromatic rings without explicit hydrogens. Return True if in any molecule found not canonical group.

Parameters:

fix_mapping – Search AAM errors of functional groups.
logging – return log from molecules with index of molecule. Otherwise, return True if these groups found in any molecule.
fix_tautomers – convert tautomers to canonical forms.

check_valence() → List[Tuple[int, Tuple[int, ...]]]¶

Check valences of all atoms of all molecules.

Works only on molecules with aromatic rings in Kekule form. :return: list of invalid molecules with invalid atoms lists

clean2d()¶: Recalculate 2d coordinates

clean_isotopes() → bool¶: Clean isotope marks for all molecules in reaction. Returns True if in any molecule found isotope.

clean_stereo()¶: Remove stereo data

compose() → CGRContainer¶

Get CGR of reaction

Reagents will be presented as unchanged molecules :return: CGRContainer

contract_ions() → bool¶

Contract ions into salts (Molecules with disconnected components). Note: works only for unambiguous cases. e.g. equal anions/cations and different or equal cations/anions.

Return True if any ions contracted.

copy() → ReactionContainer¶: Get copy of object

depict(*, width=None, height=None, clean2d: bool = True) → str¶

Depict reaction in SVG format.

Parameters:

width – set svg width param. by default auto-calculated.
height – set svg height param. by default auto-calculated.
clean2d – calculate coordinates if necessary.

explicify_hydrogens() → int¶

Add explicit hydrogens to atoms

Returns:: number of added atoms

fix_groups_mapping(*, logging: bool = False) → bool | List[Tuple[str, Tuple[int, ...]]]¶: Fix atom-to-atom mapping of some functional groups. Return True if found AAM errors.

fix_mapping(*, logging: bool = False) → bool | List[Tuple[int, str, Tuple[int, ...]]]¶: Fix mapping by using loaded rules.

fix_positions()¶: Fix coordinates of molecules in reaction

flush_cache()¶

implicify_hydrogens() → int¶

Remove explicit hydrogens if possible.

Returns:: number of removed hydrogens.

kekule(*, buffer_size=7) → bool¶

Convert structures to kekule form. Return True if in any molecule found aromatic ring

Parameters:: buffer_size – number of attempts of pyridine form searching.

property meta: Dict¶: Dictionary of metadata. Like DTYPE-DATUM in RDF

molecules() → Iterator[MoleculeContainer]¶: Iterator of all reaction molecules

property name: str¶

pack(*, compressed=True, check=True)¶

Pack into compressed bytes.

Note:

Same restrictions as in molecules pack.
reactants, reagents nad products should contain less than 256 molecules.

Format specification: Big endian bytes order 8 bit - header byte = 0x01 (current format specification) 8 bit - reactants count 8 bit - reagents count 8 bit - products count x bit - concatenated molecules packs

Parameters:

compressed – return zlib-compressed pack.
check – check molecules for format restrictions.

classmethod pack_len(data: bytes, /, *, compressed=True) → Tuple[List[int], List[int], List[int]]¶: Returns reactants, reagents, products molecules atoms count in reaction pack.

property products: Tuple[MoleculeContainer, ...]¶

property reactants: Tuple[MoleculeContainer, ...]¶

property reagents: Tuple[MoleculeContainer, ...]¶

remove_reagents(*, keep_reagents: bool = False, mapping: bool = True) → bool¶

Place molecules, except reactants, to reagents list. Reagents - molecules which atoms not presented in products. Mapping based approach remove molecules without reaction center. Rule based approach remove equal molecules in reactants and products, and predefined reactants.

Parameters:

mapping – use atom-to-atom mapping to detect reagents, otherwise use predefined list of common reagents.
keep_reagents – delete reagents if False

Return True if any reagent found.

reset_mapping(*, return_score: bool = False, multiplier=1.75, keep_reactants_numbering=False) → bool | float¶: Do atom-to-atom mapping. Return True if mapping changed.

standardize(*, fix_mapping: bool = True, logging=False, fix_tautomers=True) → bool | List[Tuple[int, Tuple[int, ...], int, str]]¶

Fix functional groups representation. Return True if in any molecule fixed group.

Deprecated method. Use canonicalize directly.

Parameters:

fix_mapping – Search AAM errors of functional groups.
logging – return log from molecules with index of molecule. Otherwise, return True if these groups found in any molecule.
fix_tautomers – convert tautomers to canonical forms.

thiele(*, fix_tautomers=True) → bool¶

Convert structures to aromatic form. Return True if in any molecule found kekule ring

Parameters:: fix_tautomers – convert tautomers to canonical forms.

classmethod unpack(data: bytes, /, *, compressed=True) → ReactionContainer¶

Unpack from compressed bytes.

Parameters:: compressed – decompress data before processing.

chython.containers package¶

chython

Navigation

Related Topics