chython.containers package¶
Data classes.
- class chython.containers.Bond(order: int)¶
-
- classmethod from_bond(bond)¶
- property in_ring: bool¶
- property order: int¶
- class chython.containers.MoleculeContainer¶
- add_atom(atom: Element | int | str, *args, charge=0, is_radical=False, xy: Tuple[float, float] = (0.0, 0.0), _skip_hydrogen_calculation=False, **kwargs)¶
Add new atom.
- add_atom_stereo(n: int, env: Tuple[int, ...], mark: bool, *, clean_cache=True)¶
Add stereo data for specified neighbors bypass. Use it for tetrahedrons or allenes.
- Parameters:
n – number of tetrahedron atom or central atom of allene.
env – numbers of atoms with specified bypass
mark – clockwise or anti bypass.
See <https://www.daylight.com/dayhtml/doc/theory/theory.smiles.html> and <http://opensmiles.org/opensmiles.html>
- add_bond(n, m, bond: Bond | int, *, _skip_hydrogen_calculation=False)¶
Connect atoms with bonds.
For Thiele forms of molecule causes invalidation of internal state. Implicit hydrogens marks will not be set if atoms in aromatic rings. Call kekule() and thiele() in sequence to fix marks.
- add_cis_trans_stereo(n: int, m: int, n1: int, n2: int, mark: bool, *, clean_cache=True)¶
Add stereo data to cis-trans double bonds (not allenes).
n1/n=m/n2
- Parameters:
n – number of starting atom of double bonds chain (alkenes of cumulenes)
m – number of ending atom of double bonds chain (alkenes of cumulenes)
n1 – number of neighboring atom of starting atom
n2 – number of neighboring atom of ending atom
mark – cis or trans
See <https://www.daylight.com/dayhtml/doc/theory/theory.smiles.html> and <http://opensmiles.org/opensmiles.html
- add_wedge(n: int, m: int, mark: int, *, clean_cache=True)¶
Add stereo data by wedge notation of bonds. Use it for tetrahedrons of allenes.
- Parameters:
n – number of atom from which wedge bond started
m – number of atom to which wedge bond coming
mark – up bond is 1, down is -1
- adjacency_matrix(set_bonds=False, /)¶
Adjacency matrix of Graph.
- Parameters:
set_bonds – if True set bond orders instead of 1.
- property aromatic_rings: Tuple[Tuple[int, ...], ...]¶
Aromatic rings atoms numbers
- atom(n: int) Atom ¶
- atoms() Iterator[Tuple[int, Atom]] ¶
iterate over all atoms
- property atoms_count: int¶
- property atoms_numbers: Iterator[int]¶
- property atoms_order: Dict[int, int]¶
Morgan like algorithm for graph nodes ordering
- Returns:
dict of atom-order pairs
- property atoms_rings: Dict[int, Tuple[Tuple[int, ...]]]¶
Dict of atoms rings which contains it.
- property atoms_rings_sizes: Dict[int, Tuple[int, ...]]¶
Sizes of rings containing atom.
- augmented_substructure(atoms: Iterable[int], deep: int = 1, **kwargs) MoleculeContainer ¶
Create substructure containing atoms and their neighbors
- Parameters:
atoms – list of core atoms in graph
deep – number of bonds between atoms and neighbors
- augmented_substructures(atoms: Iterable[int], deep: int = 1, **kwargs) List[MoleculeContainer] ¶
Create list of substructures containing atoms and their neighbors
- Parameters:
atoms – list of core atoms in graph
deep – number of bonds between atoms and neighbors
- Returns:
list of graphs containing atoms, atoms + first circle, atoms + 1st + 2nd, etc up to deep or while new nodes available
- bond(n: int, m: int) Bond ¶
- bonds() Iterator[Tuple[int, int, Bond]] ¶
iterate other all bonds
- property bonds_count: int¶
- property brutto: Dict[str, int]¶
Counted atoms dict
- calculate_cis_trans_from_2d(*, clean_cache=True)¶
Calculate cis-trans stereo bonds from given 2d coordinates. Unusable for SMILES and INCHI.
- canonicalize(*, fix_tautomers=True, keep_kekule=False, logging=False, ignore=True) bool | List[Tuple[Tuple[int, ...], int, str]] ¶
Convert molecule to canonical forms of functional groups and aromatic rings without explicit hydrogens.
- Parameters:
logging – return log.
ignore – ignore standardization bugs.
fix_tautomers – convert tautomers to canonical forms.
keep_kekule – return kekule form.
- check_valence() List[int] ¶
Check valences of all atoms.
- Returns:
list of invalid atoms
- clean2d()¶
Calculate 2d layout of graph. https://pubs.acs.org/doi/10.1021/acs.jcim.7b00425 JS implementation used.
- clean_isotopes() bool ¶
Clean isotope marks from molecule. Return True if any isotope found.
- clean_stereo()¶
Remove stereo data.
- compose(other: MoleculeContainer) CGRContainer ¶
Compose 2 graphs to CGR.
- property connected_components: Tuple[Tuple[int, ...], ...]¶
Isolated components of single graph. E.g. salts as ion pair.
- property connected_components_count: int¶
Number of components in graph
- copy() MoleculeContainer ¶
copy of graph
- property cumulenes: Tuple[Tuple[int, ...], ...]¶
Alkenes, allenes and cumulenes atoms numbers.
- delete_atom(n: int, *, _skip_hydrogen_calculation=False)¶
Remove atom.
For Thiele forms of molecule causes invalidation of internal state. Implicit hydrogens marks will not be set if atoms in aromatic rings. Call kekule() and thiele() in sequence to fix marks.
- delete_bond(n: int, m: int, *, _skip_hydrogen_calculation=False)¶
Disconnect atoms.
For Thiele forms of molecule causes invalidation of internal state. Implicit hydrogens marks will not be set if atoms in aromatic rings. Call kekule() and thiele() in sequence to fix marks.
- depict(*, width=None, height=None, clean2d: bool = True, _embedding=False) str ¶
Depict molecule in SVG format.
- Parameters:
width – set svg width param. by default auto-calculated.
height – set svg height param. by default auto-calculated.
clean2d – calculate coordinates if necessary.
- depict3d(index: int = 0) str ¶
Get X3DOM XML string.
- Parameters:
index – index of conformer
- enumerate_charged_forms(*, deep: int = 4, limit: int = 1000)¶
Enumerate protonated and deprotonated ions. Use on neutralized molecules.
- Parameters:
deep – Maximum amount of added or removed protons.
limit – Maximum amount of generated structures.
- enumerate_charged_tautomers(*, prepare_molecules=True, partial=False, increase_aromaticity=True, keep_sugars=True, heteroarenes=True, keto_enol=True, deep: int = 4, limit: int = 1000)¶
Enumerate tautomers and protonated-deprotonated forms. Better to use on neutralized non-ionic molecules.
See enumerate_tautomers and enumerate_charged_forms params description.
- enumerate_kekule()¶
Enumerate all possible kekule forms of molecule.
- enumerate_tautomers(*, prepare_molecules=True, zwitter=True, partial=False, increase_aromaticity=True, keep_sugars=True, heteroarenes=True, keto_enol=True, limit: int = 1000) Iterator[MoleculeContainer] ¶
Enumerate all possible tautomeric forms of molecule.
- Parameters:
prepare_molecules – Standardize structures for correct processing
zwitter – Do zwitter-ions enumeration
partial – Allow OC=CC=C>>O=CCC=C or O=CC=CC>>OC=C=CC
increase_aromaticity – prevent aromatic ring destruction
keep_sugars – prevent carbonyl moving in sugars
heteroarenes – enumerate heteroarenes
keto_enol – enumerate keto-enols
limit – Maximum attempts count
- environment(atom: int, include_bond: bool = True, include_atom: bool = True) Tuple[Tuple[int, Bond, Element] | Tuple[int, Element] | Tuple[int, Bond] | int, ...] ¶
groups of (atom_number, bond, atom) connected to atom or groups of (atom_number, bond) connected to atom or groups of (atom_number, atom) connected to atom or neighbors atoms connected to atom
- Parameters:
atom – number
include_atom – include atom object
include_bond – include bond object
- explicify_hydrogens(*, start_map=None, _return_map=False, _fix_stereo=True) int | List[Tuple[int, int]] ¶
Add explicit hydrogens to atoms.
- Returns:
number of added atoms
- explicit_hydrogens(n: int) int ¶
Number of explicit hydrogen atoms connected to atom.
Take into account any type of bonds with hydrogen atoms.
- fix_resonance(*, logging=False, _fix_stereo=True) bool | List[int] ¶
Transform biradical or dipole resonance structures into neutral form. Return True if structure form changed.
- Parameters:
logging – return list of changed atoms.
- fix_stereo()¶
Reset stereo marks.
- flush_cache()¶
- flush_stereo_cache()¶
Flush chiral morgan and chiral centers cache.
- get_automorphism_mapping() Iterator[Dict[int, int]] ¶
Iterator of all possible automorphism mappings.
- get_fast_mapping(other: MoleculeContainer) Dict[int, int] | None ¶
Get self to other fast (suboptimal) structure mapping. Only one possible atoms mapping returned. Effective only for big molecules.
- get_mapping(other: Container, **kwargs)¶
Get self to other Molecule substructure mapping generator.
- Parameters:
other – Molecule
automorphism_filter – Skip matches to the same atoms.
searching_scope – substructure atoms list to localize isomorphism.
- get_mcs_mapping(other: MoleculeContainer, /, *, limit=10000) Iterator[Dict[int, int]] ¶
Find maximum common substructure. Based on clique searching in product graph.
- Parameters:
limit – limit tested cliques
- has_atom(n: int) bool ¶
- has_bond(n: int, m: int) bool ¶
- heteroatoms(n: int) int ¶
Number of neighbored heteroatoms (not carbon or hydrogen) except any-bond connected.
- hybridization(n: int) int ¶
Atom hybridization.
1 - if atom has zero or only single bonded neighbors, 2 - if has only one double bonded neighbor and any amount of single bonded, 3 - if has one triple bonded and any amount of double and single bonded neighbors or two and more double bonded and any amount of single bonded neighbors, 4 - if atom in aromatic ring.
- implicify_hydrogens(*, logging=False, _fix_stereo=True) int | Tuple[int, List[int]] ¶
Remove explicit hydrogen if possible. Return number of removed hydrogens. Works only with Kekule forms of aromatic structures. Keeps isotopes of hydrogen.
- Parameters:
logging – return list of changed atoms.
- implicit_hydrogens(n: int) int | None ¶
Number of implicit hydrogen atoms connected to atom.
Returns None if count are ambiguous.
- property int_adjacency: Dict[int, Dict[int, int]]¶
Adjacency with integer-coded bonds.
- is_automorphic()¶
Test for automorphism symmetry of graph.
- is_equal(other, /) bool ¶
Test self is same structure as other
- property is_radical: bool¶
True if at least one atom is radical
- is_ring_bond(n: int, m: int, /) bool ¶
Check is bond in any ring.
- is_substructure(other, /) bool ¶
Test self is substructure of other
- kekule(*, buffer_size=7) bool ¶
Convert structure to kekule form. Return True if found any aromatic ring. Set implicit hydrogen count and hybridization marks on atoms.
Only one of possible double/single bonds positions will be set. For enumerate bonds positions use enumerate_kekule.
- Parameters:
buffer_size – number of attempts of pyridine form searching.
- linear_bit_set(min_radius: int = 1, max_radius: int = 4, length: int = 1024, number_active_bits: int = 2, number_bit_pairs: int = 4) Set[int] ¶
Transform structure into set of indexes of True-valued features.
- Parameters:
min_radius – minimal length of fragments
max_radius – maximum length of fragments
length – bit string’s length. Should be power of 2
number_active_bits – number of active bits for each hashed tuple
number_bit_pairs – describe how much repeating fragments we can count in hashable fingerprint (if number of fragment in molecule greater or equal this number, we will activate only this number of fragments). To take into account all repeating fragments put 0 as a value.
- linear_fingerprint(min_radius: int = 1, max_radius: int = 4, length: int = 1024, number_active_bits: int = 2, number_bit_pairs: int = 4)¶
Transform structures into array of binary features.
- Parameters:
min_radius – minimal length of fragments
max_radius – maximum length of fragments
length – bit string’s length. Should be power of 2
number_active_bits – number of active bits for each hashed tuple
number_bit_pairs – describe how much repeating fragments we can count in hashable fingerprint (if number of fragment in molecule greater or equal this number, we will activate only this number of fragments). To take into account all repeating fragments put 0 as a value.
- Returns:
array(n_features)
- linear_hash_set(min_radius: int = 1, max_radius: int = 4, number_bit_pairs: int = 4) Set[int] ¶
Transform structure into set of integer hashes of fragments with count information.
- Parameters:
min_radius – minimal length of fragments
max_radius – maximum length of fragments
number_bit_pairs – describe how much repeating fragments we can count in hashable fingerprint (if number of fragment in molecule greater or equal this number, we will activate only this number of fragments). To take into account all repeating fragments put 0 as a value.
- linear_hash_smiles(min_radius: int = 1, max_radius: int = 4, number_bit_pairs: int = 4) Dict[int, List[str]] ¶
- Transform structure into dict of integer hashes of fragments with count information and
corresponding fragment SMILES.
- Parameters:
min_radius – minimal length of fragments
max_radius – maximum length of fragments
number_bit_pairs – describe how much repeating fragments we can count in hashable fingerprint (if number of fragment in molecule greater or equal this number, we will activate only this number of fragments). To take into account all repeating fragments put 0 as a value.
- linear_smiles_hash(min_radius: int = 1, max_radius: int = 4, number_bit_pairs: int = 4) Dict[str, List[int]] ¶
Transform structure into dict of fragment SMILES and list of corresponding integer hashes of fragments.
- Parameters:
min_radius – minimal length of fragments
max_radius – maximum length of fragments
number_bit_pairs – describe how much repeating fragments we can count in hashable fingerprint (if number of fragment in molecule greater or equal this number, we will activate only this number of fragments). To take into account all repeating fragments put 0 as a value.
- property meta: Dict¶
- property molecular_charge: int¶
Total charge of molecule
- property molecular_mass: float¶
- morgan_bit_set(min_radius: int = 1, max_radius: int = 4, length: int = 1024, number_active_bits: int = 2) Set[int] ¶
Transform structures into set of indexes of True-valued features.
- Parameters:
min_radius – minimal radius of EC
max_radius – maximum radius of EC
length – bit string’s length. Should be power of 2
number_active_bits – number of active bits for each hashed tuple
- morgan_fingerprint(min_radius: int = 1, max_radius: int = 4, length: int = 1024, number_active_bits: int = 2)¶
Transform structures into array of binary features. Morgan fingerprints. Similar to RDkit implementation.
- Parameters:
min_radius – minimal radius of EC
max_radius – maximum radius of EC
length – bit string’s length. Should be power of 2
number_active_bits – number of active bits for each hashed tuple
- Returns:
array(n_features)
- morgan_hash_set(min_radius: int = 1, max_radius: int = 4) Set[int] ¶
Transform structures into integer hashes of atoms with EC.
- Parameters:
min_radius – minimal radius of EC
max_radius – maximum radius of EC
- morgan_hash_smiles(min_radius: int = 1, max_radius: int = 4) Dict[int, List[str]] ¶
Transform structures into dictionary of hashes of atoms with EC and corresponding SMILES.
- Parameters:
min_radius – minimal radius of EC
max_radius – maximum radius of EC
- morgan_smiles_hash(min_radius: int = 1, max_radius: int = 4) Dict[str, List[int]] ¶
Transform structures into dictionary of smiles and corresponding hashes of atoms with EC.
- Parameters:
min_radius – minimal radius of EC
max_radius – maximum radius of EC
- property name: str¶
- neighbors(n: int) int ¶
number of neighbors atoms excluding any-bonded
- neutralize(*, keep_charge=True, logging=False, _fix_stereo=True) bool | List[int] ¶
Convert organic salts to neutral form if possible. Only one possible form used for charge unbalanced structures.
- Parameters:
keep_charge – do partial neutralization to keep total charge of molecule.
logging – return changed atoms list.
- property not_special_connectivity: Dict[int, Set[int]]¶
Graph connectivity without special bonds.
- pack(*, compressed=True, check=True, version=2, order: List[int] | None = None) bytes ¶
Pack into compressed bytes.
Note:
Less than 4096 atoms supported. Atoms mapping should be in range 1-4095.
Implicit hydrogens count should be in range 0-6 or unspecified.
Isotope shift should be in range -15 - 15 relatively chython.files._mdl.mol.common_isotopes
Atoms neighbors should be in range 0-15
Format V2 specification:
Big endian bytes order 8 bit - 0x02 (format specification version) 12 bit - number of atoms 12 bit - cis/trans stereo block size Atom block 9 bytes (repeated): 12 bit - atom number 4 bit - number of neighbors 2 bit tetrahedron sign (00 - not stereo, 10 or 11 - has stereo) 2 bit - allene sign 5 bit - isotope (00000 - not specified, over = isotope - common_isotope + 16) 7 bit - atomic number (<=118) 32 bit - XY float16 coordinates 3 bit - hydrogens (0-7). Note: 7 == None 4 bit - charge (charge + 4. possible range -4 - 4) 1 bit - radical state Connection table: flatten list of neighbors. neighbors count stored in atom block. For example CC(=O)O - {1: [2], 2: [1, 3, 4], 3: [2], 4: [2]} >> [2, 1, 3, 4, 2, 2]. Repeated block (equal to bonds count). 24 bit - paired 12 bit numbers. Bonds order block 3 bit per bond zero-padded to full byte at the end. Cis/trans data block (repeated): 24 bit - atoms pair 7 bit - zero padding. in future can be used for extra bond-level stereo, like atropoisomers. 1 bit - sign
Format V3 specification:
Big endian bytes order 8 bit - 0x03 (format specification version) Atom block 3 bytes (repeated): 1 bit - atom entrance flag (always 1) 7 bit - atomic number (<=118) 3 bit - hydrogens (0-7). Note: 7 == None 4 bit - charge (charge + 4. possible range -4 - 4) 1 bit - radical state 1 bit padding 3 bit tetrahedron/allene sign (000 - not stereo or unknown, 001 - pure-unknown-enantiomer, 010 or 011 - has stereo) 4 bit - number of following bonds and CT blocks (0-15) Bond block 2 bytes (repeated 0-15 times) 12 bit - negative shift from current atom to connected (e.g. 0x001 = -1 - connected to previous atom) 4 bit - bond order: 0000 - single, 0001 - double, 0010 - triple, 0011 - aromatic, 0111 - special Cis-Trans 2 bytes 12 bit - negative shift from current atom to connected (e.g. 0x001 = -1 - connected to previous atom) 4 bit - CT sign: 1000 or 1001 - to avoid overlap with bond
V2 format is faster than V3. V3 format doesn’t include isotopes, atom numbers and XY coordinates.
- Parameters:
compressed – return zlib-compressed pack.
check – check molecule for format restrictions.
version – format version
order – atom order in V3
- classmethod pack_len(data: bytes, /, *, compressed=True) int ¶
Returns atoms count in molecule pack.
- remap(mapping: Dict[int, int], *, copy: bool = False) MoleculeContainer ¶
Change atom numbers
- Parameters:
mapping – mapping of old numbers to the new
copy – keep original graph
- remove_acids(*, logging=False) bool | List[int] ¶
Remove common acids from organic bases salts. Works only for neutral pairs like HA+B. Use neutralize before.
- Parameters:
logging – return deleted atoms list.
- remove_coordinate_bonds(*, keep_to_terminal=True, _fix_stereo=True) int ¶
Remove coordinate (or hydrogen) bonds marked with 8 (any) bond
- Parameters:
keep_to_terminal – Keep any bonds to terminal hydrogens
- Returns:
removed bonds count
- remove_metals(*, logging=False) bool | List ¶
Remove disconnected S-metals and ammonia.
- Parameters:
logging – return deleted atoms list.
- property ring_atoms¶
Atoms in rings. Not SSSR based fast algorithm.
- property rings_count: int¶
SSSR rings count. Ignored rings with special bonds.
- saturate(neighbors_distances: Dict[int, Dict[int, float]] | None = None, reset_electrons: bool = True, expected_charge: int = 0, expected_radicals_count: int = 0, allow_errors: bool = True, logging: bool = False) bool | List[str] ¶
Saturate molecules with double and triple bonds and charges and radical states to correct valences of atoms. Note: works only with fully explicit hydrogens!
- Parameters:
neighbors_distances – If given longest bonds can be removed if need.
reset_electrons – Can change charges and radicals if need.
expected_charge – Reset charge to given. Works only with reset_electrons=True.
expected_radicals_count – Reset radical atoms count to given. Works only with reset_electrons=True.
allow_errors – allow unbalanced result.
logging – return log.
- property skin_graph: Dict[int, Set[int]]¶
Graph without terminal atoms. Only rings and linkers
- property smiles_atoms_order: Tuple[int, ...]¶
Atoms order in canonic SMILES.
- split() List[MoleculeContainer] ¶
Split disconnected structure to connected substructures
- split_metal_salts(*, logging=False) bool | List[Tuple[int, int]] ¶
Split connected S-metal/lanthanides/actinides salts to cation/anion pairs.
- Parameters:
logging – return deleted bonds list.
- property sssr: Tuple[Tuple[int, ...], ...]¶
Smallest Set of Smallest Rings. Special bonds ignored.
Based on idea of PID matrices from: Lee, C. J., Kang, Y.-M., Cho, K.-H., & No, K. T. (2009). A robust method for searching the smallest set of smallest rings with a path-included distance matrix. Proceedings of the National Academy of Sciences of the United States of America, 106(41), 17355–17358. https://doi.org/10.1073/pnas.0813040106
:return rings atoms numbers
- standardize(*, logging=False, ignore=True, fix_tautomers=True, _fix_stereo=True) bool | List[Tuple[Tuple[int, ...], int, str]] ¶
Standardize functional groups. Return True if any non-canonical group found.
- Parameters:
fix_tautomers – convert tautomers to canonical forms.
logging – return list of fixed atoms with matched rules.
ignore – ignore standardization bugs.
- standardize_charges(*, logging=False, prepare_molecule=True, _fix_stereo=True) bool | List[int] ¶
Set canonical positions of charges in heterocycles and ferrocenes.
- Parameters:
logging – return list of changed atoms.
prepare_molecule – do thiele procedure.
- sticky_smiles(left: int, right: int = None, *, remove_left: bool = False, remove_right: bool = False, tries: int = 10)¶
Generate smiles with fixed left and optionally right terminal atoms. Note: Produce expected results only with acyclic terminal atoms.
- Parameters:
remove_left – drop terminal atom and corresponding bond
remove_right – drop terminal atom and corresponding bond
tries – number of attempts to generate smiles
- substructure(atoms: Iterable[int], *, as_query: bool = False, recalculate_hydrogens=True, skip_neighbors_marks=False, skip_hybridizations_marks=False, skip_hydrogens_marks=False, skip_rings_sizes_marks=False, skip_heteroatoms_marks=False) MoleculeContainer | QueryContainer ¶
Create substructure containing atoms from atoms list.
For Thiele forms of molecule In Molecule substructure causes invalidation of internal state. Implicit hydrogens marks will not be set if atoms in aromatic rings. Call kekule() and thiele() in sequence to fix marks.
- Parameters:
atoms – list of atoms numbers of substructure
as_query – return Query object based on graph substructure
recalculate_hydrogens – calculate implicit H count in substructure
skip_neighbors_marks – Don’t set neighbors count marks on substructured queries
skip_hybridizations_marks – Don’t set hybridizations marks on substructured queries
skip_hydrogens_marks – Don’t set hydrogens count marks on substructured queries
skip_rings_sizes_marks – Don’t set rings_sizes marks on substructured queries
skip_heteroatoms_marks – Don’t set heteroatoms count marks
- property tetrahedrons: Tuple[int, ...]¶
Carbon sp3 atoms numbers.
- thiele(*, fix_tautomers=True) bool ¶
Convert structure to aromatic form (Huckel rule ignored). Return True if found any kekule ring. Also marks atoms as aromatic.
- Parameters:
fix_tautomers – try to fix condensed rings with pyrroles. N1C=CC2=NC=CC2=C1>>N1C=CC2=CN=CC=C12
- total_hydrogens(n: int) int ¶
Number of hydrogen atoms connected to atom.
Take into account any type of bonds with hydrogen atoms.
- union(other: MoleculeContainer, *, remap: bool = False, copy: bool = True) MoleculeContainer ¶
Merge Graphs into one.
- Parameters:
remap – if atoms has collisions then remap other graph atoms else raise exception.
copy – keep original structure and return new object
- classmethod unpack(data: bytes | memoryview, /, *, compressed=True, _return_pack_length=False) MoleculeContainer ¶
Unpack from compressed bytes.
- Parameters:
compressed – decompress data before processing.
- view3d(index: int = 0, width='600px', height='400px')¶
Jupyter widget for 3D visualization.
- Parameters:
index – index of conformer
width – widget width
height – widget height
- class chython.containers.QueryBond(order: int | List[int] | Set[int] | Tuple[int, ...], in_ring: bool | None = None)¶
-
- classmethod from_bond(bond)¶
- property in_ring: bool | None¶
- property order: Tuple[int, ...]¶
- class chython.containers.QueryContainer¶
- add_atom(atom: Query | Element | int | str, *args, neighbors: int | List[int] | Tuple[int, ...] | None = None, hybridization: int | List[int] | Tuple[int, ...] | None = None, hydrogens: int | List[int] | Tuple[int, ...] | None = None, rings_sizes: int | List[int] | Tuple[int, ...] | None = None, heteroatoms: int | List[int] | Tuple[int, ...] | None = None, masked: bool = False, **kwargs)¶
new atom addition
- atom(n: int) Atom ¶
- atoms() Iterator[Tuple[int, Atom]] ¶
iterate over all atoms
- property atoms_count: int¶
- property atoms_numbers: Iterator[int]¶
- property atoms_order: Dict[int, int]¶
Morgan like algorithm for graph nodes ordering
- Returns:
dict of atom-order pairs
- property atoms_rings: Dict[int, Tuple[Tuple[int, ...]]]¶
Dict of atoms rings which contains it.
- property atoms_rings_sizes: Dict[int, Tuple[int, ...]]¶
Sizes of rings containing atom.
- bond(n: int, m: int) Bond ¶
- bonds() Iterator[Tuple[int, int, Bond]] ¶
iterate other all bonds
- property bonds_count: int¶
- clean_stereo()¶
Remove stereo data.
- property connected_components: Tuple[Tuple[int, ...], ...]¶
Isolated components of single graph. E.g. salts as ion pair.
- property connected_components_count: int¶
Number of components in graph
- copy() QueryContainer ¶
copy of graph
- property cumulenes: Tuple[Tuple[int, ...], ...]¶
Alkenes, allenes and cumulenes atoms numbers.
- enumerate_queries(*, enumerate_marks: bool = False)¶
Enumerate complex queries into multiple simple ones. For example [N,O]-C into NC and OC.
- Parameters:
enumerate_marks – enumerate multiple marks to separate queries
- flush_cache()¶
- get_automorphism_mapping() Iterator[Dict[int, int]] ¶
Iterator of all possible automorphism mappings.
- get_mapping(other: Container, **kwargs)¶
Get self to other Molecule or Query substructure mapping generator.
- Parameters:
other – Molecule or Query
automorphism_filter – Skip matches to the same atoms.
searching_scope – substructure atoms list to localize isomorphism.
- has_atom(n: int) bool ¶
- has_bond(n: int, m: int) bool ¶
- property int_adjacency: Dict[int, Dict[int, int]]¶
Adjacency with integer-coded bonds.
- is_automorphic()¶
Test for automorphism symmetry of graph.
- is_equal(other, /) bool ¶
Test self is same structure as other
- is_ring_bond(n: int, m: int, /) bool ¶
Check is bond in any ring.
- is_substructure(other, /) bool ¶
Test self is substructure of other
- property not_special_connectivity: Dict[int, Set[int]]¶
Graph connectivity without special bonds.
- remap(mapping: Dict[int, int], *, copy=False) QueryContainer ¶
Change atom numbers
- Parameters:
mapping – mapping of old numbers to the new
copy – keep original graph
- property ring_atoms¶
Atoms in rings. Not SSSR based fast algorithm.
- property rings_count: int¶
SSSR rings count. Ignored rings with special bonds.
- property skin_graph: Dict[int, Set[int]]¶
Graph without terminal atoms. Only rings and linkers
- property smiles_atoms_order: Tuple[int, ...]¶
Atoms order in canonic SMILES.
- property sssr: Tuple[Tuple[int, ...], ...]¶
Smallest Set of Smallest Rings. Special bonds ignored.
Based on idea of PID matrices from: Lee, C. J., Kang, Y.-M., Cho, K.-H., & No, K. T. (2009). A robust method for searching the smallest set of smallest rings with a path-included distance matrix. Proceedings of the National Academy of Sciences of the United States of America, 106(41), 17355–17358. https://doi.org/10.1073/pnas.0813040106
:return rings atoms numbers
- property tetrahedrons: Tuple[int, ...]¶
Carbon sp3 atoms numbers.
- union(other: QueryContainer, *, remap: bool = False, copy: bool = True) QueryContainer ¶
Merge Graphs into one.
- Parameters:
remap – if atoms has collisions then remap other graph atoms else raise exception.
copy – keep original structure and return new object
- class chython.containers.ReactionContainer(reactants: Iterable[MoleculeContainer] = (), products: Iterable[MoleculeContainer] = (), reagents: Iterable[MoleculeContainer] = (), meta: Dict | None = None, name: str | None = None)¶
Reaction storage. Contains reactants, products and reagents lists.
Reaction storage hashable and comparable. based on reaction unique signature (SMILES).
New reaction object creation
- Parameters:
reactants – list of MoleculeContainers in left side of reaction
products – right side of reaction. see reactants
reagents – middle side of reaction: solvents, catalysts, etc. see reactants
meta – dictionary of metadata. like DTYPE-DATUM in RDF
- canonicalize(*, fix_mapping: bool = True, logging=False, fix_tautomers=True) bool | List[Tuple[int, Tuple[int, ...], int, str]] ¶
Convert molecules to canonical forms of functional groups and aromatic rings without explicit hydrogens. Return True if in any molecule found not canonical group.
- Parameters:
fix_mapping – Search AAM errors of functional groups.
logging – return log from molecules with index of molecule. Otherwise, return True if these groups found in any molecule.
fix_tautomers – convert tautomers to canonical forms.
- check_valence() List[Tuple[int, Tuple[int, ...]]] ¶
Check valences of all atoms of all molecules.
Works only on molecules with aromatic rings in Kekule form. :return: list of invalid molecules with invalid atoms lists
- clean2d()¶
Recalculate 2d coordinates
- clean_isotopes() bool ¶
Clean isotope marks for all molecules in reaction. Returns True if in any molecule found isotope.
- clean_stereo()¶
Remove stereo data
- compose() CGRContainer ¶
Get CGR of reaction
Reagents will be presented as unchanged molecules :return: CGRContainer
- contract_ions() bool ¶
Contract ions into salts (Molecules with disconnected components). Note: works only for unambiguous cases. e.g. equal anions/cations and different or equal cations/anions.
Return True if any ions contracted.
- copy() ReactionContainer ¶
Get copy of object
- depict(*, width=None, height=None, clean2d: bool = True) str ¶
Depict reaction in SVG format.
- Parameters:
width – set svg width param. by default auto-calculated.
height – set svg height param. by default auto-calculated.
clean2d – calculate coordinates if necessary.
- explicify_hydrogens() int ¶
Add explicit hydrogens to atoms
- Returns:
number of added atoms
- fix_groups_mapping(*, logging: bool = False) bool | List[Tuple[str, Tuple[int, ...]]] ¶
Fix atom-to-atom mapping of some functional groups. Return True if found AAM errors.
- fix_mapping(*, logging: bool = False) bool | List[Tuple[int, str, Tuple[int, ...]]] ¶
Fix mapping by using loaded rules.
- fix_positions()¶
Fix coordinates of molecules in reaction
- flush_cache()¶
- implicify_hydrogens() int ¶
Remove explicit hydrogens if possible.
- Returns:
number of removed hydrogens.
- kekule(*, buffer_size=7) bool ¶
Convert structures to kekule form. Return True if in any molecule found aromatic ring
- Parameters:
buffer_size – number of attempts of pyridine form searching.
- property meta: Dict¶
Dictionary of metadata. Like DTYPE-DATUM in RDF
- molecules() Iterator[MoleculeContainer] ¶
Iterator of all reaction molecules
- property name: str¶
- pack(*, compressed=True, check=True)¶
Pack into compressed bytes.
- Note:
Same restrictions as in molecules pack.
reactants, reagents nad products should contain less than 256 molecules.
Format specification: Big endian bytes order 8 bit - header byte = 0x01 (current format specification) 8 bit - reactants count 8 bit - reagents count 8 bit - products count x bit - concatenated molecules packs
- Parameters:
compressed – return zlib-compressed pack.
check – check molecules for format restrictions.
- classmethod pack_len(data: bytes, /, *, compressed=True) Tuple[List[int], List[int], List[int]] ¶
Returns reactants, reagents, products molecules atoms count in reaction pack.
- property products: Tuple[MoleculeContainer, ...]¶
- property reactants: Tuple[MoleculeContainer, ...]¶
- property reagents: Tuple[MoleculeContainer, ...]¶
- remove_reagents(*, keep_reagents: bool = False, mapping: bool = True) bool ¶
Place molecules, except reactants, to reagents list. Reagents - molecules which atoms not presented in products. Mapping based approach remove molecules without reaction center. Rule based approach remove equal molecules in reactants and products, and predefined reactants.
- Parameters:
mapping – use atom-to-atom mapping to detect reagents, otherwise use predefined list of common reagents.
keep_reagents – delete reagents if False
Return True if any reagent found.
- reset_mapping(*, return_score: bool = False, multiplier=1.75, keep_reactants_numbering=False) bool | float ¶
Do atom-to-atom mapping. Return True if mapping changed.
- standardize(*, fix_mapping: bool = True, logging=False, fix_tautomers=True) bool | List[Tuple[int, Tuple[int, ...], int, str]] ¶
Fix functional groups representation. Return True if in any molecule fixed group.
Deprecated method. Use canonicalize directly.
- Parameters:
fix_mapping – Search AAM errors of functional groups.
logging – return log from molecules with index of molecule. Otherwise, return True if these groups found in any molecule.
fix_tautomers – convert tautomers to canonical forms.
- thiele(*, fix_tautomers=True) bool ¶
Convert structures to aromatic form. Return True if in any molecule found kekule ring
- Parameters:
fix_tautomers – convert tautomers to canonical forms.
- classmethod unpack(data: bytes, /, *, compressed=True) ReactionContainer ¶
Unpack from compressed bytes.
- Parameters:
compressed – decompress data before processing.