chython.files package

Available file parsers and writers:

class chython.files.ERDFWrite(file, *, append: bool = False, mapping: bool = True)

MDL V3000 RDF files writer. works similar to opened for writing file object. support with context manager. on initialization accept opened for writing in text mode file, string path to file, pathlib.Path object or another buffered writer object

Parameters:
  • append – append to existing file (True) or rewrite it (False). For buffered writer object append = False will write RDF header and append = True will omit the header.

  • mapping – write atom mapping.

write(data: ReactionContainer | MoleculeContainer)
class chython.files.ESDFWrite(file, *, mapping: bool = True, append: bool = False)

MDL V3000 SDF files writer. works similar to opened for writing file object. support with context manager. on initialization accept opened for writing in text mode file, string path to file, pathlib.Path object or another buffered writer object

Parameters:
  • mapping – write atom mapping.

  • append – open file path in append mode.

escape_map = {'<': '&lt;', '>': '&gt;'}
write(data: MoleculeContainer, write3d: int | None = None)

write single molecule into file

Parameters:

write3d – write conformer coordinates with given index

class chython.files.MRVRead(file, *, ignore: bool = True, remap: bool = False, calc_cis_trans: bool = False, ignore_stereo: bool = False, ignore_bad_isotopes: bool = False)

ChemAxon MRV files reader. works similar to opened file object. support with context manager. on initialization accept opened in binary mode file, string path to file, pathlib.Path object or another binary buffered reader object

Parameters:
  • ignore – Skip some checks of data or try to fix some errors.

  • remap – Remap atom numbers started from one.

  • calc_cis_trans – Calculate cis/trans marks from 2d coordinates.

  • ignore_stereo – Ignore stereo data.

  • ignore_bad_isotopes – reset invalid isotope mark to non-isotopic.

close(force: bool = False)

Close opened file

Parameters:

force – force closing of externally opened file or buffer

molecule_cls

alias of MoleculeContainer

reaction_cls

alias of ReactionContainer

read(amount: int | None = None) List[ReactionContainer | MoleculeContainer]

Parse whole file

Parameters:

amount – number of records to read

read_metadata(*, current: bool = True) Dict[str, str]

Read metadata block

read_structure(*, current: bool = True)

Read Reaction or Molecule container.

Parameters:

current – return current structure if already parsed, otherwise read next

tell()

Number of records processed from the original file

class chython.files.MRVWrite(file, mapping: bool = True)

ChemAxon MRV files writer. works similar to opened for writing file object. support with context manager. on initialization accept opened for writing in text mode file, string path to file, pathlib.Path object or another buffered writer object

Parameters:

mapping – write atom mapping.

close(force=False)

Write close tag of MRV file and close opened file

Parameters:

force – force closing of externally opened file or buffer

write(data: ReactionContainer | MoleculeContainer)

Write single molecule or reaction into file

class chython.files.PDBRead(file, *, buffer_size=10000, ignore: bool = True, element_name_priority: bool = False, parse_as_single: bool = False, atom_name_map=None, charge_map: Sequence[int] | None = None, radical_map: Sequence[int] | None = None, radius_multiplier: float = 1.25)

PDB files reader. Works similar to opened file object. Support with context manager. On initialization accept opened in text mode file, string path to file, pathlib.Path object or another buffered reader object.

Supported multiple structures in same file separated by ENDMDL. Supported only ATOM and HETATM parsing. END or ENDMDL required in the end.

Parameters:
  • ignore – Skip some checks of data or try to fix some errors.

  • element_name_priority – For ligands use element symbol column value and ignore atom name column.

  • parse_as_single – Usable if all models in file is the same structure. 2d graph will be restored only from first model.

  • atom_name_map – dictionary with atom names replacements. e.g.: {‘Ow’: ‘O’}. Keys should be capitalized.

  • charge_map – iterable with total charges of each model in file.

  • radical_map – iterable with total radicals count of each model in file.

close(force: bool = False)

Close opened file

Parameters:

force – force closing of externally opened file or buffer

molecule_cls

alias of MoleculeContainer

read(amount: int | None = None) List[MoleculeContainer]

Parse whole file

Parameters:

amount – number of records to read

read_structure(*, current: bool = True) MoleculeContainer

Read Molecule container.

Parameters:

current – return current structure if already parsed, otherwise read next

tell()

Number of records processed from the original file

class chython.files.RDFRead(*args, **kwargs)

MDL RDF files reader. works similar to opened file object. support with context manager. on initialization accept opened in text mode file, string path to file, pathlib.Path object or another buffered reader object

Parameters:
  • buffer_size – readahead size. increase if you have big molecules or metadata records.

  • indexable

    if True: supported methods seek, tell, object size and subscription, it only works when dealing with a real file (the path to the file is specified) because the external grep utility is used, supporting in unix-like OS the object behaves like a normal open file.

    if False: works like generator converting a record into MoleculeContainer and returning each object in order, records with errors are skipped

  • ignore – Skip some checks of data or try to fix some errors.

  • remap – Remap atom numbers started from one.

  • calc_cis_trans – Calculate cis/trans marks from 2d coordinates.

  • ignore_stereo – Ignore stereo data.

  • ignore_bad_isotopes – reset invalid isotope mark to non-isotopic.

molecule_cls

alias of MoleculeContainer

reaction_cls

alias of ReactionContainer

read_metadata(*, current=True) Dict[str, str]

Read metadata block

read_mol(n: int, /, *, current: bool = True) str

Read requested MOL block

read_rxn(*, current: bool = True) str

Read rxn block without metadata

read_structure(*, current=True) ReactionContainer | MoleculeContainer

Read Reaction or Molecule container.

Parameters:

current – return current structure if already parsed, otherwise read next

reset_index()

Create (rewrite) indexation table. Implemented only for object that is a real file (the path to the file is specified) because the external grep utility is used.

seek(offset)

Shift to a given record number

class chython.files.RDFWrite(file, *, append: bool = False, mapping: bool = True)

MDL RDF files writer. works similar to opened for writing file object. support with context manager. on initialization accept opened for writing in text mode file, string path to file, pathlib.Path object or another buffered writer object

Parameters:
  • append – append to existing file (True) or rewrite it (False). For buffered writer object append = False will write RDF header and append = True will omit the header.

  • mapping – write atom mapping.

write(data: ReactionContainer | MoleculeContainer)
class chython.files.SDFRead(*args, **kwargs)

MDL SDF files reader. works similar to opened file object. support with context manager. on initialization accept opened in text mode file, string path to file, pathlib.Path object or another buffered reader object

Parameters:
  • buffer_size – readahead size. increase if you have big molecules or metadata records.

  • indexable

    if True: supported methods seek, tell, object size and subscription, it only works when dealing with a real file (the path to the file is specified) because the external grep utility is used, supporting in unix-like OS the object behaves like a normal open file.

    if False: works like generator converting a record into MoleculeContainer and returning each object in order, records with errors are skipped

  • ignore – Skip some checks of data or try to fix some errors.

  • remap – Remap atom numbers started from one.

  • calc_cis_trans – Calculate cis/trans marks from 2d coordinates.

  • ignore_stereo – Ignore stereo data.

  • ignore_bad_isotopes – reset invalid isotope mark to non-isotopic.

escape_map = {'&gt;': '>', '&lt;': '<'}
molecule_cls

alias of MoleculeContainer

read_metadata(*, current=True)

Read metadata block

read_mol(*, current: bool = True) str

Read MOL block without metadata

read_structure(*, current=True) MoleculeContainer

Read Reaction or Molecule container.

Parameters:

current – return current structure if already parsed, otherwise read next

reset_index()

Create (rewrite) indexation table. Implemented only for object that is a real file (the path to the file is specified) because the external grep utility is used.

seek(offset)

Shift to a given record number

class chython.files.SDFWrite(file, *, mapping: bool = True, append: bool = False)

MDL SDF files writer. works similar to opened for writing file object. support with context manager. on initialization accept opened for writing in text mode file, string path to file, pathlib.Path object or another buffered writer object

Parameters:
  • mapping – write atom mapping.

  • append – open file path in append mode.

escape_map = {'<': '&lt;', '>': '&gt;'}
write(data: MoleculeContainer, write3d: int | None = None)

write single molecule into file

Parameters:

write3d – write conformer coordinates with given index

chython.files.inchi(data, /, *, ignore_stereo: bool = False, _cls=<class 'chython.containers.molecule.MoleculeContainer'>) MoleculeContainer

INCHI string parser

chython.files.mdl_mol(data: str, /, *, ignore=True, calc_cis_trans=False, ignore_stereo=False, remap=False, ignore_bad_isotopes=False, _cls=<class 'chython.containers.molecule.MoleculeContainer'>) MoleculeContainer

Parse string with mol file.

chython.files.mdl_rxn(data: str, /, *, ignore=True, calc_cis_trans=False, ignore_stereo=False, remap=False, ignore_bad_isotopes=False, _r_cls=<class 'chython.containers.reaction.ReactionContainer'>, _m_cls=<class 'chython.containers.molecule.MoleculeContainer'>) ReactionContainer

Parse string with rxn file.

chython.files.smarts(data: str)

Parse SMARTS string.

  • stereo ignored.

  • only D, a, h, r and !R atom primitives supported.

  • bond order list and not bond supported.

  • [not]ring bond supported only in combination with explicit bonds, not bonds and bonds orders lists.

  • mapping, charge and isotopes supported.

  • list of elements supported.

  • A - treats as any element. <A> primitive (aliphatic) ignored.

  • M - treats as any metal..

  • <&> logic operator unsupported.

  • <;> logic operator is mandatory except (however preferable) for charge, isotope, stereo marks.

  • CXSMARTS radicals supported.

  • hybridization and heteroatoms count in CXSMARTS atomProp notation coded as <hyb> and <het> keys.

  • masked atom - chython.Reactor specific mark for masking reactant atoms from deletion.

    Coded in CXSMARTS atomProp as <msk> key with any value.

For example:

[C;r5,r6;a]-;!@[C;h1,h2] |^1:1,atomProp:1.hyb.24:1.het.0| - aromatic C member of 5 or 6 atoms ring
connected with non-ring single bond to aromatic or SP2 radical C with 1 or 2 hydrogens.

Alternative hybridization, heteroatoms and masks coding:

  • primitive <xN> - heteroatoms (e.g. x2 - two heteroatoms)

  • primitive <zN> - hybridization (N = 1 - sp3, 2 - sp2, 3 - sp, 4 - aromatic)

  • primitive <M> - masked atom

Note: atom numbers greater than 10 ** 9 forbidden for usage and reserved for masked atoms numbering. In multiprocess mode has potential bugs in reaction enumeration task then used templates prepared from components from different processes. For avoiding, prepare templates on single process and then share it.

chython.files.smiles(data, /, *, ignore: bool = True, remap: bool = False, ignore_stereo: bool = False, ignore_bad_isotopes: bool = False, keep_implicit: bool = False, ignore_carbon_radicals: bool = False, ignore_aromatic_radicals: bool = True, _r_cls=<class 'chython.containers.reaction.ReactionContainer'>, _m_cls=<class 'chython.containers.molecule.MoleculeContainer'>) MoleculeContainer | ReactionContainer

SMILES string parser

Parameters:
  • ignore – Skip some checks of data or try to fix some errors.

  • remap – Remap atom numbers started from one.

  • ignore_stereo – Ignore stereo data.

  • keep_implicit – keep given in smiles implicit hydrogen count, otherwise ignore on valence error.

  • ignore_bad_isotopes – reset invalid isotope mark to non-isotopic.

  • ignore_carbon_radicals – fill carbon radicals with hydrogen (X[C](X)X case).

  • ignore_aromatic_radicals – don’t treat aromatic tokens like c[c]c as radicals.

chython.files.xyz(matrix: ~typing.Sequence[~typing.Tuple[str, float, float, float]], charge=0, radical=0, radius_multiplier=1.25, atom_charge: ~typing.Sequence[int] | None = None, _cls=<class 'chython.containers.molecule.MoleculeContainer'>) MoleculeContainer
chython.files.xyz_file(data) MoleculeContainer