cgsmiles.read_fragments module

Functions for reading the fragment list.

class cgsmiles.read_fragments.PeekIter(collection)[source]

Bases: object

Custom iter that allows looking ahead, without advancing the actual iter.

peek()[source]
cgsmiles.read_fragments.collect_ring_number(smile_iter, token, node_count, rings)[source]

When a ring identifier is found, this function will add the current node to the rings dict.

Parameters:
Returns:

  • PeekIter – the advanced smiles_iter

  • str – the current token being processed

  • str – the ring id

  • dict[list] – the updated rings dict

cgsmiles.read_fragments.fragment_iter(fragment_str, all_atom=True)[source]

Iterates over fragments defined in a CGBigSmile string. Fragments are named residues that consist of a single smile string together with the BigSmile specific bonding descriptors. The function returns the name of the fragment as well as a plain nx.Graph of the molecule described by the smile. Bonding descriptors are annotated as node attributes with the keyword bonding.

Parameters:
  • fragment_str (str) – the string describing the fragments

  • all_atom (bool) – are the fragments all atom according to OpenSmiles syntax or CGsmiles

Yields:

str, nx.Graph

cgsmiles.read_fragments.read_fragments(fragment_str, all_atom=True, fragment_dict=None)[source]

Collects the fragments defined in a CGsmiles fragment string as networkx.Graph and returns a dict of them. Bonding descriptors are annotated as node attribtues.

Parameters:
  • fragment_str (str) – string using CGsmiles fragment syntax

  • all_atom (bool) – If the fragment strings are all-atom following the OpenSmiles syntax. Default is True but if set to False fragments follow the CGsmiles syntax.

  • fragment_dict (dict) – A dict of existing fragments. Only unique new fragments are appended.

Returns:

a dict of fragments and their name

Return type:

dict

cgsmiles.read_fragments.strip_bonding_descriptors(fragment_string)[source]

Processes a CGsmiles fragment string by stripping the bonding descriptors and storing them in a dict with reference to the atom they refer to. Furthermore, a cleaned SMILES or CGsmiles string is returned.

Parameters:

fragment_string (str) – a CGsmiles fragment string

Returns:

  • str – a canonical SMILES or CGsmiles string

  • dict – a dict mapping bonding descriptors to the nodes within the string