Search

Chemical Identity - Generating InChi Keys and SMILES strings:


Recipe metadata

identifier: RX.X

version: v1.0

Difficulty level

Reading Time

15 minutes

Recipe Type

Hands-on

Executable Code

Yes

Intended Audience

Chemoinformaticians

Data Curator

Data Managers

Data Scientists


Standards:

Databases:

Not applicable.

Identifiers:

  • InChI

Tools:

  • Programming Language: Groovy
  • Dependencies: CDK 2.3 # FAIRPlus SDF tools

Requirements

To run the below scripts, you need a Groovy installation. The Groovy scripts use version 2.3 of the Chemistry Development Kit (see also doi:10.1186/s13321-017-0220-4). This library and its use in Groovy is further explain in the book Groovy Cheminformatics with the Chemistry Development Kit.

Click here on how to use:

Record validation

When generating InChIs, the InChI library may return two success states reflecting issues with the compound record in the SD file: WARNING and ERROR. This first script reports such issues:

groovy badRecords.groovy -f foo.sdf
  • Input: SD file
  • Output: Reports validation issues

Calculate InChls

Similarly, InChIKeys can be generated:

groovy inchikeys.groovy -f foo.sdf
  • Input: SD file
  • Output: list of InChIs

When the success state is ERROR, nothing is outputted.

Calculate SMILES strings

The last script calculates a SMILES for each entry in the SD file:

groovy smiles.groovy -f foo.sdf
  • Input: SD file
  • Output: list of SMILES strings

Authors:

Name Affiliation orcid CrediT role
Egon Willighagen Maastricht University,Department of Bioinformatics NUTRIM School of Nutrition and Translational Research in Metabolism Faculty of Health, Medicine and Life Sciences 0000-0001-7542-0286 Writing - Original Draft
Reviewer

License:

This page is released under the Creative Commons 4.0 BY license.