Chemical Identity - Generating InChi Keys and SMILES strings:

Recipe metadata

identifier: RX.X

version: v1.0

Difficulty level

Reading Time

15 minutes

Recipe Type


Executable Code


Intended Audience


Data Curator

Data Managers

Data Scientists



Not applicable.


  • InChI


  • Programming Language: Groovy
  • Dependencies: CDK 2.3 # FAIRPlus SDF tools


To run the below scripts, you need a Groovy installation. The Groovy scripts use version 2.3 of the Chemistry Development Kit (see also doi:10.1186/s13321-017-0220-4). This library and its use in Groovy is further explain in the book Groovy Cheminformatics with the Chemistry Development Kit.

Click here on how to use:

Record validation

When generating InChIs, the InChI library may return two success states reflecting issues with the compound record in the SD file: WARNING and ERROR. This first script reports such issues:

groovy badRecords.groovy -f foo.sdf
  • Input: SD file
  • Output: Reports validation issues

Calculate InChls

Similarly, InChIKeys can be generated:

groovy inchikeys.groovy -f foo.sdf
  • Input: SD file
  • Output: list of InChIs

When the success state is ERROR, nothing is outputted.

Calculate SMILES strings

The last script calculates a SMILES for each entry in the SD file:

groovy smiles.groovy -f foo.sdf
  • Input: SD file
  • Output: list of SMILES strings


Name Affiliation orcid CrediT role
Egon Willighagen Maastricht University,Department of Bioinformatics NUTRIM School of Nutrition and Translational Research in Metabolism Faculty of Health, Medicine and Life Sciences 0000-0001-7542-0286 Writing - Original Draft


This page is released under the Creative Commons 4.0 BY license.