ChemDes

An integrated web-based platform for molecular descriptor and fingerprint computation

  Molecular fingerprints library

Source Fingerprint name Description Details
Pybel FP2

a path-based fingerprint which indexes small molecule fragments based on linear segments of up to 7 atoms (somewhat similar to the Daylight fingerprints):

A molecule structure is analysed to identify linear fragments of length from 1-7 atoms. Single atom fragments of C, N, and O are ignored. A fragment is terminated when the atoms form a ring. For each of these fragments the atoms, bonding and whether they constitute a complete ring is recorded and saved in a set so that there is only one of each fragment type. Chemically identical versions, (i.e. ones with the atoms listed in reverse order and rings listed starting at different atoms) are identified and only a single canonical fragment is retained. Each remaining fragment is assigned a hash number from 0 to 1020 which is used to set a bit in a 1024 bit vector

 
FP3 uses a series of SMARTS queries stored in patterns.txt from Open Babel Details>>
FP4 uses a series of SMARTS queries stored in SMARTS_InteLigand.txt from Open Babel Details>>
MACCS uses the SMARTS patterns in MACCS.txt from Open Babel Details>>
Chemopy Daylight-type fingerprints (Topological fingerprint) A Daylight-like fingerprint based on hashing molecular subgraphs  
FP4 307 FP4 fingerprints  
MACCS (MACCS keys)Using the 166 public keys implemented as SMARTS  
E-state 79 E-state fingerprints or fragments  
Atom Paris Atom Paris fingerprints  
Torsions Topological torsion fingerprints  
Morgan Fingerprints based on the Morgan algorithm  
CDK CDK fingerprints Generates a fingerprint for a given AtomContainer. Fingerprints are one-dimensional bit arrays, where bits are set according to a the occurrence of a particular structural feature (See for example the Daylight inc. theory manual for more information). Fingerprints allow for a fast screening step to exclude candidates for a substructure search in a database. They are also a means for determining the similarity of chemical structures.  
Pubchem fingerprints Generates a Pubchem fingerprint for a molecule.These fingerprints are described in the description and are of the structural key type, of length 881.  Details>>
Extended fingerprints Generates an extended fingerprint for a given IAtomContainer, that extends the CDK  with additional bits describing ring features.  
EState fingerprints This fingerprinter generates 79 bit fingerprints using the E-State fragments.

The E-State fragments are those described in [Hall, L.H. and Kier, L.B. , Electrotopological State Indices for Atom Types: A Novel Combination of Electronic, Topological, and Valence State Information, Journal of Chemical Information and Computer Science, 1995, 35:1039-1045] and the SMARTS patterns were taken from RDKit

 
MACCS fingerprints This fingerprinter generates 166 bit MACCS keys.

The SMARTS patterns for each of the features was taken from RDKit. However given that there is no official and explicit listing of the original key definitions, the results of this implementation may differ from others.

 
Klekota-Roth fingerprints SMARTS based substructure fingerprint based on Chemical substructures that enrich for biological activity [Klekota, Justin and Roth, Frederick P., Chemical substructures that enrich for biological activity, Bioinformatics, 2008, 24:2518-2525]. Presence of 4860 substructures  
GraphOnly fingerprints Specialized version of the CDK Fingerprinter which does not take bond orders into account.  
Hybridization fingerprints Unlike the CDK Fingerprinter, this fingerprinter does not take into account aromaticity. Instead, it takes into account SP2 hybridization states.  
Substructure fingerprints  The fingerprint currently supports 307 substructures. Details>>
RDKit RDK fingerprints An RDKit topological fingerprint for a molecule.Generates a topological (Daylight like) fingerprint for a molecule using an alternate (faster) hashing algorithm. Details>>
Layered fingerprints A layered fingerprint for a molecule.Generates a topological (Daylight like) fingerprint for a molecule using a layer-based hashing algorithm. Details>>
Atom Pairs fingerprints Returns the atom-pair fingerprint for a molecule.The algorithm used is described here: R.E. Carhart, D.H. Smith, R. Venkataraghavan; "Atom Pairs as Molecular Features in Structure-Activity Studies: Definition and Applications" JCICS 25, 64-73 (1985). Details>>
Morgan fingerprints Returns a Morgan fingerprint for a molecule.

These fingerprints are similar to the well-known ECFP or FCFP fingerprints, depending on which invariants are used.

The algorithm used is described in the paper Rogers, D. & Hahn, M. Extended-Connectivity Fingerprints. JCIM 50:742-54 (2010)

Details>>
MACCSkeys fingerprints Returns the MACCS keys for a molecule.The result is a 167-bit vector. There are 166 public keys, but to maintain consistency with other software packages they are numbered from 1. Details>>
TopologicalTorsion fingerprints Returns the topological-torsion fingerprint for a molecule Details>>
Pattern fingerprints A fingerprint using SMARTS patterns.Generates a topological fingerprint for a molecule using a series of pre-defined structural patterns. Details>>
E-state fingerprints generates the EState fingerprints for the molecule Concept from the paper: Hall and Kier JCICS _35_ 1039-1045 (1995). Details>>
PaDEL Pubchem fingerprints Pubchem fingerprints Details>>
MACCS fingerprints MACCS keys Details>>
CDK fingerprints Fingerprint of length 1024 and search depth of 8.  
CDK extended fingerprints Extends the Fingerprinter with additional bits describing ring features.  
E-State fingerprints E-State fragments.Hall LH, Kier LB. Electrotopological state indices for atom types: A novel combination of electronic, topological, and valence state information. J Chem Inf Comput Sci, 1995;35:1039-45. Details>>
Klekota-Roth fingerprints Presence of chemical substructures.Klekota J, Roth FP. Chemical substructures that enrich for biological activity. Bioinformatics, 2008;24(21):2518-25. Details>>
Klekota-Roth fingerprint count Count of chemical substructures. Klekota J, Roth FP. Chemical substructures that enrich for biological activity. Bioinformatics, 2008;24(21):2518-25.
CDK graph only fingerprints Specialized version of the Fingerprinter which does not take bond orders into account.  
Substructure fingerprints Presence of SMARTS Patterns for Functional Group Classification by Christian Laggner. Details>>
Substructure fingerprint count Count of SMARTS Patterns for Functional Group Classification by Christian Laggner.
2D atom pairs Presence of atom pairs at various topological distances. Details>>
2D atom pairs count Count of atom pairs at various topological distances.
JCompoundMapper DFS
All-Path Encoding (DFS)All-path encodings are paths generated by a graph traversal with a modified depthfirst search as proposed by Ralaivola et al. Details>>
ASP The ASP encoding equals the DFS encoding with the exception that only the paths from an atom are stored that have shortest
distances from the root atom to the last atom contained in the path, which leads to a sparser representation.
AP2D This encoding contains atom types and the shortest path distance information between all pairs of atoms.
AT2D This encoding extends the AP2D encoding by a further atom.
AP3D Geometrical Atom Pairs (AP3D).This encoding is implemented similarly as its topological pendants AP2D. The only difference is that the geometrical distance matrix is used for the distance information.
AT3D Geometrical Atom Triplets (AT3D).his encoding is implemented similarly as its topological pendants AT2D. The only difference is that geometrical distance matrix is used for the distance information.
CATS2D The CATS2D descriptors encode the pairwise topological relationships of PPP(potential pharmacophore points) patterns in a molecular graph by a
vector of fixed size.
CATS3D Geometrical CATS fingerprints (CATS3D).This implementation differs from the description of the original CATS3D, which uses the Molecular Operating Environment (MOE, Chemical Computing Group,http://www.chemcomp.com/)patterns to depict surface features of a molecule.
PHAP2POINT2D The PHAP2PT2D encoding is computed similarly to the AP2D. However,instead of atom types, the information of all PPPs of an atom is used to generate the fingerprint.
PHAP3POINT2D Compared with PHAP2PT2D,PHAP3PT2D encoding uses three points.
PHAP2POINT3D Geometrical pharmacophore fingerprints (PHAP2PT3D).This fingerprint is derived from its topological variants PHAP2PT2D by replacing the topological distance matrix by geometrical distance matrix.
PHAP3POINT3D Geometrical pharmacophore fingerprints (PHAP3PT3D).This fingerprint is derived from its topological variants PHAP3PT2D
by replacing the topological distance matrix by geometrical distance matrix.
ECFP Each ECFP feature represents a circular
substructure around a center atom.
ECFPVariant A variant of the ECFP is as described by Rogers
and Hahn.
LSTAR Local Path Environments (LSTAR)This fingerprint is a radial fingerprint similar to RAD2D. The major difference is that all paths up to depth d are stored in a shell.
SHED The SHED Keys are closely related to the pharmacophore atom pair based encodings, with the following major differences: First, the number of dimensions is fixed, second the entries do not describe a count but the entropy of the respective atom pair descriptor.
RAD2D Topological Molprint-like fingerprints (RAD2D).This encoding was proposed by Bender et al.
RAD3D Geometrical Molprint-like fingerprints (RAD3D).These encodings are the geometrical pendant of the topological RAD2D encoding.
MACCS MACCS keys (166)

Visits since Mar 3, 2015

Copyright @ 2012- Computational Biology & Drug Design Group,
School of Pharmaceutical Sciences, Central South University. All rights reserved.

The recommended browsers: Safari, Firefox, Chrome, IE(Ver. >8).
ChemDes by CBDD Group, CSU, China. is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.Creative Commons License    E-mail: biomed@csu.edu.cn