RDKit

Cheminformatics and Machine Learning Software
Download

RDKit Ranking & Summary

Advertisement

  • Rating:
  • License:
  • BSD License
  • Price:
  • FREE
  • Publisher Name:
  • Greg Landrum
  • Publisher web site:
  • http://www.rdkit.org/

RDKit Tags


RDKit Description

Cheminformatics and Machine Learning Software RDKit is a Python library with data structures, algorithms, and scripts for cheminformatics.General Molecular Functionality• Input/Output: SMILES, mol, SDF, TDT• “Cheminformatics”: – Substructure searching with SMARTS – Canonical SMILES – Chirality support – Easy serialization (molecule text)• 2D depiction (including constrained depiction)• Generation of 2D -> 3D via distance geometry• UFF implementation for cleaning up geometries• Fingerprinting (Daylight-like, “MACCS keys”, etc.)• Similarity/diversity pickingGeneral Molecular Functionality, cntd• Subgraph/Fragment analysis• Gasteiger charges• Shape-based similarity• Molecule-molecule alignment• Molecular transformations (using SMARTS)General “QSAR” Functionality• Molecular descriptor library: – Topological (κ3, Balaban J, etc.) – Electrotopological state (EState) – ClogP, MR – “MOE like” VSA descriptors – others• Learning: – Clustering – Decision trees, naïve Bayes*, kNN* *Functional, but not a great implementation – Bagging, random forests – Infrastructure: • data splitting • shuffling • out-of-bag classification • serializable models • enrichment plots, screening, etc.Command Line Tools• ML/BuildComposite.py: build models• ML/ScreenComposite.py: screen models• ML/EnrichPlot.py: generate enrichment plot data• ML/AnalyzeComposite.py: analyze models (descriptor levels)• Chem/Fingerprints/FingerprintMols.py: generate 2D fingerprints• Chem/BuildFragmentCatalog.py: CASE-type analysis with a hierarchical catalog Requirements: · Python What's New in This Release: · The directory structure of the distribution has been changed in order to make installation of the RDKit python modules more straightforward. Specifically the directory $RDBASE/Python has been renamed to $RDBASE/rdkit and the Python code now expects that $RDBASE is in your PYTHONPATH. When importing RDKit Python modules, one should now do: "from rdkit import Chem" instead of "import Chem". Old code will continue to work if you also add $RDBASE/rdkit to your PYTHONPATH, but it is strongly suggested that you update your scripts to reflect the new organization. · For C++ programmers: There is a non-backwards compatible change in the way atoms and bonds are stored on molecules. See the *Other* section for details. Acknowledgements · Kirk DeLisle, Noel O'Boyle, Andrew Dalke, Peter Gedeck, Armin Widmer Bug Fixes · Incorrect handling of 0s as ring closure digits (issues 2525792, and 2690982) · Incorrect handling of atoms with explicit Hs in reactions (issue 2540021) · SmilesMolSupplier.GetItemText() crashes (issue 2632960) · Incorrect handling of dot separations in reaction SMARTS (issue 2690530) · Bad charge lines in mol blocks for large molecules (issue 2692246) · Order dependence in AssignAtomChiralTagsFromStructure (issue 2705543) · Order dependence in the 2D pharmacophore code · the LayeredFingerprints now handle non-aromatic single ring bonds between aromatic atoms correctly. New Features · BRICS implementation · Morgan/circular fingerprints implementation · The 2D pharmacophore code now uses standard RDKit fdef files. · Atom parity information in CTABs now written and read. If present on reading, atom parity flags are stored in the atomic property "molParity". · An optional "fromAtoms" argument has been added to the atom pairs and topological torsion fingerprints. If this is provided, only atom pairs including the specified atoms, or torsions that either start or end at the specified atoms, will be included in the fingerprint. · Kekulization is now optional when generating CTABs. Since the MDL spec suggests that aromatic bonds not be used, this is primarily intended for debugging purposes. · the removeStereochemistry() (RemoveStereoChemistry() from Python) function has been added to remove all stereochemical information from a molecule. Other · The Qt3-based GUI functionality in $RDBASE/rdkit/qtGui and $RDBASE/Projects/SDView is deprecated. It should still work, but it will be removed in a future release. Please do not build anything new on this (very old and creaky) framework. · The function DaylightFingerprintMol() is now deprecated, use RDKFingerprintMol() instead. · For C++ programmers: The ROMol methods getAtomPMap() and getBondPMap() have been removed. The molecules themselves now support an operator[]() method that can be used to convert graph iterators (e.g. ROMol:edge_iterator, ROMol::vertex_iterator, ROMol::adjacency_iterator) to the corresponding Atoms and Bonds. New API for looping over an atom's bonds: ... molPtr is a const ROMol * ... ... atomPtr is a const Atom * ... ROMol::OEDGE_ITER beg,end; boost::tie(beg,end) = molPtr->getAtomBonds(atomPtr); while(beg!=end){ const BOND_SPTR bond=(*molPtr); ... do something with the Bond ... ++beg; } New API for looping over a molecule's atoms: ... mol is an ROMol ... ROMol::VERTEX_ITER atBegin,atEnd; boost::tie(atBegin,atEnd) = mol.getVertices(); while(atBegin!=atEnd){ ATOM_SPTR at2=mol; ... do something with the Atom ... ++atBegin; }


RDKit Related Software