CDD’s Public Database is a Valuable Resource

CDD Public offers CDD Vault users access to more than 100 well-curated datasets that can be easily mined and analyzed in conjunction with your own private data. Compounds stored in CDD Public can be easily searched and filtered using the CDD Vault‘s easy-to-use, chemically intelligent interface. In addition, CDD has recently developed the ability to build QSAR models. This means that you can now build a model based on any property stored in CDD Public, and see how your compounds compare. This might be especially useful for predicting target selectivity or ADME/tox properties.

Working with NCBI-PubChem, CDD has now provided links to supporting data, such as bioactivity, ADME, toxicity, properties, and sources,  for the more than one million molecules in CDD Public. In addition, CDD also added 94,000 structures that were previously unknown to PubChem. These included a number of structure-activity collections for rare and neglected diseases such as malaria, tuberculosis and tropical parasites, as well as toxicity and property data. PubChem recently highlighted this announcement on their homepage:


The PubChem database has recently passed 50 million known compounds, making it quite possibly the most comprehensive collection. CDD is proud of our user community for this contribution. If you know of a public dataset that you think would be of value to the CDD community – please let us know and we will add it to the CDD Public database.

Here is a quick sampling of high interest data-sets for CDD users:

  • FDA APPROVED: Approved Drugs –  with defined molecular structure including 763 molecules from the Physicians’ Desk Reference, 780 from DrugBank, 1151 in the Orange Book 2007, and 1007 from Dr. Chris Lipinski’s FDA list.
  • KINASE: ChEMBL Kinase SARfari Compounds & BioAssay Data – makes available a large amount of SAR data for kinase active compounds against a wide range of kinases in a broad array of assays, much of it manually mined and curated from the literature.
  • GPCR: PDSP Ki Database– has over 47,000 Ki values against 699 GPCR targets from the NIMH Psychoactive Drug Screening Program (PDSP) Database