CDD Vault Update (November 2023): Importing Text in Data Files, and APIs for Structure Images, QC Report Details and Ambiguous Structures
Handling Non-Numeric Values in a Data Import File
Often, data files which are being imported into CDD Vault will contain certain text values in columns that are being mapped to numeric fields/readouts. Examples include “ND” for “Not Determined” or “N/A” for “Not Applicable”.
The CDD Vault Import Data Wizard will now report these rows as Suspicious Events in the QC Report. At which point, the user can choose 1 of 2 options:
ACCEPT - the text values being mapped to a numeric destination will be “blanked out” and the rest of the data on these rows will be successfully imported
REJECT - none of the data on these rows will be imported
As a quick example, importing this data file and mapping the Inhibition column to a Protocol numeric readout definition …
… will result in a Suspicious Event and any row containing textual data will be REJECTED by default.
The default REJECT selection matches the old behavior, and no data from the affected rows will be imported.
API Endpoint for Structure Images
The GET Molecules API call has a new /image parameter that will retrieve the image of the registered Molecule.
runs as an async API call
returns an Export ID
retrieves the image of the structure
New Parameter for GET Slurps API Endpoint
There is a new show_events parameter on the GET Slurps API endpoint that will show any row that generated a Suspicious Event or an Error. The details of the event are also included.
Once you've done the POST Slurps call, the next step is to use GET Slurps to check the status of the import. Including the new show_events parameter will add the details of any Suspicious Events and Errors to the JSON that is returned.
With JSON like this:
Now returns JSON that includes suspicious/ambiguous events and error, something like this:
"message": "Record rejected because no batch with External Identifier 'DoesNotExist' exists in your database."
Easily Register Ambiguous Structures
Use the new duplicate_resolution parameter to register ambiguous OR structures (structures drawn with the OR enhanced stereo label). This provides a way to register a new Molecule (versus a new Batch of an existing Molecule) via the API.
For a majority of CDD Vaults, which use the chemical registration system, use the duplicate_resolutionwith the POST Batch API call to register a new molecule.
By default (no parameter is used), a new Batch of the existing record is created. If more than one Molecule exists, and no parameter is used, an error is returned and no new Batch nor Molecule will be created.
Specify one of the following options when using this parameter:
results in a new Batch being registered for the first Molecule detected as a potential tautomer or duplicate
results in a new Molecule being registered
results in nothing being registered
matching molecule IDs are returned
For Vaults which do not utilize the chemical registration system, use the duplicate_resolutionwith the POST Molecule API call.
If this looks familiar, you are correct - these options were previously available for the tautomer_resolution parameter which is no longer needed since the new parameter handles all forms of duplicates: tautomers, ambiguous stereocenters, intentional duplicates.