-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Labels
enhancementNew feature or requestNew feature or request
Description
As @brindakv mentioned, python-ihm currently does various sanity checks to ensure the generated mmCIF is self-consistent, e.g. checking the model sequence against that in the struct_ref table. However, we could do additional sanity checks (perhaps as part of the make-mmcif.py script, or another script used as part of the deposition pipeline) that validate external resources. (I would be reluctant to have these done as part of generating every file, since they would make multiple network connections, referenced files might not exist at modeling time, and many issues might be warnings rather than errors or would need manual intervention.) For example we could
- Query UniProt and check to make sure that the
struct_refsequence matches (complication: may need to check multiple versions of the UniProt sequence since it does change). - Ping any DOI referenced in the file to make sure it exists.
- Download any referenced external archive files and make sure that any files referenced inside those archives exist.
- Look up any accessions (e.g. SASBDB, EMDB) to make sure that a) they exist, b) they have been released and c) they match the model (e.g. by checking model fit or checking that both model and data reference the same UniProt sequence).
- Look up any PMIDs and make sure the citation matches.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request