[Biojava-dev] fetching obsolete/superseding files

Amr AL-Hossary amr_alhossary at hotmail.com
Tue Apr 26 09:55:05 UTC 2011


The bug was fixed per "replaces", but "replacedBy" is not yet fixed.
Here is current result

<idStatus>
<record structureId="1HHB" status="OBSOLETE" replacedBy="4HHB"/>
<record structureId="2HHB" status="CURRENT" replaces="1HHB"/>
<record structureId="3HHB" status="CURRENT" replaces="1HHB"/>
<record structureId="4HHB" status="CURRENT" replaces="1HHB"/>
<record structureId="1CAT" status="OBSOLETE" replacedBy="8CAT"/>
<record structureId="3CAT" status="OBSOLETE" replaces="1CAT" 
replacedBy="8CAT"/>
<record structureId="7CAT" status="CURRENT" replaces="3CAT"/>
<record structureId="8CAT" status="CURRENT" replaces="3CAT"/>
<record structureId="1KSA" status="OBSOLETE" replacedBy="3ENI"/>
<record structureId="3ENI" status="CURRENT" replaces="1M50 1KSA"/>
<record structureId="1M50" status="OBSOLETE" replacedBy="3ENI"/>
</idStatus>

Did you receive my previous mail, Dr. Andreas?

Amr

--------------------------------------------------
From: "Amr AL-Hossary" <amr_alhossary at hotmail.com>
Sent: Tuesday, April 26, 2011 5:03 AM
To: "Spencer Bliven" <sbliven at ucsd.edu>; "Andreas Prlic" <andreas at sdsc.edu>
Cc: <biojava-dev at lists.open-bio.org>
Subject: Re: [Biojava-dev] fetching obsolete/superseding files

> Thanks Spencer,
> This explains a lot.
> This way, the current implementation you provided is right and the 
> recursion flag is totally right.
>
> No I don't have write access yet, but Dr. Andreas had promised me to grant 
> me the right access after my 2nd participation.
>
>>the list of status messages come from looking at the internals of the PDB 
>>website
> Do you have access to the Webservice implementation?
>
> Amr
>
>
>  From: Spencer Bliven
>  Sent: Tuesday, April 26, 2011 1:53 AM
>  To: Andreas Prlic
>  Cc: Amr AL-Hossary ; biojava-dev at lists.open-bio.org
>  Subject: Re: [Biojava-dev] fetching obsolete/superseding files
>
>
>  Hey all,
>
>  I think we are converging on a consistent model of PDB precedence. This 
> was obscured previously by the bug in how the idStatus page listed only a 
> single 'replacedBy' entry. Andreas has fixed this and it should go live 
> tomorrow. I'll write some unit tests and put update biojava at the same 
> time. Here is how things will work:
>
>  PDB supersessions form a directed acyclic graph, where edges point from 
> an obsolete ID to the entry that directly superseded it. Each record 
> contained by idStatus contains a "replaces" attribute, which consists of a 
> space-delimited list of incoming edges, and a "replacedBy" attribute, 
> which consists of a space-delimited list of outgoing edges. Two examples:
>
>  <idStatus>
>  <record structureId="1CAT" status="OBSOLETE" replacedBy="3CAT"/>
>  <record structureId="3CAT" status="OBSOLETE" replaces="1CAT" 
> replacedBy="8CAT 7CAT"/>
>  <record structureId="7CAT" status="CURRENT" replaces="3CAT"/>
>  <record structureId="8CAT" status="CURRENT" replaces="3CAT"/>
>
>  <record structureId="1KSA" status="OBSOLETE" replacedBy="3ENI"/>
>  <record structureId="3ENI" status="CURRENT" replaces="1M50 1KSA"/>
>  <record structureId="1M50" status="OBSOLETE" replacedBy="3ENI"/>
>  </idStatus>
>
>  The non-recursive versions of getReplaces/getReplacement just get the 
> incoming/outgoing edges for a single node and require only a single REST 
> query. The recursive versions will do a depth-first search up/down the 
> tree and return a list of all nodes reached.
>
>  Finally, the getCurrent() method should consistently return a single PDB 
> ID from among the results of recursive-getReplacement. To be consistent 
> with the old REST implementation, this will be the PDB ID that occurs last 
> alphabetically. Thus getCurrent(1HHB) will give 4HHB rather than 2HHB or 
> 3HHB, getCurrent(1CAT) will give 8CAT, and getCurrent(7CAT) will give 
> 7CAT.
>
>  Amr, I understand what you were thinking with the getNewestCurrent 
> method. It is appealing to think of 4HHB as the representative for all 
> four structures. However, there is a good reason that 2HHB and 3HHB are 
> still marked as current, and I think it is misleading to include a method 
> that favors 4HHB over other current IDs because it is alphabetically 
> higher. We should probably leave this method out of biojava.
>
>
>  Does anything seems wrong about this model of supersession? In 
> particular, does this address your question about the need for the 
> recursion flag, Amr? My plan is to commit the biojava changes shortly. 
> Amr, do you mind if I merge in your patch with the caching and 
> PDBFileReader updates (Do you have write access to SVN?)? Great code 
> there!
>
>  Finally, the list of status messages come from looking at the internals 
> of the PDB website. I haven't come across any examples of them myself to 
> test with. Many seem to be temporary statuses, for publication holds and 
> the like. I'm content to ignore them until someone requests something 
> specific.
>
>  -Spencer
>
>
>
>  On Mon, Apr 25, 2011 at 2:22 PM, Andreas Prlic <andreas at sdsc.edu> wrote:
>
>    Hi Amr,
>
>
>    > And any way, the webservice returns only ONE PDB ID max per record 
> (please
>    > inspect the result returned by this query
>    > 
> http://www.rcsb.org/pdb/rest/idStatus?structureId=1HHB,2HHB,3HHB,4HHB ).
>
>
>    I believe that is a bug, I just fixed this and it should become
>    available with tomorrows web site update (around 00UTC).
>
>
>    > This way, I believe the best way to get the most recent ID is getting 
> the
>    > isReplacedBy attribute of the record of superseded record (e.g. from 
> 3HHB to
>    > 1HHB and then from 1HHB to 4HHB).
>
>
>    hope this will be simpler with the updated URL response ...
>
>
>    Andreas
>
>
>
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-dev
> 



More information about the biojava-dev mailing list