[Bioperl-l] Apparent disagreement between EUtilities and web page

Michael Muratet mmuratet at hudsonalpha.org
Tue Feb 2 18:22:30 UTC 2010


On Feb 2, 2010, at 11:54 AM, Chris Fields wrote:

> Just to follow up, it appears to be a disparity between the efetch  
> gene data and the summary information (via esummary).  The folks at  
> NCBI are checking up on it.
>
Chris

Thanks for the help! I put a test for defined into the script and  
managed to pull down the other 9K genes. It was just the one that had  
issues.

Mike
> chris
>
> On Feb 2, 2010, at 8:34 AM, Chris Fields wrote:
>
>> Michael,
>>
>> This is an unusual one and appears to be server-side.  I've seen  
>> this before with older obsoleted gene IDs (see: 435023, which is  
>> the older version of 100039753), so I'm wondering if an update  
>> going on that hasn't made it's way to the web interface yet.
>>
>> I'm attempting to contact NCBI about the issue, so hopefully we'll  
>> hear back.
>>
>> chris
>>
>> On Feb 2, 2010, at 6:14 AM, Michael Muratet wrote:
>>
>>> Greetings
>>>
>>> I am using EUtilities for the first time to retrieve data from  
>>> NCBI and I could use some help trying to isolate a problem I have.  
>>> I was using a script I put together from examples in the How-To  
>>> and email archives
>>>
>>>         $factory->set_parameters(-id => \@sub );
>>>
>>>         while ( my $docsum =$factory->next_DocSum ) {
>>>            my ($item) = $docsum- 
>>> >get_Items_by_name('GenomicInfoType');
>>>
>>>            my ( $chrloc, $acc, $start, $end) = (
>>>                                    $item- 
>>> >get_contents_by_name('ChrLoc'),
>>>                                    $item- 
>>> >get_contents_by_name('ChrAccVer'),
>>>                                    $item- 
>>> >get_contents_by_name('ChrStart'),
>>>                                    $item- 
>>> >get_contents_by_name('ChrStop'));
>>>
>>> and I found a record that has a suspect ChrStart field and no  
>>> ChrStop field (using another example from the HowTo)
>>>
>>> ID: 100039753
>>> Name-Description-Orgname-Status-CurrentID-Chromosome-GeneticSource- 
>>> MapLocation-OtherAliases-OtherDesignations-NomenclatureSymbol- 
>>> NomenclatureName-NomenclatureStatus-TaxID-Mim-GenomicInfo- 
>>> GeneWeight-Summary-ChrSort-ChrStart
>>> Name                :LOC100039753
>>> Description         :similar to putative
>>> Orgname             :Mus musculus
>>> Chromosome          :Y
>>> GeneticSource       :genomic
>>> MapLocation         :Y B
>>> OtherDesignations   :hypothetical protein LOC100039753
>>> TaxID               :10090
>>> GeneWeight          :401
>>> ChrSort             :~~last
>>> ChrStart            :999999999
>>>
>>> yet on the NCBI web page it shows a defined start and a stop for  
>>> the locus.
>>>
>>> I can fix the perl code so that it won't barf when the field isn't  
>>> defined, but I'd like to get all the data and I can't figure out  
>>> if the problem is on the client side or the server side. All the  
>>> data comes from the same source, doesn't it?
>>>
>>> Thanks
>>>
>>> Mike
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

Michael Muratet, Ph.D.
Senior Scientist
HudsonAlpha Institute for Biotechnology
mmuratet at hudsonalpha.org
(256) 327-0473 (p)
(256) 327-0966 (f)

Room 4005
601 Genome Way
Huntsville, Alabama 35806








More information about the Bioperl-l mailing list