[Bioperl-l] Apparent disagreement between EUtilities and web page
Michael Muratet
mmuratet at hudsonalpha.org
Tue Feb 2 18:22:30 UTC 2010
On Feb 2, 2010, at 11:54 AM, Chris Fields wrote:
> Just to follow up, it appears to be a disparity between the efetch
> gene data and the summary information (via esummary). The folks at
> NCBI are checking up on it.
>
Chris
Thanks for the help! I put a test for defined into the script and
managed to pull down the other 9K genes. It was just the one that had
issues.
Mike
> chris
>
> On Feb 2, 2010, at 8:34 AM, Chris Fields wrote:
>
>> Michael,
>>
>> This is an unusual one and appears to be server-side. I've seen
>> this before with older obsoleted gene IDs (see: 435023, which is
>> the older version of 100039753), so I'm wondering if an update
>> going on that hasn't made it's way to the web interface yet.
>>
>> I'm attempting to contact NCBI about the issue, so hopefully we'll
>> hear back.
>>
>> chris
>>
>> On Feb 2, 2010, at 6:14 AM, Michael Muratet wrote:
>>
>>> Greetings
>>>
>>> I am using EUtilities for the first time to retrieve data from
>>> NCBI and I could use some help trying to isolate a problem I have.
>>> I was using a script I put together from examples in the How-To
>>> and email archives
>>>
>>> $factory->set_parameters(-id => \@sub );
>>>
>>> while ( my $docsum =$factory->next_DocSum ) {
>>> my ($item) = $docsum-
>>> >get_Items_by_name('GenomicInfoType');
>>>
>>> my ( $chrloc, $acc, $start, $end) = (
>>> $item-
>>> >get_contents_by_name('ChrLoc'),
>>> $item-
>>> >get_contents_by_name('ChrAccVer'),
>>> $item-
>>> >get_contents_by_name('ChrStart'),
>>> $item-
>>> >get_contents_by_name('ChrStop'));
>>>
>>> and I found a record that has a suspect ChrStart field and no
>>> ChrStop field (using another example from the HowTo)
>>>
>>> ID: 100039753
>>> Name-Description-Orgname-Status-CurrentID-Chromosome-GeneticSource-
>>> MapLocation-OtherAliases-OtherDesignations-NomenclatureSymbol-
>>> NomenclatureName-NomenclatureStatus-TaxID-Mim-GenomicInfo-
>>> GeneWeight-Summary-ChrSort-ChrStart
>>> Name :LOC100039753
>>> Description :similar to putative
>>> Orgname :Mus musculus
>>> Chromosome :Y
>>> GeneticSource :genomic
>>> MapLocation :Y B
>>> OtherDesignations :hypothetical protein LOC100039753
>>> TaxID :10090
>>> GeneWeight :401
>>> ChrSort :~~last
>>> ChrStart :999999999
>>>
>>> yet on the NCBI web page it shows a defined start and a stop for
>>> the locus.
>>>
>>> I can fix the perl code so that it won't barf when the field isn't
>>> defined, but I'd like to get all the data and I can't figure out
>>> if the problem is on the client side or the server side. All the
>>> data comes from the same source, doesn't it?
>>>
>>> Thanks
>>>
>>> Mike
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
Michael Muratet, Ph.D.
Senior Scientist
HudsonAlpha Institute for Biotechnology
mmuratet at hudsonalpha.org
(256) 327-0473 (p)
(256) 327-0966 (f)
Room 4005
601 Genome Way
Huntsville, Alabama 35806
More information about the Bioperl-l
mailing list