[Bioperl-l] how to parse the GenPept sequence object to get the'DBSOURCE' field

Hilmar Lapp hlapp at gmx.net
Fri Mar 18 23:14:24 EST 2005


On Friday, March 18, 2005, at 04:33  AM, Brian Osborne wrote:

> Hilmar,
>
> Excellent. OK, I need some suggestions as to values, this is an 
> annotation
> that I've never constructed. Here's an example:
>
> DATABASE GenBank
>
> PRIMARY_ID AAC12345
>
> OPTIONAL_ID AAC12345.2

No, leave blank - it is meant for cases where it is really different 
from the primary_id.

>
> COMMENT: ?

right, undef

>
> TAGNAME: dblink

Correct.

>
> NAMESPACE: ?

Ignore. I believe it defaults to database automagically.

>
> AUTHORITY: ?

right, undef

>
> VERSION: 2

right.


Cheers,

	-hilmar

>
>
> Brian O.
>
>
> -----Original Message-----
> From: Hilmar Lapp [mailto:hlapp at gmx.net]
> Sent: Thursday, March 17, 2005 7:18 PM
> To: Brian Osborne
> Cc: Leonardo Kenji Shikida; bioperl-l at portal.open-bio.org
> Subject: Re: [Bioperl-l] how to parse the GenPept sequence object to 
> get
> the'DBSOURCE' field
>
>
> Isn't this a dbxref? So, yes the work should be in genbank.pm but it
> should create a Bio::Annotation::DBLink object instead of a
> SimpleValue. DBLink will also properly represent version, accession,
> and database, instead of just a flat string.
>
> 	-hilmar
>
> On Thursday, March 17, 2005, at 06:06  AM, Brian Osborne wrote:
>
>> K,
>>
>> I've added some code to SeqIO/genbank.pm that appears to work but I
>> can't
>> commit it until I ask the Bioperl designers a question. Namely, it
>> appears
>> that this DBSOURCE field is specific to Genbank Protein, so the work 
>> of
>> creating the Annotation::SimpleValue should be in genbank.pm, not
>> RichSeq.pm, right?
>>
>> Brian O.
>>
>> -----Original Message-----
>> From: bioperl-l-bounces at portal.open-bio.org
>> [mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Leonardo
>> Kenji Shikida
>> Sent: Wednesday, March 16, 2005 2:16 PM
>> To: bioperl-l at portal.open-bio.org
>> Subject: [Bioperl-l] how to parse the GenPept sequence object to get
>> the'DBSOURCE' field
>>
>>
>> does anyone know how to parse the GenPept sequence object to get the
>> 'DBSOURCE' field?
>>
>> e.g. human.protein.gpff
>>
>> LOCUS       NP_000358                245 aa            linear   PRI
>> 31-OCT-2000
>> DEFINITION  thiopurine S-methyltransferase [Homo sapiens].
>> ACCESSION   NP_000358
>> VERSION     NP_000358.1  GI:4507653
>> DBSOURCE    REFSEQ: accession NM_000367.1  <<==
>> KEYWORDS    .
>> SOURCE      Homo sapiens (human)
>>
>> I found no answer reading the docs, and there is the same unanswered
>> question in this list archives at
>>
>> http://bioperl.org/pipermail/bioperl-l/2003-June/012438.html
>>
>> thanks in advance
>>
>> K.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
> --
> -------------------------------------------------------------
> Hilmar Lapp                            email: lapp at gnf.org
> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> -------------------------------------------------------------
>
>
>
>
>
-- 
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------




More information about the Bioperl-l mailing list