[Bioperl-l] Bio::SeqIO::genbank
Mark A. Jensen
maj at fortinbras.us
Thu Apr 8 16:15:55 EDT 2010
FWIW -In my SoapE investigations I found NCBI-hosted XML schema for insdc, but didn't ever run across a format descriptor that gets return data in that format-- MAJ
>-----Original Message-----
>From: Chris Fields [mailto:cjfields at illinois.edu]
>Sent: Thursday, April 8, 2010 04:09 PM
>To: 'Dave Messina'
>Cc: 'bioperl-l', 'Wayne Davis'
>Subject: Re: [Bioperl-l] Bio::SeqIO::genbank
>
>On Thu, 2010-04-08 at 21:39 +0200, Dave Messina wrote:
>> Hi Wayne,
>>
>> > if $mol is not in the fixed list of genbank molecule types it should
>> > be set to the default value of 'DNA', or some other smarter way of
>> > forcing the molecule type into the fixed vocabulary would be a help.
>>
>> Sounds good to me. Did you modify your local copy of Bio::SeqIO::genbank and try it out?
>>
>> I will say, though, that Genbank is a tricky format, both to read and to write. Even if BioPerl would write Genbank records that are fully compliant with the spec, I'm pretty sure they would not be round-trippable*. That is, if you read a Genbank record into BioPerl and then wrote it back out, the output wouldn't exactly match the input.
>
>This is true. Jason and I talked about this recently and arrived pretty
>much at the same conclusion. We're mainly interested in parsing data
>into a usable framework for manipulation. Recreating data isn't our top
>priority.
>
>> I think that NCBI is trying to nudge people toward their XML format. I know it won't help this particular situation, but it might be an option to consider for the future.
>
>The only problem I had with the XML spit out from eutils has been it was
>an on-the-fly conversion of the ASN.1. Not sure what the status of it
>is now.
>
>What's going on with the INSDC XML format? That was supposed to be an
>international standard and appeared more lightweight (if such a thing
>can be said about XML).
>
>> Speaking of which, what is the current status of the BioPerl Genbank XML parser? Jay, did you ever release that?
>>
>>
>> Dave
>>
>>
>>
>> * not that they were designed to be: http://www.bioperl.org/wiki/HOWTO:SeqIO#Caveats
>
>I think it was in a branch, can't recall.
>
>chris
>
>
>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at lists.open-bio.org
>http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list