[Bioperl-l] SeqIO-based parser for Vector NTI sequence files
Chris Fields
cjfields at illinois.edu
Mon Feb 9 12:49:03 EST 2009
I think the best short-term thing may be to wrap the genbank.pm parser
and simply reparse/rework the relevant Bio::Annotation::Comment
instance containing the COMMENT data.
Long-term, I would like to have an XML-like parser that just takes the
data and passes it in to a handler (so you could customize what
happens to data, create objects, load databases, etc). Along these
lines I've been (very slowly) reworking GenBank/EMBL/UniProt parsing
so it generically parses data and passes it on to a relevant handler
instance (in this case it just generates a Bio::Seq::Richseq as the
regular parser does).
It still needs a bit more work, though, particularly the internals.
if you want to test them out the modules are in the last 1.6.0 release
as Bio::SeqIO::gbdriver/embldriver/swissdriver.
chris
On Feb 9, 2009, at 8:50 AM, Cook, Malcolm wrote:
> Scott,
>
> What do you expect to extract from the COMMENT lines?
>
>
> Malcolm Cook
> Database Applications Manager - Bioinformatics
> Stowers Institute for Medical Research - Kansas City, Missouri
>
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-bounces at lists.open-bio.org
> ] On Behalf Of Scott Markel
> Sent: Tuesday, October 21, 2008 3:49 PM
> To: bioperl-ml
> Cc: smarkel at accelrys.com
> Subject: [Bioperl-l] SeqIO-based parser for Vector NTI sequence files
>
> I'm looking for a BioPerl-related solution to parsing Vector NTI
> sequence files. The genbank.pm parser will work, but it doesn't
> parse the COMMENT lines beyond grabbing the simple string value, so
> it misses all of the added information in those lines.
>
> If you know of any existing code, I'd be interesting in hearing
> about it. I checked BioPerl, BioJava, and EMBOSS documentation.
> I also checked the Invitrogen web site.
>
> Scott
>
> --
> Scott Markel, Ph.D.
> Principal Bioinformatics Architect email: smarkel at accelrys.com
> Accelrys (SciTegic R&D) mobile: +1 858 205 3653
> 10188 Telesis Court, Suite 100 voice: +1 858 799 5603
> San Diego, CA 92121 fax: +1 858 799 5222
> USA web: http://www.accelrys.com
>
> http://www.linkedin.com/in/smarkel
> Board of Directors: International Society for Computational Biology
> Co-chair: ISCB Publications Committee
> Associate Editor: PLoS Computational Biology Editorial Board:
> Briefings in Bioinformatics
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list