[Bioperl-l] Error parsing TIGR xml

Simon K. Chan skchan at cs.usask.ca
Tue Jun 22 13:54:03 EDT 2004


Hi Fernan, 

Which DTD are you using?  It looks like you have an older version of TIGR XML.  

You can find the newer TIGR XML DTD here:
ftp://ftp.tigr.org/pub/data/a_thaliana/ath1/BACS

The code for the tigr.pm module is built to handle the newer format (though I've
encountered a few problems myself...), which explains the error message that you
are getting.

(ie the ASMBL_ID is no longer specified as an attribute in the <ASSEMBLY> tag in
the newer DTD).

There's a TIGR XML parser that you can get from the ftp site, but I believe it
can only handle certain features.  Check here:
ftp://ftp.tigr.org/pub/data/Eukaryotic_Projects/o_sativa/annotation_dbs/tools/TIGR_XML_parser.tar.gz

I'm working on something similar as a side project, so let me know if you have
other concerns.

Cheers,

-- 
Warmest Regards,
Simon K. Chan
Bioinformatics, Crosby Lab
skchan at cs.usask.ca


Quoting Fernan Aguero <fernan at iib.unsam.edu.ar>:

> Hi!
> 
> I'm seeing an error while trying to parse a .coordset file
> from TIGR. It is my first attempt at using this kind of
> files, so perhaps I'm doing something wrong.
> 
> Here's my brief script:
> 
> #!/usr/bin/perl -w
> 
> use strict;
> use Bio::SeqIO;
> 
> my $seqio = Bio::SeqIO->new( -file => $ARGV[0], -format => 'tigr');
> 
> Just trying to create a SeqIO object produces the following error:
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: [2]Required <ASMBL_ID> missing
> STACK: Error::throw
> STACK: Bio::Root::Root::throw
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/Root/Root.pm:328
> STACK: Bio::SeqIO::tigr::throw
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/SeqIO/tigr.pm:1338
> STACK: Bio::SeqIO::tigr::_process_assembly
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/SeqIO/tigr.pm:522
> STACK: Bio::SeqIO::tigr::_process
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/SeqIO/tigr.pm:423
> STACK: Bio::SeqIO::tigr::_initialize
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/SeqIO/tigr.pm:90
> STACK: Bio::SeqIO::new
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/SeqIO.pm:358
> STACK: Bio::SeqIO::new
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/SeqIO.pm:378
> STACK: ./tigrxml2features.pl:6
> -----------------------------------------------------------
> 
> 
> The file does contain ASMBL_IDs, or at least that is what I
> believe. These are the first lines of the file
> 
> <ASSEMBLY ASMBL_ID = "56" COORDS = "1-2149">
>         <HEADER>
>                 <CLONE_NAME>1047053397923</CLONE_NAME>
>                 <ORGANISM>Trypanosoma cruzi</ORGANISM>
>                 <AUTHOR_LIST CONTACT = "">
>                 </AUTHOR_LIST>
>         </HEADER>
>         <TU FEAT_NAME = "56.t00001" LOCUS = "Tc00.1047053397923.10" PUB_LOCUS
> = 
> "" ALT_LOCUS = "" COM_NAME = "hypothetical protein" PUB_COMMENT = "" COORDS =
> "1
> 67-586">
>                 <MODEL FEAT_NAME = "56.m00001" COMMENT = "" COORDS =
> "167-586">
>                        
> <PROTEIN_SEQ>MKQSSTDGGGKQKGKDSVSSDSMKDAVTDNPGKPTATTIPTSR
> SGDAQEKEGKDDGTDERPTSKKHNSSPETGNTNDALTASENTPQTAETTATTVAKKNDTTIGDSDGSTAVSDTASPLLLL
> FLVVVACAAAAAVVAA*</PROTEIN_SEQ>
>                         <EXON FEAT_NAME = "56.e00001" COORDS = "167-586">
>                                 <CDS FEAT_NAME = "56.c00001" COORDS =
> "167-586"/
> >
>                         </EXON>
>                 </MODEL>
>         </TU>
> </ASSEMBLY>
> 
> I've found a mention of a tigrxml by Jason Stajich that
> was supposed to be different from the SeqIO::tigr by Josh
> Lauricha. But I don't seem to have it in my system
> (bioperl-1.4)
> <http://bioperl.org/pipermail/bioperl-l/2004-January/014491.html>
> 
> Thanks in advance,
> 
> Fernan 
> 
> PS: I'm CCing the author of the tigr.pm module, just in
> case. 
> 
> -- 
> F e r n a n   A g u e r o
> http://genoma.unsam.edu.ar/~fernan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 




More information about the Bioperl-l mailing list