[Bioperl-l] SeqIO::refseq

ybcho ybcho at biomics.org
Wed Dec 1 21:59:52 EST 2004


I have been parsing RefSeq gpff files using Bioperl-1.4.

But I found that $taxon_id was missed while printing with below script

And it produced below

 

taxon:9606

taxon:9606

taxon:9606

taxon:9606

GeneID:26278 LocusID:26278 MIM:604490

.....

from print "@db_xref\n";

 

 

I can not find why this happened.

 

But, I can take taxonomy id from    $taxonomy_id = $species->ncbi_taxid;

After removing  "&& ( $species->ncbi_taxid())" in 508 line of genbank.pm 

Because it has null value all the time.

 

Can any one correct these?

 

Cheers.

 

============= refseq parsing script ====================

foreach $feature(@features = $seq->get_SeqFeatures){

             $location_type =  $feature->location->location_type;

             $feature_type =  $feature->primary_tag;

 

             %seen_tag = ();

             @tags = ();

             foreach $tag (@tags = $feature->get_all_tags){

                 $seen_tag{$tag}++;

             }

             $organism = $db_xref = $taxonomy_id = $strain = $plasmid = ();

 

             if ($feature_type eq "source"){

                 @db_xref =  $feature->get_tag_values('db_xref') if exists
$seen_tag{'map'};

                 print "@db_xref\n";

                 ($taxonomy_id) = $db_xref[0] =~ /taxon\:(\d+)/;

                 ($strain) = $feature->get_tag_values('strain') if exists
$seen_tag{'strain'};

                 ($plasmid) = $feature->get_tag_values('plasmid') if exists
$seen_tag{'plasmid'};

                 print
"$internal_id\t$organism\t$strain\t$taxonomy_id\t$plasmid\n"

}

.............

}

================================================================

 



More information about the Bioperl-l mailing list