[Bioperl-l] BIO::DB::FASTA ID

Jason Stajich jason at bioperl.org
Thu Jun 21 21:19:14 UTC 2007


Hey Nick -
I think
a) your IDs are not unique
b) you need to declare the function make_my_id BEFORE your call  
Bio::DB::Fasta->new if you want your function to be used.

$ grep "^>" T_orthologs_Dpse_genes.fa | awk '{print $1}' | sort |  
uniq | wc -l
1527


-jason
On Jun 21, 2007, at 11:36 AM, Staffa, Nick (NIH/NIEHS) wrote:

> #!/usr/bin/perl
> #
> #
> #
> use strict;
> use Bio::DB::Fasta;
> use Bio::Tools::SeqWords;
> use Bio::Seq;
> use Bio::SeqIO;
> $|=1;
> #
> #
> my $Dpse_UTR_file_for_T_orthologs =
> "/home/staffa/clients/Kari/D_pse_genome/testit/ 
> T_orthologs_Dpse_genes.fa";
> my $db = Bio::DB::Fasta->new
> ('/home/staffa/clients/Kari/D_pse_genome/testit/ 
> T_orthologs_Dpse_genes.fa',
>   -reindex,  -makeid => \&make_my_id);
> my @ids = $db->ids;
> my $number_in = @ids;
> print "number of Dpse IDs = $number_in\n";
> foreach my $id (@ids){
> print "$id\n";
> }
> sub make_my_id {
> #       parse header line:
> #       >Dpse_GA13134 CG14636 NO UTR has 2 TATTTAT 117 145, 0  
> TTATTTATT
>     my $line = shift;
> #    print "line = $line\n";
>     $line =~ />(\w+) /;
>     my $ID = $1;
> #    print "ID = $ID\n";
>     return $ID;
>       }

--
Jason Stajich
jason at bioperl.org
http://jason.open-bio.org/





More information about the Bioperl-l mailing list