[Bioperl-l] Indexing CDS file
Sviya
sviiya at gmail.com
Tue Feb 10 02:28:34 EST 2009
Hello,I am trying to index cds files. The following script is run on
cds_ann_hum_01_r99.dat obtained from EMBL. It is able to read and print the
ids and sequences, but strangely it is unable to find any of the ids! Could
you please explain the mistake I am committing? I have tried variously but
it doesn't work.
--- <prog> ---
#!/usr/bin/perl -w
use strict;
use Bio::SeqIO;
use Bio::Index::Fasta;
my $file = "cds100.dat";
my $idx_file = "cds100.idx";
unlink($idx_file); # delete the file, if exists
my $in_seq = Bio::SeqIO->new( -file => "< $file",
-format => "embl");
while ( my $seq = $in_seq->next_seq) {
print $seq->id, "\n";
}
my $inx = Bio::Index::Fasta->new( -filename => $idx_file,
-write_flag => 1);
$inx->make_index($file); # creates the index file
print "Done indexing!\n";
my $out = Bio::SeqIO->new( '-format' => 'fasta');
$inx = Bio::Index::Fasta->new('-filename' => $idx_file);
my $query_seq=$inx->fetch('EAL24405SV');
if ( ! defined($query_seq) ) {
print "Sequence not found!\n";
exit 1;
} else {
print $query_seq;
}
exit 0;
--- </prog> ---
Output is:
--- <stdout> ---
EAL24405;
EAL24406;
.
.
.
EAL24407;
EAL24408;
Done indexing!
Sequence not found!
--- </stdout> ---
TIA,
Sviya
More information about the Bioperl-l
mailing list