[Bioperl-l] Bio::DB::Fasta problem: unable to fetch all sequences via get_PrimarySeq_stream

Helene RIMBERT helene.rimbert at inra.fr
Mon Nov 14 16:16:17 UTC 2016


Dear BioPerl developers,

I come with a question regarding the get_PrimarySeq_stream !

I am using the Bio::DB:Fasta module to access my fasta sequences and i 
am facing some problem with the get_PrimarySeq_stream().
When i check the content of the db object, all the sequences are indexed 
(i mean that i can see all the sequences ids in the offsets hash).

I then use the get_PrimarySeq_stream to loop over all my sequences, but 
only 1 sequence is retrieved from the stream object.
I tried to look for some explanations, and the only thing i could find 
is that it seems that my seq_ids are considered as undef. during the 
while($dbstream->next_seq()) statement when reaching
IndexedBase.pm line 1116

I tried to loop over all sequence ids using my @seq_ids = 
$self->{fastaObj}->get_all_primary_ids; and it works very well.

I don't understand why the stream object does not retrieve all the 
sequences whereas get_all_primary_ids does!
Is there something wrong with my input FASTA (my ids are very long...) 
or am i missing something?

I am really interested in finding out why i am not able to use 
get_PrimarySeq_stream !

Many thanks in advance :)

Regards,

Helene

#----------------------------------
# here is the part of code that causes problem:
# initialize db::fasta object
$self->{fastaObj} =  Bio::DB::Fasta->new("test2.fna", -reindex => 1);

# create stream object
my $seq_stream = $self->{fastaObj}->get_PrimarySeq_stream();
$self->{nbSeqFetchedInStream}=0;

# loop over all seq in BioDBFasta obj using stream obj.
while ($self->{seq} = $seq_stream->next_seq()){
#foreach my $seq_id (@seq_ids){
     #$self->{seq} = $self->{fastaObj}->get_Seq_by_id($seq_id); # to use 
with foreach loop

     print (" New sequence: ", Dumper $self->{seq});
     $self->{nbSeqFetchedInStream}++;
}
print (" Fetched sequences in _PrimarySeq_stream: 
$self->{nbSeqFetchedInStream}");
#----------------------------------





-- 

*--> Nouvelle adresse e-mail: helene.rimbert at inra.fr <--*

Hélène RIMBERT
Bioinformatic Engineer
helene.rimbert at inra.fr
UMR 1095 INRA/UBP – Site de Crouel
Tèl. : +33 (0)4 73 62 43 49
5 chemin de beaulieu
63039 Clermont-Ferrand Cedex 2
France
https://www6.ara.inra.fr/umr1095_eng/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/bioperl-l/attachments/20161114/40c48c63/attachment.html>


More information about the Bioperl-l mailing list