[Bioperl-l] Retrieving from an indexed fasta file into a LargeSeq object

Jason Stajich jason.stajich at duke.edu
Mon Sep 27 09:35:22 EDT 2004


Depends on how many games you want to play here and why you really want 
a LargeSeq. i.e. are you still going to call 'seq' on the large seq 
object to get the sequence as a string?  Yes it can be done by changing 
the factory which SeqIO uses to create the sequence - if you look at 
the Index::AbstractSeq object.

you'd want to call:
(not pretty I know)

$idx->_get_SeqIO_object->sequence_factory(Bio::Seq::Factory->new(-type 
=> 'Bio::Seq::LargeSeq'));


However, Lincoln's Bio::DB::Fasta module is better for handling this 
sort of thing I think as you can request virtual slices of the sequence 
data.  I bet it will be much faster than how the LargeSeq 
implementation works although the two use the same idea of using the 
filesystem instead of memory for the seq storage.  Just make sure your 
Fasta file is consistently formatted (all sequence lines are the same 
length, a quick
'sreformat fasta fafile > newfafile; mv  newfafile fafile;'  can take 
care of that).

-jason
On Sep 24, 2004, at 4:39 PM, Christopher Porter wrote:

>
> I have a fasta file containing large contig sequences, which I have 
> indexed using Bio::Index::Fasta. Is there a way to use the index to 
> retrieve sequences into a Bio::Seq::LargeSeq object rather than 
> Bio::Seq?
>
> What I'm currently doing is essentially:
>
> #!/usr/bin/perl
>
> use strict;
> use Bio::SeqIO;
> use Bio::Index::Fasta;
>
> my $idx = Bio::Index::Fasta->new('-filename'=>$hcindex);
>
> foreach my $acc(keys %$foo){
> 	my $seqobj = $idx->fetch($acc);
> 	...
> }
>
> How can I force $seqobj to be a LargeSeq?
>
> (At another point in the script I'm using SeqIO to read short 
> sequences from a non-indexed fasta file - I don't really want to use 
> LargeSeq for that part.)
>
>
> Thanks,
>
> Chris
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
--
Jason Stajich
jason.stajich at duke.edu
http://www.duke.edu/~jes12/



More information about the Bioperl-l mailing list