Bioperl: Re: Bioperl-guts: Reading in a substring of a sequence file

Ewan Birney birney@ebi.ac.uk
Mon, 1 May 2000 14:53:48 +0100 (GMT)


On Mon, 1 May 2000, Shai Shen-Orr wrote:

> Hi all,

[cc'ing this back to the main bioperl list, as it is general question]

> 
> I'm new to BioPerl, looks like a great set of tools!  

We hope you like it - do comment on things you don't think are great;
probably best done after playing with the package for a couple of weeks
and browsing the documentation.

 > 
> I have a question:  Is anyone aware of a way by which I can extract only
> a substring of a sequence from a file (by giving its characters start &
> end positions), instaed of getting it as a whole sequence and then using
> seq->substr ?
> 

Sadly - this is not (currently) doable. It is in fact a quite hard
problem. In the future I hope that sequences which are made from contigs
of other sequences will be able to cleverly retrieve sub sequences. This
however wont help you ;) (why do I mention it- I guess because I want
reassure people that I/we are thinking about very, very, long sequences in
bioperl).

You could write your own SeqIO module, such as Bio::SeqIO::SubFasta.
The constructor would take extra arguments, -start and -end. Then this
module would be there for everyone to use. 

I don't think we can handle this (nicely) for every SeqIO parser unless
we changed the parsing strategy (I can feel biopython guys feel quite smug
now...). It will have to be on a format-by-format basis.




> In this particular instance, I'm trying to extaract certain sequences
> from a file containing the complete chromosome 2 of arabidopsis.
> 
> 
> 	Thanks,
> 
> 
> 	Shai
> =========== Bioperl Project Mailing List Message Footer =======
> Project URL: http://bio.perl.org
> For info about how to (un)subscribe, where messages are archived, etc:
> http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl-guts.html
> ====================================================================
> 

-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>. 
-----------------------------------------------------------------

=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://bio.perl.org/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================