[Bioperl-l] Fwd: Q: batched extraction of sub-sequences and their reverse-complements ?

Chris Fields cjfields at illinois.edu
Mon Apr 18 23:42:49 UTC 2011


Have a look at Bio::Coordinate for various coordinate conversions (I think the specific module to use in this case is probably Bio::Coordinate::GeneMapper).

chris

On Apr 12, 2011, at 10:23 AM, Dave Messina wrote:

> ---------- Forwarded message ----------
> From: wadim kapulkin <wadim_kapulkin at yahoo.co.uk>
> Date: Tue, Apr 12, 2011 at 17:13
> Subject: Re: [Bioperl-l] Q: batched extraction of sub-sequences and their
> reverse-complements ?
> To: Dave Messina <David.Messina at sbc.su.se>
> 
> 
> Hello Dave
> 
> Thank you very much for yours response. Indeed my question might be split as
> you did :)
> 
> So first:
> Yours suggestion below as to use Bio::DB::Fasta shall make trick. Thanks
> very much !
> 
> As per second part : I probably did not explained properly what I had in
> mind. However the link you included below seems to address this matter:
> quoting exerted phrase 'Although coordinate conversion sounds pretty trivial
> it can get fairly tricky when one includes the possibilities of switching to
> coordinates on negative (i.e. Crick) strands and/or having a coordinate
> system terminate because you have reached the end of a clone or contig.'. The
> issue is indeed in the coordinate conversion. In the specific example, I
> have been concerned with: I used Cbriggsae chromosomal set to run external
> program and find out the output depends sometimes on strand polarity...
> (this is getting even more complicated when used other assemblies/ db
> freezes offering the sequences differing in lenght). I will need bit more
> time to describe this specific example.
> 
> Thanks very much again.
> 
> Wadim
> 
> ------------------------------
> *From:* Dave Messina <David.Messina at sbc.su.se>
> *To:* wadim kapulkin <wadim_kapulkin at yahoo.co.uk>
> *Cc:* bioperl-l at lists.open-bio.org
> *Sent:* Sat, 9 April, 2011 4:47:34
> *Subject:* Re: [Bioperl-l] Q: batched extraction of sub-sequences and their
> reverse-complements ?
> 
> Hi Wadim,
> 
> I would like to extract the batch of subsequences (as fastas),  based on
>> list of
>> coordinates : i.e. 1-1000, 1001-2000 , 2001-3000 etc) from given 'large
>> seqence'
>> (i.e. chromosome sized >10MB)
> 
> 
> Take a look at Bio::DB::Fasta.
> 
> 
> 
> 
>> and then, ideally , I would be keen to know how to
>> extract the converse set - [i.e.: extract 'same' ( I mean corresponding)
>> batch
>> of sequences, based on list of converse coordinates  from
>> reverse-complement of
>> given 'large sequence'].
>> 
> 
> I don't totally understand this part of your question, but this may help:
> 
> http://www.bioperl.org/wiki/BioPerl_Tutorial#Converting_coordinate_systems_.28Coordinate::Pair.2C_RelSegment.29
> 
> 
> Dave
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l





More information about the Bioperl-l mailing list