[Biojava-l] Question about cutting a sequence
Thomas Down
td2@sanger.ac.uk
Fri, 20 Sep 2002 14:32:39 +0100
On Fri, Sep 20, 2002 at 03:18:07PM +0200, Stein Aerts wrote:
>
> I've had these questions since I started using Biojava a year ago, and I
> still can't get it right:
>
> 1. How can I take a certain part of an annotated sequence (let's say from bp
> 500 to 750), make a new sequence from this part, while retaining all
> annotations of the piece, in new coordinates? So a feature on the original
> sequence from 600 to 610 has to be the same feature with its annotation, on
> the subsequence from 100-110.
Sequence bigSequence = ...
Sequence smallSequence = new SubSequence(bigSequence, 500, 750);
All annotations are `projected' into subsequence coordinates. In
recent versions (can't remember off the top of my head whether this
got in before 1.2 or not) you can also edit the annotation on
the SubSequence (and have changes reflected back to the parent
sequence).
If a feature crosses the boundary of the SubSequence (e.g.
for 450..550), you'll get a RemoteFeature corresponding to
the portion of the complete feature which fits onto the
SubSequence.
> 2. If I'm using DNA sequences, and if I need the strand of my features, then
> I get Exceptions when a certain feature is not stranded. So in the first
> case I need a StrandedFeature, in the second a Feature, but that puzzles me.
The idea here is that, even if you're just considering DNA
sequences, some types of annotation are inherantly directional
(exons, promoters, etc) while others may not be (replication
origins, matrix attachment sites, etc.). The idea is that
you use StrandedFeature only for cases where the strand has some
real meaning.
If you want to use strand information in a safe way, you can do:
Feature f = ...
StrandedFeature.Strand strand = StrandedFeature.UNKNOWN;
if (f instanceof StrandedFeature) {
strand = ((StrandedFeature) f).getStrand();
}
[Aside: it might be worth putting this snippet as a convenience
method somewhere...]
Hope this makes some kind of sense,
Thomas.