[Biojava-dev] Changes to Sequence in BioJava3
George Waldon
gwaldon at geneinfinity.org
Wed Nov 3 15:35:24 UTC 2010
Hi Andy:
Note that the reverse of a sequence is usually used to indicate the sequence in reverse order, from the 3' end to the 5' end. I think you should name your method getReverseComplement if you want to return the reverse & complement of a sequence:
sequence: TGCG
reverse: GCGT
complement: ACGC
reverse & complement: CGCA
Regards,
George
On Tue, Nov 2, 2010 at 8:16 AM, Andy Yates <ayates at ebi.ac.uk> wrote:
Hi everyone,
As a caution to people with implementations already built on the Sequence interface I'm proposing a couple of changes to it. This will cause a binary class incompatibility & will have impacts in the methods you need to implement but I'll sort them out at the BioJava core end.
1). Removal of getSequenceAsString(Integer,Integer,Strand)
** The implementation is patchy & buggy often exposing data from backing stores
2). Addition of SequenceView<C> getReverse()
** Will return the sequence in the reverse strand
** Also complemented if applicable
3). Addition of isComplementable() to CompoundSet
** Used to support the above function
This means substrings of Sequences are retrieved as so:
DNASequence d = new DNASequence("ATGCGC");
d.getSubSequence(2, 5).getSequenceAsString(); //Returns TGCG
d.getSubSequence(2, 5).getReverse().getSequenceAsString(); //Returns CGCT
More information about the biojava-dev
mailing list