[Biopython] sequence coordinate mapping
Peter Rice
pmr at ebi.ac.uk
Thu Jun 24 08:47:34 UTC 2010
On 24/06/2010 09:36, Peter wrote:
> On Wed, Jun 23, 2010 at 5:50 PM, Peter Rice<pmr at ebi.ac.uk> wrote:
>> I have been following the discussion with interest.
>
> Its nice to have BioPerl and EMBOSS folk on the mailing list :)
>
>> This is something we
>> also want to implement in EMBOSS soon after the next release when we
>> seriously tackle mapping and large alignments.
>
> Are you thinking beyond the simple feature mapping which I've had in
> mind here (e.g. in GenBank or EMBL files)?
Well, EMBOSS internals are identical for analysis results and
EMBL/GenBank features so we would hope to cover anything we might want
to do.
A big effort after this release will include mapping to coordinate
systems (especially reference sequences) so we could align an annotated
sequence (e.g. an EMBL/GenBank entry) to a reference and aim to transfer
the features, or to map features from the reference (using DAS or some
similar protocol to extract just the region of interest) on to the
user's own sequence.
Anything that fails to map completely can be annotated e.g. with <start
or >end or /note="some explanation"
The naming of the reference sequences is also important so the mapping
coule be hopefully reversible.
> Sadly I won't be at the Boston BOSC/ISMB 2010, but Brad and others
> will be. Maybe next time I visit the Sanger Centre I'll try and drop by and
> visit you (Peter R) at the EBI?
Great, let me know when you can drop in. Always good to see you.
regards,
Peter
More information about the Biopython
mailing list