[Biopython] sequence coordinate mapping

Peter Rice pmr at ebi.ac.uk
Thu Jun 24 04:47:34 EDT 2010


On 24/06/2010 09:36, Peter wrote:
> On Wed, Jun 23, 2010 at 5:50 PM, Peter Rice<pmr at ebi.ac.uk>  wrote:
>> I have been following the discussion with interest.
>
> Its nice to have BioPerl and EMBOSS folk on the mailing list :)
>
>> This is something we
>> also want to implement in EMBOSS soon after the next release when we
>> seriously tackle mapping and large alignments.
>
> Are you thinking beyond the simple feature mapping which I've had in
> mind here (e.g. in GenBank or EMBL files)?

Well, EMBOSS internals are identical for analysis results and 
EMBL/GenBank features so we would hope to cover anything we might want 
to do.

A big effort after this release will include mapping to coordinate 
systems (especially reference sequences) so we could align an annotated 
sequence (e.g. an EMBL/GenBank entry) to a reference and aim to transfer 
the features, or to map features from the reference (using DAS or some 
similar protocol to extract just the region of interest) on to the 
user's own sequence.

Anything that fails to map completely can be annotated e.g. with <start 
or >end or /note="some explanation"

The naming of the reference sequences is also important so the mapping 
coule be hopefully reversible.

> Sadly I won't be at the Boston BOSC/ISMB 2010, but Brad and others
> will be. Maybe next time I visit the Sanger Centre I'll try and drop by and
> visit you (Peter R) at the EBI?

Great, let me know when you can drop in. Always good to see you.

regards,

Peter


More information about the Biopython mailing list