[BioPython] Adding startswith and endswith methods to the Seq object

Leighton Pritchard lpritc at scri.ac.uk
Mon Apr 13 10:10:30 EDT 2009


Howdo,

On 13/04/2009 14:47, "Peter" <peter at maubp.freeserve.co.uk> wrote:

> I'm confident there are many possible use cases for this.
> 
> The example which prompted me to work on this was taking SeqRecord
> objects from sequencing reads (a FASTQ file read in with Bio.SeqIO,
> possible with Biopython 1.50 beta or later) where some include a PCR
> primer associated prefix/suffix which I want to strip off (by slicing
> the SeqRecord).  To do this I need to know if a given SeqRecord's
> sequence starts with (or ends with) a given primer sequence (or a
> tuple of primer sequences).
> 
> e.g. I want to be able to do this:
> 
> primer = "TGACCTGAAAAGAC"
> crop = len(primer)
> #record is a SeqRecord object
> if record.seq.startswith(primer) :
>    record = record[crop:]

[...]
 
> Does this seem like a sensible addition to the Seq object?  It is
> consistent with making the Seq object more like a python string.

Yes it does seem sensible.  I'd quite like to (eventually) have the
capability either to provide ambiguity symbols, or to query with a regular
expression along the lines of re.match() (or maybe the nonexistent
re.endmatch()).  

Since this isn't implemented yet, maybe there's still time to consider this
potential usage in the implementation?

L.

-- 
Dr Leighton Pritchard MRSC
D131, Plant Pathology Programme, SCRI
Errol Road, Invergowrie, Perth and Kinross, Scotland, DD2 5DA
e:lpritc at scri.ac.uk       w:http://www.scri.ac.uk/staff/leightonpritchard
gpg/pgp: 0xFEFC205C       tel:+44(0)1382 562731 x2405


______________________________________________________
SCRI, Invergowrie, Dundee, DD2 5DA.  
The Scottish Crop Research Institute is a charitable company limited by guarantee. 
Registered in Scotland No: SC 29367.
Recognised by the Inland Revenue as a Scottish Charity No: SC 006662.


DISCLAIMER:

This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries.  This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed.  It may not be disclosed or used by any other than that
addressee.
If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on
this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system.

Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any).
______________________________________________________



More information about the Biopython mailing list