[Biopython] AssertionError

Peter biopython at maubp.freeserve.co.uk
Wed Jul 21 05:17:21 EDT 2010


On Wed, Jul 21, 2010 at 9:47 AM, Pierre-Yves <pingou at pingoured.fr> wrote:
> Hi,
>
> I am running into a problem which I can't figure why.
>
> I parse a fasta file using biopython:
> for seq_record in SeqIO.parse(fastafile, "fasta"):
>     if seq_record.id == name:
>            print seq_record.id
>            return seq_record
>
> This seq_record becomes seq and I try to extract only a subpart of the
> sequence:
> s = Seq(seq.seq[start:stop], generic_dna)
> seq_out = SeqRecord(s, id = row[col_name])


If seq is a SeqRecord (a variable name I avoid), then seq.seq is
a Seq object, and slicing a Seq object gives another Seq object.
This means you shouldn't do this:

 Seq(seq.seq[start:stop], generic_dna)

Just do this:

seq.seq[start:stop]

Your work around seems overly complicated,

Seq(str(seq.seq[start:stop]), generic_dna)

If your reason for doing this is to specify the alphabet, just tell
Bio.SeqIO.parse() the alphabet instead.

You can also slice the original SeqRecord instead, to give a new
SeqRecord, and change its id to what you want.

>
> But the creation of the sequence object fails with the following error:
> Traceback (most recent call last):
>  File "FastaExtractor.py", line 91, in <module>
>    s = Seq(seq.seq[start:stop], generic_dna)
>  File "/usr/lib64/python2.6/site-packages/Bio/Seq.py", line 87, in
> __init__
>    type(data) == type(u""))  # but can be a unicode string
> AssertionError

That should be a clearer error message, the Seq object is not
expecting you to give it a Seq object - but a string or unicode.

Peter



More information about the Biopython mailing list