[Biopython] Invalid sequence Error(s) from SeqIO module

Sean Brimer skbrimer at gmail.com
Wed Feb 22 10:37:23 EST 2023


Hi Peter,

Thank you for the help! I have gotten the script to work AND learned more
about biopython. I tried a few different things.

I noticed from the tutorial that when concatenating different seq objects
together you declare an empty Seq object from the start where I just made a
list of them. When I use the list approach and declare the id and
description I get the same error that started all of this. (Hooray
reproducible)

When I make the new seq and empty Seq object that my code fails on the fact
that a Seq object has no append attribute

When I change the code from append to join I get an empty seq object. I am
guessing because joining on an empty object doesn't work? (reading about
this one currently)

When I change the code to  Seq Object += Seq I get exactly what I was
looking for.

Thank you again for your help!

Sean

On Sun, Feb 19, 2023 at 6:44 AM Peter Cock <p.j.a.cock at googlemail.com>
wrote:

> You can add the SeqRecord objects together, or add their sequences
> together. Either way you should explicitly set the combined record's id and
> description.
>
> See e.g. "Concatenating or adding sequences" or "Adding SeqRecord objects"
> in the Tutorial
> https://biopython.org/DIST/docs/tutorial/Tutorial.html
>
> Peter
>
> On Fri, Feb 17, 2023 at 4:30 PM Sean Brimer <skbrimer at gmail.com> wrote:
>
>> Hi Peter,
>>
>> Thank you for getting back to me!
>>
>> Overall I am trying to concatenate all the sequences in a XXX.fasta file
>> and write them to a new file called XXX_brig.fasta file.
>>
>> For the fa_seq list, originally I had the same though, I believed
>> appending the Seq objects to a list would convert them to strings and I
>> would need to call the Seq(str) when writing the file, however it turns out
>> that it just makes a list of Seq objects and I get an error when calling
>> SeqRecord(Seq(fa_list)) basically saying it was already a Seq object so I
>> removed the Seq call.
>>
>> On Fri, Feb 17, 2023 at 6:47 AM Peter Cock <p.j.a.cock at googlemail.com>
>> wrote:
>>
>>> Your variable fa_seq is a list, but the first argument to SeqRecord
>>> should be a Seq object.
>>>
>>> Are you trying to concatenate all the sequences in each XXX_brig.fasta
>>> file?
>>>
>>> Peter
>>>
>>> On Mon, Feb 13, 2023 at 3:09 PM Sean Brimer <skbrimer at gmail.com> wrote:
>>>
>>>> Hello biopython group,
>>>>
>>>> I'm using biopython 1.78 on a Ubuntu 18.04 LTS system with python
>>>> 3.6.9 and I am trying to reformat a spades mutli-fasta output file
>>>> (contigs) so I can then use BRIG. BRIG is expecting only one header per
>>>> fasta so I'm attempting to concatenate all of the Seq records under one
>>>> header and write it to a new file. From the traceback I think it is telling
>>>> me that it doesn't like the new fasta header name however I don't know why.
>>>>
>>>> Any help would be appreciated, thank you.
>>>>
>>>> Sean Brimer
>>>>
>>>> This is my Code:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *import globfrom Bio.SeqRecord import SeqRecordfrom Bio.Seq import
>>>> Seqfrom Bio import SeqIO## Looking for fasta files fa_list = [i for i in
>>>> glob.glob("*.fasta")]## New file for the contigs for BRIGfor i in fa_list:
>>>>   handle = i.rpartition(".")[0]+"_brig.fasta"    with open(handle,"w"):
>>>> fa_seq = []    new_head = i.rpartition(".")[0]        for record in
>>>> SeqIO.parse(i,"fasta"):             fa_seq.append(record.seq)
>>>>  new_rec = SeqRecord(fa_seq, id = new_head, description = "")*
>>>> *SeqIO.write(new_rec, handle, "fasta")*
>>>>
>>>>
>>>> This is the Error(s) I am getting:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *Traceback (most recent call last):  File
>>>> "/home/sean/.local/lib/python3.6/site-packages/Bio/File.py", line 73, in
>>>> as_handle    yield fp  File
>>>> "/home/sean/.local/lib/python3.6/site-packages/Bio/SeqIO/__init__.py", line
>>>> 524, in write    fp.write(format_function(record))  File
>>>> "/home/sean/.local/lib/python3.6/site-packages/Bio/SeqIO/FastaIO.py", line
>>>> 389, in as_fasta    data = _get_seq_string(record)  # Catches sequence
>>>> being None  File
>>>> "/home/sean/.local/lib/python3.6/site-packages/Bio/SeqIO/Interfaces.py",
>>>> line 109, in _get_seq_string    raise TypeError("SeqRecord (id=%s) has an
>>>> invalid sequence." % record.id <http://record.id>)TypeError: SeqRecord
>>>> (id=LS22-4780-2.4) has an invalid sequence.During handling of the above
>>>> exception, another exception occurred:Traceback (most recent call last):
>>>> File "brig_fa_formatter.py", line 24, in <module>    SeqIO.write(new_rec,
>>>> handle, "fasta")  File
>>>> "/home/sean/.local/lib/python3.6/site-packages/Bio/SeqIO/__init__.py", line
>>>> 525, in write    count += 1  File "/usr/lib/python3.6/contextlib.py", line
>>>> 126, in __exit__    raise RuntimeError("generator didn't stop after
>>>> throw()")RuntimeError: generator didn't stop after throw()*
>>>> _______________________________________________
>>>> Biopython mailing list  -  Biopython at biopython.org
>>>> https://mailman.open-bio.org/mailman/listinfo/biopython
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20230222/024df7d4/attachment.htm>


More information about the Biopython mailing list