[Biopython] Invalid sequence Error(s) from SeqIO module
Sean Brimer
skbrimer at gmail.com
Wed Feb 22 10:37:23 EST 2023
Hi Peter,
Thank you for the help! I have gotten the script to work AND learned more
about biopython. I tried a few different things.
I noticed from the tutorial that when concatenating different seq objects
together you declare an empty Seq object from the start where I just made a
list of them. When I use the list approach and declare the id and
description I get the same error that started all of this. (Hooray
reproducible)
When I make the new seq and empty Seq object that my code fails on the fact
that a Seq object has no append attribute
When I change the code from append to join I get an empty seq object. I am
guessing because joining on an empty object doesn't work? (reading about
this one currently)
When I change the code to Seq Object += Seq I get exactly what I was
looking for.
Thank you again for your help!
Sean
On Sun, Feb 19, 2023 at 6:44 AM Peter Cock <p.j.a.cock at googlemail.com>
wrote:
> You can add the SeqRecord objects together, or add their sequences
> together. Either way you should explicitly set the combined record's id and
> description.
>
> See e.g. "Concatenating or adding sequences" or "Adding SeqRecord objects"
> in the Tutorial
> https://biopython.org/DIST/docs/tutorial/Tutorial.html
>
> Peter
>
> On Fri, Feb 17, 2023 at 4:30 PM Sean Brimer <skbrimer at gmail.com> wrote:
>
>> Hi Peter,
>>
>> Thank you for getting back to me!
>>
>> Overall I am trying to concatenate all the sequences in a XXX.fasta file
>> and write them to a new file called XXX_brig.fasta file.
>>
>> For the fa_seq list, originally I had the same though, I believed
>> appending the Seq objects to a list would convert them to strings and I
>> would need to call the Seq(str) when writing the file, however it turns out
>> that it just makes a list of Seq objects and I get an error when calling
>> SeqRecord(Seq(fa_list)) basically saying it was already a Seq object so I
>> removed the Seq call.
>>
>> On Fri, Feb 17, 2023 at 6:47 AM Peter Cock <p.j.a.cock at googlemail.com>
>> wrote:
>>
>>> Your variable fa_seq is a list, but the first argument to SeqRecord
>>> should be a Seq object.
>>>
>>> Are you trying to concatenate all the sequences in each XXX_brig.fasta
>>> file?
>>>
>>> Peter
>>>
>>> On Mon, Feb 13, 2023 at 3:09 PM Sean Brimer <skbrimer at gmail.com> wrote:
>>>
>>>> Hello biopython group,
>>>>
>>>> I'm using biopython 1.78 on a Ubuntu 18.04 LTS system with python
>>>> 3.6.9 and I am trying to reformat a spades mutli-fasta output file
>>>> (contigs) so I can then use BRIG. BRIG is expecting only one header per
>>>> fasta so I'm attempting to concatenate all of the Seq records under one
>>>> header and write it to a new file. From the traceback I think it is telling
>>>> me that it doesn't like the new fasta header name however I don't know why.
>>>>
>>>> Any help would be appreciated, thank you.
>>>>
>>>> Sean Brimer
>>>>
>>>> This is my Code:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *import globfrom Bio.SeqRecord import SeqRecordfrom Bio.Seq import
>>>> Seqfrom Bio import SeqIO## Looking for fasta files fa_list = [i for i in
>>>> glob.glob("*.fasta")]## New file for the contigs for BRIGfor i in fa_list:
>>>> handle = i.rpartition(".")[0]+"_brig.fasta" with open(handle,"w"):
>>>> fa_seq = [] new_head = i.rpartition(".")[0] for record in
>>>> SeqIO.parse(i,"fasta"): fa_seq.append(record.seq)
>>>> new_rec = SeqRecord(fa_seq, id = new_head, description = "")*
>>>> *SeqIO.write(new_rec, handle, "fasta")*
>>>>
>>>>
>>>> This is the Error(s) I am getting:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *Traceback (most recent call last): File
>>>> "/home/sean/.local/lib/python3.6/site-packages/Bio/File.py", line 73, in
>>>> as_handle yield fp File
>>>> "/home/sean/.local/lib/python3.6/site-packages/Bio/SeqIO/__init__.py", line
>>>> 524, in write fp.write(format_function(record)) File
>>>> "/home/sean/.local/lib/python3.6/site-packages/Bio/SeqIO/FastaIO.py", line
>>>> 389, in as_fasta data = _get_seq_string(record) # Catches sequence
>>>> being None File
>>>> "/home/sean/.local/lib/python3.6/site-packages/Bio/SeqIO/Interfaces.py",
>>>> line 109, in _get_seq_string raise TypeError("SeqRecord (id=%s) has an
>>>> invalid sequence." % record.id <http://record.id>)TypeError: SeqRecord
>>>> (id=LS22-4780-2.4) has an invalid sequence.During handling of the above
>>>> exception, another exception occurred:Traceback (most recent call last):
>>>> File "brig_fa_formatter.py", line 24, in <module> SeqIO.write(new_rec,
>>>> handle, "fasta") File
>>>> "/home/sean/.local/lib/python3.6/site-packages/Bio/SeqIO/__init__.py", line
>>>> 525, in write count += 1 File "/usr/lib/python3.6/contextlib.py", line
>>>> 126, in __exit__ raise RuntimeError("generator didn't stop after
>>>> throw()")RuntimeError: generator didn't stop after throw()*
>>>> _______________________________________________
>>>> Biopython mailing list - Biopython at biopython.org
>>>> https://mailman.open-bio.org/mailman/listinfo/biopython
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20230222/024df7d4/attachment.htm>
More information about the Biopython
mailing list