[Biopython] Write operations for abi format in Bio.SeqIO

Peter Cock p.j.a.cock at googlemail.com
Wed Sep 29 04:54:41 EDT 2021


It seems worth a try - you've come up with multiple ABI modification
examples which make sense to me, not that I have used ABI files much.
The parser and data-structure used may not be ideal, but I guess you
think you can work with that?

I've CC'd Bow who wrote our parser (10 years ago now), in case he has
missed this mailing list thread but has any specific comments.

Peter

On Wed, Sep 29, 2021 at 1:24 AM Rohan Kanchana <rohankanchana at gmail.com> wrote:
>
> Hi Peter,
>
> Thanks for your response. To address your earlier point about confusion in writing ABI files from scratch, I suppose that BioPython could provide default values for the fields users would not usually concern themselves with (such as base order, mobility filenames, and lane number). These values could then, of course, be overridden at the user's discretion.
>
> Here are some use cases for writing/rewriting ABI files:
>
> Modifying base calls on existing ABI files, like you previously mentioned
> Modifying peak locations on existing ABI files (sometimes the automated base calling algorithm skips and/or double counts peaks in the sequencing trace)
> Modifying traces of existing ABI files to remove noise and other sequencing artifacts
> Generating new ABI files from scratch for the purposes of software testing
> Converting data that is not in ABI format (such as data from an API) into ABI format, for storage or input into another software pipeline
> Automated annotation of existing ABI files
>
> In short, most of these use cases revolve around software pipelines or libraries that accept ABI files as input. Oftentimes such software can't be modified to allow alternative forms of input, so the ability to generate/modify/preprocess ABI files would enable more technical users to maximize the potential of such software. Some examples of software that accept Sanger sequencing input include TIDE, ICE, and DECODR (full disclosure – I worked on DECODR), as well as EditR.
>
> Thanks,
> Rohan
>
> On Tue, Sep 28, 2021 at 9:46 AM Peter Cock <p.j.a.cock at googlemail.com> wrote:
>>
>> Hello Rohan,
>>
>> The ABI format has a vast number of metadata fields which would make a
>> writer rather complicated to use (except for the special case of
>> loading an existing file, modifying it, and writing out the new
>> version).
>>
>> Do you have a use case? The only one I can come up with it redoing the
>> base calling, and then updating the called sequence in the ABI file.
>>
>> Thanks,
>>
>> Peter
>>
>> On Sun, Sep 26, 2021 at 8:34 PM Rohan Kanchana <rohankanchana at gmail.com> wrote:
>> >
>> > Dear all,
>> >
>> > I'm Rohan Kanchana, a student at MIT. I observed that Bio.SeqIO only supports read operations for the "abi" format. I have code that would support a write operation for the "abi" format, so I wanted to ask the following: why are write operations not currently supported by BioPython for the "abi" format? If there is no reason, would making such a contribution to the BioPython library be useful?
>> >
>> > Thanks,
>> > Rohan
>> > _______________________________________________
>> > Biopython mailing list  -  Biopython at biopython.org
>> > https://mailman.open-bio.org/mailman/listinfo/biopython


More information about the Biopython mailing list