[Biojava-l] Sanger sequencing trace files support
Michael Heuer
heuermh at gmail.com
Tue Jul 12 15:46:03 UTC 2016
On Tue, Jul 12, 2016 at 10:26 AM, Jonas Dehairs <jonas.dehairs at gmail.com>
wrote:
> The 4.2 API currently does not have methods for importing and
> handeling Sanger sequencing files (ABI, SCF). I'm currently resorting
> to the legacy classes in 1.9.1 (ChromatogramFactory and Chromatogram).
>
> ChromatogramFactory only supports Sanger trace files with standard
> ATGCN characters. It throws a
> UnsupportedChromatogramFormatException upon reading Sanger files with
> IUPAC Ambiguity Codes (for example M = A or C). Even if I would just
> like to access the traces and ignore the base calls, this is
> impossible with the current implementation since we can't even open
> the file if it contains Ambiguity codes.
>
> On a side note, I have been getting more and more questions from users
> why they can't open their Sanger sequencing files (in my program that
> uses BioJava). I think the popularity of CRISPR and the
> characterization of CRISPR KO clones (which is likely to result in
> heterozygous base calls) is increasing the number of people that have
> these IUPAC Ambiguity Sanger files.
>
> For now, I tell people to go back to the Sanger sequencing software
> that exports the ABI or SCF files and disable IUPAC Ambiguity in the
> export options. In that case the base calling algorithm just picks the
> strongest signals in case of ambiguity and sticks to standard ATGCN
> characters.
>
> Anyway, I am requesting the addition of the Chromatogram classes to
> the new API with support for opening files if they contain UPAC
> Ambiguity Codes.
>
The biojava 1.9.x codebase is still maintained at
https://github.com/biojava/biojava-legacy
If you created a pull request to add IUPAC Ambiguity Codes support to
ChromatogramFactory, a 1.9.3 release is planned for later this summer, and
it could go in then.
michael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biojava-l/attachments/20160712/63d2b4f0/attachment.html>
More information about the Biojava-l
mailing list