[Biojava-dev] GSoC - BioJava File Parsers question

Peter Cock p.j.a.cock at googlemail.com
Wed Mar 28 22:09:43 UTC 2012


On Wed, Mar 28, 2012 at 10:05 PM, P. Troshin <to.petr at gmail.com> wrote:
> Well, they all widely used tools, and as a result of analysis they
> produce files. If you need to process these results further then you'd
> need to parse the result files. Hence the connection.
>
> Regards,
> Peter

Indeed. It is quite common in Bioinformatics for file formats to
be named after the tool which introduced them - even if sometimes
they become much more widely used.

And for GenBank and UniProt, people typically mean the GenBank
plain text 'flat file' format also used by DDBJ (there is a very similar
format used by EMBL with a common feature table), and for
UniProt that could refer to the old plain text 'SwissProt' file format
or the newer UniProt XML format. For background on these an
other sequence file file formats you might find these pages
helpful:

http://www.bioperl.org/wiki/HOWTO:SeqIO#Formats
http://biopython.org/wiki/SeqIO

Peter C.



More information about the biojava-dev mailing list