[Biojava-dev] Biojava.util package?
Michael Heuer
heuermh at gmail.com
Thu Mar 29 16:39:02 UTC 2012
David Felty <davfelty at gmail.com> wrote:
> I've actually been working on something like this for my GSoC proposal,
> here's what I came up with:
>
> public class SeqIO {
> public static final int FASTA = 0;
> public static final int FASTQ = 1;
> public static final Class<DNASequence> DNA = DNASequence.class;
> public static final Class<ProteinSequence> PROTEIN =
> ProteinSequence.class;
>
> public static <S extends Sequence> Iterable<S> parse(InputStream is,
> int fileFormat, Class<S> seqType) throws Exception {
> switch (fileFormat) {
> case FASTA:
> if (seqType == DNA) {
> return (Iterable<S>)
> FastaReaderHelper.readFastaDNASequence(is);
> } else if (seqType == PROTEIN) {
> // etc...
> }
> break;
> case FASTQ:
> // etc...
> }
> }
> }
>
> It would be used like so:
>
> InputStream is = ...
> Iterable<DNASequence> seqs = SeqIO.parse(is, SeqIO.FASTA, SeqIO.DNA);
> for (DNASequence s : seqs) {
> // do something
> }
>
> Obviously it's not the prettiest and a lot could be changed, but that's my
> initial design. I tried to base it off BioPython's SeqIO, but static typing
> and the variety of Sequence types forced me to put in some nasty generics.
> Any tips would be appreciated!
Hello David,
You might also want to look at the cookbook examples for FASTQ parsing
http://biojava.org/wiki/BioJava:Cookbook:SeqIO:FASTQ
http://biojava.org/wiki/BioJava:CookBook3:FASTQ
These APIs are different from the biojava-legacy FASTA parsing APIs
and from the biojava3 FASTA parsing APIs for various historical
reasons. Perhaps there might be something good to come from pulling
the best together from all of these.
michael
More information about the biojava-dev
mailing list