[Biojava-l] Reading and writting Fastq files
Michael Heuer
heuermh at acm.org
Thu Apr 1 03:56:42 UTC 2010
xyz wrote:
> Thank you it works, but after I extended the code with
> RichSequence.IOTools.writeFasta(outputFasta, trimSeq, ns,
> fastq.getDescription());
> in order to get also a trimmed fasta file I got the following error:
>
> Fastq2Fasta.java:51: cannot
> find symbol symbol : method
> writeFasta(java.io.FileOutputStream,java.lang.String,org.biojavax.SimpleNamespace,java.lang.String)
> location: class org.biojavax.bio.seq.RichSequence.IOTools
> RichSequence.IOTools.writeFasta(outputFasta, trimSeq, ns,
> fastq.getDescription()); 1 error
The fastq package has not yet been integrated with biojava core or the
biojavax packages. If you would like to use RichSequence.IOTools, you
would need to create a RichSequence from each Fastq object before writing.
Something like
import static ...RichSequence.Tools.*;
import static ...RichSequence.IOTools.*;
Fastq fastq = ...;
Namespace namepace = ...;
RichSequence richSequence = createRichSequence(
namespace,
fastq.getDescription(),
fastq.getSequence(),
DNATools.getDNA());
writeFasta(outputStream, richSequence, namespace);
may work.
> Suggestions:
> 1)
> After I trimmed the fastq files the header information for quality
> is empty
>
> @HWI-EAS406:5:1:0:1390#0/1
> GGGTGATGGCCGCTGCCGATGGCGTCAAAA
> +
> OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
>
> this reduced the size of the files but is it compatible with
> SOAP and TopHat?
Sorry, not sure what you are asking here.
> 2)
> I was using fastq files up to 6 GBytes and I have not run any benchmarks
> with different Buffer/stream combination on big text files and therefore
> I am not sure that is enough to use just FileInputStream or
> FileOutputStream. BioJavaX is using BufferedReader br = new
> BufferedReader(new FileReader()) are there any speed difference?
AbstractFastqReader.read(InputStream) uses a BufferedReader, and all the
other read methods pass through that one.
michael
More information about the Biojava-l
mailing list