[Biojava-l] Reading and writting Fastq files
Richard Holland
holland at eaglegenomics.com
Thu Apr 8 11:36:36 UTC 2010
You haven't included the two import static lines in your code. See first two lines of Michael's example code (expanding the ellipses to the full classpath).
On 8 Apr 2010, at 12:30, xyz wrote:
> On Wed, 31 Mar 2010 23:56:42 -0400 (EDT)
> Michael Heuer wrote:
>
>> import static ...RichSequence.Tools.*;
>> import static ...RichSequence.IOTools.*;
>>
>> Fastq fastq = ...;
>> Namespace namepace = ...;
>> RichSequence richSequence = createRichSequence(
>> namespace,
>> fastq.getDescription(),
>> fastq.getSequence(),
>> DNATools.getDNA());
>>
>> writeFasta(outputStream, richSequence, namespace);
>
> I have tried this but I got this error:
> Fastq2Fasta.java:52: cannot find symbol
> symbol : method
> createRichSequence(org.biojavax.SimpleNamespace,java.lang.String,java.lang.String,org.biojava.bio.symbol.FiniteAlphabet)
> location: class Fastq2Fasta RichSequence richSequence =
> createRichSequence(ns,
> 1 error
>
> The complete code looks now :
>
> import java.io.FileInputStream;
> import java.io.FileNotFoundException;
> import java.io.FileOutputStream;
> import java.io.IOException;
> import org.biojava.bio.program.fastq.Fastq;
> import org.biojava.bio.program.fastq.FastqBuilder;
> import org.biojava.bio.program.fastq.FastqReader;
> import org.biojava.bio.program.fastq.FastqVariant;
> import org.biojava.bio.program.fastq.FastqWriter;
> import org.biojava.bio.program.fastq.IlluminaFastqReader;
> import org.biojava.bio.program.fastq.IlluminaFastqWriter;
> import org.biojava.bio.seq.DNATools;
> import org.biojavax.SimpleNamespace;
> import org.biojavax.bio.seq.RichSequence;
>
>
> public class Fastq2Fasta {
>
> public static void main(String[] args) throws FileNotFoundException,
> IOException {
>
> FileInputStream inputFastq = new FileInputStream("fastq2fasta.fastq");
> FastqReader qReader = new IlluminaFastqReader();
>
> FileOutputStream outputFastq = new FileOutputStream("fastq2fastaTrim.fastq");
> FastqWriter qWriter = new IlluminaFastqWriter();
>
> //SimpleNamespace ns = new SimpleNamespace("biojava");
>
> FileOutputStream outputFasta = new FileOutputStream("fastq2fastaTrim.fasta");
>
>
> for (Fastq fastq : qReader.read(inputFastq)) {
> System.out.println(fastq.getDescription());
> System.out.println(fastq.getSequence());
> String trimSeq = fastq.getSequence().substring(0,
> fastq.getSequence().length() - 6);
> System.out.println(trimSeq);
> System.out.println(fastq.getQuality());
> String trimQual = fastq.getQuality().substring(0,
> fastq.getQuality().length() - 6);
> System.out.println(trimQual);
>
> FastqBuilder trimFastq = new FastqBuilder();
> trimFastq.withVariant(FastqVariant.FASTQ_ILLUMINA);
> trimFastq.withDescription(fastq.getDescription());
> trimFastq.appendSequence(trimSeq);
> trimFastq.appendQuality(trimQual);
>
> qWriter.write(outputFastq, trimFastq.build());
>
>
> SimpleNamespace ns = new SimpleNamespace("biojava");
> RichSequence richSequence = createRichSequence(ns,
> fastq.getDescription(), trimSeq, DNATools.getDNA());
> RichSequence.IOTools.writeFasta(outputFasta, richSequence, ns);
> }
> }
> }
>
> What did I wrong?
>
>
>>
>>> Suggestions:
>>> 1)
>>> After I trimmed the fastq files the header information for quality
>>> is empty
>>>
>>> @HWI-EAS406:5:1:0:1390#0/1
>>> GGGTGATGGCCGCTGCCGATGGCGTCAAAA
>>> +
>>> OOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
>>>
>>> this reduced the size of the files but is it compatible with
>>> SOAP and TopHat?
>>
>> Sorry, not sure what you are asking here.
>>
> Usually @-headerand and +-header are equal eg.
> @HWI-EAS406:5:1:0:1390#0/1
> +HWI-EAS406:5:1:0:1390#0/1
> but after trimming and writting to fastq file I got this
> @HWI-EAS406:5:1:0:1390#0/1
> +
> The +-header is empty. Is this ok like this and standard compatible?
>
> Best regards,
> _______________________________________________
> Biojava-l mailing list - Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
--
Richard Holland, BSc MBCS
Operations and Delivery Director, Eagle Genomics Ltd
T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/
More information about the Biojava-l
mailing list