[Bioperl-l] fastq splitter

Fields, Christopher J cjfields at illinois.edu
Wed Feb 29 12:13:03 EST 2012


Sean,

To follow up just in case it was a bug, tested with your seq examples and they also work, so my guess is something else is wrong locally.

[cjfields at pyrimidine-laptop sean]$ perl test.pl < example2.fastq 
@HWI-ST156:445:C0EDLACXX:4:1101:1496:1039 1:N:0:ATCACG
CTGCTGGTAGTGCCCAAAGACCTCGAATACAATGGGCTTGGTTTTGATGT
+
BCCFFFFEHHHHHJJJJJHIIJIJJIIGIJJJJJJJIJJJI?FHJJIIJA
@HWI-ST156:445:C0EDLACXX:4:2308:20877:199811 2:Y:0:ATCACG
TCATAAAAATAACAAAACCACCACCCCATACAAACTCTACTCATCTCCAC
+
##################################################

chris

On Feb 28, 2012, at 3:11 PM, Sean O'Keeffe wrote:

> Hi,
> I'm trying to write a quick script to separate one large PE fastq file into
> 2 separate files, one for each mate pair
> 
> The file is of the format (mate1)
> @HWI-ST156:445:C0EDLACXX:4:1101:1496:1039 1:N:0:ATCACG
> CTGCTGGTAGTGCCCAAAGACCTCGAATACAATGGGCTTGGTTTTGATGT
> +
> BCCFFFFEHHHHHJJJJJHIIJIJJIIGIJJJJJJJIJJJI?FHJJIIJA
> 
> && (mate2)
> 
> @HWI-ST156:445:C0EDLACXX:4:2308:20877:199811 2:Y:0:ATCACG
> TCATAAAAATAACAAAACCACCACCCCATACAAACTCTACTCATCTCCAC
> +
> ##################################################
> 
> 
> My idea is to separate using a regex such that / 1:/ would be the first
> mate pair and / 2:/ would go in the second mate file.
> I implemented the code below but each output file is empty. Can someone
> spot my error?
> 
> Thanks,
> Sean.
> 
> my $infile   = shift;
> my $outfile1 = $infile."_1";
> my $outfile2 = $infile."_2";
> 
> my $seqin = Bio::SeqIO->new(
>                             -file   => "<$infile",
>                             -format => "fastq",
>                             );
> my $seqout1 = Bio::SeqIO->new(
>                              -file   => ">$outfile1",
>                              -format => "fastq",
>                              );
> 
> my $seqout2 = Bio::SeqIO->new(
>                              -file   => ">$outfile2",
>                              -format => "fastq",
>                              );
> while (my $inseq = $seqin->next_seq) {
>    if ($seqin->desc =~ / 1:/){
>      $seqout1->write_seq($inseq);
>    } else {
>      $seqout2->write_seq($inseq);
>    }
> }
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list