[Bioperl-l] Scripting help to identify adaptors count in reads
Juan Jovel
jovel_juan at hotmail.com
Thu Nov 10 11:06:16 EST 2011
There are many ways to do it.
Perhaps the simplest is to count the number of times the adapter sequence (or part of it) appears in each read.
For example:
$adapter_matches = tr/adapter_sequence/adapter_sequence/;# $adapter_matches will store the number of times the adapter sequence is repeated.
You then place that result in a hash bin:
my %adapter_frequency;my $class = "$adapter_matches";if(exists $adapter_frequency{$class}){ $adapter_frequency{$class}++}else{ $adapter_frequency{$class} = 1}
# Then you can sort and output your classes
foreach $class (sort keys %adapter_frequency){ print "$class\t$adapter_frequency{$class}\n"; }
You can workout the details, but something like this should work.
> Date: Thu, 10 Nov 2011 04:29:55 -0800
> From: casaburi at ceinge.unina.it
> To: Bioperl-l at lists.open-bio.org
> Subject: [Bioperl-l] Scripting help to identify adaptors count in reads
>
>
> Hi everybody,
>
> i have some reads (454) where there are adaptors (NNNN...), one,two or three
> adaptors for each reads depending on the reads. Is there any way to
> establish how many reads have 1 adaptors, how many 2 and how many 3 adaptors
> over the total ???
>
> >271-88
> GCCTTGCCAGCCGCTCAGATTGATNNNNNNNNNNNNNNNATCAGGTGCCTACG
> >272-88
> GCCTTGCCAGCCGCTCAGATTGATNNNNNNNNNNNNNNNATCANNNNNNNNNNNNNNNCTGATGGCGCGAGGGAGGCGCCTTGCCAGCCCGCTCAGATTGATNNNNNNNNNNNNNNNCTGATGGCGCGAGGGAGGC
> >273-88
> GCCTCCCTCGCGCATCAGATCGTAGGCACCATCAATCTGAGCGGGCTGGCAAGGCGCCTCCCTCGCGCCA
> >274-88
> GCCTTGCCAGCCGCTCAGATTGATNNNNNNNNNNNNNNNCTGATGGCGCGAGGGAGGCGCCTCCCTCGCGCCATCAGATCGTNNNNNNNNNNNNNNNNNNTCGTAGGCACCATCAATCTGAGCGGGCTGGCAAGGCGCCTCCCTCGCGCCATCAGATCGTAGGCACCATCAA
>
> The problem is that some adpators occur in the middle of the sequences
> because they coming out from a concameration experimental design (they are
> miRNAs between NNNNNN...). So i want to know a script or tool that may say
> how many reads have 1 adapt, how many 2, (max are 4) in respect to the total
> number of reads. Do you know any tool/script that may help ? Tnx
> Can anyone suggests me a script to fix this ???
>
> Thank you very much
> --
> View this message in context: http://old.nabble.com/Scripting-help-to-identify-adaptors-count-in-reads-tp32818254p32818254.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list