[Biojava-l] Biojava Parsers : Apply quality values for contig ?
Ashika Umanga Umagiliya
aumanga at biggjapan.com
Wed Feb 25 07:44:24 UTC 2009
Greetings all,
I am using 'phred/phrap' to assemble DNA sequences ,and 'phrap'
generates contig file and a contig-quality files for an assembly.
Now I want to parse these two files and generate final contig , by
removing Bases with '0' quality values.
For example :
CGACTATG + 0 42 54 59 48 0 0 0 > _GACT____
Why I want to do this is; because only this "masking" will give the
similar contig that of which generated by ChromasPro.
I can use Fasta-parser to parse contig file.But I wonder whether theres
anyway to handle parsing of Quality file in BioJava.
Below I have give the structures of two file types:
thanks in advance,
Umanga
contig file:
------------
>seqs_fasta.Contig1
TTGGAGAGTTTGATCCTGGCTCAGATTGAACGCTGGCGGCAGGCCTAACA
CATGCAAGTCGAACGGTAACAGGAAGCAGCTTGCTGCTTTGCTGACGAGT
GGCGGACGGGTGAGTAATGTCTGGGAAACTGCCTGATGGAGGGGGATAAC
TACTGGAAACGGTAGCTAATACCGCATAACGTCGCAAGACCAAAGAGGGG
GACCTTCGGGCCTCTTGCCATCGGATGTGCCCAGATGGGATTAGCTAGTA
GGTGGGGTAACGGCTCACCTAGGCGACGATCCCTAGCTGGTCTGAGAGGA
TGACCAGCCACACTGGAACTGAGACACGGTCCAGACTCCTACGGGAGGCA
GCAGTGGGGAATATTGCACAATGGGCGCAAGCCTGATGCAGCCATGCCGC
GTGTATGAAGAAGGCCTTCGGGTTGTAAAGTACTTTCAGCGGGGAGGAAG
GGAGTAAAGTTAATACCTTTGCTCATTGACGTTACCCGCAGAAGAAGCAC
CGGCTAATTCCGTGCCAGCAGCCGCGGTAATATTNTTATTCTTTATGTAT
ACATATTCTTTTTACTTTATTCTATTAAATTTATTCTTTCATAATTAAAC
CTTCCCTTACACCCATTCCACCTCCCATCCCTCTTCCCCTCCCACTCTCC
ATCTCATATGGCGTTCGCGCCTCTCTCTTCATCTCCTCCTATATTTATTC
TAACTTCTTTCATCTCAATCATTTCTTCTGTCTCATCCTTCCATTCTTTC
CATGATCTCCCCCATTGTCATGTCTTCAAAAAACCACACAAAACACTAGA
ATCTTTTCTTATTACACACAAGTATATACAATTTTTAACAATCCATTAAA
ACACACACAACACCTAGCAATCAACAACGCTACCATCCCCAATATTCTCT
GTTCTCCTCTCTTTCTCCGCGTGCATCTGCGCACTACTCTCTAATTTCAT
CTCTATTATCTTTTTTTCTTAACTCATCCGCATACATCCAAGACTCTAGA
CCCATTTCTCGCCTCTTTCATTTACTGCCGATACAGAGCTTATAAATTCT
ATATCATTTATCCACACTCATTATTAAATAGGCTGACACCTCTAACCGTC
CACTACACCACCTTTCCCATGCCATCTCCCTAACACTGCACTCATCCGTA
ACTTCCTACTCTACCCTCTCTTTCTTTCCTTACTTTCTTTTCTTTCTCTT
ACATTTTTATTTAAAATTCCTCTTTTAGCCTCTATTTTCTGTTATCTACT
TTTCTCCTAAATTCCCCCTATTCTTCACGTCCCATACCTATCCCTACCAC
CACCACTACCACCCCTCTCTTCATTCTACTCGCTCTAAACCCTCCACCCT
CCCCTCCTTGCTCTTATGTATCTCCTCATCTTTTAAT
quality file
------------
>seqs_fasta.Contig1
0 23 23 33 33 33 33 33 31 41 47 47 47 47 47 47 47 50 47 47 57 59 59 59
42 42 35 42 42 54 59 48 48 48 48 48 48 54 57 57 57 57 57 54 54 57 54 54
54 74
74 74 74 59 57 57 57 57 72 72 84 76 73 72 72 72 79 81 74 74 62 50 50 50
59 39 43 32 35 32 43 58 44 48 70 70 58 73 55 69 67 87 87 90 90 90 90 90
90 90
90 90 90 90 90 90 90 90 89 90 90 90 90 90 90 90 85 87 87 90 90 90 90 90
90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
90 90
90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
90 90 90 90 90 90 90 90 90 90 90 77 77 77 81 81 90 90 90 90 90 90 90 90
90 90
90 90 90 90 90 87 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
90 90
90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
90 90
90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
90 90
90 90 90 90 74 74 85 90 90 90 90 90 90 90 90 90 90 90 90 83 83 90 90 90
90 90 90 90 75 83 83 89 89 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
90 90
90 72 72 72 57 57 43 37 37 43 72 72 72 72 72 72 90 90 90 90 90 90 90 86
90 90 90 90 90 90 90 79 85 83 90 90 90 89 87 87 90 90 90 90 67 67 79 78
90 86
88 82 73 68 65 61 59 63 62 68 71 72 59 56 41 35 30 30 28 32 41 47 40 56
49 42 49 51 50 37 37 39 39 37 52 54 51 46 20 20 27 24 32 24 20 20 21 24
16 19
19 33 29 22 23 12 11 11 12 20 23 40 32 31 28 22 13 13 18 26 28 28 34 28
25 24 28 23 26 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
--
アシカ ウマンガ ウマギリヤ
㈱国際バイオインフォマティクス研究所(BiGG)
〒140-0001
東京都品川区北品川3-6-9 アンドウビル8F
TEL:03-6679-8763
FAX:03-6679-8764
More information about the Biojava-l
mailing list