[Biopython-dev] [Bug 2382] Generic FASTA parser
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Tue Oct 16 21:58:38 UTC 2007
http://bugzilla.open-bio.org/show_bug.cgi?id=2382
------- Comment #5 from jflatow at northwestern.edu 2007-10-16 17:58 EST -------
Nope, they actually have a file format that looks like this:
Position Consensus Quality Score Depth Signal StdDeviation
>contig00001 1
1 G 64 2 1.00 0.00
2 A 64 2 1.00 0.00
3 G 64 2 1.00 0.00
4 A 64 2 1.00 0.00
5 G 64 2 2.00 0.00
6 G 64 2 2.00 0.00
7 A 64 2 3.00 0.00
8 A 64 2 3.00 0.00
9 A 64 2 3.00 0.00
10 C 64 2 2.00 0.00
11 C 64 2 2.00 0.00
12 T 64 2 1.00 0.00
13 C 64 2 3.00 0.00
14 C 64 2 3.00 0.00
15 C 64 2 3.00 0.00
16 G 64 2 1.00 0.00
17 T 64 2 1.00 0.00
18 G 64 2 1.00 0.00
19 A 64 2 1.00 0.00
20 T 64 2 1.00 0.00
21 C 64 2 2.00 0.00
22 C 64 2 2.00 0.00
Note the file-wide header at the top of the page (a generic FASTA-like parser
might skip to the first '>'), or we could get rid of that beforehand but it
would be nice if it were smart.
Also, here is another sample FASTA-like file format they use for pair
alignments:
>ERSGEES01EM5WC, 2..30 of 95 and ERSGEES01C1ZV2, 1..29 of 268 (29/29 ident)
2 CGGTGACCCGGGAGATCTGAATTCCTGGT 30
1 CGGTGACCCGGGAGATCTGAATTCCTGGT 29
>ERSGEES01EM5WC, 2..29 of 95 and ERSGEES01DMS5T, 1..28 of 259 (28/28 ident)
2 CGGTGACCCGGGAGATCTGAATTCCTGG 29
1 CGGTGACCCGGGAGATCTGAATTCCTGG 28
>ERSGEES01EM5WC, 29..2 of 95 and ERSGEES01D8GDV, 205..232 of 232 (28/28 ident)
29 CCAGGAATTCAGATCTCCCGGGTCACCG 2
205 CCAGGAATTCAGATCTCCCGGGTCACCG 232
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list