[BioPython] aligning with clustalw fails

Iddo Friedberg idoerg at burnham.org
Fri Dec 12 12:22:17 EST 2003


Karin,

When I removed the "+" character on the first anoatation line, 
everything seemed to work fine. The "+" sign must be confusing Martel in 
some way. Needs to be fixed. Good catch, thanks!!

Iddo

Karin Lagesen wrote:
> i
> I am trying to align some sequences using the clustalw in biopython.
> For some reason it fails on some files, whereas on others it works
> perfectly. Here is an example of where it fails:
> 
> adenine:11:40> python
> Python 2.2.3 (#1, Oct  8 2003, 10:44:04)
> [GCC 2.96 20000731 (Red Hat Linux 7.3 2.96-112)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> 
>>>>from Bio import Clustalw
>>>>from Bio.Clustalw import MultipleAlignCL
>>>>cline = MultipleAlignCL("/tmp/locatorfiles/11249")
>>>>alignment = Clustalw.do_alignment(cline)
> 
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> File
> "/tmp/biopython-1.21/build/lib.linux-i686-2.2/Bio/Clustalw/__init__.py",
> line 121, in do_alignment
> File
> "/tmp/biopython-1.21/build/lib.linux-i686-2.2/Bio/Clustalw/__init__.py",
> line 60, in parse_file
> File "/tmp/biopython-1.21/build/lib.linux-i686-2.2/Martel/Parser.py",
> line 328, in parseFile
> File "/tmp/biopython-1.21/build/lib.linux-i686-2.2/Martel/Parser.py",
> line 361, in parseString
> File "/usr/local/lib/python2.2/xml/sax/handler.py", line 38, in
> fatalError
>     raise exception
> Martel.Parser.ParserIncompleteException: error parsing at or beyond
> character 221 (unparsed text remains)
> 
> 
> When I put the file into clustalw the normal way, everything works
> just fine. The file I am trying to align is attached.
> 
> TIA,
> 
> Karin
> 
> 
> ------------------------------------------------------------------------
> 
>>+xylR to +malS 3733787 3735125
> 
> CGATGATGAGAATTGTCGGCGTCACATCAGGTAACGCTGCGTGGTTGTCGGATGCGGCGT
> GAACGCCTTATCCGACCTACCCGCCAGGCATGATAAAACGCACCAACAACGCTTCAGGCT
> CGTAGCTCAACTGCCTGAGACAAAGTAAAAAGCCTTATCCGACTGACAAGTCGGATAAGG
> CTCAAGGAAATGCAATTACATATGCGCCGCGATTAACCGTTGGTTATCCTGGTACATTGC
> GAACAGGTAGTTGTTATAACTCTTCCCTTTGGTCGAGTAGCCCTTCAGCTTGTGAATCAT
> CGCTGTGGCAGTCACTTCCTGATCCGCTTTACGCAGCTGCGCACGCGATTTACGGAACGA
> AGAGTAAGCCGGGTGCGTATTCAGGTTAGTGACATAGGCGCTCACCGATTCTTTGACAGA
> ACTAAACTGTGAGTACCCTTTCACTTTACCTGGCGCATTGGTACAACGTCCTTTCATCGA
> TTTCATGCCGAACAGGTTGTTGTTGTTGCGCGCCAGCTTCGACGTTCCCCAACCGCTTTC
> TGCTGCAGCCATCGTCGCCACCATACTGGTGGGGATAATGTCTACGCGTTCAAGCAAGGT
> ATTCCACGGGATTTTTCGCGTATTACCGGACCACTTCACCTTGTAGCGTTTGGCGATGTC
> TTTCAGACGCGCACGCTCAGCAGGTGACCATTGGCCCTGATACTGTTTTGAAATGAGCCA
> GTTACGTTCCGCAGTAATGGCCGCATTTTGGCTGGTAATGTAAGGCATTACGGTCCGGAG
> AAACGCCTTTTTCCTTGGTGTTCCGGAAGGGTATTTTCGCAAATCAGGAAGTGAACTACT
> CTTTGCACTATTGCGAGAATACTCTTGTTTACTGCTTACCTGTTTATTACTAGCTTTAGT
> TAAGTGGGACTTTTGACTCGCTGTTGTTGTGTGCGTCTTCGCTAACACCTCACTCGAAAA
> CACCAGAGTGAGTAACATAAGAATCATCGCCCCATATCGTCGTATGGGAGTCAAAATCAT
> CAGGTCTCCTGGTCGGATTTAATCATTCCAACACCTTATATTTTTCACAAATTTGAGAGT
> TGAATCTCAAATCATATCAAAAATAGCTGTCAAGAGCACCCCAAGGAATAGTCCAAATCT
> GAAACTATGTCACGTGTTAACGATTCAGATTGGCGCTAAATCGCAGAAAATGTGGGGGTT
> ATCGCAAAATTCAGCCGTTTTTTGCGCGAGATCGCTCACCCTTGCTTCTCATCCTGTGGA
> CTTACCGCTCAGGGATGAGTTTTGTTTGGCTTATCGCTGGCAAACTGTCTGAAATCGCAG
> CAATAAGGACTCATCCGCC
> 
>>0_-Salmonella_typhi_3998330_3999668
> 
> TTGGCGATAAACGAAATTTCGCAAATGTGCGGCTACCCGTCGCTGCAATATTTCTATTCG
> GTGTTTAAAAAGGAGTACGTCACTACGCCTAAGGAGTATCGCGACCAGCATAGTGAAGCG
> TTGTTGTAGTTTATCCAGCCTACAGATGCCTTGTAAGCCGGATAGCGTAGCGCCATCCGG
> CACACAGAATTACATATGCGCGGCGATCAGACGCTGGTTGTCCTGGTACATCGCGAACAA
> GTAGTTGTTATAGCGCGCCCCCTGGGTAGAGTAACCCTTTAGCTTATGAATCATCGCCGT
> GGCGGTGACTTCCTGATCCGCCTTACGCAGTTGGGCGCGCGACTTGCGGAAAGAAGAATA
> AGCCGGATGCGTGTTCAGGTTGGCGACATACGCGCTCACCGACTCCTCAACCGACGCGAA
> TTGCGAGTAACCTTTCACCTTACCCGGCGTATTGGTACAACGCCCTTTCGTGCACTTCAT
> GCCGAACAGGTTATTGTTGCTACGCGCCAGTTTAGAGGTTCCCCAGCCGCTTTCCGCCGC
> CGCCATCGTCGCCACCATGCTGGTTGGGATAATGTCGACGCGCTCCAGCAACGTATTCCA
> CGGAATACGACGCGTGTTGCCGGACCAGCTCACTTTATAGCGCTTCGCGATATCTTTCAT
> ACGCGCGCGCTCAGACGGCGACCAGCGGTTCTGGTACTGTTTTGAGATCAGCCAATTACG
> GTCCGCGGTAATCGCGGCATTTTGGCTGGTGATGTAAGGCATAACGGTCCGGAGAAACGC
> TTTTTTTCTGGGGGTTCCGGAAGGGTATTTTCGCAAATCAGGAAGTGAACTGCTCTTTGC
> ACTATTGCGAGAATACTCTTGTTTACTGCTAACCTGTTTATAACTTGTCTCAGTTATGTG
> GGACTTTTGACTCGTTTGTGTTGCGCGCGTCTTTGCCAGCACCTCACCCGAAAATGCGAT
> GGTGAGTAACATAAGAATCATTGCCCCATATCGTCGCATGGGAGTCAATATCATTAGGTC
> TCCTGGTCGGATTGATACATTCCAACACCTTTTATTTTTCACGAAGTTGAGGTTTGAACC
> CCAAATTCTAGCAAAAATAGGCTTAAAAAGCACCTCAGGGAATAGTCTTAATCCGAAACT
> ATGTCAACTATTAACGATAACAGACAGCAATAATGCCAATAAAATGCGGCGTTTATCGCA
> AATAGAGGCGTTTTTTTGCGCCTCGTCGCTCACCCTCGCGCCTCCTCCTGCGCGACTCTC
> TGCTGGGGAGGAGTTCATTCGCCTAAAAACCAGGCAAGCTGATGAATATTGCCCACAAAG
> GATAGCGTGATGAAACTTG
> 
>>1_-Salmonella_typhi_4356008_4357346
> 
> TTTATTATCGGCTTCGGCGCTATCATGCTGGTGGGGGCGAATCCCGCCTATAAAGATGCC
> GCAGGCGCGCTGATTGGCGGCAATAACATGGCGGCGGTGCATCTGGCCAACGCGGTAGGC
> GGCAACCTGTTCCTCGGCTTTATTTCGGCAGTGGCGTTTGCCACCATTCTGGCGGTGGTC
> GCAGGTCTGACGCTGGCGGGCGCATCGGCGGTGTCGCATGACTTGTACGCCAACGTGTTC
> CGCAAAGGCGCAACCGAACGTGAAGAGCTGAAGGTGTCGAAAATCACCGTCCTGGTGCTG
> GACGTGATCGCCATTATCCTCGGCGTCCTGTTTGAAAATCAGAACATCGCCTTTATGGTG
> GGCCTGGCATTTGCTATCGCCGCGAGCTGCAACTTCCCCATCATTCTGCTTTCCATGTAC
> TGGTCAAAACTGACCACGCGCGGCGCTATGCTGGGCGGCTGGTTAGGTTTACTGACAGCG
> GTGGTGCTGATGATTCTTGGCCCTACCATTTGGGTGCAGATCCTCGGCCACGAAAAAGCG
> ATCTTCCCGTATGAGTATCCGGCGCTGTTCTCTATCAGCGTGGCGTTCCTGGGGATCTGG
> TTCTTCTCGGCCACCGATAACTCGGCAGAAGGCAACCGTGAACGTGAGCAGTTCCGCGCT
> CAGTTTATCCGCTCCCAAACGGGATTCGGCGTACAACAAGGGCGTGCGCATTAATCTTAC
> CGTTTCCTCCGGCCCTGTGGGTCGGAGGAAAATCAGAACATCACCCTCGCCACCAGCGGG
> GCTGCCAGCACCATCACCACGCCGGACAGCATCATTACCAGACTGGCGACGACGCCCTCC
> TGCTGACCAAGTTCATAGGAACGCGCCGTGCCTGCCCCATGTGACGCCGCGCCGAATCCT
> GCCCCTTTTGCCATGCCTTCCCGGATAGAGAGACGCAAAAATAGCGCGTCGCCGACCGCC
> ATGCCAAACACGCCAGTGACGACCACGAACAGCGCCACCAGATCCGGCTGCCCGCCCAGG
> GGTTCTGCCGCCGCCAGCGCAAACGGCGTAGTAACGGAACGTACCGCCAGACTACGCTGA
> ATCTCATCCGATAACGTGAACAGACGCGCCAGCCAAACGGAACTGGTGACCGCCACCACC
> GTCGCCGTCACTACACCCGCGGTGAGCGACATCCAGTGACGTTTGATAATCGCGAGGTTA
> TCGTACACCGGCACCGCAAAGGCAATGGTCGCCGGGCCGAGCAGCCACAATAGCCAGTGC
> GATTCGCTAATATAGTTTTGCCAGGAGATATGACCGAACACCAGCATTAACACCAGCAGT
> GCCGGCGTCAACACTAATG
> 
>>2_Salmonella_typhi_3302673_3304011
> 
> CACGGTACTGATCCAGATCGCTGCTTTCCAGTTGCTGCTGAACTTTCGCGGCGAATTTTT
> CCAGACGGCGTTTACCCAGCAGTTCTGCGTTCGGCAGTTCCACTTCTGGAATGGTCAGCT
> TCATGGTGCGTTCGATGTTGCGCAGCAGACGACGCTCGCGGTTCTCAACGAACAGCAGCG
> CGCGACCGGCGCGACCCGCACGACCGGTACGACCGATACGGTGAACGTAGGACTCGGAGT
> CCATCGGAATATCGTAGTTCACCACCAGGCTGATACGTTCAACGTCCAGACCACGTGCCG
> CAACATCGGTGGCGATCAGGATGTCCAGACGACCATCTTTCAGGCGTTCCAGAGTCTGCT
> CACGCAGCGCCTGGTTCATATCGCCATTCAACGCGGCGCTGTTGTAGCCGTTACGTTCCA
> GCGCTTCTGCTACTTCCAGGGTCGCGTTTTTGGTACGTACGAAGATAATCGCCGCATCAA
> AGTCTTCCGCTTCCAGGAAACGCACCAGTGCTTCGTTTTTACGCATACCCCAGACAGTCC
> AGTAGCTCTGGCTGATGTCAGGACGGGTAGTCACGCTGGACTGAATGCGCACTTCCTGCG
> GCTCTTTCATAAAGCGGCGGGTAATGCGACGAATCGCTTCCGGCATGGTGGCGGAGAACA
> GAGCGGTTTGATGACCTTCCGGGATCTGCGCCATAATAGTTTCAACGTCTTCGATGAAGC
> CCATACGCAGCATTTCGTCGGCTTCATCCAGCACCAGACCACTCAGTTTAGAGAGGTCGA
> GGGTGCCGCGTTTTAAGTGATCAAGCAGACGTCCCGGCGTACCGACAACGATCTGCGGCC
> CCTGACGCAGGGCGCGTAACTGCACGTCATAACGCTGACCGCCGTACAGGGCAACCACGT
> TTACGCCGCGCATGTGTTTAGAGAAATCCGTCATGGCTTCGGCAACCTGTACCGCCAGTT
> CGCGGGTTGGCGCCAGCACCAGAATCTGAGGTGCCTTCAGCTCAGGATCAAGATTGTTGA
> GCAGCGGTAAAGAGAACGCTGCGGTTTTGCCGCTACCGGTCTGGGCCATGCCCAGCACGT
> CGCGACCGCCCAGCAGATGCGGAATGCACTCTGCCTGGATTGGAGATGGTTTTTCGTAAC
> CCAGATCGGTAAGGGCTTCAAGGATAGGAGCCTTCAGCCCCAGATCTGCAAAAGTGGTTT
> CGAATTCAGCCATGTAGTACGTGTGCCTCAAAATTAATGGCGGCCAGTCTACATAACTCA
> TCATGAAATTGATCTGCAATTTTCATTGAAAAGTGTGAACCGGCTCAAAGTAGGTGTATT
> AACGAACAACAACGCCCTC
> 
>>3_-Salmonella_typhi_767064_768402
> 
> TCTTTCCAATAACTGGCATTCAACTCTTTGATATAATTTTCTCCGGGAAAAAGTTTATCC
> TTTAAAGGCAGTTTTTTGGCTAGCACGATTCTTTCTGCGTTAGATCCCGCGGACTCTGAA
> TGTGTGAGGGAAGATAAGGCGTCCAGTTGAGCGCTTGAGGGTCTGGCTATCAATTCAAAA
> AAAGGTTCTGACAATACAACACTACTAAGTGCTGCATAATTTCCTTTTTGTATGAGCGGT
> TTGACGTACTCAAAAAAAGAAATAATCCCTGATTGCAATCTTGCCGCTTTTTTTAAATAC
> TCAATTTCAGAACCACTGCTATCTTCACGTACAGGAATTATTCCATTATCCGTTTTTTTA
> TAGCCAATTGTTGACCCAGTATTATCAGCCATAGCGAACTCCATTATTTCCACCCCTCCT
> GATAAGAAAAGATCACACTTATCGTGGGGATGGCCATAGTTGGTTAGATAACCAAACATC
> TTATTATAAATAGATCGGTTATCATTGGCGCCAGCAAAAGTTGCCAAATAAAACCCATGT
> ATTTGTTTTTCTGCCCATTGCGCACCTAAAGAACGAGCAAATACTGATTGAATATTACCC
> ATCCATCCTACATCGATAAGTGCTATATTTTTAAAACCTTCACAGGCCGTTTTAAAGTAG
> ATGGAAAGCAGACACTTGTTTAAGTAATAAAGACTTACAAATAGATTCACCCATAATATC
> TTTAGCTTTCCAATGATTAGATATCCCATGATTATACTCTGACCAATCAGCATGGAGAGT
> ATTTATGCCGAGTTTTTTAGCATTCAGAATATCAGCATGAACATTATCGCCAACATGTAT
> CCACGATGCAATATCTACATTTTCATTTTTCTTGACTATTGAAAATAATTTACCACTATT
> TTTAGAGTACCGCTCTTCGCCAGATGAATAAACTGGAATGTTACTGATATCATAGCCACA
> TGATGTTAACAACTCCTTTAATATTGCTGATGGAAGGTACATATCACTAATTAAAATGAC
> TTTGCAACCATCACTAATAGCCTTTTCAAACAAACAGCTTCCACGCGCATTTTTATATAA
> AACAATCTTCTCCATTTGTATTTCCAGATCGATTATCTTTTTTACTGTCGCTGGTGAAAG
> CTGCGGATGCTTTTTTAAAATTTCATCGTATATTTCAGATATAAGTATTTCCGGCTCGCC
> GCCAAAACGCCTGACCCTATTCTCTCTGGCACTTACTTCCGCCTGAACTCTTATCTCTGG
> GAAATTATCAATAATACCAATCTCGTACGCTGATATAAAAAATTTCTCAGTTGCTAAAGT
> TGATTGCATTAATGAAAAC
> 
>>4_-Salmonella_typhi_767064_768402
> 
> TCTTTCCAATAACTGGCATTCAACTCTTTGATATAATTTTCTCCGGGAAAAAGTTTATCC
> TTTAAAGGCAGTTTTTTGGCTAGCACGATTCTTTCTGCGTTAGATCCCGCGGACTCTGAA
> TGTGTGAGGGAAGATAAGGCGTCCAGTTGAGCGCTTGAGGGTCTGGCTATCAATTCAAAA
> AAAGGTTCTGACAATACAACACTACTAAGTGCTGCATAATTTCCTTTTTGTATGAGCGGT
> TTGACGTACTCAAAAAAAGAAATAATCCCTGATTGCAATCTTGCCGCTTTTTTTAAATAC
> TCAATTTCAGAACCACTGCTATCTTCACGTACAGGAATTATTCCATTATCCGTTTTTTTA
> TAGCCAATTGTTGACCCAGTATTATCAGCCATAGCGAACTCCATTATTTCCACCCCTCCT
> GATAAGAAAAGATCACACTTATCGTGGGGATGGCCATAGTTGGTTAGATAACCAAACATC
> TTATTATAAATAGATCGGTTATCATTGGCGCCAGCAAAAGTTGCCAAATAAAACCCATGT
> ATTTGTTTTTCTGCCCATTGCGCACCTAAAGAACGAGCAAATACTGATTGAATATTACCC
> ATCCATCCTACATCGATAAGTGCTATATTTTTAAAACCTTCACAGGCCGTTTTAAAGTAG
> ATGGAAAGCAGACACTTGTTTAAGTAATAAAGACTTACAAATAGATTCACCCATAATATC
> TTTAGCTTTCCAATGATTAGATATCCCATGATTATACTCTGACCAATCAGCATGGAGAGT
> ATTTATGCCGAGTTTTTTAGCATTCAGAATATCAGCATGAACATTATCGCCAACATGTAT
> CCACGATGCAATATCTACATTTTCATTTTTCTTGACTATTGAAAATAATTTACCACTATT
> TTTAGAGTACCGCTCTTCGCCAGATGAATAAACTGGAATGTTACTGATATCATAGCCACA
> TGATGTTAACAACTCCTTTAATATTGCTGATGGAAGGTACATATCACTAATTAAAATGAC
> TTTGCAACCATCACTAATAGCCTTTTCAAACAAACAGCTTCCACGCGCATTTTTATATAA
> AACAATCTTCTCCATTTGTATTTCCAGATCGATTATCTTTTTTACTGTCGCTGGTGAAAG
> CTGCGGATGCTTTTTTAAAATTTCATCGTATATTTCAGATATAAGTATTTCCGGCTCGCC
> GCCAAAACGCCTGACCCTATTCTCTCTGGCACTTACTTCCGCCTGAACTCTTATCTCTGG
> GAAATTATCAATAATACCAATCTCGTACGCTGATATAAAAAATTTCTCAGTTGCTAAAGT
> TGATTGCATTAATGAAAAC
> 
>>5_Salmonella_typhi_2966453_2967791
> 
> AGAGCCTTGCATAGTCCAGAAGTGCGGGTCCGCCAGCGGGATATCGACGGATTGCAGCGA
> CAGCGTATGCCCCATCTGACGCCAGTCGGTGGCGATCATATTGGTGGCCGTCGGTAATCC
> GGTCGCGCGACGGAATTCCGCCATCACTTCACGACCGGAAAAACCCTGCTCCGCGCCGCA
> CGGATCTTCTGCATAGGCCAGAGAACCTTTTAGGTATTTACCAATGCTGATCGCTTCGTT
> CAGCGACCAGGCACCGTTTGGATCGAGCGTGACGCGCGCTTGTGGGAAACGTTTCGCCAG
> CGCCACGATTGACTCGGCCTCTTCTTCGCCCGCCAGCACGCCGCCTTTCAGTTTGAAGTC
> GTTGAAGCCGTATTTTTCATACGCCGCTTCCGCCAGACGCACTACCGTTTCCGGCGTCAT
> CGCCTCTTCATGGCGCAGACGATACCAGTCGCATTGCTCATCCGGCTGGCTTTGATACGG
> CAGCGGCGTAACCTTGCGATTGCCGACAAAGAACAGATAACCCAGCATTTCGACTTCGCT
> GCGCTGCTGACCGTCGCCTAACAGCGAAGCGACGTTGACGCCCAGATGTTGGCCCAAAAG
> GTCAAGCATTGCCGCTTCAATGCCCGTCACCACATGGATAGTGGTACGGAGATCGAACGT
> TTGTAAACCGCGTCCGCCCGCATCGCGATCGGCAAACTGGTTGCGAACGGCGGTCAGGAC
> ATTTTTATATTCATCCAGCGTTTTTCCCACCACCAGTGGAATCGCATCTTCCAGCGTTTT
> GCGAATTTTTTCGCCGCCCGGAATCTCGCCGACCCCAGTATGACCGGAGTTATCTTTAAC
> AATGACGATGTTGCGCGTAAAGAACGGGGCGTGCGCGCCGCTCAGGTTCATCAGCATACT
> GTCATGACCCGCAACCGGGATAACCTGCATTTCAGTCACTACAGGCGTGGTAAATTGAGT
> ACTCATAACTGTGTCCTTATTCAGAATTAGTGACGACCAAAAACAGGGCGTTTGCGGTCA
> AAAGTCCAGCCGGGGATCAGGTATTGCATCGGGCCTGCATCATTACGCGCGCCGCCTGGC
> AGCTTTTTATACGCGTCATGCGCTTTTCGCACCTGTTCCCAGTCAAGCTCTACGCCCAGT
> CCTGGCGCATCCGGAACGGCAATTTTGCCGTTTTTAATTTCCAGCGGATTTTTAGTCAGG
> CGGCAATCGCCCTCCTGCCAGATCCAGTGCGTATCAATAGCGGTGGGTTTGCCTGGCGCC
> GCCGCGCCGACATGGGTAAACATCGCCAGTGAAATATCAAAATGGTTATTCGAATGGCAG
> CCCCAGGTTAGCCCCCAGT
> 
>>6_Salmonella_typhi_3089870_3091208
> 
> TTCAATCGCTAAGGTGGCGATATCAGGCACAGCATCGACGCTTGCTGTACCGGAAGTCAC
> GATGTGCGGGCCTTCCGGCAATTCGCTTGCCTGCGCCGACATTGCACTTAAACCCACTAA
> TGCCGCCAGGGCCATCACTTTGAACTTCACAGTCTCTCCTCCATATTGCAGTCATTGCCC
> TGATATACAGGGCGCATGCAGGCAAGCTTAGCGGATATCGGTCTGTTGTCCATCAGACAA
> CGCTTAGTTGAAAAGCGCGTGCATATGCGCGACGCCTTCCCGCGCCAGTTGGAAGGCGAT
> AAGCCACATCACCACCCCTACCAGAATGTTGATAATGCGTTGAGCTTTCGCCGTACGCAG
> TCGAGGCGCAAGCCAGGCCGCCAGTAGCGCTAACCCAAAAAACCATAAGAACGACGCGCT
> AATCGTGCCGAGAGCAAACCAGCGCTTTGGCTCCATAGCCAGTTGTCCGCCGAGACTGCC
> CAACACGACAAACGTATCCAGATAAACGTGCGGATTCAGCCAGGTTACCGCCAGCATCGT
> AGCAATAATTTTCCAGCGCCCCTGCTTCATAACCTCGGCGCTGGCCAGCTCCAGATTACT
> GCTCATTGCCGTTTTCAGCGCGCCAAAACCGTACCATAACAAGAACGCAACCCCGCCCCA
> TGTGACCAGCGCCAGCAGCCACGGCGACTGCATCAGCAACGCGCTACCGCCAAAAATACC
> GGCGCTAATCAGGACTAAATCACTTAACGCGCAAAGCAGCGCTATCATGAGGTGGTACTG
> GCGACGAATTCCCTGATTCATCACAAACGCATTTTGCGGGCCAAGCGGAAGGATCATGGC
> GGCGCCAAGGGCAACCCCTTGAAAATAATAAGATATCACGTTAACTACCCTGAGCTGTTT
> TTCTTAAAGGCAGACTATAGCGCGGGAATATTATTAGCGGAAATTGATAATTTTAATCAC
> TAATAAGAAAAGCTAATAAAGAGACTGAATAACGGATGGCGGCTACGCTTATCCGCTCAA
> TAACCAGAGAATACTGGAGGCGGATAAGCGCAGCGCCGGCAGGCAGCACGGGAGGAAAAT
> TACTCCGCGGCGTTATCTTTCACGCGTTTAAAATTGACGTCCATCTGCGGGTACGGGAAG
> CTAATACCAGCGGCGTCGAATTCACGTTTAATACGTTCCAGCACGTCCCAATAGACATTT
> TGCAGGTCGCTGCTTTTACTCCAGACACGCACCACAAAATTAATTGATGAGGCGCCCAGC
> TCATTCAAACGCACCGTCATTTCGCGATCTTTTAAAATACGATCGTCGGACTCGATAATC
> GTCGTTAGCAGCTGTTTCA
> 
>>0_-Salmonella_typhi_Ty2_3983042_3984380
> 
> ACGAAATTTCGCAAATGTGCGGCTACCCGTCGCTGCAATATTTCTATTCGGTGTTTAAAA
> AGGAGTACGTCACTACGCCTAAGGAGTATCGCGACCAGCATAGTGAAGCGTTGTTGTAGT
> TTATCCAGCCTACAGATGCCTTGTAAGCCGGATAGCGTAGCGCCATCCGGCACACAGAAT
> TACATATGCGCGGCGATCAGACGCTGGTTGTCCTGGTACATCGCGAACAAGTAGTTGTTA
> TAGCGCGCCCCCTGGGTAGAGTAACCCTTTAGCTTATGAATCATCGCCGTGGCGGTGACT
> TCCTGATCCGCCTTACGCAGTTGGGCGCGCGACTTGCGGAAAGAAGAATAAGCCGGATGC
> GTGTTCAGGTTGGCGACATACGCGCTCACCGACTCCTCAACCGACGCGAATTGCGAGTAA
> CCTTTCACCTTACCCGGCGTATTGGTACAACGCCCTTTCGTGCACTTCATGCCGAACAGG
> TTATTGTTGCTACGCGCCAGTTTAGAGGTTCCCCAGCCGCTTTCCGCCGCCGCCATCGTC
> GCCACCATGCTGGTTGGGATAATGTCGACGCGCTCCAGCAACGTATTCCACGGAATACGA
> CGCGTGTTGCCGGACCAGCTCACTTTATAGCGCTTCGCGATATCTTTCATACGCGCGCGC
> TCAGACGGCGACCAGCGGTTCTGGTACTGTTTTGAGATCAGCCAATTACGGTCCGCGGTA
> ATCGCGGCATTTTGGCTGGTGATGTAAGGCATAACGGTCCGGAGAAACGCTTTTTTTCTG
> GGGGTTCCGGAAGGGTATTTTCGCAAATCAGGAAGTGAACTGCTCTTTGCACTATTGCGA
> GAATACTCTTGTTTACTGCTAACCTGTTTATAACTTGTCTCAGTTATGTGGGACTTTTGA
> CTCGTTTGTGTTGCGCGCGTCTTTGCCAGCACCTCACCCGAAAATGCGATGGTGAGTAAC
> ATAAGAATCATTGCCCCATATCGTCGCATGGGAGTCAATATCATTAGGTCTCCTGGTCGG
> ATTGATACATTCCAACACCTTTTATTTTTCACGAAGTTGAGGTTTGAACCCCAAATTCTA
> GCAAAAATAGGCTTAAAAAGCACCTCAGGGAATAGTCTTAATCCGAAACTATGTCAACTA
> TTAACGATAACAGACAGCAATAATGCCAATAAAATGCGGCGTTTATCGCAAATAGAGGCG
> TTTTTTTGCGCCTCGTCGCTCACCCTCGCGCCTCCTCCTGCGCGACTCTCTGCTGGGGAG
> GAGTTCATTCGCCTAAAAACCAGGCAAGCTGATGAATATTGCCCACAAAGGATAGCGTGA
> TGAAACTTGCCGCCTTCGC
> 
>>1_-Salmonella_typhi_Ty2_4340648_4341986
> 
> GCTTCGGCGCTATCATGCTGGTGGGGGCGAATCCCGCCTATAAAGATGCCGCAGGCGCGC
> TGATTGGCGGCAATAACATGGCGGCGGTGCATCTGGCCAACGCGGTAGGCGGCAACCTGT
> TCCTCGGCTTTATTTCGGCAGTGGCGTTTGCCACCATTCTGGCGGTGGTCGCAGGTCTGA
> CGCTGGCGGGCGCATCGGCGGTGTCGCATGACTTGTACGCCAACGTGTTCCGCAAAGGCG
> CAACCGAACGTGAAGAGCTGAAGGTGTCGAAAATCACCGTCCTGGTGCTGGACGTGATCG
> CCATTATCCTCGGCGTCCTGTTTGAAAATCAGAACATCGCCTTTATGGTGGGCCTGGCAT
> TTGCTATCGCCGCGAGCTGCAACTTCCCCATCATTCTGCTTTCCATGTACTGGTCAAAAC
> TGACCACGCGCGGCGCTATGCTGGGCGGCTGGTTAGGTTTACTGACAGCGGTGGTGCTGA
> TGATTCTTGGCCCTACCATTTGGGTGCAGATCCTCGGCCACGAAAAAGCGATCTTCCCGT
> ATGAGTATCCGGCGCTGTTCTCTATCAGCGTGGCGTTCCTGGGGATCTGGTTCTTCTCGG
> CCACCGATAACTCGGCAGAAGGCAACCGTGAACGTGAGCAGTTCCGCGCTCAGTTTATCC
> GCTCCCAAACGGGATTCGGCGTACAACAAGGGCGTGCGCATTAATCTTACCGTTTCCTCC
> GGCCCTGTGGGTCGGAGGAAAATCAGAACATCACCCTCGCCACCAGCGGGGCTGCCAGCA
> CCATCACCACGCCGGACAGCATCATTACCAGACTGGCGACGACGCCCTCCTGCTGACCAA
> GTTCATAGGAACGCGCCGTGCCTGCCCCATGTGACGCCGCGCCGAATCCTGCCCCTTTTG
> CCATGCCTTCCCGGATAGAGAGACGCAAAAATAGCGCGTCGCCGACCGCCATGCCAAACA
> CGCCAGTGACGACCACGAACAGCGCCACCAGATCCGGCTGCCCGCCCAGGGGTTCTGCCG
> CCGCCAGCGCAAACGGCGTAGTAACGGAACGTACCGCCAGACTACGCTGAATCTCATCCG
> ATAACGTGAACAGACGCGCCAGCCAAACGGAACTGGTGACCGCCACCACCGTCGCCGTCA
> CTACACCCGCGGTGAGCGACATCCAGTGACGTTTGATAATCGCGAGGTTATCGTACACCG
> GCACCGCAAAGGCAATGGTCGCCGGGCCGAGCAGCCACAATAGCCAGTGCGATTCGCTAA
> TATAGTTTTGCCAGGAGATATGACCGAACACCAGCATTAACACCAGCAGTGCCGGCGTCA
> ACACTAATGGCATCAACGG
> 
>>2_Salmonella_typhi_Ty2_3288159_3289497
> 
> CCAGCAGCGCACGGTACTGATCCAGATCGCTGCTTTCCAGTTGCTGCTGAACTTTCGCGG
> CGAATTTTTCCAGACGGCGTTTACCCAGCAGTTCTGCGTTCGGCAGTTCCACTTCTGGAA
> TGGTCAGCTTCATGGTGCGTTCGATGTTGCGCAGCAGACGACGCTCGCGGTTCTCAACGA
> ACAGCAGCGCGCGACCGGCGCGACCCGCACGACCGGTACGACCGATACGGTGAACGTAGG
> ACTCGGAGTCCATCGGAATATCGTAGTTCACCACCAGGCTGATACGTTCAACGTCCAGAC
> CACGTGCCGCAACATCGGTGGCGATCAGGATGTCCAGACGACCATCTTTCAGGCGTTCCA
> GAGTCTGCTCACGCAGCGCCTGGTTCATATCGCCATTCAACGCGGCGCTGTTGTAGCCGT
> TACGTTCCAGCGCTTCTGCTACTTCCAGGGTCGCGTTTTTGGTACGTACGAAGATAATCG
> CCGCATCAAAGTCTTCCGCTTCCAGGAAACGCACCAGTGCTTCGTTTTTACGCATACCCC
> AGACAGTCCAGTAGCTCTGGCTGATGTCAGGACGGGTAGTCACGCTGGACTGAATGCGCA
> CTTCCTGCGGCTCTTTCATAAAGCGGCGGGTAATGCGACGAATCGCTTCCGGCATGGTGG
> CGGAGAACAGAGCGGTTTGATGACCTTCCGGGATCTGCGCCATAATAGTTTCAACGTCTT
> CGATGAAGCCCATACGCAGCATTTCGTCGGCTTCATCCAGCACCAGACCACTCAGTTTAG
> AGAGGTCGAGGGTGCCGCGTTTTAAGTGATCAAGCAGACGTCCCGGCGTACCGACAACGA
> TCTGCGGCCCCTGACGCAGGGCGCGTAACTGCACGTCATAACGCTGACCGCCGTACAGGG
> CAACCACGTTTACGCCGCGCATGTGTTTAGAGAAATCCGTCATGGCTTCGGCAACCTGTA
> CCGCCAGTTCGCGGGTTGGCGCCAGCACCAGAATCTGAGGTGCCTTCAGCTCAGGATCAA
> GATTGTTGAGCAGCGGTAAAGAGAACGCTGCGGTTTTGCCGCTACCGGTCTGGGCCATGC
> CCAGCACGTCGCGACCGCCCAGCAGATGCGGAATGCACTCTGCCTGGATTGGAGATGGTT
> TTTCGTAACCCAGATCGGTAAGGGCTTCAAGGATAGGAGCCTTCAGCCCCAGATCTGCAA
> AAGTGGTTTCGAATTCAGCCATGTAGTACGTGTGCCTCAAAATTAATGGCGGCCAGTCTA
> CATAACTCATCATGAAATTGATCTGCAATTTTCATTGAAAAGTGTGAACCGGCTCAAAGT
> AGGTGTATTAACGAACAAC
> 
>>3_Salmonella_typhi_Ty2_2186964_2188302
> 
> GGCGGCGTAACGTCTTATCTGGCCTACGTGAGCGGTGCCGTCTGTAGGCCTGATAAGCGC
> AGCGCATCAGGCAAGACCCGAACCTGCGGCAGGTTCGAATCTTCCATATCGCAGATAGCA
> AAAAAGCGCCTTTAGGGCGCTTTTTTACATTGGTGGGTCGTGCAGGATTCGAACCTGCGA
> CCAATTGATTAAAAGTCAACTGCTCTACCAACTGAGCTAACGACCCCTTGGATAAGGGTT
> ACTGCTTCAATCATTCAGGATGGTGGGTCGTGCAGGATGACTCGGCTTCGCCTCGCCCTT
> CGGGCCGTTGCTAAAGCAACATTATCCTTCACGTTCTCTACCAGTTACCACCTTGTATAT
> TGGTGGGTCGTGCAGGATTCGAACCTGCGACCAATTGATTAAAAGTCAACTGCTCTACCA
> ACTGAGCTAACGACCCATACGGGTGCTGCCTGAAAGATTTTACTCGGACCATCTAAAGAT
> GGTGGGTCGTGCAGGATGACTCGGCTTCGCCTCGCCCTACGGGCCGTTGCTGACGCAACG
> TTATCCTTCACGTTCAACATCTGAGTTTGATGTTAAATTAGTGGGTCGTGCAGGATTCGA
> ACCTGCGACCAATTGATTAAAAGTCAACTGCTCTACCAACTGAGCTAACGACCCACTTTT
> ACGTTGCTTTCGAGTTGTTTAATATCCCGTGGCAACGGCGGCATATATTACTGATTTCAG
> ATTTGAGCGCAACAAAAATTTCGACGCAGAGGGCTCAACTGCTTAGGAATCGCACGACGC
> GACCAGAAAAAAGGCGTTTTCTGGTCGCATGGTACGCATTACATCGCGTTAAGACGCTTC
> TGCGCCTGTTTCGCGCCATCAGTGCCAGGATATTTGTTAATCACCTGCTGATAAACCGCT
> TTCGCTTTTGCCGTATCACCTTTGTCCTGCATGATAACGCCAACTTTGTACATCGCGTCC
> GCAGCCTTCGGCGACTTAGGATAGTTTTTTACTACCGAGGCGAAATAATAGGCGGCGTCA
> TCTTTTTTACCCTTGTTGTAATTCAACTGGCCCAGCCAATAATTGGCGTTCGGCTGATAA
> GTAGAATCAGGGTATTTCTTGATGAAGTTCTGAAACGCCACAATCGCATCATCCTGGCGA
> GACTTATCCTGCACCAGCGCAATTGCCGCATTGTAATCGGTATTCGCATCGCCACTTTGT
> ACCGGCGCCCCTGAGGTTGCCGTACCGGCATCCGGAGCGGGGGTCGCAGCGGTTGCCGCC
> CCGCTCTGGTCGCCAGCTGCTGGCTGCGCTGCGCCGCCATTATTTAAACTCCCCAGCTGC
> AGCATAATTTGCTTCTGGC
> 
>>4_Salmonella_typhi_Ty2_2186964_2188302
> 
> GGCGGCGTAACGTCTTATCTGGCCTACGTGAGCGGTGCCGTCTGTAGGCCTGATAAGCGC
> AGCGCATCAGGCAAGACCCGAACCTGCGGCAGGTTCGAATCTTCCATATCGCAGATAGCA
> AAAAAGCGCCTTTAGGGCGCTTTTTTACATTGGTGGGTCGTGCAGGATTCGAACCTGCGA
> CCAATTGATTAAAAGTCAACTGCTCTACCAACTGAGCTAACGACCCCTTGGATAAGGGTT
> ACTGCTTCAATCATTCAGGATGGTGGGTCGTGCAGGATGACTCGGCTTCGCCTCGCCCTT
> CGGGCCGTTGCTAAAGCAACATTATCCTTCACGTTCTCTACCAGTTACCACCTTGTATAT
> TGGTGGGTCGTGCAGGATTCGAACCTGCGACCAATTGATTAAAAGTCAACTGCTCTACCA
> ACTGAGCTAACGACCCATACGGGTGCTGCCTGAAAGATTTTACTCGGACCATCTAAAGAT
> GGTGGGTCGTGCAGGATGACTCGGCTTCGCCTCGCCCTACGGGCCGTTGCTGACGCAACG
> TTATCCTTCACGTTCAACATCTGAGTTTGATGTTAAATTAGTGGGTCGTGCAGGATTCGA
> ACCTGCGACCAATTGATTAAAAGTCAACTGCTCTACCAACTGAGCTAACGACCCACTTTT
> ACGTTGCTTTCGAGTTGTTTAATATCCCGTGGCAACGGCGGCATATATTACTGATTTCAG
> ATTTGAGCGCAACAAAAATTTCGACGCAGAGGGCTCAACTGCTTAGGAATCGCACGACGC
> GACCAGAAAAAAGGCGTTTTCTGGTCGCATGGTACGCATTACATCGCGTTAAGACGCTTC
> TGCGCCTGTTTCGCGCCATCAGTGCCAGGATATTTGTTAATCACCTGCTGATAAACCGCT
> TTCGCTTTTGCCGTATCACCTTTGTCCTGCATGATAACGCCAACTTTGTACATCGCGTCC
> GCAGCCTTCGGCGACTTAGGATAGTTTTTTACTACCGAGGCGAAATAATAGGCGGCGTCA
> TCTTTTTTACCCTTGTTGTAATTCAACTGGCCCAGCCAATAATTGGCGTTCGGCTGATAA
> GTAGAATCAGGGTATTTCTTGATGAAGTTCTGAAACGCCACAATCGCATCATCCTGGCGA
> GACTTATCCTGCACCAGCGCAATTGCCGCATTGTAATCGGTATTCGCATCGCCACTTTGT
> ACCGGCGCCCCTGAGGTTGCCGTACCGGCATCCGGAGCGGGGGTCGCAGCGGTTGCCGCC
> CCGCTCTGGTCGCCAGCTGCTGGCTGCGCTGCGCCGCCATTATTTAAACTCCCCAGCTGC
> AGCATAATTTGCTTCTGGC
> 
>>5_Salmonella_typhi_Ty2_2952339_2953677
> 
> CACGCGTACAGAGCCTTGCATAGTCCAGAAGTGCGGGTCCGCCAGCGGGATATCGACGGA
> TTGCAGCGACAGCGTATGCCCCATCTGACGCCAGTCGGTGGCGATCATATTGGTGGCCGT
> CGGTAATCCGGTCGCGCGACGGAATTCCGCCATCACTTCACGACCGGAAAAACCCTGCTC
> CGCGCCGCACGGATCTTCTGCATAGGCCAGAGAACCTTTTAGGTATTTACCAATGCTGAT
> CGCTTCGTTCAGCGACCAGGCACCGTTTGGATCGAGCGTGACGCGCGCTTGTGGGAAACG
> TTTCGCCAGCGCCACGATTGACTCGGCCTCTTCTTCGCCCGCCAGCACGCCGCCTTTCAG
> TTTGAAGTCGTTGAAGCCGTATTTTTCATACGCCGCTTCCGCCAGACGCACTACCGTTTC
> CGGCGTCATCGCCTCTTCATGGCGCAGACGATACCAGTCGCATTGCTCATCCGGCTGGCT
> TTGATACGGCAGCGGCGTAACCTTGCGATTGCCGACAAAGAACAGATAACCCAGCATTTC
> GACTTCGCTGCGCTGCTGACCGTCGCCTAACAGCGAAGCGACGTTGACGCCCAGATGTTG
> GCCCAAAAGGTCAAGCATTGCCGCTTCAATGCCCGTCACCACATGGATAGTGGTACGGAG
> ATCGAACGTTTGTAAACCGCGTCCGCCCGCATCGCGATCGGCAAACTGGTTGCGAACGGC
> GGTCAGGACATTTTTATATTCATCCAGCGTTTTTCCCACCACCAGTGGAATCGCATCTTC
> CAGCGTTTTGCGAATTTTTTCGCCGCCCGGAATCTCGCCGACCCCAGTATGACCGGAGTT
> ATCTTTAACAATGACGATGTTGCGCGTAAAGAACGGGGCGTGCGCGCCGCTCAGGTTCAT
> CAGCATACTGTCATGACCCGCAACCGGGATAACCTGCATTTCAGTCACTACAGGCGTGGT
> AAATTGAGTACTCATAACTGTGTCCTTATTCAGAATTAGTGACGACCAAAAACAGGGCGT
> TTGCGGTCAAAAGTCCAGCCGGGGATCAGGTATTGCATCGGGCCTGCATCATTACGCGCG
> CCGCCTGGCAGCTTTTTATACGCGTCATGCGCTTTTCGCACCTGTTCCCAGTCAAGCTCT
> ACGCCCAGTCCTGGCGCATCCGGAACGGCAATTTTGCCGTTTTTAATTTCCAGCGGATTT
> TTAGTCAGGCGGCAATCGCCCTCCTGCCAGATCCAGTGCGTATCAATAGCGGTGGGTTTG
> CCTGGCGCCGCCGCGCCGACATGGGTAAACATCGCCAGTGAAATATCAAAATGGTTATTC
> GAATGGCAGCCCCAGGTTA
> 
>>6_Salmonella_typhi_Ty2_3075356_3076694
> 
> GACGTTCACTTCAATCGCTAAGGTGGCGATATCAGGCACAGCATCGACGCTTGCTGTACC
> GGAAGTCACGATGTGCGGGCCTTCCGGCAATTCGCTTGCCTGCGCCGACATTGCACTTAA
> ACCCACTAATGCCGCCAGGGCCATCACTTTGAACTTCACAGTCTCTCCTCCATATTGCAG
> TCATTGCCCTGATATACAGGGCGCATGCAGGCAAGCTTAGCGGATATCGGTCTGTTGTCC
> ATCAGACAACGCTTAGTTGAAAAGCGCGTGCATATGCGCGACGCCTTCCCGCGCCAGTTG
> GAAGGCGATAAGCCACATCACCACCCCTACCAGAATGTTGATAATGCGTTGAGCTTTCGC
> CGTACGCAGTCGAGGCGCAAGCCAGGCCGCCAGTAGCGCTAACCCAAAAAACCATAAGAA
> CGACGCGCTAATCGTGCCGAGAGCAAACCAGCGCTTTGGCTCCATAGCCAGTTGTCCGCC
> GAGACTGCCCAACACGACAAACGTATCCAGATAAACGTGCGGATTCAGCCAGGTTACCGC
> CAGCATCGTAGCAATAATTTTCCAGCGCCCCTGCTTCATAACCTCGGCGCTGGCCAGCTC
> CAGATTACTGCTCATTGCCGTTTTCAGCGCGCCAAAACCGTACCATAACAAGAACGCAAC
> CCCGCCCCATGTGACCAGCGCCAGCAGCCACGGCGACTGCATCAGCAACGCGCTACCGCC
> AAAAATACCGGCGCTAATCAGGACTAAATCACTTAACGCGCAAAGCAGCGCTATCATGAG
> GTGGTACTGGCGACGAATTCCCTGATTCATCACAAACGCATTTTGCGGGCCAAGCGGAAG
> GATCATGGCGGCGCCAAGGGCAACCCCTTGAAAATAATAAGATATCACGTTAACTACCCT
> GAGCTGTTTTTCTTAAAGGCAGACTATAGCGCGGGAATATTATTAGCGGAAATTGATAAT
> TTTAATCACTAATAAGAAAAGCTAATAAAGAGACTGAATAACGGATGGCGGCTACGCTTA
> TCCGCTCAATAACCAGAGAATACTGGAGGCGGATAAGCGCAGCGCCGGCAGGCAGCACGG
> GAGGAAAATTACTCCGCGGCGTTATCTTTCACGCGTTTAAAATTGACGTCCATCTGCGGG
> TACGGGAAGCTAATACCAGCGGCGTCGAATTCACGTTTAATACGTTCCAGCACGTCCCAA
> TAGACATTTTGCAGGTCGCTGCTTTTACTCCAGACACGCACCACAAAATTAATTGATGAG
> GCGCCCAGCTCATTCAAACGCACCGTCATTTCGCGATCTTTTAAAATACGATCGTCGGAC
> TCGATAATCGTCGTTAGCA
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> BioPython mailing list  -  BioPython at biopython.org
> http://biopython.org/mailman/listinfo/biopython

-- 
Iddo Friedberg, Ph.D.
The Burnham Institute
10901 N. Torrey Pines Rd.
La Jolla, CA 92037
USA
Tel: +1 (858) 646 3100 x3516
Fax: +1 (858) 646 3171
http://ffas.ljcrf.edu/~iddo



More information about the BioPython mailing list