From srini_iyyer_bio at yahoo.com Sat Apr 1 13:13:16 2006 From: srini_iyyer_bio at yahoo.com (Srinivas Iyyer) Date: Sat, 1 Apr 2006 10:13:16 -0800 (PST) Subject: [BioPython] How can I retreive FASTA sequences from NCBI In-Reply-To: <442BFFAD.10103@maubp.freeserve.co.uk> Message-ID: <20060401181316.62812.qmail@web38113.mail.mud.yahoo.com> Hi , I have 151,204 GenBank Accession IDs. I want to retreive FASTA sequences from NCBI and compile them for my local blast. I am unable to get fasta sequences. I do not understand. Could any one please help me. my code: >>> mylis ['AA035383', 'AA971406', 'N98563'] parser = Fasta.RecordParser() iterator = Fasta.Iterator(mylis,parser) rec = iterator.next() rec = iterator.next() >>> rec >>> rec is empty :-( Accession IDs are not GIs. They are GenBank accession Ids. I do not want sequences in GenBank (long format). I want them in FASTA sequence format. Could any one pleast help me. Thanks Srini __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From biopython at maubp.freeserve.co.uk Sat Apr 1 14:59:46 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython)) Date: Sat, 01 Apr 2006 20:59:46 +0100 Subject: [BioPython] How can I retreive FASTA sequences from NCBI In-Reply-To: <20060401181316.62812.qmail@web38113.mail.mud.yahoo.com> References: <20060401181316.62812.qmail@web38113.mail.mud.yahoo.com> Message-ID: <442EDBB2.3040105@maubp.freeserve.co.uk> Srinivas Iyyer wrote: > Hi , > I have 151,204 GenBank Accession IDs. > I want to retreive FASTA sequences from NCBI and > compile them for my local blast. > > I am unable to get fasta sequences. I do not > understand. > > Could any one please help me. This should help. Using the first identifier in your example, AA035383, this is a nucleotide sequence, available from the NCBI. By searching the Entrez database you end up here:- http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=1507107 Note, AA035383 --> gi:1507107 Using the web interface, you can choose to view it as FASTA format rather than the default of GenBank format, and save to file. You could make a note of that URL, and just change the GI number to download all the files you want - but you need a simple way to determine the GI number... Now, BioPython can help you here: >>> from Bio import GenBank >>> gi_list = GenBank.search_for('AA035383', database='nucleotide') >>> print gi_list ['1507107'] You could use this code to get the GI numbers for each of your 151,204 GenBank Accession IDs. I would check in each case that only one GI number is returned. >>> assert len(gi_list)==1 >>> gi_number = gi_list[0] Once you have the GI number, then you could just download the FASTA file yourself and then parse it in the normal way. Or, get BioPython to do all this for you with its rather clever NCBIDictionary object... >>> from Bio import Fasta >>> from Bio import GenBank >>> ncbi_dict = GenBank.NCBIDictionary('nucleotide', 'fasta', \ ... parser = Fasta.RecordParser()) >>> gi_number = '1507107' >>> fasta_rec = ncbi_dict[gi_number] >>> print fasta_rec >gi|1507107|gb|AA035383.1|AA035383 zk25e12.r1 Soares_pregnant_uterus_NbHPU Homo sapiens cDNA clone IMAGE:471598 5', mRNA sequence CTTGAGCCTCAGGAACGAGATGGCGGTTCTCTGGAGGCTGAGTGCCGTTTGCGGTGCCCT AGGAGGCCGAGCTCTGTTGCTTCGAACTCCAGTGGTCAGACCTGCTCATATCTCAGCATT TCTTCAGGACCGACCTATCCCAGAATGGTGTGGAGTGCAGCACATACACTTGTCACCCGA GCCACCATTCTGGCTCCAAGGCTGCATCTCTCCACTGGACTAGCGAGANGGTTGTCANTG TTTTGCTCCTGGGTCTGCTTCCCGGCTGCTTANTTGAANCCTTGCTCNGCGANGGACTAN TCCCTGGC You could use the Fasta.SequenceParser() if you prefer. I would guess you would then want to save these FASTA records into one long FASTA file. Enjoy! Peter From halima at mancala.cbio.uct.ac.za Sun Apr 2 09:33:11 2006 From: halima at mancala.cbio.uct.ac.za (Halima Rabiu) Date: Sun, 2 Apr 2006 15:33:11 +0200 (SAST) Subject: [BioPython] Need help on NCBIStandaloneblast In-Reply-To: <442BFFAD.10103@maubp.freeserve.co.uk> References: <442BFFAD.10103@maubp.freeserve.co.uk> Message-ID: Thanks Peter , I have been able to trace the error when I print the error_info.read() the error is with my infile There is result in my save file now but I am still having problem passing the output file.But I will try to figure it out it may be syntax problem Thanks On Thu, 30 Mar 2006, Peter (BioPython List) wrote: > Halima Rabiu wrote: > > Hi everyboby ; > > I am new to biopython having problems with the "NCBIStandalone.blastall". > > After launching the Blast with "doBlast" it look like runs and end > > and then I check the output it empty and I try same thing using comand > > line it work and get result. > > I attch my code. > > Have you checked the paths are correct, e.g. > > assert os.path.isfile(data), "Missing database file " + data > assert os.path.isfile(infile), "Missing input file " + infile > > You don't need to check blast_exe yourself, as the blastall command does this > for you. > > If I understood you correctly, the "blast.out" file is empty. > > Did blast return any error message? Try: > > print error_info.read() > > or: > > save_file =open("blast.error","w") > blast_result=error_info.read() > save_file.write(blast_result) > save_file.close() > > Next question, could you tell us what you typed at the command line which does > work? > > > I also try to go though the previous posts on biopython mailing list fund > > similar problem post by Andreas but no solution to the problem . > > It was worth checking anyway :) > > Peter > > From as_nascimento at yahoo.com.br Wed Apr 5 16:35:35 2006 From: as_nascimento at yahoo.com.br (Alessandro S. Nascimento) Date: Wed, 05 Apr 2006 17:35:35 -0300 Subject: [BioPython] problems when parsing blast output In-Reply-To: <43CCD436.7020704@maubp.freeserve.co.uk> References: <43CC485E.7050702@yahoo.com.br> <43CCC6D4.4020307@maubp.freeserve.co.uk> <43CCCF56.40803@yahoo.com.br> <43CCD436.7020704@maubp.freeserve.co.uk> Message-ID: <44342A17.4070404@yahoo.com.br> Hi Peter I had some troubles when parsing some results from a blastpgp output file. My initial script used to work but isn't working this time. My blast output file is very, very large. When I try to run it, I can see my processor working in 99% for some minutes than is returns to prompt with no results or information. Any idea of what may be happening? Thanks in advance, Alessandro #!/usr/bin/python import os from Bio.Blast import NCBIStandalone from string import * blast_out = open('blast.output', 'r') b_parser = NCBIStandalone.PSIBlastParser() b_record = b_parser.parse(blast_out) n=0 for round in b_record.rounds: for alignment in round.alignments: for hsp in alignment.hsps: if hsp.identities < 90: if hsp.identities > 30: if alignment.length > 200: print "Retrieving sequence query" os.system ("fastacmd -d ..//db/nr -s \'%s\' > test.bl2.%d" % (query, n, )) n=n+1 blast_out.close() From halima at mancala.cbio.uct.ac.za Thu Apr 13 11:07:52 2006 From: halima at mancala.cbio.uct.ac.za (Halima Rabiu) Date: Thu, 13 Apr 2006 17:07:52 +0200 (SAST) Subject: [BioPython] Need help parsing Blastoutput Message-ID: Hi All, I have a BLAST output from a local blast I need to calculate my % alignment coverage as regard to my subject I try parsed the blast output and wanted to print the sbjct Start and Sbjct end. but I could not is there anyway I could this try to get mach coverage between my querry and subject I dont need Identities,but total % alignment for querry or subject. Thanks Halimah From mdehoon at c2b2.columbia.edu Thu Apr 13 11:56:26 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Thu, 13 Apr 2006 11:56:26 -0400 Subject: [BioPython] Need help parsing Blastoutput Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEE9@cgcmail.cgc.cpmc.columbia.edu> Could you send us the script you were using? --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu Sent: Thu 4/13/2006 11:07 AM To: biopython at lists.open-bio.org Subject: [BioPython] Need help parsing Blastoutput Hi All, I have a BLAST output from a local blast I need to calculate my % alignment coverage as regard to my subject I try parsed the blast output and wanted to print the sbjct Start and Sbjct end. but I could not is there anyway I could this try to get mach coverage between my querry and subject I dont need Identities,but total % alignment for querry or subject. Thanks Halimah _______________________________________________ BioPython mailing list - BioPython at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biopython From rafael at nbn.ac.za Fri Apr 14 05:52:42 2006 From: rafael at nbn.ac.za (Rafael C. Jimenez) Date: Fri, 14 Apr 2006 11:52:42 +0200 Subject: [BioPython] Need help parsing Blastoutput In-Reply-To: References: Message-ID: <9ad32945680e91a485c1e0cdb1ca4eb7@nbn.ac.za> On 13 Apr 2006, at 5:07 PM, Halima Rabiu wrote: > Hi All, > I have a BLAST output from a local blast Well, I would say that you can use three alternatives to run blast, and somehow you can use all of them locally. - Blast web server (Through Blastcl3 or through biopython) - Blast standalone - wwwblast I guess that when you say local blast you want to say you are using blast standalone to use your own local databases. It makes a difference to use one of these three different because you will use different modules to parse the output: - Bio.Blast.NCBIStandalone for Blast standalone outputs - Bio.Blast.NCBIWWW for Blast web server outputs - No parser for the wwwblast > I need to calculate my % alignment coverage as regard to my subject I am not sure what you mean, but I would say that this % is provided by the "Identities" field in nucleotide and protein comparisons for each alignment, and also by the "Positives" field in protein comparisons. Example: Identities = 11/26 (42%), Positives = 15/26 (57%) > I try parsed the blast output and wanted to print the > sbjct Start and Sbjct end. but I could not is there anyway I could this # Open your Blast Output file blastOutput = open("The name of your blast output", 'r') Once you have parsed the NCBIWWW output: from Bio.Blast import NCBIWWW parser = NCBIWWW.BlastParser() blastRecord = parser.parse(blastOutput) .... or the NCBI web server output: from Bio.Blast import NCBIWWW parser = NCBIWWW.BlastParser() blastRecord = parser.parse(blastOutput) now you can start to recover information using the Bio.Blast.Record module import Bio.Blast.Record # ... for instance you can retreive the Blast version you used when you got your output ... print 'header.version:',blastRecord.version for alignment in blastRecord.alignments: # ... or the length of the alignment ... print 'alignment.length:', alignment.length for hsp in alignment.hsps: # ... or the sbjct Start as you want ... print 'hsp.sbjct_start:', hsp.sbjct_start > > try to get mach coverage between my querry and subject I dont need > Identities,but total % alignment for querry or subject. > Thanks > Halimah I am working in the NBN central node in UWC, not far away from UCT. Don't hesitate to visit us if you want help or advice. Cheers, Rafael Rafael C. Jimenez ----------------------------------------------------------- National Bioinformatics Network University of the Western Cape Private Bag X17 Bellville 7530 South Africa Tel: +27219592991 rafael at nbn.ac.za www.nbn.ac.za ----------------------------------------------------------- Proteomics Services Group European Bioinformatics Institute (EMBL-EBI) Wellcome Trust Genome Campus, Hinxton Cambridge - CB10 1SD - UK Tel: +441223492610 rafael at ebi.ac.uk www.ebi.ac.uk ----------------------------------------------------------- On 13 Apr 2006, at 5:07 PM, Halima Rabiu wrote: > Hi All, > I have a BLAST output from a local blast > I need to calculate my % alignment coverage as regard to my subject > I try parsed the blast output and wanted to print the > sbjct Start and Sbjct end. but I could not is there anyway I could this > try to get mach coverage between my querry and subject I dont need > Identities,but total % alignment for querry or subject. > Thanks > Halimah > > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From halima at mancala.cbio.uct.ac.za Tue Apr 18 11:06:02 2006 From: halima at mancala.cbio.uct.ac.za (Halima Rabiu) Date: Tue, 18 Apr 2006 17:06:02 +0200 (SAST) Subject: [BioPython] Need help parsing Blastoutput In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEE9@cgcmail.cgc.cpmc.columbia.edu> References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEE9@cgcmail.cgc.cpmc.columbia.edu> Message-ID: thanks please see the attchment a copy of my script and copy of my Blast output Thanks On Thu, 13 Apr 2006, Michiel De Hoon wrote: > Could you send us the script you were using? > > --Michiel. > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu > Sent: Thu 4/13/2006 11:07 AM > To: biopython at lists.open-bio.org > Subject: [BioPython] Need help parsing Blastoutput > > Hi All, > I have a BLAST output from a local blast > I need to calculate my % alignment coverage as regard to my subject > I try parsed the blast output and wanted to print the > sbjct Start and Sbjct end. but I could not is there anyway I could this > try to get mach coverage between my querry and subject I dont need > Identities,but total % alignment for querry or subject. > Thanks > Halimah > > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > > -------------- next part -------------- #! /usr/local/bin/python2.4 #halimah #16-04-2006 from string import split from Bio.Blast import NCBIStandalone b_out = open('Enterococcus_out','r') b_parser = NCBIStandalone.BlastParser() b_iterator = NCBIStandalone.Iterator(b_out,b_parser) E_VALUE_THRESH = 1.0 while 1: b_record = b_iterator.next() print "The following results are for query " + b_record.query print 'len of query:',b_record.query_letters if b_record is None: break for alignment in b_record.alignments: for hsp in alignment.hsps: if hsp.expect <= E_VALUE_THRESH: print '****Alignment****' print 'title:', alignment.title print 'length:', alignment.length print 'e value:', hsp.expect print 'subjectstart:',hsp.sbjct_start print 'subject end:', hsp.sbject_end From mdehoon at c2b2.columbia.edu Tue Apr 18 12:40:05 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Tue, 18 Apr 2006 12:40:05 -0400 Subject: [BioPython] Need help parsing Blastoutput Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF5@cgcmail.cgc.cpmc.columbia.edu> Could you also send us the file Enterococcus_out so we can run the script? >From the script, it looks like you're trying to parse text output from Blast. While this is possible (in theory), the format of Blast text output tends to change a lot, thereby breaking the parser in Biopython. It is more reliable to have Blast generate output in XML format, and use the XML parser: blast_out = open('my_blast.xml', 'r') from Bio.Blast import NCBIXML b_parser = NCBIXML.BlastParser() b_record = b_parser.parse(blast_out) See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to generate Blast output in XML. --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] Sent: Tue 4/18/2006 11:06 AM To: Michiel De Hoon Cc: biopython at lists.open-bio.org Subject: RE: [BioPython] Need help parsing Blastoutput thanks please see the attchment a copy of my script and copy of my Blast output Thanks On Thu, 13 Apr 2006, Michiel De Hoon wrote: > Could you send us the script you were using? > > --Michiel. > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu > Sent: Thu 4/13/2006 11:07 AM > To: biopython at lists.open-bio.org > Subject: [BioPython] Need help parsing Blastoutput > > Hi All, > I have a BLAST output from a local blast > I need to calculate my % alignment coverage as regard to my subject > I try parsed the blast output and wanted to print the > sbjct Start and Sbjct end. but I could not is there anyway I could this > try to get mach coverage between my querry and subject I dont need > Identities,but total % alignment for querry or subject. > Thanks > Halimah > > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > > From halima at mancala.cbio.uct.ac.za Wed Apr 19 06:15:15 2006 From: halima at mancala.cbio.uct.ac.za (Halima Rabiu) Date: Wed, 19 Apr 2006 12:15:15 +0200 (SAST) Subject: [BioPython] Need help parsing Blastoutput In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF5@cgcmail.cgc.cpmc.columbia.edu> References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF5@cgcmail.cgc.cpmc.columbia.edu> Message-ID: Hi Please see the attachment,it part of my Blast output. yes I am try to parse text output from Blast ,I have use another script to run my local blast that I am trying to perse the NCBIStandalone.BlastParser was working fine without hsp.sbject_end which is one of what I need to print out . On checking the class diagrams from cookbook, findout that sbject_end is not included .I just need another way of printing the int(subject end). Thanks for your help Halimah On Tue, 18 Apr 2006, Michiel De Hoon wrote: > Could you also send us the file Enterococcus_out so we can run the script? > > From the script, it looks like you're trying to parse text output from Blast. > While this is possible (in theory), the format of Blast text output tends to > change a lot, thereby breaking the parser in Biopython. It is more reliable > to have Blast generate output in XML format, and use the XML parser: > > blast_out = open('my_blast.xml', 'r') > > from Bio.Blast import NCBIXML > > b_parser = NCBIXML.BlastParser() > b_record = b_parser.parse(blast_out) > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to > generate Blast output in XML. > > --Michiel. > > > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > Sent: Tue 4/18/2006 11:06 AM > To: Michiel De Hoon > Cc: biopython at lists.open-bio.org > Subject: RE: [BioPython] Need help parsing Blastoutput > > thanks > please see the attchment a copy of my script and copy of my Blast output > Thanks > > > On Thu, 13 Apr 2006, Michiel De Hoon wrote: > > > Could you send us the script you were using? > > > > --Michiel. > > > > Michiel de Hoon > > Center for Computational Biology and Bioinformatics > > Columbia University > > 1150 St Nicholas Avenue > > New York, NY 10032 > > > > > > > > -----Original Message----- > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu > > Sent: Thu 4/13/2006 11:07 AM > > To: biopython at lists.open-bio.org > > Subject: [BioPython] Need help parsing Blastoutput > > > > Hi All, > > I have a BLAST output from a local blast > > I need to calculate my % alignment coverage as regard to my subject > > I try parsed the blast output and wanted to print the > > sbjct Start and Sbjct end. but I could not is there anyway I could this > > try to get mach coverage between my querry and subject I dont need > > Identities,but total % alignment for querry or subject. > > Thanks > > Halimah > > > > _______________________________________________ > > BioPython mailing list - BioPython at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython > > > > > > -------------- next part -------------- BLASTP 2.2.10 [Oct-19-2004] Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Query= ENTFA 3MGH_ENTFA (Q833H5) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) (229 letters) Database: Blastdata.fdb 240,170 sequences; 77,468,597 total letters Searching..................................................done Score E Sequences producing significant alignments: (bits) Value ENTFA 3MGH_ENTFA (Q833H5) Putative 3-methyladenine DNA glycosyla... 462 e-130 LISMO 3MGH_LISMO (P58621) Putative 3-methyladenine DNA glycosyla... 194 2e-49 STAAM 3MGH_STAAM (P65414) Putative 3-methyladenine DNA glycosyla... 187 3e-47 STAES 3MGH_STAES (Q8CRC1) Putative 3-methyladenine DNA glycosyla... 186 5e-47 LISIN 3MGH_LISIN (Q92D89) Putative 3-methyladenine DNA glycosyla... 185 8e-47 LACPL 3MGH_LACPL (Q88VP8) Putative 3-methyladenine DNA glycosyla... 178 1e-44 BACSU 3MGH_BACSU (P94378) Putative 3-methyladenine DNA glycosyla... 160 3e-39 LACJO Q74LU5 (Q74LU5) 3-methyladenine DNA glycosylase 155 7e-38 OCEIH 3MGH_OCEIH (Q8ETG4) Putative 3-methyladenine DNA glycosyla... 147 2e-35 BACAN 3MGH_BACAN (Q81UJ9) Putative 3-methyladenine DNA glycosyla... 130 4e-30 BACC1 Q73CV5 (Q73CV5) Methylpurine-DNA glycosylase family protein 125 8e-29 CLOTE 3MGH_CLOTE (Q896H4) Putative 3-methyladenine DNA glycosyla... 124 3e-28 CHLMU Q9PJE9 (Q9PJE9) Succinate dehydrogenase, iron-sulfur protein 113 4e-25 CHLCV 3MGH_CHLCV (Q824B4) Putative 3-methyladenine DNA glycosyla... 111 2e-24 CLOAB 3MGH_CLOAB (Q97EY6) Putative 3-methyladenine DNA glycosyla... 108 1e-23 CLOPE 3MGH_CLOPE (Q8XHA9) Putative 3-methyladenine DNA glycosyla... 107 4e-23 STRMU Q8DRU4 (Q8DRU4) Putative 3-methyladenine DNA glycosylase 103 3e-22 DEIRA 3MGH_DEIRA (Q9RSQ0) Putative 3-methyladenine DNA glycosyla... 86 9e-17 CORGL 3MGH_CORGL (Q8NU33) Putative 3-methyladenine DNA glycosyla... 82 1e-15 STRCO 3MGH_STRCO (Q9S208) Putative 3-methyladenine DNA glycosyla... 80 4e-15 BRAJA 3MGH_BRAJA (Q89LR7) Putative 3-methyladenine DNA glycosyla... 79 1e-14 STRAW 3MGH_STRAW (Q829C5) Putative 3-methyladenine DNA glycosyla... 73 8e-13 COREF 3MGH_COREF (Q8FUA2) Putative 3-methyladenine DNA glycosyla... 69 9e-12 PROAC Q6A5L3 (Q6A5L3) Putative 3-methylpurine DNA glycosylase 66 9e-11 MYCPA Q740F6 (Q740F6) Hypothetical protein 64 3e-10 MYCLEH 3MGH_MYCTU (P65412) Putative 3-methyladenine DNA glycosyl... 64 5e-10 MYCTU 3MGH_MYCTU (P65412) Putative 3-methyladenine DNA glycosyla... 64 5e-10 MYCBO 3MGH_MYCBO (P65413) Putative 3-methyladenine DNA glycosyla... 64 5e-10 MYCLE 3MGH_MYCLE (O05678) Putative 3-methyladenine DNA glycosyla... 60 5e-09 RICCN 3MGH_RICCN (Q92IE0) Putative 3-methyladenine DNA glycosyla... 52 2e-06 RICPR 3MGH_RICPR (Q9ZDH7) Putative 3-methyladenine DNA glycosyla... 49 1e-05 PSEAE 3MGH_PSEAE (Q9HX17) Putative 3-methyladenine DNA glycosyla... 45 2e-04 PSEPK Q88DL3 (Q88DL3) DNA-3-methyladenine glycosylase, putative 42 0.002 BORPE Q7VWB7 (Q7VWB7) Putative methypurine-DNA glycosylase 40 0.004 BORPA Q7W9Q9 (Q7W9Q9) Putative methypurine-DNA glycosylase 40 0.004 STRAW Q93HJ1 (Q93HJ1) Modular polyketide synthase 35 0.14 STRAW Q93HJ2 (Q93HJ2) Modular polyketide synthase 33 0.68 SILPO Q5LKE3 (Q5LKE3) Peptidyl-prolyl cis-trans isomerase, FKBP-... 32 1.5 SILPO Q5LVX1 (Q5LVX1) ABC transporter, transmembrane ATP-binding... 30 4.4 CLOTE DXS_CLOTE (Q894H0) 1-deoxy-D-xylulose-5-phosphate synthase... 30 5.8 BURMA Q9AI54 (Q9AI54) DedA family protein 30 7.5 STAES Q8CST0 (Q8CST0) Hypothetical protein SE0952 29 9.8 SILPO Q5LMI5 (Q5LMI5) ADA regulatory protein 29 9.8 >ENTFA 3MGH_ENTFA (Q833H5) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 229 Score = 462 bits (1190), Expect = e-130 Identities = 229/229 (100%), Positives = 229/229 (100%) Query: 1 MVKEMKETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFG 60 MVKEMKETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFG Sbjct: 1 MVKEMKETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFG 60 Query: 61 LRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQ 120 LRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQ Sbjct: 61 LRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQ 120 Query: 121 GRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGR 180 GRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGR Sbjct: 121 GRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGR 180 Query: 181 WTELPLRYVVAGNPYISKQKRTAVDQIDFGWKDEENEKSNNAHILRGTT 229 WTELPLRYVVAGNPYISKQKRTAVDQIDFGWKDEENEKSNNAHILRGTT Sbjct: 181 WTELPLRYVVAGNPYISKQKRTAVDQIDFGWKDEENEKSNNAHILRGTT 229 >LISMO 3MGH_LISMO (P58621) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 207 Score = 194 bits (492), Expect = 2e-49 Identities = 99/198 (50%), Positives = 134/198 (67%), Gaps = 3/198 (1%) Query: 8 TINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRL 67 T F +KTT E+A+ +LGM L H+T G+L G IV+ EAYLG D AAHSF +T R Sbjct: 6 TKEFFESKTTIELARDILGMRLVHQTNEGLLSGLIVETEAYLGATDMAAHSFQNLRTKRT 65 Query: 68 QAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVE-GVDKMIENRQGRQGVE 126 + M+ PGTIY+Y MH ++LN +T +G P+ ++IRAIEP E +M +NR G+ G E Sbjct: 66 EVMFSSPGTIYMYQMHRQVLLNFITMPKGIPEAILIRAIEPDEQAKQQMTQNRHGKTGYE 125 Query: 127 LTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPL 186 LTNGPGKL ALG+ Q YG+++F S++ L E+ K P IEA RIG+PNKG T PL Sbjct: 126 LTNGPGKLTQALGLSMQDYGKTLFDSNIWL--EEAKLPHLIEATNRIGVPNKGIATHYPL 183 Query: 187 RYVVAGNPYISKQKRTAV 204 R+ V G+PYIS Q++ ++ Sbjct: 184 RFTVKGSPYISGQRKNSI 201 >STAAM 3MGH_STAAM (P65414) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 202 Score = 187 bits (474), Expect = 3e-47 Identities = 91/201 (45%), Positives = 132/201 (65%), Gaps = 1/201 (0%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F + T + A+ LLG+ + ++ GYIV+ EAYLG D+AAH FG + TP++ ++Y Sbjct: 6 FINQQTTQTAKALLGVKIIYQDDYQTYTGYIVETEAYLGIQDKAAHGFGGKITPKVTSLY 65 Query: 72 DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGP 131 K GTIY + MHTHL++N VT+ +G P+GV+IRAIEP EG+ M NR G+ G ELTNGP Sbjct: 66 KKGGTIYAHVMHTHLLINFVTRTEGIPEGVLIRAIEPDEGIGAMNVNR-GKSGYELTNGP 124 Query: 132 GKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVA 191 GK A I + + G ++ L + RK+PK I RIGIPNKG WT PLR+ V Sbjct: 125 GKWTKAFNIPRSIDGSTLNDCKLSIDTNHRKYPKTIIESGRIGIPNKGEWTNKPLRFTVK 184 Query: 192 GNPYISKQKRTAVDQIDFGWK 212 GNPY+S+ +++ D WK Sbjct: 185 GNPYVSRMRKSDFQNPDDTWK 205 >LISIN 3MGH_LISIN (Q92D89) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 207 Score = 185 bits (470), Expect = 8e-47 Identities = 96/200 (48%), Positives = 130/200 (65%), Gaps = 3/200 (1%) Query: 6 KETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTP 65 K T F +TT E+A+ ++GM L HE L GYIV+ EAYLG D AAHSF +T Sbjct: 4 KITPTFFENRTTIELARDIIGMRLVHEIGNYTLSGYIVETEAYLGATDMAAHSFKNLRTK 63 Query: 66 RLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPV-EGVDKMIENRQGRQG 124 R + M+ PGTIY Y MH ++LN +T +G P+ V+IRA+EP E +++M +NR + G Sbjct: 64 RTEVMFGTPGTIYTYQMHQQVLLNFITMREGIPEAVLIRALEPTKESIEQMEQNRFLKTG 123 Query: 125 VELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTEL 184 ELTNGPGKL ALG+ Q YG+++F S++ L E+ K P IEA RIG+PNKG T Sbjct: 124 FELTNGPGKLTQALGLSMQDYGKTLFDSNIWL--ERAKVPHIIEATNRIGVPNKGIATHY 181 Query: 185 PLRYVVAGNPYISKQKRTAV 204 PLR+ G+PYIS Q++ + Sbjct: 182 PLRFTAKGSPYISAQRKRQI 201 >LACPL 3MGH_LACPL (Q88VP8) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 209 Score = 178 bits (451), Expect = 1e-44 Identities = 93/199 (46%), Positives = 127/199 (63%), Gaps = 1/199 (0%) Query: 13 NTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYD 72 +T TT E+A LLG L +T++GVL +I + EAYLG D AH++ +TPR A++ Sbjct: 9 STCTTPEIAVSLLGKQLRLQTSSGVLTAWITETEAYLGARDAGAHAYQNHQTPRNHALWQ 68 Query: 73 KPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPG 132 GTIY+Y M +LN+VTQ G P+ V+IR IEP G+++M + R LTNGPG Sbjct: 69 SAGTIYIYQMRAWCLLNIVTQAAGTPECVLIRGIEPDAGLERMQQQRP-VPIANLTNGPG 127 Query: 133 KLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAG 192 KL+ ALG+DK L GQ++ ++L L + P+++ A PRIGI NKG WT PLRY VAG Sbjct: 128 KLMQALGLDKTLNGQALQPATLSLDLSHYRQPEQVVATPRIGIVNKGEWTTAPLRYFVAG 187 Query: 193 NPYISKQKRTAVDQIDFGW 211 NP++SK R +D GW Sbjct: 188 NPFVSKISRRTIDHEHHGW 206 >BACSU 3MGH_BACSU (P94378) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 196 Score = 160 bits (405), Expect = 3e-39 Identities = 91/198 (45%), Positives = 112/198 (56%), Gaps = 2/198 (1%) Query: 1 MVKEMKETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFG 60 M +E F KT E+A LLG L ET G GYIV+ EAY+G D AAHSF Sbjct: 1 MTREKNPLPITFYQKTALELAPSLLGCLLVKETDEGTASGYIVETEAYMGAGDRAAHSFN 60 Query: 61 LRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQ 120 R+T R + M+ + G +Y Y MHTH +LN+V E+ PQ V+IRAIEP EG M E R Sbjct: 61 NRRTKRTEIMFAEAGRVYTYVMHTHTLLNVVAAEEDVPQAVLIRAIEPHEGQLLMEERRP 120 Query: 121 GRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGR 180 GR E TNGPGKL ALG+ YG+ I L + E P+ I PRIGI N G Sbjct: 121 GRSPREWTNGPGKLTKALGVTMNDYGRWITEQPLYI--ESGYTPEAISTGPRIGIDNSGE 178 Query: 181 WTELPLRYVVAGNPYISK 198 + P R+ V GN Y+S+ Sbjct: 179 ARDYPWRFWVTGNRYVSR 196 >LACJO Q74LU5 (Q74LU5) 3-methyladenine DNA glycosylase Length = 208 Score = 155 bits (393), Expect = 7e-38 Identities = 77/192 (40%), Positives = 125/192 (65%), Gaps = 2/192 (1%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F ++T E+++ LLG L + +L G IV+AEAY+G D AAHS+G R++P + +Y Sbjct: 7 FTNRSTSEISKDLLGRTLSYNNGEEILSGTIVEAEAYVGVKDRAAHSYGGRRSPANEGLY 66 Query: 72 DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGP 131 G++Y+Y+ + ++ QE+G+PQGV+IRAI+P+ G+D MI+NR G+ G LTNGP Sbjct: 67 RPGGSLYIYSQRQYFFFDVSCQEEGEPQGVLIRAIDPLTGIDTMIKNRSGKTGPLLTNGP 126 Query: 132 GKLVAALGIDKQLYG-QSIFSSSLRLVPEKRKFPKKIEALPRIGI-PNKGRWTELPLRYV 189 GK++ ALGI + + + S + + ++ ++I ALPR+GI + W + LR++ Sbjct: 127 GKMMQALGITSRKWDLVDLNDSPFDIDIDHKREIEEIVALPRVGINQSDPEWAQKKLRFI 186 Query: 190 VAGNPYISKQKR 201 V+GNPY+S K+ Sbjct: 187 VSGNPYVSDIKK 198 >OCEIH 3MGH_OCEIH (Q8ETG4) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 198 Score = 147 bits (371), Expect = 2e-35 Identities = 74/182 (40%), Positives = 112/182 (61%), Gaps = 2/182 (1%) Query: 17 TEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGT 76 T E+A+ LLG L +T G G IV+ EAYLG D AAH +G R+T R + +Y KPG Sbjct: 19 TLELAKNLLGCILVKQTEEGTSSGVIVETEAYLGNTDRAAHGYGNRRTKRTEILYSKPGY 78 Query: 77 IYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVA 136 Y++ +H H ++N+V+ +G P+ V+IRA+EP G+D+M+ R ++ LT+GPGKL Sbjct: 79 AYVHLIHNHRLINVVSSMEGDPESVLIRAVEPFSGIDEMLMRRPVKKFQNLTSGPGKLTQ 138 Query: 137 ALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYI 196 A+GI + YG + + L + + K P ++ RIGI N G + P R+ V GNP++ Sbjct: 139 AMGIYMEDYGHFMLAPPLFI--SEGKSPASVKTGSRIGIDNTGEAKDYPYRFWVDGNPFV 196 Query: 197 SK 198 S+ Sbjct: 197 SR 198 >BACAN 3MGH_BACAN (Q81UJ9) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 205 Score = 130 bits (326), Expect = 4e-30 Identities = 80/194 (41%), Positives = 112/194 (57%), Gaps = 11/194 (5%) Query: 17 TEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGT 76 T EVA+ LLG L H G IV+ EAY GPDD+AAHS+G R+T R + M+ PG Sbjct: 12 TLEVAKKLLGQKLVHIVNGIKRSGIIVEVEAYKGPDDKAAHSYGGRRTDRTEVMFGAPGH 71 Query: 77 IYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGV------ELTN 129 Y+Y ++ + N++T G PQGV+IRA+EPV+G++++ R + + LTN Sbjct: 72 AYVYLIYGMYHCFNVITAPVGTPQGVLIRALEPVDGIEEIKLARYNKTDITKAQYKNLTN 131 Query: 130 GPGKLVAALGIDKQLYGQSIFSSSL--RLVPEKRKFPK--KIEALPRIGIPNKGRWTELP 185 GPGKL ALGI + G S+ S +L LVPE++ KI A PRI I P Sbjct: 132 GPGKLCRALGITLEERGVSLQSDTLHIELVPEEKHISSQYKITAGPRINIDYAEEAVHYP 191 Query: 186 LRYVVAGNPYISKQ 199 R+ G+P++SK+ Sbjct: 192 WRFYYEGHPFVSKK 205 >BACC1 Q73CV5 (Q73CV5) Methylpurine-DNA glycosylase family protein Length = 205 Score = 125 bits (315), Expect = 8e-29 Identities = 79/194 (40%), Positives = 110/194 (56%), Gaps = 11/194 (5%) Query: 17 TEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGT 76 T EVA+ LLG L H G IV+ EAY GPDD+AAHS+G R+T R + M+ PG Sbjct: 12 TLEVAKKLLGQKLVHIVDGIKRSGIIVEVEAYKGPDDKAAHSYGGRRTDRTEVMFGAPGH 71 Query: 77 IYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGV------ELTN 129 Y+Y ++ + N++T G PQGV+IRA+EPV+G++++ R + + LTN Sbjct: 72 AYVYLIYGMYHCFNVITAPVGTPQGVLIRALEPVDGIEEIKLARYNKTDITKAQYKNLTN 131 Query: 130 GPGKLVAALGIDKQLYGQSIFSSSL--RLVPEKRKFPK--KIEALPRIGIPNKGRWTELP 185 GPGKL ALGI + G S+ S +L LV E+ KI A PRI I P Sbjct: 132 GPGKLCRALGITLEERGVSLQSDTLHIELVREEEHISSQYKITAGPRINIDYAEEAVHYP 191 Query: 186 LRYVVAGNPYISKQ 199 R+ G+P++SK+ Sbjct: 192 WRFYYEGHPFVSKK 205 >CLOTE 3MGH_CLOTE (Q896H4) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 203 Score = 124 bits (310), Expect = 3e-28 Identities = 74/197 (37%), Positives = 109/197 (55%), Gaps = 9/197 (4%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F K+ +VA+YLLG L +E L G IV+ EAY+G D+A+H++G +KT R+ +Y Sbjct: 7 FYEKSALQVAKYLLGKILVNEVEGITLKGKIVETEAYIGAIDKASHAYGGKKTERVMPLY 66 Query: 72 DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKM--------IENRQGR 122 KPGT Y+Y ++ + N++T+ +G+ +GV+IRAIEP+EG++KM I Sbjct: 67 GKPGTAYVYLIYGMYHCFNVITKVEGEAEGVLIRAIEPLEGIEKMAYLRYKKPISEISKT 126 Query: 123 QGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWT 182 Q LT GPGKL AL IDK Q + + + K I RIGI Sbjct: 127 QFKNLTTGPGKLCIALNIDKSNNKQDLCNEGTLYIEHNDKEKFNIVESKRIGIEYAEEAK 186 Query: 183 ELPLRYVVAGNPYISKQ 199 + R+ + NP+ISK+ Sbjct: 187 DFLWRFYIEDNPWISKK 203 >CHLMU Q9PJE9 (Q9PJE9) Succinate dehydrogenase, iron-sulfur protein Length = 425711 Score = 113 bits (283), Expect = 4e-25 Identities = 72/185 (38%), Positives = 105/185 (56%), Gaps = 5/185 (2%) Query: 10 NIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQA 69 + F ++ +AQ LLG L + GYIV+ EAY GPDD+A H++ RKT R +A Sbjct: 321 HFFLSEDVITLAQQLLGHKLITTHEGLITSGYIVETEAYRGPDDKACHAYNYRKTQRNRA 380 Query: 70 MYDKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVE-- 126 MY K G+ YLY + H +LN+VT + P V+IRAI P +G + MI+ RQ R Sbjct: 381 MYLKGGSAYLYRCYGMHHLLNVVTGPEDIPHAVLIRAILPDQGKELMIQRRQWRDKPPHL 440 Query: 127 LTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPL 186 LTNGPGK+ ALGI + Q + + +L + K K + A RIGI + ++P Sbjct: 441 LTNGPGKVCQALGISLENNRQRLNTPALYI--SKEKISGTLTATARIGIDYAQEYRDVPW 498 Query: 187 RYVVA 191 R++++ Sbjct: 499 RFLLS 503 >CHLCV 3MGH_CHLCV (Q824B4) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 190 Score = 111 bits (278), Expect = 2e-24 Identities = 67/174 (38%), Positives = 98/174 (56%), Gaps = 5/174 (2%) Query: 20 VAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYL 79 +A+ LLG L + + + G+IV+ EAY GPDD+A H++ RKT R MY + G Y+ Sbjct: 15 LAKELLGHILITKISGKITSGFIVETEAYRGPDDKACHAYNYRKTKRNSPMYSRGGIAYI 74 Query: 80 YTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVE--LTNGPGKLVA 136 Y + H + N+VT +Q P V+IRAI P EG D MI+ RQ + + LTNGPGK+ Sbjct: 75 YRCYGMHSLFNVVTAKQDLPHAVLIRAILPYEGEDIMIQRRQWQNKPKHLLTNGPGKVCQ 134 Query: 137 ALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVV 190 AL + + ++ S L + K K +I PRIGI +LP R+++ Sbjct: 135 ALNLTLEHNTHALTSPHLHI--SKEKASGRITQTPRIGIDYAEECKDLPWRFLL 186 >CLOAB 3MGH_CLOAB (Q97EY6) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 205 Score = 108 bits (270), Expect = 1e-23 Identities = 70/202 (34%), Positives = 110/202 (54%), Gaps = 10/202 (4%) Query: 9 INIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQ 68 I F ++ T VA+ LLG L HE G IV+ EAY G +D+ AH++G R+TPR + Sbjct: 4 IREFYSRDTIVVAKELLGKVLVHEVNGIRTSGKIVEVEAYRGINDKGAHAYGGRRTPRTE 63 Query: 69 AMYDKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGR----- 122 A+Y G Y+Y ++ + +N+V ++G P+GV+IRAIEP+EG++ M E R + Sbjct: 64 ALYGPAGHAYVYFIYGLYYCMNVVAMQEGIPEGVLIRAIEPIEGIEVMSERRFKKLFNDL 123 Query: 123 ---QGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKG 179 Q LTNGP KL +A+ I ++ + L + K + + +EA R+GI Sbjct: 124 TKYQLKNLTNGPSKLCSAMEIRREQNLMDLNGDELYIEEGKNESFEIVEA-KRVGIDYAE 182 Query: 180 RWTELPLRYVVAGNPYISKQKR 201 + R+ + GN +S K+ Sbjct: 183 EAKDYLWRFYIKGNKCVSVLKK 204 >CLOPE 3MGH_CLOPE (Q8XHA9) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 205 Score = 107 bits (266), Expect = 4e-23 Identities = 69/199 (34%), Positives = 107/199 (53%), Gaps = 11/199 (5%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F + T VA+ LLG L L G IV+ EAY+G D+A+H++G ++T R + +Y Sbjct: 7 FYNRDTVTVAKELLGKVLVRNINGVTLKGKIVETEAYIGAIDKASHAYGGKRTNRTETLY 66 Query: 72 DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVEL--- 127 PGT+Y+Y ++ + LN++++E+ GV+IR IEP+EG+++M + R + EL Sbjct: 67 ADPGTVYVYIIYGMYHCLNLISEEKDVAGGVLIRGIEPLEGIEEMSKLRYKKSYEELSNY 126 Query: 128 -----TNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPK--KIEALPRIGIPNKGR 180 +NGP KL ALGIDK G + SS V + K I RIGI Sbjct: 127 EKKNFSNGPSKLCMALGIDKGENGINTISSEEIYVEDDSLIKKDFSIVEAKRIGIDYAEE 186 Query: 181 WTELPLRYVVAGNPYISKQ 199 + R+ + N ++SK+ Sbjct: 187 ARDFLWRFYIKDNKFVSKK 205 >STRMU Q8DRU4 (Q8DRU4) Putative 3-methyladenine DNA glycosylase Length = 192 Score = 103 bits (258), Expect = 3e-22 Identities = 64/173 (36%), Positives = 91/173 (52%), Gaps = 15/173 (8%) Query: 40 GYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQ 99 G IV+ EAYLG D A HS R+TP+ +AMY G Y+Y ++ H +LN+VT+ Q + Sbjct: 34 GRIVETEAYLGSKDSACHSANDRRTPKNEAMYLAAGHWYVYQIYGHQMLNLVTKPQNVAE 93 Query: 100 GVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPE 159 V+IRA+E + G L NGPGKL GIDK G S+ S L L + Sbjct: 94 AVLIRALETAD-------------GHLLANGPGKLTKFAGIDKSFNGDSLQDSRLSL--Q 138 Query: 160 KRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYISKQKRTAVDQIDFGWK 212 + P++IE RIG+ W + L + V GN ++SK + ++ WK Sbjct: 139 EDLSPQRIEERSRIGVTCTDEWKDALLCFYVRGNQHVSKIAKKSLLTDKETWK 191 >DEIRA 3MGH_DEIRA (Q9RSQ0) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 190 Score = 85.9 bits (211), Expect = 9e-17 Identities = 64/181 (35%), Positives = 97/181 (53%), Gaps = 7/181 (3%) Query: 20 VAQYLLGMYLEHETATGV-LGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIY 78 +A+ LLG L T G L G +V+ EAY P D A + G R M PG Sbjct: 3 LARELLGGTLVRVTPDGHRLSGRVVEVEAYDCPRDPACTA-GRFHAARSAEMAIAPGHWL 61 Query: 79 LYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAAL 138 + H H +L + +++G V+IRA+EP+EG KM++ R + +LT+GP KLV AL Sbjct: 62 FWFAHGHPLLQVACRQEGVSASVLIRALEPLEGAGKMLDYRPVTRQRDLTSGPAKLVYAL 121 Query: 139 GID-KQLYGQSIFSSSLRLV-PEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYI 196 G+D Q+ + + S L L+ PE ++ R+GI +GR LP R+++ GN ++ Sbjct: 122 GLDPMQISHRPVNSPELHLLAPETPLADDEVTVTARVGI-REGR--NLPWRFLIRGNGWV 178 Query: 197 S 197 S Sbjct: 179 S 179 >CORGL 3MGH_CORGL (Q8NU33) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 189 Score = 82.0 bits (201), Expect = 1e-15 Identities = 66/185 (35%), Positives = 100/185 (54%), Gaps = 16/185 (8%) Query: 20 VAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYL 79 VA LLG L H G +G I + EAYL DEAAH++ KTPR AM+ G +Y+ Sbjct: 12 VAPQLLGCTLTH----GGVGIRITEVEAYLDSTDEAAHTY-RGKTPRNAAMFGPGGHMYV 66 Query: 80 YTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGV---ELTNGPGKLV 135 Y + H N+V +G QGV++RA E V G + + ++R+G +G+ L GPG Sbjct: 67 YISYGIHRAGNIVCGPEGTGQGVLLRAGEVVSG-ESIAQSRRG-EGIPHARLAQGPGNFG 124 Query: 136 AALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPY 195 ALG++ S+F S L+ ++ + P+ + PRIGI TE LR+ + +P Sbjct: 125 QALGLEISDNHASVFGPSF-LISDRVETPEIVRG-PRIGISKN---TEALLRFWIPNDPT 179 Query: 196 ISKQK 200 +S ++ Sbjct: 180 VSGRR 184 >STRCO 3MGH_STRCO (Q9S208) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 213 Score = 80.5 bits (197), Expect = 4e-15 Identities = 59/184 (32%), Positives = 91/184 (49%), Gaps = 8/184 (4%) Query: 19 EVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIY 78 EVA LLG L G + + + EAY G +D +H++ R TPR + M+ PG +Y Sbjct: 21 EVAPDLLGRILVRTGPDGPITLRLTEVEAYDGQNDPGSHAYRGR-TPRNEVMFGPPGHVY 79 Query: 79 LY-TMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENR-QGRQGVELTNGPGKLVA 136 +Y T +N+V +G+ V++RA E ++G + R R EL GP +L Sbjct: 80 VYFTYGMWFCMNLVCGPEGRSSAVLLRAGEIIDGAELARTRRLSARNDKELAKGPARLAT 139 Query: 137 ALGIDKQLYGQSIFSSS---LRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGN 193 ALG+D+ L G +S LR++ ++ PR G+ +G P RY VA + Sbjct: 140 ALGVDRALNGTDACTSQETPLRILTGTPVPGDQVRNGPRTGVAGEG--GVHPWRYWVADD 197 Query: 194 PYIS 197 P +S Sbjct: 198 PTVS 201 >BRAJA 3MGH_BRAJA (Q89LR7) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 200 Score = 79.0 bits (193), Expect = 1e-14 Identities = 68/193 (35%), Positives = 94/193 (48%), Gaps = 22/193 (11%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F ++ EVA L+G + GV GG IV+ EAY + AAHS+ TPR M+ Sbjct: 20 FFGRSVREVAHDLIGATM---LVDGV-GGLIVEVEAY-HHTEPAAHSYN-GPTPRNHVMF 73 Query: 72 DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNG 130 PG Y+Y + H +N V + +G V+IRA+EP G+ M R + L +G Sbjct: 74 GPPGFAYVYRSYGIHWCVNFVCEAEGSAAAVLIRALEPTHGIAAMRRRRHLQDVHALCSG 133 Query: 131 PGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALP-----RIGIPNKGRWTELP 185 PGKL ALGI +I ++L L + E L RIGI + ELP Sbjct: 134 PGKLTEALGI-------TIAHNALPLDRPPIALHARTEDLEVATGIRIGIT---KAVELP 183 Query: 186 LRYVVAGNPYISK 198 RY V G+ ++SK Sbjct: 184 WRYGVKGSKFLSK 196 >STRAW 3MGH_STRAW (Q829C5) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 213 Score = 72.8 bits (177), Expect = 8e-13 Identities = 57/191 (29%), Positives = 88/191 (46%), Gaps = 8/191 (4%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F + +VA LLG L T G + + + EAY GP D +H++ R T R M+ Sbjct: 14 FFARPVLDVAPDLLGRVLVRTTPDGPIELRVTEVEAYDGPSDPGSHAYRGR-TARNGVMF 72 Query: 72 DKPGTIYLY-TMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENR-QGRQGVELTN 129 PG +Y+Y T +N+V +G+ V++RA E +EG + R R EL Sbjct: 73 GPPGHVYVYFTYGMWHCMNLVCGPEGRASAVLLRAGEIIEGAELARTRRLSARNDKELAK 132 Query: 130 GPGKLVAALGIDKQLYGQSIFS---SSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPL 186 GP +L AL +D+ L G + L L+ P ++ PR G+ G P Sbjct: 133 GPARLATALEVDRALDGTDACAPEGGPLTLLSGTPVPPDQVRNGPRTGVSGDG--GVHPW 190 Query: 187 RYVVAGNPYIS 197 R+ + +P +S Sbjct: 191 RFWIDNDPTVS 201 >COREF 3MGH_COREF (Q8FUA2) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 190 Score = 69.3 bits (168), Expect = 9e-12 Identities = 58/182 (31%), Positives = 85/182 (46%), Gaps = 11/182 (6%) Query: 20 VAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYL 79 VA LLG H+ + L + EAYLG +D AAH+ KT R AM+ G +Y+ Sbjct: 12 VAPQLLGCIFTHDGVSIRL----TEVEAYLGAEDAAAHTHR-GKTARNAAMFGPGGHMYI 66 Query: 80 YTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAAL 138 Y + H N+ +G QGV++RA E V G D R L GPG L AL Sbjct: 67 YISYGIHRAGNIACAPEGVGQGVLLRAGEVVAGEDIAYRRRGDVPFTRLAQGPGNLGQAL 126 Query: 139 GIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYISK 198 I + +L+ E + P+ + PR+GI + PLR+ + G+P +S Sbjct: 127 NFQLSDNHAPINGTDFQLM-EPSERPEWVSG-PRVGITKN---ADAPLRFWIPGDPTVSV 181 Query: 199 QK 200 ++ Sbjct: 182 RR 183 >PROAC Q6A5L3 (Q6A5L3) Putative 3-methylpurine DNA glycosylase Length = 191 Score = 65.9 bits (159), Expect = 9e-11 Identities = 56/190 (29%), Positives = 85/190 (44%), Gaps = 23/190 (12%) Query: 19 EVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIY 78 EVA LLG + G +G + + EAY+G DD A+H+F TPR + M+ P IY Sbjct: 10 EVAPLLLGATIWR----GPVGIRLTEVEAYMGLDDPASHAFR-GPTPRARVMFGPPSHIY 64 Query: 79 LYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137 +Y + H +N+V G+ V++R + + G D R L GPG + +A Sbjct: 65 VYLSYGMHRCVNLVCSPDGEASAVLLRGGQVIAGHDDARRRRGNVAENRLACGPGNMGSA 124 Query: 138 LGIDKQLYGQ----------SIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLR 187 LG + G S L PE +F + PR+GI R + P R Sbjct: 125 LGASLEESGNPVSIIGNGAISALGWRLEPAPEIAEFRQG----PRVGI---SRNIDAPWR 177 Query: 188 YVVAGNPYIS 197 + + +P +S Sbjct: 178 WWIPQDPTVS 187 >MYCPA Q740F6 (Q740F6) Hypothetical protein Length = 205 Score = 64.3 bits (155), Expect = 3e-10 Identities = 66/198 (33%), Positives = 92/198 (46%), Gaps = 30/198 (15%) Query: 19 EVAQYLLGMYLEHETATGVLGGYIVDAEAYLG-PD----DEAAHSF-GLRKTPRLQAMYD 72 E A+ LLG L T GV G IV+ EAY G PD D AAHS+ GLR R M+ Sbjct: 14 EAARRLLGATL---TGRGV-SGVIVEVEAYGGVPDGPWPDAAAHSYKGLRA--RNFVMFG 67 Query: 73 KPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQG-----VE 126 PG +Y Y H H+ N+ G V++RA +G D +GR+G Sbjct: 68 PPGRLYTYRSHGIHVCANVSCGPDGTAAAVLLRAAALEDGTDVA----RGRRGELVHTAA 123 Query: 127 LTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEAL--PRIGIPNKGRWTEL 184 L GPG L AA+GI G +F P + + + A+ PR+G+ + + Sbjct: 124 LARGPGNLCAAMGITMADNGIDLFDPD---SPVTLRLHEPLTAVCGPRVGV---SQAADR 177 Query: 185 PLRYVVAGNPYISKQKRT 202 P R + G P +S +R+ Sbjct: 178 PWRLWLPGRPEVSAYRRS 195 >MYCLEH 3MGH_MYCTU (P65412) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 203 Score = 63.5 bits (153), Expect = 5e-10 Identities = 55/171 (32%), Positives = 81/171 (47%), Gaps = 16/171 (9%) Query: 42 IVDAEAYLG-PD----DEAAHSFGLRKTPRLQAMYDKPGTIYLYTMH-THLILNMVTQEQ 95 +V+ EAY G PD D AAHS+ R R M+ PG +Y Y H H+ N+ Sbjct: 31 VVEVEAYGGVPDGPWPDAAAHSYRGRNG-RNDVMFGPPGRLYTYRSHGIHVCANVACGPD 89 Query: 96 GKPQGVMIRAIEPVEGVDKMIENR-QGRQGVELTNGPGKLVAALGIDKQLYGQSIF--SS 152 G V++RA +G + R Q + V L GPG L AALGI G +F SS Sbjct: 90 GTAAAVLLRAAAIEDGAELATSRRGQTVRAVALARGPGNLCAALGITMADNGIDLFDPSS 149 Query: 153 SLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYISKQKRTA 203 +RL + + + PR+G+ + + P R + G P +S +R++ Sbjct: 150 PVRL---RLNDTHRARSGPRVGV---SQAADRPWRLWLTGRPEVSAYRRSS 194 >MYCLE 3MGH_MYCLE (O05678) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 214 Score = 60.1 bits (144), Expect = 5e-09 Identities = 60/190 (31%), Positives = 88/190 (46%), Gaps = 18/190 (9%) Query: 21 AQYLLGMYLEHETATGVLGGYIVDAEAYLG-PD----DEAAHSFGLRKTPRLQAMYDKPG 75 A LLG + T GV +V+ EAY G PD D AAHS+ R R M+ PG Sbjct: 25 AHRLLGATI---TGRGVCA-IVVEVEAYGGVPDGPWPDAAAHSYHGRND-RNAVMFGPPG 79 Query: 76 TIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGR--QGVELTNGPG 132 +Y Y H H+ N+ G V+IRA G D + +R+G + V L GPG Sbjct: 80 RLYTYCSHGIHVCANVSCGPDGTAAAVLIRAGALENGAD-VARSRRGASVRTVALARGPG 138 Query: 133 KLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAG 192 L +ALGI G +F++ + + + + PR+GI + + P R + G Sbjct: 139 NLCSALGITMDDNGIDVFAADSPVTLVLNEAQEAMSG-PRVGISHA---ADRPWRLWLPG 194 Query: 193 NPYISKQKRT 202 P +S +R+ Sbjct: 195 RPEVSTYRRS 204 >RICCN 3MGH_RICCN (Q92IE0) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 183 Score = 51.6 bits (122), Expect = 2e-06 Identities = 39/131 (29%), Positives = 62/131 (47%), Gaps = 18/131 (13%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F + T V+ L+G L + T + I + E+Y+G +D A H+ +T R M+ Sbjct: 11 FFARDTNVVSTELIGKALYFQGKTAI----ITETESYIGQNDPACHA-ARGRTKRTDIMF 65 Query: 72 DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNG 130 G Y+Y ++ + LN VT+ +G P +IR + + + EN NG Sbjct: 66 GPAGFSYVYLIYGMYYCLNFVTEAKGFPAATLIRGVHVI-----LPENLY-------LNG 113 Query: 131 PGKLVAALGID 141 PGKL LGI+ Sbjct: 114 PGKLCKYLGIN 124 >RICPR 3MGH_RICPR (Q9ZDH7) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 217 Score = 48.9 bits (115), Expect = 1e-05 Identities = 29/96 (30%), Positives = 49/96 (51%), Gaps = 6/96 (6%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F + T V+ L+G L + T + I + E+Y+G DD A H+ +T R M+ Sbjct: 11 FFARDTNLVSTELIGKVLYFQGTTAI----ITETESYIGEDDPACHA-ARGRTKRTDVMF 65 Query: 72 DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAI 106 G Y+Y ++ + LN VT+++G P +IR + Sbjct: 66 GPAGFSYVYLIYGMYYCLNFVTEDEGFPAATLIRGV 101 >PSEAE 3MGH_PSEAE (Q9HX17) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 239 Score = 45.1 bits (105), Expect = 2e-04 Identities = 49/184 (26%), Positives = 80/184 (43%), Gaps = 17/184 (9%) Query: 20 VAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYL 79 VA+ LLG + H L I++ EAY +++ +H+ L T + +A++ G IY+ Sbjct: 29 VARELLGKVIRHRQGNLWLAARIIETEAYY-LEEKGSHA-SLGYTEKRKALFLDGGHIYM 86 Query: 80 YTMHTHLILNMVTQEQGKPQGVMIRAIEP----------VEGVDKMIENRQG--RQGVEL 127 Y LN G V+I++ P +E + + + QG R+ L Sbjct: 87 YYARGGDSLNF--SAGGPGNAVLIKSGHPWLDRISDHTALERMQSLNPDSQGRPREIGRL 144 Query: 128 TNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLR 187 G L A+G+ + F V + + P ++ R+GIP KGR LP R Sbjct: 145 CAGQTLLCKAMGLKVPEWDAQRFDPQRLFVDDVGERPSQVIQAARLGIP-KGRDEHLPYR 203 Query: 188 YVVA 191 +V A Sbjct: 204 FVDA 207 >PSEPK Q88DL3 (Q88DL3) DNA-3-methyladenine glycosylase, putative Length = 222 Score = 41.6 bits (96), Expect = 0.002 Identities = 48/192 (25%), Positives = 77/192 (40%), Gaps = 17/192 (8%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F + + +A+ LLG + H L I++ EAY D + S G T + +A++ Sbjct: 8 FFDRDAQTLAKALLGKVIRHRHGDLWLAARIIETEAYYLSDKGSHASLGY--TEKRKALF 65 Query: 72 DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEP----VEGVDKMIENRQG------ 121 G IY+Y LN G V+I++ P + G D + + + Sbjct: 66 LDGGHIYMYYARGGDSLNF--SAHGPGNAVLIKSAYPWQDTLSGPDSLAQMQLNNPDASG 123 Query: 122 --RQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKG 179 R L G L ALG+ + F + V + ++ R+GIP+ G Sbjct: 124 NIRPQERLCAGQTLLCRALGLKVPHWDAQRFDAERLYVEDCGNAVPRVIQAARLGIPH-G 182 Query: 180 RWTELPLRYVVA 191 R LP R+V A Sbjct: 183 RDEHLPYRFVDA 194 >BORPE Q7VWB7 (Q7VWB7) Putative methypurine-DNA glycosylase Length = 238 Score = 40.4 bits (93), Expect = 0.004 Identities = 47/190 (24%), Positives = 78/190 (41%), Gaps = 17/190 (8%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F + +++A+ LLG + H L I++ EAY + + S G T + +A++ Sbjct: 20 FFNRDAQQLARDLLGKVVRHRVDGLWLSARIIETEAYYLAEKGSHASLGY--THKRRALF 77 Query: 72 DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRA----IEPVEG------VDKMIENRQG 121 G +Y+Y LN G V+I++ ++ V G + ++ + QG Sbjct: 78 MDGGVVYMYYARGGDSLNF--SAAGPGNAVLIKSGHPWVDEVSGPRALACMQRLNPDAQG 135 Query: 122 --RQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKG 179 R L G L ALG+ + F V + ++ R+GIP G Sbjct: 136 QPRPPARLCAGQTLLCRALGLKVPQWDAKAFDPRRFFVEDVGVRLDRLVRTTRLGIP-AG 194 Query: 180 RWTELPLRYV 189 R LP RYV Sbjct: 195 RDEHLPYRYV 204 >BORPA Q7W9Q9 (Q7W9Q9) Putative methypurine-DNA glycosylase Length = 238 Score = 40.4 bits (93), Expect = 0.004 Identities = 47/190 (24%), Positives = 78/190 (41%), Gaps = 17/190 (8%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F + +++A+ LLG + H L I++ EAY + + S G T + +A++ Sbjct: 20 FFNRDAQQLARDLLGKVVRHRVDGLWLSARIIETEAYYLAEKGSHASLGY--THKRRALF 77 Query: 72 DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRA----IEPVEG------VDKMIENRQG 121 G +Y+Y LN G V+I++ ++ V G + ++ + QG Sbjct: 78 MDGGVVYMYYARGGDSLNF--SAAGPGNAVLIKSGHPWVDEVSGPRALACMQRLNPDAQG 135 Query: 122 --RQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKG 179 R L G L ALG+ + F V + ++ R+GIP G Sbjct: 136 QPRPPARLCAGQTLLCRALGLKVPQWDAKAFDPRRFFVEDVGVRLDRLVRTTRLGIP-AG 194 Query: 180 RWTELPLRYV 189 R LP RYV Sbjct: 195 RDEHLPYRYV 204 >STRAW Q93HJ1 (Q93HJ1) Modular polyketide synthase Length = 3613 Score = 35.4 bits (80), Expect = 0.14 Identities = 16/39 (41%), Positives = 23/39 (58%) Query: 99 QGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137 +G M+ PVEGV++ + +GR GV NGPG V + Sbjct: 700 KGGMVSVALPVEGVEERLARFEGRIGVAAVNGPGSAVVS 738 >STRAW Q93HJ2 (Q93HJ2) Modular polyketide synthase Length = 4685 Score = 33.1 bits (74), Expect = 0.68 Identities = 15/39 (38%), Positives = 23/39 (58%) Query: 99 QGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137 +G M+ PVEGV++ + +GR GV NGP +V + Sbjct: 3743 KGGMVSVALPVEGVEERLARFEGRIGVAAVNGPTSVVVS 3781 Score = 33.1 bits (74), Expect = 0.68 Identities = 15/39 (38%), Positives = 23/39 (58%) Query: 99 QGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137 +G M+ PVEGV++ + +GR GV NGP +V + Sbjct: 2223 KGGMVSVALPVEGVEERLARFEGRIGVAAVNGPTSVVVS 2261 Score = 30.4 bits (67), Expect = 4.4 Identities = 14/39 (35%), Positives = 22/39 (56%) Query: 99 QGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137 +G M+ PV V++ + +GR GV NGPG +V + Sbjct: 695 KGGMVSVALPVGEVEERLARFEGRIGVAAVNGPGSVVVS 733 >SILPO Q5LKE3 (Q5LKE3) Peptidyl-prolyl cis-trans isomerase, FKBP-type Length = 142 Score = 32.0 bits (71), Expect = 1.5 Identities = 20/68 (29%), Positives = 34/68 (50%), Gaps = 6/68 (8%) Query: 114 KMIENRQGRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKF----PKKIEA 169 K ++ +GR +E T G G+++ G+DK + G VP + P+ +A Sbjct: 22 KTFDSSEGRDPLEFTVGSGQIIP--GLDKAMPGMETGEKKRVEVPCAEAYGPLNPEARQA 79 Query: 170 LPRIGIPN 177 +PR GIP+ Sbjct: 80 IPREGIPD 87 >SILPO Q5LVX1 (Q5LVX1) ABC transporter, transmembrane ATP-binding protein, putative Length = 1032 Score = 30.4 bits (67), Expect = 4.4 Identities = 19/48 (39%), Positives = 26/48 (54%), Gaps = 8/48 (16%) Query: 101 VMIRAIEPVEGVDKMIENRQG----RQGVE----LTNGPGKLVAALGI 140 V ++ EP +G MIE G R+G E +T GPG+LV LG+ Sbjct: 935 VFLKDDEPTDGAYMMIEGEAGLYLPREGQEDQLIVTVGPGRLVGELGL 982 >CLOTE DXS_CLOTE (Q894H0) 1-deoxy-D-xylulose-5-phosphate synthase (EC 2.2.1.7) (1-deoxyxylulose-5-phosphate synthase) (DXP synthase) (DXPS) Length = 620 Score = 30.0 bits (66), Expect = 5.8 Identities = 19/55 (34%), Positives = 28/55 (50%), Gaps = 1/55 (1%) Query: 138 LGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAG 192 + I K + G S + SSLR+ P KF + +E + + IPN G+ L V G Sbjct: 179 MSIGKNVGGLSTYLSSLRIDPNYNKFKRDVEGIIK-KIPNIGKGVAKNLERVKDG 232 >BURMA Q9AI54 (Q9AI54) DedA family protein Length = 1925639 Score = 29.6 bits (65), Expect = 7.5 Identities = 32/136 (23%), Positives = 52/136 (38%), Gaps = 6/136 (4%) Query: 43 VDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGT-IYLYTMHTHLILNMVTQEQGKPQGV 101 V+ A P A ++ + A Y G + + H L + Q K + Sbjct: 1823164 VELVANEAPGSRMAFMHPVKSRAAISAAYFDHGVKTFSFDTHEELAKILDATGQAKDLNL 1823223 Query: 102 MIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAALGIDKQLYGQS--IFSSSLRLVPE 159 ++R EG + G+ GVE+ N P L+AA + L G S + S +R Sbjct: 1823224 IVRMGVQAEGAAYSLS---GKFGVEMHNAPDLLLAARRATQDLMGVSFHVGSQCMRPTAF 1823280 Query: 160 KRKFPKKIEALPRIGI 175 + + AL R G+ Sbjct: 1823281 QAAMAQASRALVRAGV 1823296 >STAES Q8CST0 (Q8CST0) Hypothetical protein SE0952 Length = 572 Score = 29.3 bits (64), Expect = 9.8 Identities = 17/75 (22%), Positives = 36/75 (48%), Gaps = 5/75 (6%) Query: 98 PQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAALG-----IDKQLYGQSIFSS 152 P+ M+ + + +IEN++ +G+ LT+G + A+ ID +YG + + Sbjct: 60 PEDEMLGVDIVIPDIQYVIENKERLKGIFLTHGHEHAIGAVSYVLEQIDAPVYGSKLTIA 119 Query: 153 SLRLVPEKRKFPKKI 167 ++ + R KK+ Sbjct: 120 LVKEAMKARNIKKKV 134 >SILPO Q5LMI5 (Q5LMI5) ADA regulatory protein Length = 283 Score = 29.3 bits (64), Expect = 9.8 Identities = 24/103 (23%), Positives = 48/103 (46%), Gaps = 4/103 (3%) Query: 88 LNMVTQEQGKPQGVMIRAIEPVE--GVDKMIENRQGRQGVELTNGPGKLVAALGIDKQLY 145 +N+ T E+G GV+ RAIE ++ G +++ R + + +G+ + Y Sbjct: 1 MNVQTTEEGYHYGVIRRAIELIDAGGESMPLDDLAARMNMSPAHFQRIFSRWVGVSPKKY 60 Query: 146 GQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRY 188 Q + + + E+R +EA +G+ GR +L +R+ Sbjct: 61 QQYLTLGHAKALLEERF--TLLEAAQNVGLSGTGRLHDLFVRW 101 Database: Blastdata.fdb Posted date: Mar 29, 2006 3:30 PM Number of letters in database: 77,468,597 Number of sequences in database: 240,170 Lambda K H 0.316 0.135 0.391 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Hits to DB: 35,841,668 Number of Sequences: 240170 Number of extensions: 1550248 Number of successful extensions: 3502 Number of sequences better than 10.0: 43 Number of HSP's better than 10.0 without gapping: 24 Number of HSP's successfully gapped in prelim test: 19 Number of HSP's that attempted gapping in prelim test: 3332 Number of HSP's gapped (non-prelim): 140 length of query: 229 length of database: 77,468,597 effective HSP length: 107 effective length of query: 122 effective length of database: 51,770,407 effective search space: 6315989654 effective search space used: 6315989654 T: 11 A: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 64 (29.3 bits) BLASTP 2.2.10 [Oct-19-2004] Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Query= ENTFA AACA_ENTFA (P0A0C2) Bifunctional AAC/APH [Includes: 6'-aminoglycoside N-acetyltransferase (EC 2.3.1.-) (AAC(6')); 2''-aminoglycoside phosphotransferase (EC 2.7.1.-) (APH(2''))] (479 letters) Database: Blastdata.fdb 240,170 sequences; 77,468,597 total letters Searching..................................................done Score E Sequences producing significant alignments: (bits) Value STAAM AACA_STAAM (P0A0C0) Bifunctional AAC/APH [Includes: 6'-ami... 959 0.0 ENTFA AACA_ENTFA (P0A0C2) Bifunctional AAC/APH [Includes: 6'-ami... 959 0.0 BACAN Q81P54 (Q81P54) Acetyltransferase, GNAT family 168 4e-41 BACC1 Q735Z9 (Q735Z9) Acetyltransferase, GNAT family 159 1e-38 BACAN Q81T90 (Q81T90) Aminoglycoside phophotransferase family pr... 67 1e-10 BACC1 Q73BC3 (Q73BC3) Aminoglycoside phophotransferase family pr... 62 4e-09 BRAJA Q89WN0 (Q89WN0) Bll0648 protein 59 3e-08 BACHD Q9K9M4 (Q9K9M4) BH2621 protein 56 2e-07 BACC1 Q739G2 (Q739G2) 6'-aminoglycoside N-acetyltransferase/2''-... 55 5e-07 THEMA Q9X063 (Q9X063) Hypothetical protein 52 3e-06 CLOTE Q896X4 (Q896X4) Putative acetyltransferase 49 3e-05 BACHD Q9KB15 (Q9KB15) BH2121 protein 48 6e-05 STAES Q8CQT6 (Q8CQT6) Spermidine acetyltransferase 47 1e-04 VIBCH Q9KMC2 (Q9KMC2) Acetyltransferase, putative 45 5e-04 BACSU BLTD_BACSU (P39909) Spermine/spermidine acetyltransferase ... 45 6e-04 BACC1 Q739N8 (Q739N8) Spermidine acetyltransferase, putative 44 0.001 LACPL Q88SW8 (Q88SW8) Acetyltransferase (Putative) 44 0.001 VIBUY Q7MK89 (Q7MK89) Putative acetyltransferase 43 0.002 DEIRA Q9RW71 (Q9RW71) Acetyltransferase, putative 43 0.002 BACAN Q81RM0 (Q81RM0) Spermidine acetyltransferase, putative 43 0.002 LACJO Q74K74 (Q74K74) Hypothetical protein 42 0.003 BACC1 Q72YV7 (Q72YV7) Acetyltransferase, GNAT family 42 0.003 BACC1 Q734R6 (Q734R6) Acetyltransferase, GNAT family 42 0.004 CLOPE Q8XKH9 (Q8XKH9) Hypothetical protein CPE1416 42 0.005 BACAN Q81SQ8 (Q81SQ8) Acetyltransferase, GNAT family 41 0.007 VIBCH Q9K330 (Q9K330) Acetyltransferase, putative 41 0.009 VIBUY Q7MK96 (Q7MK96) Putative acetyltransferase 40 0.012 WIGBR Q8D3I4 (Q8D3I4) Imp protein 40 0.016 BACSU P94482 (P94482) YnaD 40 0.021 BACC1 Q73A91 (Q73A91) Acetyltransferase, GNAT family 40 0.021 THETN Q8RC99 (Q8RC99) Acetyltransferases 39 0.027 STRAW Q82IB6 (Q82IB6) Putative acetyltransferase 39 0.027 LISIN Q92E38 (Q92E38) Lin0623 protein 39 0.027 STRCO O69977 (O69977) Hypothetical protein SCO5801 39 0.036 STRAW Q82KD8 (Q82KD8) Hypothetical protein 39 0.036 VIBCH Q9KMA3 (Q9KMA3) Acetyltransferase, putative 39 0.046 STAAM Q932M5 (Q932M5) Hypothetical protein SAVP027 39 0.046 LACLA Q9CHJ8 (Q9CHJ8) Acetyl transferase 39 0.046 ENTFA Q82YR8 (Q82YR8) Acetyltransferase, GNAT family 39 0.046 BACC1 Q73AS8 (Q73AS8) Acetyltransferase, GNAT family 39 0.046 BACC1 Q734C3 (Q734C3) Acetyltransferase, GNAT family 39 0.046 BACAN Q81YL2 (Q81YL2) Acetyltransferase, GNAT family 39 0.046 BACAN Q81XB7 (Q81XB7) Spermidine N1-acetyltransferase 39 0.046 SALTI Q8Z6Z1 (Q8Z6Z1) Spermidine N1-acetyltransferase 38 0.061 SALTY Q8ZPJ3 (Q8ZPJ3) Spermidine N1-acetyltransferase (EC 2.3.1.57) 38 0.061 SALCH Q57PD6 (Q57PD6) Spermidine N1-acetyltransferase 38 0.061 MYCPE Q8EWE5 (Q8EWE5) Acetyltransferase GNAT family 38 0.061 BACC1 Q72Y03 (Q72Y03) Spermidine N1-acetyltransferase (EC 2.3.1.57) 38 0.061 DEIRA Q9RW73 (Q9RW73) Acetyltransferase, putative 38 0.079 STAAM Q99U68 (Q99U68) Hypothetical protein 37 0.10 RICPR Q9ZE75 (Q9ZE75) TRANSCRIPTIONAL ACTIVATOR PROTEIN CZCR (CzcR) 37 0.10 LACJO Q74J71 (Q74J71) Hypothetical protein 37 0.10 CLOAB Q97D33 (Q97D33) Acetyltransferase (With duplicated domains... 37 0.10 VIBCH Q9KU66 (Q9KU66) Ribosomal-protein-alanine acetyltransferase 37 0.18 STRR6 Q8DNF9 (Q8DNF9) Hypothetical protein spr1760 37 0.18 SALTY Q8ZNU6 (Q8ZNU6) Putative ribose 5-phosphate isomerase (EC ... 37 0.18 SALCH Q57N66 (Q57N66) Putative ribose 5-phosphate isomerase 37 0.18 ECO57 ATDA_ECO57 (P0A952) Spermidine N(1)-acetyltransferase (EC ... 37 0.18 ECOLI ATDA_ECOLI (P0A951) Spermidine N(1)-acetyltransferase (EC ... 37 0.18 BACHD Q9KG16 (Q9KG16) BH0299 protein 37 0.18 AQUAE O66838 (O66838) Ribosomal-protein-alanine acetyltransferase 37 0.18 PROAC Q6ABU2 (Q6ABU2) Acetyltransferase (EC 2.3.1.-) 36 0.23 BACAN Q81QL3 (Q81QL3) Acetyltransferase, GNAT family 36 0.23 STRMU Q8DV67 (Q8DV67) Putative acetyltransferase 36 0.30 STAAM COAD_STAAM (P63818) Phosphopantetheine adenylyltransferase... 36 0.30 LISMO Q8Y9B8 (Q8Y9B8) Lmo0614 protein 36 0.30 THEMA Q9WZ46 (Q9WZ46) Hypothetical protein 35 0.39 STRR6 Q8DQU8 (Q8DQU8) Hypothetical protein spr0490 35 0.39 CLOAB Q97FA0 (Q97FA0) Predicted acetyltransferase 35 0.39 _BUCAP LOLC_BUCAP (Q8K9N8) Lipoprotein-releasing system transmem... 35 0.39 BACC1 Q734M2 (Q734M2) Acetyltransferase, GNAT family 35 0.39 BACAN Q81U09 (Q81U09) Acetyltransferase, GNAT family 35 0.39 YERPE Q8ZHB3 (Q8ZHB3) Putative siderophore biosynthesis protein ... 35 0.51 VIBUY Q7MFH7 (Q7MFH7) Histone acetyltransferase HPA2 35 0.51 RICCN Q92JG6 (Q92JG6) Transcriptional activator protein czcR 35 0.51 CLOAB Q97G03 (Q97G03) Predicted acetyltransferase 35 0.51 BACHD Q9KF00 (Q9KF00) Ribosomal-protein-alanine N-acetyltransferase 35 0.51 BACAN Q81NB6 (Q81NB6) Spermine/spermidine acetyltransferase, put... 35 0.51 BACAN Q81N07 (Q81N07) Acetyltransferase, GNAT family 35 0.51 STRMU Q8DT36 (Q8DT36) Putative acetyltransferase 35 0.67 PSEAE Q9I3W7 (Q9I3W7) Hypothetical protein 35 0.67 NITEU Q82UT0 (Q82UT0) GCN5-related N-acetyltransferase (EC 2.3.1... 35 0.67 LACLA Q9CF66 (Q9CF66) Spermidine acetyltransferase (EC 2.3.1.57) 35 0.67 BACC1 Q739G7 (Q739G7) Acetyltransferase, GNAT family 35 0.67 MYCPE Q8EWM2 (Q8EWM2) Hypothetical protein MYPE1810 34 0.88 MYCPE Q8EWE6 (Q8EWE6) Acetyltransferase GNAT family 34 0.88 LACLA Q9CHW9 (Q9CHW9) Hypothetical protein yfiL 34 0.88 LACPL Q88SM7 (Q88SM7) Prophage Lp4 protein 8, DNA primase/helicase 34 0.88 ENTFA Q837C4 (Q837C4) Acetyltransferase, GNAT family 34 0.88 CLOTE Q891D4 (Q891D4) Ribosomal-protein-alanine acetyltransferas... 34 0.88 CLOAB Q97I16 (Q97I16) Predicted acetyltransferase domain contain... 34 0.88 BACC1 Q72WY7 (Q72WY7) Hypothetical protein 34 0.88 VIBPA Q87G30 (Q87G30) Putative acetyltransferase 34 1.1 STAES ARGD2_STAES (Q8CSG1) Acetylornithine aminotransferase 2 (E... 34 1.1 RICPR Q9ZEC5 (Q9ZEC5) 190 KD ANTIGEN (Sca1) 34 1.1 PSEPK Q88NL5 (Q88NL5) Acetyltransferase, GNAT family 34 1.1 LACPL Q88YF8 (Q88YF8) Acetyltransferase (Putative) 34 1.1 BACC1 Q737S7 (Q737S7) Acetyltransferase, GNAT family 34 1.1 BACC1 Q734H0 (Q734H0) Acetyltransferase, GNAT family family 34 1.1 BACAN Q81Q67 (Q81Q67) Acetyltransferase, GNAT family 34 1.1 BACAN Q81P77 (Q81P77) Acetyltransferase, GNAT family 34 1.1 Q8E1I7 (Q8E1I7) Acetyltransferase, GNAT family 33 1.5 Q8DXE6 (Q8DXE6) Hypothetical protein SAG1905 33 1.5 OCEIH Q8ET96 (Q8ET96) Hypothetical conserved protein 33 1.5 LISIN Q929M8 (Q929M8) Lin2246 protein 33 1.5 CLOTE Q895L4 (Q895L4) DNA topoisomerase I (EC 5.99.1.2) 33 1.5 CLOPE Q8XIF5 (Q8XIF5) Ribosomal-protein-alanine N-acetyltransferase 33 1.5 BRUSU Q8FVN1 (Q8FVN1) Acyl-CoA dehydrogenase family protein 33 1.5 BRUME Q8YCN6 (Q8YCN6) ACYL-COA DEHYDROGENASE (EC 1.3.99.-) 33 1.5 BACSU YQAR_BACSU (P45914) Hypothetical protein yqaR 33 1.5 BACAN Q81RF9 (Q81RF9) Acetyltransferase, GNAT family 33 1.5 VIBPA Q87NP1 (Q87NP1) Putative spermine/spermidine acetyltransfe... 33 2.0 THETN Q8RC65 (Q8RC65) Acetyltransferases 33 2.0 STRR6 Q8CYV9 (Q8CYV9) Hypothetical protein spr0850 33 2.0 STAES Q8CU70 (Q8CU70) Spermidine acetyltransferase 33 2.0 STAES Q8CS08 (Q8CS08) Hypothetical protein SE1483 33 2.0 RICPR Q9ZDP9 (Q9ZDP9) Hypothetical protein RP278 33 2.0 OCEIH Q8CXG0 (Q8CXG0) Diamine N-acetyltransferase (Spermine:sper... 33 2.0 CLOAB Q97J70 (Q97J70) Predicted acetyltransferase 33 2.0 BURMA Q9AI54 (Q9AI54) DedA family protein 33 2.0 BRAJA Q89YE3 (Q89YE3) Bll0009 protein 33 2.0 BACAN Q81NK7 (Q81NK7) Streptothricin acetyltransferase, putative 33 2.0 VIBUY Q7MI31 (Q7MI31) Ribosomal-protein-alanine acetyltransferase 33 2.6 OCEIH Q8EMB8 (Q8EMB8) Glucose-6-phosphate 1-dehydrogenase (EC 1.... 33 2.6 OCEIH Q8EKW6 (Q8EKW6) Acetyltransferase 33 2.6 MYCPE Q8EWC2 (Q8EWC2) Oligoendopeptidase F 33 2.6 LISMO Q8Y8S4 (Q8Y8S4) Lmo0820 protein 33 2.6 CLOPE Q8XKM5 (Q8XKM5) Probable acetyltransferase 33 2.6 BUCAI SPED_BUCAI (P57304) S-adenosylmethionine decarboxylase pro... 33 2.6 AQUAE O67458 (O67458) Hypothetical protein aq_1482 33 2.6 YERPE Q8CZN5 (Q8CZN5) Acyltransferase for 30S ribosomal subunit ... 32 3.3 STRAW Q827N9 (Q827N9) Putative acetyltransferase 32 3.3 STAES Q8CU81 (Q8CU81) Spermidine acetyltransferase 32 3.3 RICPR Q9ZCN0 (Q9ZCN0) RIBOSOMAL-PROTEIN-ALANINE ACETYLTRANSFERAS... 32 3.3 OCEIH Q8ERM1 (Q8ERM1) Acetyltransferase 32 3.3 MYCGE RIBF_MYCGE (P47391) Putative riboflavin biosynthesis prote... 32 3.3 ENTFA Q836M4 (Q836M4) Spermine/spermidine acetyltransferase, put... 32 3.3 CHRVO Q7NRZ7 (Q7NRZ7) Probable acetyltransferase (EC 2.3.1.-) 32 3.3 CHLCV Q824H9 (Q824H9) Acetyltransferase, GNAT family 32 3.3 CHLMU SYR_CHLMU (Q9PJT8) Arginyl-tRNA synthetase (EC 6.1.1.19) (... 32 3.3 BACSU O34376 (O34376) Putative acetyl transferase (YobR protein) 32 3.3 BACC1 Q738E9 (Q738E9) Acetyltransferase, GNAT family 32 3.3 BACC1 Q736C5 (Q736C5) Acetyltransferase, GNAT family 32 3.3 BACC1 Q735P8 (Q735P8) Acetyltransferase, GNAT family 32 3.3 AQUAE YZ34_AQUAE (O66423) Hypothetical protein AA34 32 3.3 YERPE Q8ZCX6 (Q8ZCX6) Spermidine acetyltransferase (EC 2.3.1.57)... 32 4.4 STRR6 Q8DNN2 (Q8DNN2) Hypothetical protein spr1627 32 4.4 STAAM Q99RQ8 (Q99RQ8) Similar to transcription repressor of spor... 32 4.4 LACLA Q9CJA2 (Q9CJA2) Acetyl transferase 32 4.4 CLOTE Q892J2 (Q892J2) Conserved protein 32 4.4 BRAJA Q89UA8 (Q89UA8) Hypothetical acetyltransferase 32 4.4 BACSU O34558 (O34558) YopR protein 32 4.4 BACAN Q81R63 (Q81R63) Hypothetical protein 32 4.4 VIBPA Q87MD0 (Q87MD0) Acetyltransferase-related protein 32 5.7 STRR6 Q8DND0 (Q8DND0) Transcriptional activator 32 5.7 OCEIH Q8CUS9 (Q8CUS9) Hypothetical conserved protein 32 5.7 LISIN Q92E28 (Q92E28) Lin0633 protein 32 5.7 LACPL Q88U49 (Q88U49) Glucose-6-phosphate 1-dehydrogenase (EC 1.... 32 5.7 CORDI Q6NHQ0 (Q6NHQ0) Putative acetyltransferase 32 5.7 BURMA Q62J98 (Q62J98) Ribosomal-protein-alanine acetyltransferas... 32 5.7 THETN Q8R764 (Q8R764) LysM-repeat proteins and domains 31 7.4 STRR6 Q8DPX8 (Q8DPX8) Hypothetical protein spr0952 31 7.4 STRMU Q8DVT1 (Q8DVT1) Putative ribosomal-protein-alanine acetylt... 31 7.4 SALTI SPED_SALTI (Q8Z9E3) S-adenosylmethionine decarboxylase pro... 31 7.4 SALTY SPED_SALTY (Q8ZRS4) S-adenosylmethionine decarboxylase pro... 31 7.4 SALCH Q57T90 (Q57T90) S-adenosylmethionine decarboxylase, proenzyme 31 7.4 RICCN Q92JP8 (Q92JP8) Cell surface antigen 31 7.4 NEIGI Q5FAH1 (Q5FAH1) Hypothetical protein 31 7.4 LISIN Q92DJ7 (Q92DJ7) Lin0816 protein 31 7.4 LACJO Q74J74 (Q74J74) Hypothetical protein 31 7.4 GEOSL Q74A59 (Q74A59) Sensory box histidine kinase 31 7.4 ENTFA Q836Z2 (Q836Z2) Acetyltransferase, GNAT family 31 7.4 ENTFA Q82YJ4 (Q82YJ4) Toxin ABC transporter, ATP-binding/permeas... 31 7.4 CLOAB Q97M70 (Q97M70) Predicted metal-dependent hydrolase 31 7.4 CLOAB Q97HI5 (Q97HI5) Predicted flavodoxin 31 7.4 BRAJA Q89R32 (Q89R32) Acyl-CoA dehydrogenase 31 7.4 BACC1 Q735F5 (Q735F5) Streptothricin acetyltransferase, putative 31 7.4 BACAN Q81YS9 (Q81YS9) Acetyltransferase, GNAT family 31 7.4 VIBPA Q87HD3 (Q87HD3) Hypothetical protein VPA1032 31 9.7 VIBCH Q9KL03 (Q9KL03) Spermidine n1-acetyltransferase 31 9.7 THEMA SYE2_THEMA (Q9X2I8) Glutamyl-tRNA synthetase 2 (EC 6.1.1.1... 31 9.7 THETN ISPH_THETN (Q8RA76) 4-hydroxy-3-methylbut-2-enyl diphospha... 31 9.7 STRCO Q9S2L5 (Q9S2L5) Hypothetical protein SCO1988 31 9.7 STRP1 Q99XX8 (Q99XX8) Putative pullulanase 31 9.7 STRP1 Q99YF0 (Q99YF0) Hypothetical protein SPy1734 (Acetyltransf... 31 9.7 STAES Q8CTP7 (Q8CTP7) Hypothetical protein SE0368 31 9.7 STAES COAD_STAES (Q8CSZ5) Phosphopantetheine adenylyltransferase... 31 9.7 MYCLEH Y3433_MYCTU (O06250) Hypothetical protein Rv3433c/MT3539 31 9.7 MYCTU Y3433_MYCTU (O06250) Hypothetical protein Rv3433c/MT3539 31 9.7 MYCBO Q7TWI6 (Q7TWI6) Hypothetical protein Mb3463c 31 9.7 LISMO Q8Y5C4 (Q8Y5C4) Lmo2141 protein 31 9.7 LISIN Q929Z8 (Q929Z8) Lin2125 protein 31 9.7 ENTFA Q837Z5 (Q837Z5) Acetyltransferase, GNAT family 31 9.7 ENTFA Q831M9 (Q831M9) Ribosomal-protein-alanine acetyltransferas... 31 9.7 CLOTE Q895T0 (Q895T0) Acetyltransferase (EC 2.3.1.-) 31 9.7 CLOPE Q8XMF9 (Q8XMF9) Hypothetical protein CPE0730 31 9.7 BRUSU Q8FY73 (Q8FY73) Outer membrane autotransporter 31 9.7 BACSU YCBJ_BACSU (P42242) Hypothetical protein ycbJ 31 9.7 BACHD Q9KE57 (Q9KE57) BH1001 protein 31 9.7 BACC1 Q73CD9 (Q73CD9) Glycerol-3-phosphate dehydrogenase, aerobi... 31 9.7 BACAN Q81VL8 (Q81VL8) Oxidoreductase, FAD-binding 31 9.7 BACAN Q81U57 (Q81U57) Glycerol-3-phosphate dehydrogenase, aerobic 31 9.7 BACAN Q81NU7 (Q81NU7) Acetyltransferase, GNAT family 31 9.7 >STAAM AACA_STAAM (P0A0C0) Bifunctional AAC/APH [Includes: 6'-aminoglycoside N-acetyltransferase (EC 2.3.1.-) (AAC(6')); 2''-aminoglycoside phosphotransferase (EC 2.7.1.-) (APH(2''))] Length = 479 Score = 959 bits (2480), Expect = 0.0 Identities = 467/479 (97%), Positives = 467/479 (97%) Query: 1 MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR 60 MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR Sbjct: 1 MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR 60 Query: 61 VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120 VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI Sbjct: 61 VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120 Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD 180 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD Sbjct: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD 180 Query: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE Sbjct: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240 Query: 241 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR 300 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR Sbjct: 241 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR 300 Query: 301 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA 360 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA Sbjct: 301 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA 360 Query: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRLTXXXXXXXXXXXXEYCDFIYLLEDSEEEIGTN 420 TTVFEGKKCLCHNDFSCNHLLLDGNNRLT EYCDFIYLLEDSEEEIGTN Sbjct: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRLTGIIDFGDSGIIDEYCDFIYLLEDSEEEIGTN 420 Query: 421 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD 479 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD Sbjct: 421 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD 479 >ENTFA AACA_ENTFA (P0A0C2) Bifunctional AAC/APH [Includes: 6'-aminoglycoside N-acetyltransferase (EC 2.3.1.-) (AAC(6')); 2''-aminoglycoside phosphotransferase (EC 2.7.1.-) (APH(2''))] Length = 479 Score = 959 bits (2480), Expect = 0.0 Identities = 467/479 (97%), Positives = 467/479 (97%) Query: 1 MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR 60 MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR Sbjct: 1 MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR 60 Query: 61 VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120 VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI Sbjct: 61 VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120 Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD 180 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD Sbjct: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD 180 Query: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE Sbjct: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240 Query: 241 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR 300 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR Sbjct: 241 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR 300 Query: 301 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA 360 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA Sbjct: 301 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA 360 Query: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRLTXXXXXXXXXXXXEYCDFIYLLEDSEEEIGTN 420 TTVFEGKKCLCHNDFSCNHLLLDGNNRLT EYCDFIYLLEDSEEEIGTN Sbjct: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRLTGIIDFGDSGIIDEYCDFIYLLEDSEEEIGTN 420 Query: 421 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD 479 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD Sbjct: 421 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD 479 >BACAN Q81P54 (Q81P54) Acetyltransferase, GNAT family Length = 177 Score = 168 bits (425), Expect = 4e-41 Identities = 76/174 (43%), Positives = 116/174 (66%), Gaps = 1/174 (0%) Query: 5 ENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIE 64 ++ + +R ++++D P++ KWLTD VL++Y GRD ++E + H+ R +IE Sbjct: 5 KDNVSVRYVVEEDAPIISKWLTDPEVLQYYEGRDDPQSVEMVLNHFIHNPNSPEKRCLIE 64 Query: 65 YNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFL 124 +++VPIGY Q+Y + E T Y Y ++ V+GMDQFIGEP YW KGIGT+++K ++ Sbjct: 65 FDDVPIGYIQMYPVDSESKTLYGYEESQN-VWGMDQFIGEPTYWGKGIGTKFVKAAITYI 123 Query: 125 KKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYR 178 E A A+ +DP NN RAI+ Y+K GF+ ++ L EHELHEG EDC++MEY+ Sbjct: 124 LSEMGAEAIAMDPKVNNERAIKCYEKCGFKKVKILKEHELHEGVLEDCWMMEYK 177 >BACC1 Q735Z9 (Q735Z9) Acetyltransferase, GNAT family Length = 359 Score = 159 bits (403), Expect = 1e-38 Identities = 74/185 (40%), Positives = 118/185 (63%), Gaps = 1/185 (0%) Query: 5 ENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIE 64 ++ + +R + ++D P++ KWLT+ VL++Y GRD +++ + H+ R +IE Sbjct: 5 KDNVSVRYVKEEDAPIISKWLTEPEVLQYYEGRDNPQSVDMVLDHFIHNPNSHEKRCLIE 64 Query: 65 YNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFL 124 +++VPIGY Q+Y + E T Y Y ++ V+GMDQFIGEP YW KGIGT+ ++ ++ Sbjct: 65 FDDVPIGYIQMYPVDSEWKTLYGYEESQH-VWGMDQFIGEPTYWGKGIGTKLVQTAITYI 123 Query: 125 KKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNAT 184 + A A+ +DP NN RAI+ Y+K GF+ ++ L EHELHEG EDC++MEY+ + Sbjct: 124 MENTGAEAIAMDPKVNNERAIKCYEKCGFKKVKVLKEHELHEGVLEDCWMMEYKQRELRE 183 Query: 185 NVKAM 189 KA+ Sbjct: 184 MKKAL 188 >BACAN Q81T90 (Q81T90) Aminoglycoside phophotransferase family protein Length = 300 Score = 67.0 bits (162), Expect = 1e-10 Identities = 51/208 (24%), Positives = 95/208 (45%), Gaps = 12/208 (5%) Query: 190 KYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNT 249 K I+ N + S + G+D+VA +VN+E +F+ EK + L Sbjct: 5 KQYIKEALPNLSIHSYKQNEEGWDNVAVIVNDELLFRFPRKQEYAMRIPLEKELCTILTQ 64 Query: 250 NLETNVKIP--NIEYSYISDELSILGYKE-IKGTFLTPEIYSTMSEEEQNLLKRDIASFL 306 +L+ +++P ++ Y SDE+ + Y I G L EI + + E+E+ ++ +A+FL Sbjct: 65 SLQ-EIEVPQYHLIYKNESDEVPLCSYYTLIHGEPLKTEIVANLDEKERKIIITQLATFL 123 Query: 307 RQMHGLDYTDISECTIDNKQNVL---EEYILLRETIYNDLTDIEKDYIESFMERL---NA 360 +H + ++ ++ + E L E + N LT +K + E A Sbjct: 124 AALHSIPLKSVTALGFPTEKTLTYWKELQTKLNEYVTNSLTSFQKSTLNRLFENFFACIA 183 Query: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRL 388 T+ F + H DF+ +H+L D N++ Sbjct: 184 TSAF--PNAIIHADFTHHHILFDKQNKI 209 >BACC1 Q73BC3 (Q73BC3) Aminoglycoside phophotransferase family protein Length = 300 Score = 62.0 bits (149), Expect = 4e-09 Identities = 51/206 (24%), Positives = 92/206 (44%), Gaps = 10/206 (4%) Query: 190 KYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNT 249 K I+ N + S + G+D+VA +VN+E +F+ EK + L+ Sbjct: 5 KQYIKEALPNLSIHSYKQNEEGWDNVAIIVNDELLFRFPRKQEYAMRIPLEKELCTLLSC 64 Query: 250 NL-ETNVKIPNIEYSYISDELSILGYKE-IKGTFLTPEIYSTMSEEEQNLLKRDIASFLR 307 +L E V ++ Y +D + + Y I G L EI +T+ ++E+ L +A+FL Sbjct: 65 SLHEIEVPKYHLFYEKNTDAIPLCSYYTLIHGEPLKTEIVTTLEKQERKALITQLATFLA 124 Query: 308 QMHGLDYTDISECTIDNKQNVL---EEYILLRETIYNDLTDIEKDYIESFMERL---NAT 361 +H + ++ ++ + E L E + N LT +K + E AT Sbjct: 125 ALHSIPLKSVTALGFPIEKTLTYWKELQAKLNEYVTNSLTSFQKSTLNRLFENFFACLAT 184 Query: 362 TVFEGKKCLCHNDFSCNHLLLDGNNR 387 + F+ + H DF+ +H+L D N+ Sbjct: 185 SKFQ--NTIIHADFTHHHILFDKQNK 208 >BRAJA Q89WN0 (Q89WN0) Bll0648 protein Length = 161 Score = 59.3 bits (142), Expect = 3e-08 Identities = 44/145 (30%), Positives = 75/145 (51%), Gaps = 13/145 (8%) Query: 11 RTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPI 70 R + D PL+ +WL + V E++G +++ L S EP D+ I+ + P Sbjct: 8 RPMTAADLPLIRRWLGEAHVREWWGDPGEQFALVS--GDLDEPAMDQF---IVLAGDKPF 62 Query: 71 GYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKE--R 128 GY Q Y++ + P+ G+DQFIGE + ++G G+ +I+ +F+ ++ Sbjct: 63 GYLQCYRL--TAWNTGFGPQPGG-TRGIDQFIGESDMIARGHGSAFIR---QFVDEQLRH 116 Query: 129 NANAVILDPHKNNPRAIRAYQKSGF 153 V+ DP N RA+RAY+K+GF Sbjct: 117 GLPRVVTDPDPLNSRAVRAYEKAGF 141 >BACHD Q9K9M4 (Q9K9M4) BH2621 protein Length = 197 Score = 56.2 bits (134), Expect = 2e-07 Identities = 35/159 (22%), Positives = 78/159 (49%), Gaps = 6/159 (3%) Query: 2 NIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRV 61 ++V ++ R + DD ++ W+ +E V+ ++ L KKH D+ + Sbjct: 15 HVVNKKLSFRHVTMDDVDMLHSWMHEEHVIPYW---KLNIPLVDYKKHLQTFLNDDHQTL 71 Query: 62 II-EYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120 ++ N VP+ Y + Y + +++ +Y YP +E G+ IG Y +G+ + I Sbjct: 72 MVGAINGVPMSYWESYWVKEDIIANY-YP-FEEHDQGIHLLIGPQEYLGQGLIYPLLLAI 129 Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 + +E + N ++ +P + N + I ++K GF+ ++++ Sbjct: 130 MQQKFQEPDTNTIVAEPDRRNKKMIHVFKKCGFQPVKEV 168 >BACC1 Q739G2 (Q739G2) 6'-aminoglycoside N-acetyltransferase/2''-aminoglycoside phosphotransferase, putative (EC 2.3.1.-) Length = 293 Score = 55.1 bits (131), Expect = 5e-07 Identities = 57/289 (19%), Positives = 125/289 (43%), Gaps = 24/289 (8%) Query: 193 IEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLE 252 ++ + +++S+ I G ++ +VN+ +F+ +KG K + L Sbjct: 11 LQRLYPELQINSVYINEIGQNNDVLIVNDNIVFRFP---KYEKGIQKLRIETQLLEKIRP 67 Query: 253 -TNVKIPNIEYSYISDELS---ILGYKEIKGTFLTPEIYSTMSEEEQ-NLLKRDIASFLR 307 ++IPN Y +E+ GY+ I+G +++ +++E+Q L +A FL+ Sbjct: 68 FITLQIPNPSYQGFQNEVPGKVFAGYEMIEGDPFWKNVFTEINDEKQLQKLAYTLARFLK 127 Query: 308 QMHGLD---YTDISEC-TIDNKQNVLEEYILLRETIYNDLTDI-EKDYIESFMERLNATT 362 ++H + + I +C + D + Y L+E +Y + ++ K+ SF LN ++ Sbjct: 128 ELHEIPLSTFESIMQCDSTDMYSEINSLYSQLKEHVYPFMRNVARKEVSTSFELYLNESS 187 Query: 363 VFEGKKCLCHNDFSCNHLLLDGNNR-LTXXXXXXXXXXXXEYCDFIYLLEDSEEEIGTNF 421 F L H DF ++L + ++ DF +L ++ Sbjct: 188 HFNFTPSLVHGDFGMTNILYSATKKNISGVIDFGGASIGDPAYDFAGIL--------ASY 239 Query: 422 GEDILRMYGNI--DIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENG 468 GE+ L+++ ++E KE + + ++ ++G+ N ++ E G Sbjct: 240 GEEFLQLFEAYYPNLEAVKERMYFYKSTFALQEALFGVLNNDKKAFEAG 288 >THEMA Q9X063 (Q9X063) Hypothetical protein Length = 182 Score = 52.4 bits (124), Expect = 3e-06 Identities = 27/75 (36%), Positives = 41/75 (54%), Gaps = 1/75 (1%) Query: 101 FIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLP 160 F+G P YWS+G GT ++++ F+ E N N + L N RA R Y+K GF++ L Sbjct: 94 FLGRP-YWSQGYGTDAMRVLVRFIFNEMNMNKIKLHVFSFNERAKRVYEKIGFKVEGILR 152 Query: 161 EHELHEGKKEDCYLM 175 + EG+ D +M Sbjct: 153 QELFREGRYHDVIVM 167 >CLOTE Q896X4 (Q896X4) Putative acetyltransferase Length = 186 Score = 48.9 bits (115), Expect = 3e-05 Identities = 44/173 (25%), Positives = 73/173 (42%), Gaps = 15/173 (8%) Query: 6 NEICIRTLIDDDFPLMLKWLTDE---RVLEFYGGRDK-KYTLESLKKHYTEPWEDEVFRV 61 + I I L ++D + KW D RV +F K + + + F + Sbjct: 10 DRIKITALREEDIETITKWYEDTNFLRVFDFNPSAPKTSWKIREWLMEEVSSSNNYFFAI 69 Query: 62 IIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIF 121 + N +GY +I K+ + V G+ IG+ + W KG G+ + L Sbjct: 70 RKKDANKILGYVEIEKI-----------NWNNGVGGIAIGIGDSSEWGKGYGSEALSLAM 118 Query: 122 EFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYL 174 +F +E N + + L N RAI++Y+K GF+ E +GK+ D YL Sbjct: 119 DFAFRELNLHRLQLITISYNERAIKSYEKLGFKKEGIYREAVNRDGKRYDIYL 171 >BACHD Q9KB15 (Q9KB15) BH2121 protein Length = 181 Score = 48.1 bits (113), Expect = 6e-05 Identities = 28/78 (35%), Positives = 36/78 (46%), Gaps = 13/78 (16%) Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 IGE YW KG G ++L+ + E N + V L N +AIR Y+K GF+ Sbjct: 95 IGEKTYWGKGYGFEALRLLLNYAFLEMNLHRVSLRVFSFNKKAIRLYEKLGFK------- 147 Query: 162 HELHEGKKEDCYLMEYRY 179 HEG C YRY Sbjct: 148 ---HEGTSRQCL---YRY 159 >STAES Q8CQT6 (Q8CQT6) Spermidine acetyltransferase Length = 177 Score = 47.0 bits (110), Expect = 1e-04 Identities = 42/169 (24%), Positives = 71/169 (42%), Gaps = 14/169 (8%) Query: 8 ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR-VIIEYN 66 + IR L D + L +E + Y + +L L+ YT+ DE R I+E Sbjct: 3 LIIRALEKTDLSF-IHHLNNEYSIMSYWFEEPYQSLSELENLYTKHILDETERRFIVEEG 61 Query: 67 NVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKK 126 + +G ++ ++ +T E++ +D P Y + G + K+ ++ Sbjct: 62 STSVGVVELLEIN-------FIHRTCEVLIIID-----PQYANNGYAKKAFKMAIDYAFL 109 Query: 127 ERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175 N N V L N +A+ YQ + F I L EH G+ DCY+M Sbjct: 110 VLNMNKVYLYVDIKNEKAVHIYQSNNFEIEGTLKEHFYTRGEYRDCYVM 158 >VIBCH Q9KMC2 (Q9KMC2) Acetyltransferase, putative Length = 158 Score = 45.1 bits (105), Expect = 5e-04 Identities = 37/166 (22%), Positives = 69/166 (41%), Gaps = 18/166 (10%) Query: 15 DDDFPLMLKWLTDERVLEFYGGRDKKY--TLESLKKHYTEPWEDEVFRVIIEYNNVPIGY 72 + DF L++KW+ + + +GG + T E + H ++ EVF +++ G+ Sbjct: 8 ESDFDLLIKWIDSDELNYLWGGPAYVFPLTYEQIHSHCSKA---EVFPYLLKVKGRHAGF 64 Query: 73 GQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANA 132 ++YK+ DE Y FI Y +G+ + L+ + + + +A Sbjct: 65 VELYKVTDEQYRICRV------------FISNA-YRGQGLSKSMLMLLIDKARLDFSATK 111 Query: 133 VILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYR 178 + L + N A + Y+ GF ++ GK D ME R Sbjct: 112 LSLGVFEQNTVARKCYESLGFEVVSTEIGTRAFNGKLWDLVRMEKR 157 >BACSU BLTD_BACSU (P39909) Spermine/spermidine acetyltransferase (EC 2.3.1.57) Length = 152 Score = 44.7 bits (104), Expect = 6e-04 Identities = 40/153 (26%), Positives = 69/153 (45%), Gaps = 16/153 (10%) Query: 8 ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKK-HYTEPWEDEVFRVIIEYN 66 I I+ + DD+ +L + L + K LE K+ HY +P V + Y Sbjct: 3 INIKAVTDDNRAAILDLHVSQNQLSYI--ESTKVCLEDAKECHYYKP-------VGLYYE 53 Query: 67 NVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKK 126 +G+ MY L+ +Y + V+ +D+F + Y KG+G + +K + + L + Sbjct: 54 GDLVGFA----MYG-LFPEYDEDNKNGRVW-LDRFFIDERYQGKGLGKKMLKALIQHLAE 107 Query: 127 ERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 + L +NN AIR YQ+ GF+ +L Sbjct: 108 LYKCKRIYLSIFENNIHAIRLYQRFGFQFNGEL 140 >BACC1 Q739N8 (Q739N8) Spermidine acetyltransferase, putative Length = 156 Score = 43.9 bits (102), Expect = 0.001 Identities = 18/64 (28%), Positives = 36/64 (56%) Query: 98 MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157 +D+F+ + Y KG R+++L+ +FL+ + + L H +N A+ Y+ GFR+ Sbjct: 74 LDRFMIDQQYQGKGYAKRFLRLLIQFLQNKFECKTIYLSLHPDNKLAMGLYESFGFRLNG 133 Query: 158 DLPE 161 D+ + Sbjct: 134 DIDD 137 >LACPL Q88SW8 (Q88SW8) Acetyltransferase (Putative) Length = 180 Score = 43.5 bits (101), Expect = 0.001 Identities = 26/74 (35%), Positives = 33/74 (44%) Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 IG+P+ G GT + LI + E N V LD NP AI YQ SGF Sbjct: 93 IGDPDERGHGYGTETLSLILNYAFNELNLYKVCLDVIATNPAAIAVYQNSGFEFEGTNKR 152 Query: 162 HELHEGKKEDCYLM 175 +G++ D Y M Sbjct: 153 AIKRDGQRIDLYHM 166 >VIBUY Q7MK89 (Q7MK89) Putative acetyltransferase Length = 158 Score = 43.1 bits (100), Expect = 0.002 Identities = 30/140 (21%), Positives = 66/140 (47%), Gaps = 14/140 (10%) Query: 17 DFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIY 76 DF L+++W+ + + +GG + L S ++ ++EVF +++ N G+ ++Y Sbjct: 10 DFHLLIEWIDSDELNYLWGGPAYTFPLTS-EQIIAHCAKEEVFPYLLKVNGQNAGFVELY 68 Query: 77 KMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILD 136 K+ +E Y FI +Y +G+ I L+ + ++ + +A + L Sbjct: 69 KVTNEHYRICRV------------FISN-SYRGQGLSKSMIMLLIDKVRSDFSATMLSLG 115 Query: 137 PHKNNPRAIRAYQKSGFRII 156 ++N A + Y+ GF ++ Sbjct: 116 VFEHNTVARKCYESLGFNVV 135 >DEIRA Q9RW71 (Q9RW71) Acetyltransferase, putative Length = 207 Score = 43.1 bits (100), Expect = 0.002 Identities = 21/70 (30%), Positives = 38/70 (54%) Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 I +P +W G G + ++L + E +A+ + L N R +RA Q++G+R +PE Sbjct: 107 IYDPAHWGGGFGRQALRLWTDATFAETDAHLITLTTWSGNERMVRAAQRAGYRECARIPE 166 Query: 162 HELHEGKKED 171 L +G++ D Sbjct: 167 ARLWQGQRWD 176 >BACAN Q81RM0 (Q81RM0) Spermidine acetyltransferase, putative Length = 156 Score = 43.1 bits (100), Expect = 0.002 Identities = 18/64 (28%), Positives = 35/64 (54%) Query: 98 MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157 +D+F+ + Y KG R+++L+ +FL+ + + L H N A+ Y+ GFR+ Sbjct: 74 LDRFMIDQQYQGKGYAKRFLRLLIQFLQHKFECKTIYLSLHPENKLAMGLYESFGFRLNG 133 Query: 158 DLPE 161 D+ + Sbjct: 134 DIDD 137 >LACJO Q74K74 (Q74K74) Hypothetical protein Length = 189 Score = 42.4 bits (98), Expect = 0.003 Identities = 41/162 (25%), Positives = 71/162 (43%), Gaps = 25/162 (15%) Query: 17 DFPLM---LKWLTDERVLEFYGGRDKKYTLESLKKHYTEP-WEDEVFRVIIEYNNV--PI 70 DFPL+ LK + DE ++ + + +K + P + R+ +E +++ PI Sbjct: 9 DFPLVYPILKQIFDEMDMDTIKALPESQFYDLMKHGFYSPHYRYSHNRMWVETDDLDRPI 68 Query: 71 GYGQIYKMYDELYTDYH----YPKT----DEIVYG----------MDQFIGEPNYWSKGI 112 G +Y D+ D YPK D +++ +D P +W KGI Sbjct: 69 GLIVMYGYDDQGLIDISLKSAYPKVGLPLDAVIFSDKEALPHEWYLDAIAVSPKHWGKGI 128 Query: 113 GTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 G + IK I + ++ + L+ ++NPRA R Y GF+ Sbjct: 129 GQKLIK-IAPGIARQNGYKKISLNVDQDNPRAARLYDYMGFK 169 >BACC1 Q72YV7 (Q72YV7) Acetyltransferase, GNAT family Length = 174 Score = 42.4 bits (98), Expect = 0.003 Identities = 35/152 (23%), Positives = 63/152 (41%), Gaps = 14/152 (9%) Query: 6 NEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKK---HYTEPWEDEVFRVI 62 N I +R +DD KW D V+ KY+ + +K + + + + Sbjct: 5 NRIQLRKFSEDDILTYYKWHNDIDVMSSTTLNLDKYSFQDTEKLCQQFIHSPNAKSYIIE 64 Query: 63 IEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFE 122 + N+PIG + ++ D + + I+ IG+ +YW +G G L+ Sbjct: 65 EKATNLPIGITSL------IHIDSYNRNAECIID-----IGKKDYWGQGYGKEAFTLLLN 113 Query: 123 FLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 + E N + + L N RAI+ Y+ GF+ Sbjct: 114 YAFLELNLHRLSLRVFSFNDRAIKLYKSLGFQ 145 >BACC1 Q734R6 (Q734R6) Acetyltransferase, GNAT family Length = 176 Score = 42.0 bits (97), Expect = 0.004 Identities = 37/146 (25%), Positives = 63/146 (43%), Gaps = 16/146 (10%) Query: 15 DDDFPLMLKWLTDERVLEFYGGRDKKYTLES--LKKHYTEPWEDE----VFRVIIEYNNV 68 ++DF ++ W+ + +GG + L + LK + +D VF+ I E N+ Sbjct: 9 EEDFQQLIDWIPNAEFSLQWGGPAFTFPLTNAQLKNYLQNANKDNAIKYVFKAIDETNSE 68 Query: 69 PIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKER 128 IG+ + + KT+E IG N KG GT+ + + +F +E Sbjct: 69 VIGHISLGNV----------DKTNESARIGKVLIGSTNSRGKGYGTQMMTAVLKFAFEEL 118 Query: 129 NANAVILDPHKNNPRAIRAYQKSGFR 154 + V L N AI+ Y+K GF+ Sbjct: 119 KLHKVTLGVFDFNESAIKCYKKVGFQ 144 >CLOPE Q8XKH9 (Q8XKH9) Hypothetical protein CPE1416 Length = 193 Score = 41.6 bits (96), Expect = 0.005 Identities = 31/106 (29%), Positives = 45/106 (42%), Gaps = 5/106 (4%) Query: 82 LYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNN 141 LY Y EI Y + E N+W KG+ + IK I F + + N +I NN Sbjct: 93 LYNIDFYSNNTEIGYTI-----EKNFWRKGVASECIKAIENFAFETLDMNRIIAMIDSNN 147 Query: 142 PRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVK 187 +I+ +K GF L EH ++ K E + Y + VK Sbjct: 148 ISSIKLSEKLGFHRDGILREHYYNKSKDEYINICVYSLIKSDIKVK 193 >BACAN Q81SQ8 (Q81SQ8) Acetyltransferase, GNAT family Length = 157 Score = 41.2 bits (95), Expect = 0.007 Identities = 33/126 (26%), Positives = 54/126 (42%), Gaps = 17/126 (13%) Query: 34 YGGRDKKYTLESLKKHYTEPWEDEV---FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPK 90 Y G+ Y +E+ ++ E DE ++ N IGY + K+ D Sbjct: 22 YEGKYSFYDIEADEEDLAEFLHDESRGDHTFSVKENGTLIGYFTVCKITDG--------- 72 Query: 91 TDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQK 150 T +I G+ PN G G ++I I F K++ N + L N RAI+ Y++ Sbjct: 73 TVDIGLGI-----RPNITGNGFGLQFINAILAFSKEKYGCNYITLSVATFNKRAIKVYKR 127 Query: 151 SGFRII 156 +GF + Sbjct: 128 AGFEAV 133 >VIBCH Q9K330 (Q9K330) Acetyltransferase, putative Length = 178 Score = 40.8 bits (94), Expect = 0.009 Identities = 21/80 (26%), Positives = 40/80 (50%), Gaps = 10/80 (12%) Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 IG+ +W KG+GT +L+ + +E + + L + +N A++AY+ +G++ Sbjct: 95 IGDKAFWGKGLGTEVTRLVTNYGFRELGLHRIELTAYCDNVAAVKAYENAGYQ------- 147 Query: 162 HELHEGKKEDCYLMEYRYDD 181 HEG K + R+ D Sbjct: 148 ---HEGIKRESGYRNGRFMD 164 >VIBUY Q7MK96 (Q7MK96) Putative acetyltransferase Length = 158 Score = 40.4 bits (93), Expect = 0.012 Identities = 35/166 (21%), Positives = 70/166 (42%), Gaps = 18/166 (10%) Query: 15 DDDFPLMLKWLTDERVLEFYGGRDKKY--TLESLKKHYTEPWEDEVFRVIIEYNNVPIGY 72 + +F ++ W+ + + +GG + T E + H ++ EVF +++ N G+ Sbjct: 8 ESNFDQLIAWIDSDELNYLWGGPAYVFPLTYEQIHAHCSKA---EVFPYLLKVNGRHAGF 64 Query: 73 GQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANA 132 ++YK+ DE Y VY + + G +G+ + L+ + + + +A Sbjct: 65 VELYKVTDEQYRICR-------VYISNAYRG------RGLSKSMLMLLIDKARLDFSATK 111 Query: 133 VILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYR 178 + L + N A + Y+ GF ++ GK D ME R Sbjct: 112 LSLGVFEQNTVARKCYESLGFEVVSTEIGTRSFNGKLWDLVRMEKR 157 >WIGBR Q8D3I4 (Q8D3I4) Imp protein Length = 723 Score = 40.0 bits (92), Expect = 0.016 Identities = 60/261 (22%), Positives = 104/261 (39%), Gaps = 50/261 (19%) Query: 57 EVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRY 116 +++ + + N+PI Y +K+Y E Y D Y + +I Y + + Y+ K +Y Sbjct: 191 KIWNAKLNFKNIPIFYVPFFKVY-EKYNDIFY--SPKISYKNNNGLSLSFYYKKIFFDKY 247 Query: 117 IKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLME 176 F F+ K + ++L NN + Y S F + KK + Y++ Sbjct: 248 ---FFYFIPKYNSDGTILL----NN----KIYYSSDF------------DKKKINLYIL- 283 Query: 177 YRYDDNATNVKAMKYLIEHYFDNFKVD---------SIEIIGSGYDSVAYLVNNEYI--F 225 +D L ++YF N K+D + I +D + NE + F Sbjct: 284 --FDIKKNKNNWFIDLKQNYFFNKKLDILYIYKKSNNFIIFNKMFDIEKNFLQNEILEKF 341 Query: 226 KTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPE 285 K+ N K + K F N N +K P++ +SY ++ K K F+ Sbjct: 342 NLKYFYNNWKLKLEYKKFIIFDNKNF-NYIKFPHVYFSYFDNK-----NKNFKFNFVGKF 395 Query: 286 IYSTMSEEEQNLLKRDIASFL 306 Y EE++ +L +I FL Sbjct: 396 SY----EEDKKILHINIEPFL 412 >BACSU P94482 (P94482) YnaD Length = 170 Score = 39.7 bits (91), Expect = 0.021 Identities = 39/156 (25%), Positives = 66/156 (42%), Gaps = 17/156 (10%) Query: 1 MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWED--EV 58 M+I + IR D+ + ++ +D V+++ + +T E K + D E Sbjct: 1 MHITTKRLLIREFEFKDWQAVYEYTSDSNVMKYIP--EGVFTEEDAKAFVNKNKGDNAEK 58 Query: 59 FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIK 118 F VI+ + IG+ YK + E T EI + + PNY +KG + + Sbjct: 59 FPVILRDEDCLIGHIVFYKYFGE--------HTYEIGW-----VFNPNYQNKGYASEAAQ 105 Query: 119 LIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 I E+ KE N + +I N + R +K G R Sbjct: 106 AILEYGFKEMNLHRIIATCQPENIPSYRVMKKIGMR 141 >BACC1 Q73A91 (Q73A91) Acetyltransferase, GNAT family Length = 177 Score = 39.7 bits (91), Expect = 0.021 Identities = 23/85 (27%), Positives = 37/85 (43%), Gaps = 5/85 (5%) Query: 87 HYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIR 146 H K E+ Y ++G+P YW G GT K + + E + N + NNP + R Sbjct: 86 HIHKRGELAY----WVGKP-YWGNGFGTEAAKTLLHYGFNELHLNKIFAAAFTNNPGSWR 140 Query: 147 AYQKSGFRIIEDLPEHELHEGKKED 171 +K G + +H + G+ D Sbjct: 141 IMEKIGMKHEGTFKQHVVKSGEPMD 165 >THETN Q8RC99 (Q8RC99) Acetyltransferases Length = 149 Score = 39.3 bits (90), Expect = 0.027 Identities = 35/149 (23%), Positives = 66/149 (44%), Gaps = 29/149 (19%) Query: 43 LESLKKHYTEPWEDEVF-----------RVIIEYNNVPIGYGQIYKMYDELYTDYHYPKT 91 +E K +T PW E F ++ E + +GY + + DE + T Sbjct: 18 MEIEKLSFTTPWSREAFVGEVTKNSCARYIVAEVDKKVVGYAGFWVVLDEGHI------T 71 Query: 92 DEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKS 151 + V+ P Y KGIG+R ++ + + L K+ ++ L+ ++N A Y+K Sbjct: 72 NIAVH--------PEYRGKGIGSRLMEGLID-LAKKNGITSMTLEVRESNLVAQNLYKKF 122 Query: 152 GFRIIEDLPEHELHEGKKEDCYLMEYRYD 180 GF+++ ++ ED +M ++YD Sbjct: 123 GFKVLG--RREGYYQDNNEDAIVM-WKYD 148 >STRAW Q82IB6 (Q82IB6) Putative acetyltransferase Length = 168 Score = 39.3 bits (90), Expect = 0.027 Identities = 21/54 (38%), Positives = 33/54 (61%), Gaps = 5/54 (9%) Query: 101 FIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 ++G P YW++GIG+R + L FL++ER + DP N ++R +K GFR Sbjct: 100 WLGRP-YWARGIGSRALGL---FLRRERT-RPLYADPFHGNTASVRLLEKHGFR 148 >LISIN Q92E38 (Q92E38) Lin0623 protein Length = 177 Score = 39.3 bits (90), Expect = 0.027 Identities = 22/69 (31%), Positives = 33/69 (47%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166 +W GIGT ++ + + KK V L+ N RAI Y+K GF ++P E Sbjct: 105 FWGLGIGTLIMEGLIKHAKKTERLKLVYLEAVSENKRAINLYKKFGFIEAGEIPALMQVE 164 Query: 167 GKKEDCYLM 175 G+ D +M Sbjct: 165 GRYLDVTMM 173 >STRCO O69977 (O69977) Hypothetical protein SCO5801 Length = 231 Score = 38.9 bits (89), Expect = 0.036 Identities = 30/143 (20%), Positives = 65/143 (45%), Gaps = 6/143 (4%) Query: 14 IDDDFPLMLKWLTDERVLEFYG-GRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIGY 72 ++ D PL+ +W+ D V ++ + T + L+ + + + VP+ Y Sbjct: 67 LERDVPLIARWMNDPAVAAYWELTGPQSVTADHLRAQLAG--DGRSVPCVGTLDGVPMSY 124 Query: 73 GQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNA-N 131 +IY+ + Y + + G+ IG+ + +G+GT I+ + + + R A Sbjct: 125 WEIYRADLDPLARYCPVRPHDT--GVHLLIGDGAHRGRGLGTELIRAVVDLVLAGRPACT 182 Query: 132 AVILDPHKNNPRAIRAYQKSGFR 154 V+ +P N +++ A+ +GFR Sbjct: 183 RVLAEPDVRNRQSVAAFLGAGFR 205 >STRAW Q82KD8 (Q82KD8) Hypothetical protein Length = 377 Score = 38.9 bits (89), Expect = 0.036 Identities = 37/150 (24%), Positives = 66/150 (44%), Gaps = 10/150 (6%) Query: 17 DFPLMLKWLTDERVLEFYG-GRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIGYGQI 75 D PL+ +W+ D V F+ D+ T + L+ ++E P+ Y +I Sbjct: 217 DLPLLGRWMNDPAVAAFWKLAGDESVTEQHLRAQLGGDGRSVPCLGVLE--GTPMSYWEI 274 Query: 76 YKM-YDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANA-V 133 Y+ D L HYP G+ IG +G+G+ ++ + + + R + A V Sbjct: 275 YRADLDSLAR--HYPARPHDT-GIHLLIGGVADRGRGLGSTLLRAVADLVLDRRPSCARV 331 Query: 134 ILDPHKNNPRAIRAYQKSGFRIIE--DLPE 161 + +P N ++ A+ +GFR DLP+ Sbjct: 332 VAEPDLRNTSSVSAFLGAGFRFSAEVDLPD 361 >VIBCH Q9KMA3 (Q9KMA3) Acetyltransferase, putative Length = 230 Score = 38.5 bits (88), Expect = 0.046 Identities = 30/144 (20%), Positives = 62/144 (43%), Gaps = 18/144 (12%) Query: 15 DDDFPLMLKWLTDERVLEFYGGRDKKY--TLESLKKHYTEPWEDEVFRVIIEYNNVPIGY 72 + DF L++KW+ + + +G + T E + H ++ EVF +++ G+ Sbjct: 8 ESDFDLLIKWIDSDELNYLWGCPAYVFPLTYEQIHSHCSKA---EVFPYLLKVKGRHAGF 64 Query: 73 GQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANA 132 ++YK+ DE Y FI Y +G+ + L+ + + + +A Sbjct: 65 VELYKVTDEQYRICRV------------FISNA-YRGQGLSKSMLMLLIDKARLDFSATK 111 Query: 133 VILDPHKNNPRAIRAYQKSGFRII 156 + L + N A + Y+ GF ++ Sbjct: 112 LSLGVFEQNTVARKCYESLGFEVV 135 >STAAM Q932M5 (Q932M5) Hypothetical protein SAVP027 Length = 134 Score = 38.5 bits (88), Expect = 0.046 Identities = 24/77 (31%), Positives = 36/77 (46%), Gaps = 11/77 (14%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164 PNY KG G++ + I E+ KE + + L K NPRA Y+K G + Sbjct: 68 PNYQDKGYGSKLLSFIKEY-SKEIGCSEMFLITDKGNPRACHVYEKLGGK---------- 116 Query: 165 HEGKKEDCYLMEYRYDD 181 ++ K E Y+ +Y D Sbjct: 117 NDYKDEIVYVYDYEKGD 133 >LACLA Q9CHJ8 (Q9CHJ8) Acetyl transferase Length = 193 Score = 38.5 bits (88), Expect = 0.046 Identities = 29/106 (27%), Positives = 50/106 (47%), Gaps = 12/106 (11%) Query: 59 FRVIIEYNNVPIGYGQI-YKMYDELYTDYHYPKTD------EIVYGMDQFIGEPNYWSKG 111 F + +Y P+G I K L D H+ K EI Y ++Q NYW++G Sbjct: 56 FSIANDYMKSPLGKWAIELKSEHRLIGDIHFVKISDKNQSAEIGYVLNQ-----NYWNQG 110 Query: 112 IGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157 + T +K++ EF ++ +IL K N + + KSG+ +++ Sbjct: 111 LLTEALKVLTEFSFEQFGLKKLILLIDKENVPSKKVALKSGYHLVK 156 >ENTFA Q82YR8 (Q82YR8) Acetyltransferase, GNAT family Length = 130 Score = 38.5 bits (88), Expect = 0.046 Identities = 24/77 (31%), Positives = 36/77 (46%), Gaps = 11/77 (14%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164 PNY KG G++ + I E+ KE + + L K NPRA Y+K G + Sbjct: 64 PNYQDKGYGSKLLSFIKEY-SKEIGCSEMFLITDKGNPRACHVYEKLGGK---------- 112 Query: 165 HEGKKEDCYLMEYRYDD 181 ++ K E Y+ +Y D Sbjct: 113 NDYKDEIVYVYDYEKGD 129 >BACC1 Q73AS8 (Q73AS8) Acetyltransferase, GNAT family Length = 157 Score = 38.5 bits (88), Expect = 0.046 Identities = 34/128 (26%), Positives = 55/128 (42%), Gaps = 21/128 (16%) Query: 34 YGGRDKKYTLESLKKHYTEPWEDE-----VFRVIIEYNNVPIGYGQIYKMYDELYTDYHY 88 Y G Y +E+ ++ E DE +F V + + IGY + K+ D Sbjct: 22 YEGEYSFYDIEADEEDLAEFLHDESRGDHIFSV--KEHGTLIGYFTVCKINDG------- 72 Query: 89 PKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAY 148 T +I GM +PN G G ++I I F K++ + L N RAI+ Y Sbjct: 73 --TVDIGLGM-----KPNITGNGFGLQFINAILAFSKEKYGCKYITLSVATFNKRAIKVY 125 Query: 149 QKSGFRII 156 +++GF + Sbjct: 126 KRAGFEAV 133 >BACC1 Q734C3 (Q734C3) Acetyltransferase, GNAT family Length = 183 Score = 38.5 bits (88), Expect = 0.046 Identities = 27/80 (33%), Positives = 38/80 (47%) Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 IG+ N KG G I LI ++ E N + V LD N AI Y+K GF++ + E Sbjct: 98 IGDANDRGKGYGREAIHLILKYAFYELNLHRVGLDVISYNKDAIELYKKMGFQMEGCMRE 157 Query: 162 HELHEGKKEDCYLMEYRYDD 181 +GK D +M D+ Sbjct: 158 AVQRDGKCFDRIIMGILRDE 177 >BACAN Q81YL2 (Q81YL2) Acetyltransferase, GNAT family Length = 179 Score = 38.5 bits (88), Expect = 0.046 Identities = 27/80 (33%), Positives = 38/80 (47%) Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 IG+ N KG G I LI ++ E N + V LD N AI Y+K GF+I + E Sbjct: 96 IGDANDRGKGYGREAIHLILKYAFYELNLHRVGLDVISYNKAAIELYKKMGFQIEGCMRE 155 Query: 162 HELHEGKKEDCYLMEYRYDD 181 +G+ D +M D+ Sbjct: 156 AVQRDGECFDRIIMGILRDE 175 >BACAN Q81XB7 (Q81XB7) Spermidine N1-acetyltransferase Length = 171 Score = 38.5 bits (88), Expect = 0.046 Identities = 27/116 (23%), Positives = 51/116 (43%), Gaps = 12/116 (10%) Query: 60 RVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKL 119 R I+E +N +G ++ ++ DY + +T+ Q I +PNY G +L Sbjct: 57 RFIVEKDNEMVGLVELVEI------DYIHRRTEF------QIIIDPNYQGYGYAAEATRL 104 Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175 ++ N + + L K N +A+ Y+K GF + +L + +G + M Sbjct: 105 AMDYAFSVLNMHKIYLVVDKENEKAVHVYKKVGFMVEGELLDEFFVDGNYHNAIRM 160 >SALTI Q8Z6Z1 (Q8Z6Z1) Spermidine N1-acetyltransferase Length = 186 Score = 38.1 bits (87), Expect = 0.061 Identities = 25/90 (27%), Positives = 40/90 (44%), Gaps = 4/90 (4%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 Q I P Y KG+ +R KL ++ N + L K N +AI Y+K GFR+ +L Sbjct: 87 QIIISPEYQGKGLASRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFRVEGEL 146 Query: 160 PEHELHEGKKED----CYLMEYRYDDNATN 185 G+ + C + D++ T+ Sbjct: 147 IHEFFINGEYRNTIRMCIFQQQYLDEHKTS 176 >SALTY Q8ZPJ3 (Q8ZPJ3) Spermidine N1-acetyltransferase (EC 2.3.1.57) Length = 186 Score = 38.1 bits (87), Expect = 0.061 Identities = 25/90 (27%), Positives = 40/90 (44%), Gaps = 4/90 (4%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 Q I P Y KG+ +R KL ++ N + L K N +AI Y+K GFR+ +L Sbjct: 87 QIIISPEYQGKGLASRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFRVEGEL 146 Query: 160 PEHELHEGKKED----CYLMEYRYDDNATN 185 G+ + C + D++ T+ Sbjct: 147 IHEFFINGEYRNTIRMCIFQQQYLDEHKTS 176 >SALCH Q57PD6 (Q57PD6) Spermidine N1-acetyltransferase Length = 186 Score = 38.1 bits (87), Expect = 0.061 Identities = 25/90 (27%), Positives = 40/90 (44%), Gaps = 4/90 (4%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 Q I P Y KG+ +R KL ++ N + L K N +AI Y+K GFR+ +L Sbjct: 87 QIIISPEYQGKGLASRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFRVEGEL 146 Query: 160 PEHELHEGKKED----CYLMEYRYDDNATN 185 G+ + C + D++ T+ Sbjct: 147 IHEFFINGEYRNTIRMCIFQQQYLDEHKTS 176 >MYCPE Q8EWE5 (Q8EWE5) Acetyltransferase GNAT family Length = 193 Score = 38.1 bits (87), Expect = 0.061 Identities = 34/157 (21%), Positives = 72/157 (45%), Gaps = 20/157 (12%) Query: 2 NIVENE-ICIRTLIDDDFPLMLKWLTDERVLEFYGG---RDKKYTLESLKKHYTEPWEDE 57 NI+E + + +R L +D ++ E V E G +D +Y+ + L K + Sbjct: 8 NIIETKRLYLRPLKIEDLNDFYEFAKVEGVGESAGWFHHKDIEYSKKILIKMINSKQD-- 65 Query: 58 VFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYI 117 + ++ + NN IG I+ Y+ D+++ G F+ +YW+KG+ T + Sbjct: 66 -YAIVYKENNKVIGELGIFNKYEN----------DKLMIG---FVLNKDYWNKGLATEIV 111 Query: 118 KLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 K + +++ + + + ++N + R +K GF+ Sbjct: 112 KELIDYIFTNTDHQQIYMGHFESNLASKRVVEKCGFK 148 >BACC1 Q72Y03 (Q72Y03) Spermidine N1-acetyltransferase (EC 2.3.1.57) Length = 176 Score = 38.1 bits (87), Expect = 0.061 Identities = 34/177 (19%), Positives = 74/177 (41%), Gaps = 14/177 (7%) Query: 1 MNIVE--NEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEV 58 M ++E E+ +R L +D + + + ++ ++ + +E + + Sbjct: 1 MEVIEMSQELKLRPLEREDLKFVHELNNNAHIMSYWFEEPYEAFVELQDLYDKHIHDQSE 60 Query: 59 FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIK 118 R I+E +N +G ++ ++ DY + +T+ Q I +PNY G + Sbjct: 61 RRFIVEKDNEMVGLVELVEI------DYIHRRTEF------QIIIDPNYQGYGYAAEATR 108 Query: 119 LIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175 L ++ N + + L K N +A+ Y+K GF + +L + +G + M Sbjct: 109 LAMDYAFSVLNMHKLYLVVDKENEKAVHVYKKVGFMVEGELLDEFFVDGNYHNAIRM 165 >DEIRA Q9RW73 (Q9RW73) Acetyltransferase, putative Length = 186 Score = 37.7 bits (86), Expect = 0.079 Identities = 45/179 (25%), Positives = 77/179 (43%), Gaps = 35/179 (19%) Query: 8 ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYT-----LESLKKHY-----TEPWEDE 57 + +R +D P +WLTDER + D YT E+++ + T P DE Sbjct: 9 VVLRDRRPEDLPTFTRWLTDERAA--WREWDAPYTPAAQTSETMQAYIRYLQVTPPDADE 66 Query: 58 VFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYG-----MDQFIGEPNYWSKGI 112 RVI +G GQ+ M + +++E G + I +P YW G+ Sbjct: 67 --RVI------EVG-GQVVGMVN---------RSEEEPAGGGWWDLGILIYDPAYWEGGV 108 Query: 113 GTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKED 171 GTR + L + +A+ + + N R +RA ++ GF+ + E + G++ D Sbjct: 109 GTRALSLWVQDTLDWTDAHTLTVTTWSGNERMMRAARRLGFQECARVREARVVGGQRYD 167 >STAAM Q99U68 (Q99U68) Hypothetical protein Length = 169 Score = 37.4 bits (85), Expect = 0.10 Identities = 31/133 (23%), Positives = 55/133 (41%), Gaps = 17/133 (12%) Query: 44 ESLKKHYTEPWEDEV-------------FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPK 90 E +K+H E W+D+ + ++E N+ G+ + + E Y D +P Sbjct: 22 ELMKEHDNEQWDDQYPLLEHFEEDIAKDYLYVLEENDKIYGFIVVDQDQAEWYDDIDWPV 81 Query: 91 TDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQK 150 E + + + G Y KG T + + + K R A ++ D N A + K Sbjct: 82 NREGAFVIHRLTGSKEY--KGAATELFNYVIDVV-KARGAEVILTDTFALNKPAQGLFAK 138 Query: 151 SGF-RIIEDLPEH 162 GF ++ E L E+ Sbjct: 139 FGFHKVGEQLMEY 151 >RICPR Q9ZE75 (Q9ZE75) TRANSCRIPTIONAL ACTIVATOR PROTEIN CZCR (CzcR) Length = 237 Score = 37.4 bits (85), Expect = 0.10 Identities = 35/130 (26%), Positives = 63/130 (48%), Gaps = 15/130 (11%) Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLET-NVKIPNIEYSYISDELSILGYKEI 277 V E I + K + KG+A ++ ++ NL+T +V++ + + E SIL + Sbjct: 104 VREELIARIKAIVRRSKGHAASIFRFDKISVNLDTRSVEVDGKKLHLTNKEYSILELLIL 163 Query: 278 -KGTFLTPE-----IYSTMSEEEQNLLKRDIASFLRQMH----GLDYTDISECTIDNKQN 327 +GT LT E +YST+ E E ++ I +++ G DY D T+ + Sbjct: 164 RRGTILTKEMFLNHLYSTVDEPEMKIIDVFICKLRKKLSDAAGGRDYID----TVWGRGY 219 Query: 328 VLEEYILLRE 337 +L+EY L++ Sbjct: 220 MLKEYDELQQ 229 >LACJO Q74J71 (Q74J71) Hypothetical protein Length = 181 Score = 37.4 bits (85), Expect = 0.10 Identities = 23/72 (31%), Positives = 32/72 (44%), Gaps = 1/72 (1%) Query: 102 IGEPNYWSKGIGTRYIKL-IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLP 160 I P YW GIG R +K+ I E + + L NPR I QK GF+ + Sbjct: 98 IYNPTYWHGGIGGRVLKIWISEIFDQYPELEHIGLTTWSGNPRMIHLAQKLGFKKEAQIR 157 Query: 161 EHELHEGKKEDC 172 + ++ K DC Sbjct: 158 KVRFYKEKYYDC 169 >CLOAB Q97D33 (Q97D33) Acetyltransferase (With duplicated domains), possibly RIMI-like protein Length = 292 Score = 37.4 bits (85), Expect = 0.10 Identities = 18/59 (30%), Positives = 32/59 (54%), Gaps = 1/59 (1%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHE 163 P Y +G G + ++ E+L ER+ + + L+ NN RA Y+ GF+I ++ +E Sbjct: 225 PEYRGRGFGREMMSMLLEYLI-ERDYDDIALEVDSNNKRAFELYKSIGFQIEREIDYYE 282 >VIBCH Q9KU66 (Q9KU66) Ribosomal-protein-alanine acetyltransferase Length = 161 Score = 36.6 bits (83), Expect = 0.18 Identities = 22/77 (28%), Positives = 42/77 (54%), Gaps = 2/77 (2%) Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE-DLPEH 162 +P KG G + ++ F L +++ A + L+ ++N RA YQ++GF I+ + + Sbjct: 85 DPAQQGKGYGQQLLQH-FIALCEQQKAESAWLEVRESNQRAFALYQRAGFNEIDRRVNYY 143 Query: 163 ELHEGKKEDCYLMEYRY 179 + +GK ED +M Y + Sbjct: 144 PVAKGKSEDAIIMSYLF 160 >STRR6 Q8DNF9 (Q8DNF9) Hypothetical protein spr1760 Length = 172 Score = 36.6 bits (83), Expect = 0.18 Identities = 23/71 (32%), Positives = 34/71 (47%), Gaps = 3/71 (4%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEH--EL 164 YW+ G+G+ ++ E+ + + L N A+ YQK GF +IE E + Sbjct: 98 YWNNGLGSLLLEEAIEWAQASGILRRLQLTVQTRNQAAVHLYQKHGF-VIEGSQERGAYI 156 Query: 165 HEGKKEDCYLM 175 EGK D YLM Sbjct: 157 EEGKFIDVYLM 167 >SALTY Q8ZNU6 (Q8ZNU6) Putative ribose 5-phosphate isomerase (EC 5.3.1.6) Length = 212 Score = 36.6 bits (83), Expect = 0.18 Identities = 24/93 (25%), Positives = 44/93 (47%), Gaps = 4/93 (4%) Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIK 278 +N +IF+ F+ K +GY E+ N + VK ++ +Y+ D L + + +K Sbjct: 124 LNVRFIFEKAFTGRKGEGYPPERKAPQVRNAGILNQVKAAVVKENYL-DTLRAIDPELVK 182 Query: 279 GTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHG 311 P + E QN ++I +F+R+M G Sbjct: 183 TAVSGPRFQQCLFENGQN---KEIEAFVREMLG 212 >SALCH Q57N66 (Q57N66) Putative ribose 5-phosphate isomerase Length = 212 Score = 36.6 bits (83), Expect = 0.18 Identities = 24/93 (25%), Positives = 44/93 (47%), Gaps = 4/93 (4%) Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIK 278 +N +IF+ F+ K +GY E+ N + VK ++ +Y+ D L + + +K Sbjct: 124 LNVRFIFEKTFTGRKGEGYPPERKAPQVRNAGILNQVKAAVVKENYL-DTLRAIDPELVK 182 Query: 279 GTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHG 311 P + E QN ++I +F+R+M G Sbjct: 183 TAVSGPRFQQCLFENGQN---KEIEAFVREMLG 212 >ECO57 ATDA_ECO57 (P0A952) Spermidine N(1)-acetyltransferase (EC 2.3.1.57) (Diamine acetyltransferase) (SAT) Length = 185 Score = 36.6 bits (83), Expect = 0.18 Identities = 21/60 (35%), Positives = 29/60 (48%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 Q I P Y KG+ TR KL ++ N + L K N +AI Y+K GF + +L Sbjct: 86 QIIISPEYQGKGLATRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFSVEGEL 145 >ECOLI ATDA_ECOLI (P0A951) Spermidine N(1)-acetyltransferase (EC 2.3.1.57) (Diamine acetyltransferase) (SAT) Length = 185 Score = 36.6 bits (83), Expect = 0.18 Identities = 21/60 (35%), Positives = 29/60 (48%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 Q I P Y KG+ TR KL ++ N + L K N +AI Y+K GF + +L Sbjct: 86 QIIISPEYQGKGLATRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFSVEGEL 145 >BACHD Q9KG16 (Q9KG16) BH0299 protein Length = 305 Score = 36.6 bits (83), Expect = 0.18 Identities = 35/126 (27%), Positives = 52/126 (41%), Gaps = 17/126 (13%) Query: 41 YTLESLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQ 100 Y E + K EP +IIE + IGY Y + P+ E G + Sbjct: 185 YDAEEILKKINEPTNK---LLIIEKEQIVIGYA---------YVEVE-PEHGE---GQIE 228 Query: 101 FIG-EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 +IG P+Y +G+ T+ + L + L K N +AIR YQ +GF+ L Sbjct: 229 YIGIAPDYRRQGLATQLLTNALHVLFSYPTVEDITLCVSKQNTKAIRLYQAAGFKKERQL 288 Query: 160 PEHELH 165 EL+ Sbjct: 289 TYFELN 294 >AQUAE O66838 (O66838) Ribosomal-protein-alanine acetyltransferase Length = 154 Score = 36.6 bits (83), Expect = 0.18 Identities = 23/74 (31%), Positives = 37/74 (50%), Gaps = 5/74 (6%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164 P Y KG G + ++ L + V+LD K+N RAI Y+K GF+++ E + Sbjct: 75 PGYRGKGYGEKLLREAISRLGDK--VKRVVLDVRKSNLRAINLYKKLGFKVV---TERKG 129 Query: 165 HEGKKEDCYLMEYR 178 + E+ LME + Sbjct: 130 YYSDGENALLMELK 143 >PROAC Q6ABU2 (Q6ABU2) Acetyltransferase (EC 2.3.1.-) Length = 188 Score = 36.2 bits (82), Expect = 0.23 Identities = 21/77 (27%), Positives = 32/77 (41%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164 P W +G+G R ++L E +A V L N R I G+R +P+ Sbjct: 111 PTLWGRGVGRRALRLWTEATFATTDAQVVTLTTWSGNGRMIHCAGAVGYRECGRIPQARS 170 Query: 165 HEGKKEDCYLMEYRYDD 181 +G++ D M DD Sbjct: 171 WQGRRWDLVTMALLRDD 187 >BACAN Q81QL3 (Q81QL3) Acetyltransferase, GNAT family Length = 308 Score = 36.2 bits (82), Expect = 0.23 Identities = 44/183 (24%), Positives = 70/183 (38%), Gaps = 31/183 (16%) Query: 44 ESLKKHYTEPWEDEVFRV------IIEYNNVPIGYGQIYKM-YDELYTDYHYPKTDEIVY 96 E L + T +++E R +I+YN P GY + M Y D + DE + Sbjct: 15 EKLTEIMTRTFDEEAERWLCGQGDVIDYNIQPPGYSSVEMMRYSIEELDSYKVIMDEKII 74 Query: 97 G-------------MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPR 143 G +D+ EP Y KGIG+ IKLI R + NN Sbjct: 75 GGIIVTISGKSYGRIDRIFVEPVYQGKGIGSNVIKLIEAEYPSIRIWDLETSSRQINNH- 133 Query: 144 AIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVD 203 Y+K G++ I E + E CY+ N +V + + ++N ++ Sbjct: 134 --HFYKKMGYQTI--------FESEDEYCYVKRIGTSSNKESVFKNEDMKNSQYENCNLE 183 Query: 204 SIE 206 + E Sbjct: 184 NTE 186 >STRMU Q8DV67 (Q8DV67) Putative acetyltransferase Length = 166 Score = 35.8 bits (81), Expect = 0.30 Identities = 21/52 (40%), Positives = 29/52 (55%), Gaps = 2/52 (3%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156 P Y +GIGT +K E L+K + + V L K N A+ YQK+GF+ I Sbjct: 95 PAYRGQGIGTELLKTFLEHLRK-KGYHKVSLSVQKEND-AVNMYQKAGFQTI 144 >STAAM COAD_STAAM (P63818) Phosphopantetheine adenylyltransferase (EC 2.7.7.3) (Pantetheine-phosphate adenylyltransferase) (PPAT) (Dephospho-CoA pyrophosphorylase) Length = 160 Score = 35.8 bits (81), Expect = 0.30 Identities = 28/132 (21%), Positives = 55/132 (41%), Gaps = 13/132 (9%) Query: 164 LHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEY 223 L KKE + +E R D +VK + + H F VD E +G+ +++ Sbjct: 38 LKNSKKEGTFSLEERMDLIEQSVKHLPNVKVHQFSGLLVDYCEQVGAKTIIRGLRAVSDF 97 Query: 224 IFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDEL--SILGYKEIKGTF 281 ++ + ++ KK LN +ET + + YS+IS + + Y+ F Sbjct: 98 EYELRLTSMNKK-----------LNNEIETLYMMSSTNYSFISSSIVKEVAAYRADISEF 146 Query: 282 LTPEIYSTMSEE 293 + P + + ++ Sbjct: 147 VPPYVEKALKKK 158 >LISMO Q8Y9B8 (Q8Y9B8) Lmo0614 protein Length = 177 Score = 35.8 bits (81), Expect = 0.30 Identities = 17/54 (31%), Positives = 27/54 (50%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLP 160 YW GIGT ++ + ++ K + L+ N RAI Y+K GF ++P Sbjct: 105 YWGLGIGTICMEELIKYAKSSEYLKLIYLEVVTENKRAINLYKKFGFIEAGEIP 158 >THEMA Q9WZ46 (Q9WZ46) Hypothetical protein Length = 179 Score = 35.4 bits (80), Expect = 0.39 Identities = 19/49 (38%), Positives = 28/49 (57%), Gaps = 1/49 (2%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRI 155 YW+ GIGTR I E+ ++ + L+ K+N RAI Y+K GF + Sbjct: 106 YWNIGIGTRMITSAIEWARR-NGFIRIQLEVLKSNERAISLYRKLGFEL 153 >STRR6 Q8DQU8 (Q8DQU8) Hypothetical protein spr0490 Length = 185 Score = 35.4 bits (80), Expect = 0.39 Identities = 29/128 (22%), Positives = 60/128 (46%), Gaps = 16/128 (12%) Query: 55 EDEVF---RVIIEYN---NVPIGYGQIYKMYDELY--TDYHYPKTDEIVYGMDQFIG--- 103 EDE++ ++ E N N+P GYG + K D++ D+++ D+++ IG Sbjct: 50 EDEIYYLEHILPERNQKENLPAGYGIVVKGTDKIVGSVDFNHRHEDDVLE-----IGYTL 104 Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHE 163 P+YW +G + + + K+ + + L N ++ R +K GF + + + + Sbjct: 105 HPDYWGRGYVPEAARALIDLAFKDLGLHKIELTCFGYNLQSKRVAEKLGFTLEARIRDRK 164 Query: 164 LHEGKKED 171 +G + D Sbjct: 165 DVQGNRCD 172 >CLOAB Q97FA0 (Q97FA0) Predicted acetyltransferase Length = 146 Score = 35.4 bits (80), Expect = 0.39 Identities = 23/94 (24%), Positives = 45/94 (47%), Gaps = 15/94 (15%) Query: 61 VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120 V+I+ NN+ +GYG ++ + DE + + P + GIG + ++ + Sbjct: 46 VVIKNNNLVVGYGGLWLIIDEGH--------------ITNIAVHPEFRGMGIGNKILEEL 91 Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 + +K RN ++ L+ +N A Y+K GF+ Sbjct: 92 IKLCEK-RNIPSMTLEVRISNTIAQNLYKKFGFK 124 >_BUCAP LOLC_BUCAP (Q8K9N8) Lipoprotein-releasing system transmembrane protein lolC Length = 399 Score = 35.4 bits (80), Expect = 0.39 Identities = 44/156 (28%), Positives = 62/156 (39%), Gaps = 26/156 (16%) Query: 189 MKYLIEHYFDNFK----VDSIEIIGSGYDSVAYLVNNEYIFKTKFS------------TN 232 ++YL Y NFK + SI IG G S ++ F+ KF TN Sbjct: 11 LRYLWNPYLPNFKKIIIILSILGIGIGISSTIITISIMNGFQNKFKNDILSFIPHIIITN 70 Query: 233 KKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSE 292 K + K LN ET +K+ N+E I+D +S E K EI + Sbjct: 71 KNRNINK-------LNFPKET-LKLKNVEE--ITDFISKKVIIENKNEINIGEIIGINIK 120 Query: 293 EEQNLLKRDIASFLRQMHGLDYTDISECTIDNKQNV 328 E+NL +I FL +H Y I + K +V Sbjct: 121 NEKNLENYNIKKFLHTLHSRKYNAIIGSELAKKMHV 156 >BACC1 Q734M2 (Q734M2) Acetyltransferase, GNAT family Length = 153 Score = 35.4 bits (80), Expect = 0.39 Identities = 22/91 (24%), Positives = 42/91 (46%), Gaps = 16/91 (17%) Query: 80 DELYTDYHYPKT-----------DEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKER 128 DE++ Y Y T DE+ + + + P+Y+ KGI T+ + +F+ + Sbjct: 50 DEIFYGYFYEDTLAGFISFKIEKDEV--DIHRLVVSPDYFHKGIATKLLLYVFDMFSPSK 107 Query: 129 NANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 I+ K N A+ Y+K GF ++++ Sbjct: 108 ---TYIVQTGKENTPALSLYKKHGFIEVKEI 135 >BACAN Q81U09 (Q81U09) Acetyltransferase, GNAT family Length = 167 Score = 35.4 bits (80), Expect = 0.39 Identities = 15/47 (31%), Positives = 27/47 (57%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 Y ++GIGT+ I+ + + K++ + L N RAI+ Y++ GF Sbjct: 93 YCNQGIGTKLIEFLIRWAKEQNGLEKICLGVVSVNDRAIKVYKRMGF 139 >YERPE Q8ZHB3 (Q8ZHB3) Putative siderophore biosynthesis protein IucB (Acetyl CoA:N6-hydroxylsyine acetyl transferase) Length = 316 Score = 35.0 bits (79), Expect = 0.51 Identities = 35/159 (22%), Positives = 63/159 (39%), Gaps = 31/159 (19%) Query: 14 IDDDFPLMLKWLTDERVLEFY---GGRDKK--YTLESLKKHYTEPWEDEVFRVIIEYNNV 68 +D D P +W+ RV F+ G D + Y L Y P ++ +++ Sbjct: 151 VDHDAPQFTRWMNSPRVDAFWEMSGPLDVQAAYLQRQLDSPYCYP-------LLGCFDDQ 203 Query: 69 PIGYGQIY-KMYDELYTDYHYPKTDEIVYGMDQFIGEPNY--------WSKGIGTRYIKL 119 P GY ++Y D + Y + D G+ +GE N+ W +G+ T Y+ L Sbjct: 204 PFGYFEVYWAAEDRIGRHYRWQPFDR---GLHMLVGEENWRGAQYIHSWLRGL-THYLYL 259 Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIED 158 E V+ +P +N R +G+ +++ Sbjct: 260 ------DESRTTRVVAEPRIDNQRLFHHLPAAGYHTLKE 292 >VIBUY Q7MFH7 (Q7MFH7) Histone acetyltransferase HPA2 Length = 166 Score = 35.0 bits (79), Expect = 0.51 Identities = 18/65 (27%), Positives = 33/65 (50%) Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKE 170 GIG++ I+ + E N + ++ + +N +AI Y+K GF I + + EG+ Sbjct: 94 GIGSKLIETVTELADNWLNVRRIQIEVNVDNEKAISLYKKHGFVIEGEAVDSSFREGRFI 153 Query: 171 DCYLM 175 + Y M Sbjct: 154 NTYYM 158 >RICCN Q92JG6 (Q92JG6) Transcriptional activator protein czcR Length = 237 Score = 35.0 bits (79), Expect = 0.51 Identities = 37/136 (27%), Positives = 63/136 (46%), Gaps = 27/136 (19%) Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLET--------NVKIPNIEYSYISDELS 270 V E I + K + KG+A ++ ++ NL+T V + N EY+ + EL Sbjct: 104 VREELIARIKAIVRRSKGHAASVFRFDKVSINLDTRSVEVDGKKVHLTNKEYAIL--ELL 161 Query: 271 ILGYKEIKGTFLTPE-----IYSTMSEEEQNLLKRDIASFLRQMH----GLDYTDISECT 321 IL +GT LT E +YS++ E E ++ I +++ G DY D T Sbjct: 162 ILR----RGTILTKEMFLNHLYSSVDEPEMKIIDVFICKLRKKLSDAAGGRDYID----T 213 Query: 322 IDNKQNVLEEYILLRE 337 + + +L+EY L++ Sbjct: 214 VWGRGYMLKEYDELQQ 229 >CLOAB Q97G03 (Q97G03) Predicted acetyltransferase Length = 167 Score = 35.0 bits (79), Expect = 0.51 Identities = 24/75 (32%), Positives = 37/75 (49%), Gaps = 11/75 (14%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166 Y KGIG+ IK +FE+ +E + L+ +N +AI Y+K GF + E Sbjct: 94 YSGKGIGSLIIKRVFEW-AEENAIEKIDLEVFHDNFKAISLYKKFGF----------IEE 142 Query: 167 GKKEDCYLMEYRYDD 181 G+K++ E Y D Sbjct: 143 GRKKNAIKAEDGYKD 157 >BACHD Q9KF00 (Q9KF00) Ribosomal-protein-alanine N-acetyltransferase Length = 190 Score = 35.0 bits (79), Expect = 0.51 Identities = 43/175 (24%), Positives = 76/175 (43%), Gaps = 23/175 (13%) Query: 8 ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNN 67 + +R + DD +L +L+D+ V++++ G + TLE W + + + Sbjct: 16 LILRKITTDDARSILSYLSDKEVMKYF-GLEPFQTLEDALGEIA--WYESIL-----HEQ 67 Query: 68 VPIGYGQIYKMYDELY--TDYH--YPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI--- 120 I +G K DE+ +H PK G + YW +GI + I+ + Sbjct: 68 TGIRWGITLKGQDEVIGSCGFHQWVPKHHRAEIGFEL---SKLYWGQGIASEAIRAVIQY 124 Query: 121 -FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYL 174 FE L+ +R A+I P+ + R + +K GF L +E GK +D Y+ Sbjct: 125 GFEHLELQR-IQALIEPPNIPSQRLV---EKQGFISEGLLRSYEYTCGKFDDLYM 175 >BACAN Q81NB6 (Q81NB6) Spermine/spermidine acetyltransferase, putative Length = 148 Score = 35.0 bits (79), Expect = 0.51 Identities = 31/119 (26%), Positives = 52/119 (43%), Gaps = 20/119 (16%) Query: 54 WEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPK---TDEIVYGMDQFIGEP---NY 107 WE+ + + E I +Y + + + D Y K +E + G F +P NY Sbjct: 14 WEEAIKLSVKEEQQTFIA-SNLYSIAEVQFLDNFYAKGIYLEEKMVGFTMFGIDPEDNNY 72 Query: 108 W-----------SKGIGTRYIKLIFEFLKKERNAN--AVILDPHKNNPRAIRAYQKSGF 153 W KGIG + I L+ + +++ NAN +++ N A AY+K+GF Sbjct: 73 WIYRLMIDENFQGKGIGKQAIYLVIDEIRRNNNANFSRIMIGYAPENLTAKFAYKKAGF 131 >BACAN Q81N07 (Q81N07) Acetyltransferase, GNAT family Length = 153 Score = 35.0 bits (79), Expect = 0.51 Identities = 21/89 (23%), Positives = 42/89 (47%), Gaps = 12/89 (13%) Query: 80 DELYTDYHYPKT---------DEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNA 130 DE++ Y Y T D+ + + + P+++ KGI T+ + IF+ ++ Sbjct: 50 DEIFYGYFYEDTLAGFISFKIDKEEVDIHRLVVSPDHFHKGIATKLLLYIFDMFS---SS 106 Query: 131 NAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 I+ K N A+ Y+K GF ++++ Sbjct: 107 KTYIVQTGKENTPALSLYKKHGFIEVQNI 135 >STRMU Q8DT36 (Q8DT36) Putative acetyltransferase Length = 184 Score = 34.7 bits (78), Expect = 0.67 Identities = 16/78 (20%), Positives = 36/78 (46%) Query: 106 NYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELH 165 +YW +G+ T ++ + +E + + + HK N + R +K+GFR++ + + Sbjct: 105 HYWKQGLATEALENLVFLAFQELDLKELEIIVHKENRASARVAEKAGFRLVRQFKGSDRY 164 Query: 166 EGKKEDCYLMEYRYDDNA 183 K D + + D + Sbjct: 165 THKMRDYLKYDLKAGDKS 182 >PSEAE Q9I3W7 (Q9I3W7) Hypothetical protein Length = 177 Score = 34.7 bits (78), Expect = 0.67 Identities = 17/66 (25%), Positives = 33/66 (50%) Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKK 169 KG+G+R + + + N V L + +N A+ Y+K GF ++ ++ + +G+ Sbjct: 100 KGVGSRLLGELLDIADNWMNLRRVELTVYTDNAPALALYRKFGFETEGEMRDYAVRDGRF 159 Query: 170 EDCYLM 175 D Y M Sbjct: 160 VDVYSM 165 >NITEU Q82UT0 (Q82UT0) GCN5-related N-acetyltransferase (EC 2.3.1.128) Length = 157 Score = 34.7 bits (78), Expect = 0.67 Identities = 17/48 (35%), Positives = 30/48 (62%), Gaps = 1/48 (2%) Query: 109 SKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156 S+G+G + ++ + E L ++ A V+LD ++N AI YQ+ GF+ I Sbjct: 88 SQGLGRKMLRYLIE-LSRKHQAEFVLLDVRESNTGAINLYQRLGFQQI 134 >LACLA Q9CF66 (Q9CF66) Spermidine acetyltransferase (EC 2.3.1.57) Length = 180 Score = 34.7 bits (78), Expect = 0.67 Identities = 34/131 (25%), Positives = 55/131 (41%), Gaps = 12/131 (9%) Query: 60 RVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKL 119 R +IE N+ IG ++ + DY + +T EI Q I + KG + +K Sbjct: 62 RFVIEANDTFIGIVELMSI------DYIH-RTCEI-----QIIIISGFSGKGYAQKALKT 109 Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRY 179 ++ N + V L +N A+ Y+K GF+I + E G+ D Y M Sbjct: 110 GVDYAFNTLNMHKVYLWVDIDNAPAVHIYKKLGFKIEGTIKEQFFAGGRYHDSYFMGILK 169 Query: 180 DDNATNVKAMK 190 + KA+K Sbjct: 170 SEYTQREKAVK 180 >BACC1 Q739G7 (Q739G7) Acetyltransferase, GNAT family Length = 188 Score = 34.7 bits (78), Expect = 0.67 Identities = 33/136 (24%), Positives = 52/136 (38%), Gaps = 21/136 (15%) Query: 47 KKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPN 106 +K T E+ + +IIE+N IG Y + Y T + G+ I P Sbjct: 56 EKMQTRLKEEPLSNLIIEHNGQVIGTVGFY---------WEYKPTRWLEMGI--VIYNPA 104 Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166 YW+ G GT + L + L ++ V L N R ++ +K G + E Sbjct: 105 YWNGGYGTEALTLYRDLLFEKMEIGRVGLTTWSGNERMMKVAEKIGMTL----------E 154 Query: 167 GKKEDCYLMEYRYDDN 182 G+ C Y D+ Sbjct: 155 GRMRKCRYYNGTYYDS 170 >MYCPE Q8EWM2 (Q8EWM2) Hypothetical protein MYPE1810 Length = 300 Score = 34.3 bits (77), Expect = 0.88 Identities = 57/252 (22%), Positives = 99/252 (39%), Gaps = 62/252 (24%) Query: 115 RYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII-EDL-----PEHELHEGK 168 R+I + +FL K+ + + KN+ + G I+ EDL P L E K Sbjct: 55 RFILNLLDFLYKDNDLIEYKRERSKNDLKFFHFSFSKGLDILLEDLHLNKDPYKWLVETK 114 Query: 169 KEDCYLME-YRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLV-----NNE 222 C+L+ + Y + + + Y E N ++ ++I+ + S+ + NN Sbjct: 115 TRSCFLIGLFLYGGSINSPNSSNYHFEIKIHNTEI--LKIVEKIFSSINIPLLVLNRNNT 172 Query: 223 YIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFL 282 YI K S + ISD L +LG E Sbjct: 173 YIVYIKKSES--------------------------------ISDILKLLGATE------ 194 Query: 283 TPEIYSTMSEEEQNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLE--EYILLRETIY 340 +M E E+ + RD + + +++ LD +++ + TI+ L+ EY+ ++ Sbjct: 195 ------SMFEYEEKRISRDYTNQMSRLNNLDMSNLKK-TIEASHIQLQNIEYVK-NNNLF 246 Query: 341 NDLTDIEKDYIE 352 N LTD EK Y E Sbjct: 247 NQLTDKEKIYCE 258 >MYCPE Q8EWE6 (Q8EWE6) Acetyltransferase GNAT family Length = 190 Score = 34.3 bits (77), Expect = 0.88 Identities = 26/112 (23%), Positives = 51/112 (45%), Gaps = 13/112 (11%) Query: 48 KHYTEPWEDE-VFRVIIEYNNVPIGYGQIYKMYDELYTDYHYP----KTDEIVYGMDQFI 102 KH+ E E + +++I N Y ++K +++ + KT +I Y + + Sbjct: 45 KHHKNIEETETILKILISGGNF---YALVWKENNKVIGSFGIETPSYKTVKIGYALSK-- 99 Query: 103 GEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 +YW+ GI T K I +F+ N +++ N + + +KSGF+ Sbjct: 100 ---DYWNLGIMTEVTKHIIDFIFTNSGFNKILVSHFDENTASKKVIEKSGFK 148 >LACLA Q9CHW9 (Q9CHW9) Hypothetical protein yfiL Length = 154 Score = 34.3 bits (77), Expect = 0.88 Identities = 27/97 (27%), Positives = 46/97 (47%), Gaps = 15/97 (15%) Query: 90 KTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVIL--DPHKNNPRAIRA 147 K + + ++ F E ++ +G G + +K + +LK+ A+ +IL D NN + Sbjct: 56 KKQKNTFEIENFAVETSFQGQGFGQQMMKQLITYLKENLAADELILGTDDVSNN---VAF 112 Query: 148 YQKSGFRIIE-------DLPEHELHEGK---KEDCYL 174 Y+K GF I D +H + EGK K+ YL Sbjct: 113 YEKCGFTITHKISNYFLDNCDHPIFEGKVQLKDKIYL 149 >LACPL Q88SM7 (Q88SM7) Prophage Lp4 protein 8, DNA primase/helicase Length = 500 Score = 34.3 bits (77), Expect = 0.88 Identities = 37/160 (23%), Positives = 64/160 (40%), Gaps = 11/160 (6%) Query: 179 YDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEY-----IFKTKFSTNK 233 Y DNAT + I+++FD EI+ ++ + +N Y IF N Sbjct: 174 YRDNATTPNIKGWTIDNWFDELACGDDEIVELLWEVINDCLNGNYTRKKAIFLFSELGNS 233 Query: 234 KKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEE 293 KG +E I N + + +K+ + + ++G G + P+IY S Sbjct: 234 GKGTFQE-LITNLVGMDNVGTLKVNEFDVRF--RLAGLVGKTVCIGDDIAPDIYIKDSSN 290 Query: 294 EQNLLKRDIASFLRQMHGLD-YTDISECTIDNKQNVLEEY 332 +++ D+ + + G D YT CTI N L + Sbjct: 291 FNSVVTGDLVNI--EFKGQDGYTSALRCTIVQSCNGLPNF 328 >ENTFA Q837C4 (Q837C4) Acetyltransferase, GNAT family Length = 144 Score = 34.3 bits (77), Expect = 0.88 Identities = 17/58 (29%), Positives = 30/58 (51%), Gaps = 6/58 (10%) Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 +P Y+ KG G I+ + E + + +D +K N A++ YQ GF++I + E Sbjct: 78 DPVYFRKGYGGEIIQKLIE------QESIIFVDANKQNEGAVKFYQSQGFQVIGESKE 129 >CLOTE Q891D4 (Q891D4) Ribosomal-protein-alanine acetyltransferase (EC 2.3.1.128) Length = 152 Score = 34.3 bits (77), Expect = 0.88 Identities = 30/118 (25%), Positives = 54/118 (45%), Gaps = 18/118 (15%) Query: 58 VFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYI 117 ++ V I+ N + +GYG ++ + DE + T+ ++ PNY GI + + Sbjct: 48 LYIVAIKDNKI-LGYGGLWIILDEGHV------TNIAIH--------PNYRQLGIASLVL 92 Query: 118 KLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175 + + K R N++ L+ K+N A YQK GF +E+ + ED +M Sbjct: 93 STLIKE-SKNRGVNSITLEVRKSNSVAQNLYQKFGF--VEEGCRKHYYSDNLEDAIIM 147 >CLOAB Q97I16 (Q97I16) Predicted acetyltransferase domain containing protein Length = 291 Score = 34.3 bits (77), Expect = 0.88 Identities = 31/103 (30%), Positives = 37/103 (35%), Gaps = 20/103 (19%) Query: 68 VPIGYGQIYKMYDELYTDY-----HYPKTDEIVYGMDQFIGEPN------------YWSK 110 +P+ IY YDE Y + DEI G QFI E N Y Sbjct: 177 IPLSIDDIY--YDEAQEYYVDDGAFFISKDEIKIGYGQFIFEHNNITIVNFGIVEQYRGN 234 Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 G G ++ I LK R V + NN AI Y GF Sbjct: 235 GYGRYFLSYILNILKN-RGCKVVYIKVDMNNVPAINLYTSMGF 276 >BACC1 Q72WY7 (Q72WY7) Hypothetical protein Length = 186 Score = 34.3 bits (77), Expect = 0.88 Identities = 38/163 (23%), Positives = 70/163 (42%), Gaps = 16/163 (9%) Query: 195 HYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETN 254 H N K++ I + D V L N Y T+ T+ +K K +N + Sbjct: 12 HLEKNIKLEDIPNVDLYVDQVVQLFENTYADTTR--TDDEKVLTK-----TMINNYAKGK 64 Query: 255 VKIPNIEYSYISDELSILG-YKEIKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLD 313 + IP Y + + ++ ++KG +I S++ +LL D SF M + Sbjct: 65 LFIPIKNKKYSKEHMILISLIYQLKGALSINDIKSSLETINDSLLNDD--SFELNMLYKN 122 Query: 314 YTDISECTIDN-KQNVLEEYILLRETIYNDLTDIEKDYIESFM 355 Y ++E +++ KQ+V R T N+++ +E +E F+ Sbjct: 123 YLALTESNVESFKQDVNN-----RVTEVNEISSLEDTKLEKFL 160 >VIBPA Q87G30 (Q87G30) Putative acetyltransferase Length = 166 Score = 33.9 bits (76), Expect = 1.1 Identities = 22/77 (28%), Positives = 36/77 (46%), Gaps = 6/77 (7%) Query: 99 DQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIED 158 DQF G G+G++ I+ I E N + L+ + +N AI Y+K GF I + Sbjct: 88 DQFHG------LGVGSKLIETITELADNWLNVRRIQLEVNADNEAAIGLYKKHGFEIEGE 141 Query: 159 LPEHELHEGKKEDCYLM 175 + +G+ + Y M Sbjct: 142 AIDASFRDGEFINTYYM 158 >STAES ARGD2_STAES (Q8CSG1) Acetylornithine aminotransferase 2 (EC 2.6.1.11) (ACOAT 2) Length = 375 Score = 33.9 bits (76), Expect = 1.1 Identities = 31/133 (23%), Positives = 61/133 (45%), Gaps = 11/133 (8%) Query: 193 IEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE--KAIYNFLNTN 250 + + F+N+K D+IE + + + + NN Y+ + G+ E +A+YN LN Sbjct: 1 MSYLFNNYKRDNIEFVDANQNELIDKDNNVYLDFSSGIGVTNLGFNMEIYQAVYNQLNLI 60 Query: 251 LETNVKIPNIEYSYISDELS--ILGYKEIKGTFL---TPEIYSTMSEEEQNLLKRDIASF 305 + PN+ S I +E++ ++G ++ F T + + + K +I +F Sbjct: 61 WHS----PNLYLSSIQEEVAQKLIGQRDYLAFFCNSGTEANEAAIKLARKATGKSEIIAF 116 Query: 306 LRQMHGLDYTDIS 318 + HG Y +S Sbjct: 117 KKSFHGRTYGAMS 129 >RICPR Q9ZEC5 (Q9ZEC5) 190 KD ANTIGEN (Sca1) Length = 347 Score = 33.9 bits (76), Expect = 1.1 Identities = 24/90 (26%), Positives = 43/90 (47%), Gaps = 2/90 (2%) Query: 197 FDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVK 256 F N K + E+IGS S+ + F + + K + K+K Y++ +T ++++VK Sbjct: 123 FKNGKNNDKELIGSKVISIYGQKELQQNFTLQLLVSASKNFIKDKINYSYGDTQIKSHVK 182 Query: 257 IPNIEYSYISDELSILGYKEIKGTFLTPEI 286 N +SY ++ L Y +TP I Sbjct: 183 HHN--HSYNAEALLNYNYLVKNSIIITPNI 210 >PSEPK Q88NL5 (Q88NL5) Acetyltransferase, GNAT family Length = 162 Score = 33.9 bits (76), Expect = 1.1 Identities = 18/59 (30%), Positives = 28/59 (47%), Gaps = 6/59 (10%) Query: 98 MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156 +D P Y +G+G R ++ L NA LD ++ NP+A+ Y GF +I Sbjct: 86 VDMLFVAPGYRGQGVGKRLLRYAISEL------NAEYLDVNEQNPKALGFYLHEGFEVI 138 >LACPL Q88YF8 (Q88YF8) Acetyltransferase (Putative) Length = 171 Score = 33.9 bits (76), Expect = 1.1 Identities = 13/47 (27%), Positives = 25/47 (53%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 +W G+GT I+ + ++ + + ++L N RA++ YQ GF Sbjct: 98 FWGMGLGTALIEEVLDWARNYSSLERLVLTVQLRNVRAVKLYQHLGF 144 >BACC1 Q737S7 (Q737S7) Acetyltransferase, GNAT family Length = 282 Score = 33.9 bits (76), Expect = 1.1 Identities = 36/125 (28%), Positives = 59/125 (47%), Gaps = 13/125 (10%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANA---VILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 PNY +GIG + +FE K E N + L+ N RAIR Y K G+ + DL Sbjct: 87 PNY--RGIGVS--QKLFELHKDEAIQNGCKQLFLEVIVGNDRAIRFYNKLGYEKVYDLSY 142 Query: 162 HELHEGKK---EDCYLMEYR-YDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAY 217 + L + K ++C +E + + A V+ K+L H+ N++ D I + + Sbjct: 143 YNLKDMTKIIHKECKGIEVKQLEFPAFKVEIQKWL--HFHINWQNDMDYIEKTNHTFYGA 200 Query: 218 LVNNE 222 V+N+ Sbjct: 201 YVDND 205 >BACC1 Q734H0 (Q734H0) Acetyltransferase, GNAT family family Length = 182 Score = 33.9 bits (76), Expect = 1.1 Identities = 45/172 (26%), Positives = 71/172 (41%), Gaps = 24/172 (13%) Query: 8 ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWED---EVFRVIIE 64 +CI +DD ++ L +++ L G Y LE + + W D E+ R IE Sbjct: 10 LCIEPFTNDDV-CRIRELANDKELANILGLPHPYKLE-----FAQDWVDMQPELIRKGIE 63 Query: 65 YNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQ-----FIGEPNYWSKGIGTRYIKL 119 Y P+G + K E+ T I G ++ +IG+ NYW KG T + Sbjct: 64 Y---PLGI--VSKESREIVGTI----TLRIDKGNNRGELGYWIGK-NYWGKGFATEALNR 113 Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKED 171 + +F E N + N +I+ +KSG R L ++ L ED Sbjct: 114 MIQFGFIELGLNKIWASAISRNRSSIKVLEKSGLRKEGTLRQNRLLLNTYED 165 >BACAN Q81Q67 (Q81Q67) Acetyltransferase, GNAT family Length = 282 Score = 33.9 bits (76), Expect = 1.1 Identities = 34/125 (27%), Positives = 57/125 (45%), Gaps = 13/125 (10%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANA---VILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 PNY G+ + +FE K+E N + L+ N RAIR Y K G+ + DL Sbjct: 87 PNYRGVGVSQK----LFELHKEEALQNECKQLFLEVIVGNDRAIRFYNKLGYEKVYDLSY 142 Query: 162 HELHEGKK---EDCYLMEYR-YDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAY 217 + L + K +C +E + + A V+ K+L H+ N++ D I + + Sbjct: 143 YNLKDMTKIIHRECKGIEVKQLEFAAFKVEIQKWL--HFHINWQNDMDYIEKTNHTFYGA 200 Query: 218 LVNNE 222 V+N+ Sbjct: 201 YVDND 205 >BACAN Q81P77 (Q81P77) Acetyltransferase, GNAT family Length = 181 Score = 33.9 bits (76), Expect = 1.1 Identities = 45/181 (24%), Positives = 76/181 (41%), Gaps = 32/181 (17%) Query: 10 IRTLIDDDFPLMLKWLTDERVLEFYGGRDK--KYTLESLKKHYTEPWEDEVF-------- 59 +R L DD +W D +V + D+ +TLE K+ W + Sbjct: 8 LRELTLDDVEDRYQWSLDTKVTKHLVVSDQYPPFTLEDTKQ-----WIEACINRKNGYEQ 62 Query: 60 RVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKL 119 R I N + IG+ ++ K +D+ K E+ IG YW KG G + Sbjct: 63 RAITAENGIHIGWIEL-KNFDKTN------KNAELGIA----IGNKEYWGKGDGIAALYS 111 Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHE-LHEGKKEDCYLMEYR 178 + E V L ++N +A ++Y+K+GF + E L ++ L +G+ ++ YR Sbjct: 112 MLHVAFFEFELEKVWLRVDEDNLQARKSYEKAGF-VCEGLMRNDRLRKGR----FIHRYR 166 Query: 179 Y 179 Y Sbjct: 167 Y 167 >Q8E1I7 (Q8E1I7) Acetyltransferase, GNAT family Length = 186 Score = 33.5 bits (75), Expect = 1.5 Identities = 22/64 (34%), Positives = 29/64 (45%), Gaps = 1/64 (1%) Query: 93 EIVYGMDQFIG-EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKS 151 EI + D FI + +YW GIG ++ E+ + L N RAI YQK Sbjct: 97 EIEHVGDLFIAVQKDYWGYGIGHILMEEAIEWASDNDITRRLELSVQGRNERAIHLYQKF 156 Query: 152 GFRI 155 GF I Sbjct: 157 GFEI 160 >Q8DXE6 (Q8DXE6) Hypothetical protein SAG1905 Length = 212 Score = 33.5 bits (75), Expect = 1.5 Identities = 29/93 (31%), Positives = 44/93 (47%), Gaps = 9/93 (9%) Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIK 278 +N Y+F+ F K GY KE+A+ N + + +K I Y D LS+L KEI Sbjct: 125 LNLRYLFERLFEDEKGGGYPKERAVPEQRNARILSEIK--QITY---RDLLSVL--KEID 177 Query: 279 GTFLTPEIYSTMSEEE--QNLLKRDIASFLRQM 309 FL I +E N ++IA +L+ + Sbjct: 178 QDFLKETISGEHFQEYFFANCQNQNIADYLKSV 210 >OCEIH Q8ET96 (Q8ET96) Hypothetical conserved protein Length = 167 Score = 33.5 bits (75), Expect = 1.5 Identities = 26/90 (28%), Positives = 41/90 (45%), Gaps = 15/90 (16%) Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKK 169 KGIG + LI + A+ + LD +N RAI Y+K GF + EG Sbjct: 92 KGIGKEALNLIKIWAFNSYKAHRLWLDVKTDNKRAITIYKKEGFTL----------EGTL 141 Query: 170 EDCYLMEYRYDDNATNVKAMKYLIEHYFDN 199 +C + Y+ ++ M L++H +DN Sbjct: 142 RECLRVGNTYE----SLHVMS-LLKHEYDN 166 >LISIN Q929M8 (Q929M8) Lin2246 protein Length = 157 Score = 33.5 bits (75), Expect = 1.5 Identities = 23/73 (31%), Positives = 38/73 (52%), Gaps = 1/73 (1%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164 P+Y +GIG + + E + +E+ + L N +AIR Y+K+GF+ L + + Sbjct: 84 PDYQREGIGQLLMDKMKE-VAREKGFIKISLRVLSINQKAIRFYEKNGFKQEGRLEKEFI 142 Query: 165 HEGKKEDCYLMEY 177 +GK D LM Y Sbjct: 143 IQGKYVDDILMAY 155 >CLOTE Q895L4 (Q895L4) DNA topoisomerase I (EC 5.99.1.2) Length = 696 Score = 33.5 bits (75), Expect = 1.5 Identities = 19/52 (36%), Positives = 28/52 (53%), Gaps = 3/52 (5%) Query: 220 NNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSI 271 N+EYIF+ S K G+ IY +LN + + N+ IP +E + E SI Sbjct: 399 NSEYIFRATGSIVKFDGFM---IIYEYLNEDEKENINIPKLEKGELLKEKSI 447 >CLOPE Q8XIF5 (Q8XIF5) Ribosomal-protein-alanine N-acetyltransferase Length = 148 Score = 33.5 bits (75), Expect = 1.5 Identities = 23/76 (30%), Positives = 39/76 (51%), Gaps = 4/76 (5%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164 P Y +G+G I + L KE N N++ L+ ++N A Y+K GF+ E+ Sbjct: 76 PEYRKQGVGNLLIDNLIT-LCKENNINSLTLEVRESNIPAQSLYKKHGFK--EEGIRKNF 132 Query: 165 HEGKKEDCYLMEYRYD 180 + KE+ +M +R+D Sbjct: 133 YNNPKENAIIM-WRHD 147 >BRUSU Q8FVN1 (Q8FVN1) Acyl-CoA dehydrogenase family protein Length = 388 Score = 33.5 bits (75), Expect = 1.5 Identities = 21/79 (26%), Positives = 38/79 (48%), Gaps = 15/79 (18%) Query: 181 DNATNVKAMKYLIEH-----YFDNFKVDSIEIIG----------SGYDSVAYLVNNEYIF 225 +N ++ ++ ++ H +FDN +V +IG SG ++ L+ E I Sbjct: 197 NNGLTIRPIRTMMNHATTEVFFDNLRVPVSNLIGEEGKGFRYILSGMNAERILIAAECIG 256 Query: 226 KTKFSTNKKKGYAKEKAIY 244 K+ T K YAKE++I+ Sbjct: 257 DAKWFTQKSVNYAKERSIF 275 >BRUME Q8YCN6 (Q8YCN6) ACYL-COA DEHYDROGENASE (EC 1.3.99.-) Length = 395 Score = 33.5 bits (75), Expect = 1.5 Identities = 21/79 (26%), Positives = 38/79 (48%), Gaps = 15/79 (18%) Query: 181 DNATNVKAMKYLIEH-----YFDNFKVDSIEIIG----------SGYDSVAYLVNNEYIF 225 +N ++ ++ ++ H +FDN +V +IG SG ++ L+ E I Sbjct: 204 NNGLTIRPIRTMMNHATTEVFFDNLRVPVSNLIGEEGKGFRYILSGMNAERILIAAECIG 263 Query: 226 KTKFSTNKKKGYAKEKAIY 244 K+ T K YAKE++I+ Sbjct: 264 DAKWFTQKSVNYAKERSIF 282 >BACSU YQAR_BACSU (P45914) Hypothetical protein yqaR Length = 154 Score = 33.5 bits (75), Expect = 1.5 Identities = 36/157 (22%), Positives = 74/157 (47%), Gaps = 11/157 (7%) Query: 201 KVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKK---KGYAKEKAIYNFLNTNLETN--- 254 ++D +I S +V + Y+ + K + KGY IYN N +ET Sbjct: 3 QIDFGTVITSAITAVFFTGGTNYVLQKKNRKGNEIFTKGYILIDEIYNINNKRIETAAAF 62 Query: 255 VKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLDY 314 V N Y+ ++L +KE+ L + +S + ++E N+ ++ ++LR++ Sbjct: 63 VPFYNHPEGYL-EKLHTDYFKELSAFELIVKKFSILFDKELNIKLQEYINYLREVEVALR 121 Query: 315 TDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYI 351 +++ I + N +EYI E + +++T++ K +I Sbjct: 122 GFMNDDPI-IEVNFNQEYI---ERLIDEITNLIKKHI 154 >BACAN Q81RF9 (Q81RF9) Acetyltransferase, GNAT family Length = 185 Score = 33.5 bits (75), Expect = 1.5 Identities = 40/184 (21%), Positives = 71/184 (38%), Gaps = 29/184 (15%) Query: 6 NEICIRTLIDDDFPLMLKWLTDERVLE-------FYGGRDKKYTLESLKKHYTEPWEDEV 58 +++ IRT+ + D + + E E ++ ++Y++ +K T E+ + Sbjct: 9 DKVTIRTIEESDIKTLWNLVFKEENPEWKKWDAPYFSFSMQEYSVYK-EKMQTRLKEEPL 67 Query: 59 FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIK 118 +IIE N IG Y + Y T + G+ I P YW+ G GT + Sbjct: 68 SNLIIENNGQVIGTVGFY---------WEYKPTRWLEMGI--VIYNPAYWNGGYGTEALT 116 Query: 119 LIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYR 178 L + L ++ V L N R ++ +K G + EG+ C Sbjct: 117 LYRDLLFEKMEIGRVGLTTWSGNERMMKVAEKIGMSL----------EGRMRKCRYYNGT 166 Query: 179 YDDN 182 Y D+ Sbjct: 167 YYDS 170 >VIBPA Q87NP1 (Q87NP1) Putative spermine/spermidine acetyltransferase BltD Length = 182 Score = 33.1 bits (74), Expect = 2.0 Identities = 21/91 (23%), Positives = 42/91 (46%), Gaps = 3/91 (3%) Query: 85 DYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRA 144 ++H P T + M + P++ KG+G+ + + + N V L+ + N A Sbjct: 89 EFHAPSTGTLWLPMLTIL--PSFKGKGLGSEIVSSVIAVACEYANLQNVGLNVYAENISA 146 Query: 145 IRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175 R + + GF I + E+ GK+ +C ++ Sbjct: 147 FRFWYRQGFTQIRAF-DQEIEFGKEYNCLVL 176 >THETN Q8RC65 (Q8RC65) Acetyltransferases Length = 200 Score = 33.1 bits (74), Expect = 2.0 Identities = 16/50 (32%), Positives = 31/50 (62%), Gaps = 1/50 (2%) Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLP 160 G+G++ ++ I + +K + ++LD N +AI+ Y+K G++IIE P Sbjct: 134 GLGSKLLEEIEQEARKLK-CKRIVLDVEIENEKAIKLYEKLGYKIIERSP 182 >STRR6 Q8CYV9 (Q8CYV9) Hypothetical protein spr0850 Length = 166 Score = 33.1 bits (74), Expect = 2.0 Identities = 23/81 (28%), Positives = 40/81 (49%), Gaps = 6/81 (7%) Query: 86 YHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYI-KLIFEFLKKERNANAVILDPHKNNPRA 144 Y YP + + G+ F+ + Y KGIG+ + + + F K R A + K NP++ Sbjct: 80 YAYPDEETVFIGL--FMVDQAYQRKGIGSHIVTEALAYFAKNFRKARLAYV---KGNPQS 134 Query: 145 IRAYQKSGFRIIEDLPEHELH 165 ++K GF+ I + EL+ Sbjct: 135 QHFWEKQGFKSIGCEVKQELY 155 >STAES Q8CU70 (Q8CU70) Spermidine acetyltransferase Length = 165 Score = 33.1 bits (74), Expect = 2.0 Identities = 27/83 (32%), Positives = 37/83 (44%), Gaps = 14/83 (16%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHK-------NNPRAIRAYQKSG 152 Q I +P + KG Y K FE K N IL+ HK +N +A+ Y+ G Sbjct: 81 QIIIKPEFSGKG----YAKFAFE---KAINYAFDILNMHKIYLYVDTDNKKAVHIYESQG 133 Query: 153 FRIIEDLPEHELHEGKKEDCYLM 175 F+ L E +GK +D Y M Sbjct: 134 FKTEGLLKEQFYTKGKYKDAYFM 156 >STAES Q8CS08 (Q8CS08) Hypothetical protein SE1483 Length = 434 Score = 33.1 bits (74), Expect = 2.0 Identities = 38/150 (25%), Positives = 71/150 (47%), Gaps = 24/150 (16%) Query: 224 IFKTKFSTNKKKGYAK----EKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKE--- 276 I KT+FST+K KGY K EK+ N N + + ++ N + I++E+S L Sbjct: 6 ILKTQFSTSKFKGYLKYINDEKS--NKANHD-KKKIQSLNQDIENINNEMSNLNLNSYSS 62 Query: 277 -IKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLDYTDISECTID----------NK 325 I G I ++ ++ ++KR A F + LD ++++ D N Sbjct: 63 YIIGYMKNNSITKKDNQNKKKVIKRTTAPFNNNSYTLDNKELNKLKDDFDTAEKQGCINY 122 Query: 326 QNVL--EEYILLRETIYNDLTD-IEKDYIE 352 Q+++ + L++ +Y+ TD + +D I+ Sbjct: 123 QDIISFDNDFLIKNHLYDAKTDELNEDVIK 152 >RICPR Q9ZDP9 (Q9ZDP9) Hypothetical protein RP278 Length = 371 Score = 33.1 bits (74), Expect = 2.0 Identities = 34/116 (29%), Positives = 52/116 (44%), Gaps = 19/116 (16%) Query: 231 TNKKKGYAK----EKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEI 286 T K+ Y + E+A+Y+ L + K NI S D+L + +KG LTPE Sbjct: 101 TRLKENYIQYDTVEEALYSLLTKETDLIKKANNIPESLTPDDL-----RRLKGENLTPE- 154 Query: 287 YSTMSEEEQNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYND 342 E+E+ K + S L + +D T S D + N + E L +TI N+ Sbjct: 155 -----EQEEERKKFEYLSILGSI--IDDTKKSNEHYDKRANEINEQ--LNKTIINE 201 >OCEIH Q8CXG0 (Q8CXG0) Diamine N-acetyltransferase (Spermine:spermidine acetyltransferase) (EC 2.3.1.57) Length = 152 Score = 33.1 bits (74), Expect = 2.0 Identities = 26/100 (26%), Positives = 44/100 (44%), Gaps = 12/100 (12%) Query: 66 NNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLK 125 ++ PIGY + +H + + D+F+ + KG +YI LI +++K Sbjct: 53 DDTPIGYAMV---------GFHSQEKQSAWF--DRFMIAAEHQGKGYAHQYIPLILDYIK 101 Query: 126 KERNANAVILDPHKNNPRAIRAYQKSGFRII-EDLPEHEL 164 + ++ L N A Y+K GF + E PE EL Sbjct: 102 MKYQVKSIKLSIIPTNDVAKLLYEKYGFVLTGETDPEGEL 141 >CLOAB Q97J70 (Q97J70) Predicted acetyltransferase Length = 171 Score = 33.1 bits (74), Expect = 2.0 Identities = 19/69 (27%), Positives = 31/69 (44%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166 YW G+G + I + + KK + L +N RAI+ Y+ GF + L + Sbjct: 98 YWGLGVGRKLIMNLIAWSKKNHIVRKINLRVRTDNYRAIKLYESLGFVNEGTIKRDFLID 157 Query: 167 GKKEDCYLM 175 G+ D + M Sbjct: 158 GEFYDSFSM 166 >BURMA Q9AI54 (Q9AI54) DedA family protein Length = 1925639 Score = 33.1 bits (74), Expect = 2.0 Identities = 50/238 (21%), Positives = 103/238 (43%), Gaps = 28/238 (11%) Query: 12 TLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIG 71 T+ ++++ M+ ++VLE G ++K +YT ++I+Y N I Sbjct: 546537 TINENZYMEMITKDNLKQVLENLGFKNKNENYVKTINNYT---------LLIDYKNQSIN 546587 Query: 72 YGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFL----KKE 127 Y + K++D+ +++ +P+ + + + + KG Y++L ++ KK Sbjct: 546588 YPKEIKIHDKTTSNFSHPENFVVFECVHRLL------EKGYKAEYLELEPKWNLGRDKKG 546641 Query: 128 RNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVK 187 A+ ++ D ++NNP I + + + E + E + +++ L Y + K Sbjct: 546642 GKADILVKD-NENNPYLIIECKTTDSKNSEFI--KEWNRMQEDGGQLFSYFQQE-----K 546693 Query: 188 AMKYLIEHYFD-NFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIY 244 +KYL + D + K++ I YD+ YL E K S N + + K Y Sbjct: 546694 GVKYLCLYTSDFSDKLEYKNYIIQAYDNEEYLKEKELQNSYKKSNNNIELFKTWKESY 546751 Score = 31.2 bits (69), Expect = 7.4 Identities = 20/73 (27%), Positives = 36/73 (49%), Gaps = 2/73 (2%) Query: 105 PNYWSKGIGTRYIKLIFEFLKK-ERNANAVILDPHKNNPRAIRAYQKSGFRII-EDLPEH 162 P++ +G+G+R + + + + E + L NP A+R Y++ GFR + Sbjct: 1424334 PDHQGRGVGSRLFESLIAWARSAEPEIVRIELAAGAGNPGAVRLYERLGFRHEGRQVARG 1424393 Query: 163 ELHEGKKEDCYLM 175 L +G+ ED LM Sbjct: 1424394 RLPDGRFEDDILM 1424406 >BRAJA Q89YE3 (Q89YE3) Bll0009 protein Length = 250 Score = 33.1 bits (74), Expect = 2.0 Identities = 14/56 (25%), Positives = 31/56 (55%), Gaps = 4/56 (7%) Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 +PN+ KG+GT L+ + + + + ++ +NP I YQ+ GF+++ ++ Sbjct: 165 DPNWVGKGLGT----LLMNYALQRCDEDGIVAYLESSNPENIPFYQRHGFKVVGEI 216 >BACAN Q81NK7 (Q81NK7) Streptothricin acetyltransferase, putative Length = 184 Score = 33.1 bits (74), Expect = 2.0 Identities = 31/119 (26%), Positives = 54/119 (45%), Gaps = 6/119 (5%) Query: 40 KYTLE---SLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVY 96 +YT+E S +K Y + +E+ V EY N P I +++++ K Sbjct: 41 EYTVEDVPSYEKSYLQNDNEEL--VYNEYINKPNQIIYIALLHNQIIGFIVLKKNWNNYA 98 Query: 97 GMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRI 155 ++ + Y + G+G R I ++ K E N ++L+ NN A + Y+K GF I Sbjct: 99 YIEDITVDKKYRTLGVGKRLIAQAKQWAK-EGNMPGIMLETQNNNVAACKFYEKCGFVI 156 >VIBUY Q7MI31 (Q7MI31) Ribosomal-protein-alanine acetyltransferase Length = 150 Score = 32.7 bits (73), Expect = 2.6 Identities = 19/75 (25%), Positives = 34/75 (45%), Gaps = 1/75 (1%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164 P KG G + + + + NA + L+ ++N AI YQ+ GF ++ + Sbjct: 74 PKQQGKGYGRQLLDAFIDE-GEAANAESAWLEVRESNVNAIHLYQEMGFNEVDRRRNYYP 132 Query: 165 HEGKKEDCYLMEYRY 179 + KED +M Y + Sbjct: 133 TQSGKEDAIIMSYLF 147 >OCEIH Q8EMB8 (Q8EMB8) Glucose-6-phosphate 1-dehydrogenase (EC 1.1.1.49) (G6PD) Length = 491 Score = 32.7 bits (73), Expect = 2.6 Identities = 45/198 (22%), Positives = 79/198 (39%), Gaps = 25/198 (12%) Query: 53 PWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTD----EIVYGMDQFIG--EPN 106 PW DEV R +E N++ + E + ++Y D E G+++ I E Sbjct: 51 PWTDEVLRENVE-NSIQDALSPDEDL-SEFISHFYYKSFDVTEKESYQGLNEIIQNLEGQ 108 Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166 Y ++G Y+ + +F N + N ++ +IE H+L Sbjct: 109 YQTEGNRLFYLAMAPDFFGAIAN---------QLNDYGLKNTSGWTRLVIEKPFGHDLPS 159 Query: 167 GKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFK 226 KK + L +D Y I+HY V +IE+I +L NN +I Sbjct: 160 AKKLNHELQAAFREDQI-------YRIDHYLGKEMVQNIEVIRFANGIFEHLWNNRFISN 212 Query: 227 TKFSTNKKKGYAKEKAIY 244 + ++++ G +E+A Y Sbjct: 213 IQITSSETLG-VEERARY 229 >OCEIH Q8EKW6 (Q8EKW6) Acetyltransferase Length = 166 Score = 32.7 bits (73), Expect = 2.6 Identities = 36/165 (21%), Positives = 64/165 (38%), Gaps = 16/165 (9%) Query: 6 NEICIRTLIDDDFPLMLKWLTDERVLEFYGG---RDKKYTLESLKKHYT--EPWEDEVFR 60 N + R D+DFP + L D V+ F G RD K + L+ Y + + Sbjct: 5 NRLTFRPYHDNDFPFLQSLLQDPEVVRFIGDGNVRDDKACNDFLQWIYDTYKNGNGLGLQ 64 Query: 61 VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120 V++ N +G+ + E + EI Y + + +W KG T + Sbjct: 65 VLVNKQNERVGHAGLVPQTVEGKNEI------EIGYWIAK-----KHWGKGYATEAALAL 113 Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELH 165 F F +K + VI + N + +K +I +++ + H Sbjct: 114 FAFARKNIEVDRVISLIQRENTASRNVAEKLMMKIEKEIILKDKH 158 >MYCPE Q8EWC2 (Q8EWC2) Oligoendopeptidase F Length = 604 Score = 32.7 bits (73), Expect = 2.6 Identities = 30/121 (24%), Positives = 54/121 (44%), Gaps = 7/121 (5%) Query: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240 DN T + L ++ F FK+D + I Y+ V+ +N + K + N+K Sbjct: 37 DNGTCYSNLNKLKKYLF--FKLDMVPIENKLYNYVSNKLNEDLANKEMINWNQKLSSKIS 94 Query: 241 KAIYNFLNTNLETNVKIPNIEY--SYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLL 298 + +F N E N+ + N E S+I ++ I ++ E + +EEE+ L+ Sbjct: 95 EFQLSFAN---EINIILDNKELIKSFIENDSEIKKFERFFDLIFKEENHKLSNEEEKLLV 151 Query: 299 K 299 K Sbjct: 152 K 152 >LISMO Q8Y8S4 (Q8Y8S4) Lmo0820 protein Length = 185 Score = 32.7 bits (73), Expect = 2.6 Identities = 21/74 (28%), Positives = 33/74 (44%), Gaps = 3/74 (4%) Query: 98 MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157 +D + Y G+GT + + E + E V L+ K NP A R Y++ GF + Sbjct: 113 LDSIVTNEKYRGHGVGTALLAKLTE-IAAEDGEKVVGLNCDKGNPHAKRLYERLGFHVTG 171 Query: 158 D--LPEHELHEGKK 169 + L HE +K Sbjct: 172 EITLSGHEYEHMQK 185 >CLOPE Q8XKM5 (Q8XKM5) Probable acetyltransferase Length = 167 Score = 32.7 bits (73), Expect = 2.6 Identities = 18/76 (23%), Positives = 36/76 (47%), Gaps = 8/76 (10%) Query: 85 DYHYPKTDEIVYGMD-------QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDP 137 DY Y ++I + +D + P++ KG G + I + E + KE+ N++ + Sbjct: 69 DYAYDVYNDIAWQVDGPFLSFHRIAVSPSHRGKGYGRKMIDFV-EEMAKEKKCNSIRISA 127 Query: 138 HKNNPRAIRAYQKSGF 153 + N A+ Y+ G+ Sbjct: 128 YHKNENAVNLYKNLGY 143 >BUCAI SPED_BUCAI (P57304) S-adenosylmethionine decarboxylase proenzyme (EC 4.1.1.50) (AdoMetDC) (SamDC) [Contains: S-adenosylmethionine decarboxylase beta chain; S-adenosylmethionine decarboxylase alpha chain] Length = 265 Score = 32.7 bits (73), Expect = 2.6 Identities = 30/103 (29%), Positives = 46/103 (44%), Gaps = 11/103 (10%) Query: 169 KEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTK 228 + D +EYR ++ +K+ I+H ++ + + I S YD V V E IF T+ Sbjct: 159 ESDIVTIEYRVRGFTRDIHGIKHFIDHKINSIQNFMSDDIKSMYDMVDVNVYQENIFHTR 218 Query: 229 FSTNKKKGYAKEKAIYNFL-NTNLETNVKIPNIEYSYISDELS 270 +E + N+L N NLE + E SYI LS Sbjct: 219 M-------LLREFNLKNYLFNINLE---NLEKEERSYIKKLLS 251 >AQUAE O67458 (O67458) Hypothetical protein aq_1482 Length = 161 Score = 32.7 bits (73), Expect = 2.6 Identities = 18/60 (30%), Positives = 30/60 (50%), Gaps = 1/60 (1%) Query: 95 VYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 V + + + +P Y G+GT + I E+ KK + + L N +AI Y+K GF+ Sbjct: 87 VGAIHEIVVDPEYQGHGVGTALMNTILEYFKK-KGLDTAELWVGDENYKAINFYKKFGFQ 145 >YERPE Q8CZN5 (Q8CZN5) Acyltransferase for 30S ribosomal subunit protein S18 Length = 161 Score = 32.3 bits (72), Expect = 3.3 Identities = 20/72 (27%), Positives = 34/72 (47%), Gaps = 1/72 (1%) Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHE 163 +P Y +G G ++ + E L+ ERN + L+ +N RAI Y+ GF + + Sbjct: 86 DPQYQRQGYGRLLLEHLIEQLE-ERNIVTLWLEVRASNARAIALYESLGFNEVSVRRNYY 144 Query: 164 LHEGKKEDCYLM 175 +ED +M Sbjct: 145 PSANGREDAIMM 156 >STRAW Q827N9 (Q827N9) Putative acetyltransferase Length = 166 Score = 32.3 bits (72), Expect = 3.3 Identities = 19/59 (32%), Positives = 31/59 (52%), Gaps = 1/59 (1%) Query: 109 SKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEG 167 S+GIG+ I+ E L +ER + + L +NPRA Y + G+R + + +EG Sbjct: 88 SRGIGSALIRAAEE-LTRERGLDVIGLGVGTDNPRAAELYARLGYRPLTGYVDRWSYEG 145 >STAES Q8CU81 (Q8CU81) Spermidine acetyltransferase Length = 165 Score = 32.3 bits (72), Expect = 3.3 Identities = 24/80 (30%), Positives = 35/80 (43%), Gaps = 8/80 (10%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFE----FLKKERNANAVILDPHKNNPRAIRAYQKSGFRI 155 Q I +P + KG Y K FE + N + + L +N +AI Y+ GF+ Sbjct: 81 QIIIKPEFSGKG----YAKFAFEKAIIYAFNILNMHKIYLYVDADNKKAIHIYESQGFKT 136 Query: 156 IEDLPEHELHEGKKEDCYLM 175 L E +GK +D Y M Sbjct: 137 EGLLKEQFYTKGKYKDAYFM 156 >RICPR Q9ZCN0 (Q9ZCN0) RIBOSOMAL-PROTEIN-ALANINE ACETYLTRANSFERASE (RimJ) Length = 183 Score = 32.3 bits (72), Expect = 3.3 Identities = 26/85 (30%), Positives = 41/85 (48%), Gaps = 12/85 (14%) Query: 93 EIVYGMDQFIGEPNYWSKGIGTRYIKLIFEF---LKKERNANAVILDPHKNNPRAIRAYQ 149 EI Y +D PN+W +GI + IK I +F + R VI D N R++ + Sbjct: 103 EISYDLD-----PNFWGQGIMLKSIKNILKFADCIGIIRVQATVITD----NFRSVNLLE 153 Query: 150 KSGFRIIEDLPEHELHEGKKEDCYL 174 + GF L ++E+ K +D Y+ Sbjct: 154 RCGFSKEGILKKYEIIANKHKDYYM 178 >OCEIH Q8ERM1 (Q8ERM1) Acetyltransferase Length = 177 Score = 32.3 bits (72), Expect = 3.3 Identities = 39/175 (22%), Positives = 64/175 (36%), Gaps = 15/175 (8%) Query: 5 ENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLK-KHY---TEPWEDEVFR 60 + E+ IR + + D + + + E E+ ++ ES+ +H+ E W D R Sbjct: 4 DQELTIRPIQEKDLKRLWELIYKEDNPEWKQWDAPYFSHESMSYEHFLKEAESWIDAKSR 63 Query: 61 VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120 ++ NN G Y+Y + M E N W KG GT +KL Sbjct: 64 WVVCVNNDVHGT-----------VSYYYEDEQKNWLEMGIIFYEGNNWGKGYGTTALKLW 112 Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175 + + V L N R IR +K G + + + G+ D M Sbjct: 113 VNHIFTQLPVVRVGLTTWSGNKRMIRVAEKLGMTMEGRIRNVRYYNGEYYDSIRM 167 >MYCGE RIBF_MYCGE (P47391) Putative riboflavin biosynthesis protein ribF [Includes: Riboflavin kinase (EC 2.7.1.26) (Flavokinase); FMN adenylyltransferase (EC 2.7.7.2) (FAD pyrophosphorylase) (FAD synthetase)] Length = 269 Score = 32.3 bits (72), Expect = 3.3 Identities = 16/44 (36%), Positives = 27/44 (61%), Gaps = 3/44 (6%) Query: 419 TNFGEDILRMY-GNIDIEKAKEYQDIVEEYYPIETIVYGIKNIK 461 TN ++R Y N ++EKA + +VE YY + T+V+G+K + Sbjct: 120 TNLSSSVIRNYLTNNELEKANQL--LVEPYYRVGTVVHGLKKAR 161 >ENTFA Q836M4 (Q836M4) Spermine/spermidine acetyltransferase, putative Length = 148 Score = 32.3 bits (72), Expect = 3.3 Identities = 16/56 (28%), Positives = 29/56 (51%) Query: 98 MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 +D+F+ + + +G G +L+ L ++ N + L + N AIR YQ+ GF Sbjct: 72 LDRFLIDQRFQGQGYGKAACRLLMLKLIEKYQTNKLYLSVYDTNSSAIRLYQQLGF 127 >CHRVO Q7NRZ7 (Q7NRZ7) Probable acetyltransferase (EC 2.3.1.-) Length = 172 Score = 32.3 bits (72), Expect = 3.3 Identities = 20/70 (28%), Positives = 31/70 (44%) Query: 106 NYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELH 165 ++ KG G ++ I + + L +N RAI Y+K GF L + +L Sbjct: 92 DWQGKGAGGAMMRAIIDLADNWLGLIRIELKVIHDNARAIALYEKFGFEYEGRLRQEQLR 151 Query: 166 EGKKEDCYLM 175 GK ED +M Sbjct: 152 AGKLEDVLVM 161 >CHLCV Q824H9 (Q824H9) Acetyltransferase, GNAT family Length = 170 Score = 32.3 bits (72), Expect = 3.3 Identities = 18/52 (34%), Positives = 29/52 (55%), Gaps = 2/52 (3%) Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 +GEP Y +KGIGT + + K + + L+ ++ NP AI Y++ GF Sbjct: 95 VGEP-YRNKGIGTALLNNLCHLAKSRFHLEILYLEVYEENP-AIELYKRFGF 144 >CHLMU SYR_CHLMU (Q9PJT8) Arginyl-tRNA synthetase (EC 6.1.1.19) (Arginine--tRNA ligase) (ArgRS) Length = 563 Score = 32.3 bits (72), Expect = 3.3 Identities = 56/268 (20%), Positives = 106/268 (39%), Gaps = 63/268 (23%) Query: 204 SIEIIGSGYD----SVAYLVNNEYIFKTKFSTNKKKGY---AKEKAIYNFLNTNLETNVK 256 SIEI G+G+ S +L N IF + + KG+ + +K I +F + N+ ++ Sbjct: 75 SIEIAGAGFINFTFSKEFLANQLQIFSQELA----KGFPVSSPQKVIIDFSSPNIAKDMH 130 Query: 257 IPNIEYSYISDEL----SILGYKEIK-----------GTFLT--PEIYSTMSEEEQNLLK 299 + ++ + I D L S +G+ ++ G +T E T + +NL + Sbjct: 131 VGHLRSTIIGDCLARCFSFVGHDVLRLNHIGDWGTAFGMLITYLQETAQTDIHQLENLTE 190 Query: 300 RDIASFLRQMHGLDYTDISE--------------------CTIDNKQ-----NVLEEYIL 334 + +R ++ S+ C + K ++L+ + Sbjct: 191 LYKKAHVRFAEDPEFKKRSQYNVVALQSGDPQALALWKQICAVSEKSFQKIYSILDVELH 250 Query: 335 LR-ETIYND-LTDIEKDYIESFMERLNATTVFEGKKCLCHNDFSCNHLLL---DGNNRLT 389 R E+ YN L D+ D +E N T+ +G KC+ H +FS ++ G N T Sbjct: 251 TRGESFYNPFLADVVSD-----LESKNLVTLSDGAKCVFHEEFSIPLMIQKSDGGYNYAT 305 Query: 390 XXXXXXXXXXXXEYCDFIYLLEDSEEEI 417 ++ D I ++ DS + + Sbjct: 306 TDVAAMRYRIQQDHADRILIVTDSGQSL 333 >BACSU O34376 (O34376) Putative acetyl transferase (YobR protein) Length = 247 Score = 32.3 bits (72), Expect = 3.3 Identities = 24/84 (28%), Positives = 37/84 (44%), Gaps = 2/84 (2%) Query: 76 YKMYD-ELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVI 134 +KMYD E T + G+ + + KG GT+ I+++ E+ K A + Sbjct: 158 FKMYDKESLTALGTVSVIDGYGGLSNIVVAEEHRGKGAGTQVIRVLTEWAKNN-GAERMF 216 Query: 135 LDPHKNNPRAIRAYQKSGFRIIED 158 L K N A+ Y K GF I + Sbjct: 217 LQVMKENLAAVSLYGKIGFSPISE 240 >BACC1 Q738E9 (Q738E9) Acetyltransferase, GNAT family Length = 308 Score = 32.3 bits (72), Expect = 3.3 Identities = 38/159 (23%), Positives = 59/159 (37%), Gaps = 25/159 (15%) Query: 62 IIEYNNVPIGYGQIYKM-YDELYTDYHYPKTDEIVYG-------------MDQFIGEPNY 107 +I+YN P GY + M Y D + D + G +D+ EP Y Sbjct: 39 VIDYNIQPPGYSSVEMMRYSIEELDCYKVIMDGKIIGGIIVTISGKSYGRIDRIFVEPVY 98 Query: 108 WSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEG 167 KGIG+ IKLI E R + NN Y+K G+ I + Sbjct: 99 QGKGIGSYVIKLIEEEYPSIRIWDLETSSRQLNNH---HFYKKMGYETI--------FKS 147 Query: 168 KKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIE 206 + E CY+ + N+ K + ++N + + E Sbjct: 148 EDEYCYVKRITVESAEENLIKNKDMKNSQYENCNLANTE 186 >BACC1 Q736C5 (Q736C5) Acetyltransferase, GNAT family Length = 181 Score = 32.3 bits (72), Expect = 3.3 Identities = 23/79 (29%), Positives = 39/79 (49%), Gaps = 6/79 (7%) Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 IG YW KG G + + E V L ++N +A ++Y+K+GF + E L Sbjct: 94 IGNKEYWGKGYGIAALYSMLHVAFFEFELEKVWLRVDEDNFQARKSYEKAGF-VCEGLMR 152 Query: 162 HE-LHEGKKEDCYLMEYRY 179 ++ L +G+ ++ YRY Sbjct: 153 NDRLRKGQ----FIHRYRY 167 >BACC1 Q735P8 (Q735P8) Acetyltransferase, GNAT family Length = 149 Score = 32.3 bits (72), Expect = 3.3 Identities = 31/122 (25%), Positives = 49/122 (40%), Gaps = 21/122 (17%) Query: 50 YTEPWEDEVFRVIIE--YNNVP--IGYGQIYKMYDELY---TDYHYPKTDEIVYGMDQFI 102 Y P +E V E YN+ P +G+ + K +L Y K D+IV G F Sbjct: 9 YIVPCTEESIHVANEQGYNSGPHIVGHVENVKQDKDLLPWGAWYVIRKEDDIVLGDIGFK 68 Query: 103 GEPN--------------YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAY 148 G+PN YW+KG T ++ + + + +I + N +IR Sbjct: 69 GKPNEEHTVEVGYGFIEKYWNKGYATEAVRELINWAFQTGEVEMIIAETLLENESSIRVL 128 Query: 149 QK 150 +K Sbjct: 129 EK 130 >AQUAE YZ34_AQUAE (O66423) Hypothetical protein AA34 Length = 318 Score = 32.3 bits (72), Expect = 3.3 Identities = 25/71 (35%), Positives = 37/71 (52%), Gaps = 5/71 (7%) Query: 414 EEEIGTNFGEDILRMYGNIDIE-KAKEYQDIVEEYYPI----ETIVYGIKNIKQEFIENG 468 EE IG GE + + + E KAKE + V++ I ET+ Y IK I +E I + Sbjct: 215 EELIGETLGELLEKEIEKLVAEEKAKEIEGKVKKLKEIVSWFETLPYEIKQIAKEVISDN 274 Query: 469 RKEIYKRTYKD 479 +I ++ YKD Sbjct: 275 VLDIAEKFYKD 285 >YERPE Q8ZCX6 (Q8ZCX6) Spermidine acetyltransferase (EC 2.3.1.57) (Spermidine N1-acetyltransferase) Length = 181 Score = 32.0 bits (71), Expect = 4.4 Identities = 21/69 (30%), Positives = 29/69 (42%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 Q I +P + KG KL E+ N + L K N +AI Y K GF I +L Sbjct: 87 QIIIDPTHQGKGYAGAAAKLAMEYGFSVLNLYKLYLIVDKENEKAIHIYSKLGFEIEGEL 146 Query: 160 PEHELHEGK 168 + G+ Sbjct: 147 KQEFFINGE 155 >STRR6 Q8DNN2 (Q8DNN2) Hypothetical protein spr1627 Length = 148 Score = 32.0 bits (71), Expect = 4.4 Identities = 14/58 (24%), Positives = 30/58 (51%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157 +F P +G+G++ ++ + + +++ L+ + N RA YQK GF I++ Sbjct: 76 RFFINPQKQEQGLGSQALRKFVSLAFENEDIDSISLNVFEANQRAQNLYQKEGFEIVQ 133 >STAAM Q99RQ8 (Q99RQ8) Similar to transcription repressor of sporulation, septation and degradation PaiA Length = 171 Score = 32.0 bits (71), Expect = 4.4 Identities = 20/65 (30%), Positives = 35/65 (53%), Gaps = 4/65 (6%) Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKE 170 G G++ I+L E + +E N + + L ++NPRA Y++ GF+++ EH G Sbjct: 106 GRGSQLIELA-EKIAQEHNKHKIWLGVWEHNPRAQAFYKRHGFKVV---GEHHFQTGDVT 161 Query: 171 DCYLM 175 D L+ Sbjct: 162 DTDLI 166 >LACLA Q9CJA2 (Q9CJA2) Acetyl transferase Length = 162 Score = 32.0 bits (71), Expect = 4.4 Identities = 21/69 (30%), Positives = 32/69 (46%), Gaps = 2/69 (2%) Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKK 169 KG+ T I +F KKE + + +NP A++ Y K GF L + +G+ Sbjct: 89 KGVATTLINFFIDFAKKE-GFKKITIQVMGSNPAALKLYNKLGFVEEGRLKKEFFIDGEY 147 Query: 170 -EDCYLMEY 177 +DC L Y Sbjct: 148 IDDCILAFY 156 >CLOTE Q892J2 (Q892J2) Conserved protein Length = 218 Score = 32.0 bits (71), Expect = 4.4 Identities = 39/154 (25%), Positives = 57/154 (37%), Gaps = 21/154 (13%) Query: 219 VNNEYIFK-------------TKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYI 265 VNN IFK T F++ K +G Y LN N+ N S + Sbjct: 9 VNNTPIFKCNYCGHCSKEIEATSFTSVKNRGCCWYFPKYTLLNIKNILNIGKENFIISLL 68 Query: 266 SDELSILG--YKEIKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLDYTDISECTID 323 +++ S + + E+KG+F E Y M E E D F R+ + C++D Sbjct: 69 NNKNSNISSYFIEVKGSFEEEEYYKFMRENEYTESSFDYKLFFRK---CSFVTDKGCSLD 125 Query: 324 NKQNVLEEYILLRETIYNDLTDIEKDYIESFMER 357 + L I N +KDY ER Sbjct: 126 FSLRPHPCNLYLCRNIIN---TCDKDYSSFSRER 156 >BRAJA Q89UA8 (Q89UA8) Hypothetical acetyltransferase Length = 148 Score = 32.0 bits (71), Expect = 4.4 Identities = 19/56 (33%), Positives = 33/56 (58%), Gaps = 5/56 (8%) Query: 98 MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 +DQ + +P W G+ +L+ E K+ + + V L +K+N RAIR Y+++GF Sbjct: 76 LDQLVVDPASW----GSDAARLLVEEAKR-LSPSGVTLLVNKDNTRAIRFYERNGF 126 >BACSU O34558 (O34558) YopR protein Length = 325 Score = 32.0 bits (71), Expect = 4.4 Identities = 17/45 (37%), Positives = 26/45 (57%), Gaps = 5/45 (11%) Query: 211 GYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNV 255 G +LV N+Y+ KTK ++NK G A + F+ TNL T++ Sbjct: 203 GQTKEVFLVENDYVVKTKRTSNKGDGQASK-----FVITNLITDI 242 >BACAN Q81R63 (Q81R63) Hypothetical protein Length = 217 Score = 32.0 bits (71), Expect = 4.4 Identities = 15/45 (33%), Positives = 27/45 (60%) Query: 324 NKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNATTVFEGKK 368 N+ NVL E + +E + L++ +KDYI+S E++ T E ++ Sbjct: 141 NQMNVLNESVTTQEELQRYLSENKKDYIKSVAEKVYQTATEEKRE 185 >VIBPA Q87MD0 (Q87MD0) Acetyltransferase-related protein Length = 168 Score = 31.6 bits (70), Expect = 5.7 Identities = 15/53 (28%), Positives = 26/53 (49%) Query: 101 FIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 FI + YW KG+ T +K F +E + V + + N+ ++ +K GF Sbjct: 86 FIFDKAYWGKGLATEALKAFFPKACRELELHKVKANVNSNHQASMAVLEKLGF 138 >STRR6 Q8DND0 (Q8DND0) Transcriptional activator Length = 299 Score = 31.6 bits (70), Expect = 5.7 Identities = 19/81 (23%), Positives = 40/81 (49%), Gaps = 12/81 (14%) Query: 295 QNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEK------ 348 Q L+++D+A F+ Q+ L + + K +E Y ++R+T+ + + +EK Sbjct: 167 QMLIRKDLAKFINQIEKLMLFLLEQ----KKVTQIENYFIIRDTLISGMCCLEKVGVTDC 222 Query: 349 --DYIESFMERLNATTVFEGK 367 DY+ E ++ T ++ K Sbjct: 223 FNDYLSCLQEIMDKTQDYQKK 243 >OCEIH Q8CUS9 (Q8CUS9) Hypothetical conserved protein Length = 161 Score = 31.6 bits (70), Expect = 5.7 Identities = 20/76 (26%), Positives = 36/76 (47%), Gaps = 6/76 (7%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164 P++ GIG+ + + + + L+ +NN +A+ Y GF II+D E+ Sbjct: 92 PSHQGIGIGSA----LLHYGVNQLRPREIQLNVEQNNIKALDFYTSKGFEIIKDFQEN-- 145 Query: 165 HEGKKEDCYLMEYRYD 180 +G D Y M ++ D Sbjct: 146 FDGHLLDTYRMSWKLD 161 >LISIN Q92E28 (Q92E28) Lin0633 protein Length = 143 Score = 31.6 bits (70), Expect = 5.7 Identities = 20/80 (25%), Positives = 37/80 (46%), Gaps = 1/80 (1%) Query: 75 IYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVI 134 +Y ++ + Y + DE + F+ + KG GT+ ++ + + L KE + Sbjct: 55 LYSIFTDQKIGYLWFHVDEKHAFIYDFVIFETFRGKGFGTKTLEAL-DVLAKEMGITKIE 113 Query: 135 LDPHKNNPRAIRAYQKSGFR 154 L +N AI+ Y K GF+ Sbjct: 114 LHVFAHNQTAIKLYDKVGFK 133 >LACPL Q88U49 (Q88U49) Glucose-6-phosphate 1-dehydrogenase (EC 1.1.1.49) (G6PD) Length = 494 Score = 31.6 bits (70), Expect = 5.7 Identities = 26/95 (27%), Positives = 45/95 (47%), Gaps = 8/95 (8%) Query: 151 SGF-RIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIG 209 +GF R+I + P +E KE + +++N Y I+HY + +I I Sbjct: 140 NGFNRVIIEKPFGHDYESAKELNDQLTATFNENQI------YRIDHYLGKEMIQNITAIR 193 Query: 210 SGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIY 244 G + L NN YI + + ++K G +E+A+Y Sbjct: 194 FGNNIWESLWNNRYIDNVQITLSEKLG-VEERAVY 227 >CORDI Q6NHQ0 (Q6NHQ0) Putative acetyltransferase Length = 163 Score = 31.6 bits (70), Expect = 5.7 Identities = 18/53 (33%), Positives = 27/53 (50%), Gaps = 1/53 (1%) Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156 EP Y KG+G+ + E L + A + L NPRA + Y++ GF+ I Sbjct: 98 EPRYRGKGVGSILLNKSLE-LARTLGAPGLSLSVDDGNPRAKKLYERLGFQHI 149 >BURMA Q62J98 (Q62J98) Ribosomal-protein-alanine acetyltransferase (EC 2.3.1.128) Length = 165 Score = 31.6 bits (70), Expect = 5.7 Identities = 15/43 (34%), Positives = 25/43 (58%), Gaps = 1/43 (2%) Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 G+G ++ + ER + V+L+ +NPRAIR Y++ GF Sbjct: 87 GVGLALLREAVRIARAER-LDGVLLEVRPSNPRAIRLYERFGF 128 >THETN Q8R764 (Q8R764) LysM-repeat proteins and domains Length = 508 Score = 31.2 bits (69), Expect = 7.4 Identities = 31/141 (21%), Positives = 53/141 (37%), Gaps = 23/141 (16%) Query: 235 KGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEE 294 KGY E F+ E + + +Y+S E++ L KE++ F ++E+E Sbjct: 381 KGYRDEYPFRTFVEIEGEVGEVLTEVSTAYVSYEINSL--KELEFKFAIDSCVEVLTEKE 438 Query: 295 QNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESF 354 L+ D+ E + + V I+ L DI K Y + Sbjct: 439 MTLI----------------YDLKEIEMPRGEEVRHSIIIYMVQKGESLWDIAKRYRVNV 482 Query: 355 MERLNAT-----TVFEGKKCL 370 + + A VFEG+K + Sbjct: 483 EDLITANDLKEDKVFEGEKLI 503 >STRR6 Q8DPX8 (Q8DPX8) Hypothetical protein spr0952 Length = 253 Score = 31.2 bits (69), Expect = 7.4 Identities = 22/106 (20%), Positives = 48/106 (45%), Gaps = 12/106 (11%) Query: 261 EYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLD------- 313 E SY+S +++ Y+E+ + P +E + + + R++ L Sbjct: 148 ELSYLS---TLIRYEELY--IINPNQARATPKEHHDFIVNHLVDNTRKLEELAIFERIQI 202 Query: 314 YTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLN 359 Y C D+K+N +L+E ++ + + +EK+ ++ +RLN Sbjct: 203 YQRDRSCVYDSKENTTSAADVLQELLFGEWSQVEKEMLQVGEKRLN 248 >STRMU Q8DVT1 (Q8DVT1) Putative ribosomal-protein-alanine acetyltransferase (EC 2.3.1.128) Length = 144 Score = 31.2 bits (69), Expect = 7.4 Identities = 28/124 (22%), Positives = 51/124 (41%), Gaps = 20/124 (16%) Query: 65 YNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYG---MDQFIGEPN---------YWSKGI 112 Y P QI + L DY + D+ + G + +GE Y +G+ Sbjct: 22 YQVSPWSQKQILTDMNRLDVDYFFAYDDKEIVGFLSIQHLVGELELTNIAIKKAYQGQGL 81 Query: 113 GTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDC 172 G++ + ++ ++ + L+ +N A YQK GFR + ++ + KED Sbjct: 82 GSQLLAML------TKDELPIFLEVRASNQAAQALYQKFGFRSLTTRKDY--YHNPKEDA 133 Query: 173 YLME 176 LM+ Sbjct: 134 ILMK 137 >SALTI SPED_SALTI (Q8Z9E3) S-adenosylmethionine decarboxylase proenzyme (EC 4.1.1.50) (AdoMetDC) (SamDC) [Contains: S-adenosylmethionine decarboxylase beta chain; S-adenosylmethionine decarboxylase alpha chain] Length = 264 Score = 31.2 bits (69), Expect = 7.4 Identities = 19/60 (31%), Positives = 29/60 (48%) Query: 169 KEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTK 228 + D ++YR +V MK+ I+H ++ + E + S YD V V E IF TK Sbjct: 157 ESDIVTIDYRVRGFTRDVNGMKHFIDHEINSIQNFMSEDMKSLYDMVDVNVYQENIFHTK 216 >SALTY SPED_SALTY (Q8ZRS4) S-adenosylmethionine decarboxylase proenzyme (EC 4.1.1.50) (AdoMetDC) (SamDC) [Contains: S-adenosylmethionine decarboxylase beta chain; S-adenosylmethionine decarboxylase alpha chain] Length = 264 Score = 31.2 bits (69), Expect = 7.4 Identities = 19/60 (31%), Positives = 29/60 (48%) Query: 169 KEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTK 228 + D ++YR +V MK+ I+H ++ + E + S YD V V E IF TK Sbjct: 157 ESDIVTIDYRVRGFTRDVNGMKHFIDHEINSIQNFMSEDMKSLYDMVDVNVYQENIFHTK 216 >SALCH Q57T90 (Q57T90) S-adenosylmethionine decarboxylase, proenzyme Length = 264 Score = 31.2 bits (69), Expect = 7.4 Identities = 19/60 (31%), Positives = 29/60 (48%) Query: 169 KEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTK 228 + D ++YR +V MK+ I+H ++ + E + S YD V V E IF TK Sbjct: 157 ESDIVTIDYRVRGFTRDVNGMKHFIDHEINSIQNFMSEDMKSLYDMVDVNVYQENIFHTK 216 >RICCN Q92JP8 (Q92JP8) Cell surface antigen Length = 1902 Score = 31.2 bits (69), Expect = 7.4 Identities = 24/90 (26%), Positives = 41/90 (45%), Gaps = 2/90 (2%) Query: 197 FDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVK 256 F N K + E+I S S+ F + + K + K+K Y++ +T +++NVK Sbjct: 1678 FKNSKNNDKELINSHVVSIYGQKELPKNFALQALVSASKNFIKDKTTYSYGDTKIKSNVK 1737 Query: 257 IPNIEYSYISDELSILGYKEIKGTFLTPEI 286 N +SY ++ L Y +TP I Sbjct: 1738 HRN--HSYNAEALLHYNYLLQSKLVITPNI 1765 >NEIGI Q5FAH1 (Q5FAH1) Hypothetical protein Length = 177 Score = 31.2 bits (69), Expect = 7.4 Identities = 18/45 (40%), Positives = 25/45 (55%), Gaps = 7/45 (15%) Query: 215 VAYLVNNEYI-------FKTKFSTNKKKGYAKEKAIYNFLNTNLE 252 + YL++NE + FK FSTN+KK EK I FL N++ Sbjct: 69 IDYLISNEILIVRTKFSFKNIFSTNEKKYKEIEKEINKFLYKNMD 113 >LISIN Q92DJ7 (Q92DJ7) Lin0816 protein Length = 185 Score = 31.2 bits (69), Expect = 7.4 Identities = 16/62 (25%), Positives = 28/62 (45%), Gaps = 1/62 (1%) Query: 98 MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157 +D + Y G+GT + + E + V L+ K NP A R Y++ GF + Sbjct: 113 LDSIVTNEKYRGHGVGTALLAKLTEIAAND-GEKVVGLNCDKGNPHAKRLYERLGFHVTG 171 Query: 158 DL 159 ++ Sbjct: 172 EI 173 >LACJO Q74J74 (Q74J74) Hypothetical protein Length = 150 Score = 31.2 bits (69), Expect = 7.4 Identities = 21/58 (36%), Positives = 28/58 (48%), Gaps = 5/58 (8%) Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 +P Y SKGI T IK L++ V L+ +N RA Y+K GF + L E Sbjct: 80 DPIYQSKGIATELIKKALTELERP-----VRLEVFTDNERAKALYRKFGFERVNTLTE 132 >GEOSL Q74A59 (Q74A59) Sensory box histidine kinase Length = 1053 Score = 31.2 bits (69), Expect = 7.4 Identities = 41/188 (21%), Positives = 78/188 (41%), Gaps = 34/188 (18%) Query: 176 EYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGS------------GYDSVAYLVNNE- 222 E+RY D V+A+K E YF ++GS G D LV+ E Sbjct: 106 EHRYGD----VEALKSRYEAYFRKATELYPRVLGSTDTFLSGEIARLGADGRLILVDFER 161 Query: 223 ----YIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIK 278 Y+ + + + A++ +IY F+ + + P I +++++ L I +E++ Sbjct: 162 MSRDYVTSVEHQIERNRALARDTSIYLFVLFGMVVLLAAPAI--TFVANRLLIRPLEELR 219 Query: 279 GTFLTPEIYSTMSEEEQNLLKRDI--------ASFLRQMHGLDYTDISECTIDNKQNVLE 330 G + ++ S + L D ASF + GL T +S +DN + Sbjct: 220 GMVTS---FAGGSLDLSGLPDYDAGDEIGSLCASFRSMVEGLQETTVSRDYVDNIIESMS 276 Query: 331 EYILLRET 338 + +++ +T Sbjct: 277 DCLIVVDT 284 >ENTFA Q836Z2 (Q836Z2) Acetyltransferase, GNAT family Length = 173 Score = 31.2 bits (69), Expect = 7.4 Identities = 19/75 (25%), Positives = 33/75 (44%), Gaps = 1/75 (1%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE-HELH 165 YW G+G+ ++ + + + + L N RAI Y+K GF +P + Sbjct: 99 YWGYGLGSILMEELIRWAHESHVIRRLELTVQDRNQRAIHVYKKLGFETEAIMPRGAKTD 158 Query: 166 EGKKEDCYLMEYRYD 180 +G+ D +LM D Sbjct: 159 QGEFLDVHLMRLLID 173 >ENTFA Q82YJ4 (Q82YJ4) Toxin ABC transporter, ATP-binding/permease protein Length = 700 Score = 31.2 bits (69), Expect = 7.4 Identities = 33/166 (19%), Positives = 68/166 (40%), Gaps = 7/166 (4%) Query: 94 IVYGMDQFIG-EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSG 152 +V ++QF + Y S + + ++ I ++ E +ILD + R R K G Sbjct: 439 VVSSLNQFGSFQAQYESMQVASHRLESILINMENENVCGEIILDKKIESIRCKRVSIKKG 498 Query: 153 FRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGY 212 ++ D E++ GK + + +T +K++ L + Y +++I+I Sbjct: 499 DTLLLDTVNCEIYRGK--NLSIRGENGSGKSTLIKSLVRLDDDYRGQILINNIDIKKINL 556 Query: 213 D----SVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETN 254 D + ++ N + N G+ +I+N L + E N Sbjct: 557 DCLRSKLVFVEPNPKFLEGTIRDNLLLGHKVPNSIFNKLIRDFEIN 602 >CLOAB Q97M70 (Q97M70) Predicted metal-dependent hydrolase Length = 259 Score = 31.2 bits (69), Expect = 7.4 Identities = 13/39 (33%), Positives = 23/39 (58%), Gaps = 5/39 (12%) Query: 47 KKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTD 85 KK+Y E W ++ + +EY Y + YK++DE+Y + Sbjct: 145 KKNYAEKWYKKIAAIELEYL-----YNEKYKIFDEIYDE 178 >CLOAB Q97HI5 (Q97HI5) Predicted flavodoxin Length = 180 Score = 31.2 bits (69), Expect = 7.4 Identities = 23/76 (30%), Positives = 35/76 (46%), Gaps = 2/76 (2%) Query: 119 LIFEFLKKERNANAVILDPHKNNP-RAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEY 177 L+F L K R AN IL + NP RA+ YQ S I +++ +++ +G Y + Sbjct: 22 LMFSRLNKPRQANQKILKAKEANPKRALIVYQPSMSSITDEV-ANQIAKGLNTQGYEVTL 80 Query: 178 RYDDNATNVKAMKYLI 193 Y N + Y I Sbjct: 81 NYPSNHLSTNVSDYSI 96 >BRAJA Q89R32 (Q89R32) Acyl-CoA dehydrogenase Length = 455 Score = 31.2 bits (69), Expect = 7.4 Identities = 13/35 (37%), Positives = 23/35 (65%) Query: 235 KGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDEL 269 K AK++ ++NF + ET + N++Y+YI+ EL Sbjct: 107 KNKAKKEGLWNFFLPDDETGQGLKNLDYAYIASEL 141 >BACC1 Q735F5 (Q735F5) Streptothricin acetyltransferase, putative Length = 184 Score = 31.2 bits (69), Expect = 7.4 Identities = 30/122 (24%), Positives = 54/122 (44%), Gaps = 6/122 (4%) Query: 37 RDKKYTLE---SLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDE 93 R +YT+E S +K Y + +E+ EY N P I +++++ K Sbjct: 38 RHIEYTVEDVPSYEKSYLQNDNEEL--AYNEYINKPNQIIYIALLHNQIIGFIVLKKNWN 95 Query: 94 IVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 ++ + Y + G+G R + ++ K E N ++L+ NN A + Y+K GF Sbjct: 96 HYAYIEDITVDKKYRTLGVGKRLVVQAKQWAK-EGNMPGIMLETQNNNVAACKFYEKCGF 154 Query: 154 RI 155 I Sbjct: 155 VI 156 >BACAN Q81YS9 (Q81YS9) Acetyltransferase, GNAT family Length = 288 Score = 31.2 bits (69), Expect = 7.4 Identities = 13/44 (29%), Positives = 25/44 (56%) Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 KG+G R ++ +++ + + L + NN RA++ Y+K GF Sbjct: 233 KGVGERLLQAAIQYIFSFQGMREIELCLNTNNDRAVKLYKKVGF 276 >VIBPA Q87HD3 (Q87HD3) Hypothetical protein VPA1032 Length = 265 Score = 30.8 bits (68), Expect = 9.7 Identities = 19/64 (29%), Positives = 31/64 (48%), Gaps = 3/64 (4%) Query: 294 EQNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEK---DY 350 E L + + SF M DY +SE + ++ E+ L +T ++D+ DI+ Y Sbjct: 96 ENEELTKSLVSFNLSMVSQDYEQVSELALQIEELRQEKGFLANDTSFSDVRDIDDRLGGY 155 Query: 351 IESF 354 IE F Sbjct: 156 IELF 159 >VIBCH Q9KL03 (Q9KL03) Spermidine n1-acetyltransferase Length = 173 Score = 30.8 bits (68), Expect = 9.7 Identities = 20/83 (24%), Positives = 35/83 (42%), Gaps = 3/83 (3%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 Q I P + KG I ++ N + + L NP+A+ Y++ GF L Sbjct: 86 QIIIAPEHQGKGFARTLINRALDYSFTILNLHKIYLHVAVENPKAVHLYEECGFVEEGHL 145 Query: 160 PEHELHEGKKED---CYLMEYRY 179 E G+ +D Y+++ +Y Sbjct: 146 VEEFFINGRYQDVKRMYILQSKY 168 >THEMA SYE2_THEMA (Q9X2I8) Glutamyl-tRNA synthetase 2 (EC 6.1.1.17) (Glutamate--tRNA ligase 2) (GluRS 2) Length = 487 Score = 30.8 bits (68), Expect = 9.7 Identities = 16/44 (36%), Positives = 22/44 (50%) Query: 325 KQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNATTVFEGKK 368 K N L + + ND + EKDY+E F++R A V E K Sbjct: 369 KVNTLSQLYDIMYPFMNDDYEYEKDYVEKFLKREEAERVLEEAK 412 >THETN ISPH_THETN (Q8RA76) 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (EC 1.17.1.2) Length = 288 Score = 30.8 bits (68), Expect = 9.7 Identities = 14/55 (25%), Positives = 31/55 (56%) Query: 115 RYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKK 169 R I++ +E L K+++ L +NP+ ++ ++ G R+IE+ +L +G + Sbjct: 17 RAIEIAYEELNKQKDTRLYTLGEIIHNPQVVKDLEEKGVRVIEEEELEKLLKGDR 71 >STRCO Q9S2L5 (Q9S2L5) Hypothetical protein SCO1988 Length = 183 Score = 30.8 bits (68), Expect = 9.7 Identities = 18/47 (38%), Positives = 26/47 (55%), Gaps = 5/47 (10%) Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157 GIG R++ L KER A+ + L + N A R Y++ GFR +E Sbjct: 119 GIGDRFVALA-----KERRADGLSLWTFQVNAPARRFYERHGFRAVE 160 >STRP1 Q99XX8 (Q99XX8) Putative pullulanase Length = 1165 Score = 30.8 bits (68), Expect = 9.7 Identities = 37/171 (21%), Positives = 62/171 (36%), Gaps = 31/171 (18%) Query: 83 YTDYHY----PKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPH 138 YT Y+Y + E V +D + W+ T IK A A +DP Sbjct: 473 YTGYYYLYEITRGQEKVMVLDPYAKSLAAWNDATATDDIK----------TAKAAFIDPS 522 Query: 139 KNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFD 198 K P + + + F+ K+ED + E D T+ KA++ + H F Sbjct: 523 KLGPTGLDFAKINNFK-------------KREDAIIYEAHVRD-FTSDKALEGKLTHPFG 568 Query: 199 NFK--VDSIEIIGS-GYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNF 246 F V+ ++ + G V L Y + + ++ Y YN+ Sbjct: 569 TFSAFVEQLDYLKDLGVTHVQLLPVLSYFYANELDKSRSTAYTSSDNNYNW 619 >STRP1 Q99YF0 (Q99YF0) Hypothetical protein SPy1734 (Acetyltransferase) (EC 2.3.1.-) Length = 174 Score = 30.8 bits (68), Expect = 9.7 Identities = 17/65 (26%), Positives = 34/65 (52%), Gaps = 1/65 (1%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166 Y GIG +++ ++ ++ ++ LD N +AI Y+K GFR IE + ++++ Sbjct: 99 YRGYGIGQLLLEIALDWAEENPYIESLKLDVQVRNTKAIYLYKKYGFR-IESMRKNDIKS 157 Query: 167 GKKED 171 +D Sbjct: 158 KNGDD 162 >STAES Q8CTP7 (Q8CTP7) Hypothetical protein SE0368 Length = 158 Score = 30.8 bits (68), Expect = 9.7 Identities = 27/108 (25%), Positives = 44/108 (40%), Gaps = 18/108 (16%) Query: 49 HYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYW 108 H + +++F V E + + +G+ + +ELY HY + P Sbjct: 48 HLKKRLNEQLFLVAEEDSEI-VGFAN-FIYGEELYLSAHYVR--------------PESQ 91 Query: 109 SKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156 +G GTR ++ + K + V L+ NN I YQ GF II Sbjct: 92 HRGYGTRLLEAGLKRFKDQYET--VYLEVDNNNSNGIEYYQNHGFEII 137 >STAES COAD_STAES (Q8CSZ5) Phosphopantetheine adenylyltransferase (EC 2.7.7.3) (Pantetheine-phosphate adenylyltransferase) (PPAT) (Dephospho-CoA pyrophosphorylase) Length = 161 Score = 30.8 bits (68), Expect = 9.7 Identities = 21/111 (18%), Positives = 50/111 (45%), Gaps = 13/111 (11%) Query: 185 NVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIY 244 +VK + + H+F+ VD + +G+ +++ ++ + ++ KK Sbjct: 59 SVKHLPNIQVHHFNGLLVDFCDQVGAKTIIRGLRAVSDFEYELRLTSMNKK--------- 109 Query: 245 NFLNTNLETNVKIPNIEYSYISDEL--SILGYKEIKGTFLTPEIYSTMSEE 293 LN+N+ET + + YS+IS + + Y+ F+ P + + ++ Sbjct: 110 --LNSNIETMYMMTSANYSFISSSIVKEVAAYQADISPFVPPHVERALKKK 158 >MYCLEH Y3433_MYCTU (O06250) Hypothetical protein Rv3433c/MT3539 Length = 473 Score = 30.8 bits (68), Expect = 9.7 Identities = 12/30 (40%), Positives = 23/30 (76%) Query: 130 ANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 A+AV+L+P + + +A+ A+ KSG R++E + Sbjct: 81 ADAVLLNPDRTHRKALAAFTKSGGRLVESV 110 >MYCTU Y3433_MYCTU (O06250) Hypothetical protein Rv3433c/MT3539 Length = 473 Score = 30.8 bits (68), Expect = 9.7 Identities = 12/30 (40%), Positives = 23/30 (76%) Query: 130 ANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 A+AV+L+P + + +A+ A+ KSG R++E + Sbjct: 81 ADAVLLNPDRTHRKALAAFTKSGGRLVESV 110 >MYCBO Q7TWI6 (Q7TWI6) Hypothetical protein Mb3463c Length = 473 Score = 30.8 bits (68), Expect = 9.7 Identities = 12/30 (40%), Positives = 23/30 (76%) Query: 130 ANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 A+AV+L+P + + +A+ A+ KSG R++E + Sbjct: 81 ADAVLLNPDRTHRKALAAFTKSGGRLVESV 110 >LISMO Q8Y5C4 (Q8Y5C4) Lmo2141 protein Length = 157 Score = 30.8 bits (68), Expect = 9.7 Identities = 28/102 (27%), Positives = 47/102 (46%), Gaps = 9/102 (8%) Query: 76 YKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVIL 135 YK L ++ H + D V+ P+Y GIG + + E + +E+ + L Sbjct: 63 YKSPIPLASNKHVAEIDIAVH--------PDYQRAGIGQLLMDKMKE-VAREKGYIKIAL 113 Query: 136 DPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEY 177 N +AIR Y+K+GF+ L + + +G+ D LM Y Sbjct: 114 RVLSINQKAIRFYEKNGFKQEGLLEKEFIIQGEFVDDILMAY 155 >LISIN Q929Z8 (Q929Z8) Lin2125 protein Length = 231 Score = 30.8 bits (68), Expect = 9.7 Identities = 25/89 (28%), Positives = 38/89 (42%), Gaps = 15/89 (16%) Query: 8 ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKK----YTLESLKKHYTEP--WEDEVFRV 61 + ++TL+ P + WL DE F G Y L ++ +T P W+ V + Sbjct: 107 LVLKTLVARTRPDSVNWLIDESGFSFPSGHATATAVFYGLAAMFLIFTVPKMWQKIVIGI 166 Query: 62 IIEYNNVPIGYGQI-YKMYDELYTDYHYP 89 IGYG I + MY +Y H+P Sbjct: 167 --------IGYGFILFVMYTRVYLGVHFP 187 >ENTFA Q837Z5 (Q837Z5) Acetyltransferase, GNAT family Length = 184 Score = 30.8 bits (68), Expect = 9.7 Identities = 15/45 (33%), Positives = 24/45 (53%) Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 +G G + LI +F E + + L + NN +AI Y+K GF+ Sbjct: 104 QGCGFEAVSLICKFAFYELGLHKIRLAVNSNNQKAIHVYEKVGFK 148 >ENTFA Q831M9 (Q831M9) Ribosomal-protein-alanine acetyltransferase, putative Length = 154 Score = 30.8 bits (68), Expect = 9.7 Identities = 15/45 (33%), Positives = 28/45 (62%), Gaps = 1/45 (2%) Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 +GIG + +K E++K R+ + L+ ++N A + Y+K+GFR Sbjct: 83 QGIGCQLMKAFKEYVKS-RDITQIFLEVRESNILAQKLYEKTGFR 126 >CLOTE Q895T0 (Q895T0) Acetyltransferase (EC 2.3.1.-) Length = 165 Score = 30.8 bits (68), Expect = 9.7 Identities = 17/67 (25%), Positives = 37/67 (55%), Gaps = 3/67 (4%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166 Y +G+GT+ + I + L K++ + + D + NP+ +QK G+ + ++ + L++ Sbjct: 97 YRHQGVGTKLLSYI-KTLAKDKKIHLIKSDTYSLNPKMNALFQKCGYEKVGEI--NLLNK 153 Query: 167 GKKEDCY 173 K +CY Sbjct: 154 PYKFNCY 160 >CLOPE Q8XMF9 (Q8XMF9) Hypothetical protein CPE0730 Length = 154 Score = 30.8 bits (68), Expect = 9.7 Identities = 39/162 (24%), Positives = 66/162 (40%), Gaps = 30/162 (18%) Query: 28 ERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYH 87 ERVLE K ++ + K + E + + + EY +K E+ D + Sbjct: 3 ERVLEIR--EPKNCEIDDIMKIWLESTVEAHYFIEEEY----------WKKNYEVVRDIY 50 Query: 88 YPKTDEIVYGMD------------QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVIL 135 P VY + FIG +K G+ K + E++K + + L Sbjct: 51 IPMAKTFVYCDEGKINGFISIIDSNFIGALFVHTKSQGSGIGKSLLEYVKNKYEN--IEL 108 Query: 136 DPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEY 177 +K+N +A+ Y+K F+II++ + G E YLM Y Sbjct: 109 AVYKDNKKAVEFYKKHDFKIIKEQENED--SGHLE--YLMSY 146 >BRUSU Q8FY73 (Q8FY73) Outer membrane autotransporter Length = 1593 Score = 30.8 bits (68), Expect = 9.7 Identities = 18/64 (28%), Positives = 32/64 (50%), Gaps = 3/64 (4%) Query: 414 EEEIGTNFGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIY 473 E + N + ++ G + ++ +QD + + T +YGI NI QEF+ NGR + Sbjct: 1478 EANVSLNDSDSLIGRAG-VALDYRNAWQDDAGQI--VHTNIYGIANIYQEFMGNGRVGVA 1534 Query: 474 KRTY 477 T+ Sbjct: 1535 DTTF 1538 >BACSU YCBJ_BACSU (P42242) Hypothetical protein ycbJ Length = 306 Score = 30.8 bits (68), Expect = 9.7 Identities = 36/179 (20%), Positives = 66/179 (36%), Gaps = 26/179 (14%) Query: 300 RDIASFLRQMHGLDYTDISECTI------DNKQNVLEEYILLRETIYNDLTDIEKDYIES 353 R +A L ++HG D + I D +Q + + ++ + + E Sbjct: 129 RTLADILAELHGTDQISAGQSGIEVIRPEDFRQMTADSMVDVKNKL-----GVSTTLWER 183 Query: 354 FMERLNATTVFEGKKCLCHNDFSCNHLLLDGNNRLTXXXXXXXXXXXXEYCDFIYLLEDS 413 + + ++ + G L H D H+L+D N R+T DF+ Sbjct: 184 WQKWVDDDAYWPGFSSLIHGDLHPPHILIDQNGRVTGLLDWTEAKVADPAKDFVL----- 238 Query: 414 EEEIGTNFGED----ILRMY---GNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFI 465 T FGE +L Y G K +E+ ++ YP+E ++ ++E I Sbjct: 239 ---YQTIFGEKETARLLEYYDQAGGRIWAKMQEHISEMQAAYPVEIAKLALQTQQEEHI 294 >BACHD Q9KE57 (Q9KE57) BH1001 protein Length = 448 Score = 30.8 bits (68), Expect = 9.7 Identities = 32/119 (26%), Positives = 54/119 (45%), Gaps = 21/119 (17%) Query: 272 LGYKEIKGTFLTPEIYSTMSEEEQNLL------------KRDIASFLRQMHGLDYTDISE 319 LG+K +GT L ++ TMS EE + D F +++G + T ++E Sbjct: 306 LGFKVERGTLLESKVELTMSFEEDGISFDVGMSVDSTYNYDDAVEF--KLYGQERTTLTE 363 Query: 320 CTIDNKQNVLEEYILLRETIYND-LTDIEKDYIESFM--ERLNATTVFEGKKCLCHNDF 375 +D ++ E E++ ND L D ++DY E + E L E ++ + H DF Sbjct: 364 AELD---DLTYEINWELESLVNDLLADFQEDYYEEELSEEDLALLAAIEAQE-VSHEDF 418 >BACC1 Q73CD9 (Q73CD9) Glycerol-3-phosphate dehydrogenase, aerobic (EC 1.1.99.5) Length = 560 Score = 30.8 bits (68), Expect = 9.7 Identities = 19/52 (36%), Positives = 26/52 (50%), Gaps = 2/52 (3%) Query: 146 RAYQKSGFRIIEDLPEHEL--HEGKKEDCYLMEYRYDDNATNVKAMKYLIEH 195 R+ ++ F E L + L EG K Y +EYR DD ++ MK IEH Sbjct: 133 RSERRKMFNREETLKKEPLVKQEGLKGGGYYVEYRTDDARLTIEVMKEAIEH 184 >BACAN Q81VL8 (Q81VL8) Oxidoreductase, FAD-binding Length = 471 Score = 30.8 bits (68), Expect = 9.7 Identities = 28/96 (29%), Positives = 41/96 (42%), Gaps = 12/96 (12%) Query: 244 YNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKRDIA 303 Y L+ +K+ N E +L YKE F E + +L + +A Sbjct: 188 YGLFGVILDVTLKLTNDEL--YETHTKMLDYKEYTSYF--KEKVKKDANVRMHLARISVA 243 Query: 304 --SFLRQMHGLDYTDISECTIDNKQNVLEEYILLRE 337 SFLR+M+ DY T+ QN+ EEY L+E Sbjct: 244 PNSFLREMYVTDY------TLAQNQNMREEYSELKE 273 >BACAN Q81U57 (Q81U57) Glycerol-3-phosphate dehydrogenase, aerobic Length = 560 Score = 30.8 bits (68), Expect = 9.7 Identities = 19/52 (36%), Positives = 26/52 (50%), Gaps = 2/52 (3%) Query: 146 RAYQKSGFRIIEDLPEHEL--HEGKKEDCYLMEYRYDDNATNVKAMKYLIEH 195 R+ ++ F E L + L EG K Y +EYR DD ++ MK IEH Sbjct: 133 RSERRKMFNREETLKKEPLVKQEGLKGGGYYVEYRTDDARLTIEVMKEAIEH 184 >BACAN Q81NU7 (Q81NU7) Acetyltransferase, GNAT family Length = 153 Score = 30.8 bits (68), Expect = 9.7 Identities = 20/79 (25%), Positives = 34/79 (43%), Gaps = 14/79 (17%) Query: 86 YHYPKTDEIVYGMDQFIGEPN--------------YWSKGIGTRYIKLIFEFLKKERNAN 131 Y K D+IV G F G+PN YW+KG T ++ + ++ + Sbjct: 52 YVIRKEDDIVLGDIGFKGKPNEEHTVEVGYGFIEKYWNKGYATEAVQELIDWAFQTGEVE 111 Query: 132 AVILDPHKNNPRAIRAYQK 150 +I + +N +IR +K Sbjct: 112 TIIAETLLDNYGSIRVLEK 130 Database: Blastdata.fdb Posted date: Mar 29, 2006 3:30 PM Number of letters in database: 77,468,597 Number of sequences in database: 240,170 Lambda K H 0.318 0.139 0.409 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Hits to DB: 72,017,968 Number of Sequences: 240170 Number of extensions: 3196375 Number of successful extensions: 9166 Number of sequences better than 10.0: 203 Number of HSP's better than 10.0 without gapping: 69 Number of HSP's successfully gapped in prelim test: 134 Number of HSP's that attempted gapping in prelim test: 8848 Number of HSP's gapped (non-prelim): 424 length of query: 479 length of database: 77,468,597 effective HSP length: 115 effective length of query: 364 effective length of database: 49,849,047 effective search space: 18145053108 effective search space used: 18145053108 T: 11 A: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 68 (30.8 bits) BLASTP 2.2.10 [Oct-19-2004] From mdehoon at c2b2.columbia.edu Wed Apr 19 12:54:33 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Wed, 19 Apr 2006 12:54:33 -0400 Subject: [BioPython] Need help parsing Blastoutput Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF7@cgcmail.cgc.cpmc.columbia.edu> The Blast parser fails to read your file because the format of Blast output has changed. If I edit the data file so that it corresponds to the old format (add a space here, remove a blank line there, etc.), the Blast parser reads the file without problems. The easiest solution is to repeat the Blast run, using XML for the output format, and use the Blast XML parser in Biopython to parse the results. A general question is if anybody still needs the parser for Blast text output. Currently, we are confusing our users by having a Blast text parser that tends to break. A broken parser may be worse than no parser. --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] Sent: Wed 4/19/2006 6:15 AM To: Michiel De Hoon Cc: biopython at lists.open-bio.org Subject: RE: [BioPython] Need help parsing Blastoutput Hi Please see the attachment,it part of my Blast output. yes I am try to parse text output from Blast ,I have use another script to run my local blast that I am trying to perse the NCBIStandalone.BlastParser was working fine without hsp.sbject_end which is one of what I need to print out . On checking the class diagrams from cookbook, findout that sbject_end is not included .I just need another way of printing the int(subject end). Thanks for your help Halimah On Tue, 18 Apr 2006, Michiel De Hoon wrote: > Could you also send us the file Enterococcus_out so we can run the script? > > From the script, it looks like you're trying to parse text output from Blast. > While this is possible (in theory), the format of Blast text output tends to > change a lot, thereby breaking the parser in Biopython. It is more reliable > to have Blast generate output in XML format, and use the XML parser: > > blast_out = open('my_blast.xml', 'r') > > from Bio.Blast import NCBIXML > > b_parser = NCBIXML.BlastParser() > b_record = b_parser.parse(blast_out) > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to > generate Blast output in XML. > > --Michiel. > > > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > Sent: Tue 4/18/2006 11:06 AM > To: Michiel De Hoon > Cc: biopython at lists.open-bio.org > Subject: RE: [BioPython] Need help parsing Blastoutput > > thanks > please see the attchment a copy of my script and copy of my Blast output > Thanks > > > On Thu, 13 Apr 2006, Michiel De Hoon wrote: > > > Could you send us the script you were using? > > > > --Michiel. > > > > Michiel de Hoon > > Center for Computational Biology and Bioinformatics > > Columbia University > > 1150 St Nicholas Avenue > > New York, NY 10032 > > > > > > > > -----Original Message----- > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu > > Sent: Thu 4/13/2006 11:07 AM > > To: biopython at lists.open-bio.org > > Subject: [BioPython] Need help parsing Blastoutput > > > > Hi All, > > I have a BLAST output from a local blast > > I need to calculate my % alignment coverage as regard to my subject > > I try parsed the blast output and wanted to print the > > sbjct Start and Sbjct end. but I could not is there anyway I could this > > try to get mach coverage between my querry and subject I dont need > > Identities,but total % alignment for querry or subject. > > Thanks > > Halimah > > > > _______________________________________________ > > BioPython mailing list - BioPython at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython > > > > > > From elventear at gmail.com Wed Apr 19 21:02:30 2006 From: elventear at gmail.com (Pepe Barbe) Date: Wed, 19 Apr 2006 20:02:30 -0500 Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files Message-ID: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com> Hello, Following the simple steps in the BioPython cookbook, I wanted to create a dictionary with the following GenBank file: ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Escherichia_coli_K12/NC_000913.gbk Below you can find what I tried executing and the error I got. I would appreciate any insight into solving the error and correctly producing the dictionary. Thanks! Pepe ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> dict_file = 'NC_000913.gbk' >>> index_file = 'NC_000913.idx' >>> from Bio import GenBank >>> GenBank.index_file(dict_file, index_file) Traceback (most recent call last): File "", line 1, in ? File "/sw/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 1283, in index_file SimpleSeqRecord.create_flatdb([filename], indexname, indexer) File "/sw/lib/python2.4/site-packages/Bio/Mindy/SimpleSeqRecord.py", line 152, in create_flatdb creator.load(filename, builder = builder, fileid_info = {}) File "/sw/lib/python2.4/site-packages/Bio/Mindy/BaseDB.py", line 36, in load raise TypeError("Cannot identify file as a %s format" % TypeError: Cannot identify file as a unknown format From biopython at maubp.freeserve.co.uk Thu Apr 20 08:42:34 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython)) Date: Thu, 20 Apr 2006 13:42:34 +0100 Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files In-Reply-To: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com> References: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com> Message-ID: <444781BA.8080107@maubp.freeserve.co.uk> Pepe Barbe wrote: > Hello, > > Following the simple steps in the BioPython cookbook, I wanted to > create a dictionary with the following GenBank file: > > ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Escherichia_coli_K12/NC_000913.gbk > > Below you can find what I tried executing and the error I got. I would > appreciate any insight into solving the error and correctly producing > the dictionary. The cookbook tutorial is a little misleading in that regard. Indexing a GenBank file only makes sense for those files with multiple genbank record (i.e. multiple LOCUS lines). For example, you can get multi-record GenBank files with records for different genes. These tend to be small records, and the Martel based indexing code copes fine. It doesn't cope very well with large records like genomes. Your example (and in my experience all Bacterial Genomes) have just a single very large record (which will contain many features). Does this page help? http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/genbank/ I did suggest a change to the documentation but it looks like no one has made the change... http://biopython.org/pipermail/biopython-dev/2005-November/002193.html I had forgotten to chase this up. Peter From alpersoyler at yahoo.com Thu Apr 20 08:59:57 2006 From: alpersoyler at yahoo.com (alper soyler) Date: Thu, 20 Apr 2006 05:59:57 -0700 (PDT) Subject: [BioPython] Need help!!! Message-ID: <20060420125957.72247.qmail@web36804.mail.mud.yahoo.com> Hi All, I am new to Biopython and have a question. I want to construct a pyhlogenetic profile for one organism's proteins. I want to give my protein to blast to search one organism's genome (e.g. Homo sapiens) instead of whole genbank database. How can I solve my problem? Thank you in advance. regards, Alper --------------------------------- New Yahoo! Messenger with Voice. Call regular phones from your PC and save big. From cy at cymon.org Thu Apr 20 09:41:46 2006 From: cy at cymon.org (Cymon J. Cox) Date: Thu, 20 Apr 2006 14:41:46 +0100 Subject: [BioPython] Need help!!! In-Reply-To: <20060420125957.72247.qmail@web36804.mail.mud.yahoo.com> References: <20060420125957.72247.qmail@web36804.mail.mud.yahoo.com> Message-ID: <1145540506.11610.17.camel@clintonite.nhm.ac.uk> Hi Alper, On Thu, 2006-04-20 at 05:59 -0700, alper soyler wrote: > Hi All, > > I am new to Biopython and have a question. I want to construct a pyhlogenetic > profile for one organism's proteins. I want to give my protein to blast to > search one organism's genome (e.g. Homo sapiens) instead of whole genbank > database. How can I solve my problem? Thank you in advance. Assuming you want to do this locally, you'll need to download you target genome, format it with the BLAST distribution programme 'formatdb', and then feed your query and newly formatted genome BLAST database to Bio.Blast.NCBIStandalone. See http://biopython.org/docs/tutorial/Tutorial004.html#toc10 3.1.4 Running BLAST locally for details, Cheers, Cymon ____________________________________________________________________ Cymon J. Cox Biometry and Molecular Research Department of Zoology Natural History Museum Cromwell Road London, SW7 5BD Email: cy at cymon.org, c.cox at nhm.ac.uk, cymon.cox at googlemail.com Phone : +44 (0)20 7942 6981 HomePage : http://www.duke.edu/~cymon -8.63/-6.77 _____________________________________________________________________ Fedora Core release 4 (Stentz) clintonite.nhm.ac.uk 14:35:55 up 13 days, 20:42, 8 users, load average: 0.08, 0.16, 0.12 From mcolosimo at mitre.org Thu Apr 20 10:23:19 2006 From: mcolosimo at mitre.org (Marc Colosimo) Date: Thu, 20 Apr 2006 10:23:19 -0400 Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files In-Reply-To: <444781BA.8080107@maubp.freeserve.co.uk> References: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com> <444781BA.8080107@maubp.freeserve.co.uk> Message-ID: <65CA5BE4-1C83-4FD7-B998-C97BCF9AA6DE@mitre.org> While we are on the subject of parsing multiple GenBank files and the Cookbook, I think a better example (and more pythonish) is the following: from Bio import GenBank gb_file = "my_file.gb" gb_handle = open(gb_file, 'r') feature_parser = GenBank.FeatureParser() gb_iterator = GenBank.Iterator(gb_handle, feature_parser) for cur_record in gb_iterator: # now do something with the record print cur_record.seq which is way nicer (and uses iterators as per pep-234 and ) than while 1: cur_record = gb_iterator.next() if cur_record is None: break # now do something with the record print cur_record.seq Actually, the above works with the Fasta iterator as well. Times for a GenBank file with 72,358 records (LOCUSs): my way (using iterators): 14m16.886s cookbook way (using next and if): 14m28.547s Surprisingly, this isn't much faster (maybe with -O it would be) Marc On Apr 20, 2006, at 8:42 AM, Peter (BioPython) wrote: > Pepe Barbe wrote: >> Hello, >> >> Following the simple steps in the BioPython cookbook, I wanted to >> create a dictionary with the following GenBank file: >> >> ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Escherichia_coli_K12/ >> NC_000913.gbk >> >> Below you can find what I tried executing and the error I got. I >> would >> appreciate any insight into solving the error and correctly producing >> the dictionary. > > The cookbook tutorial is a little misleading in that regard. > Indexing a > GenBank file only makes sense for those files with multiple genbank > record (i.e. multiple LOCUS lines). > > For example, you can get multi-record GenBank files with records for > different genes. These tend to be small records, and the Martel based > indexing code copes fine. It doesn't cope very well with large > records > like genomes. > > Your example (and in my experience all Bacterial Genomes) have just a > single very large record (which will contain many features). > > Does this page help? > > http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/ > python/genbank/ > > I did suggest a change to the documentation but it looks like no > one has > made the change... > > http://biopython.org/pipermail/biopython-dev/2005-November/002193.html > > I had forgotten to chase this up. > > Peter > > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From elventear at gmail.com Thu Apr 20 12:11:42 2006 From: elventear at gmail.com (Pepe Barbe) Date: Thu, 20 Apr 2006 11:11:42 -0500 Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files In-Reply-To: <444781BA.8080107@maubp.freeserve.co.uk> References: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com> <444781BA.8080107@maubp.freeserve.co.uk> Message-ID: <3e73596b0604200911i2e2c481bj306c5d282cae5c75@mail.gmail.com> On 4/20/06, Peter (BioPython) wrote: > > The cookbook tutorial is a little misleading in that regard. Indexing a > GenBank file only makes sense for those files with multiple genbank > record (i.e. multiple LOCUS lines). > Your example (and in my experience all Bacterial Genomes) have just a > single very large record (which will contain many features). > > Does this page help? > > http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/genbank/ It does help a lot. Thanks! As an aside, while what I was doing, wasn't exactly what I was looking for, I think it was crashing because of a Bug on 1.41. I installed the latest CVS and it works normally now. Pepe From halima at mancala.cbio.uct.ac.za Thu Apr 20 07:57:20 2006 From: halima at mancala.cbio.uct.ac.za (Halima Rabiu) Date: Thu, 20 Apr 2006 13:57:20 +0200 (SAST) Subject: [BioPython] Need help parsing Blastoutput In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF7@cgcmail.cgc.cpmc.columbia.edu> References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF7@cgcmail.cgc.cpmc.columbia.edu> Message-ID: thanks I try using XML parser and I am still geting errors which I dont understand . please see the attchmnt copy of my script and Blast XML output. here is the error raceback (most recent call last): File "Bioperser.py", line 11, in ? b_record = b_parser.parse(b_out) File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 112, in parse self._parser.parse(handler) File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in parse xmlreader.IncrementalParser.parse(self, source) File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in parse self.feed(buffer) File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in feed self._err_handler.fatalError(exc) File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception thanks Halimah On Wed, 19 Apr 2006, Michiel De Hoon wrote: > The Blast parser fails to read your file because the format of Blast output > has changed. If I edit the data file so that it corresponds to the old format > (add a space here, remove a blank line there, etc.), the Blast parser reads > the file without problems. The easiest solution is to repeat the Blast run, > using XML for the output format, and use the Blast XML parser in Biopython to > parse the results. > > A general question is if anybody still needs the parser for Blast text > output. Currently, we are confusing our users by having a Blast text parser > that tends to break. A broken parser may be worse than no parser. > > --Michiel. > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > Sent: Wed 4/19/2006 6:15 AM > To: Michiel De Hoon > Cc: biopython at lists.open-bio.org > Subject: RE: [BioPython] Need help parsing Blastoutput > > Hi > Please see the attachment,it part of my Blast output. > yes I am try to parse text output from Blast ,I have use another script to > run my local blast that I am trying to perse the NCBIStandalone.BlastParser > was working fine without hsp.sbject_end which is one of what I need to > print out . > On checking the class diagrams from cookbook, findout that sbject_end is > not included .I just need another way of printing the int(subject end). > Thanks for your help > Halimah > > On Tue, 18 Apr 2006, Michiel De Hoon wrote: > > > Could you also send us the file Enterococcus_out so we can run the script? > > > > From the script, it looks like you're trying to parse text output from > Blast. > > While this is possible (in theory), the format of Blast text output tends > to > > change a lot, thereby breaking the parser in Biopython. It is more reliable > > to have Blast generate output in XML format, and use the XML parser: > > > > blast_out = open('my_blast.xml', 'r') > > > > from Bio.Blast import NCBIXML > > > > b_parser = NCBIXML.BlastParser() > > b_record = b_parser.parse(blast_out) > > > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to > > generate Blast output in XML. > > > > --Michiel. > > > > > > > > Michiel de Hoon > > Center for Computational Biology and Bioinformatics > > Columbia University > > 1150 St Nicholas Avenue > > New York, NY 10032 > > > > > > > > -----Original Message----- > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > > Sent: Tue 4/18/2006 11:06 AM > > To: Michiel De Hoon > > Cc: biopython at lists.open-bio.org > > Subject: RE: [BioPython] Need help parsing Blastoutput > > > > thanks > > please see the attchment a copy of my script and copy of my Blast output > > Thanks > > > > > > On Thu, 13 Apr 2006, Michiel De Hoon wrote: > > > > > Could you send us the script you were using? > > > > > > --Michiel. > > > > > > Michiel de Hoon > > > Center for Computational Biology and Bioinformatics > > > Columbia University > > > 1150 St Nicholas Avenue > > > New York, NY 10032 > > > > > > > > > > > > -----Original Message----- > > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu > > > Sent: Thu 4/13/2006 11:07 AM > > > To: biopython at lists.open-bio.org > > > Subject: [BioPython] Need help parsing Blastoutput > > > > > > Hi All, > > > I have a BLAST output from a local blast > > > I need to calculate my % alignment coverage as regard to my subject > > > I try parsed the blast output and wanted to print the > > > sbjct Start and Sbjct end. but I could not is there anyway I could this > > > try to get mach coverage between my querry and subject I dont need > > > Identities,but total % alignment for querry or subject. > > > Thanks > > > Halimah > > > > > > _______________________________________________ > > > BioPython mailing list - BioPython at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/biopython > > > > > > > > > > > > -------------- next part -------------- #! /usr/local/bin/python2.4 #halimah #16-04-2006 from string import split from Bio.Blast import NCBIXML #from Bio.Blast import NCBIStandalone b_out = open('blast2.xml','r') b_parser = NCBIXML.BlastParser() b_record = b_parser.parse(b_out) E_VALUE_THRESH = 1.0 while 1: b_record = b_iterator.next() print "The following results are for query " + b_record.query print 'len of query:',b_record.query_letters if b_record is None: break for alignment in b_record.alignments: for hsp in alignment.hsps: if hsp.expect <= E_VALUE_THRESH: print '****Alignment****' print 'title:', alignment.title print 'length:', alignment.length print 'e value:', hsp.expect print 'subjectstart:',hsp.sbjct_start print 'subject end:', hsp.sbject_end -------------- next part -------------- A non-text attachment was scrubbed... Name: blast2.xml Type: text/xml Size: 151659 bytes Desc: Url : http://lists.open-bio.org/pipermail/biopython/attachments/20060420/391af520/attachment-0001.xml From mdehoon at c2b2.columbia.edu Thu Apr 20 13:37:29 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Thu, 20 Apr 2006 13:37:29 -0400 Subject: [BioPython] Need help parsing Blastoutput Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEFF@cgcmail.cgc.cpmc.columbia.edu> Could you send us the Blast XML output also? --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] Sent: Thu 4/20/2006 7:57 AM To: Michiel De Hoon Cc: biopython at lists.open-bio.org Subject: RE: [BioPython] Need help parsing Blastoutput thanks I try using XML parser and I am still geting errors which I dont understand . please see the attchmnt copy of my script and Blast XML output. here is the error raceback (most recent call last): File "Bioperser.py", line 11, in ? b_record = b_parser.parse(b_out) File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 112, in parse self._parser.parse(handler) File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in parse xmlreader.IncrementalParser.parse(self, source) File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in parse self.feed(buffer) File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in feed self._err_handler.fatalError(exc) File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception thanks Halimah On Wed, 19 Apr 2006, Michiel De Hoon wrote: > The Blast parser fails to read your file because the format of Blast output > has changed. If I edit the data file so that it corresponds to the old format > (add a space here, remove a blank line there, etc.), the Blast parser reads > the file without problems. The easiest solution is to repeat the Blast run, > using XML for the output format, and use the Blast XML parser in Biopython to > parse the results. > > A general question is if anybody still needs the parser for Blast text > output. Currently, we are confusing our users by having a Blast text parser > that tends to break. A broken parser may be worse than no parser. > > --Michiel. > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > Sent: Wed 4/19/2006 6:15 AM > To: Michiel De Hoon > Cc: biopython at lists.open-bio.org > Subject: RE: [BioPython] Need help parsing Blastoutput > > Hi > Please see the attachment,it part of my Blast output. > yes I am try to parse text output from Blast ,I have use another script to > run my local blast that I am trying to perse the NCBIStandalone.BlastParser > was working fine without hsp.sbject_end which is one of what I need to > print out . > On checking the class diagrams from cookbook, findout that sbject_end is > not included .I just need another way of printing the int(subject end). > Thanks for your help > Halimah > > On Tue, 18 Apr 2006, Michiel De Hoon wrote: > > > Could you also send us the file Enterococcus_out so we can run the script? > > > > From the script, it looks like you're trying to parse text output from > Blast. > > While this is possible (in theory), the format of Blast text output tends > to > > change a lot, thereby breaking the parser in Biopython. It is more reliable > > to have Blast generate output in XML format, and use the XML parser: > > > > blast_out = open('my_blast.xml', 'r') > > > > from Bio.Blast import NCBIXML > > > > b_parser = NCBIXML.BlastParser() > > b_record = b_parser.parse(blast_out) > > > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to > > generate Blast output in XML. > > > > --Michiel. > > > > > > > > Michiel de Hoon > > Center for Computational Biology and Bioinformatics > > Columbia University > > 1150 St Nicholas Avenue > > New York, NY 10032 > > > > > > > > -----Original Message----- > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > > Sent: Tue 4/18/2006 11:06 AM > > To: Michiel De Hoon > > Cc: biopython at lists.open-bio.org > > Subject: RE: [BioPython] Need help parsing Blastoutput > > > > thanks > > please see the attchment a copy of my script and copy of my Blast output > > Thanks > > > > > > On Thu, 13 Apr 2006, Michiel De Hoon wrote: > > > > > Could you send us the script you were using? > > > > > > --Michiel. > > > > > > Michiel de Hoon > > > Center for Computational Biology and Bioinformatics > > > Columbia University > > > 1150 St Nicholas Avenue > > > New York, NY 10032 > > > > > > > > > > > > -----Original Message----- > > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu > > > Sent: Thu 4/13/2006 11:07 AM > > > To: biopython at lists.open-bio.org > > > Subject: [BioPython] Need help parsing Blastoutput > > > > > > Hi All, > > > I have a BLAST output from a local blast > > > I need to calculate my % alignment coverage as regard to my subject > > > I try parsed the blast output and wanted to print the > > > sbjct Start and Sbjct end. but I could not is there anyway I could this > > > try to get mach coverage between my querry and subject I dont need > > > Identities,but total % alignment for querry or subject. > > > Thanks > > > Halimah > > > > > > _______________________________________________ > > > BioPython mailing list - BioPython at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/biopython > > > > > > > > > > > > From mdehoon at c2b2.columbia.edu Thu Apr 20 15:15:51 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Thu, 20 Apr 2006 15:15:51 -0400 Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF01@cgcmail.cgc.cpmc.columbia.edu> > I did suggest a change to the documentation but it looks like no one has > made the change... > > http://biopython.org/pipermail/biopython-dev/2005-November/002193.html I have now made this update in CVS. I'll put it on the website also as soon as I can figure out how to do that with the new webserver. --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From alpersoyler at yahoo.com Fri Apr 21 03:07:05 2006 From: alpersoyler at yahoo.com (alper soyler) Date: Fri, 21 Apr 2006 00:07:05 -0700 (PDT) Subject: [BioPython] Need help!!! In-Reply-To: <1145540506.11610.17.camel@clintonite.nhm.ac.uk> Message-ID: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com> Hi Cymon, Thank you for your reply. However, to construct phylogenet?c profile I need to download approx. 100 completed genomes. I am searching to make it easier (e.g. without downloading genomes). Can I do it by running blast over the internet? Regards, Alper Soyler "Cymon J. Cox" wrote: Hi Alper, On Thu, 2006-04-20 at 05:59 -0700, alper soyler wrote: > Hi All, > > I am new to Biopython and have a question. I want to construct a pyhlogenetic > profile for one organism's proteins. I want to give my protein to blast to > search one organism's genome (e.g. Homo sapiens) instead of whole genbank > database. How can I solve my problem? Thank you in advance. Assuming you want to do this locally, you'll need to download you target genome, format it with the BLAST distribution programme 'formatdb', and then feed your query and newly formatted genome BLAST database to Bio.Blast.NCBIStandalone. See http://biopython.org/docs/tutorial/Tutorial004.html#toc10 3.1.4 Running BLAST locally for details, Cheers, Cymon ____________________________________________________________________ Cymon J. Cox Biometry and Molecular Research Department of Zoology Natural History Museum Cromwell Road London, SW7 5BD Email: cy at cymon.org, c.cox at nhm.ac.uk, cymon.cox at googlemail.com Phone : +44 (0)20 7942 6981 HomePage : http://www.duke.edu/~cymon -8.63/-6.77 _____________________________________________________________________ Fedora Core release 4 (Stentz) clintonite.nhm.ac.uk 14:35:55 up 13 days, 20:42, 8 users, load average: 0.08, 0.16, 0.12 --------------------------------- Blab-away for as little as 1?/min. Make PC-to-Phone Calls using Yahoo! Messenger with Voice. From biopython at maubp.freeserve.co.uk Fri Apr 21 04:44:56 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython)) Date: Fri, 21 Apr 2006 09:44:56 +0100 Subject: [BioPython] Updating the tutorial, was :Parsing and Creating Dictionaries of GenBank files In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF01@cgcmail.cgc.cpmc.columbia.edu> References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF01@cgcmail.cgc.cpmc.columbia.edu> Message-ID: <44489B88.2030801@maubp.freeserve.co.uk> Michiel De Hoon wrote: >> I did suggest a change to the documentation but it looks like no >> one has made the change... >> >> http://biopython.org/pipermail/biopython-dev/2005-November/002193.html >> Thanks - I was going to look at this today. Something funny seems to have happened to the plain text version: http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Doc/Tutorial.txt.diff?r1=1.5&r2=1.6&cvsroot=biopython (a) The old "Title" is missing above the contents listing (b) Contents entries contain   which is nasty for plain text. (b) Section references now contain odd text. Is it possible you only ran the TeX file once? Usually with references TeX should be run twice (and in extreme cases, three times) In an earlier discussion it was suggested we remove the plain text documentation from CVS, which I objected to as plain text is much easier for non-TeX people to read. If generating a consistent plain text version is a lot of hassle, then maybe we can live without it? > I have now made this update in CVS. I'll put it on the website also > as soon as I can figure out how to do that with the new webserver. I can't help you there - I was going to post to the Developer mailing list to see if anyone had done this recently. Have you been able to generate new HTML and Tutorial.pdf files? Looks like you have also updated the text about the Blast parser :) Peter From cy at cymon.org Fri Apr 21 05:38:33 2006 From: cy at cymon.org (Cymon J. Cox) Date: Fri, 21 Apr 2006 10:38:33 +0100 Subject: [BioPython] Need help!!! In-Reply-To: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com> References: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com> Message-ID: <1145612313.4167.15.camel@clintonite.nhm.ac.uk> Hi Alper, On Fri, 2006-04-21 at 00:07 -0700, alper soyler wrote: > Hi Cymon, > > Thank you for your reply. However, to construct phylogenet?c profile I need to > download approx. 100 completed genomes. I am searching to make it easier (e.g. > without downloading genomes). Can I do it by running blast over the internet? Well, I'm not sure; but here's my take on it and hopefully someone will correct me if I'm wrong. Assuming you are referring to complete genomes available through NCBI (otherwise you'll almost certainly need to download them), I don't think it's possible with the BioPython interface. Bio.Blast.NCBIWWW uses the qblast interface at NCBI (http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html) which I think only makes the following db's available: http://www.ncbi.nlm.nih.gov/blast/blast_databases.shtml . From looking at the qblast docs it doesn't seem possible to restrict the search to a particular organism while blast'ing against a particular NCBI db (e.g. nr). Depending on what you want to do, it maybe easier and quicker to use the NCBI web Blast interface to the Genomes db's: http://www.ncbi.nlm.nih.gov/sutils/genom_table.cgi Else you'll have to bite the proverbial bullet and download and format them individually. Cheers, Cymon > > Regards, > Alper Soyler > > "Cymon J. Cox" wrote: > Hi Alper, > > On Thu, 2006-04-20 at 05:59 -0700, alper soyler wrote: > > Hi All, > > > > I am new to Biopython and have a question. I want to construct a pyhlogenetic > > profile for one organism's proteins. I want to give my protein to blast to > > search one organism's genome (e.g. Homo sapiens) instead of whole genbank > > database. How can I solve my problem? Thank you in advance. > > Assuming you want to do this locally, you'll need to download you target > genome, format it with the BLAST distribution programme 'formatdb', and > then feed your query and newly formatted genome BLAST database to > Bio.Blast.NCBIStandalone. > > See http://biopython.org/docs/tutorial/Tutorial004.html#toc10 > 3.1.4 Running BLAST locally > > for details, > > Cheers, Cymon From biopython at maubp.freeserve.co.uk Fri Apr 21 05:23:12 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython)) Date: Fri, 21 Apr 2006 10:23:12 +0100 Subject: [BioPython] blast against genomes, was: Need help!!! In-Reply-To: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com> References: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com> Message-ID: <4448A480.5010805@maubp.freeserve.co.uk> alper soyler wrote: > Hi Cymon, > > Thank you for your reply. However, to construct phylogenet?c profile > I need to download approx. 100 completed genomes. I am searching to > make it easier (e.g. without downloading genomes). Can I do it by > running blast over the internet? So you want to search 100 completed genomes using your protein as the input query? As Cymon suggested, downloading the genomes and building your own database is one method. As this is a "big task" you have in mind, the network speed limitations of doing many blast queries may make this a better idea than trying to do it online. However, the NCBI offer online blast against some (all?) of their completed genomes so it may be possible to do it this way via BioPython. http://www.ncbi.nlm.nih.gov/BLAST/ The webpage has a nice interface for blast against specific genomes (right hand side, second box down). You can also use the normal blast pages and the "Limit by entrez query" field, e.g. mouse[ORGN] OR rat[ORGN] It should be possible to do this automatically in code but you will need to compile a list of the species names the NCBI will understand... Peter From sbassi at gmail.com Fri Apr 21 07:46:49 2006 From: sbassi at gmail.com (Sebastian Bassi) Date: Fri, 21 Apr 2006 08:46:49 -0300 Subject: [BioPython] Need help!!! In-Reply-To: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com> References: <1145540506.11610.17.camel@clintonite.nhm.ac.uk> <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com> Message-ID: On 4/21/06, alper soyler wrote: > Hi Cymon, > Thank you for your reply. However, to construct phylogenet?c profile I need to download approx. 100 completed genomes. I am searching to make it easier (e.g. without downloading genomes). Can I do it by running blast over the internet? > Maybe you could download only NR db and then make subsets from it. NCBI utilities or the local BLAST has one utility that allows you to extract sequences from BLAST compiled DBs. I don't know if this would be enough for your needs. -- Bioinformatics news: http://www.bioinformatica.info Lriser: http://www.linspire.com/lraiser_success.php?serial=318 From mdehoon at c2b2.columbia.edu Fri Apr 21 12:26:39 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Fri, 21 Apr 2006 12:26:39 -0400 Subject: [BioPython] Updating the tutorial, was :Parsing and Creating Dictionaries of GenBank files Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF05@cgcmail.cgc.cpmc.columbia.edu> > Something funny seems to have happened to the plain text version: > > http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Doc/Tutorial.t xt.diff?r1=1.5&r2=1.6&cvsroot=biopython The plain text version is generated by hevea, so not by tex directly. The funny output is likely due to having a different hevea version (which I ran a couple of times). I didn't see anything obviously wrong with the Tutorial.tex source file, so I think these errors are due to errors in the Tutorial.tex -> Tutorial.txt translation by hevea. > If generating a consistent plain text version is a lot of hassle, then > maybe we can live without it? Currently, the plain text version is not very useful. It's not a source file, so it should not be in CVS. On the other hand, the plain text version is not available from the Biopython documentation page, and users are better off with the PDF version anyway. So I think nobody will miss the plain text version. Correct me if I'm wrong. --Michiel. From srini_iyyer_bio at yahoo.com Fri Apr 21 18:49:28 2006 From: srini_iyyer_bio at yahoo.com (Srinivas Iyyer) Date: Fri, 21 Apr 2006 15:49:28 -0700 (PDT) Subject: [BioPython] Creating a graphical interface to database of gene coordinates Message-ID: <20060421224928.87408.qmail@web38105.mail.mud.yahoo.com> Dear group, I am happy that I am slowly finding pyhonian projects related to my research area. Problem: 1. I have a database of human gene coordinates on chromosomes. 2. I have gene expression data from my lab concerning the genes I mentioned above. 3. I want to visualize expression data laid on chromosomes. Eg. Coordinates: Chr Gene From To Exon 1 x 100 120 exon:1 1 x 200 250 exon:2 1 x 350 450 exon:3 Expression data: IDent sample Chr From To Expression value xxx_at lung 1 110 120 100.35 x_s_at heart 1 225 250 124.35 x_a_at eye 1 375 400 146.35 What I want: I want to have a simpler window, that would connect to my database. I want to give a gene, this python/tk interfacce or what ever would query the database draw a graph of gene according the exons and plot the values. -------_______----------_______------- -- : exon __: regions that are not exons, introns. My questions to Tutor/BioPython forums: 1. What should I decide to work on a. Py/Tk framework b. python imaging libraries etc. 2. I do not want to impress any one with this work, except that it should help me understand the relationships as the number game in the tables above is highly confusing. So, a working version that accurately plots the expression values for as many samples I have 3. Are there any available modules to jump-start? or do I have to create some from scratch. which would be a problem because I am between novice to mediocral level of python programing. 4. Any ideas/suggestions/pointers are highly appreciated. thanks Sri __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From biopython at maubp.freeserve.co.uk Sat Apr 22 08:32:21 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython)) Date: Sat, 22 Apr 2006 13:32:21 +0100 Subject: [BioPython] Creating a graphical interface to database of gene coordinates In-Reply-To: <20060421224928.87408.qmail@web38105.mail.mud.yahoo.com> References: <20060421224928.87408.qmail@web38105.mail.mud.yahoo.com> Message-ID: <444A2255.6010704@maubp.freeserve.co.uk> Srinivas Iyyer wrote: > Dear group, > I am happy that I am slowly finding pyhonian projects > related to my research area. > > Problem: > 1. I have a database of human gene coordinates on > chromosomes. > 2. I have gene expression data from my lab concerning > the genes I mentioned above. > > 3. I want to visualize expression data laid on > chromosomes. You may be able to produce chromosome diagrams with Leighton Pritchard and Jennifer White's program genomediagram: http://bioinf.scri.sari.ac.uk/lp/programs.html#genomediagram It will do both circular genomes diagrams (nice for bacteria) and linear ones - which would make sense for chromosomes. I think I've seen examples with expression data shown in this way... certainly it could be done. Note that this can produce PDF or bitmap output - but its not interactive. There is also a GUI to go with it, but I have not looked at this. ---------------------------------------------------------------------- One final suggestion, is to consider looking at R/BioConductor - its a completely different language but I have seen examples where expression data is visualised on chromosomes. http://www.r-project.org/ http://www.bioconductor.org/ You can even call R from Python, for example using RPy (R from Python),: http://rpy.sourceforge.net/index.html See also RSPython, an R/SPlus - Python Interface which I have not used personally: http://www.omegahat.org/RSPython/ Peter From biopython at maubp.freeserve.co.uk Mon Apr 24 06:56:06 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython List)) Date: Mon, 24 Apr 2006 11:56:06 +0100 Subject: [BioPython] Bio.Nexus documentation Message-ID: <444CAEC6.5040703@maubp.freeserve.co.uk> I'm thinking of having a go at using the new Bio.Nexus model in BioPython to do some phylogenetic tree manipulation (from Clustal .dnd files in my case), so I thought I would have a hunt for some examples or help... Back in July 2005, Frank Kauff wrote: > I hope most of the methods have a descriptive title and are easy to use. > Let me know if I can help further. And I promise to write some > documentation, but it won't be before end of August. > > Cheers, > Frank Archive link: http://biopython.org/pipermail/biopython/2005-July/002714.html Was that August 2005, or August 2006, you had in mind? ;) Do you have some simple examples you could share with us instead perhaps? Thanks Peter From fkauff at duke.edu Mon Apr 24 09:32:45 2006 From: fkauff at duke.edu (Frank Kauff) Date: Mon, 24 Apr 2006 09:32:45 -0400 Subject: [BioPython] Bio.Nexus documentation In-Reply-To: <444CAEC6.5040703@maubp.freeserve.co.uk> References: <444CAEC6.5040703@maubp.freeserve.co.uk> Message-ID: <1145885566.2369.6.camel@osiris.biology.duke.edu> An embedded and charset-unspecified text was scrubbed... Name: not available Url: http://lists.open-bio.org/pipermail/biopython/attachments/20060424/d8a5de2f/attachment.ksh From halima at mancala.cbio.uct.ac.za Mon Apr 24 04:45:09 2006 From: halima at mancala.cbio.uct.ac.za (Halima Rabiu) Date: Mon, 24 Apr 2006 10:45:09 +0200 (SAST) Subject: [BioPython] Need help parsing Blastoutput In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEFF@cgcmail.cgc.cpmc.columbia.edu> References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEFF@cgcmail.cgc.cpmc.columbia.edu> Message-ID: Hi attch here is the output xml out I also attached it in my previous post thanks Halimah On Thu, 20 Apr 2006, Michiel De Hoon wrote: > Could you send us the Blast XML output also? > > --Michiel. > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > Sent: Thu 4/20/2006 7:57 AM > To: Michiel De Hoon > Cc: biopython at lists.open-bio.org > Subject: RE: [BioPython] Need help parsing Blastoutput > > thanks I try using XML parser and I am still geting errors which I dont > understand . please see the attchmnt copy of my script and Blast XML > output. > here is the error > raceback (most recent call last): > File "Bioperser.py", line 11, in ? > b_record = b_parser.parse(b_out) > File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line > 112, in parse > self._parser.parse(handler) > File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in > parse > xmlreader.IncrementalParser.parse(self, source) > File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in > parse > self.feed(buffer) > File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in > feed > self._err_handler.fatalError(exc) > File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in > fatalError > raise exception > thanks > Halimah > > On Wed, 19 Apr 2006, Michiel De Hoon wrote: > > > The Blast parser fails to read your file because the format of Blast output > > has changed. If I edit the data file so that it corresponds to the old > format > > (add a space here, remove a blank line there, etc.), the Blast parser reads > > the file without problems. The easiest solution is to repeat the Blast run, > > using XML for the output format, and use the Blast XML parser in Biopython > to > > parse the results. > > > > A general question is if anybody still needs the parser for Blast text > > output. Currently, we are confusing our users by having a Blast text parser > > that tends to break. A broken parser may be worse than no parser. > > > > --Michiel. > > > > Michiel de Hoon > > Center for Computational Biology and Bioinformatics > > Columbia University > > 1150 St Nicholas Avenue > > New York, NY 10032 > > > > > > > > -----Original Message----- > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > > Sent: Wed 4/19/2006 6:15 AM > > To: Michiel De Hoon > > Cc: biopython at lists.open-bio.org > > Subject: RE: [BioPython] Need help parsing Blastoutput > > > > Hi > > Please see the attachment,it part of my Blast output. > > yes I am try to parse text output from Blast ,I have use another script to > > run my local blast that I am trying to perse the NCBIStandalone.BlastParser > > > was working fine without hsp.sbject_end which is one of what I need to > > print out . > > On checking the class diagrams from cookbook, findout that sbject_end is > > not included .I just need another way of printing the int(subject end). > > Thanks for your help > > Halimah > > > > On Tue, 18 Apr 2006, Michiel De Hoon wrote: > > > > > Could you also send us the file Enterococcus_out so we can run the > script? > > > > > > From the script, it looks like you're trying to parse text output from > > Blast. > > > While this is possible (in theory), the format of Blast text output tends > > to > > > change a lot, thereby breaking the parser in Biopython. It is more > reliable > > > to have Blast generate output in XML format, and use the XML parser: > > > > > > blast_out = open('my_blast.xml', 'r') > > > > > > from Bio.Blast import NCBIXML > > > > > > b_parser = NCBIXML.BlastParser() > > > b_record = b_parser.parse(blast_out) > > > > > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to > > > generate Blast output in XML. > > > > > > --Michiel. > > > > > > > > > > > > Michiel de Hoon > > > Center for Computational Biology and Bioinformatics > > > Columbia University > > > 1150 St Nicholas Avenue > > > New York, NY 10032 > > > > > > > > > > > > -----Original Message----- > > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > > > Sent: Tue 4/18/2006 11:06 AM > > > To: Michiel De Hoon > > > Cc: biopython at lists.open-bio.org > > > Subject: RE: [BioPython] Need help parsing Blastoutput > > > > > > thanks > > > please see the attchment a copy of my script and copy of my Blast output > > > Thanks > > > > > > > > > On Thu, 13 Apr 2006, Michiel De Hoon wrote: > > > > > > > Could you send us the script you were using? > > > > > > > > --Michiel. > > > > > > > > Michiel de Hoon > > > > Center for Computational Biology and Bioinformatics > > > > Columbia University > > > > 1150 St Nicholas Avenue > > > > New York, NY 10032 > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu > > > > Sent: Thu 4/13/2006 11:07 AM > > > > To: biopython at lists.open-bio.org > > > > Subject: [BioPython] Need help parsing Blastoutput > > > > > > > > Hi All, > > > > I have a BLAST output from a local blast > > > > I need to calculate my % alignment coverage as regard to my subject > > > > I try parsed the blast output and wanted to print the > > > > sbjct Start and Sbjct end. but I could not is there anyway I could this > > > > > try to get mach coverage between my querry and subject I dont need > > > > Identities,but total % alignment for querry or subject. > > > > Thanks > > > > Halimah > > > > > > > > _______________________________________________ > > > > BioPython mailing list - BioPython at lists.open-bio.org > > > > http://lists.open-bio.org/mailman/listinfo/biopython > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- A non-text attachment was scrubbed... Name: blast2.xml Type: text/xml Size: 151658 bytes Desc: Url : http://lists.open-bio.org/pipermail/biopython/attachments/20060424/af1567dc/attachment-0001.xml From mdehoon at c2b2.columbia.edu Mon Apr 24 14:14:17 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Mon, 24 Apr 2006 14:14:17 -0400 Subject: [BioPython] Need help parsing Blastoutput Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF0E@cgcmail.cgc.cpmc.columbia.edu> Ha, I see. My stupid email program was removing the XML file from your email messages for security reasons something or other. Anyway, I got the XML files from the mailing list archives. The XML file from Thursday April 20 is different from the one sent on Monday April 24. In fact, the latter seems to be damaged; in line 194, it has: while the former has So in the latter a " is missing for some reason. Anyway, the XML parser can read the XML file from Thursday April 20 if you fix a few things in your script: *) Instead of b_record = b_parser.parse(b_out) you need b_iterator = NCBIStandalone.Iterator(b_out, b_parser) (and then you should also import NCBIStandalone) *) You should check if b_record is None immediately after b_record = b_iterator.next(). *) There is no hsp.sbject_end --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] Sent: Mon 4/24/2006 4:45 AM To: Michiel De Hoon Cc: biopython at lists.open-bio.org Subject: RE: [BioPython] Need help parsing Blastoutput Hi attch here is the output xml out I also attached it in my previous post thanks Halimah On Thu, 20 Apr 2006, Michiel De Hoon wrote: > Could you send us the Blast XML output also? > > --Michiel. > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > Sent: Thu 4/20/2006 7:57 AM > To: Michiel De Hoon > Cc: biopython at lists.open-bio.org > Subject: RE: [BioPython] Need help parsing Blastoutput > > thanks I try using XML parser and I am still geting errors which I dont > understand . please see the attchmnt copy of my script and Blast XML > output. > here is the error > raceback (most recent call last): > File "Bioperser.py", line 11, in ? > b_record = b_parser.parse(b_out) > File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line > 112, in parse > self._parser.parse(handler) > File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in > parse > xmlreader.IncrementalParser.parse(self, source) > File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in > parse > self.feed(buffer) > File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in > feed > self._err_handler.fatalError(exc) > File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in > fatalError > raise exception > thanks > Halimah > > On Wed, 19 Apr 2006, Michiel De Hoon wrote: > > > The Blast parser fails to read your file because the format of Blast output > > has changed. If I edit the data file so that it corresponds to the old > format > > (add a space here, remove a blank line there, etc.), the Blast parser reads > > the file without problems. The easiest solution is to repeat the Blast run, > > using XML for the output format, and use the Blast XML parser in Biopython > to > > parse the results. > > > > A general question is if anybody still needs the parser for Blast text > > output. Currently, we are confusing our users by having a Blast text parser > > that tends to break. A broken parser may be worse than no parser. > > > > --Michiel. > > > > Michiel de Hoon > > Center for Computational Biology and Bioinformatics > > Columbia University > > 1150 St Nicholas Avenue > > New York, NY 10032 > > > > > > > > -----Original Message----- > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > > Sent: Wed 4/19/2006 6:15 AM > > To: Michiel De Hoon > > Cc: biopython at lists.open-bio.org > > Subject: RE: [BioPython] Need help parsing Blastoutput > > > > Hi > > Please see the attachment,it part of my Blast output. > > yes I am try to parse text output from Blast ,I have use another script to > > run my local blast that I am trying to perse the NCBIStandalone.BlastParser > > > was working fine without hsp.sbject_end which is one of what I need to > > print out . > > On checking the class diagrams from cookbook, findout that sbject_end is > > not included .I just need another way of printing the int(subject end). > > Thanks for your help > > Halimah > > > > On Tue, 18 Apr 2006, Michiel De Hoon wrote: > > > > > Could you also send us the file Enterococcus_out so we can run the > script? > > > > > > From the script, it looks like you're trying to parse text output from > > Blast. > > > While this is possible (in theory), the format of Blast text output tends > > to > > > change a lot, thereby breaking the parser in Biopython. It is more > reliable > > > to have Blast generate output in XML format, and use the XML parser: > > > > > > blast_out = open('my_blast.xml', 'r') > > > > > > from Bio.Blast import NCBIXML > > > > > > b_parser = NCBIXML.BlastParser() > > > b_record = b_parser.parse(blast_out) > > > > > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to > > > generate Blast output in XML. > > > > > > --Michiel. > > > > > > > > > > > > Michiel de Hoon > > > Center for Computational Biology and Bioinformatics > > > Columbia University > > > 1150 St Nicholas Avenue > > > New York, NY 10032 > > > > > > > > > > > > -----Original Message----- > > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > > > Sent: Tue 4/18/2006 11:06 AM > > > To: Michiel De Hoon > > > Cc: biopython at lists.open-bio.org > > > Subject: RE: [BioPython] Need help parsing Blastoutput > > > > > > thanks > > > please see the attchment a copy of my script and copy of my Blast output > > > Thanks > > > > > > > > > On Thu, 13 Apr 2006, Michiel De Hoon wrote: > > > > > > > Could you send us the script you were using? > > > > > > > > --Michiel. > > > > > > > > Michiel de Hoon > > > > Center for Computational Biology and Bioinformatics > > > > Columbia University > > > > 1150 St Nicholas Avenue > > > > New York, NY 10032 > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu > > > > Sent: Thu 4/13/2006 11:07 AM > > > > To: biopython at lists.open-bio.org > > > > Subject: [BioPython] Need help parsing Blastoutput > > > > > > > > Hi All, > > > > I have a BLAST output from a local blast > > > > I need to calculate my % alignment coverage as regard to my subject > > > > I try parsed the blast output and wanted to print the > > > > sbjct Start and Sbjct end. but I could not is there anyway I could this > > > > > try to get mach coverage between my querry and subject I dont need > > > > Identities,but total % alignment for querry or subject. > > > > Thanks > > > > Halimah > > > > > > > > _______________________________________________ > > > > BioPython mailing list - BioPython at lists.open-bio.org > > > > http://lists.open-bio.org/mailman/listinfo/biopython > > > > > > > > > > > > > > > > > > > > From mdehoon at c2b2.columbia.edu Mon Apr 24 14:27:31 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Mon, 24 Apr 2006 14:27:31 -0400 Subject: [BioPython] Need help parsing Blastoutput Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF0F@cgcmail.cgc.cpmc.columbia.edu> Also, make sure you have the latest version of Bio/Blast/NCBIStandalone.py; you can get it from here: http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/*checkout*/biopython/Bio /Blast/NCBIStandalone.py?rev=1.60&cvsroot=biopython&content-type=text/plain --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] Sent: Mon 4/24/2006 4:45 AM To: Michiel De Hoon Cc: biopython at lists.open-bio.org Subject: RE: [BioPython] Need help parsing Blastoutput Hi attch here is the output xml out I also attached it in my previous post thanks Halimah On Thu, 20 Apr 2006, Michiel De Hoon wrote: > Could you send us the Blast XML output also? > > --Michiel. > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > Sent: Thu 4/20/2006 7:57 AM > To: Michiel De Hoon > Cc: biopython at lists.open-bio.org > Subject: RE: [BioPython] Need help parsing Blastoutput > > thanks I try using XML parser and I am still geting errors which I dont > understand . please see the attchmnt copy of my script and Blast XML > output. > here is the error > raceback (most recent call last): > File "Bioperser.py", line 11, in ? > b_record = b_parser.parse(b_out) > File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line > 112, in parse > self._parser.parse(handler) > File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in > parse > xmlreader.IncrementalParser.parse(self, source) > File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in > parse > self.feed(buffer) > File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in > feed > self._err_handler.fatalError(exc) > File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in > fatalError > raise exception > thanks > Halimah > > On Wed, 19 Apr 2006, Michiel De Hoon wrote: > > > The Blast parser fails to read your file because the format of Blast output > > has changed. If I edit the data file so that it corresponds to the old > format > > (add a space here, remove a blank line there, etc.), the Blast parser reads > > the file without problems. The easiest solution is to repeat the Blast run, > > using XML for the output format, and use the Blast XML parser in Biopython > to > > parse the results. > > > > A general question is if anybody still needs the parser for Blast text > > output. Currently, we are confusing our users by having a Blast text parser > > that tends to break. A broken parser may be worse than no parser. > > > > --Michiel. > > > > Michiel de Hoon > > Center for Computational Biology and Bioinformatics > > Columbia University > > 1150 St Nicholas Avenue > > New York, NY 10032 > > > > > > > > -----Original Message----- > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > > Sent: Wed 4/19/2006 6:15 AM > > To: Michiel De Hoon > > Cc: biopython at lists.open-bio.org > > Subject: RE: [BioPython] Need help parsing Blastoutput > > > > Hi > > Please see the attachment,it part of my Blast output. > > yes I am try to parse text output from Blast ,I have use another script to > > run my local blast that I am trying to perse the NCBIStandalone.BlastParser > > > was working fine without hsp.sbject_end which is one of what I need to > > print out . > > On checking the class diagrams from cookbook, findout that sbject_end is > > not included .I just need another way of printing the int(subject end). > > Thanks for your help > > Halimah > > > > On Tue, 18 Apr 2006, Michiel De Hoon wrote: > > > > > Could you also send us the file Enterococcus_out so we can run the > script? > > > > > > From the script, it looks like you're trying to parse text output from > > Blast. > > > While this is possible (in theory), the format of Blast text output tends > > to > > > change a lot, thereby breaking the parser in Biopython. It is more > reliable > > > to have Blast generate output in XML format, and use the XML parser: > > > > > > blast_out = open('my_blast.xml', 'r') > > > > > > from Bio.Blast import NCBIXML > > > > > > b_parser = NCBIXML.BlastParser() > > > b_record = b_parser.parse(blast_out) > > > > > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to > > > generate Blast output in XML. > > > > > > --Michiel. > > > > > > > > > > > > Michiel de Hoon > > > Center for Computational Biology and Bioinformatics > > > Columbia University > > > 1150 St Nicholas Avenue > > > New York, NY 10032 > > > > > > > > > > > > -----Original Message----- > > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > > > Sent: Tue 4/18/2006 11:06 AM > > > To: Michiel De Hoon > > > Cc: biopython at lists.open-bio.org > > > Subject: RE: [BioPython] Need help parsing Blastoutput > > > > > > thanks > > > please see the attchment a copy of my script and copy of my Blast output > > > Thanks > > > > > > > > > On Thu, 13 Apr 2006, Michiel De Hoon wrote: > > > > > > > Could you send us the script you were using? > > > > > > > > --Michiel. > > > > > > > > Michiel de Hoon > > > > Center for Computational Biology and Bioinformatics > > > > Columbia University > > > > 1150 St Nicholas Avenue > > > > New York, NY 10032 > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu > > > > Sent: Thu 4/13/2006 11:07 AM > > > > To: biopython at lists.open-bio.org > > > > Subject: [BioPython] Need help parsing Blastoutput > > > > > > > > Hi All, > > > > I have a BLAST output from a local blast > > > > I need to calculate my % alignment coverage as regard to my subject > > > > I try parsed the blast output and wanted to print the > > > > sbjct Start and Sbjct end. but I could not is there anyway I could this > > > > > try to get mach coverage between my querry and subject I dont need > > > > Identities,but total % alignment for querry or subject. > > > > Thanks > > > > Halimah > > > > > > > > _______________________________________________ > > > > BioPython mailing list - BioPython at lists.open-bio.org > > > > http://lists.open-bio.org/mailman/listinfo/biopython > > > > > > > > > > > > > > > > > > > > From biopython at maubp.freeserve.co.uk Tue Apr 25 05:08:33 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython List)) Date: Tue, 25 Apr 2006 10:08:33 +0100 Subject: [BioPython] Bio.Nexus documentation In-Reply-To: <1145885566.2369.6.camel@osiris.biology.duke.edu> References: <444CAEC6.5040703@maubp.freeserve.co.uk> <1145885566.2369.6.camel@osiris.biology.duke.edu> Message-ID: <444DE711.8070509@maubp.freeserve.co.uk> > Anyway, I'll get some examples together, and I still want to do some > documentation for the cookbook. It won't be before this weekend, though. > For a quick and dirty anchor point, there's the test module that comes > with the distribution, it naturally has some code that does interesting > things with trees and data. Its certainly shown me that the Nexus file format is a lot more complicated than just holding simple trees. What I actually wanted to do was load a Newick format tree (extension *.dnd files from Clustalw/ClustalX in particular) into BioPython. This doesn't look like is possible. However, I can get Clustalx to save the corresponding alignment in Nexus format, but the parser doesn't seem to like it... Traceback (most recent call last): File "C:\temp\hack_trees_000.py", line 7, in ? n=Nexus.Nexus(input_file) File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 525, in __init__ self.read(input) File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 582, in read self._parse_nexus_block(title, contents) File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 623, in _parse_nexus_block getattr(self,'_'+line.command)(line.options) AttributeError: 'Nexus' object has no attribute '_utree' This looks like its cause by the penultimate line of the "Nexus Tree file" produced by ClustalX: .. UTREE PAUP_1= (...); ENDBLOCK; Any ideas? I'll happily send you some example tree files off the list if you want. Peter From fkauff at duke.edu Tue Apr 25 08:03:16 2006 From: fkauff at duke.edu (Frank) Date: Tue, 25 Apr 2006 08:03:16 -0400 Subject: [BioPython] Bio.Nexus documentation In-Reply-To: <444DE711.8070509@maubp.freeserve.co.uk> References: <444CAEC6.5040703@maubp.freeserve.co.uk> <1145885566.2369.6.camel@osiris.biology.duke.edu> <444DE711.8070509@maubp.freeserve.co.uk> Message-ID: <1145966596.2276.3.camel@cpe-066-057-048-192.nc.res.rr.com> Hi Peter, yes, utree is in deed a nexus command I never heard of... The thing is that nexus is extendible, so programs can in theory define new commands. So, what is utree? Maybe an unrooted tree? And, many programs don't care much about the nexus specifications, which are, in turn, not always too precise. If you send the files along, I'd be happy to have a look. Cheers, Frank On Tue, 2006-04-25 at 10:08 +0100, Peter (BioPython List) wrote: > > Anyway, I'll get some examples together, and I still want to do some > > documentation for the cookbook. It won't be before this weekend, though. > > For a quick and dirty anchor point, there's the test module that comes > > with the distribution, it naturally has some code that does interesting > > things with trees and data. > > Its certainly shown me that the Nexus file format is a lot more > complicated than just holding simple trees. > > What I actually wanted to do was load a Newick format tree (extension > *.dnd files from Clustalw/ClustalX in particular) into BioPython. This > doesn't look like is possible. > > However, I can get Clustalx to save the corresponding alignment in Nexus > format, but the parser doesn't seem to like it... > > Traceback (most recent call last): > File "C:\temp\hack_trees_000.py", line 7, in ? > n=Nexus.Nexus(input_file) > File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 525, in > __init__ > self.read(input) > File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 582, in > read > self._parse_nexus_block(title, contents) > File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 623, in > _parse_nexus_block > getattr(self,'_'+line.command)(line.options) > AttributeError: 'Nexus' object has no attribute '_utree' > > This looks like its cause by the penultimate line of the "Nexus Tree > file" produced by ClustalX: > > .. > UTREE PAUP_1= (...); > ENDBLOCK; > > Any ideas? I'll happily send you some example tree files off the list > if you want. > > Peter > > From fkauff at duke.edu Tue Apr 25 17:17:23 2006 From: fkauff at duke.edu (Frank Kauff) Date: Tue, 25 Apr 2006 17:17:23 -0400 Subject: [BioPython] Bio.Nexus documentation In-Reply-To: <444DE711.8070509@maubp.freeserve.co.uk> References: <444CAEC6.5040703@maubp.freeserve.co.uk> <1145885566.2369.6.camel@osiris.biology.duke.edu> <444DE711.8070509@maubp.freeserve.co.uk> Message-ID: <1145999843.2365.25.camel@osiris.biology.duke.edu> Ok, I added support for the utree command used in clustal to denote an unrooted tree (in the nexus parser, it is synonym to 'tree', as trees are unrooted by default anyway), and fixed some issues with linebreaks in tree descriptions. Nexus files from Clustal should now be read without problems (famous last words). Cheers, Frank On Tue, 2006-04-25 at 10:08 +0100, Peter (BioPython List) wrote: > > Anyway, I'll get some examples together, and I still want to do some > > documentation for the cookbook. It won't be before this weekend, though. > > For a quick and dirty anchor point, there's the test module that comes > > with the distribution, it naturally has some code that does interesting > > things with trees and data. > > Its certainly shown me that the Nexus file format is a lot more > complicated than just holding simple trees. > > What I actually wanted to do was load a Newick format tree (extension > *.dnd files from Clustalw/ClustalX in particular) into BioPython. This > doesn't look like is possible. > > However, I can get Clustalx to save the corresponding alignment in Nexus > format, but the parser doesn't seem to like it... > > Traceback (most recent call last): > File "C:\temp\hack_trees_000.py", line 7, in ? > n=Nexus.Nexus(input_file) > File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 525, in > __init__ > self.read(input) > File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 582, in > read > self._parse_nexus_block(title, contents) > File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 623, in > _parse_nexus_block > getattr(self,'_'+line.command)(line.options) > AttributeError: 'Nexus' object has no attribute '_utree' > > This looks like its cause by the penultimate line of the "Nexus Tree > file" produced by ClustalX: > > .. > UTREE PAUP_1= (...); > ENDBLOCK; > > Any ideas? I'll happily send you some example tree files off the list > if you want. > > Peter > > -- Frank Kauff Dept. of Biology Duke University Box 90338 Durham, NC 27708 USA Phone 919-660-7382 Fax 919-660-7293 Web http://www.lutzonilab.net From biopython at maubp.freeserve.co.uk Wed Apr 26 10:16:21 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython List)) Date: Wed, 26 Apr 2006 15:16:21 +0100 Subject: [BioPython] Bio.Nexus and Clustal tree files Message-ID: <444F80B5.60207@maubp.freeserve.co.uk> Hello again, I have installed Frank Kauff's recent changes to Bio.Nexus from CVS, and have actually got a tree loaded now :) Here is my example script, which tries to load two tree files created using ClustalX 1.83 (files previously sent to Frank off list) (b) demo.dnd - Clustal guide tree in Newick format, no bootstraps (b) demo.treb - Clustal NJ tree in Nexus format, with bootstraps Example code starts here: from Bio.Nexus import Nexus for filename in [r"C:\TEMP\nexus\demo.dnd", r"C:\TEMP\nexus\demo.treb"] : input_file = open(filename,"r") n=Nexus.Nexus(input_file) input_file.close() print "-----------------" print "Filename:" + n.filename print "Number of taxlabels = %i" % len(n.taxlabels) print "Number of trees = %i" % len(n.trees) for tree in n.trees : print "Tree name: %s"% tree.name print "Tree nodes: " + ", ".join(tree.get_taxa()) print "-----------------" This gives the following output: ----------------- Filename:C:\TEMP\nexus\demo.dnd Number of taxlabels = 0 Number of trees = 0 ----------------- Filename:C:\TEMP\nexus\demo.treb Number of taxlabels = 0 Number of trees = 1 Tree name: PAUP_1 Tree nodes: V_Harveyi_PATH, B_subtilis_YXEM, B_subtilis_GlnH_homo_YCKK, YA80_HAEIN, FLIY_ECOLI, Deinococcus_radiodurans, HISJ_E_COLI, E_coli_GlnH ----------------- As you can see, loading the ClustalX NEXUS output (*.treb) seems to work without trouble (although n.taxlabels is an empty list... is this to be expected?). On the other hand, I don't get the tree for the Clustal guide tree file (*.dnd) which is a pain. Do I need to load these files differently, as they are Newick format, not NEXUS format? Thank you Peter From fkauff at duke.edu Wed Apr 26 11:17:31 2006 From: fkauff at duke.edu (Frank Kauff) Date: Wed, 26 Apr 2006 11:17:31 -0400 Subject: [BioPython] Bio.Nexus and Clustal tree files In-Reply-To: <444F80B5.60207@maubp.freeserve.co.uk> References: <444F80B5.60207@maubp.freeserve.co.uk> Message-ID: <1146064651.2365.41.camel@osiris.biology.duke.edu> On Wed, 2006-04-26 at 15:16 +0100, Peter (BioPython List) wrote: > Hello again, > > I have installed Frank Kauff's recent changes to Bio.Nexus from CVS, and > have actually got a tree loaded now :) > Excellent! > Here is my example script, which tries to load two tree files created > using ClustalX 1.83 (files previously sent to Frank off list) > > (b) demo.dnd - Clustal guide tree in Newick format, no bootstraps > (b) demo.treb - Clustal NJ tree in Nexus format, with bootstraps > > Example code starts here: > This gives the following output: > > ----------------- > Filename:C:\TEMP\nexus\demo.dnd > Number of taxlabels = 0 > Number of trees = 0 > ----------------- > Filename:C:\TEMP\nexus\demo.treb > Number of taxlabels = 0 > Number of trees = 1 > Tree name: PAUP_1 > Tree nodes: V_Harveyi_PATH, B_subtilis_YXEM, B_subtilis_GlnH_homo_YCKK, > YA80_HAEIN, FLIY_ECOLI, Deinococcus_radiodurans, HISJ_E_COLI, E_coli_GlnH > ----------------- > > As you can see, loading the ClustalX NEXUS output (*.treb) seems to work > without trouble (although n.taxlabels is an empty list... is this to be > expected?). yes, the taxlabels refers to the taxon labels of a nexus data matrix. They are not necessarily identical with the taxa in the tree, but could be a superset or a subset of those. However, the way clustal indicates the no. of supported bootstrap replicates (square brackets after the branchlengths) is unsupported, and thus these values are ignored. > > On the other hand, I don't get the tree for the Clustal guide tree file > (*.dnd) which is a pain. Do I need to load these files differently, as > they are Newick format, not NEXUS format? > Yes, the nexus parser reads only nexus. But you can throw the newick tree directly at the Tree class >>> from Bio.Nexus import Trees >>> t=Trees.Tree(open('demo.dnd').read()) Frank > Thank you > > Peter > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython -- Frank Kauff Dept. of Biology Duke University Box 90338 Durham, NC 27708 USA Phone 919-660-7382 Fax 919-660-7293 Web http://www.lutzonilab.net From dam6278 at yahoo.fr Thu Apr 27 03:53:24 2006 From: dam6278 at yahoo.fr (dam6278) Date: Thu, 27 Apr 2006 07:53:24 +0000 (GMT) Subject: [BioPython] GenBank Message-ID: <20060427075324.13946.qmail@web86913.mail.ukl.yahoo.com> I have a proble with the GenBank parser : When I execute : from Bio import GenBank gi_list = GenBank.search_for("Opuntia AND rpl16") My output is : Traceback (most recent call last): File "", line 1, in ? File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 1398, in search_for retstart = start_id, retmax = max_ids) File "/usr/lib/python2.4/site-packages/Bio/EUtils/DBIdsClient.py", line 294, in search searchinfo = parse.parse_search(infile, [None]) File "/usr/lib/python2.4/site-packages/Bio/EUtils/parse.py", line 201, in parse_search for ele in pom["TranslationStack"]: File "/usr/lib/python2.4/site-packages/Bio/EUtils/POM.py", line 355, in __getitem__ raise IndexError, "no item matches" IndexError: no item matches Do you know where is my problem ? Thank you for your help. damien From lpritc at scri.sari.ac.uk Thu Apr 27 04:33:21 2006 From: lpritc at scri.sari.ac.uk (Leighton Pritchard) Date: Thu, 27 Apr 2006 09:33:21 +0100 Subject: [BioPython] Creating a graphical interface to database of gene coordinates In-Reply-To: <444A2255.6010704@maubp.freeserve.co.uk> References: <20060421224928.87408.qmail@web38105.mail.mud.yahoo.com> <444A2255.6010704@maubp.freeserve.co.uk> Message-ID: <1146126802.4725.223.camel@lplinuxdev> Hi guys, On Sat, 2006-04-22 at 13:32 +0100, Peter (BioPython) wrote: > Srinivas Iyyer wrote: > > Dear group, > > I am happy that I am slowly finding pyhonian projects > > related to my research area. > > > > Problem: > > 1. I have a database of human gene coordinates on > > chromosomes. > > 2. I have gene expression data from my lab concerning > > the genes I mentioned above. > > > > 3. I want to visualize expression data laid on > > chromosomes. > > You may be able to produce chromosome diagrams with Leighton Pritchard > and Jennifer White's program genomediagram: > > http://bioinf.scri.sari.ac.uk/lp/programs.html#genomediagram > > It will do both circular genomes diagrams (nice for bacteria) and linear > ones - which would make sense for chromosomes. I think I've seen > examples with expression data shown in this way... certainly it could be > done. We use it ourselves to plot array data against chromosome location, but on the whole chromosome scale and, as you mention, not interactively. It's pretty easy to do, but not what Srinivas is looking for, I think. It sounds, Srinivas, like you're wanting something that will operate more like GeneSpring? Is that right? It's possible that, if you just wanted to present a static image of expression data, you could use GenomeDiagram in this way, but it's not the way I would choose to present the data in a GUI - I'd expect drawing straight onto a canvas (in whichever GUI toolkit suited you) to be more flexible for you. > Note that this can produce PDF or bitmap output - but its not > interactive. There is also a GUI to go with it, but I have not looked > at this. The GUI is pretty rudimentary, providing for file selection and just enough document formatting so as to not be entirely useless to the non- programmer. An improved version (but still not interactive) is in a perenially almost-ready state as wxPython widgets in the current source, waiting for a serious fixing and a wxApp to hang from. -- Dr Leighton Pritchard AMRSC D131, Plant-Pathogen Interactions, Scottish Crop Research Institute Invergowrie, Dundee, Scotland, DD2 5DA, UK T: +44 (0)1382 562731 x2405 F: +44 (0)1382 568578 E: lpritc at scri.sari.ac.uk W: http://bioinf.scri.sari.ac.uk/lp GPG/PGP: FEFC205C E58BA41B http://www.keyserver.net (If the signature does not verify, please remove the SCRI disclaimer) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.sari.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). From mdehoon at c2b2.columbia.edu Thu Apr 27 11:31:43 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Thu, 27 Apr 2006 11:31:43 -0400 Subject: [BioPython] GenBank Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF1A@cgcmail.cgc.cpmc.columbia.edu> I was not able to replicate this error -- both biopython 1.41 and biopython in CVS worked fine. Perhaps a temporary internet failure? --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: biopython-bounces at lists.open-bio.org on behalf of dam6278 Sent: Thu 4/27/2006 3:53 AM To: biopython at lists.open-bio.org Subject: [BioPython] GenBank I have a proble with the GenBank parser : When I execute : from Bio import GenBank gi_list = GenBank.search_for("Opuntia AND rpl16") My output is : Traceback (most recent call last): File "", line 1, in ? File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 1398, in search_for retstart = start_id, retmax = max_ids) File "/usr/lib/python2.4/site-packages/Bio/EUtils/DBIdsClient.py", line 294, in search searchinfo = parse.parse_search(infile, [None]) File "/usr/lib/python2.4/site-packages/Bio/EUtils/parse.py", line 201, in parse_search for ele in pom["TranslationStack"]: File "/usr/lib/python2.4/site-packages/Bio/EUtils/POM.py", line 355, in __getitem__ raise IndexError, "no item matches" IndexError: no item matches Do you know where is my problem ? Thank you for your help. damien _______________________________________________ BioPython mailing list - BioPython at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biopython From bill at barnard-engineering.com Fri Apr 28 00:44:28 2006 From: bill at barnard-engineering.com (Bill Barnard) Date: Thu, 27 Apr 2006 21:44:28 -0700 Subject: [BioPython] Updating the tutorial, was :Parsing and Creating Dictionaries of GenBank files In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF05@cgcmail.cgc.cpmc.columbia.edu> References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF05@cgcmail.cgc.cpmc.columbia.edu> Message-ID: <1146199468.5816.34.camel@lyell.barnard-engineering.com> On Fri, 2006-04-21 at 12:26 -0400, Michiel De Hoon wrote: > > Something funny seems to have happened to the plain text version: > > > > > http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Doc/Tutorial.t > xt.diff?r1=1.5&r2=1.6&cvsroot=biopython > > The plain text version is generated by hevea, so not by tex directly. The > funny output is likely due to having a different hevea version (which I ran a > couple of times). I didn't see anything obviously wrong with the Tutorial.tex > source file, so I think these errors are due to errors in the Tutorial.tex -> > Tutorial.txt translation by hevea. FWIW - I just updated from CVS and ran my updated Doc makefiles (see http://bugzilla.open-bio.org/show_bug.cgi?id=1939 ) and don't see any of the weird artifacts in the generated Tutorial.txt file. My hevea version is 1.06. > > > If generating a consistent plain text version is a lot of hassle, then > > maybe we can live without it? > > Currently, the plain text version is not very useful. It's not a source file, > so it should not be in CVS. On the other hand, the plain text version is not > available from the Biopython documentation page, and users are better off > with the PDF version anyway. So I think nobody will miss the plain text > version. Correct me if I'm wrong. As long as your release process includes running a make in the Doc tree, then you can generate the txt file from the tex source. Bill From mdehoon at c2b2.columbia.edu Fri Apr 28 12:37:30 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Fri, 28 Apr 2006 12:37:30 -0400 Subject: [BioPython] Updating the tutorial, was :Parsing and Creating Dictionaries of GenBank files Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF1E@cgcmail.cgc.cpmc.columbia.edu> > On Fri, 2006-04-21 at 12:26 -0400, Michiel De Hoon wrote: > > > Something funny seems to have happened to the plain text version: > > > > The plain text version is generated by hevea, so not by tex directly. The > > funny output is likely due to having a different hevea version (which I ran a > > couple of times). I didn't see anything obviously wrong with the Tutorial.tex > > source file, so I think these errors are due to errors in the Tutorial.tex -> > > Tutorial.txt translation by hevea. > > FWIW - I just updated from CVS and ran my updated Doc makefiles (see > http://bugzilla.open-bio.org/show_bug.cgi?id=1939 ) and don't see any of > the weird artifacts in the generated Tutorial.txt file. My hevea version > is 1.06. So it's probably a hevea problem -- I'm using version 1.08. > As long as your release process includes running a make in the Doc tree, > then you can generate the txt file from the tex source. That is one of the steps in building a release -- see http://www.biopython.org/docs/developer/build.html --Michiel. From clayton_kd at yahoo.com Sat Apr 29 11:05:09 2006 From: clayton_kd at yahoo.com (Kyle Dent) Date: Sat, 29 Apr 2006 08:05:09 -0700 (PDT) Subject: [BioPython] GenBank parsing Message-ID: <20060429150509.82051.qmail@web31509.mail.mud.yahoo.com> Dear All, My script was successfully implementing the Genbank parser until just today I was trying to get it to parse a genpept file. After much experimentation I discovered that it was actually having trouble parsing even newly downloaded GenBank files as well (downloaded of NCBI). I wanted to ask if anyone is aware of this problem, I understand the flat file format was updated this month and is probably the cause of this. The output which I am getting: Traceback (most recent call last): File "C:\work\GB CDS Extractor.py", line 289, in open1_clicked loadGenBank(self, self.gbFilePath) File "C:\work\GB CDS Extractor.py", line 75, in loadGenBank cur_record = genBank_Iterator.next() File "C:\Python24\Lib\site-packages\Bio\GenBank\__init__.py", line 129, in nex t return self._parser.parse(File.StringHandle(data)) File "C:\Python24\Lib\site-packages\Bio\GenBank\__init__.py", line 219, in par se self._scanner.feed(handle, self._consumer) File "C:\Python24\Lib\site-packages\Bio\GenBank\__init__.py", line 1259, in fe ed self._parser.parseFile(handle) File "C:\Python24\Lib\site-packages\Martel\Parser.py", line 328, in parseFile self.parseString(fileobj.read()) File "C:\Python24\Lib\site-packages\Martel\Parser.py", line 356, in parseStrin g self._err_handler.fatalError(result) File "C:\Python24\lib\xml\sax\handler.py", line 38, in fatalError raise exception Martel.Parser.ParserPositionException: error parsing at or beyond character 136 With thanks, Kyle __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From biopython at maubp.freeserve.co.uk Sat Apr 29 17:54:59 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython)) Date: Sat, 29 Apr 2006 22:54:59 +0100 Subject: [BioPython] GenBank parsing In-Reply-To: <20060429150509.82051.qmail@web31509.mail.mud.yahoo.com> References: <20060429150509.82051.qmail@web31509.mail.mud.yahoo.com> Message-ID: <4453E0B3.9040409@maubp.freeserve.co.uk> Kyle Dent wrote: > Dear All, > > My script was successfully implementing the Genbank > parser until just today I was trying to get it to > parse a genpept file. After much experimentation I > discovered that it was actually having trouble parsing > even newly downloaded GenBank files as well > (downloaded of NCBI). > > I wanted to ask if anyone is aware of this problem, I > understand the flat file format was updated this month > and is probably the cause of this. I'm aware that earlier in 2006, there was a new project line added. I haven't been aware of any further changes... on the other hand, I don't think I've ever used a "genpept" file either. Anyway, from the error message you are using the "old" Martel based parser shipped with BioPython 1.41 We recommend you update to the current CVS parser which is (a) more up to date, (b) faster, (c) should give slightly more helpful error messages if it does get stuck. For most cases you can simply download this file, replacing your Bio/GenBank/__init__.py after making a backup of the old version: http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/GenBank/__init__.py?cvsroot=biopython If you see errors about ReseekFile then you will need to make a few other changes... If you are still having trouble, or need further help making the update, please reply back. Including the GenBank reference of any problem file would be handy. Thank you Peter From srini_iyyer_bio at yahoo.com Sat Apr 1 18:13:16 2006 From: srini_iyyer_bio at yahoo.com (Srinivas Iyyer) Date: Sat, 1 Apr 2006 10:13:16 -0800 (PST) Subject: [BioPython] How can I retreive FASTA sequences from NCBI In-Reply-To: <442BFFAD.10103@maubp.freeserve.co.uk> Message-ID: <20060401181316.62812.qmail@web38113.mail.mud.yahoo.com> Hi , I have 151,204 GenBank Accession IDs. I want to retreive FASTA sequences from NCBI and compile them for my local blast. I am unable to get fasta sequences. I do not understand. Could any one please help me. my code: >>> mylis ['AA035383', 'AA971406', 'N98563'] parser = Fasta.RecordParser() iterator = Fasta.Iterator(mylis,parser) rec = iterator.next() rec = iterator.next() >>> rec >>> rec is empty :-( Accession IDs are not GIs. They are GenBank accession Ids. I do not want sequences in GenBank (long format). I want them in FASTA sequence format. Could any one pleast help me. Thanks Srini __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From biopython at maubp.freeserve.co.uk Sat Apr 1 19:59:46 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython)) Date: Sat, 01 Apr 2006 20:59:46 +0100 Subject: [BioPython] How can I retreive FASTA sequences from NCBI In-Reply-To: <20060401181316.62812.qmail@web38113.mail.mud.yahoo.com> References: <20060401181316.62812.qmail@web38113.mail.mud.yahoo.com> Message-ID: <442EDBB2.3040105@maubp.freeserve.co.uk> Srinivas Iyyer wrote: > Hi , > I have 151,204 GenBank Accession IDs. > I want to retreive FASTA sequences from NCBI and > compile them for my local blast. > > I am unable to get fasta sequences. I do not > understand. > > Could any one please help me. This should help. Using the first identifier in your example, AA035383, this is a nucleotide sequence, available from the NCBI. By searching the Entrez database you end up here:- http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=1507107 Note, AA035383 --> gi:1507107 Using the web interface, you can choose to view it as FASTA format rather than the default of GenBank format, and save to file. You could make a note of that URL, and just change the GI number to download all the files you want - but you need a simple way to determine the GI number... Now, BioPython can help you here: >>> from Bio import GenBank >>> gi_list = GenBank.search_for('AA035383', database='nucleotide') >>> print gi_list ['1507107'] You could use this code to get the GI numbers for each of your 151,204 GenBank Accession IDs. I would check in each case that only one GI number is returned. >>> assert len(gi_list)==1 >>> gi_number = gi_list[0] Once you have the GI number, then you could just download the FASTA file yourself and then parse it in the normal way. Or, get BioPython to do all this for you with its rather clever NCBIDictionary object... >>> from Bio import Fasta >>> from Bio import GenBank >>> ncbi_dict = GenBank.NCBIDictionary('nucleotide', 'fasta', \ ... parser = Fasta.RecordParser()) >>> gi_number = '1507107' >>> fasta_rec = ncbi_dict[gi_number] >>> print fasta_rec >gi|1507107|gb|AA035383.1|AA035383 zk25e12.r1 Soares_pregnant_uterus_NbHPU Homo sapiens cDNA clone IMAGE:471598 5', mRNA sequence CTTGAGCCTCAGGAACGAGATGGCGGTTCTCTGGAGGCTGAGTGCCGTTTGCGGTGCCCT AGGAGGCCGAGCTCTGTTGCTTCGAACTCCAGTGGTCAGACCTGCTCATATCTCAGCATT TCTTCAGGACCGACCTATCCCAGAATGGTGTGGAGTGCAGCACATACACTTGTCACCCGA GCCACCATTCTGGCTCCAAGGCTGCATCTCTCCACTGGACTAGCGAGANGGTTGTCANTG TTTTGCTCCTGGGTCTGCTTCCCGGCTGCTTANTTGAANCCTTGCTCNGCGANGGACTAN TCCCTGGC You could use the Fasta.SequenceParser() if you prefer. I would guess you would then want to save these FASTA records into one long FASTA file. Enjoy! Peter From halima at mancala.cbio.uct.ac.za Sun Apr 2 13:33:11 2006 From: halima at mancala.cbio.uct.ac.za (Halima Rabiu) Date: Sun, 2 Apr 2006 15:33:11 +0200 (SAST) Subject: [BioPython] Need help on NCBIStandaloneblast In-Reply-To: <442BFFAD.10103@maubp.freeserve.co.uk> References: <442BFFAD.10103@maubp.freeserve.co.uk> Message-ID: Thanks Peter , I have been able to trace the error when I print the error_info.read() the error is with my infile There is result in my save file now but I am still having problem passing the output file.But I will try to figure it out it may be syntax problem Thanks On Thu, 30 Mar 2006, Peter (BioPython List) wrote: > Halima Rabiu wrote: > > Hi everyboby ; > > I am new to biopython having problems with the "NCBIStandalone.blastall". > > After launching the Blast with "doBlast" it look like runs and end > > and then I check the output it empty and I try same thing using comand > > line it work and get result. > > I attch my code. > > Have you checked the paths are correct, e.g. > > assert os.path.isfile(data), "Missing database file " + data > assert os.path.isfile(infile), "Missing input file " + infile > > You don't need to check blast_exe yourself, as the blastall command does this > for you. > > If I understood you correctly, the "blast.out" file is empty. > > Did blast return any error message? Try: > > print error_info.read() > > or: > > save_file =open("blast.error","w") > blast_result=error_info.read() > save_file.write(blast_result) > save_file.close() > > Next question, could you tell us what you typed at the command line which does > work? > > > I also try to go though the previous posts on biopython mailing list fund > > similar problem post by Andreas but no solution to the problem . > > It was worth checking anyway :) > > Peter > > From as_nascimento at yahoo.com.br Wed Apr 5 20:35:35 2006 From: as_nascimento at yahoo.com.br (Alessandro S. Nascimento) Date: Wed, 05 Apr 2006 17:35:35 -0300 Subject: [BioPython] problems when parsing blast output In-Reply-To: <43CCD436.7020704@maubp.freeserve.co.uk> References: <43CC485E.7050702@yahoo.com.br> <43CCC6D4.4020307@maubp.freeserve.co.uk> <43CCCF56.40803@yahoo.com.br> <43CCD436.7020704@maubp.freeserve.co.uk> Message-ID: <44342A17.4070404@yahoo.com.br> Hi Peter I had some troubles when parsing some results from a blastpgp output file. My initial script used to work but isn't working this time. My blast output file is very, very large. When I try to run it, I can see my processor working in 99% for some minutes than is returns to prompt with no results or information. Any idea of what may be happening? Thanks in advance, Alessandro #!/usr/bin/python import os from Bio.Blast import NCBIStandalone from string import * blast_out = open('blast.output', 'r') b_parser = NCBIStandalone.PSIBlastParser() b_record = b_parser.parse(blast_out) n=0 for round in b_record.rounds: for alignment in round.alignments: for hsp in alignment.hsps: if hsp.identities < 90: if hsp.identities > 30: if alignment.length > 200: print "Retrieving sequence query" os.system ("fastacmd -d ..//db/nr -s \'%s\' > test.bl2.%d" % (query, n, )) n=n+1 blast_out.close() From halima at mancala.cbio.uct.ac.za Thu Apr 13 15:07:52 2006 From: halima at mancala.cbio.uct.ac.za (Halima Rabiu) Date: Thu, 13 Apr 2006 17:07:52 +0200 (SAST) Subject: [BioPython] Need help parsing Blastoutput Message-ID: Hi All, I have a BLAST output from a local blast I need to calculate my % alignment coverage as regard to my subject I try parsed the blast output and wanted to print the sbjct Start and Sbjct end. but I could not is there anyway I could this try to get mach coverage between my querry and subject I dont need Identities,but total % alignment for querry or subject. Thanks Halimah From mdehoon at c2b2.columbia.edu Thu Apr 13 15:56:26 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Thu, 13 Apr 2006 11:56:26 -0400 Subject: [BioPython] Need help parsing Blastoutput Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEE9@cgcmail.cgc.cpmc.columbia.edu> Could you send us the script you were using? --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu Sent: Thu 4/13/2006 11:07 AM To: biopython at lists.open-bio.org Subject: [BioPython] Need help parsing Blastoutput Hi All, I have a BLAST output from a local blast I need to calculate my % alignment coverage as regard to my subject I try parsed the blast output and wanted to print the sbjct Start and Sbjct end. but I could not is there anyway I could this try to get mach coverage between my querry and subject I dont need Identities,but total % alignment for querry or subject. Thanks Halimah _______________________________________________ BioPython mailing list - BioPython at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biopython From rafael at nbn.ac.za Fri Apr 14 09:52:42 2006 From: rafael at nbn.ac.za (Rafael C. Jimenez) Date: Fri, 14 Apr 2006 11:52:42 +0200 Subject: [BioPython] Need help parsing Blastoutput In-Reply-To: References: Message-ID: <9ad32945680e91a485c1e0cdb1ca4eb7@nbn.ac.za> On 13 Apr 2006, at 5:07 PM, Halima Rabiu wrote: > Hi All, > I have a BLAST output from a local blast Well, I would say that you can use three alternatives to run blast, and somehow you can use all of them locally. - Blast web server (Through Blastcl3 or through biopython) - Blast standalone - wwwblast I guess that when you say local blast you want to say you are using blast standalone to use your own local databases. It makes a difference to use one of these three different because you will use different modules to parse the output: - Bio.Blast.NCBIStandalone for Blast standalone outputs - Bio.Blast.NCBIWWW for Blast web server outputs - No parser for the wwwblast > I need to calculate my % alignment coverage as regard to my subject I am not sure what you mean, but I would say that this % is provided by the "Identities" field in nucleotide and protein comparisons for each alignment, and also by the "Positives" field in protein comparisons. Example: Identities = 11/26 (42%), Positives = 15/26 (57%) > I try parsed the blast output and wanted to print the > sbjct Start and Sbjct end. but I could not is there anyway I could this # Open your Blast Output file blastOutput = open("The name of your blast output", 'r') Once you have parsed the NCBIWWW output: from Bio.Blast import NCBIWWW parser = NCBIWWW.BlastParser() blastRecord = parser.parse(blastOutput) .... or the NCBI web server output: from Bio.Blast import NCBIWWW parser = NCBIWWW.BlastParser() blastRecord = parser.parse(blastOutput) now you can start to recover information using the Bio.Blast.Record module import Bio.Blast.Record # ... for instance you can retreive the Blast version you used when you got your output ... print 'header.version:',blastRecord.version for alignment in blastRecord.alignments: # ... or the length of the alignment ... print 'alignment.length:', alignment.length for hsp in alignment.hsps: # ... or the sbjct Start as you want ... print 'hsp.sbjct_start:', hsp.sbjct_start > > try to get mach coverage between my querry and subject I dont need > Identities,but total % alignment for querry or subject. > Thanks > Halimah I am working in the NBN central node in UWC, not far away from UCT. Don't hesitate to visit us if you want help or advice. Cheers, Rafael Rafael C. Jimenez ----------------------------------------------------------- National Bioinformatics Network University of the Western Cape Private Bag X17 Bellville 7530 South Africa Tel: +27219592991 rafael at nbn.ac.za www.nbn.ac.za ----------------------------------------------------------- Proteomics Services Group European Bioinformatics Institute (EMBL-EBI) Wellcome Trust Genome Campus, Hinxton Cambridge - CB10 1SD - UK Tel: +441223492610 rafael at ebi.ac.uk www.ebi.ac.uk ----------------------------------------------------------- On 13 Apr 2006, at 5:07 PM, Halima Rabiu wrote: > Hi All, > I have a BLAST output from a local blast > I need to calculate my % alignment coverage as regard to my subject > I try parsed the blast output and wanted to print the > sbjct Start and Sbjct end. but I could not is there anyway I could this > try to get mach coverage between my querry and subject I dont need > Identities,but total % alignment for querry or subject. > Thanks > Halimah > > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From halima at mancala.cbio.uct.ac.za Tue Apr 18 15:06:02 2006 From: halima at mancala.cbio.uct.ac.za (Halima Rabiu) Date: Tue, 18 Apr 2006 17:06:02 +0200 (SAST) Subject: [BioPython] Need help parsing Blastoutput In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEE9@cgcmail.cgc.cpmc.columbia.edu> References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEE9@cgcmail.cgc.cpmc.columbia.edu> Message-ID: thanks please see the attchment a copy of my script and copy of my Blast output Thanks On Thu, 13 Apr 2006, Michiel De Hoon wrote: > Could you send us the script you were using? > > --Michiel. > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu > Sent: Thu 4/13/2006 11:07 AM > To: biopython at lists.open-bio.org > Subject: [BioPython] Need help parsing Blastoutput > > Hi All, > I have a BLAST output from a local blast > I need to calculate my % alignment coverage as regard to my subject > I try parsed the blast output and wanted to print the > sbjct Start and Sbjct end. but I could not is there anyway I could this > try to get mach coverage between my querry and subject I dont need > Identities,but total % alignment for querry or subject. > Thanks > Halimah > > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > > -------------- next part -------------- #! /usr/local/bin/python2.4 #halimah #16-04-2006 from string import split from Bio.Blast import NCBIStandalone b_out = open('Enterococcus_out','r') b_parser = NCBIStandalone.BlastParser() b_iterator = NCBIStandalone.Iterator(b_out,b_parser) E_VALUE_THRESH = 1.0 while 1: b_record = b_iterator.next() print "The following results are for query " + b_record.query print 'len of query:',b_record.query_letters if b_record is None: break for alignment in b_record.alignments: for hsp in alignment.hsps: if hsp.expect <= E_VALUE_THRESH: print '****Alignment****' print 'title:', alignment.title print 'length:', alignment.length print 'e value:', hsp.expect print 'subjectstart:',hsp.sbjct_start print 'subject end:', hsp.sbject_end From mdehoon at c2b2.columbia.edu Tue Apr 18 16:40:05 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Tue, 18 Apr 2006 12:40:05 -0400 Subject: [BioPython] Need help parsing Blastoutput Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF5@cgcmail.cgc.cpmc.columbia.edu> Could you also send us the file Enterococcus_out so we can run the script? >From the script, it looks like you're trying to parse text output from Blast. While this is possible (in theory), the format of Blast text output tends to change a lot, thereby breaking the parser in Biopython. It is more reliable to have Blast generate output in XML format, and use the XML parser: blast_out = open('my_blast.xml', 'r') from Bio.Blast import NCBIXML b_parser = NCBIXML.BlastParser() b_record = b_parser.parse(blast_out) See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to generate Blast output in XML. --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] Sent: Tue 4/18/2006 11:06 AM To: Michiel De Hoon Cc: biopython at lists.open-bio.org Subject: RE: [BioPython] Need help parsing Blastoutput thanks please see the attchment a copy of my script and copy of my Blast output Thanks On Thu, 13 Apr 2006, Michiel De Hoon wrote: > Could you send us the script you were using? > > --Michiel. > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu > Sent: Thu 4/13/2006 11:07 AM > To: biopython at lists.open-bio.org > Subject: [BioPython] Need help parsing Blastoutput > > Hi All, > I have a BLAST output from a local blast > I need to calculate my % alignment coverage as regard to my subject > I try parsed the blast output and wanted to print the > sbjct Start and Sbjct end. but I could not is there anyway I could this > try to get mach coverage between my querry and subject I dont need > Identities,but total % alignment for querry or subject. > Thanks > Halimah > > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython > > From halima at mancala.cbio.uct.ac.za Wed Apr 19 10:15:15 2006 From: halima at mancala.cbio.uct.ac.za (Halima Rabiu) Date: Wed, 19 Apr 2006 12:15:15 +0200 (SAST) Subject: [BioPython] Need help parsing Blastoutput In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF5@cgcmail.cgc.cpmc.columbia.edu> References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF5@cgcmail.cgc.cpmc.columbia.edu> Message-ID: Hi Please see the attachment,it part of my Blast output. yes I am try to parse text output from Blast ,I have use another script to run my local blast that I am trying to perse the NCBIStandalone.BlastParser was working fine without hsp.sbject_end which is one of what I need to print out . On checking the class diagrams from cookbook, findout that sbject_end is not included .I just need another way of printing the int(subject end). Thanks for your help Halimah On Tue, 18 Apr 2006, Michiel De Hoon wrote: > Could you also send us the file Enterococcus_out so we can run the script? > > From the script, it looks like you're trying to parse text output from Blast. > While this is possible (in theory), the format of Blast text output tends to > change a lot, thereby breaking the parser in Biopython. It is more reliable > to have Blast generate output in XML format, and use the XML parser: > > blast_out = open('my_blast.xml', 'r') > > from Bio.Blast import NCBIXML > > b_parser = NCBIXML.BlastParser() > b_record = b_parser.parse(blast_out) > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to > generate Blast output in XML. > > --Michiel. > > > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > Sent: Tue 4/18/2006 11:06 AM > To: Michiel De Hoon > Cc: biopython at lists.open-bio.org > Subject: RE: [BioPython] Need help parsing Blastoutput > > thanks > please see the attchment a copy of my script and copy of my Blast output > Thanks > > > On Thu, 13 Apr 2006, Michiel De Hoon wrote: > > > Could you send us the script you were using? > > > > --Michiel. > > > > Michiel de Hoon > > Center for Computational Biology and Bioinformatics > > Columbia University > > 1150 St Nicholas Avenue > > New York, NY 10032 > > > > > > > > -----Original Message----- > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu > > Sent: Thu 4/13/2006 11:07 AM > > To: biopython at lists.open-bio.org > > Subject: [BioPython] Need help parsing Blastoutput > > > > Hi All, > > I have a BLAST output from a local blast > > I need to calculate my % alignment coverage as regard to my subject > > I try parsed the blast output and wanted to print the > > sbjct Start and Sbjct end. but I could not is there anyway I could this > > try to get mach coverage between my querry and subject I dont need > > Identities,but total % alignment for querry or subject. > > Thanks > > Halimah > > > > _______________________________________________ > > BioPython mailing list - BioPython at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython > > > > > > -------------- next part -------------- BLASTP 2.2.10 [Oct-19-2004] Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Query= ENTFA 3MGH_ENTFA (Q833H5) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) (229 letters) Database: Blastdata.fdb 240,170 sequences; 77,468,597 total letters Searching..................................................done Score E Sequences producing significant alignments: (bits) Value ENTFA 3MGH_ENTFA (Q833H5) Putative 3-methyladenine DNA glycosyla... 462 e-130 LISMO 3MGH_LISMO (P58621) Putative 3-methyladenine DNA glycosyla... 194 2e-49 STAAM 3MGH_STAAM (P65414) Putative 3-methyladenine DNA glycosyla... 187 3e-47 STAES 3MGH_STAES (Q8CRC1) Putative 3-methyladenine DNA glycosyla... 186 5e-47 LISIN 3MGH_LISIN (Q92D89) Putative 3-methyladenine DNA glycosyla... 185 8e-47 LACPL 3MGH_LACPL (Q88VP8) Putative 3-methyladenine DNA glycosyla... 178 1e-44 BACSU 3MGH_BACSU (P94378) Putative 3-methyladenine DNA glycosyla... 160 3e-39 LACJO Q74LU5 (Q74LU5) 3-methyladenine DNA glycosylase 155 7e-38 OCEIH 3MGH_OCEIH (Q8ETG4) Putative 3-methyladenine DNA glycosyla... 147 2e-35 BACAN 3MGH_BACAN (Q81UJ9) Putative 3-methyladenine DNA glycosyla... 130 4e-30 BACC1 Q73CV5 (Q73CV5) Methylpurine-DNA glycosylase family protein 125 8e-29 CLOTE 3MGH_CLOTE (Q896H4) Putative 3-methyladenine DNA glycosyla... 124 3e-28 CHLMU Q9PJE9 (Q9PJE9) Succinate dehydrogenase, iron-sulfur protein 113 4e-25 CHLCV 3MGH_CHLCV (Q824B4) Putative 3-methyladenine DNA glycosyla... 111 2e-24 CLOAB 3MGH_CLOAB (Q97EY6) Putative 3-methyladenine DNA glycosyla... 108 1e-23 CLOPE 3MGH_CLOPE (Q8XHA9) Putative 3-methyladenine DNA glycosyla... 107 4e-23 STRMU Q8DRU4 (Q8DRU4) Putative 3-methyladenine DNA glycosylase 103 3e-22 DEIRA 3MGH_DEIRA (Q9RSQ0) Putative 3-methyladenine DNA glycosyla... 86 9e-17 CORGL 3MGH_CORGL (Q8NU33) Putative 3-methyladenine DNA glycosyla... 82 1e-15 STRCO 3MGH_STRCO (Q9S208) Putative 3-methyladenine DNA glycosyla... 80 4e-15 BRAJA 3MGH_BRAJA (Q89LR7) Putative 3-methyladenine DNA glycosyla... 79 1e-14 STRAW 3MGH_STRAW (Q829C5) Putative 3-methyladenine DNA glycosyla... 73 8e-13 COREF 3MGH_COREF (Q8FUA2) Putative 3-methyladenine DNA glycosyla... 69 9e-12 PROAC Q6A5L3 (Q6A5L3) Putative 3-methylpurine DNA glycosylase 66 9e-11 MYCPA Q740F6 (Q740F6) Hypothetical protein 64 3e-10 MYCLEH 3MGH_MYCTU (P65412) Putative 3-methyladenine DNA glycosyl... 64 5e-10 MYCTU 3MGH_MYCTU (P65412) Putative 3-methyladenine DNA glycosyla... 64 5e-10 MYCBO 3MGH_MYCBO (P65413) Putative 3-methyladenine DNA glycosyla... 64 5e-10 MYCLE 3MGH_MYCLE (O05678) Putative 3-methyladenine DNA glycosyla... 60 5e-09 RICCN 3MGH_RICCN (Q92IE0) Putative 3-methyladenine DNA glycosyla... 52 2e-06 RICPR 3MGH_RICPR (Q9ZDH7) Putative 3-methyladenine DNA glycosyla... 49 1e-05 PSEAE 3MGH_PSEAE (Q9HX17) Putative 3-methyladenine DNA glycosyla... 45 2e-04 PSEPK Q88DL3 (Q88DL3) DNA-3-methyladenine glycosylase, putative 42 0.002 BORPE Q7VWB7 (Q7VWB7) Putative methypurine-DNA glycosylase 40 0.004 BORPA Q7W9Q9 (Q7W9Q9) Putative methypurine-DNA glycosylase 40 0.004 STRAW Q93HJ1 (Q93HJ1) Modular polyketide synthase 35 0.14 STRAW Q93HJ2 (Q93HJ2) Modular polyketide synthase 33 0.68 SILPO Q5LKE3 (Q5LKE3) Peptidyl-prolyl cis-trans isomerase, FKBP-... 32 1.5 SILPO Q5LVX1 (Q5LVX1) ABC transporter, transmembrane ATP-binding... 30 4.4 CLOTE DXS_CLOTE (Q894H0) 1-deoxy-D-xylulose-5-phosphate synthase... 30 5.8 BURMA Q9AI54 (Q9AI54) DedA family protein 30 7.5 STAES Q8CST0 (Q8CST0) Hypothetical protein SE0952 29 9.8 SILPO Q5LMI5 (Q5LMI5) ADA regulatory protein 29 9.8 >ENTFA 3MGH_ENTFA (Q833H5) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 229 Score = 462 bits (1190), Expect = e-130 Identities = 229/229 (100%), Positives = 229/229 (100%) Query: 1 MVKEMKETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFG 60 MVKEMKETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFG Sbjct: 1 MVKEMKETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFG 60 Query: 61 LRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQ 120 LRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQ Sbjct: 61 LRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQ 120 Query: 121 GRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGR 180 GRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGR Sbjct: 121 GRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGR 180 Query: 181 WTELPLRYVVAGNPYISKQKRTAVDQIDFGWKDEENEKSNNAHILRGTT 229 WTELPLRYVVAGNPYISKQKRTAVDQIDFGWKDEENEKSNNAHILRGTT Sbjct: 181 WTELPLRYVVAGNPYISKQKRTAVDQIDFGWKDEENEKSNNAHILRGTT 229 >LISMO 3MGH_LISMO (P58621) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 207 Score = 194 bits (492), Expect = 2e-49 Identities = 99/198 (50%), Positives = 134/198 (67%), Gaps = 3/198 (1%) Query: 8 TINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRL 67 T F +KTT E+A+ +LGM L H+T G+L G IV+ EAYLG D AAHSF +T R Sbjct: 6 TKEFFESKTTIELARDILGMRLVHQTNEGLLSGLIVETEAYLGATDMAAHSFQNLRTKRT 65 Query: 68 QAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVE-GVDKMIENRQGRQGVE 126 + M+ PGTIY+Y MH ++LN +T +G P+ ++IRAIEP E +M +NR G+ G E Sbjct: 66 EVMFSSPGTIYMYQMHRQVLLNFITMPKGIPEAILIRAIEPDEQAKQQMTQNRHGKTGYE 125 Query: 127 LTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPL 186 LTNGPGKL ALG+ Q YG+++F S++ L E+ K P IEA RIG+PNKG T PL Sbjct: 126 LTNGPGKLTQALGLSMQDYGKTLFDSNIWL--EEAKLPHLIEATNRIGVPNKGIATHYPL 183 Query: 187 RYVVAGNPYISKQKRTAV 204 R+ V G+PYIS Q++ ++ Sbjct: 184 RFTVKGSPYISGQRKNSI 201 >STAAM 3MGH_STAAM (P65414) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 202 Score = 187 bits (474), Expect = 3e-47 Identities = 91/201 (45%), Positives = 132/201 (65%), Gaps = 1/201 (0%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F + T + A+ LLG+ + ++ GYIV+ EAYLG D+AAH FG + TP++ ++Y Sbjct: 6 FINQQTTQTAKALLGVKIIYQDDYQTYTGYIVETEAYLGIQDKAAHGFGGKITPKVTSLY 65 Query: 72 DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGP 131 K GTIY + MHTHL++N VT+ +G P+GV+IRAIEP EG+ M NR G+ G ELTNGP Sbjct: 66 KKGGTIYAHVMHTHLLINFVTRTEGIPEGVLIRAIEPDEGIGAMNVNR-GKSGYELTNGP 124 Query: 132 GKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVA 191 GK A I + + G ++ L + RK+PK I RIGIPNKG WT PLR+ V Sbjct: 125 GKWTKAFNIPRSIDGSTLNDCKLSIDTNHRKYPKTIIESGRIGIPNKGEWTNKPLRFTVK 184 Query: 192 GNPYISKQKRTAVDQIDFGWK 212 GNPY+S+ +++ D WK Sbjct: 185 GNPYVSRMRKSDFQNPDDTWK 205 >LISIN 3MGH_LISIN (Q92D89) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 207 Score = 185 bits (470), Expect = 8e-47 Identities = 96/200 (48%), Positives = 130/200 (65%), Gaps = 3/200 (1%) Query: 6 KETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTP 65 K T F +TT E+A+ ++GM L HE L GYIV+ EAYLG D AAHSF +T Sbjct: 4 KITPTFFENRTTIELARDIIGMRLVHEIGNYTLSGYIVETEAYLGATDMAAHSFKNLRTK 63 Query: 66 RLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPV-EGVDKMIENRQGRQG 124 R + M+ PGTIY Y MH ++LN +T +G P+ V+IRA+EP E +++M +NR + G Sbjct: 64 RTEVMFGTPGTIYTYQMHQQVLLNFITMREGIPEAVLIRALEPTKESIEQMEQNRFLKTG 123 Query: 125 VELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTEL 184 ELTNGPGKL ALG+ Q YG+++F S++ L E+ K P IEA RIG+PNKG T Sbjct: 124 FELTNGPGKLTQALGLSMQDYGKTLFDSNIWL--ERAKVPHIIEATNRIGVPNKGIATHY 181 Query: 185 PLRYVVAGNPYISKQKRTAV 204 PLR+ G+PYIS Q++ + Sbjct: 182 PLRFTAKGSPYISAQRKRQI 201 >LACPL 3MGH_LACPL (Q88VP8) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 209 Score = 178 bits (451), Expect = 1e-44 Identities = 93/199 (46%), Positives = 127/199 (63%), Gaps = 1/199 (0%) Query: 13 NTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYD 72 +T TT E+A LLG L +T++GVL +I + EAYLG D AH++ +TPR A++ Sbjct: 9 STCTTPEIAVSLLGKQLRLQTSSGVLTAWITETEAYLGARDAGAHAYQNHQTPRNHALWQ 68 Query: 73 KPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPG 132 GTIY+Y M +LN+VTQ G P+ V+IR IEP G+++M + R LTNGPG Sbjct: 69 SAGTIYIYQMRAWCLLNIVTQAAGTPECVLIRGIEPDAGLERMQQQRP-VPIANLTNGPG 127 Query: 133 KLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAG 192 KL+ ALG+DK L GQ++ ++L L + P+++ A PRIGI NKG WT PLRY VAG Sbjct: 128 KLMQALGLDKTLNGQALQPATLSLDLSHYRQPEQVVATPRIGIVNKGEWTTAPLRYFVAG 187 Query: 193 NPYISKQKRTAVDQIDFGW 211 NP++SK R +D GW Sbjct: 188 NPFVSKISRRTIDHEHHGW 206 >BACSU 3MGH_BACSU (P94378) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 196 Score = 160 bits (405), Expect = 3e-39 Identities = 91/198 (45%), Positives = 112/198 (56%), Gaps = 2/198 (1%) Query: 1 MVKEMKETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFG 60 M +E F KT E+A LLG L ET G GYIV+ EAY+G D AAHSF Sbjct: 1 MTREKNPLPITFYQKTALELAPSLLGCLLVKETDEGTASGYIVETEAYMGAGDRAAHSFN 60 Query: 61 LRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQ 120 R+T R + M+ + G +Y Y MHTH +LN+V E+ PQ V+IRAIEP EG M E R Sbjct: 61 NRRTKRTEIMFAEAGRVYTYVMHTHTLLNVVAAEEDVPQAVLIRAIEPHEGQLLMEERRP 120 Query: 121 GRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGR 180 GR E TNGPGKL ALG+ YG+ I L + E P+ I PRIGI N G Sbjct: 121 GRSPREWTNGPGKLTKALGVTMNDYGRWITEQPLYI--ESGYTPEAISTGPRIGIDNSGE 178 Query: 181 WTELPLRYVVAGNPYISK 198 + P R+ V GN Y+S+ Sbjct: 179 ARDYPWRFWVTGNRYVSR 196 >LACJO Q74LU5 (Q74LU5) 3-methyladenine DNA glycosylase Length = 208 Score = 155 bits (393), Expect = 7e-38 Identities = 77/192 (40%), Positives = 125/192 (65%), Gaps = 2/192 (1%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F ++T E+++ LLG L + +L G IV+AEAY+G D AAHS+G R++P + +Y Sbjct: 7 FTNRSTSEISKDLLGRTLSYNNGEEILSGTIVEAEAYVGVKDRAAHSYGGRRSPANEGLY 66 Query: 72 DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGP 131 G++Y+Y+ + ++ QE+G+PQGV+IRAI+P+ G+D MI+NR G+ G LTNGP Sbjct: 67 RPGGSLYIYSQRQYFFFDVSCQEEGEPQGVLIRAIDPLTGIDTMIKNRSGKTGPLLTNGP 126 Query: 132 GKLVAALGIDKQLYG-QSIFSSSLRLVPEKRKFPKKIEALPRIGI-PNKGRWTELPLRYV 189 GK++ ALGI + + + S + + ++ ++I ALPR+GI + W + LR++ Sbjct: 127 GKMMQALGITSRKWDLVDLNDSPFDIDIDHKREIEEIVALPRVGINQSDPEWAQKKLRFI 186 Query: 190 VAGNPYISKQKR 201 V+GNPY+S K+ Sbjct: 187 VSGNPYVSDIKK 198 >OCEIH 3MGH_OCEIH (Q8ETG4) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 198 Score = 147 bits (371), Expect = 2e-35 Identities = 74/182 (40%), Positives = 112/182 (61%), Gaps = 2/182 (1%) Query: 17 TEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGT 76 T E+A+ LLG L +T G G IV+ EAYLG D AAH +G R+T R + +Y KPG Sbjct: 19 TLELAKNLLGCILVKQTEEGTSSGVIVETEAYLGNTDRAAHGYGNRRTKRTEILYSKPGY 78 Query: 77 IYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVA 136 Y++ +H H ++N+V+ +G P+ V+IRA+EP G+D+M+ R ++ LT+GPGKL Sbjct: 79 AYVHLIHNHRLINVVSSMEGDPESVLIRAVEPFSGIDEMLMRRPVKKFQNLTSGPGKLTQ 138 Query: 137 ALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYI 196 A+GI + YG + + L + + K P ++ RIGI N G + P R+ V GNP++ Sbjct: 139 AMGIYMEDYGHFMLAPPLFI--SEGKSPASVKTGSRIGIDNTGEAKDYPYRFWVDGNPFV 196 Query: 197 SK 198 S+ Sbjct: 197 SR 198 >BACAN 3MGH_BACAN (Q81UJ9) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 205 Score = 130 bits (326), Expect = 4e-30 Identities = 80/194 (41%), Positives = 112/194 (57%), Gaps = 11/194 (5%) Query: 17 TEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGT 76 T EVA+ LLG L H G IV+ EAY GPDD+AAHS+G R+T R + M+ PG Sbjct: 12 TLEVAKKLLGQKLVHIVNGIKRSGIIVEVEAYKGPDDKAAHSYGGRRTDRTEVMFGAPGH 71 Query: 77 IYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGV------ELTN 129 Y+Y ++ + N++T G PQGV+IRA+EPV+G++++ R + + LTN Sbjct: 72 AYVYLIYGMYHCFNVITAPVGTPQGVLIRALEPVDGIEEIKLARYNKTDITKAQYKNLTN 131 Query: 130 GPGKLVAALGIDKQLYGQSIFSSSL--RLVPEKRKFPK--KIEALPRIGIPNKGRWTELP 185 GPGKL ALGI + G S+ S +L LVPE++ KI A PRI I P Sbjct: 132 GPGKLCRALGITLEERGVSLQSDTLHIELVPEEKHISSQYKITAGPRINIDYAEEAVHYP 191 Query: 186 LRYVVAGNPYISKQ 199 R+ G+P++SK+ Sbjct: 192 WRFYYEGHPFVSKK 205 >BACC1 Q73CV5 (Q73CV5) Methylpurine-DNA glycosylase family protein Length = 205 Score = 125 bits (315), Expect = 8e-29 Identities = 79/194 (40%), Positives = 110/194 (56%), Gaps = 11/194 (5%) Query: 17 TEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGT 76 T EVA+ LLG L H G IV+ EAY GPDD+AAHS+G R+T R + M+ PG Sbjct: 12 TLEVAKKLLGQKLVHIVDGIKRSGIIVEVEAYKGPDDKAAHSYGGRRTDRTEVMFGAPGH 71 Query: 77 IYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGV------ELTN 129 Y+Y ++ + N++T G PQGV+IRA+EPV+G++++ R + + LTN Sbjct: 72 AYVYLIYGMYHCFNVITAPVGTPQGVLIRALEPVDGIEEIKLARYNKTDITKAQYKNLTN 131 Query: 130 GPGKLVAALGIDKQLYGQSIFSSSL--RLVPEKRKFPK--KIEALPRIGIPNKGRWTELP 185 GPGKL ALGI + G S+ S +L LV E+ KI A PRI I P Sbjct: 132 GPGKLCRALGITLEERGVSLQSDTLHIELVREEEHISSQYKITAGPRINIDYAEEAVHYP 191 Query: 186 LRYVVAGNPYISKQ 199 R+ G+P++SK+ Sbjct: 192 WRFYYEGHPFVSKK 205 >CLOTE 3MGH_CLOTE (Q896H4) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 203 Score = 124 bits (310), Expect = 3e-28 Identities = 74/197 (37%), Positives = 109/197 (55%), Gaps = 9/197 (4%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F K+ +VA+YLLG L +E L G IV+ EAY+G D+A+H++G +KT R+ +Y Sbjct: 7 FYEKSALQVAKYLLGKILVNEVEGITLKGKIVETEAYIGAIDKASHAYGGKKTERVMPLY 66 Query: 72 DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKM--------IENRQGR 122 KPGT Y+Y ++ + N++T+ +G+ +GV+IRAIEP+EG++KM I Sbjct: 67 GKPGTAYVYLIYGMYHCFNVITKVEGEAEGVLIRAIEPLEGIEKMAYLRYKKPISEISKT 126 Query: 123 QGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWT 182 Q LT GPGKL AL IDK Q + + + K I RIGI Sbjct: 127 QFKNLTTGPGKLCIALNIDKSNNKQDLCNEGTLYIEHNDKEKFNIVESKRIGIEYAEEAK 186 Query: 183 ELPLRYVVAGNPYISKQ 199 + R+ + NP+ISK+ Sbjct: 187 DFLWRFYIEDNPWISKK 203 >CHLMU Q9PJE9 (Q9PJE9) Succinate dehydrogenase, iron-sulfur protein Length = 425711 Score = 113 bits (283), Expect = 4e-25 Identities = 72/185 (38%), Positives = 105/185 (56%), Gaps = 5/185 (2%) Query: 10 NIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQA 69 + F ++ +AQ LLG L + GYIV+ EAY GPDD+A H++ RKT R +A Sbjct: 321 HFFLSEDVITLAQQLLGHKLITTHEGLITSGYIVETEAYRGPDDKACHAYNYRKTQRNRA 380 Query: 70 MYDKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVE-- 126 MY K G+ YLY + H +LN+VT + P V+IRAI P +G + MI+ RQ R Sbjct: 381 MYLKGGSAYLYRCYGMHHLLNVVTGPEDIPHAVLIRAILPDQGKELMIQRRQWRDKPPHL 440 Query: 127 LTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPL 186 LTNGPGK+ ALGI + Q + + +L + K K + A RIGI + ++P Sbjct: 441 LTNGPGKVCQALGISLENNRQRLNTPALYI--SKEKISGTLTATARIGIDYAQEYRDVPW 498 Query: 187 RYVVA 191 R++++ Sbjct: 499 RFLLS 503 >CHLCV 3MGH_CHLCV (Q824B4) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 190 Score = 111 bits (278), Expect = 2e-24 Identities = 67/174 (38%), Positives = 98/174 (56%), Gaps = 5/174 (2%) Query: 20 VAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYL 79 +A+ LLG L + + + G+IV+ EAY GPDD+A H++ RKT R MY + G Y+ Sbjct: 15 LAKELLGHILITKISGKITSGFIVETEAYRGPDDKACHAYNYRKTKRNSPMYSRGGIAYI 74 Query: 80 YTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVE--LTNGPGKLVA 136 Y + H + N+VT +Q P V+IRAI P EG D MI+ RQ + + LTNGPGK+ Sbjct: 75 YRCYGMHSLFNVVTAKQDLPHAVLIRAILPYEGEDIMIQRRQWQNKPKHLLTNGPGKVCQ 134 Query: 137 ALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVV 190 AL + + ++ S L + K K +I PRIGI +LP R+++ Sbjct: 135 ALNLTLEHNTHALTSPHLHI--SKEKASGRITQTPRIGIDYAEECKDLPWRFLL 186 >CLOAB 3MGH_CLOAB (Q97EY6) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 205 Score = 108 bits (270), Expect = 1e-23 Identities = 70/202 (34%), Positives = 110/202 (54%), Gaps = 10/202 (4%) Query: 9 INIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQ 68 I F ++ T VA+ LLG L HE G IV+ EAY G +D+ AH++G R+TPR + Sbjct: 4 IREFYSRDTIVVAKELLGKVLVHEVNGIRTSGKIVEVEAYRGINDKGAHAYGGRRTPRTE 63 Query: 69 AMYDKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGR----- 122 A+Y G Y+Y ++ + +N+V ++G P+GV+IRAIEP+EG++ M E R + Sbjct: 64 ALYGPAGHAYVYFIYGLYYCMNVVAMQEGIPEGVLIRAIEPIEGIEVMSERRFKKLFNDL 123 Query: 123 ---QGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKG 179 Q LTNGP KL +A+ I ++ + L + K + + +EA R+GI Sbjct: 124 TKYQLKNLTNGPSKLCSAMEIRREQNLMDLNGDELYIEEGKNESFEIVEA-KRVGIDYAE 182 Query: 180 RWTELPLRYVVAGNPYISKQKR 201 + R+ + GN +S K+ Sbjct: 183 EAKDYLWRFYIKGNKCVSVLKK 204 >CLOPE 3MGH_CLOPE (Q8XHA9) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 205 Score = 107 bits (266), Expect = 4e-23 Identities = 69/199 (34%), Positives = 107/199 (53%), Gaps = 11/199 (5%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F + T VA+ LLG L L G IV+ EAY+G D+A+H++G ++T R + +Y Sbjct: 7 FYNRDTVTVAKELLGKVLVRNINGVTLKGKIVETEAYIGAIDKASHAYGGKRTNRTETLY 66 Query: 72 DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVEL--- 127 PGT+Y+Y ++ + LN++++E+ GV+IR IEP+EG+++M + R + EL Sbjct: 67 ADPGTVYVYIIYGMYHCLNLISEEKDVAGGVLIRGIEPLEGIEEMSKLRYKKSYEELSNY 126 Query: 128 -----TNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPK--KIEALPRIGIPNKGR 180 +NGP KL ALGIDK G + SS V + K I RIGI Sbjct: 127 EKKNFSNGPSKLCMALGIDKGENGINTISSEEIYVEDDSLIKKDFSIVEAKRIGIDYAEE 186 Query: 181 WTELPLRYVVAGNPYISKQ 199 + R+ + N ++SK+ Sbjct: 187 ARDFLWRFYIKDNKFVSKK 205 >STRMU Q8DRU4 (Q8DRU4) Putative 3-methyladenine DNA glycosylase Length = 192 Score = 103 bits (258), Expect = 3e-22 Identities = 64/173 (36%), Positives = 91/173 (52%), Gaps = 15/173 (8%) Query: 40 GYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQ 99 G IV+ EAYLG D A HS R+TP+ +AMY G Y+Y ++ H +LN+VT+ Q + Sbjct: 34 GRIVETEAYLGSKDSACHSANDRRTPKNEAMYLAAGHWYVYQIYGHQMLNLVTKPQNVAE 93 Query: 100 GVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPE 159 V+IRA+E + G L NGPGKL GIDK G S+ S L L + Sbjct: 94 AVLIRALETAD-------------GHLLANGPGKLTKFAGIDKSFNGDSLQDSRLSL--Q 138 Query: 160 KRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYISKQKRTAVDQIDFGWK 212 + P++IE RIG+ W + L + V GN ++SK + ++ WK Sbjct: 139 EDLSPQRIEERSRIGVTCTDEWKDALLCFYVRGNQHVSKIAKKSLLTDKETWK 191 >DEIRA 3MGH_DEIRA (Q9RSQ0) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 190 Score = 85.9 bits (211), Expect = 9e-17 Identities = 64/181 (35%), Positives = 97/181 (53%), Gaps = 7/181 (3%) Query: 20 VAQYLLGMYLEHETATGV-LGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIY 78 +A+ LLG L T G L G +V+ EAY P D A + G R M PG Sbjct: 3 LARELLGGTLVRVTPDGHRLSGRVVEVEAYDCPRDPACTA-GRFHAARSAEMAIAPGHWL 61 Query: 79 LYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAAL 138 + H H +L + +++G V+IRA+EP+EG KM++ R + +LT+GP KLV AL Sbjct: 62 FWFAHGHPLLQVACRQEGVSASVLIRALEPLEGAGKMLDYRPVTRQRDLTSGPAKLVYAL 121 Query: 139 GID-KQLYGQSIFSSSLRLV-PEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYI 196 G+D Q+ + + S L L+ PE ++ R+GI +GR LP R+++ GN ++ Sbjct: 122 GLDPMQISHRPVNSPELHLLAPETPLADDEVTVTARVGI-REGR--NLPWRFLIRGNGWV 178 Query: 197 S 197 S Sbjct: 179 S 179 >CORGL 3MGH_CORGL (Q8NU33) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 189 Score = 82.0 bits (201), Expect = 1e-15 Identities = 66/185 (35%), Positives = 100/185 (54%), Gaps = 16/185 (8%) Query: 20 VAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYL 79 VA LLG L H G +G I + EAYL DEAAH++ KTPR AM+ G +Y+ Sbjct: 12 VAPQLLGCTLTH----GGVGIRITEVEAYLDSTDEAAHTY-RGKTPRNAAMFGPGGHMYV 66 Query: 80 YTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGV---ELTNGPGKLV 135 Y + H N+V +G QGV++RA E V G + + ++R+G +G+ L GPG Sbjct: 67 YISYGIHRAGNIVCGPEGTGQGVLLRAGEVVSG-ESIAQSRRG-EGIPHARLAQGPGNFG 124 Query: 136 AALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPY 195 ALG++ S+F S L+ ++ + P+ + PRIGI TE LR+ + +P Sbjct: 125 QALGLEISDNHASVFGPSF-LISDRVETPEIVRG-PRIGISKN---TEALLRFWIPNDPT 179 Query: 196 ISKQK 200 +S ++ Sbjct: 180 VSGRR 184 >STRCO 3MGH_STRCO (Q9S208) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 213 Score = 80.5 bits (197), Expect = 4e-15 Identities = 59/184 (32%), Positives = 91/184 (49%), Gaps = 8/184 (4%) Query: 19 EVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIY 78 EVA LLG L G + + + EAY G +D +H++ R TPR + M+ PG +Y Sbjct: 21 EVAPDLLGRILVRTGPDGPITLRLTEVEAYDGQNDPGSHAYRGR-TPRNEVMFGPPGHVY 79 Query: 79 LY-TMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENR-QGRQGVELTNGPGKLVA 136 +Y T +N+V +G+ V++RA E ++G + R R EL GP +L Sbjct: 80 VYFTYGMWFCMNLVCGPEGRSSAVLLRAGEIIDGAELARTRRLSARNDKELAKGPARLAT 139 Query: 137 ALGIDKQLYGQSIFSSS---LRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGN 193 ALG+D+ L G +S LR++ ++ PR G+ +G P RY VA + Sbjct: 140 ALGVDRALNGTDACTSQETPLRILTGTPVPGDQVRNGPRTGVAGEG--GVHPWRYWVADD 197 Query: 194 PYIS 197 P +S Sbjct: 198 PTVS 201 >BRAJA 3MGH_BRAJA (Q89LR7) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 200 Score = 79.0 bits (193), Expect = 1e-14 Identities = 68/193 (35%), Positives = 94/193 (48%), Gaps = 22/193 (11%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F ++ EVA L+G + GV GG IV+ EAY + AAHS+ TPR M+ Sbjct: 20 FFGRSVREVAHDLIGATM---LVDGV-GGLIVEVEAY-HHTEPAAHSYN-GPTPRNHVMF 73 Query: 72 DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNG 130 PG Y+Y + H +N V + +G V+IRA+EP G+ M R + L +G Sbjct: 74 GPPGFAYVYRSYGIHWCVNFVCEAEGSAAAVLIRALEPTHGIAAMRRRRHLQDVHALCSG 133 Query: 131 PGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALP-----RIGIPNKGRWTELP 185 PGKL ALGI +I ++L L + E L RIGI + ELP Sbjct: 134 PGKLTEALGI-------TIAHNALPLDRPPIALHARTEDLEVATGIRIGIT---KAVELP 183 Query: 186 LRYVVAGNPYISK 198 RY V G+ ++SK Sbjct: 184 WRYGVKGSKFLSK 196 >STRAW 3MGH_STRAW (Q829C5) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 213 Score = 72.8 bits (177), Expect = 8e-13 Identities = 57/191 (29%), Positives = 88/191 (46%), Gaps = 8/191 (4%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F + +VA LLG L T G + + + EAY GP D +H++ R T R M+ Sbjct: 14 FFARPVLDVAPDLLGRVLVRTTPDGPIELRVTEVEAYDGPSDPGSHAYRGR-TARNGVMF 72 Query: 72 DKPGTIYLY-TMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENR-QGRQGVELTN 129 PG +Y+Y T +N+V +G+ V++RA E +EG + R R EL Sbjct: 73 GPPGHVYVYFTYGMWHCMNLVCGPEGRASAVLLRAGEIIEGAELARTRRLSARNDKELAK 132 Query: 130 GPGKLVAALGIDKQLYGQSIFS---SSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPL 186 GP +L AL +D+ L G + L L+ P ++ PR G+ G P Sbjct: 133 GPARLATALEVDRALDGTDACAPEGGPLTLLSGTPVPPDQVRNGPRTGVSGDG--GVHPW 190 Query: 187 RYVVAGNPYIS 197 R+ + +P +S Sbjct: 191 RFWIDNDPTVS 201 >COREF 3MGH_COREF (Q8FUA2) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 190 Score = 69.3 bits (168), Expect = 9e-12 Identities = 58/182 (31%), Positives = 85/182 (46%), Gaps = 11/182 (6%) Query: 20 VAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYL 79 VA LLG H+ + L + EAYLG +D AAH+ KT R AM+ G +Y+ Sbjct: 12 VAPQLLGCIFTHDGVSIRL----TEVEAYLGAEDAAAHTHR-GKTARNAAMFGPGGHMYI 66 Query: 80 YTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAAL 138 Y + H N+ +G QGV++RA E V G D R L GPG L AL Sbjct: 67 YISYGIHRAGNIACAPEGVGQGVLLRAGEVVAGEDIAYRRRGDVPFTRLAQGPGNLGQAL 126 Query: 139 GIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYISK 198 I + +L+ E + P+ + PR+GI + PLR+ + G+P +S Sbjct: 127 NFQLSDNHAPINGTDFQLM-EPSERPEWVSG-PRVGITKN---ADAPLRFWIPGDPTVSV 181 Query: 199 QK 200 ++ Sbjct: 182 RR 183 >PROAC Q6A5L3 (Q6A5L3) Putative 3-methylpurine DNA glycosylase Length = 191 Score = 65.9 bits (159), Expect = 9e-11 Identities = 56/190 (29%), Positives = 85/190 (44%), Gaps = 23/190 (12%) Query: 19 EVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIY 78 EVA LLG + G +G + + EAY+G DD A+H+F TPR + M+ P IY Sbjct: 10 EVAPLLLGATIWR----GPVGIRLTEVEAYMGLDDPASHAFR-GPTPRARVMFGPPSHIY 64 Query: 79 LYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137 +Y + H +N+V G+ V++R + + G D R L GPG + +A Sbjct: 65 VYLSYGMHRCVNLVCSPDGEASAVLLRGGQVIAGHDDARRRRGNVAENRLACGPGNMGSA 124 Query: 138 LGIDKQLYGQ----------SIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLR 187 LG + G S L PE +F + PR+GI R + P R Sbjct: 125 LGASLEESGNPVSIIGNGAISALGWRLEPAPEIAEFRQG----PRVGI---SRNIDAPWR 177 Query: 188 YVVAGNPYIS 197 + + +P +S Sbjct: 178 WWIPQDPTVS 187 >MYCPA Q740F6 (Q740F6) Hypothetical protein Length = 205 Score = 64.3 bits (155), Expect = 3e-10 Identities = 66/198 (33%), Positives = 92/198 (46%), Gaps = 30/198 (15%) Query: 19 EVAQYLLGMYLEHETATGVLGGYIVDAEAYLG-PD----DEAAHSF-GLRKTPRLQAMYD 72 E A+ LLG L T GV G IV+ EAY G PD D AAHS+ GLR R M+ Sbjct: 14 EAARRLLGATL---TGRGV-SGVIVEVEAYGGVPDGPWPDAAAHSYKGLRA--RNFVMFG 67 Query: 73 KPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQG-----VE 126 PG +Y Y H H+ N+ G V++RA +G D +GR+G Sbjct: 68 PPGRLYTYRSHGIHVCANVSCGPDGTAAAVLLRAAALEDGTDVA----RGRRGELVHTAA 123 Query: 127 LTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEAL--PRIGIPNKGRWTEL 184 L GPG L AA+GI G +F P + + + A+ PR+G+ + + Sbjct: 124 LARGPGNLCAAMGITMADNGIDLFDPD---SPVTLRLHEPLTAVCGPRVGV---SQAADR 177 Query: 185 PLRYVVAGNPYISKQKRT 202 P R + G P +S +R+ Sbjct: 178 PWRLWLPGRPEVSAYRRS 195 >MYCLEH 3MGH_MYCTU (P65412) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 203 Score = 63.5 bits (153), Expect = 5e-10 Identities = 55/171 (32%), Positives = 81/171 (47%), Gaps = 16/171 (9%) Query: 42 IVDAEAYLG-PD----DEAAHSFGLRKTPRLQAMYDKPGTIYLYTMH-THLILNMVTQEQ 95 +V+ EAY G PD D AAHS+ R R M+ PG +Y Y H H+ N+ Sbjct: 31 VVEVEAYGGVPDGPWPDAAAHSYRGRNG-RNDVMFGPPGRLYTYRSHGIHVCANVACGPD 89 Query: 96 GKPQGVMIRAIEPVEGVDKMIENR-QGRQGVELTNGPGKLVAALGIDKQLYGQSIF--SS 152 G V++RA +G + R Q + V L GPG L AALGI G +F SS Sbjct: 90 GTAAAVLLRAAAIEDGAELATSRRGQTVRAVALARGPGNLCAALGITMADNGIDLFDPSS 149 Query: 153 SLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYISKQKRTA 203 +RL + + + PR+G+ + + P R + G P +S +R++ Sbjct: 150 PVRL---RLNDTHRARSGPRVGV---SQAADRPWRLWLTGRPEVSAYRRSS 194 >MYCLE 3MGH_MYCLE (O05678) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 214 Score = 60.1 bits (144), Expect = 5e-09 Identities = 60/190 (31%), Positives = 88/190 (46%), Gaps = 18/190 (9%) Query: 21 AQYLLGMYLEHETATGVLGGYIVDAEAYLG-PD----DEAAHSFGLRKTPRLQAMYDKPG 75 A LLG + T GV +V+ EAY G PD D AAHS+ R R M+ PG Sbjct: 25 AHRLLGATI---TGRGVCA-IVVEVEAYGGVPDGPWPDAAAHSYHGRND-RNAVMFGPPG 79 Query: 76 TIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGR--QGVELTNGPG 132 +Y Y H H+ N+ G V+IRA G D + +R+G + V L GPG Sbjct: 80 RLYTYCSHGIHVCANVSCGPDGTAAAVLIRAGALENGAD-VARSRRGASVRTVALARGPG 138 Query: 133 KLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAG 192 L +ALGI G +F++ + + + + PR+GI + + P R + G Sbjct: 139 NLCSALGITMDDNGIDVFAADSPVTLVLNEAQEAMSG-PRVGISHA---ADRPWRLWLPG 194 Query: 193 NPYISKQKRT 202 P +S +R+ Sbjct: 195 RPEVSTYRRS 204 >RICCN 3MGH_RICCN (Q92IE0) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 183 Score = 51.6 bits (122), Expect = 2e-06 Identities = 39/131 (29%), Positives = 62/131 (47%), Gaps = 18/131 (13%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F + T V+ L+G L + T + I + E+Y+G +D A H+ +T R M+ Sbjct: 11 FFARDTNVVSTELIGKALYFQGKTAI----ITETESYIGQNDPACHA-ARGRTKRTDIMF 65 Query: 72 DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNG 130 G Y+Y ++ + LN VT+ +G P +IR + + + EN NG Sbjct: 66 GPAGFSYVYLIYGMYYCLNFVTEAKGFPAATLIRGVHVI-----LPENLY-------LNG 113 Query: 131 PGKLVAALGID 141 PGKL LGI+ Sbjct: 114 PGKLCKYLGIN 124 >RICPR 3MGH_RICPR (Q9ZDH7) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 217 Score = 48.9 bits (115), Expect = 1e-05 Identities = 29/96 (30%), Positives = 49/96 (51%), Gaps = 6/96 (6%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F + T V+ L+G L + T + I + E+Y+G DD A H+ +T R M+ Sbjct: 11 FFARDTNLVSTELIGKVLYFQGTTAI----ITETESYIGEDDPACHA-ARGRTKRTDVMF 65 Query: 72 DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAI 106 G Y+Y ++ + LN VT+++G P +IR + Sbjct: 66 GPAGFSYVYLIYGMYYCLNFVTEDEGFPAATLIRGV 101 >PSEAE 3MGH_PSEAE (Q9HX17) Putative 3-methyladenine DNA glycosylase (EC 3.2.2.-) Length = 239 Score = 45.1 bits (105), Expect = 2e-04 Identities = 49/184 (26%), Positives = 80/184 (43%), Gaps = 17/184 (9%) Query: 20 VAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYL 79 VA+ LLG + H L I++ EAY +++ +H+ L T + +A++ G IY+ Sbjct: 29 VARELLGKVIRHRQGNLWLAARIIETEAYY-LEEKGSHA-SLGYTEKRKALFLDGGHIYM 86 Query: 80 YTMHTHLILNMVTQEQGKPQGVMIRAIEP----------VEGVDKMIENRQG--RQGVEL 127 Y LN G V+I++ P +E + + + QG R+ L Sbjct: 87 YYARGGDSLNF--SAGGPGNAVLIKSGHPWLDRISDHTALERMQSLNPDSQGRPREIGRL 144 Query: 128 TNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLR 187 G L A+G+ + F V + + P ++ R+GIP KGR LP R Sbjct: 145 CAGQTLLCKAMGLKVPEWDAQRFDPQRLFVDDVGERPSQVIQAARLGIP-KGRDEHLPYR 203 Query: 188 YVVA 191 +V A Sbjct: 204 FVDA 207 >PSEPK Q88DL3 (Q88DL3) DNA-3-methyladenine glycosylase, putative Length = 222 Score = 41.6 bits (96), Expect = 0.002 Identities = 48/192 (25%), Positives = 77/192 (40%), Gaps = 17/192 (8%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F + + +A+ LLG + H L I++ EAY D + S G T + +A++ Sbjct: 8 FFDRDAQTLAKALLGKVIRHRHGDLWLAARIIETEAYYLSDKGSHASLGY--TEKRKALF 65 Query: 72 DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEP----VEGVDKMIENRQG------ 121 G IY+Y LN G V+I++ P + G D + + + Sbjct: 66 LDGGHIYMYYARGGDSLNF--SAHGPGNAVLIKSAYPWQDTLSGPDSLAQMQLNNPDASG 123 Query: 122 --RQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKG 179 R L G L ALG+ + F + V + ++ R+GIP+ G Sbjct: 124 NIRPQERLCAGQTLLCRALGLKVPHWDAQRFDAERLYVEDCGNAVPRVIQAARLGIPH-G 182 Query: 180 RWTELPLRYVVA 191 R LP R+V A Sbjct: 183 RDEHLPYRFVDA 194 >BORPE Q7VWB7 (Q7VWB7) Putative methypurine-DNA glycosylase Length = 238 Score = 40.4 bits (93), Expect = 0.004 Identities = 47/190 (24%), Positives = 78/190 (41%), Gaps = 17/190 (8%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F + +++A+ LLG + H L I++ EAY + + S G T + +A++ Sbjct: 20 FFNRDAQQLARDLLGKVVRHRVDGLWLSARIIETEAYYLAEKGSHASLGY--THKRRALF 77 Query: 72 DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRA----IEPVEG------VDKMIENRQG 121 G +Y+Y LN G V+I++ ++ V G + ++ + QG Sbjct: 78 MDGGVVYMYYARGGDSLNF--SAAGPGNAVLIKSGHPWVDEVSGPRALACMQRLNPDAQG 135 Query: 122 --RQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKG 179 R L G L ALG+ + F V + ++ R+GIP G Sbjct: 136 QPRPPARLCAGQTLLCRALGLKVPQWDAKAFDPRRFFVEDVGVRLDRLVRTTRLGIP-AG 194 Query: 180 RWTELPLRYV 189 R LP RYV Sbjct: 195 RDEHLPYRYV 204 >BORPA Q7W9Q9 (Q7W9Q9) Putative methypurine-DNA glycosylase Length = 238 Score = 40.4 bits (93), Expect = 0.004 Identities = 47/190 (24%), Positives = 78/190 (41%), Gaps = 17/190 (8%) Query: 12 FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71 F + +++A+ LLG + H L I++ EAY + + S G T + +A++ Sbjct: 20 FFNRDAQQLARDLLGKVVRHRVDGLWLSARIIETEAYYLAEKGSHASLGY--THKRRALF 77 Query: 72 DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRA----IEPVEG------VDKMIENRQG 121 G +Y+Y LN G V+I++ ++ V G + ++ + QG Sbjct: 78 MDGGVVYMYYARGGDSLNF--SAAGPGNAVLIKSGHPWVDEVSGPRALACMQRLNPDAQG 135 Query: 122 --RQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKG 179 R L G L ALG+ + F V + ++ R+GIP G Sbjct: 136 QPRPPARLCAGQTLLCRALGLKVPQWDAKAFDPRRFFVEDVGVRLDRLVRTTRLGIP-AG 194 Query: 180 RWTELPLRYV 189 R LP RYV Sbjct: 195 RDEHLPYRYV 204 >STRAW Q93HJ1 (Q93HJ1) Modular polyketide synthase Length = 3613 Score = 35.4 bits (80), Expect = 0.14 Identities = 16/39 (41%), Positives = 23/39 (58%) Query: 99 QGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137 +G M+ PVEGV++ + +GR GV NGPG V + Sbjct: 700 KGGMVSVALPVEGVEERLARFEGRIGVAAVNGPGSAVVS 738 >STRAW Q93HJ2 (Q93HJ2) Modular polyketide synthase Length = 4685 Score = 33.1 bits (74), Expect = 0.68 Identities = 15/39 (38%), Positives = 23/39 (58%) Query: 99 QGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137 +G M+ PVEGV++ + +GR GV NGP +V + Sbjct: 3743 KGGMVSVALPVEGVEERLARFEGRIGVAAVNGPTSVVVS 3781 Score = 33.1 bits (74), Expect = 0.68 Identities = 15/39 (38%), Positives = 23/39 (58%) Query: 99 QGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137 +G M+ PVEGV++ + +GR GV NGP +V + Sbjct: 2223 KGGMVSVALPVEGVEERLARFEGRIGVAAVNGPTSVVVS 2261 Score = 30.4 bits (67), Expect = 4.4 Identities = 14/39 (35%), Positives = 22/39 (56%) Query: 99 QGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137 +G M+ PV V++ + +GR GV NGPG +V + Sbjct: 695 KGGMVSVALPVGEVEERLARFEGRIGVAAVNGPGSVVVS 733 >SILPO Q5LKE3 (Q5LKE3) Peptidyl-prolyl cis-trans isomerase, FKBP-type Length = 142 Score = 32.0 bits (71), Expect = 1.5 Identities = 20/68 (29%), Positives = 34/68 (50%), Gaps = 6/68 (8%) Query: 114 KMIENRQGRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKF----PKKIEA 169 K ++ +GR +E T G G+++ G+DK + G VP + P+ +A Sbjct: 22 KTFDSSEGRDPLEFTVGSGQIIP--GLDKAMPGMETGEKKRVEVPCAEAYGPLNPEARQA 79 Query: 170 LPRIGIPN 177 +PR GIP+ Sbjct: 80 IPREGIPD 87 >SILPO Q5LVX1 (Q5LVX1) ABC transporter, transmembrane ATP-binding protein, putative Length = 1032 Score = 30.4 bits (67), Expect = 4.4 Identities = 19/48 (39%), Positives = 26/48 (54%), Gaps = 8/48 (16%) Query: 101 VMIRAIEPVEGVDKMIENRQG----RQGVE----LTNGPGKLVAALGI 140 V ++ EP +G MIE G R+G E +T GPG+LV LG+ Sbjct: 935 VFLKDDEPTDGAYMMIEGEAGLYLPREGQEDQLIVTVGPGRLVGELGL 982 >CLOTE DXS_CLOTE (Q894H0) 1-deoxy-D-xylulose-5-phosphate synthase (EC 2.2.1.7) (1-deoxyxylulose-5-phosphate synthase) (DXP synthase) (DXPS) Length = 620 Score = 30.0 bits (66), Expect = 5.8 Identities = 19/55 (34%), Positives = 28/55 (50%), Gaps = 1/55 (1%) Query: 138 LGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAG 192 + I K + G S + SSLR+ P KF + +E + + IPN G+ L V G Sbjct: 179 MSIGKNVGGLSTYLSSLRIDPNYNKFKRDVEGIIK-KIPNIGKGVAKNLERVKDG 232 >BURMA Q9AI54 (Q9AI54) DedA family protein Length = 1925639 Score = 29.6 bits (65), Expect = 7.5 Identities = 32/136 (23%), Positives = 52/136 (38%), Gaps = 6/136 (4%) Query: 43 VDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGT-IYLYTMHTHLILNMVTQEQGKPQGV 101 V+ A P A ++ + A Y G + + H L + Q K + Sbjct: 1823164 VELVANEAPGSRMAFMHPVKSRAAISAAYFDHGVKTFSFDTHEELAKILDATGQAKDLNL 1823223 Query: 102 MIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAALGIDKQLYGQS--IFSSSLRLVPE 159 ++R EG + G+ GVE+ N P L+AA + L G S + S +R Sbjct: 1823224 IVRMGVQAEGAAYSLS---GKFGVEMHNAPDLLLAARRATQDLMGVSFHVGSQCMRPTAF 1823280 Query: 160 KRKFPKKIEALPRIGI 175 + + AL R G+ Sbjct: 1823281 QAAMAQASRALVRAGV 1823296 >STAES Q8CST0 (Q8CST0) Hypothetical protein SE0952 Length = 572 Score = 29.3 bits (64), Expect = 9.8 Identities = 17/75 (22%), Positives = 36/75 (48%), Gaps = 5/75 (6%) Query: 98 PQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAALG-----IDKQLYGQSIFSS 152 P+ M+ + + +IEN++ +G+ LT+G + A+ ID +YG + + Sbjct: 60 PEDEMLGVDIVIPDIQYVIENKERLKGIFLTHGHEHAIGAVSYVLEQIDAPVYGSKLTIA 119 Query: 153 SLRLVPEKRKFPKKI 167 ++ + R KK+ Sbjct: 120 LVKEAMKARNIKKKV 134 >SILPO Q5LMI5 (Q5LMI5) ADA regulatory protein Length = 283 Score = 29.3 bits (64), Expect = 9.8 Identities = 24/103 (23%), Positives = 48/103 (46%), Gaps = 4/103 (3%) Query: 88 LNMVTQEQGKPQGVMIRAIEPVE--GVDKMIENRQGRQGVELTNGPGKLVAALGIDKQLY 145 +N+ T E+G GV+ RAIE ++ G +++ R + + +G+ + Y Sbjct: 1 MNVQTTEEGYHYGVIRRAIELIDAGGESMPLDDLAARMNMSPAHFQRIFSRWVGVSPKKY 60 Query: 146 GQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRY 188 Q + + + E+R +EA +G+ GR +L +R+ Sbjct: 61 QQYLTLGHAKALLEERF--TLLEAAQNVGLSGTGRLHDLFVRW 101 Database: Blastdata.fdb Posted date: Mar 29, 2006 3:30 PM Number of letters in database: 77,468,597 Number of sequences in database: 240,170 Lambda K H 0.316 0.135 0.391 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Hits to DB: 35,841,668 Number of Sequences: 240170 Number of extensions: 1550248 Number of successful extensions: 3502 Number of sequences better than 10.0: 43 Number of HSP's better than 10.0 without gapping: 24 Number of HSP's successfully gapped in prelim test: 19 Number of HSP's that attempted gapping in prelim test: 3332 Number of HSP's gapped (non-prelim): 140 length of query: 229 length of database: 77,468,597 effective HSP length: 107 effective length of query: 122 effective length of database: 51,770,407 effective search space: 6315989654 effective search space used: 6315989654 T: 11 A: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.6 bits) S2: 64 (29.3 bits) BLASTP 2.2.10 [Oct-19-2004] Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402. Query= ENTFA AACA_ENTFA (P0A0C2) Bifunctional AAC/APH [Includes: 6'-aminoglycoside N-acetyltransferase (EC 2.3.1.-) (AAC(6')); 2''-aminoglycoside phosphotransferase (EC 2.7.1.-) (APH(2''))] (479 letters) Database: Blastdata.fdb 240,170 sequences; 77,468,597 total letters Searching..................................................done Score E Sequences producing significant alignments: (bits) Value STAAM AACA_STAAM (P0A0C0) Bifunctional AAC/APH [Includes: 6'-ami... 959 0.0 ENTFA AACA_ENTFA (P0A0C2) Bifunctional AAC/APH [Includes: 6'-ami... 959 0.0 BACAN Q81P54 (Q81P54) Acetyltransferase, GNAT family 168 4e-41 BACC1 Q735Z9 (Q735Z9) Acetyltransferase, GNAT family 159 1e-38 BACAN Q81T90 (Q81T90) Aminoglycoside phophotransferase family pr... 67 1e-10 BACC1 Q73BC3 (Q73BC3) Aminoglycoside phophotransferase family pr... 62 4e-09 BRAJA Q89WN0 (Q89WN0) Bll0648 protein 59 3e-08 BACHD Q9K9M4 (Q9K9M4) BH2621 protein 56 2e-07 BACC1 Q739G2 (Q739G2) 6'-aminoglycoside N-acetyltransferase/2''-... 55 5e-07 THEMA Q9X063 (Q9X063) Hypothetical protein 52 3e-06 CLOTE Q896X4 (Q896X4) Putative acetyltransferase 49 3e-05 BACHD Q9KB15 (Q9KB15) BH2121 protein 48 6e-05 STAES Q8CQT6 (Q8CQT6) Spermidine acetyltransferase 47 1e-04 VIBCH Q9KMC2 (Q9KMC2) Acetyltransferase, putative 45 5e-04 BACSU BLTD_BACSU (P39909) Spermine/spermidine acetyltransferase ... 45 6e-04 BACC1 Q739N8 (Q739N8) Spermidine acetyltransferase, putative 44 0.001 LACPL Q88SW8 (Q88SW8) Acetyltransferase (Putative) 44 0.001 VIBUY Q7MK89 (Q7MK89) Putative acetyltransferase 43 0.002 DEIRA Q9RW71 (Q9RW71) Acetyltransferase, putative 43 0.002 BACAN Q81RM0 (Q81RM0) Spermidine acetyltransferase, putative 43 0.002 LACJO Q74K74 (Q74K74) Hypothetical protein 42 0.003 BACC1 Q72YV7 (Q72YV7) Acetyltransferase, GNAT family 42 0.003 BACC1 Q734R6 (Q734R6) Acetyltransferase, GNAT family 42 0.004 CLOPE Q8XKH9 (Q8XKH9) Hypothetical protein CPE1416 42 0.005 BACAN Q81SQ8 (Q81SQ8) Acetyltransferase, GNAT family 41 0.007 VIBCH Q9K330 (Q9K330) Acetyltransferase, putative 41 0.009 VIBUY Q7MK96 (Q7MK96) Putative acetyltransferase 40 0.012 WIGBR Q8D3I4 (Q8D3I4) Imp protein 40 0.016 BACSU P94482 (P94482) YnaD 40 0.021 BACC1 Q73A91 (Q73A91) Acetyltransferase, GNAT family 40 0.021 THETN Q8RC99 (Q8RC99) Acetyltransferases 39 0.027 STRAW Q82IB6 (Q82IB6) Putative acetyltransferase 39 0.027 LISIN Q92E38 (Q92E38) Lin0623 protein 39 0.027 STRCO O69977 (O69977) Hypothetical protein SCO5801 39 0.036 STRAW Q82KD8 (Q82KD8) Hypothetical protein 39 0.036 VIBCH Q9KMA3 (Q9KMA3) Acetyltransferase, putative 39 0.046 STAAM Q932M5 (Q932M5) Hypothetical protein SAVP027 39 0.046 LACLA Q9CHJ8 (Q9CHJ8) Acetyl transferase 39 0.046 ENTFA Q82YR8 (Q82YR8) Acetyltransferase, GNAT family 39 0.046 BACC1 Q73AS8 (Q73AS8) Acetyltransferase, GNAT family 39 0.046 BACC1 Q734C3 (Q734C3) Acetyltransferase, GNAT family 39 0.046 BACAN Q81YL2 (Q81YL2) Acetyltransferase, GNAT family 39 0.046 BACAN Q81XB7 (Q81XB7) Spermidine N1-acetyltransferase 39 0.046 SALTI Q8Z6Z1 (Q8Z6Z1) Spermidine N1-acetyltransferase 38 0.061 SALTY Q8ZPJ3 (Q8ZPJ3) Spermidine N1-acetyltransferase (EC 2.3.1.57) 38 0.061 SALCH Q57PD6 (Q57PD6) Spermidine N1-acetyltransferase 38 0.061 MYCPE Q8EWE5 (Q8EWE5) Acetyltransferase GNAT family 38 0.061 BACC1 Q72Y03 (Q72Y03) Spermidine N1-acetyltransferase (EC 2.3.1.57) 38 0.061 DEIRA Q9RW73 (Q9RW73) Acetyltransferase, putative 38 0.079 STAAM Q99U68 (Q99U68) Hypothetical protein 37 0.10 RICPR Q9ZE75 (Q9ZE75) TRANSCRIPTIONAL ACTIVATOR PROTEIN CZCR (CzcR) 37 0.10 LACJO Q74J71 (Q74J71) Hypothetical protein 37 0.10 CLOAB Q97D33 (Q97D33) Acetyltransferase (With duplicated domains... 37 0.10 VIBCH Q9KU66 (Q9KU66) Ribosomal-protein-alanine acetyltransferase 37 0.18 STRR6 Q8DNF9 (Q8DNF9) Hypothetical protein spr1760 37 0.18 SALTY Q8ZNU6 (Q8ZNU6) Putative ribose 5-phosphate isomerase (EC ... 37 0.18 SALCH Q57N66 (Q57N66) Putative ribose 5-phosphate isomerase 37 0.18 ECO57 ATDA_ECO57 (P0A952) Spermidine N(1)-acetyltransferase (EC ... 37 0.18 ECOLI ATDA_ECOLI (P0A951) Spermidine N(1)-acetyltransferase (EC ... 37 0.18 BACHD Q9KG16 (Q9KG16) BH0299 protein 37 0.18 AQUAE O66838 (O66838) Ribosomal-protein-alanine acetyltransferase 37 0.18 PROAC Q6ABU2 (Q6ABU2) Acetyltransferase (EC 2.3.1.-) 36 0.23 BACAN Q81QL3 (Q81QL3) Acetyltransferase, GNAT family 36 0.23 STRMU Q8DV67 (Q8DV67) Putative acetyltransferase 36 0.30 STAAM COAD_STAAM (P63818) Phosphopantetheine adenylyltransferase... 36 0.30 LISMO Q8Y9B8 (Q8Y9B8) Lmo0614 protein 36 0.30 THEMA Q9WZ46 (Q9WZ46) Hypothetical protein 35 0.39 STRR6 Q8DQU8 (Q8DQU8) Hypothetical protein spr0490 35 0.39 CLOAB Q97FA0 (Q97FA0) Predicted acetyltransferase 35 0.39 _BUCAP LOLC_BUCAP (Q8K9N8) Lipoprotein-releasing system transmem... 35 0.39 BACC1 Q734M2 (Q734M2) Acetyltransferase, GNAT family 35 0.39 BACAN Q81U09 (Q81U09) Acetyltransferase, GNAT family 35 0.39 YERPE Q8ZHB3 (Q8ZHB3) Putative siderophore biosynthesis protein ... 35 0.51 VIBUY Q7MFH7 (Q7MFH7) Histone acetyltransferase HPA2 35 0.51 RICCN Q92JG6 (Q92JG6) Transcriptional activator protein czcR 35 0.51 CLOAB Q97G03 (Q97G03) Predicted acetyltransferase 35 0.51 BACHD Q9KF00 (Q9KF00) Ribosomal-protein-alanine N-acetyltransferase 35 0.51 BACAN Q81NB6 (Q81NB6) Spermine/spermidine acetyltransferase, put... 35 0.51 BACAN Q81N07 (Q81N07) Acetyltransferase, GNAT family 35 0.51 STRMU Q8DT36 (Q8DT36) Putative acetyltransferase 35 0.67 PSEAE Q9I3W7 (Q9I3W7) Hypothetical protein 35 0.67 NITEU Q82UT0 (Q82UT0) GCN5-related N-acetyltransferase (EC 2.3.1... 35 0.67 LACLA Q9CF66 (Q9CF66) Spermidine acetyltransferase (EC 2.3.1.57) 35 0.67 BACC1 Q739G7 (Q739G7) Acetyltransferase, GNAT family 35 0.67 MYCPE Q8EWM2 (Q8EWM2) Hypothetical protein MYPE1810 34 0.88 MYCPE Q8EWE6 (Q8EWE6) Acetyltransferase GNAT family 34 0.88 LACLA Q9CHW9 (Q9CHW9) Hypothetical protein yfiL 34 0.88 LACPL Q88SM7 (Q88SM7) Prophage Lp4 protein 8, DNA primase/helicase 34 0.88 ENTFA Q837C4 (Q837C4) Acetyltransferase, GNAT family 34 0.88 CLOTE Q891D4 (Q891D4) Ribosomal-protein-alanine acetyltransferas... 34 0.88 CLOAB Q97I16 (Q97I16) Predicted acetyltransferase domain contain... 34 0.88 BACC1 Q72WY7 (Q72WY7) Hypothetical protein 34 0.88 VIBPA Q87G30 (Q87G30) Putative acetyltransferase 34 1.1 STAES ARGD2_STAES (Q8CSG1) Acetylornithine aminotransferase 2 (E... 34 1.1 RICPR Q9ZEC5 (Q9ZEC5) 190 KD ANTIGEN (Sca1) 34 1.1 PSEPK Q88NL5 (Q88NL5) Acetyltransferase, GNAT family 34 1.1 LACPL Q88YF8 (Q88YF8) Acetyltransferase (Putative) 34 1.1 BACC1 Q737S7 (Q737S7) Acetyltransferase, GNAT family 34 1.1 BACC1 Q734H0 (Q734H0) Acetyltransferase, GNAT family family 34 1.1 BACAN Q81Q67 (Q81Q67) Acetyltransferase, GNAT family 34 1.1 BACAN Q81P77 (Q81P77) Acetyltransferase, GNAT family 34 1.1 Q8E1I7 (Q8E1I7) Acetyltransferase, GNAT family 33 1.5 Q8DXE6 (Q8DXE6) Hypothetical protein SAG1905 33 1.5 OCEIH Q8ET96 (Q8ET96) Hypothetical conserved protein 33 1.5 LISIN Q929M8 (Q929M8) Lin2246 protein 33 1.5 CLOTE Q895L4 (Q895L4) DNA topoisomerase I (EC 5.99.1.2) 33 1.5 CLOPE Q8XIF5 (Q8XIF5) Ribosomal-protein-alanine N-acetyltransferase 33 1.5 BRUSU Q8FVN1 (Q8FVN1) Acyl-CoA dehydrogenase family protein 33 1.5 BRUME Q8YCN6 (Q8YCN6) ACYL-COA DEHYDROGENASE (EC 1.3.99.-) 33 1.5 BACSU YQAR_BACSU (P45914) Hypothetical protein yqaR 33 1.5 BACAN Q81RF9 (Q81RF9) Acetyltransferase, GNAT family 33 1.5 VIBPA Q87NP1 (Q87NP1) Putative spermine/spermidine acetyltransfe... 33 2.0 THETN Q8RC65 (Q8RC65) Acetyltransferases 33 2.0 STRR6 Q8CYV9 (Q8CYV9) Hypothetical protein spr0850 33 2.0 STAES Q8CU70 (Q8CU70) Spermidine acetyltransferase 33 2.0 STAES Q8CS08 (Q8CS08) Hypothetical protein SE1483 33 2.0 RICPR Q9ZDP9 (Q9ZDP9) Hypothetical protein RP278 33 2.0 OCEIH Q8CXG0 (Q8CXG0) Diamine N-acetyltransferase (Spermine:sper... 33 2.0 CLOAB Q97J70 (Q97J70) Predicted acetyltransferase 33 2.0 BURMA Q9AI54 (Q9AI54) DedA family protein 33 2.0 BRAJA Q89YE3 (Q89YE3) Bll0009 protein 33 2.0 BACAN Q81NK7 (Q81NK7) Streptothricin acetyltransferase, putative 33 2.0 VIBUY Q7MI31 (Q7MI31) Ribosomal-protein-alanine acetyltransferase 33 2.6 OCEIH Q8EMB8 (Q8EMB8) Glucose-6-phosphate 1-dehydrogenase (EC 1.... 33 2.6 OCEIH Q8EKW6 (Q8EKW6) Acetyltransferase 33 2.6 MYCPE Q8EWC2 (Q8EWC2) Oligoendopeptidase F 33 2.6 LISMO Q8Y8S4 (Q8Y8S4) Lmo0820 protein 33 2.6 CLOPE Q8XKM5 (Q8XKM5) Probable acetyltransferase 33 2.6 BUCAI SPED_BUCAI (P57304) S-adenosylmethionine decarboxylase pro... 33 2.6 AQUAE O67458 (O67458) Hypothetical protein aq_1482 33 2.6 YERPE Q8CZN5 (Q8CZN5) Acyltransferase for 30S ribosomal subunit ... 32 3.3 STRAW Q827N9 (Q827N9) Putative acetyltransferase 32 3.3 STAES Q8CU81 (Q8CU81) Spermidine acetyltransferase 32 3.3 RICPR Q9ZCN0 (Q9ZCN0) RIBOSOMAL-PROTEIN-ALANINE ACETYLTRANSFERAS... 32 3.3 OCEIH Q8ERM1 (Q8ERM1) Acetyltransferase 32 3.3 MYCGE RIBF_MYCGE (P47391) Putative riboflavin biosynthesis prote... 32 3.3 ENTFA Q836M4 (Q836M4) Spermine/spermidine acetyltransferase, put... 32 3.3 CHRVO Q7NRZ7 (Q7NRZ7) Probable acetyltransferase (EC 2.3.1.-) 32 3.3 CHLCV Q824H9 (Q824H9) Acetyltransferase, GNAT family 32 3.3 CHLMU SYR_CHLMU (Q9PJT8) Arginyl-tRNA synthetase (EC 6.1.1.19) (... 32 3.3 BACSU O34376 (O34376) Putative acetyl transferase (YobR protein) 32 3.3 BACC1 Q738E9 (Q738E9) Acetyltransferase, GNAT family 32 3.3 BACC1 Q736C5 (Q736C5) Acetyltransferase, GNAT family 32 3.3 BACC1 Q735P8 (Q735P8) Acetyltransferase, GNAT family 32 3.3 AQUAE YZ34_AQUAE (O66423) Hypothetical protein AA34 32 3.3 YERPE Q8ZCX6 (Q8ZCX6) Spermidine acetyltransferase (EC 2.3.1.57)... 32 4.4 STRR6 Q8DNN2 (Q8DNN2) Hypothetical protein spr1627 32 4.4 STAAM Q99RQ8 (Q99RQ8) Similar to transcription repressor of spor... 32 4.4 LACLA Q9CJA2 (Q9CJA2) Acetyl transferase 32 4.4 CLOTE Q892J2 (Q892J2) Conserved protein 32 4.4 BRAJA Q89UA8 (Q89UA8) Hypothetical acetyltransferase 32 4.4 BACSU O34558 (O34558) YopR protein 32 4.4 BACAN Q81R63 (Q81R63) Hypothetical protein 32 4.4 VIBPA Q87MD0 (Q87MD0) Acetyltransferase-related protein 32 5.7 STRR6 Q8DND0 (Q8DND0) Transcriptional activator 32 5.7 OCEIH Q8CUS9 (Q8CUS9) Hypothetical conserved protein 32 5.7 LISIN Q92E28 (Q92E28) Lin0633 protein 32 5.7 LACPL Q88U49 (Q88U49) Glucose-6-phosphate 1-dehydrogenase (EC 1.... 32 5.7 CORDI Q6NHQ0 (Q6NHQ0) Putative acetyltransferase 32 5.7 BURMA Q62J98 (Q62J98) Ribosomal-protein-alanine acetyltransferas... 32 5.7 THETN Q8R764 (Q8R764) LysM-repeat proteins and domains 31 7.4 STRR6 Q8DPX8 (Q8DPX8) Hypothetical protein spr0952 31 7.4 STRMU Q8DVT1 (Q8DVT1) Putative ribosomal-protein-alanine acetylt... 31 7.4 SALTI SPED_SALTI (Q8Z9E3) S-adenosylmethionine decarboxylase pro... 31 7.4 SALTY SPED_SALTY (Q8ZRS4) S-adenosylmethionine decarboxylase pro... 31 7.4 SALCH Q57T90 (Q57T90) S-adenosylmethionine decarboxylase, proenzyme 31 7.4 RICCN Q92JP8 (Q92JP8) Cell surface antigen 31 7.4 NEIGI Q5FAH1 (Q5FAH1) Hypothetical protein 31 7.4 LISIN Q92DJ7 (Q92DJ7) Lin0816 protein 31 7.4 LACJO Q74J74 (Q74J74) Hypothetical protein 31 7.4 GEOSL Q74A59 (Q74A59) Sensory box histidine kinase 31 7.4 ENTFA Q836Z2 (Q836Z2) Acetyltransferase, GNAT family 31 7.4 ENTFA Q82YJ4 (Q82YJ4) Toxin ABC transporter, ATP-binding/permeas... 31 7.4 CLOAB Q97M70 (Q97M70) Predicted metal-dependent hydrolase 31 7.4 CLOAB Q97HI5 (Q97HI5) Predicted flavodoxin 31 7.4 BRAJA Q89R32 (Q89R32) Acyl-CoA dehydrogenase 31 7.4 BACC1 Q735F5 (Q735F5) Streptothricin acetyltransferase, putative 31 7.4 BACAN Q81YS9 (Q81YS9) Acetyltransferase, GNAT family 31 7.4 VIBPA Q87HD3 (Q87HD3) Hypothetical protein VPA1032 31 9.7 VIBCH Q9KL03 (Q9KL03) Spermidine n1-acetyltransferase 31 9.7 THEMA SYE2_THEMA (Q9X2I8) Glutamyl-tRNA synthetase 2 (EC 6.1.1.1... 31 9.7 THETN ISPH_THETN (Q8RA76) 4-hydroxy-3-methylbut-2-enyl diphospha... 31 9.7 STRCO Q9S2L5 (Q9S2L5) Hypothetical protein SCO1988 31 9.7 STRP1 Q99XX8 (Q99XX8) Putative pullulanase 31 9.7 STRP1 Q99YF0 (Q99YF0) Hypothetical protein SPy1734 (Acetyltransf... 31 9.7 STAES Q8CTP7 (Q8CTP7) Hypothetical protein SE0368 31 9.7 STAES COAD_STAES (Q8CSZ5) Phosphopantetheine adenylyltransferase... 31 9.7 MYCLEH Y3433_MYCTU (O06250) Hypothetical protein Rv3433c/MT3539 31 9.7 MYCTU Y3433_MYCTU (O06250) Hypothetical protein Rv3433c/MT3539 31 9.7 MYCBO Q7TWI6 (Q7TWI6) Hypothetical protein Mb3463c 31 9.7 LISMO Q8Y5C4 (Q8Y5C4) Lmo2141 protein 31 9.7 LISIN Q929Z8 (Q929Z8) Lin2125 protein 31 9.7 ENTFA Q837Z5 (Q837Z5) Acetyltransferase, GNAT family 31 9.7 ENTFA Q831M9 (Q831M9) Ribosomal-protein-alanine acetyltransferas... 31 9.7 CLOTE Q895T0 (Q895T0) Acetyltransferase (EC 2.3.1.-) 31 9.7 CLOPE Q8XMF9 (Q8XMF9) Hypothetical protein CPE0730 31 9.7 BRUSU Q8FY73 (Q8FY73) Outer membrane autotransporter 31 9.7 BACSU YCBJ_BACSU (P42242) Hypothetical protein ycbJ 31 9.7 BACHD Q9KE57 (Q9KE57) BH1001 protein 31 9.7 BACC1 Q73CD9 (Q73CD9) Glycerol-3-phosphate dehydrogenase, aerobi... 31 9.7 BACAN Q81VL8 (Q81VL8) Oxidoreductase, FAD-binding 31 9.7 BACAN Q81U57 (Q81U57) Glycerol-3-phosphate dehydrogenase, aerobic 31 9.7 BACAN Q81NU7 (Q81NU7) Acetyltransferase, GNAT family 31 9.7 >STAAM AACA_STAAM (P0A0C0) Bifunctional AAC/APH [Includes: 6'-aminoglycoside N-acetyltransferase (EC 2.3.1.-) (AAC(6')); 2''-aminoglycoside phosphotransferase (EC 2.7.1.-) (APH(2''))] Length = 479 Score = 959 bits (2480), Expect = 0.0 Identities = 467/479 (97%), Positives = 467/479 (97%) Query: 1 MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR 60 MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR Sbjct: 1 MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR 60 Query: 61 VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120 VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI Sbjct: 61 VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120 Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD 180 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD Sbjct: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD 180 Query: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE Sbjct: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240 Query: 241 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR 300 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR Sbjct: 241 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR 300 Query: 301 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA 360 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA Sbjct: 301 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA 360 Query: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRLTXXXXXXXXXXXXEYCDFIYLLEDSEEEIGTN 420 TTVFEGKKCLCHNDFSCNHLLLDGNNRLT EYCDFIYLLEDSEEEIGTN Sbjct: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRLTGIIDFGDSGIIDEYCDFIYLLEDSEEEIGTN 420 Query: 421 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD 479 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD Sbjct: 421 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD 479 >ENTFA AACA_ENTFA (P0A0C2) Bifunctional AAC/APH [Includes: 6'-aminoglycoside N-acetyltransferase (EC 2.3.1.-) (AAC(6')); 2''-aminoglycoside phosphotransferase (EC 2.7.1.-) (APH(2''))] Length = 479 Score = 959 bits (2480), Expect = 0.0 Identities = 467/479 (97%), Positives = 467/479 (97%) Query: 1 MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR 60 MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR Sbjct: 1 MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR 60 Query: 61 VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120 VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI Sbjct: 61 VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120 Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD 180 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD Sbjct: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD 180 Query: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE Sbjct: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240 Query: 241 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR 300 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR Sbjct: 241 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR 300 Query: 301 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA 360 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA Sbjct: 301 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA 360 Query: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRLTXXXXXXXXXXXXEYCDFIYLLEDSEEEIGTN 420 TTVFEGKKCLCHNDFSCNHLLLDGNNRLT EYCDFIYLLEDSEEEIGTN Sbjct: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRLTGIIDFGDSGIIDEYCDFIYLLEDSEEEIGTN 420 Query: 421 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD 479 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD Sbjct: 421 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD 479 >BACAN Q81P54 (Q81P54) Acetyltransferase, GNAT family Length = 177 Score = 168 bits (425), Expect = 4e-41 Identities = 76/174 (43%), Positives = 116/174 (66%), Gaps = 1/174 (0%) Query: 5 ENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIE 64 ++ + +R ++++D P++ KWLTD VL++Y GRD ++E + H+ R +IE Sbjct: 5 KDNVSVRYVVEEDAPIISKWLTDPEVLQYYEGRDDPQSVEMVLNHFIHNPNSPEKRCLIE 64 Query: 65 YNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFL 124 +++VPIGY Q+Y + E T Y Y ++ V+GMDQFIGEP YW KGIGT+++K ++ Sbjct: 65 FDDVPIGYIQMYPVDSESKTLYGYEESQN-VWGMDQFIGEPTYWGKGIGTKFVKAAITYI 123 Query: 125 KKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYR 178 E A A+ +DP NN RAI+ Y+K GF+ ++ L EHELHEG EDC++MEY+ Sbjct: 124 LSEMGAEAIAMDPKVNNERAIKCYEKCGFKKVKILKEHELHEGVLEDCWMMEYK 177 >BACC1 Q735Z9 (Q735Z9) Acetyltransferase, GNAT family Length = 359 Score = 159 bits (403), Expect = 1e-38 Identities = 74/185 (40%), Positives = 118/185 (63%), Gaps = 1/185 (0%) Query: 5 ENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIE 64 ++ + +R + ++D P++ KWLT+ VL++Y GRD +++ + H+ R +IE Sbjct: 5 KDNVSVRYVKEEDAPIISKWLTEPEVLQYYEGRDNPQSVDMVLDHFIHNPNSHEKRCLIE 64 Query: 65 YNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFL 124 +++VPIGY Q+Y + E T Y Y ++ V+GMDQFIGEP YW KGIGT+ ++ ++ Sbjct: 65 FDDVPIGYIQMYPVDSEWKTLYGYEESQH-VWGMDQFIGEPTYWGKGIGTKLVQTAITYI 123 Query: 125 KKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNAT 184 + A A+ +DP NN RAI+ Y+K GF+ ++ L EHELHEG EDC++MEY+ + Sbjct: 124 MENTGAEAIAMDPKVNNERAIKCYEKCGFKKVKVLKEHELHEGVLEDCWMMEYKQRELRE 183 Query: 185 NVKAM 189 KA+ Sbjct: 184 MKKAL 188 >BACAN Q81T90 (Q81T90) Aminoglycoside phophotransferase family protein Length = 300 Score = 67.0 bits (162), Expect = 1e-10 Identities = 51/208 (24%), Positives = 95/208 (45%), Gaps = 12/208 (5%) Query: 190 KYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNT 249 K I+ N + S + G+D+VA +VN+E +F+ EK + L Sbjct: 5 KQYIKEALPNLSIHSYKQNEEGWDNVAVIVNDELLFRFPRKQEYAMRIPLEKELCTILTQ 64 Query: 250 NLETNVKIP--NIEYSYISDELSILGYKE-IKGTFLTPEIYSTMSEEEQNLLKRDIASFL 306 +L+ +++P ++ Y SDE+ + Y I G L EI + + E+E+ ++ +A+FL Sbjct: 65 SLQ-EIEVPQYHLIYKNESDEVPLCSYYTLIHGEPLKTEIVANLDEKERKIIITQLATFL 123 Query: 307 RQMHGLDYTDISECTIDNKQNVL---EEYILLRETIYNDLTDIEKDYIESFMERL---NA 360 +H + ++ ++ + E L E + N LT +K + E A Sbjct: 124 AALHSIPLKSVTALGFPTEKTLTYWKELQTKLNEYVTNSLTSFQKSTLNRLFENFFACIA 183 Query: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRL 388 T+ F + H DF+ +H+L D N++ Sbjct: 184 TSAF--PNAIIHADFTHHHILFDKQNKI 209 >BACC1 Q73BC3 (Q73BC3) Aminoglycoside phophotransferase family protein Length = 300 Score = 62.0 bits (149), Expect = 4e-09 Identities = 51/206 (24%), Positives = 92/206 (44%), Gaps = 10/206 (4%) Query: 190 KYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNT 249 K I+ N + S + G+D+VA +VN+E +F+ EK + L+ Sbjct: 5 KQYIKEALPNLSIHSYKQNEEGWDNVAIIVNDELLFRFPRKQEYAMRIPLEKELCTLLSC 64 Query: 250 NL-ETNVKIPNIEYSYISDELSILGYKE-IKGTFLTPEIYSTMSEEEQNLLKRDIASFLR 307 +L E V ++ Y +D + + Y I G L EI +T+ ++E+ L +A+FL Sbjct: 65 SLHEIEVPKYHLFYEKNTDAIPLCSYYTLIHGEPLKTEIVTTLEKQERKALITQLATFLA 124 Query: 308 QMHGLDYTDISECTIDNKQNVL---EEYILLRETIYNDLTDIEKDYIESFMERL---NAT 361 +H + ++ ++ + E L E + N LT +K + E AT Sbjct: 125 ALHSIPLKSVTALGFPIEKTLTYWKELQAKLNEYVTNSLTSFQKSTLNRLFENFFACLAT 184 Query: 362 TVFEGKKCLCHNDFSCNHLLLDGNNR 387 + F+ + H DF+ +H+L D N+ Sbjct: 185 SKFQ--NTIIHADFTHHHILFDKQNK 208 >BRAJA Q89WN0 (Q89WN0) Bll0648 protein Length = 161 Score = 59.3 bits (142), Expect = 3e-08 Identities = 44/145 (30%), Positives = 75/145 (51%), Gaps = 13/145 (8%) Query: 11 RTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPI 70 R + D PL+ +WL + V E++G +++ L S EP D+ I+ + P Sbjct: 8 RPMTAADLPLIRRWLGEAHVREWWGDPGEQFALVS--GDLDEPAMDQF---IVLAGDKPF 62 Query: 71 GYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKE--R 128 GY Q Y++ + P+ G+DQFIGE + ++G G+ +I+ +F+ ++ Sbjct: 63 GYLQCYRL--TAWNTGFGPQPGG-TRGIDQFIGESDMIARGHGSAFIR---QFVDEQLRH 116 Query: 129 NANAVILDPHKNNPRAIRAYQKSGF 153 V+ DP N RA+RAY+K+GF Sbjct: 117 GLPRVVTDPDPLNSRAVRAYEKAGF 141 >BACHD Q9K9M4 (Q9K9M4) BH2621 protein Length = 197 Score = 56.2 bits (134), Expect = 2e-07 Identities = 35/159 (22%), Positives = 78/159 (49%), Gaps = 6/159 (3%) Query: 2 NIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRV 61 ++V ++ R + DD ++ W+ +E V+ ++ L KKH D+ + Sbjct: 15 HVVNKKLSFRHVTMDDVDMLHSWMHEEHVIPYW---KLNIPLVDYKKHLQTFLNDDHQTL 71 Query: 62 II-EYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120 ++ N VP+ Y + Y + +++ +Y YP +E G+ IG Y +G+ + I Sbjct: 72 MVGAINGVPMSYWESYWVKEDIIANY-YP-FEEHDQGIHLLIGPQEYLGQGLIYPLLLAI 129 Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 + +E + N ++ +P + N + I ++K GF+ ++++ Sbjct: 130 MQQKFQEPDTNTIVAEPDRRNKKMIHVFKKCGFQPVKEV 168 >BACC1 Q739G2 (Q739G2) 6'-aminoglycoside N-acetyltransferase/2''-aminoglycoside phosphotransferase, putative (EC 2.3.1.-) Length = 293 Score = 55.1 bits (131), Expect = 5e-07 Identities = 57/289 (19%), Positives = 125/289 (43%), Gaps = 24/289 (8%) Query: 193 IEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLE 252 ++ + +++S+ I G ++ +VN+ +F+ +KG K + L Sbjct: 11 LQRLYPELQINSVYINEIGQNNDVLIVNDNIVFRFP---KYEKGIQKLRIETQLLEKIRP 67 Query: 253 -TNVKIPNIEYSYISDELS---ILGYKEIKGTFLTPEIYSTMSEEEQ-NLLKRDIASFLR 307 ++IPN Y +E+ GY+ I+G +++ +++E+Q L +A FL+ Sbjct: 68 FITLQIPNPSYQGFQNEVPGKVFAGYEMIEGDPFWKNVFTEINDEKQLQKLAYTLARFLK 127 Query: 308 QMHGLD---YTDISEC-TIDNKQNVLEEYILLRETIYNDLTDI-EKDYIESFMERLNATT 362 ++H + + I +C + D + Y L+E +Y + ++ K+ SF LN ++ Sbjct: 128 ELHEIPLSTFESIMQCDSTDMYSEINSLYSQLKEHVYPFMRNVARKEVSTSFELYLNESS 187 Query: 363 VFEGKKCLCHNDFSCNHLLLDGNNR-LTXXXXXXXXXXXXEYCDFIYLLEDSEEEIGTNF 421 F L H DF ++L + ++ DF +L ++ Sbjct: 188 HFNFTPSLVHGDFGMTNILYSATKKNISGVIDFGGASIGDPAYDFAGIL--------ASY 239 Query: 422 GEDILRMYGNI--DIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENG 468 GE+ L+++ ++E KE + + ++ ++G+ N ++ E G Sbjct: 240 GEEFLQLFEAYYPNLEAVKERMYFYKSTFALQEALFGVLNNDKKAFEAG 288 >THEMA Q9X063 (Q9X063) Hypothetical protein Length = 182 Score = 52.4 bits (124), Expect = 3e-06 Identities = 27/75 (36%), Positives = 41/75 (54%), Gaps = 1/75 (1%) Query: 101 FIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLP 160 F+G P YWS+G GT ++++ F+ E N N + L N RA R Y+K GF++ L Sbjct: 94 FLGRP-YWSQGYGTDAMRVLVRFIFNEMNMNKIKLHVFSFNERAKRVYEKIGFKVEGILR 152 Query: 161 EHELHEGKKEDCYLM 175 + EG+ D +M Sbjct: 153 QELFREGRYHDVIVM 167 >CLOTE Q896X4 (Q896X4) Putative acetyltransferase Length = 186 Score = 48.9 bits (115), Expect = 3e-05 Identities = 44/173 (25%), Positives = 73/173 (42%), Gaps = 15/173 (8%) Query: 6 NEICIRTLIDDDFPLMLKWLTDE---RVLEFYGGRDK-KYTLESLKKHYTEPWEDEVFRV 61 + I I L ++D + KW D RV +F K + + + F + Sbjct: 10 DRIKITALREEDIETITKWYEDTNFLRVFDFNPSAPKTSWKIREWLMEEVSSSNNYFFAI 69 Query: 62 IIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIF 121 + N +GY +I K+ + V G+ IG+ + W KG G+ + L Sbjct: 70 RKKDANKILGYVEIEKI-----------NWNNGVGGIAIGIGDSSEWGKGYGSEALSLAM 118 Query: 122 EFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYL 174 +F +E N + + L N RAI++Y+K GF+ E +GK+ D YL Sbjct: 119 DFAFRELNLHRLQLITISYNERAIKSYEKLGFKKEGIYREAVNRDGKRYDIYL 171 >BACHD Q9KB15 (Q9KB15) BH2121 protein Length = 181 Score = 48.1 bits (113), Expect = 6e-05 Identities = 28/78 (35%), Positives = 36/78 (46%), Gaps = 13/78 (16%) Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 IGE YW KG G ++L+ + E N + V L N +AIR Y+K GF+ Sbjct: 95 IGEKTYWGKGYGFEALRLLLNYAFLEMNLHRVSLRVFSFNKKAIRLYEKLGFK------- 147 Query: 162 HELHEGKKEDCYLMEYRY 179 HEG C YRY Sbjct: 148 ---HEGTSRQCL---YRY 159 >STAES Q8CQT6 (Q8CQT6) Spermidine acetyltransferase Length = 177 Score = 47.0 bits (110), Expect = 1e-04 Identities = 42/169 (24%), Positives = 71/169 (42%), Gaps = 14/169 (8%) Query: 8 ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR-VIIEYN 66 + IR L D + L +E + Y + +L L+ YT+ DE R I+E Sbjct: 3 LIIRALEKTDLSF-IHHLNNEYSIMSYWFEEPYQSLSELENLYTKHILDETERRFIVEEG 61 Query: 67 NVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKK 126 + +G ++ ++ +T E++ +D P Y + G + K+ ++ Sbjct: 62 STSVGVVELLEIN-------FIHRTCEVLIIID-----PQYANNGYAKKAFKMAIDYAFL 109 Query: 127 ERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175 N N V L N +A+ YQ + F I L EH G+ DCY+M Sbjct: 110 VLNMNKVYLYVDIKNEKAVHIYQSNNFEIEGTLKEHFYTRGEYRDCYVM 158 >VIBCH Q9KMC2 (Q9KMC2) Acetyltransferase, putative Length = 158 Score = 45.1 bits (105), Expect = 5e-04 Identities = 37/166 (22%), Positives = 69/166 (41%), Gaps = 18/166 (10%) Query: 15 DDDFPLMLKWLTDERVLEFYGGRDKKY--TLESLKKHYTEPWEDEVFRVIIEYNNVPIGY 72 + DF L++KW+ + + +GG + T E + H ++ EVF +++ G+ Sbjct: 8 ESDFDLLIKWIDSDELNYLWGGPAYVFPLTYEQIHSHCSKA---EVFPYLLKVKGRHAGF 64 Query: 73 GQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANA 132 ++YK+ DE Y FI Y +G+ + L+ + + + +A Sbjct: 65 VELYKVTDEQYRICRV------------FISNA-YRGQGLSKSMLMLLIDKARLDFSATK 111 Query: 133 VILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYR 178 + L + N A + Y+ GF ++ GK D ME R Sbjct: 112 LSLGVFEQNTVARKCYESLGFEVVSTEIGTRAFNGKLWDLVRMEKR 157 >BACSU BLTD_BACSU (P39909) Spermine/spermidine acetyltransferase (EC 2.3.1.57) Length = 152 Score = 44.7 bits (104), Expect = 6e-04 Identities = 40/153 (26%), Positives = 69/153 (45%), Gaps = 16/153 (10%) Query: 8 ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKK-HYTEPWEDEVFRVIIEYN 66 I I+ + DD+ +L + L + K LE K+ HY +P V + Y Sbjct: 3 INIKAVTDDNRAAILDLHVSQNQLSYI--ESTKVCLEDAKECHYYKP-------VGLYYE 53 Query: 67 NVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKK 126 +G+ MY L+ +Y + V+ +D+F + Y KG+G + +K + + L + Sbjct: 54 GDLVGFA----MYG-LFPEYDEDNKNGRVW-LDRFFIDERYQGKGLGKKMLKALIQHLAE 107 Query: 127 ERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 + L +NN AIR YQ+ GF+ +L Sbjct: 108 LYKCKRIYLSIFENNIHAIRLYQRFGFQFNGEL 140 >BACC1 Q739N8 (Q739N8) Spermidine acetyltransferase, putative Length = 156 Score = 43.9 bits (102), Expect = 0.001 Identities = 18/64 (28%), Positives = 36/64 (56%) Query: 98 MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157 +D+F+ + Y KG R+++L+ +FL+ + + L H +N A+ Y+ GFR+ Sbjct: 74 LDRFMIDQQYQGKGYAKRFLRLLIQFLQNKFECKTIYLSLHPDNKLAMGLYESFGFRLNG 133 Query: 158 DLPE 161 D+ + Sbjct: 134 DIDD 137 >LACPL Q88SW8 (Q88SW8) Acetyltransferase (Putative) Length = 180 Score = 43.5 bits (101), Expect = 0.001 Identities = 26/74 (35%), Positives = 33/74 (44%) Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 IG+P+ G GT + LI + E N V LD NP AI YQ SGF Sbjct: 93 IGDPDERGHGYGTETLSLILNYAFNELNLYKVCLDVIATNPAAIAVYQNSGFEFEGTNKR 152 Query: 162 HELHEGKKEDCYLM 175 +G++ D Y M Sbjct: 153 AIKRDGQRIDLYHM 166 >VIBUY Q7MK89 (Q7MK89) Putative acetyltransferase Length = 158 Score = 43.1 bits (100), Expect = 0.002 Identities = 30/140 (21%), Positives = 66/140 (47%), Gaps = 14/140 (10%) Query: 17 DFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIY 76 DF L+++W+ + + +GG + L S ++ ++EVF +++ N G+ ++Y Sbjct: 10 DFHLLIEWIDSDELNYLWGGPAYTFPLTS-EQIIAHCAKEEVFPYLLKVNGQNAGFVELY 68 Query: 77 KMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILD 136 K+ +E Y FI +Y +G+ I L+ + ++ + +A + L Sbjct: 69 KVTNEHYRICRV------------FISN-SYRGQGLSKSMIMLLIDKVRSDFSATMLSLG 115 Query: 137 PHKNNPRAIRAYQKSGFRII 156 ++N A + Y+ GF ++ Sbjct: 116 VFEHNTVARKCYESLGFNVV 135 >DEIRA Q9RW71 (Q9RW71) Acetyltransferase, putative Length = 207 Score = 43.1 bits (100), Expect = 0.002 Identities = 21/70 (30%), Positives = 38/70 (54%) Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 I +P +W G G + ++L + E +A+ + L N R +RA Q++G+R +PE Sbjct: 107 IYDPAHWGGGFGRQALRLWTDATFAETDAHLITLTTWSGNERMVRAAQRAGYRECARIPE 166 Query: 162 HELHEGKKED 171 L +G++ D Sbjct: 167 ARLWQGQRWD 176 >BACAN Q81RM0 (Q81RM0) Spermidine acetyltransferase, putative Length = 156 Score = 43.1 bits (100), Expect = 0.002 Identities = 18/64 (28%), Positives = 35/64 (54%) Query: 98 MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157 +D+F+ + Y KG R+++L+ +FL+ + + L H N A+ Y+ GFR+ Sbjct: 74 LDRFMIDQQYQGKGYAKRFLRLLIQFLQHKFECKTIYLSLHPENKLAMGLYESFGFRLNG 133 Query: 158 DLPE 161 D+ + Sbjct: 134 DIDD 137 >LACJO Q74K74 (Q74K74) Hypothetical protein Length = 189 Score = 42.4 bits (98), Expect = 0.003 Identities = 41/162 (25%), Positives = 71/162 (43%), Gaps = 25/162 (15%) Query: 17 DFPLM---LKWLTDERVLEFYGGRDKKYTLESLKKHYTEP-WEDEVFRVIIEYNNV--PI 70 DFPL+ LK + DE ++ + + +K + P + R+ +E +++ PI Sbjct: 9 DFPLVYPILKQIFDEMDMDTIKALPESQFYDLMKHGFYSPHYRYSHNRMWVETDDLDRPI 68 Query: 71 GYGQIYKMYDELYTDYH----YPKT----DEIVYG----------MDQFIGEPNYWSKGI 112 G +Y D+ D YPK D +++ +D P +W KGI Sbjct: 69 GLIVMYGYDDQGLIDISLKSAYPKVGLPLDAVIFSDKEALPHEWYLDAIAVSPKHWGKGI 128 Query: 113 GTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 G + IK I + ++ + L+ ++NPRA R Y GF+ Sbjct: 129 GQKLIK-IAPGIARQNGYKKISLNVDQDNPRAARLYDYMGFK 169 >BACC1 Q72YV7 (Q72YV7) Acetyltransferase, GNAT family Length = 174 Score = 42.4 bits (98), Expect = 0.003 Identities = 35/152 (23%), Positives = 63/152 (41%), Gaps = 14/152 (9%) Query: 6 NEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKK---HYTEPWEDEVFRVI 62 N I +R +DD KW D V+ KY+ + +K + + + + Sbjct: 5 NRIQLRKFSEDDILTYYKWHNDIDVMSSTTLNLDKYSFQDTEKLCQQFIHSPNAKSYIIE 64 Query: 63 IEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFE 122 + N+PIG + ++ D + + I+ IG+ +YW +G G L+ Sbjct: 65 EKATNLPIGITSL------IHIDSYNRNAECIID-----IGKKDYWGQGYGKEAFTLLLN 113 Query: 123 FLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 + E N + + L N RAI+ Y+ GF+ Sbjct: 114 YAFLELNLHRLSLRVFSFNDRAIKLYKSLGFQ 145 >BACC1 Q734R6 (Q734R6) Acetyltransferase, GNAT family Length = 176 Score = 42.0 bits (97), Expect = 0.004 Identities = 37/146 (25%), Positives = 63/146 (43%), Gaps = 16/146 (10%) Query: 15 DDDFPLMLKWLTDERVLEFYGGRDKKYTLES--LKKHYTEPWEDE----VFRVIIEYNNV 68 ++DF ++ W+ + +GG + L + LK + +D VF+ I E N+ Sbjct: 9 EEDFQQLIDWIPNAEFSLQWGGPAFTFPLTNAQLKNYLQNANKDNAIKYVFKAIDETNSE 68 Query: 69 PIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKER 128 IG+ + + KT+E IG N KG GT+ + + +F +E Sbjct: 69 VIGHISLGNV----------DKTNESARIGKVLIGSTNSRGKGYGTQMMTAVLKFAFEEL 118 Query: 129 NANAVILDPHKNNPRAIRAYQKSGFR 154 + V L N AI+ Y+K GF+ Sbjct: 119 KLHKVTLGVFDFNESAIKCYKKVGFQ 144 >CLOPE Q8XKH9 (Q8XKH9) Hypothetical protein CPE1416 Length = 193 Score = 41.6 bits (96), Expect = 0.005 Identities = 31/106 (29%), Positives = 45/106 (42%), Gaps = 5/106 (4%) Query: 82 LYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNN 141 LY Y EI Y + E N+W KG+ + IK I F + + N +I NN Sbjct: 93 LYNIDFYSNNTEIGYTI-----EKNFWRKGVASECIKAIENFAFETLDMNRIIAMIDSNN 147 Query: 142 PRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVK 187 +I+ +K GF L EH ++ K E + Y + VK Sbjct: 148 ISSIKLSEKLGFHRDGILREHYYNKSKDEYINICVYSLIKSDIKVK 193 >BACAN Q81SQ8 (Q81SQ8) Acetyltransferase, GNAT family Length = 157 Score = 41.2 bits (95), Expect = 0.007 Identities = 33/126 (26%), Positives = 54/126 (42%), Gaps = 17/126 (13%) Query: 34 YGGRDKKYTLESLKKHYTEPWEDEV---FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPK 90 Y G+ Y +E+ ++ E DE ++ N IGY + K+ D Sbjct: 22 YEGKYSFYDIEADEEDLAEFLHDESRGDHTFSVKENGTLIGYFTVCKITDG--------- 72 Query: 91 TDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQK 150 T +I G+ PN G G ++I I F K++ N + L N RAI+ Y++ Sbjct: 73 TVDIGLGI-----RPNITGNGFGLQFINAILAFSKEKYGCNYITLSVATFNKRAIKVYKR 127 Query: 151 SGFRII 156 +GF + Sbjct: 128 AGFEAV 133 >VIBCH Q9K330 (Q9K330) Acetyltransferase, putative Length = 178 Score = 40.8 bits (94), Expect = 0.009 Identities = 21/80 (26%), Positives = 40/80 (50%), Gaps = 10/80 (12%) Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 IG+ +W KG+GT +L+ + +E + + L + +N A++AY+ +G++ Sbjct: 95 IGDKAFWGKGLGTEVTRLVTNYGFRELGLHRIELTAYCDNVAAVKAYENAGYQ------- 147 Query: 162 HELHEGKKEDCYLMEYRYDD 181 HEG K + R+ D Sbjct: 148 ---HEGIKRESGYRNGRFMD 164 >VIBUY Q7MK96 (Q7MK96) Putative acetyltransferase Length = 158 Score = 40.4 bits (93), Expect = 0.012 Identities = 35/166 (21%), Positives = 70/166 (42%), Gaps = 18/166 (10%) Query: 15 DDDFPLMLKWLTDERVLEFYGGRDKKY--TLESLKKHYTEPWEDEVFRVIIEYNNVPIGY 72 + +F ++ W+ + + +GG + T E + H ++ EVF +++ N G+ Sbjct: 8 ESNFDQLIAWIDSDELNYLWGGPAYVFPLTYEQIHAHCSKA---EVFPYLLKVNGRHAGF 64 Query: 73 GQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANA 132 ++YK+ DE Y VY + + G +G+ + L+ + + + +A Sbjct: 65 VELYKVTDEQYRICR-------VYISNAYRG------RGLSKSMLMLLIDKARLDFSATK 111 Query: 133 VILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYR 178 + L + N A + Y+ GF ++ GK D ME R Sbjct: 112 LSLGVFEQNTVARKCYESLGFEVVSTEIGTRSFNGKLWDLVRMEKR 157 >WIGBR Q8D3I4 (Q8D3I4) Imp protein Length = 723 Score = 40.0 bits (92), Expect = 0.016 Identities = 60/261 (22%), Positives = 104/261 (39%), Gaps = 50/261 (19%) Query: 57 EVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRY 116 +++ + + N+PI Y +K+Y E Y D Y + +I Y + + Y+ K +Y Sbjct: 191 KIWNAKLNFKNIPIFYVPFFKVY-EKYNDIFY--SPKISYKNNNGLSLSFYYKKIFFDKY 247 Query: 117 IKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLME 176 F F+ K + ++L NN + Y S F + KK + Y++ Sbjct: 248 ---FFYFIPKYNSDGTILL----NN----KIYYSSDF------------DKKKINLYIL- 283 Query: 177 YRYDDNATNVKAMKYLIEHYFDNFKVD---------SIEIIGSGYDSVAYLVNNEYI--F 225 +D L ++YF N K+D + I +D + NE + F Sbjct: 284 --FDIKKNKNNWFIDLKQNYFFNKKLDILYIYKKSNNFIIFNKMFDIEKNFLQNEILEKF 341 Query: 226 KTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPE 285 K+ N K + K F N N +K P++ +SY ++ K K F+ Sbjct: 342 NLKYFYNNWKLKLEYKKFIIFDNKNF-NYIKFPHVYFSYFDNK-----NKNFKFNFVGKF 395 Query: 286 IYSTMSEEEQNLLKRDIASFL 306 Y EE++ +L +I FL Sbjct: 396 SY----EEDKKILHINIEPFL 412 >BACSU P94482 (P94482) YnaD Length = 170 Score = 39.7 bits (91), Expect = 0.021 Identities = 39/156 (25%), Positives = 66/156 (42%), Gaps = 17/156 (10%) Query: 1 MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWED--EV 58 M+I + IR D+ + ++ +D V+++ + +T E K + D E Sbjct: 1 MHITTKRLLIREFEFKDWQAVYEYTSDSNVMKYIP--EGVFTEEDAKAFVNKNKGDNAEK 58 Query: 59 FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIK 118 F VI+ + IG+ YK + E T EI + + PNY +KG + + Sbjct: 59 FPVILRDEDCLIGHIVFYKYFGE--------HTYEIGW-----VFNPNYQNKGYASEAAQ 105 Query: 119 LIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 I E+ KE N + +I N + R +K G R Sbjct: 106 AILEYGFKEMNLHRIIATCQPENIPSYRVMKKIGMR 141 >BACC1 Q73A91 (Q73A91) Acetyltransferase, GNAT family Length = 177 Score = 39.7 bits (91), Expect = 0.021 Identities = 23/85 (27%), Positives = 37/85 (43%), Gaps = 5/85 (5%) Query: 87 HYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIR 146 H K E+ Y ++G+P YW G GT K + + E + N + NNP + R Sbjct: 86 HIHKRGELAY----WVGKP-YWGNGFGTEAAKTLLHYGFNELHLNKIFAAAFTNNPGSWR 140 Query: 147 AYQKSGFRIIEDLPEHELHEGKKED 171 +K G + +H + G+ D Sbjct: 141 IMEKIGMKHEGTFKQHVVKSGEPMD 165 >THETN Q8RC99 (Q8RC99) Acetyltransferases Length = 149 Score = 39.3 bits (90), Expect = 0.027 Identities = 35/149 (23%), Positives = 66/149 (44%), Gaps = 29/149 (19%) Query: 43 LESLKKHYTEPWEDEVF-----------RVIIEYNNVPIGYGQIYKMYDELYTDYHYPKT 91 +E K +T PW E F ++ E + +GY + + DE + T Sbjct: 18 MEIEKLSFTTPWSREAFVGEVTKNSCARYIVAEVDKKVVGYAGFWVVLDEGHI------T 71 Query: 92 DEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKS 151 + V+ P Y KGIG+R ++ + + L K+ ++ L+ ++N A Y+K Sbjct: 72 NIAVH--------PEYRGKGIGSRLMEGLID-LAKKNGITSMTLEVRESNLVAQNLYKKF 122 Query: 152 GFRIIEDLPEHELHEGKKEDCYLMEYRYD 180 GF+++ ++ ED +M ++YD Sbjct: 123 GFKVLG--RREGYYQDNNEDAIVM-WKYD 148 >STRAW Q82IB6 (Q82IB6) Putative acetyltransferase Length = 168 Score = 39.3 bits (90), Expect = 0.027 Identities = 21/54 (38%), Positives = 33/54 (61%), Gaps = 5/54 (9%) Query: 101 FIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 ++G P YW++GIG+R + L FL++ER + DP N ++R +K GFR Sbjct: 100 WLGRP-YWARGIGSRALGL---FLRRERT-RPLYADPFHGNTASVRLLEKHGFR 148 >LISIN Q92E38 (Q92E38) Lin0623 protein Length = 177 Score = 39.3 bits (90), Expect = 0.027 Identities = 22/69 (31%), Positives = 33/69 (47%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166 +W GIGT ++ + + KK V L+ N RAI Y+K GF ++P E Sbjct: 105 FWGLGIGTLIMEGLIKHAKKTERLKLVYLEAVSENKRAINLYKKFGFIEAGEIPALMQVE 164 Query: 167 GKKEDCYLM 175 G+ D +M Sbjct: 165 GRYLDVTMM 173 >STRCO O69977 (O69977) Hypothetical protein SCO5801 Length = 231 Score = 38.9 bits (89), Expect = 0.036 Identities = 30/143 (20%), Positives = 65/143 (45%), Gaps = 6/143 (4%) Query: 14 IDDDFPLMLKWLTDERVLEFYG-GRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIGY 72 ++ D PL+ +W+ D V ++ + T + L+ + + + VP+ Y Sbjct: 67 LERDVPLIARWMNDPAVAAYWELTGPQSVTADHLRAQLAG--DGRSVPCVGTLDGVPMSY 124 Query: 73 GQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNA-N 131 +IY+ + Y + + G+ IG+ + +G+GT I+ + + + R A Sbjct: 125 WEIYRADLDPLARYCPVRPHDT--GVHLLIGDGAHRGRGLGTELIRAVVDLVLAGRPACT 182 Query: 132 AVILDPHKNNPRAIRAYQKSGFR 154 V+ +P N +++ A+ +GFR Sbjct: 183 RVLAEPDVRNRQSVAAFLGAGFR 205 >STRAW Q82KD8 (Q82KD8) Hypothetical protein Length = 377 Score = 38.9 bits (89), Expect = 0.036 Identities = 37/150 (24%), Positives = 66/150 (44%), Gaps = 10/150 (6%) Query: 17 DFPLMLKWLTDERVLEFYG-GRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIGYGQI 75 D PL+ +W+ D V F+ D+ T + L+ ++E P+ Y +I Sbjct: 217 DLPLLGRWMNDPAVAAFWKLAGDESVTEQHLRAQLGGDGRSVPCLGVLE--GTPMSYWEI 274 Query: 76 YKM-YDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANA-V 133 Y+ D L HYP G+ IG +G+G+ ++ + + + R + A V Sbjct: 275 YRADLDSLAR--HYPARPHDT-GIHLLIGGVADRGRGLGSTLLRAVADLVLDRRPSCARV 331 Query: 134 ILDPHKNNPRAIRAYQKSGFRIIE--DLPE 161 + +P N ++ A+ +GFR DLP+ Sbjct: 332 VAEPDLRNTSSVSAFLGAGFRFSAEVDLPD 361 >VIBCH Q9KMA3 (Q9KMA3) Acetyltransferase, putative Length = 230 Score = 38.5 bits (88), Expect = 0.046 Identities = 30/144 (20%), Positives = 62/144 (43%), Gaps = 18/144 (12%) Query: 15 DDDFPLMLKWLTDERVLEFYGGRDKKY--TLESLKKHYTEPWEDEVFRVIIEYNNVPIGY 72 + DF L++KW+ + + +G + T E + H ++ EVF +++ G+ Sbjct: 8 ESDFDLLIKWIDSDELNYLWGCPAYVFPLTYEQIHSHCSKA---EVFPYLLKVKGRHAGF 64 Query: 73 GQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANA 132 ++YK+ DE Y FI Y +G+ + L+ + + + +A Sbjct: 65 VELYKVTDEQYRICRV------------FISNA-YRGQGLSKSMLMLLIDKARLDFSATK 111 Query: 133 VILDPHKNNPRAIRAYQKSGFRII 156 + L + N A + Y+ GF ++ Sbjct: 112 LSLGVFEQNTVARKCYESLGFEVV 135 >STAAM Q932M5 (Q932M5) Hypothetical protein SAVP027 Length = 134 Score = 38.5 bits (88), Expect = 0.046 Identities = 24/77 (31%), Positives = 36/77 (46%), Gaps = 11/77 (14%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164 PNY KG G++ + I E+ KE + + L K NPRA Y+K G + Sbjct: 68 PNYQDKGYGSKLLSFIKEY-SKEIGCSEMFLITDKGNPRACHVYEKLGGK---------- 116 Query: 165 HEGKKEDCYLMEYRYDD 181 ++ K E Y+ +Y D Sbjct: 117 NDYKDEIVYVYDYEKGD 133 >LACLA Q9CHJ8 (Q9CHJ8) Acetyl transferase Length = 193 Score = 38.5 bits (88), Expect = 0.046 Identities = 29/106 (27%), Positives = 50/106 (47%), Gaps = 12/106 (11%) Query: 59 FRVIIEYNNVPIGYGQI-YKMYDELYTDYHYPKTD------EIVYGMDQFIGEPNYWSKG 111 F + +Y P+G I K L D H+ K EI Y ++Q NYW++G Sbjct: 56 FSIANDYMKSPLGKWAIELKSEHRLIGDIHFVKISDKNQSAEIGYVLNQ-----NYWNQG 110 Query: 112 IGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157 + T +K++ EF ++ +IL K N + + KSG+ +++ Sbjct: 111 LLTEALKVLTEFSFEQFGLKKLILLIDKENVPSKKVALKSGYHLVK 156 >ENTFA Q82YR8 (Q82YR8) Acetyltransferase, GNAT family Length = 130 Score = 38.5 bits (88), Expect = 0.046 Identities = 24/77 (31%), Positives = 36/77 (46%), Gaps = 11/77 (14%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164 PNY KG G++ + I E+ KE + + L K NPRA Y+K G + Sbjct: 64 PNYQDKGYGSKLLSFIKEY-SKEIGCSEMFLITDKGNPRACHVYEKLGGK---------- 112 Query: 165 HEGKKEDCYLMEYRYDD 181 ++ K E Y+ +Y D Sbjct: 113 NDYKDEIVYVYDYEKGD 129 >BACC1 Q73AS8 (Q73AS8) Acetyltransferase, GNAT family Length = 157 Score = 38.5 bits (88), Expect = 0.046 Identities = 34/128 (26%), Positives = 55/128 (42%), Gaps = 21/128 (16%) Query: 34 YGGRDKKYTLESLKKHYTEPWEDE-----VFRVIIEYNNVPIGYGQIYKMYDELYTDYHY 88 Y G Y +E+ ++ E DE +F V + + IGY + K+ D Sbjct: 22 YEGEYSFYDIEADEEDLAEFLHDESRGDHIFSV--KEHGTLIGYFTVCKINDG------- 72 Query: 89 PKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAY 148 T +I GM +PN G G ++I I F K++ + L N RAI+ Y Sbjct: 73 --TVDIGLGM-----KPNITGNGFGLQFINAILAFSKEKYGCKYITLSVATFNKRAIKVY 125 Query: 149 QKSGFRII 156 +++GF + Sbjct: 126 KRAGFEAV 133 >BACC1 Q734C3 (Q734C3) Acetyltransferase, GNAT family Length = 183 Score = 38.5 bits (88), Expect = 0.046 Identities = 27/80 (33%), Positives = 38/80 (47%) Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 IG+ N KG G I LI ++ E N + V LD N AI Y+K GF++ + E Sbjct: 98 IGDANDRGKGYGREAIHLILKYAFYELNLHRVGLDVISYNKDAIELYKKMGFQMEGCMRE 157 Query: 162 HELHEGKKEDCYLMEYRYDD 181 +GK D +M D+ Sbjct: 158 AVQRDGKCFDRIIMGILRDE 177 >BACAN Q81YL2 (Q81YL2) Acetyltransferase, GNAT family Length = 179 Score = 38.5 bits (88), Expect = 0.046 Identities = 27/80 (33%), Positives = 38/80 (47%) Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 IG+ N KG G I LI ++ E N + V LD N AI Y+K GF+I + E Sbjct: 96 IGDANDRGKGYGREAIHLILKYAFYELNLHRVGLDVISYNKAAIELYKKMGFQIEGCMRE 155 Query: 162 HELHEGKKEDCYLMEYRYDD 181 +G+ D +M D+ Sbjct: 156 AVQRDGECFDRIIMGILRDE 175 >BACAN Q81XB7 (Q81XB7) Spermidine N1-acetyltransferase Length = 171 Score = 38.5 bits (88), Expect = 0.046 Identities = 27/116 (23%), Positives = 51/116 (43%), Gaps = 12/116 (10%) Query: 60 RVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKL 119 R I+E +N +G ++ ++ DY + +T+ Q I +PNY G +L Sbjct: 57 RFIVEKDNEMVGLVELVEI------DYIHRRTEF------QIIIDPNYQGYGYAAEATRL 104 Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175 ++ N + + L K N +A+ Y+K GF + +L + +G + M Sbjct: 105 AMDYAFSVLNMHKIYLVVDKENEKAVHVYKKVGFMVEGELLDEFFVDGNYHNAIRM 160 >SALTI Q8Z6Z1 (Q8Z6Z1) Spermidine N1-acetyltransferase Length = 186 Score = 38.1 bits (87), Expect = 0.061 Identities = 25/90 (27%), Positives = 40/90 (44%), Gaps = 4/90 (4%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 Q I P Y KG+ +R KL ++ N + L K N +AI Y+K GFR+ +L Sbjct: 87 QIIISPEYQGKGLASRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFRVEGEL 146 Query: 160 PEHELHEGKKED----CYLMEYRYDDNATN 185 G+ + C + D++ T+ Sbjct: 147 IHEFFINGEYRNTIRMCIFQQQYLDEHKTS 176 >SALTY Q8ZPJ3 (Q8ZPJ3) Spermidine N1-acetyltransferase (EC 2.3.1.57) Length = 186 Score = 38.1 bits (87), Expect = 0.061 Identities = 25/90 (27%), Positives = 40/90 (44%), Gaps = 4/90 (4%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 Q I P Y KG+ +R KL ++ N + L K N +AI Y+K GFR+ +L Sbjct: 87 QIIISPEYQGKGLASRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFRVEGEL 146 Query: 160 PEHELHEGKKED----CYLMEYRYDDNATN 185 G+ + C + D++ T+ Sbjct: 147 IHEFFINGEYRNTIRMCIFQQQYLDEHKTS 176 >SALCH Q57PD6 (Q57PD6) Spermidine N1-acetyltransferase Length = 186 Score = 38.1 bits (87), Expect = 0.061 Identities = 25/90 (27%), Positives = 40/90 (44%), Gaps = 4/90 (4%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 Q I P Y KG+ +R KL ++ N + L K N +AI Y+K GFR+ +L Sbjct: 87 QIIISPEYQGKGLASRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFRVEGEL 146 Query: 160 PEHELHEGKKED----CYLMEYRYDDNATN 185 G+ + C + D++ T+ Sbjct: 147 IHEFFINGEYRNTIRMCIFQQQYLDEHKTS 176 >MYCPE Q8EWE5 (Q8EWE5) Acetyltransferase GNAT family Length = 193 Score = 38.1 bits (87), Expect = 0.061 Identities = 34/157 (21%), Positives = 72/157 (45%), Gaps = 20/157 (12%) Query: 2 NIVENE-ICIRTLIDDDFPLMLKWLTDERVLEFYGG---RDKKYTLESLKKHYTEPWEDE 57 NI+E + + +R L +D ++ E V E G +D +Y+ + L K + Sbjct: 8 NIIETKRLYLRPLKIEDLNDFYEFAKVEGVGESAGWFHHKDIEYSKKILIKMINSKQD-- 65 Query: 58 VFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYI 117 + ++ + NN IG I+ Y+ D+++ G F+ +YW+KG+ T + Sbjct: 66 -YAIVYKENNKVIGELGIFNKYEN----------DKLMIG---FVLNKDYWNKGLATEIV 111 Query: 118 KLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 K + +++ + + + ++N + R +K GF+ Sbjct: 112 KELIDYIFTNTDHQQIYMGHFESNLASKRVVEKCGFK 148 >BACC1 Q72Y03 (Q72Y03) Spermidine N1-acetyltransferase (EC 2.3.1.57) Length = 176 Score = 38.1 bits (87), Expect = 0.061 Identities = 34/177 (19%), Positives = 74/177 (41%), Gaps = 14/177 (7%) Query: 1 MNIVE--NEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEV 58 M ++E E+ +R L +D + + + ++ ++ + +E + + Sbjct: 1 MEVIEMSQELKLRPLEREDLKFVHELNNNAHIMSYWFEEPYEAFVELQDLYDKHIHDQSE 60 Query: 59 FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIK 118 R I+E +N +G ++ ++ DY + +T+ Q I +PNY G + Sbjct: 61 RRFIVEKDNEMVGLVELVEI------DYIHRRTEF------QIIIDPNYQGYGYAAEATR 108 Query: 119 LIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175 L ++ N + + L K N +A+ Y+K GF + +L + +G + M Sbjct: 109 LAMDYAFSVLNMHKLYLVVDKENEKAVHVYKKVGFMVEGELLDEFFVDGNYHNAIRM 165 >DEIRA Q9RW73 (Q9RW73) Acetyltransferase, putative Length = 186 Score = 37.7 bits (86), Expect = 0.079 Identities = 45/179 (25%), Positives = 77/179 (43%), Gaps = 35/179 (19%) Query: 8 ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYT-----LESLKKHY-----TEPWEDE 57 + +R +D P +WLTDER + D YT E+++ + T P DE Sbjct: 9 VVLRDRRPEDLPTFTRWLTDERAA--WREWDAPYTPAAQTSETMQAYIRYLQVTPPDADE 66 Query: 58 VFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYG-----MDQFIGEPNYWSKGI 112 RVI +G GQ+ M + +++E G + I +P YW G+ Sbjct: 67 --RVI------EVG-GQVVGMVN---------RSEEEPAGGGWWDLGILIYDPAYWEGGV 108 Query: 113 GTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKED 171 GTR + L + +A+ + + N R +RA ++ GF+ + E + G++ D Sbjct: 109 GTRALSLWVQDTLDWTDAHTLTVTTWSGNERMMRAARRLGFQECARVREARVVGGQRYD 167 >STAAM Q99U68 (Q99U68) Hypothetical protein Length = 169 Score = 37.4 bits (85), Expect = 0.10 Identities = 31/133 (23%), Positives = 55/133 (41%), Gaps = 17/133 (12%) Query: 44 ESLKKHYTEPWEDEV-------------FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPK 90 E +K+H E W+D+ + ++E N+ G+ + + E Y D +P Sbjct: 22 ELMKEHDNEQWDDQYPLLEHFEEDIAKDYLYVLEENDKIYGFIVVDQDQAEWYDDIDWPV 81 Query: 91 TDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQK 150 E + + + G Y KG T + + + K R A ++ D N A + K Sbjct: 82 NREGAFVIHRLTGSKEY--KGAATELFNYVIDVV-KARGAEVILTDTFALNKPAQGLFAK 138 Query: 151 SGF-RIIEDLPEH 162 GF ++ E L E+ Sbjct: 139 FGFHKVGEQLMEY 151 >RICPR Q9ZE75 (Q9ZE75) TRANSCRIPTIONAL ACTIVATOR PROTEIN CZCR (CzcR) Length = 237 Score = 37.4 bits (85), Expect = 0.10 Identities = 35/130 (26%), Positives = 63/130 (48%), Gaps = 15/130 (11%) Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLET-NVKIPNIEYSYISDELSILGYKEI 277 V E I + K + KG+A ++ ++ NL+T +V++ + + E SIL + Sbjct: 104 VREELIARIKAIVRRSKGHAASIFRFDKISVNLDTRSVEVDGKKLHLTNKEYSILELLIL 163 Query: 278 -KGTFLTPE-----IYSTMSEEEQNLLKRDIASFLRQMH----GLDYTDISECTIDNKQN 327 +GT LT E +YST+ E E ++ I +++ G DY D T+ + Sbjct: 164 RRGTILTKEMFLNHLYSTVDEPEMKIIDVFICKLRKKLSDAAGGRDYID----TVWGRGY 219 Query: 328 VLEEYILLRE 337 +L+EY L++ Sbjct: 220 MLKEYDELQQ 229 >LACJO Q74J71 (Q74J71) Hypothetical protein Length = 181 Score = 37.4 bits (85), Expect = 0.10 Identities = 23/72 (31%), Positives = 32/72 (44%), Gaps = 1/72 (1%) Query: 102 IGEPNYWSKGIGTRYIKL-IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLP 160 I P YW GIG R +K+ I E + + L NPR I QK GF+ + Sbjct: 98 IYNPTYWHGGIGGRVLKIWISEIFDQYPELEHIGLTTWSGNPRMIHLAQKLGFKKEAQIR 157 Query: 161 EHELHEGKKEDC 172 + ++ K DC Sbjct: 158 KVRFYKEKYYDC 169 >CLOAB Q97D33 (Q97D33) Acetyltransferase (With duplicated domains), possibly RIMI-like protein Length = 292 Score = 37.4 bits (85), Expect = 0.10 Identities = 18/59 (30%), Positives = 32/59 (54%), Gaps = 1/59 (1%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHE 163 P Y +G G + ++ E+L ER+ + + L+ NN RA Y+ GF+I ++ +E Sbjct: 225 PEYRGRGFGREMMSMLLEYLI-ERDYDDIALEVDSNNKRAFELYKSIGFQIEREIDYYE 282 >VIBCH Q9KU66 (Q9KU66) Ribosomal-protein-alanine acetyltransferase Length = 161 Score = 36.6 bits (83), Expect = 0.18 Identities = 22/77 (28%), Positives = 42/77 (54%), Gaps = 2/77 (2%) Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE-DLPEH 162 +P KG G + ++ F L +++ A + L+ ++N RA YQ++GF I+ + + Sbjct: 85 DPAQQGKGYGQQLLQH-FIALCEQQKAESAWLEVRESNQRAFALYQRAGFNEIDRRVNYY 143 Query: 163 ELHEGKKEDCYLMEYRY 179 + +GK ED +M Y + Sbjct: 144 PVAKGKSEDAIIMSYLF 160 >STRR6 Q8DNF9 (Q8DNF9) Hypothetical protein spr1760 Length = 172 Score = 36.6 bits (83), Expect = 0.18 Identities = 23/71 (32%), Positives = 34/71 (47%), Gaps = 3/71 (4%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEH--EL 164 YW+ G+G+ ++ E+ + + L N A+ YQK GF +IE E + Sbjct: 98 YWNNGLGSLLLEEAIEWAQASGILRRLQLTVQTRNQAAVHLYQKHGF-VIEGSQERGAYI 156 Query: 165 HEGKKEDCYLM 175 EGK D YLM Sbjct: 157 EEGKFIDVYLM 167 >SALTY Q8ZNU6 (Q8ZNU6) Putative ribose 5-phosphate isomerase (EC 5.3.1.6) Length = 212 Score = 36.6 bits (83), Expect = 0.18 Identities = 24/93 (25%), Positives = 44/93 (47%), Gaps = 4/93 (4%) Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIK 278 +N +IF+ F+ K +GY E+ N + VK ++ +Y+ D L + + +K Sbjct: 124 LNVRFIFEKAFTGRKGEGYPPERKAPQVRNAGILNQVKAAVVKENYL-DTLRAIDPELVK 182 Query: 279 GTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHG 311 P + E QN ++I +F+R+M G Sbjct: 183 TAVSGPRFQQCLFENGQN---KEIEAFVREMLG 212 >SALCH Q57N66 (Q57N66) Putative ribose 5-phosphate isomerase Length = 212 Score = 36.6 bits (83), Expect = 0.18 Identities = 24/93 (25%), Positives = 44/93 (47%), Gaps = 4/93 (4%) Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIK 278 +N +IF+ F+ K +GY E+ N + VK ++ +Y+ D L + + +K Sbjct: 124 LNVRFIFEKTFTGRKGEGYPPERKAPQVRNAGILNQVKAAVVKENYL-DTLRAIDPELVK 182 Query: 279 GTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHG 311 P + E QN ++I +F+R+M G Sbjct: 183 TAVSGPRFQQCLFENGQN---KEIEAFVREMLG 212 >ECO57 ATDA_ECO57 (P0A952) Spermidine N(1)-acetyltransferase (EC 2.3.1.57) (Diamine acetyltransferase) (SAT) Length = 185 Score = 36.6 bits (83), Expect = 0.18 Identities = 21/60 (35%), Positives = 29/60 (48%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 Q I P Y KG+ TR KL ++ N + L K N +AI Y+K GF + +L Sbjct: 86 QIIISPEYQGKGLATRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFSVEGEL 145 >ECOLI ATDA_ECOLI (P0A951) Spermidine N(1)-acetyltransferase (EC 2.3.1.57) (Diamine acetyltransferase) (SAT) Length = 185 Score = 36.6 bits (83), Expect = 0.18 Identities = 21/60 (35%), Positives = 29/60 (48%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 Q I P Y KG+ TR KL ++ N + L K N +AI Y+K GF + +L Sbjct: 86 QIIISPEYQGKGLATRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFSVEGEL 145 >BACHD Q9KG16 (Q9KG16) BH0299 protein Length = 305 Score = 36.6 bits (83), Expect = 0.18 Identities = 35/126 (27%), Positives = 52/126 (41%), Gaps = 17/126 (13%) Query: 41 YTLESLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQ 100 Y E + K EP +IIE + IGY Y + P+ E G + Sbjct: 185 YDAEEILKKINEPTNK---LLIIEKEQIVIGYA---------YVEVE-PEHGE---GQIE 228 Query: 101 FIG-EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 +IG P+Y +G+ T+ + L + L K N +AIR YQ +GF+ L Sbjct: 229 YIGIAPDYRRQGLATQLLTNALHVLFSYPTVEDITLCVSKQNTKAIRLYQAAGFKKERQL 288 Query: 160 PEHELH 165 EL+ Sbjct: 289 TYFELN 294 >AQUAE O66838 (O66838) Ribosomal-protein-alanine acetyltransferase Length = 154 Score = 36.6 bits (83), Expect = 0.18 Identities = 23/74 (31%), Positives = 37/74 (50%), Gaps = 5/74 (6%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164 P Y KG G + ++ L + V+LD K+N RAI Y+K GF+++ E + Sbjct: 75 PGYRGKGYGEKLLREAISRLGDK--VKRVVLDVRKSNLRAINLYKKLGFKVV---TERKG 129 Query: 165 HEGKKEDCYLMEYR 178 + E+ LME + Sbjct: 130 YYSDGENALLMELK 143 >PROAC Q6ABU2 (Q6ABU2) Acetyltransferase (EC 2.3.1.-) Length = 188 Score = 36.2 bits (82), Expect = 0.23 Identities = 21/77 (27%), Positives = 32/77 (41%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164 P W +G+G R ++L E +A V L N R I G+R +P+ Sbjct: 111 PTLWGRGVGRRALRLWTEATFATTDAQVVTLTTWSGNGRMIHCAGAVGYRECGRIPQARS 170 Query: 165 HEGKKEDCYLMEYRYDD 181 +G++ D M DD Sbjct: 171 WQGRRWDLVTMALLRDD 187 >BACAN Q81QL3 (Q81QL3) Acetyltransferase, GNAT family Length = 308 Score = 36.2 bits (82), Expect = 0.23 Identities = 44/183 (24%), Positives = 70/183 (38%), Gaps = 31/183 (16%) Query: 44 ESLKKHYTEPWEDEVFRV------IIEYNNVPIGYGQIYKM-YDELYTDYHYPKTDEIVY 96 E L + T +++E R +I+YN P GY + M Y D + DE + Sbjct: 15 EKLTEIMTRTFDEEAERWLCGQGDVIDYNIQPPGYSSVEMMRYSIEELDSYKVIMDEKII 74 Query: 97 G-------------MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPR 143 G +D+ EP Y KGIG+ IKLI R + NN Sbjct: 75 GGIIVTISGKSYGRIDRIFVEPVYQGKGIGSNVIKLIEAEYPSIRIWDLETSSRQINNH- 133 Query: 144 AIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVD 203 Y+K G++ I E + E CY+ N +V + + ++N ++ Sbjct: 134 --HFYKKMGYQTI--------FESEDEYCYVKRIGTSSNKESVFKNEDMKNSQYENCNLE 183 Query: 204 SIE 206 + E Sbjct: 184 NTE 186 >STRMU Q8DV67 (Q8DV67) Putative acetyltransferase Length = 166 Score = 35.8 bits (81), Expect = 0.30 Identities = 21/52 (40%), Positives = 29/52 (55%), Gaps = 2/52 (3%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156 P Y +GIGT +K E L+K + + V L K N A+ YQK+GF+ I Sbjct: 95 PAYRGQGIGTELLKTFLEHLRK-KGYHKVSLSVQKEND-AVNMYQKAGFQTI 144 >STAAM COAD_STAAM (P63818) Phosphopantetheine adenylyltransferase (EC 2.7.7.3) (Pantetheine-phosphate adenylyltransferase) (PPAT) (Dephospho-CoA pyrophosphorylase) Length = 160 Score = 35.8 bits (81), Expect = 0.30 Identities = 28/132 (21%), Positives = 55/132 (41%), Gaps = 13/132 (9%) Query: 164 LHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEY 223 L KKE + +E R D +VK + + H F VD E +G+ +++ Sbjct: 38 LKNSKKEGTFSLEERMDLIEQSVKHLPNVKVHQFSGLLVDYCEQVGAKTIIRGLRAVSDF 97 Query: 224 IFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDEL--SILGYKEIKGTF 281 ++ + ++ KK LN +ET + + YS+IS + + Y+ F Sbjct: 98 EYELRLTSMNKK-----------LNNEIETLYMMSSTNYSFISSSIVKEVAAYRADISEF 146 Query: 282 LTPEIYSTMSEE 293 + P + + ++ Sbjct: 147 VPPYVEKALKKK 158 >LISMO Q8Y9B8 (Q8Y9B8) Lmo0614 protein Length = 177 Score = 35.8 bits (81), Expect = 0.30 Identities = 17/54 (31%), Positives = 27/54 (50%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLP 160 YW GIGT ++ + ++ K + L+ N RAI Y+K GF ++P Sbjct: 105 YWGLGIGTICMEELIKYAKSSEYLKLIYLEVVTENKRAINLYKKFGFIEAGEIP 158 >THEMA Q9WZ46 (Q9WZ46) Hypothetical protein Length = 179 Score = 35.4 bits (80), Expect = 0.39 Identities = 19/49 (38%), Positives = 28/49 (57%), Gaps = 1/49 (2%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRI 155 YW+ GIGTR I E+ ++ + L+ K+N RAI Y+K GF + Sbjct: 106 YWNIGIGTRMITSAIEWARR-NGFIRIQLEVLKSNERAISLYRKLGFEL 153 >STRR6 Q8DQU8 (Q8DQU8) Hypothetical protein spr0490 Length = 185 Score = 35.4 bits (80), Expect = 0.39 Identities = 29/128 (22%), Positives = 60/128 (46%), Gaps = 16/128 (12%) Query: 55 EDEVF---RVIIEYN---NVPIGYGQIYKMYDELY--TDYHYPKTDEIVYGMDQFIG--- 103 EDE++ ++ E N N+P GYG + K D++ D+++ D+++ IG Sbjct: 50 EDEIYYLEHILPERNQKENLPAGYGIVVKGTDKIVGSVDFNHRHEDDVLE-----IGYTL 104 Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHE 163 P+YW +G + + + K+ + + L N ++ R +K GF + + + + Sbjct: 105 HPDYWGRGYVPEAARALIDLAFKDLGLHKIELTCFGYNLQSKRVAEKLGFTLEARIRDRK 164 Query: 164 LHEGKKED 171 +G + D Sbjct: 165 DVQGNRCD 172 >CLOAB Q97FA0 (Q97FA0) Predicted acetyltransferase Length = 146 Score = 35.4 bits (80), Expect = 0.39 Identities = 23/94 (24%), Positives = 45/94 (47%), Gaps = 15/94 (15%) Query: 61 VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120 V+I+ NN+ +GYG ++ + DE + + P + GIG + ++ + Sbjct: 46 VVIKNNNLVVGYGGLWLIIDEGH--------------ITNIAVHPEFRGMGIGNKILEEL 91 Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 + +K RN ++ L+ +N A Y+K GF+ Sbjct: 92 IKLCEK-RNIPSMTLEVRISNTIAQNLYKKFGFK 124 >_BUCAP LOLC_BUCAP (Q8K9N8) Lipoprotein-releasing system transmembrane protein lolC Length = 399 Score = 35.4 bits (80), Expect = 0.39 Identities = 44/156 (28%), Positives = 62/156 (39%), Gaps = 26/156 (16%) Query: 189 MKYLIEHYFDNFK----VDSIEIIGSGYDSVAYLVNNEYIFKTKFS------------TN 232 ++YL Y NFK + SI IG G S ++ F+ KF TN Sbjct: 11 LRYLWNPYLPNFKKIIIILSILGIGIGISSTIITISIMNGFQNKFKNDILSFIPHIIITN 70 Query: 233 KKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSE 292 K + K LN ET +K+ N+E I+D +S E K EI + Sbjct: 71 KNRNINK-------LNFPKET-LKLKNVEE--ITDFISKKVIIENKNEINIGEIIGINIK 120 Query: 293 EEQNLLKRDIASFLRQMHGLDYTDISECTIDNKQNV 328 E+NL +I FL +H Y I + K +V Sbjct: 121 NEKNLENYNIKKFLHTLHSRKYNAIIGSELAKKMHV 156 >BACC1 Q734M2 (Q734M2) Acetyltransferase, GNAT family Length = 153 Score = 35.4 bits (80), Expect = 0.39 Identities = 22/91 (24%), Positives = 42/91 (46%), Gaps = 16/91 (17%) Query: 80 DELYTDYHYPKT-----------DEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKER 128 DE++ Y Y T DE+ + + + P+Y+ KGI T+ + +F+ + Sbjct: 50 DEIFYGYFYEDTLAGFISFKIEKDEV--DIHRLVVSPDYFHKGIATKLLLYVFDMFSPSK 107 Query: 129 NANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 I+ K N A+ Y+K GF ++++ Sbjct: 108 ---TYIVQTGKENTPALSLYKKHGFIEVKEI 135 >BACAN Q81U09 (Q81U09) Acetyltransferase, GNAT family Length = 167 Score = 35.4 bits (80), Expect = 0.39 Identities = 15/47 (31%), Positives = 27/47 (57%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 Y ++GIGT+ I+ + + K++ + L N RAI+ Y++ GF Sbjct: 93 YCNQGIGTKLIEFLIRWAKEQNGLEKICLGVVSVNDRAIKVYKRMGF 139 >YERPE Q8ZHB3 (Q8ZHB3) Putative siderophore biosynthesis protein IucB (Acetyl CoA:N6-hydroxylsyine acetyl transferase) Length = 316 Score = 35.0 bits (79), Expect = 0.51 Identities = 35/159 (22%), Positives = 63/159 (39%), Gaps = 31/159 (19%) Query: 14 IDDDFPLMLKWLTDERVLEFY---GGRDKK--YTLESLKKHYTEPWEDEVFRVIIEYNNV 68 +D D P +W+ RV F+ G D + Y L Y P ++ +++ Sbjct: 151 VDHDAPQFTRWMNSPRVDAFWEMSGPLDVQAAYLQRQLDSPYCYP-------LLGCFDDQ 203 Query: 69 PIGYGQIY-KMYDELYTDYHYPKTDEIVYGMDQFIGEPNY--------WSKGIGTRYIKL 119 P GY ++Y D + Y + D G+ +GE N+ W +G+ T Y+ L Sbjct: 204 PFGYFEVYWAAEDRIGRHYRWQPFDR---GLHMLVGEENWRGAQYIHSWLRGL-THYLYL 259 Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIED 158 E V+ +P +N R +G+ +++ Sbjct: 260 ------DESRTTRVVAEPRIDNQRLFHHLPAAGYHTLKE 292 >VIBUY Q7MFH7 (Q7MFH7) Histone acetyltransferase HPA2 Length = 166 Score = 35.0 bits (79), Expect = 0.51 Identities = 18/65 (27%), Positives = 33/65 (50%) Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKE 170 GIG++ I+ + E N + ++ + +N +AI Y+K GF I + + EG+ Sbjct: 94 GIGSKLIETVTELADNWLNVRRIQIEVNVDNEKAISLYKKHGFVIEGEAVDSSFREGRFI 153 Query: 171 DCYLM 175 + Y M Sbjct: 154 NTYYM 158 >RICCN Q92JG6 (Q92JG6) Transcriptional activator protein czcR Length = 237 Score = 35.0 bits (79), Expect = 0.51 Identities = 37/136 (27%), Positives = 63/136 (46%), Gaps = 27/136 (19%) Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLET--------NVKIPNIEYSYISDELS 270 V E I + K + KG+A ++ ++ NL+T V + N EY+ + EL Sbjct: 104 VREELIARIKAIVRRSKGHAASVFRFDKVSINLDTRSVEVDGKKVHLTNKEYAIL--ELL 161 Query: 271 ILGYKEIKGTFLTPE-----IYSTMSEEEQNLLKRDIASFLRQMH----GLDYTDISECT 321 IL +GT LT E +YS++ E E ++ I +++ G DY D T Sbjct: 162 ILR----RGTILTKEMFLNHLYSSVDEPEMKIIDVFICKLRKKLSDAAGGRDYID----T 213 Query: 322 IDNKQNVLEEYILLRE 337 + + +L+EY L++ Sbjct: 214 VWGRGYMLKEYDELQQ 229 >CLOAB Q97G03 (Q97G03) Predicted acetyltransferase Length = 167 Score = 35.0 bits (79), Expect = 0.51 Identities = 24/75 (32%), Positives = 37/75 (49%), Gaps = 11/75 (14%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166 Y KGIG+ IK +FE+ +E + L+ +N +AI Y+K GF + E Sbjct: 94 YSGKGIGSLIIKRVFEW-AEENAIEKIDLEVFHDNFKAISLYKKFGF----------IEE 142 Query: 167 GKKEDCYLMEYRYDD 181 G+K++ E Y D Sbjct: 143 GRKKNAIKAEDGYKD 157 >BACHD Q9KF00 (Q9KF00) Ribosomal-protein-alanine N-acetyltransferase Length = 190 Score = 35.0 bits (79), Expect = 0.51 Identities = 43/175 (24%), Positives = 76/175 (43%), Gaps = 23/175 (13%) Query: 8 ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNN 67 + +R + DD +L +L+D+ V++++ G + TLE W + + + Sbjct: 16 LILRKITTDDARSILSYLSDKEVMKYF-GLEPFQTLEDALGEIA--WYESIL-----HEQ 67 Query: 68 VPIGYGQIYKMYDELY--TDYH--YPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI--- 120 I +G K DE+ +H PK G + YW +GI + I+ + Sbjct: 68 TGIRWGITLKGQDEVIGSCGFHQWVPKHHRAEIGFEL---SKLYWGQGIASEAIRAVIQY 124 Query: 121 -FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYL 174 FE L+ +R A+I P+ + R + +K GF L +E GK +D Y+ Sbjct: 125 GFEHLELQR-IQALIEPPNIPSQRLV---EKQGFISEGLLRSYEYTCGKFDDLYM 175 >BACAN Q81NB6 (Q81NB6) Spermine/spermidine acetyltransferase, putative Length = 148 Score = 35.0 bits (79), Expect = 0.51 Identities = 31/119 (26%), Positives = 52/119 (43%), Gaps = 20/119 (16%) Query: 54 WEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPK---TDEIVYGMDQFIGEP---NY 107 WE+ + + E I +Y + + + D Y K +E + G F +P NY Sbjct: 14 WEEAIKLSVKEEQQTFIA-SNLYSIAEVQFLDNFYAKGIYLEEKMVGFTMFGIDPEDNNY 72 Query: 108 W-----------SKGIGTRYIKLIFEFLKKERNAN--AVILDPHKNNPRAIRAYQKSGF 153 W KGIG + I L+ + +++ NAN +++ N A AY+K+GF Sbjct: 73 WIYRLMIDENFQGKGIGKQAIYLVIDEIRRNNNANFSRIMIGYAPENLTAKFAYKKAGF 131 >BACAN Q81N07 (Q81N07) Acetyltransferase, GNAT family Length = 153 Score = 35.0 bits (79), Expect = 0.51 Identities = 21/89 (23%), Positives = 42/89 (47%), Gaps = 12/89 (13%) Query: 80 DELYTDYHYPKT---------DEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNA 130 DE++ Y Y T D+ + + + P+++ KGI T+ + IF+ ++ Sbjct: 50 DEIFYGYFYEDTLAGFISFKIDKEEVDIHRLVVSPDHFHKGIATKLLLYIFDMFS---SS 106 Query: 131 NAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 I+ K N A+ Y+K GF ++++ Sbjct: 107 KTYIVQTGKENTPALSLYKKHGFIEVQNI 135 >STRMU Q8DT36 (Q8DT36) Putative acetyltransferase Length = 184 Score = 34.7 bits (78), Expect = 0.67 Identities = 16/78 (20%), Positives = 36/78 (46%) Query: 106 NYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELH 165 +YW +G+ T ++ + +E + + + HK N + R +K+GFR++ + + Sbjct: 105 HYWKQGLATEALENLVFLAFQELDLKELEIIVHKENRASARVAEKAGFRLVRQFKGSDRY 164 Query: 166 EGKKEDCYLMEYRYDDNA 183 K D + + D + Sbjct: 165 THKMRDYLKYDLKAGDKS 182 >PSEAE Q9I3W7 (Q9I3W7) Hypothetical protein Length = 177 Score = 34.7 bits (78), Expect = 0.67 Identities = 17/66 (25%), Positives = 33/66 (50%) Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKK 169 KG+G+R + + + N V L + +N A+ Y+K GF ++ ++ + +G+ Sbjct: 100 KGVGSRLLGELLDIADNWMNLRRVELTVYTDNAPALALYRKFGFETEGEMRDYAVRDGRF 159 Query: 170 EDCYLM 175 D Y M Sbjct: 160 VDVYSM 165 >NITEU Q82UT0 (Q82UT0) GCN5-related N-acetyltransferase (EC 2.3.1.128) Length = 157 Score = 34.7 bits (78), Expect = 0.67 Identities = 17/48 (35%), Positives = 30/48 (62%), Gaps = 1/48 (2%) Query: 109 SKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156 S+G+G + ++ + E L ++ A V+LD ++N AI YQ+ GF+ I Sbjct: 88 SQGLGRKMLRYLIE-LSRKHQAEFVLLDVRESNTGAINLYQRLGFQQI 134 >LACLA Q9CF66 (Q9CF66) Spermidine acetyltransferase (EC 2.3.1.57) Length = 180 Score = 34.7 bits (78), Expect = 0.67 Identities = 34/131 (25%), Positives = 55/131 (41%), Gaps = 12/131 (9%) Query: 60 RVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKL 119 R +IE N+ IG ++ + DY + +T EI Q I + KG + +K Sbjct: 62 RFVIEANDTFIGIVELMSI------DYIH-RTCEI-----QIIIISGFSGKGYAQKALKT 109 Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRY 179 ++ N + V L +N A+ Y+K GF+I + E G+ D Y M Sbjct: 110 GVDYAFNTLNMHKVYLWVDIDNAPAVHIYKKLGFKIEGTIKEQFFAGGRYHDSYFMGILK 169 Query: 180 DDNATNVKAMK 190 + KA+K Sbjct: 170 SEYTQREKAVK 180 >BACC1 Q739G7 (Q739G7) Acetyltransferase, GNAT family Length = 188 Score = 34.7 bits (78), Expect = 0.67 Identities = 33/136 (24%), Positives = 52/136 (38%), Gaps = 21/136 (15%) Query: 47 KKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPN 106 +K T E+ + +IIE+N IG Y + Y T + G+ I P Sbjct: 56 EKMQTRLKEEPLSNLIIEHNGQVIGTVGFY---------WEYKPTRWLEMGI--VIYNPA 104 Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166 YW+ G GT + L + L ++ V L N R ++ +K G + E Sbjct: 105 YWNGGYGTEALTLYRDLLFEKMEIGRVGLTTWSGNERMMKVAEKIGMTL----------E 154 Query: 167 GKKEDCYLMEYRYDDN 182 G+ C Y D+ Sbjct: 155 GRMRKCRYYNGTYYDS 170 >MYCPE Q8EWM2 (Q8EWM2) Hypothetical protein MYPE1810 Length = 300 Score = 34.3 bits (77), Expect = 0.88 Identities = 57/252 (22%), Positives = 99/252 (39%), Gaps = 62/252 (24%) Query: 115 RYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII-EDL-----PEHELHEGK 168 R+I + +FL K+ + + KN+ + G I+ EDL P L E K Sbjct: 55 RFILNLLDFLYKDNDLIEYKRERSKNDLKFFHFSFSKGLDILLEDLHLNKDPYKWLVETK 114 Query: 169 KEDCYLME-YRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLV-----NNE 222 C+L+ + Y + + + Y E N ++ ++I+ + S+ + NN Sbjct: 115 TRSCFLIGLFLYGGSINSPNSSNYHFEIKIHNTEI--LKIVEKIFSSINIPLLVLNRNNT 172 Query: 223 YIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFL 282 YI K S + ISD L +LG E Sbjct: 173 YIVYIKKSES--------------------------------ISDILKLLGATE------ 194 Query: 283 TPEIYSTMSEEEQNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLE--EYILLRETIY 340 +M E E+ + RD + + +++ LD +++ + TI+ L+ EY+ ++ Sbjct: 195 ------SMFEYEEKRISRDYTNQMSRLNNLDMSNLKK-TIEASHIQLQNIEYVK-NNNLF 246 Query: 341 NDLTDIEKDYIE 352 N LTD EK Y E Sbjct: 247 NQLTDKEKIYCE 258 >MYCPE Q8EWE6 (Q8EWE6) Acetyltransferase GNAT family Length = 190 Score = 34.3 bits (77), Expect = 0.88 Identities = 26/112 (23%), Positives = 51/112 (45%), Gaps = 13/112 (11%) Query: 48 KHYTEPWEDE-VFRVIIEYNNVPIGYGQIYKMYDELYTDYHYP----KTDEIVYGMDQFI 102 KH+ E E + +++I N Y ++K +++ + KT +I Y + + Sbjct: 45 KHHKNIEETETILKILISGGNF---YALVWKENNKVIGSFGIETPSYKTVKIGYALSK-- 99 Query: 103 GEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 +YW+ GI T K I +F+ N +++ N + + +KSGF+ Sbjct: 100 ---DYWNLGIMTEVTKHIIDFIFTNSGFNKILVSHFDENTASKKVIEKSGFK 148 >LACLA Q9CHW9 (Q9CHW9) Hypothetical protein yfiL Length = 154 Score = 34.3 bits (77), Expect = 0.88 Identities = 27/97 (27%), Positives = 46/97 (47%), Gaps = 15/97 (15%) Query: 90 KTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVIL--DPHKNNPRAIRA 147 K + + ++ F E ++ +G G + +K + +LK+ A+ +IL D NN + Sbjct: 56 KKQKNTFEIENFAVETSFQGQGFGQQMMKQLITYLKENLAADELILGTDDVSNN---VAF 112 Query: 148 YQKSGFRIIE-------DLPEHELHEGK---KEDCYL 174 Y+K GF I D +H + EGK K+ YL Sbjct: 113 YEKCGFTITHKISNYFLDNCDHPIFEGKVQLKDKIYL 149 >LACPL Q88SM7 (Q88SM7) Prophage Lp4 protein 8, DNA primase/helicase Length = 500 Score = 34.3 bits (77), Expect = 0.88 Identities = 37/160 (23%), Positives = 64/160 (40%), Gaps = 11/160 (6%) Query: 179 YDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEY-----IFKTKFSTNK 233 Y DNAT + I+++FD EI+ ++ + +N Y IF N Sbjct: 174 YRDNATTPNIKGWTIDNWFDELACGDDEIVELLWEVINDCLNGNYTRKKAIFLFSELGNS 233 Query: 234 KKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEE 293 KG +E I N + + +K+ + + ++G G + P+IY S Sbjct: 234 GKGTFQE-LITNLVGMDNVGTLKVNEFDVRF--RLAGLVGKTVCIGDDIAPDIYIKDSSN 290 Query: 294 EQNLLKRDIASFLRQMHGLD-YTDISECTIDNKQNVLEEY 332 +++ D+ + + G D YT CTI N L + Sbjct: 291 FNSVVTGDLVNI--EFKGQDGYTSALRCTIVQSCNGLPNF 328 >ENTFA Q837C4 (Q837C4) Acetyltransferase, GNAT family Length = 144 Score = 34.3 bits (77), Expect = 0.88 Identities = 17/58 (29%), Positives = 30/58 (51%), Gaps = 6/58 (10%) Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 +P Y+ KG G I+ + E + + +D +K N A++ YQ GF++I + E Sbjct: 78 DPVYFRKGYGGEIIQKLIE------QESIIFVDANKQNEGAVKFYQSQGFQVIGESKE 129 >CLOTE Q891D4 (Q891D4) Ribosomal-protein-alanine acetyltransferase (EC 2.3.1.128) Length = 152 Score = 34.3 bits (77), Expect = 0.88 Identities = 30/118 (25%), Positives = 54/118 (45%), Gaps = 18/118 (15%) Query: 58 VFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYI 117 ++ V I+ N + +GYG ++ + DE + T+ ++ PNY GI + + Sbjct: 48 LYIVAIKDNKI-LGYGGLWIILDEGHV------TNIAIH--------PNYRQLGIASLVL 92 Query: 118 KLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175 + + K R N++ L+ K+N A YQK GF +E+ + ED +M Sbjct: 93 STLIKE-SKNRGVNSITLEVRKSNSVAQNLYQKFGF--VEEGCRKHYYSDNLEDAIIM 147 >CLOAB Q97I16 (Q97I16) Predicted acetyltransferase domain containing protein Length = 291 Score = 34.3 bits (77), Expect = 0.88 Identities = 31/103 (30%), Positives = 37/103 (35%), Gaps = 20/103 (19%) Query: 68 VPIGYGQIYKMYDELYTDY-----HYPKTDEIVYGMDQFIGEPN------------YWSK 110 +P+ IY YDE Y + DEI G QFI E N Y Sbjct: 177 IPLSIDDIY--YDEAQEYYVDDGAFFISKDEIKIGYGQFIFEHNNITIVNFGIVEQYRGN 234 Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 G G ++ I LK R V + NN AI Y GF Sbjct: 235 GYGRYFLSYILNILKN-RGCKVVYIKVDMNNVPAINLYTSMGF 276 >BACC1 Q72WY7 (Q72WY7) Hypothetical protein Length = 186 Score = 34.3 bits (77), Expect = 0.88 Identities = 38/163 (23%), Positives = 70/163 (42%), Gaps = 16/163 (9%) Query: 195 HYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETN 254 H N K++ I + D V L N Y T+ T+ +K K +N + Sbjct: 12 HLEKNIKLEDIPNVDLYVDQVVQLFENTYADTTR--TDDEKVLTK-----TMINNYAKGK 64 Query: 255 VKIPNIEYSYISDELSILG-YKEIKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLD 313 + IP Y + + ++ ++KG +I S++ +LL D SF M + Sbjct: 65 LFIPIKNKKYSKEHMILISLIYQLKGALSINDIKSSLETINDSLLNDD--SFELNMLYKN 122 Query: 314 YTDISECTIDN-KQNVLEEYILLRETIYNDLTDIEKDYIESFM 355 Y ++E +++ KQ+V R T N+++ +E +E F+ Sbjct: 123 YLALTESNVESFKQDVNN-----RVTEVNEISSLEDTKLEKFL 160 >VIBPA Q87G30 (Q87G30) Putative acetyltransferase Length = 166 Score = 33.9 bits (76), Expect = 1.1 Identities = 22/77 (28%), Positives = 36/77 (46%), Gaps = 6/77 (7%) Query: 99 DQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIED 158 DQF G G+G++ I+ I E N + L+ + +N AI Y+K GF I + Sbjct: 88 DQFHG------LGVGSKLIETITELADNWLNVRRIQLEVNADNEAAIGLYKKHGFEIEGE 141 Query: 159 LPEHELHEGKKEDCYLM 175 + +G+ + Y M Sbjct: 142 AIDASFRDGEFINTYYM 158 >STAES ARGD2_STAES (Q8CSG1) Acetylornithine aminotransferase 2 (EC 2.6.1.11) (ACOAT 2) Length = 375 Score = 33.9 bits (76), Expect = 1.1 Identities = 31/133 (23%), Positives = 61/133 (45%), Gaps = 11/133 (8%) Query: 193 IEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE--KAIYNFLNTN 250 + + F+N+K D+IE + + + + NN Y+ + G+ E +A+YN LN Sbjct: 1 MSYLFNNYKRDNIEFVDANQNELIDKDNNVYLDFSSGIGVTNLGFNMEIYQAVYNQLNLI 60 Query: 251 LETNVKIPNIEYSYISDELS--ILGYKEIKGTFL---TPEIYSTMSEEEQNLLKRDIASF 305 + PN+ S I +E++ ++G ++ F T + + + K +I +F Sbjct: 61 WHS----PNLYLSSIQEEVAQKLIGQRDYLAFFCNSGTEANEAAIKLARKATGKSEIIAF 116 Query: 306 LRQMHGLDYTDIS 318 + HG Y +S Sbjct: 117 KKSFHGRTYGAMS 129 >RICPR Q9ZEC5 (Q9ZEC5) 190 KD ANTIGEN (Sca1) Length = 347 Score = 33.9 bits (76), Expect = 1.1 Identities = 24/90 (26%), Positives = 43/90 (47%), Gaps = 2/90 (2%) Query: 197 FDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVK 256 F N K + E+IGS S+ + F + + K + K+K Y++ +T ++++VK Sbjct: 123 FKNGKNNDKELIGSKVISIYGQKELQQNFTLQLLVSASKNFIKDKINYSYGDTQIKSHVK 182 Query: 257 IPNIEYSYISDELSILGYKEIKGTFLTPEI 286 N +SY ++ L Y +TP I Sbjct: 183 HHN--HSYNAEALLNYNYLVKNSIIITPNI 210 >PSEPK Q88NL5 (Q88NL5) Acetyltransferase, GNAT family Length = 162 Score = 33.9 bits (76), Expect = 1.1 Identities = 18/59 (30%), Positives = 28/59 (47%), Gaps = 6/59 (10%) Query: 98 MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156 +D P Y +G+G R ++ L NA LD ++ NP+A+ Y GF +I Sbjct: 86 VDMLFVAPGYRGQGVGKRLLRYAISEL------NAEYLDVNEQNPKALGFYLHEGFEVI 138 >LACPL Q88YF8 (Q88YF8) Acetyltransferase (Putative) Length = 171 Score = 33.9 bits (76), Expect = 1.1 Identities = 13/47 (27%), Positives = 25/47 (53%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 +W G+GT I+ + ++ + + ++L N RA++ YQ GF Sbjct: 98 FWGMGLGTALIEEVLDWARNYSSLERLVLTVQLRNVRAVKLYQHLGF 144 >BACC1 Q737S7 (Q737S7) Acetyltransferase, GNAT family Length = 282 Score = 33.9 bits (76), Expect = 1.1 Identities = 36/125 (28%), Positives = 59/125 (47%), Gaps = 13/125 (10%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANA---VILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 PNY +GIG + +FE K E N + L+ N RAIR Y K G+ + DL Sbjct: 87 PNY--RGIGVS--QKLFELHKDEAIQNGCKQLFLEVIVGNDRAIRFYNKLGYEKVYDLSY 142 Query: 162 HELHEGKK---EDCYLMEYR-YDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAY 217 + L + K ++C +E + + A V+ K+L H+ N++ D I + + Sbjct: 143 YNLKDMTKIIHKECKGIEVKQLEFPAFKVEIQKWL--HFHINWQNDMDYIEKTNHTFYGA 200 Query: 218 LVNNE 222 V+N+ Sbjct: 201 YVDND 205 >BACC1 Q734H0 (Q734H0) Acetyltransferase, GNAT family family Length = 182 Score = 33.9 bits (76), Expect = 1.1 Identities = 45/172 (26%), Positives = 71/172 (41%), Gaps = 24/172 (13%) Query: 8 ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWED---EVFRVIIE 64 +CI +DD ++ L +++ L G Y LE + + W D E+ R IE Sbjct: 10 LCIEPFTNDDV-CRIRELANDKELANILGLPHPYKLE-----FAQDWVDMQPELIRKGIE 63 Query: 65 YNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQ-----FIGEPNYWSKGIGTRYIKL 119 Y P+G + K E+ T I G ++ +IG+ NYW KG T + Sbjct: 64 Y---PLGI--VSKESREIVGTI----TLRIDKGNNRGELGYWIGK-NYWGKGFATEALNR 113 Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKED 171 + +F E N + N +I+ +KSG R L ++ L ED Sbjct: 114 MIQFGFIELGLNKIWASAISRNRSSIKVLEKSGLRKEGTLRQNRLLLNTYED 165 >BACAN Q81Q67 (Q81Q67) Acetyltransferase, GNAT family Length = 282 Score = 33.9 bits (76), Expect = 1.1 Identities = 34/125 (27%), Positives = 57/125 (45%), Gaps = 13/125 (10%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANA---VILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 PNY G+ + +FE K+E N + L+ N RAIR Y K G+ + DL Sbjct: 87 PNYRGVGVSQK----LFELHKEEALQNECKQLFLEVIVGNDRAIRFYNKLGYEKVYDLSY 142 Query: 162 HELHEGKK---EDCYLMEYR-YDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAY 217 + L + K +C +E + + A V+ K+L H+ N++ D I + + Sbjct: 143 YNLKDMTKIIHRECKGIEVKQLEFAAFKVEIQKWL--HFHINWQNDMDYIEKTNHTFYGA 200 Query: 218 LVNNE 222 V+N+ Sbjct: 201 YVDND 205 >BACAN Q81P77 (Q81P77) Acetyltransferase, GNAT family Length = 181 Score = 33.9 bits (76), Expect = 1.1 Identities = 45/181 (24%), Positives = 76/181 (41%), Gaps = 32/181 (17%) Query: 10 IRTLIDDDFPLMLKWLTDERVLEFYGGRDK--KYTLESLKKHYTEPWEDEVF-------- 59 +R L DD +W D +V + D+ +TLE K+ W + Sbjct: 8 LRELTLDDVEDRYQWSLDTKVTKHLVVSDQYPPFTLEDTKQ-----WIEACINRKNGYEQ 62 Query: 60 RVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKL 119 R I N + IG+ ++ K +D+ K E+ IG YW KG G + Sbjct: 63 RAITAENGIHIGWIEL-KNFDKTN------KNAELGIA----IGNKEYWGKGDGIAALYS 111 Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHE-LHEGKKEDCYLMEYR 178 + E V L ++N +A ++Y+K+GF + E L ++ L +G+ ++ YR Sbjct: 112 MLHVAFFEFELEKVWLRVDEDNLQARKSYEKAGF-VCEGLMRNDRLRKGR----FIHRYR 166 Query: 179 Y 179 Y Sbjct: 167 Y 167 >Q8E1I7 (Q8E1I7) Acetyltransferase, GNAT family Length = 186 Score = 33.5 bits (75), Expect = 1.5 Identities = 22/64 (34%), Positives = 29/64 (45%), Gaps = 1/64 (1%) Query: 93 EIVYGMDQFIG-EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKS 151 EI + D FI + +YW GIG ++ E+ + L N RAI YQK Sbjct: 97 EIEHVGDLFIAVQKDYWGYGIGHILMEEAIEWASDNDITRRLELSVQGRNERAIHLYQKF 156 Query: 152 GFRI 155 GF I Sbjct: 157 GFEI 160 >Q8DXE6 (Q8DXE6) Hypothetical protein SAG1905 Length = 212 Score = 33.5 bits (75), Expect = 1.5 Identities = 29/93 (31%), Positives = 44/93 (47%), Gaps = 9/93 (9%) Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIK 278 +N Y+F+ F K GY KE+A+ N + + +K I Y D LS+L KEI Sbjct: 125 LNLRYLFERLFEDEKGGGYPKERAVPEQRNARILSEIK--QITY---RDLLSVL--KEID 177 Query: 279 GTFLTPEIYSTMSEEE--QNLLKRDIASFLRQM 309 FL I +E N ++IA +L+ + Sbjct: 178 QDFLKETISGEHFQEYFFANCQNQNIADYLKSV 210 >OCEIH Q8ET96 (Q8ET96) Hypothetical conserved protein Length = 167 Score = 33.5 bits (75), Expect = 1.5 Identities = 26/90 (28%), Positives = 41/90 (45%), Gaps = 15/90 (16%) Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKK 169 KGIG + LI + A+ + LD +N RAI Y+K GF + EG Sbjct: 92 KGIGKEALNLIKIWAFNSYKAHRLWLDVKTDNKRAITIYKKEGFTL----------EGTL 141 Query: 170 EDCYLMEYRYDDNATNVKAMKYLIEHYFDN 199 +C + Y+ ++ M L++H +DN Sbjct: 142 RECLRVGNTYE----SLHVMS-LLKHEYDN 166 >LISIN Q929M8 (Q929M8) Lin2246 protein Length = 157 Score = 33.5 bits (75), Expect = 1.5 Identities = 23/73 (31%), Positives = 38/73 (52%), Gaps = 1/73 (1%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164 P+Y +GIG + + E + +E+ + L N +AIR Y+K+GF+ L + + Sbjct: 84 PDYQREGIGQLLMDKMKE-VAREKGFIKISLRVLSINQKAIRFYEKNGFKQEGRLEKEFI 142 Query: 165 HEGKKEDCYLMEY 177 +GK D LM Y Sbjct: 143 IQGKYVDDILMAY 155 >CLOTE Q895L4 (Q895L4) DNA topoisomerase I (EC 5.99.1.2) Length = 696 Score = 33.5 bits (75), Expect = 1.5 Identities = 19/52 (36%), Positives = 28/52 (53%), Gaps = 3/52 (5%) Query: 220 NNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSI 271 N+EYIF+ S K G+ IY +LN + + N+ IP +E + E SI Sbjct: 399 NSEYIFRATGSIVKFDGFM---IIYEYLNEDEKENINIPKLEKGELLKEKSI 447 >CLOPE Q8XIF5 (Q8XIF5) Ribosomal-protein-alanine N-acetyltransferase Length = 148 Score = 33.5 bits (75), Expect = 1.5 Identities = 23/76 (30%), Positives = 39/76 (51%), Gaps = 4/76 (5%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164 P Y +G+G I + L KE N N++ L+ ++N A Y+K GF+ E+ Sbjct: 76 PEYRKQGVGNLLIDNLIT-LCKENNINSLTLEVRESNIPAQSLYKKHGFK--EEGIRKNF 132 Query: 165 HEGKKEDCYLMEYRYD 180 + KE+ +M +R+D Sbjct: 133 YNNPKENAIIM-WRHD 147 >BRUSU Q8FVN1 (Q8FVN1) Acyl-CoA dehydrogenase family protein Length = 388 Score = 33.5 bits (75), Expect = 1.5 Identities = 21/79 (26%), Positives = 38/79 (48%), Gaps = 15/79 (18%) Query: 181 DNATNVKAMKYLIEH-----YFDNFKVDSIEIIG----------SGYDSVAYLVNNEYIF 225 +N ++ ++ ++ H +FDN +V +IG SG ++ L+ E I Sbjct: 197 NNGLTIRPIRTMMNHATTEVFFDNLRVPVSNLIGEEGKGFRYILSGMNAERILIAAECIG 256 Query: 226 KTKFSTNKKKGYAKEKAIY 244 K+ T K YAKE++I+ Sbjct: 257 DAKWFTQKSVNYAKERSIF 275 >BRUME Q8YCN6 (Q8YCN6) ACYL-COA DEHYDROGENASE (EC 1.3.99.-) Length = 395 Score = 33.5 bits (75), Expect = 1.5 Identities = 21/79 (26%), Positives = 38/79 (48%), Gaps = 15/79 (18%) Query: 181 DNATNVKAMKYLIEH-----YFDNFKVDSIEIIG----------SGYDSVAYLVNNEYIF 225 +N ++ ++ ++ H +FDN +V +IG SG ++ L+ E I Sbjct: 204 NNGLTIRPIRTMMNHATTEVFFDNLRVPVSNLIGEEGKGFRYILSGMNAERILIAAECIG 263 Query: 226 KTKFSTNKKKGYAKEKAIY 244 K+ T K YAKE++I+ Sbjct: 264 DAKWFTQKSVNYAKERSIF 282 >BACSU YQAR_BACSU (P45914) Hypothetical protein yqaR Length = 154 Score = 33.5 bits (75), Expect = 1.5 Identities = 36/157 (22%), Positives = 74/157 (47%), Gaps = 11/157 (7%) Query: 201 KVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKK---KGYAKEKAIYNFLNTNLETN--- 254 ++D +I S +V + Y+ + K + KGY IYN N +ET Sbjct: 3 QIDFGTVITSAITAVFFTGGTNYVLQKKNRKGNEIFTKGYILIDEIYNINNKRIETAAAF 62 Query: 255 VKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLDY 314 V N Y+ ++L +KE+ L + +S + ++E N+ ++ ++LR++ Sbjct: 63 VPFYNHPEGYL-EKLHTDYFKELSAFELIVKKFSILFDKELNIKLQEYINYLREVEVALR 121 Query: 315 TDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYI 351 +++ I + N +EYI E + +++T++ K +I Sbjct: 122 GFMNDDPI-IEVNFNQEYI---ERLIDEITNLIKKHI 154 >BACAN Q81RF9 (Q81RF9) Acetyltransferase, GNAT family Length = 185 Score = 33.5 bits (75), Expect = 1.5 Identities = 40/184 (21%), Positives = 71/184 (38%), Gaps = 29/184 (15%) Query: 6 NEICIRTLIDDDFPLMLKWLTDERVLE-------FYGGRDKKYTLESLKKHYTEPWEDEV 58 +++ IRT+ + D + + E E ++ ++Y++ +K T E+ + Sbjct: 9 DKVTIRTIEESDIKTLWNLVFKEENPEWKKWDAPYFSFSMQEYSVYK-EKMQTRLKEEPL 67 Query: 59 FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIK 118 +IIE N IG Y + Y T + G+ I P YW+ G GT + Sbjct: 68 SNLIIENNGQVIGTVGFY---------WEYKPTRWLEMGI--VIYNPAYWNGGYGTEALT 116 Query: 119 LIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYR 178 L + L ++ V L N R ++ +K G + EG+ C Sbjct: 117 LYRDLLFEKMEIGRVGLTTWSGNERMMKVAEKIGMSL----------EGRMRKCRYYNGT 166 Query: 179 YDDN 182 Y D+ Sbjct: 167 YYDS 170 >VIBPA Q87NP1 (Q87NP1) Putative spermine/spermidine acetyltransferase BltD Length = 182 Score = 33.1 bits (74), Expect = 2.0 Identities = 21/91 (23%), Positives = 42/91 (46%), Gaps = 3/91 (3%) Query: 85 DYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRA 144 ++H P T + M + P++ KG+G+ + + + N V L+ + N A Sbjct: 89 EFHAPSTGTLWLPMLTIL--PSFKGKGLGSEIVSSVIAVACEYANLQNVGLNVYAENISA 146 Query: 145 IRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175 R + + GF I + E+ GK+ +C ++ Sbjct: 147 FRFWYRQGFTQIRAF-DQEIEFGKEYNCLVL 176 >THETN Q8RC65 (Q8RC65) Acetyltransferases Length = 200 Score = 33.1 bits (74), Expect = 2.0 Identities = 16/50 (32%), Positives = 31/50 (62%), Gaps = 1/50 (2%) Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLP 160 G+G++ ++ I + +K + ++LD N +AI+ Y+K G++IIE P Sbjct: 134 GLGSKLLEEIEQEARKLK-CKRIVLDVEIENEKAIKLYEKLGYKIIERSP 182 >STRR6 Q8CYV9 (Q8CYV9) Hypothetical protein spr0850 Length = 166 Score = 33.1 bits (74), Expect = 2.0 Identities = 23/81 (28%), Positives = 40/81 (49%), Gaps = 6/81 (7%) Query: 86 YHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYI-KLIFEFLKKERNANAVILDPHKNNPRA 144 Y YP + + G+ F+ + Y KGIG+ + + + F K R A + K NP++ Sbjct: 80 YAYPDEETVFIGL--FMVDQAYQRKGIGSHIVTEALAYFAKNFRKARLAYV---KGNPQS 134 Query: 145 IRAYQKSGFRIIEDLPEHELH 165 ++K GF+ I + EL+ Sbjct: 135 QHFWEKQGFKSIGCEVKQELY 155 >STAES Q8CU70 (Q8CU70) Spermidine acetyltransferase Length = 165 Score = 33.1 bits (74), Expect = 2.0 Identities = 27/83 (32%), Positives = 37/83 (44%), Gaps = 14/83 (16%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHK-------NNPRAIRAYQKSG 152 Q I +P + KG Y K FE K N IL+ HK +N +A+ Y+ G Sbjct: 81 QIIIKPEFSGKG----YAKFAFE---KAINYAFDILNMHKIYLYVDTDNKKAVHIYESQG 133 Query: 153 FRIIEDLPEHELHEGKKEDCYLM 175 F+ L E +GK +D Y M Sbjct: 134 FKTEGLLKEQFYTKGKYKDAYFM 156 >STAES Q8CS08 (Q8CS08) Hypothetical protein SE1483 Length = 434 Score = 33.1 bits (74), Expect = 2.0 Identities = 38/150 (25%), Positives = 71/150 (47%), Gaps = 24/150 (16%) Query: 224 IFKTKFSTNKKKGYAK----EKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKE--- 276 I KT+FST+K KGY K EK+ N N + + ++ N + I++E+S L Sbjct: 6 ILKTQFSTSKFKGYLKYINDEKS--NKANHD-KKKIQSLNQDIENINNEMSNLNLNSYSS 62 Query: 277 -IKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLDYTDISECTID----------NK 325 I G I ++ ++ ++KR A F + LD ++++ D N Sbjct: 63 YIIGYMKNNSITKKDNQNKKKVIKRTTAPFNNNSYTLDNKELNKLKDDFDTAEKQGCINY 122 Query: 326 QNVL--EEYILLRETIYNDLTD-IEKDYIE 352 Q+++ + L++ +Y+ TD + +D I+ Sbjct: 123 QDIISFDNDFLIKNHLYDAKTDELNEDVIK 152 >RICPR Q9ZDP9 (Q9ZDP9) Hypothetical protein RP278 Length = 371 Score = 33.1 bits (74), Expect = 2.0 Identities = 34/116 (29%), Positives = 52/116 (44%), Gaps = 19/116 (16%) Query: 231 TNKKKGYAK----EKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEI 286 T K+ Y + E+A+Y+ L + K NI S D+L + +KG LTPE Sbjct: 101 TRLKENYIQYDTVEEALYSLLTKETDLIKKANNIPESLTPDDL-----RRLKGENLTPE- 154 Query: 287 YSTMSEEEQNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYND 342 E+E+ K + S L + +D T S D + N + E L +TI N+ Sbjct: 155 -----EQEEERKKFEYLSILGSI--IDDTKKSNEHYDKRANEINEQ--LNKTIINE 201 >OCEIH Q8CXG0 (Q8CXG0) Diamine N-acetyltransferase (Spermine:spermidine acetyltransferase) (EC 2.3.1.57) Length = 152 Score = 33.1 bits (74), Expect = 2.0 Identities = 26/100 (26%), Positives = 44/100 (44%), Gaps = 12/100 (12%) Query: 66 NNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLK 125 ++ PIGY + +H + + D+F+ + KG +YI LI +++K Sbjct: 53 DDTPIGYAMV---------GFHSQEKQSAWF--DRFMIAAEHQGKGYAHQYIPLILDYIK 101 Query: 126 KERNANAVILDPHKNNPRAIRAYQKSGFRII-EDLPEHEL 164 + ++ L N A Y+K GF + E PE EL Sbjct: 102 MKYQVKSIKLSIIPTNDVAKLLYEKYGFVLTGETDPEGEL 141 >CLOAB Q97J70 (Q97J70) Predicted acetyltransferase Length = 171 Score = 33.1 bits (74), Expect = 2.0 Identities = 19/69 (27%), Positives = 31/69 (44%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166 YW G+G + I + + KK + L +N RAI+ Y+ GF + L + Sbjct: 98 YWGLGVGRKLIMNLIAWSKKNHIVRKINLRVRTDNYRAIKLYESLGFVNEGTIKRDFLID 157 Query: 167 GKKEDCYLM 175 G+ D + M Sbjct: 158 GEFYDSFSM 166 >BURMA Q9AI54 (Q9AI54) DedA family protein Length = 1925639 Score = 33.1 bits (74), Expect = 2.0 Identities = 50/238 (21%), Positives = 103/238 (43%), Gaps = 28/238 (11%) Query: 12 TLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIG 71 T+ ++++ M+ ++VLE G ++K +YT ++I+Y N I Sbjct: 546537 TINENZYMEMITKDNLKQVLENLGFKNKNENYVKTINNYT---------LLIDYKNQSIN 546587 Query: 72 YGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFL----KKE 127 Y + K++D+ +++ +P+ + + + + KG Y++L ++ KK Sbjct: 546588 YPKEIKIHDKTTSNFSHPENFVVFECVHRLL------EKGYKAEYLELEPKWNLGRDKKG 546641 Query: 128 RNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVK 187 A+ ++ D ++NNP I + + + E + E + +++ L Y + K Sbjct: 546642 GKADILVKD-NENNPYLIIECKTTDSKNSEFI--KEWNRMQEDGGQLFSYFQQE-----K 546693 Query: 188 AMKYLIEHYFD-NFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIY 244 +KYL + D + K++ I YD+ YL E K S N + + K Y Sbjct: 546694 GVKYLCLYTSDFSDKLEYKNYIIQAYDNEEYLKEKELQNSYKKSNNNIELFKTWKESY 546751 Score = 31.2 bits (69), Expect = 7.4 Identities = 20/73 (27%), Positives = 36/73 (49%), Gaps = 2/73 (2%) Query: 105 PNYWSKGIGTRYIKLIFEFLKK-ERNANAVILDPHKNNPRAIRAYQKSGFRII-EDLPEH 162 P++ +G+G+R + + + + E + L NP A+R Y++ GFR + Sbjct: 1424334 PDHQGRGVGSRLFESLIAWARSAEPEIVRIELAAGAGNPGAVRLYERLGFRHEGRQVARG 1424393 Query: 163 ELHEGKKEDCYLM 175 L +G+ ED LM Sbjct: 1424394 RLPDGRFEDDILM 1424406 >BRAJA Q89YE3 (Q89YE3) Bll0009 protein Length = 250 Score = 33.1 bits (74), Expect = 2.0 Identities = 14/56 (25%), Positives = 31/56 (55%), Gaps = 4/56 (7%) Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 +PN+ KG+GT L+ + + + + ++ +NP I YQ+ GF+++ ++ Sbjct: 165 DPNWVGKGLGT----LLMNYALQRCDEDGIVAYLESSNPENIPFYQRHGFKVVGEI 216 >BACAN Q81NK7 (Q81NK7) Streptothricin acetyltransferase, putative Length = 184 Score = 33.1 bits (74), Expect = 2.0 Identities = 31/119 (26%), Positives = 54/119 (45%), Gaps = 6/119 (5%) Query: 40 KYTLE---SLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVY 96 +YT+E S +K Y + +E+ V EY N P I +++++ K Sbjct: 41 EYTVEDVPSYEKSYLQNDNEEL--VYNEYINKPNQIIYIALLHNQIIGFIVLKKNWNNYA 98 Query: 97 GMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRI 155 ++ + Y + G+G R I ++ K E N ++L+ NN A + Y+K GF I Sbjct: 99 YIEDITVDKKYRTLGVGKRLIAQAKQWAK-EGNMPGIMLETQNNNVAACKFYEKCGFVI 156 >VIBUY Q7MI31 (Q7MI31) Ribosomal-protein-alanine acetyltransferase Length = 150 Score = 32.7 bits (73), Expect = 2.6 Identities = 19/75 (25%), Positives = 34/75 (45%), Gaps = 1/75 (1%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164 P KG G + + + + NA + L+ ++N AI YQ+ GF ++ + Sbjct: 74 PKQQGKGYGRQLLDAFIDE-GEAANAESAWLEVRESNVNAIHLYQEMGFNEVDRRRNYYP 132 Query: 165 HEGKKEDCYLMEYRY 179 + KED +M Y + Sbjct: 133 TQSGKEDAIIMSYLF 147 >OCEIH Q8EMB8 (Q8EMB8) Glucose-6-phosphate 1-dehydrogenase (EC 1.1.1.49) (G6PD) Length = 491 Score = 32.7 bits (73), Expect = 2.6 Identities = 45/198 (22%), Positives = 79/198 (39%), Gaps = 25/198 (12%) Query: 53 PWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTD----EIVYGMDQFIG--EPN 106 PW DEV R +E N++ + E + ++Y D E G+++ I E Sbjct: 51 PWTDEVLRENVE-NSIQDALSPDEDL-SEFISHFYYKSFDVTEKESYQGLNEIIQNLEGQ 108 Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166 Y ++G Y+ + +F N + N ++ +IE H+L Sbjct: 109 YQTEGNRLFYLAMAPDFFGAIAN---------QLNDYGLKNTSGWTRLVIEKPFGHDLPS 159 Query: 167 GKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFK 226 KK + L +D Y I+HY V +IE+I +L NN +I Sbjct: 160 AKKLNHELQAAFREDQI-------YRIDHYLGKEMVQNIEVIRFANGIFEHLWNNRFISN 212 Query: 227 TKFSTNKKKGYAKEKAIY 244 + ++++ G +E+A Y Sbjct: 213 IQITSSETLG-VEERARY 229 >OCEIH Q8EKW6 (Q8EKW6) Acetyltransferase Length = 166 Score = 32.7 bits (73), Expect = 2.6 Identities = 36/165 (21%), Positives = 64/165 (38%), Gaps = 16/165 (9%) Query: 6 NEICIRTLIDDDFPLMLKWLTDERVLEFYGG---RDKKYTLESLKKHYT--EPWEDEVFR 60 N + R D+DFP + L D V+ F G RD K + L+ Y + + Sbjct: 5 NRLTFRPYHDNDFPFLQSLLQDPEVVRFIGDGNVRDDKACNDFLQWIYDTYKNGNGLGLQ 64 Query: 61 VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120 V++ N +G+ + E + EI Y + + +W KG T + Sbjct: 65 VLVNKQNERVGHAGLVPQTVEGKNEI------EIGYWIAK-----KHWGKGYATEAALAL 113 Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELH 165 F F +K + VI + N + +K +I +++ + H Sbjct: 114 FAFARKNIEVDRVISLIQRENTASRNVAEKLMMKIEKEIILKDKH 158 >MYCPE Q8EWC2 (Q8EWC2) Oligoendopeptidase F Length = 604 Score = 32.7 bits (73), Expect = 2.6 Identities = 30/121 (24%), Positives = 54/121 (44%), Gaps = 7/121 (5%) Query: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240 DN T + L ++ F FK+D + I Y+ V+ +N + K + N+K Sbjct: 37 DNGTCYSNLNKLKKYLF--FKLDMVPIENKLYNYVSNKLNEDLANKEMINWNQKLSSKIS 94 Query: 241 KAIYNFLNTNLETNVKIPNIEY--SYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLL 298 + +F N E N+ + N E S+I ++ I ++ E + +EEE+ L+ Sbjct: 95 EFQLSFAN---EINIILDNKELIKSFIENDSEIKKFERFFDLIFKEENHKLSNEEEKLLV 151 Query: 299 K 299 K Sbjct: 152 K 152 >LISMO Q8Y8S4 (Q8Y8S4) Lmo0820 protein Length = 185 Score = 32.7 bits (73), Expect = 2.6 Identities = 21/74 (28%), Positives = 33/74 (44%), Gaps = 3/74 (4%) Query: 98 MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157 +D + Y G+GT + + E + E V L+ K NP A R Y++ GF + Sbjct: 113 LDSIVTNEKYRGHGVGTALLAKLTE-IAAEDGEKVVGLNCDKGNPHAKRLYERLGFHVTG 171 Query: 158 D--LPEHELHEGKK 169 + L HE +K Sbjct: 172 EITLSGHEYEHMQK 185 >CLOPE Q8XKM5 (Q8XKM5) Probable acetyltransferase Length = 167 Score = 32.7 bits (73), Expect = 2.6 Identities = 18/76 (23%), Positives = 36/76 (47%), Gaps = 8/76 (10%) Query: 85 DYHYPKTDEIVYGMD-------QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDP 137 DY Y ++I + +D + P++ KG G + I + E + KE+ N++ + Sbjct: 69 DYAYDVYNDIAWQVDGPFLSFHRIAVSPSHRGKGYGRKMIDFV-EEMAKEKKCNSIRISA 127 Query: 138 HKNNPRAIRAYQKSGF 153 + N A+ Y+ G+ Sbjct: 128 YHKNENAVNLYKNLGY 143 >BUCAI SPED_BUCAI (P57304) S-adenosylmethionine decarboxylase proenzyme (EC 4.1.1.50) (AdoMetDC) (SamDC) [Contains: S-adenosylmethionine decarboxylase beta chain; S-adenosylmethionine decarboxylase alpha chain] Length = 265 Score = 32.7 bits (73), Expect = 2.6 Identities = 30/103 (29%), Positives = 46/103 (44%), Gaps = 11/103 (10%) Query: 169 KEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTK 228 + D +EYR ++ +K+ I+H ++ + + I S YD V V E IF T+ Sbjct: 159 ESDIVTIEYRVRGFTRDIHGIKHFIDHKINSIQNFMSDDIKSMYDMVDVNVYQENIFHTR 218 Query: 229 FSTNKKKGYAKEKAIYNFL-NTNLETNVKIPNIEYSYISDELS 270 +E + N+L N NLE + E SYI LS Sbjct: 219 M-------LLREFNLKNYLFNINLE---NLEKEERSYIKKLLS 251 >AQUAE O67458 (O67458) Hypothetical protein aq_1482 Length = 161 Score = 32.7 bits (73), Expect = 2.6 Identities = 18/60 (30%), Positives = 30/60 (50%), Gaps = 1/60 (1%) Query: 95 VYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 V + + + +P Y G+GT + I E+ KK + + L N +AI Y+K GF+ Sbjct: 87 VGAIHEIVVDPEYQGHGVGTALMNTILEYFKK-KGLDTAELWVGDENYKAINFYKKFGFQ 145 >YERPE Q8CZN5 (Q8CZN5) Acyltransferase for 30S ribosomal subunit protein S18 Length = 161 Score = 32.3 bits (72), Expect = 3.3 Identities = 20/72 (27%), Positives = 34/72 (47%), Gaps = 1/72 (1%) Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHE 163 +P Y +G G ++ + E L+ ERN + L+ +N RAI Y+ GF + + Sbjct: 86 DPQYQRQGYGRLLLEHLIEQLE-ERNIVTLWLEVRASNARAIALYESLGFNEVSVRRNYY 144 Query: 164 LHEGKKEDCYLM 175 +ED +M Sbjct: 145 PSANGREDAIMM 156 >STRAW Q827N9 (Q827N9) Putative acetyltransferase Length = 166 Score = 32.3 bits (72), Expect = 3.3 Identities = 19/59 (32%), Positives = 31/59 (52%), Gaps = 1/59 (1%) Query: 109 SKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEG 167 S+GIG+ I+ E L +ER + + L +NPRA Y + G+R + + +EG Sbjct: 88 SRGIGSALIRAAEE-LTRERGLDVIGLGVGTDNPRAAELYARLGYRPLTGYVDRWSYEG 145 >STAES Q8CU81 (Q8CU81) Spermidine acetyltransferase Length = 165 Score = 32.3 bits (72), Expect = 3.3 Identities = 24/80 (30%), Positives = 35/80 (43%), Gaps = 8/80 (10%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFE----FLKKERNANAVILDPHKNNPRAIRAYQKSGFRI 155 Q I +P + KG Y K FE + N + + L +N +AI Y+ GF+ Sbjct: 81 QIIIKPEFSGKG----YAKFAFEKAIIYAFNILNMHKIYLYVDADNKKAIHIYESQGFKT 136 Query: 156 IEDLPEHELHEGKKEDCYLM 175 L E +GK +D Y M Sbjct: 137 EGLLKEQFYTKGKYKDAYFM 156 >RICPR Q9ZCN0 (Q9ZCN0) RIBOSOMAL-PROTEIN-ALANINE ACETYLTRANSFERASE (RimJ) Length = 183 Score = 32.3 bits (72), Expect = 3.3 Identities = 26/85 (30%), Positives = 41/85 (48%), Gaps = 12/85 (14%) Query: 93 EIVYGMDQFIGEPNYWSKGIGTRYIKLIFEF---LKKERNANAVILDPHKNNPRAIRAYQ 149 EI Y +D PN+W +GI + IK I +F + R VI D N R++ + Sbjct: 103 EISYDLD-----PNFWGQGIMLKSIKNILKFADCIGIIRVQATVITD----NFRSVNLLE 153 Query: 150 KSGFRIIEDLPEHELHEGKKEDCYL 174 + GF L ++E+ K +D Y+ Sbjct: 154 RCGFSKEGILKKYEIIANKHKDYYM 178 >OCEIH Q8ERM1 (Q8ERM1) Acetyltransferase Length = 177 Score = 32.3 bits (72), Expect = 3.3 Identities = 39/175 (22%), Positives = 64/175 (36%), Gaps = 15/175 (8%) Query: 5 ENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLK-KHY---TEPWEDEVFR 60 + E+ IR + + D + + + E E+ ++ ES+ +H+ E W D R Sbjct: 4 DQELTIRPIQEKDLKRLWELIYKEDNPEWKQWDAPYFSHESMSYEHFLKEAESWIDAKSR 63 Query: 61 VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120 ++ NN G Y+Y + M E N W KG GT +KL Sbjct: 64 WVVCVNNDVHGT-----------VSYYYEDEQKNWLEMGIIFYEGNNWGKGYGTTALKLW 112 Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175 + + V L N R IR +K G + + + G+ D M Sbjct: 113 VNHIFTQLPVVRVGLTTWSGNKRMIRVAEKLGMTMEGRIRNVRYYNGEYYDSIRM 167 >MYCGE RIBF_MYCGE (P47391) Putative riboflavin biosynthesis protein ribF [Includes: Riboflavin kinase (EC 2.7.1.26) (Flavokinase); FMN adenylyltransferase (EC 2.7.7.2) (FAD pyrophosphorylase) (FAD synthetase)] Length = 269 Score = 32.3 bits (72), Expect = 3.3 Identities = 16/44 (36%), Positives = 27/44 (61%), Gaps = 3/44 (6%) Query: 419 TNFGEDILRMY-GNIDIEKAKEYQDIVEEYYPIETIVYGIKNIK 461 TN ++R Y N ++EKA + +VE YY + T+V+G+K + Sbjct: 120 TNLSSSVIRNYLTNNELEKANQL--LVEPYYRVGTVVHGLKKAR 161 >ENTFA Q836M4 (Q836M4) Spermine/spermidine acetyltransferase, putative Length = 148 Score = 32.3 bits (72), Expect = 3.3 Identities = 16/56 (28%), Positives = 29/56 (51%) Query: 98 MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 +D+F+ + + +G G +L+ L ++ N + L + N AIR YQ+ GF Sbjct: 72 LDRFLIDQRFQGQGYGKAACRLLMLKLIEKYQTNKLYLSVYDTNSSAIRLYQQLGF 127 >CHRVO Q7NRZ7 (Q7NRZ7) Probable acetyltransferase (EC 2.3.1.-) Length = 172 Score = 32.3 bits (72), Expect = 3.3 Identities = 20/70 (28%), Positives = 31/70 (44%) Query: 106 NYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELH 165 ++ KG G ++ I + + L +N RAI Y+K GF L + +L Sbjct: 92 DWQGKGAGGAMMRAIIDLADNWLGLIRIELKVIHDNARAIALYEKFGFEYEGRLRQEQLR 151 Query: 166 EGKKEDCYLM 175 GK ED +M Sbjct: 152 AGKLEDVLVM 161 >CHLCV Q824H9 (Q824H9) Acetyltransferase, GNAT family Length = 170 Score = 32.3 bits (72), Expect = 3.3 Identities = 18/52 (34%), Positives = 29/52 (55%), Gaps = 2/52 (3%) Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 +GEP Y +KGIGT + + K + + L+ ++ NP AI Y++ GF Sbjct: 95 VGEP-YRNKGIGTALLNNLCHLAKSRFHLEILYLEVYEENP-AIELYKRFGF 144 >CHLMU SYR_CHLMU (Q9PJT8) Arginyl-tRNA synthetase (EC 6.1.1.19) (Arginine--tRNA ligase) (ArgRS) Length = 563 Score = 32.3 bits (72), Expect = 3.3 Identities = 56/268 (20%), Positives = 106/268 (39%), Gaps = 63/268 (23%) Query: 204 SIEIIGSGYD----SVAYLVNNEYIFKTKFSTNKKKGY---AKEKAIYNFLNTNLETNVK 256 SIEI G+G+ S +L N IF + + KG+ + +K I +F + N+ ++ Sbjct: 75 SIEIAGAGFINFTFSKEFLANQLQIFSQELA----KGFPVSSPQKVIIDFSSPNIAKDMH 130 Query: 257 IPNIEYSYISDEL----SILGYKEIK-----------GTFLT--PEIYSTMSEEEQNLLK 299 + ++ + I D L S +G+ ++ G +T E T + +NL + Sbjct: 131 VGHLRSTIIGDCLARCFSFVGHDVLRLNHIGDWGTAFGMLITYLQETAQTDIHQLENLTE 190 Query: 300 RDIASFLRQMHGLDYTDISE--------------------CTIDNKQ-----NVLEEYIL 334 + +R ++ S+ C + K ++L+ + Sbjct: 191 LYKKAHVRFAEDPEFKKRSQYNVVALQSGDPQALALWKQICAVSEKSFQKIYSILDVELH 250 Query: 335 LR-ETIYND-LTDIEKDYIESFMERLNATTVFEGKKCLCHNDFSCNHLLL---DGNNRLT 389 R E+ YN L D+ D +E N T+ +G KC+ H +FS ++ G N T Sbjct: 251 TRGESFYNPFLADVVSD-----LESKNLVTLSDGAKCVFHEEFSIPLMIQKSDGGYNYAT 305 Query: 390 XXXXXXXXXXXXEYCDFIYLLEDSEEEI 417 ++ D I ++ DS + + Sbjct: 306 TDVAAMRYRIQQDHADRILIVTDSGQSL 333 >BACSU O34376 (O34376) Putative acetyl transferase (YobR protein) Length = 247 Score = 32.3 bits (72), Expect = 3.3 Identities = 24/84 (28%), Positives = 37/84 (44%), Gaps = 2/84 (2%) Query: 76 YKMYD-ELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVI 134 +KMYD E T + G+ + + KG GT+ I+++ E+ K A + Sbjct: 158 FKMYDKESLTALGTVSVIDGYGGLSNIVVAEEHRGKGAGTQVIRVLTEWAKNN-GAERMF 216 Query: 135 LDPHKNNPRAIRAYQKSGFRIIED 158 L K N A+ Y K GF I + Sbjct: 217 LQVMKENLAAVSLYGKIGFSPISE 240 >BACC1 Q738E9 (Q738E9) Acetyltransferase, GNAT family Length = 308 Score = 32.3 bits (72), Expect = 3.3 Identities = 38/159 (23%), Positives = 59/159 (37%), Gaps = 25/159 (15%) Query: 62 IIEYNNVPIGYGQIYKM-YDELYTDYHYPKTDEIVYG-------------MDQFIGEPNY 107 +I+YN P GY + M Y D + D + G +D+ EP Y Sbjct: 39 VIDYNIQPPGYSSVEMMRYSIEELDCYKVIMDGKIIGGIIVTISGKSYGRIDRIFVEPVY 98 Query: 108 WSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEG 167 KGIG+ IKLI E R + NN Y+K G+ I + Sbjct: 99 QGKGIGSYVIKLIEEEYPSIRIWDLETSSRQLNNH---HFYKKMGYETI--------FKS 147 Query: 168 KKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIE 206 + E CY+ + N+ K + ++N + + E Sbjct: 148 EDEYCYVKRITVESAEENLIKNKDMKNSQYENCNLANTE 186 >BACC1 Q736C5 (Q736C5) Acetyltransferase, GNAT family Length = 181 Score = 32.3 bits (72), Expect = 3.3 Identities = 23/79 (29%), Positives = 39/79 (49%), Gaps = 6/79 (7%) Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 IG YW KG G + + E V L ++N +A ++Y+K+GF + E L Sbjct: 94 IGNKEYWGKGYGIAALYSMLHVAFFEFELEKVWLRVDEDNFQARKSYEKAGF-VCEGLMR 152 Query: 162 HE-LHEGKKEDCYLMEYRY 179 ++ L +G+ ++ YRY Sbjct: 153 NDRLRKGQ----FIHRYRY 167 >BACC1 Q735P8 (Q735P8) Acetyltransferase, GNAT family Length = 149 Score = 32.3 bits (72), Expect = 3.3 Identities = 31/122 (25%), Positives = 49/122 (40%), Gaps = 21/122 (17%) Query: 50 YTEPWEDEVFRVIIE--YNNVP--IGYGQIYKMYDELY---TDYHYPKTDEIVYGMDQFI 102 Y P +E V E YN+ P +G+ + K +L Y K D+IV G F Sbjct: 9 YIVPCTEESIHVANEQGYNSGPHIVGHVENVKQDKDLLPWGAWYVIRKEDDIVLGDIGFK 68 Query: 103 GEPN--------------YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAY 148 G+PN YW+KG T ++ + + + +I + N +IR Sbjct: 69 GKPNEEHTVEVGYGFIEKYWNKGYATEAVRELINWAFQTGEVEMIIAETLLENESSIRVL 128 Query: 149 QK 150 +K Sbjct: 129 EK 130 >AQUAE YZ34_AQUAE (O66423) Hypothetical protein AA34 Length = 318 Score = 32.3 bits (72), Expect = 3.3 Identities = 25/71 (35%), Positives = 37/71 (52%), Gaps = 5/71 (7%) Query: 414 EEEIGTNFGEDILRMYGNIDIE-KAKEYQDIVEEYYPI----ETIVYGIKNIKQEFIENG 468 EE IG GE + + + E KAKE + V++ I ET+ Y IK I +E I + Sbjct: 215 EELIGETLGELLEKEIEKLVAEEKAKEIEGKVKKLKEIVSWFETLPYEIKQIAKEVISDN 274 Query: 469 RKEIYKRTYKD 479 +I ++ YKD Sbjct: 275 VLDIAEKFYKD 285 >YERPE Q8ZCX6 (Q8ZCX6) Spermidine acetyltransferase (EC 2.3.1.57) (Spermidine N1-acetyltransferase) Length = 181 Score = 32.0 bits (71), Expect = 4.4 Identities = 21/69 (30%), Positives = 29/69 (42%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 Q I +P + KG KL E+ N + L K N +AI Y K GF I +L Sbjct: 87 QIIIDPTHQGKGYAGAAAKLAMEYGFSVLNLYKLYLIVDKENEKAIHIYSKLGFEIEGEL 146 Query: 160 PEHELHEGK 168 + G+ Sbjct: 147 KQEFFINGE 155 >STRR6 Q8DNN2 (Q8DNN2) Hypothetical protein spr1627 Length = 148 Score = 32.0 bits (71), Expect = 4.4 Identities = 14/58 (24%), Positives = 30/58 (51%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157 +F P +G+G++ ++ + + +++ L+ + N RA YQK GF I++ Sbjct: 76 RFFINPQKQEQGLGSQALRKFVSLAFENEDIDSISLNVFEANQRAQNLYQKEGFEIVQ 133 >STAAM Q99RQ8 (Q99RQ8) Similar to transcription repressor of sporulation, septation and degradation PaiA Length = 171 Score = 32.0 bits (71), Expect = 4.4 Identities = 20/65 (30%), Positives = 35/65 (53%), Gaps = 4/65 (6%) Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKE 170 G G++ I+L E + +E N + + L ++NPRA Y++ GF+++ EH G Sbjct: 106 GRGSQLIELA-EKIAQEHNKHKIWLGVWEHNPRAQAFYKRHGFKVV---GEHHFQTGDVT 161 Query: 171 DCYLM 175 D L+ Sbjct: 162 DTDLI 166 >LACLA Q9CJA2 (Q9CJA2) Acetyl transferase Length = 162 Score = 32.0 bits (71), Expect = 4.4 Identities = 21/69 (30%), Positives = 32/69 (46%), Gaps = 2/69 (2%) Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKK 169 KG+ T I +F KKE + + +NP A++ Y K GF L + +G+ Sbjct: 89 KGVATTLINFFIDFAKKE-GFKKITIQVMGSNPAALKLYNKLGFVEEGRLKKEFFIDGEY 147 Query: 170 -EDCYLMEY 177 +DC L Y Sbjct: 148 IDDCILAFY 156 >CLOTE Q892J2 (Q892J2) Conserved protein Length = 218 Score = 32.0 bits (71), Expect = 4.4 Identities = 39/154 (25%), Positives = 57/154 (37%), Gaps = 21/154 (13%) Query: 219 VNNEYIFK-------------TKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYI 265 VNN IFK T F++ K +G Y LN N+ N S + Sbjct: 9 VNNTPIFKCNYCGHCSKEIEATSFTSVKNRGCCWYFPKYTLLNIKNILNIGKENFIISLL 68 Query: 266 SDELSILG--YKEIKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLDYTDISECTID 323 +++ S + + E+KG+F E Y M E E D F R+ + C++D Sbjct: 69 NNKNSNISSYFIEVKGSFEEEEYYKFMRENEYTESSFDYKLFFRK---CSFVTDKGCSLD 125 Query: 324 NKQNVLEEYILLRETIYNDLTDIEKDYIESFMER 357 + L I N +KDY ER Sbjct: 126 FSLRPHPCNLYLCRNIIN---TCDKDYSSFSRER 156 >BRAJA Q89UA8 (Q89UA8) Hypothetical acetyltransferase Length = 148 Score = 32.0 bits (71), Expect = 4.4 Identities = 19/56 (33%), Positives = 33/56 (58%), Gaps = 5/56 (8%) Query: 98 MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 +DQ + +P W G+ +L+ E K+ + + V L +K+N RAIR Y+++GF Sbjct: 76 LDQLVVDPASW----GSDAARLLVEEAKR-LSPSGVTLLVNKDNTRAIRFYERNGF 126 >BACSU O34558 (O34558) YopR protein Length = 325 Score = 32.0 bits (71), Expect = 4.4 Identities = 17/45 (37%), Positives = 26/45 (57%), Gaps = 5/45 (11%) Query: 211 GYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNV 255 G +LV N+Y+ KTK ++NK G A + F+ TNL T++ Sbjct: 203 GQTKEVFLVENDYVVKTKRTSNKGDGQASK-----FVITNLITDI 242 >BACAN Q81R63 (Q81R63) Hypothetical protein Length = 217 Score = 32.0 bits (71), Expect = 4.4 Identities = 15/45 (33%), Positives = 27/45 (60%) Query: 324 NKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNATTVFEGKK 368 N+ NVL E + +E + L++ +KDYI+S E++ T E ++ Sbjct: 141 NQMNVLNESVTTQEELQRYLSENKKDYIKSVAEKVYQTATEEKRE 185 >VIBPA Q87MD0 (Q87MD0) Acetyltransferase-related protein Length = 168 Score = 31.6 bits (70), Expect = 5.7 Identities = 15/53 (28%), Positives = 26/53 (49%) Query: 101 FIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 FI + YW KG+ T +K F +E + V + + N+ ++ +K GF Sbjct: 86 FIFDKAYWGKGLATEALKAFFPKACRELELHKVKANVNSNHQASMAVLEKLGF 138 >STRR6 Q8DND0 (Q8DND0) Transcriptional activator Length = 299 Score = 31.6 bits (70), Expect = 5.7 Identities = 19/81 (23%), Positives = 40/81 (49%), Gaps = 12/81 (14%) Query: 295 QNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEK------ 348 Q L+++D+A F+ Q+ L + + K +E Y ++R+T+ + + +EK Sbjct: 167 QMLIRKDLAKFINQIEKLMLFLLEQ----KKVTQIENYFIIRDTLISGMCCLEKVGVTDC 222 Query: 349 --DYIESFMERLNATTVFEGK 367 DY+ E ++ T ++ K Sbjct: 223 FNDYLSCLQEIMDKTQDYQKK 243 >OCEIH Q8CUS9 (Q8CUS9) Hypothetical conserved protein Length = 161 Score = 31.6 bits (70), Expect = 5.7 Identities = 20/76 (26%), Positives = 36/76 (47%), Gaps = 6/76 (7%) Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164 P++ GIG+ + + + + L+ +NN +A+ Y GF II+D E+ Sbjct: 92 PSHQGIGIGSA----LLHYGVNQLRPREIQLNVEQNNIKALDFYTSKGFEIIKDFQEN-- 145 Query: 165 HEGKKEDCYLMEYRYD 180 +G D Y M ++ D Sbjct: 146 FDGHLLDTYRMSWKLD 161 >LISIN Q92E28 (Q92E28) Lin0633 protein Length = 143 Score = 31.6 bits (70), Expect = 5.7 Identities = 20/80 (25%), Positives = 37/80 (46%), Gaps = 1/80 (1%) Query: 75 IYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVI 134 +Y ++ + Y + DE + F+ + KG GT+ ++ + + L KE + Sbjct: 55 LYSIFTDQKIGYLWFHVDEKHAFIYDFVIFETFRGKGFGTKTLEAL-DVLAKEMGITKIE 113 Query: 135 LDPHKNNPRAIRAYQKSGFR 154 L +N AI+ Y K GF+ Sbjct: 114 LHVFAHNQTAIKLYDKVGFK 133 >LACPL Q88U49 (Q88U49) Glucose-6-phosphate 1-dehydrogenase (EC 1.1.1.49) (G6PD) Length = 494 Score = 31.6 bits (70), Expect = 5.7 Identities = 26/95 (27%), Positives = 45/95 (47%), Gaps = 8/95 (8%) Query: 151 SGF-RIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIG 209 +GF R+I + P +E KE + +++N Y I+HY + +I I Sbjct: 140 NGFNRVIIEKPFGHDYESAKELNDQLTATFNENQI------YRIDHYLGKEMIQNITAIR 193 Query: 210 SGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIY 244 G + L NN YI + + ++K G +E+A+Y Sbjct: 194 FGNNIWESLWNNRYIDNVQITLSEKLG-VEERAVY 227 >CORDI Q6NHQ0 (Q6NHQ0) Putative acetyltransferase Length = 163 Score = 31.6 bits (70), Expect = 5.7 Identities = 18/53 (33%), Positives = 27/53 (50%), Gaps = 1/53 (1%) Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156 EP Y KG+G+ + E L + A + L NPRA + Y++ GF+ I Sbjct: 98 EPRYRGKGVGSILLNKSLE-LARTLGAPGLSLSVDDGNPRAKKLYERLGFQHI 149 >BURMA Q62J98 (Q62J98) Ribosomal-protein-alanine acetyltransferase (EC 2.3.1.128) Length = 165 Score = 31.6 bits (70), Expect = 5.7 Identities = 15/43 (34%), Positives = 25/43 (58%), Gaps = 1/43 (2%) Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 G+G ++ + ER + V+L+ +NPRAIR Y++ GF Sbjct: 87 GVGLALLREAVRIARAER-LDGVLLEVRPSNPRAIRLYERFGF 128 >THETN Q8R764 (Q8R764) LysM-repeat proteins and domains Length = 508 Score = 31.2 bits (69), Expect = 7.4 Identities = 31/141 (21%), Positives = 53/141 (37%), Gaps = 23/141 (16%) Query: 235 KGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEE 294 KGY E F+ E + + +Y+S E++ L KE++ F ++E+E Sbjct: 381 KGYRDEYPFRTFVEIEGEVGEVLTEVSTAYVSYEINSL--KELEFKFAIDSCVEVLTEKE 438 Query: 295 QNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESF 354 L+ D+ E + + V I+ L DI K Y + Sbjct: 439 MTLI----------------YDLKEIEMPRGEEVRHSIIIYMVQKGESLWDIAKRYRVNV 482 Query: 355 MERLNAT-----TVFEGKKCL 370 + + A VFEG+K + Sbjct: 483 EDLITANDLKEDKVFEGEKLI 503 >STRR6 Q8DPX8 (Q8DPX8) Hypothetical protein spr0952 Length = 253 Score = 31.2 bits (69), Expect = 7.4 Identities = 22/106 (20%), Positives = 48/106 (45%), Gaps = 12/106 (11%) Query: 261 EYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLD------- 313 E SY+S +++ Y+E+ + P +E + + + R++ L Sbjct: 148 ELSYLS---TLIRYEELY--IINPNQARATPKEHHDFIVNHLVDNTRKLEELAIFERIQI 202 Query: 314 YTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLN 359 Y C D+K+N +L+E ++ + + +EK+ ++ +RLN Sbjct: 203 YQRDRSCVYDSKENTTSAADVLQELLFGEWSQVEKEMLQVGEKRLN 248 >STRMU Q8DVT1 (Q8DVT1) Putative ribosomal-protein-alanine acetyltransferase (EC 2.3.1.128) Length = 144 Score = 31.2 bits (69), Expect = 7.4 Identities = 28/124 (22%), Positives = 51/124 (41%), Gaps = 20/124 (16%) Query: 65 YNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYG---MDQFIGEPN---------YWSKGI 112 Y P QI + L DY + D+ + G + +GE Y +G+ Sbjct: 22 YQVSPWSQKQILTDMNRLDVDYFFAYDDKEIVGFLSIQHLVGELELTNIAIKKAYQGQGL 81 Query: 113 GTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDC 172 G++ + ++ ++ + L+ +N A YQK GFR + ++ + KED Sbjct: 82 GSQLLAML------TKDELPIFLEVRASNQAAQALYQKFGFRSLTTRKDY--YHNPKEDA 133 Query: 173 YLME 176 LM+ Sbjct: 134 ILMK 137 >SALTI SPED_SALTI (Q8Z9E3) S-adenosylmethionine decarboxylase proenzyme (EC 4.1.1.50) (AdoMetDC) (SamDC) [Contains: S-adenosylmethionine decarboxylase beta chain; S-adenosylmethionine decarboxylase alpha chain] Length = 264 Score = 31.2 bits (69), Expect = 7.4 Identities = 19/60 (31%), Positives = 29/60 (48%) Query: 169 KEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTK 228 + D ++YR +V MK+ I+H ++ + E + S YD V V E IF TK Sbjct: 157 ESDIVTIDYRVRGFTRDVNGMKHFIDHEINSIQNFMSEDMKSLYDMVDVNVYQENIFHTK 216 >SALTY SPED_SALTY (Q8ZRS4) S-adenosylmethionine decarboxylase proenzyme (EC 4.1.1.50) (AdoMetDC) (SamDC) [Contains: S-adenosylmethionine decarboxylase beta chain; S-adenosylmethionine decarboxylase alpha chain] Length = 264 Score = 31.2 bits (69), Expect = 7.4 Identities = 19/60 (31%), Positives = 29/60 (48%) Query: 169 KEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTK 228 + D ++YR +V MK+ I+H ++ + E + S YD V V E IF TK Sbjct: 157 ESDIVTIDYRVRGFTRDVNGMKHFIDHEINSIQNFMSEDMKSLYDMVDVNVYQENIFHTK 216 >SALCH Q57T90 (Q57T90) S-adenosylmethionine decarboxylase, proenzyme Length = 264 Score = 31.2 bits (69), Expect = 7.4 Identities = 19/60 (31%), Positives = 29/60 (48%) Query: 169 KEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTK 228 + D ++YR +V MK+ I+H ++ + E + S YD V V E IF TK Sbjct: 157 ESDIVTIDYRVRGFTRDVNGMKHFIDHEINSIQNFMSEDMKSLYDMVDVNVYQENIFHTK 216 >RICCN Q92JP8 (Q92JP8) Cell surface antigen Length = 1902 Score = 31.2 bits (69), Expect = 7.4 Identities = 24/90 (26%), Positives = 41/90 (45%), Gaps = 2/90 (2%) Query: 197 FDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVK 256 F N K + E+I S S+ F + + K + K+K Y++ +T +++NVK Sbjct: 1678 FKNSKNNDKELINSHVVSIYGQKELPKNFALQALVSASKNFIKDKTTYSYGDTKIKSNVK 1737 Query: 257 IPNIEYSYISDELSILGYKEIKGTFLTPEI 286 N +SY ++ L Y +TP I Sbjct: 1738 HRN--HSYNAEALLHYNYLLQSKLVITPNI 1765 >NEIGI Q5FAH1 (Q5FAH1) Hypothetical protein Length = 177 Score = 31.2 bits (69), Expect = 7.4 Identities = 18/45 (40%), Positives = 25/45 (55%), Gaps = 7/45 (15%) Query: 215 VAYLVNNEYI-------FKTKFSTNKKKGYAKEKAIYNFLNTNLE 252 + YL++NE + FK FSTN+KK EK I FL N++ Sbjct: 69 IDYLISNEILIVRTKFSFKNIFSTNEKKYKEIEKEINKFLYKNMD 113 >LISIN Q92DJ7 (Q92DJ7) Lin0816 protein Length = 185 Score = 31.2 bits (69), Expect = 7.4 Identities = 16/62 (25%), Positives = 28/62 (45%), Gaps = 1/62 (1%) Query: 98 MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157 +D + Y G+GT + + E + V L+ K NP A R Y++ GF + Sbjct: 113 LDSIVTNEKYRGHGVGTALLAKLTEIAAND-GEKVVGLNCDKGNPHAKRLYERLGFHVTG 171 Query: 158 DL 159 ++ Sbjct: 172 EI 173 >LACJO Q74J74 (Q74J74) Hypothetical protein Length = 150 Score = 31.2 bits (69), Expect = 7.4 Identities = 21/58 (36%), Positives = 28/58 (48%), Gaps = 5/58 (8%) Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161 +P Y SKGI T IK L++ V L+ +N RA Y+K GF + L E Sbjct: 80 DPIYQSKGIATELIKKALTELERP-----VRLEVFTDNERAKALYRKFGFERVNTLTE 132 >GEOSL Q74A59 (Q74A59) Sensory box histidine kinase Length = 1053 Score = 31.2 bits (69), Expect = 7.4 Identities = 41/188 (21%), Positives = 78/188 (41%), Gaps = 34/188 (18%) Query: 176 EYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGS------------GYDSVAYLVNNE- 222 E+RY D V+A+K E YF ++GS G D LV+ E Sbjct: 106 EHRYGD----VEALKSRYEAYFRKATELYPRVLGSTDTFLSGEIARLGADGRLILVDFER 161 Query: 223 ----YIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIK 278 Y+ + + + A++ +IY F+ + + P I +++++ L I +E++ Sbjct: 162 MSRDYVTSVEHQIERNRALARDTSIYLFVLFGMVVLLAAPAI--TFVANRLLIRPLEELR 219 Query: 279 GTFLTPEIYSTMSEEEQNLLKRDI--------ASFLRQMHGLDYTDISECTIDNKQNVLE 330 G + ++ S + L D ASF + GL T +S +DN + Sbjct: 220 GMVTS---FAGGSLDLSGLPDYDAGDEIGSLCASFRSMVEGLQETTVSRDYVDNIIESMS 276 Query: 331 EYILLRET 338 + +++ +T Sbjct: 277 DCLIVVDT 284 >ENTFA Q836Z2 (Q836Z2) Acetyltransferase, GNAT family Length = 173 Score = 31.2 bits (69), Expect = 7.4 Identities = 19/75 (25%), Positives = 33/75 (44%), Gaps = 1/75 (1%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE-HELH 165 YW G+G+ ++ + + + + L N RAI Y+K GF +P + Sbjct: 99 YWGYGLGSILMEELIRWAHESHVIRRLELTVQDRNQRAIHVYKKLGFETEAIMPRGAKTD 158 Query: 166 EGKKEDCYLMEYRYD 180 +G+ D +LM D Sbjct: 159 QGEFLDVHLMRLLID 173 >ENTFA Q82YJ4 (Q82YJ4) Toxin ABC transporter, ATP-binding/permease protein Length = 700 Score = 31.2 bits (69), Expect = 7.4 Identities = 33/166 (19%), Positives = 68/166 (40%), Gaps = 7/166 (4%) Query: 94 IVYGMDQFIG-EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSG 152 +V ++QF + Y S + + ++ I ++ E +ILD + R R K G Sbjct: 439 VVSSLNQFGSFQAQYESMQVASHRLESILINMENENVCGEIILDKKIESIRCKRVSIKKG 498 Query: 153 FRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGY 212 ++ D E++ GK + + +T +K++ L + Y +++I+I Sbjct: 499 DTLLLDTVNCEIYRGK--NLSIRGENGSGKSTLIKSLVRLDDDYRGQILINNIDIKKINL 556 Query: 213 D----SVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETN 254 D + ++ N + N G+ +I+N L + E N Sbjct: 557 DCLRSKLVFVEPNPKFLEGTIRDNLLLGHKVPNSIFNKLIRDFEIN 602 >CLOAB Q97M70 (Q97M70) Predicted metal-dependent hydrolase Length = 259 Score = 31.2 bits (69), Expect = 7.4 Identities = 13/39 (33%), Positives = 23/39 (58%), Gaps = 5/39 (12%) Query: 47 KKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTD 85 KK+Y E W ++ + +EY Y + YK++DE+Y + Sbjct: 145 KKNYAEKWYKKIAAIELEYL-----YNEKYKIFDEIYDE 178 >CLOAB Q97HI5 (Q97HI5) Predicted flavodoxin Length = 180 Score = 31.2 bits (69), Expect = 7.4 Identities = 23/76 (30%), Positives = 35/76 (46%), Gaps = 2/76 (2%) Query: 119 LIFEFLKKERNANAVILDPHKNNP-RAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEY 177 L+F L K R AN IL + NP RA+ YQ S I +++ +++ +G Y + Sbjct: 22 LMFSRLNKPRQANQKILKAKEANPKRALIVYQPSMSSITDEV-ANQIAKGLNTQGYEVTL 80 Query: 178 RYDDNATNVKAMKYLI 193 Y N + Y I Sbjct: 81 NYPSNHLSTNVSDYSI 96 >BRAJA Q89R32 (Q89R32) Acyl-CoA dehydrogenase Length = 455 Score = 31.2 bits (69), Expect = 7.4 Identities = 13/35 (37%), Positives = 23/35 (65%) Query: 235 KGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDEL 269 K AK++ ++NF + ET + N++Y+YI+ EL Sbjct: 107 KNKAKKEGLWNFFLPDDETGQGLKNLDYAYIASEL 141 >BACC1 Q735F5 (Q735F5) Streptothricin acetyltransferase, putative Length = 184 Score = 31.2 bits (69), Expect = 7.4 Identities = 30/122 (24%), Positives = 54/122 (44%), Gaps = 6/122 (4%) Query: 37 RDKKYTLE---SLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDE 93 R +YT+E S +K Y + +E+ EY N P I +++++ K Sbjct: 38 RHIEYTVEDVPSYEKSYLQNDNEEL--AYNEYINKPNQIIYIALLHNQIIGFIVLKKNWN 95 Query: 94 IVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 ++ + Y + G+G R + ++ K E N ++L+ NN A + Y+K GF Sbjct: 96 HYAYIEDITVDKKYRTLGVGKRLVVQAKQWAK-EGNMPGIMLETQNNNVAACKFYEKCGF 154 Query: 154 RI 155 I Sbjct: 155 VI 156 >BACAN Q81YS9 (Q81YS9) Acetyltransferase, GNAT family Length = 288 Score = 31.2 bits (69), Expect = 7.4 Identities = 13/44 (29%), Positives = 25/44 (56%) Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153 KG+G R ++ +++ + + L + NN RA++ Y+K GF Sbjct: 233 KGVGERLLQAAIQYIFSFQGMREIELCLNTNNDRAVKLYKKVGF 276 >VIBPA Q87HD3 (Q87HD3) Hypothetical protein VPA1032 Length = 265 Score = 30.8 bits (68), Expect = 9.7 Identities = 19/64 (29%), Positives = 31/64 (48%), Gaps = 3/64 (4%) Query: 294 EQNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEK---DY 350 E L + + SF M DY +SE + ++ E+ L +T ++D+ DI+ Y Sbjct: 96 ENEELTKSLVSFNLSMVSQDYEQVSELALQIEELRQEKGFLANDTSFSDVRDIDDRLGGY 155 Query: 351 IESF 354 IE F Sbjct: 156 IELF 159 >VIBCH Q9KL03 (Q9KL03) Spermidine n1-acetyltransferase Length = 173 Score = 30.8 bits (68), Expect = 9.7 Identities = 20/83 (24%), Positives = 35/83 (42%), Gaps = 3/83 (3%) Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 Q I P + KG I ++ N + + L NP+A+ Y++ GF L Sbjct: 86 QIIIAPEHQGKGFARTLINRALDYSFTILNLHKIYLHVAVENPKAVHLYEECGFVEEGHL 145 Query: 160 PEHELHEGKKED---CYLMEYRY 179 E G+ +D Y+++ +Y Sbjct: 146 VEEFFINGRYQDVKRMYILQSKY 168 >THEMA SYE2_THEMA (Q9X2I8) Glutamyl-tRNA synthetase 2 (EC 6.1.1.17) (Glutamate--tRNA ligase 2) (GluRS 2) Length = 487 Score = 30.8 bits (68), Expect = 9.7 Identities = 16/44 (36%), Positives = 22/44 (50%) Query: 325 KQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNATTVFEGKK 368 K N L + + ND + EKDY+E F++R A V E K Sbjct: 369 KVNTLSQLYDIMYPFMNDDYEYEKDYVEKFLKREEAERVLEEAK 412 >THETN ISPH_THETN (Q8RA76) 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (EC 1.17.1.2) Length = 288 Score = 30.8 bits (68), Expect = 9.7 Identities = 14/55 (25%), Positives = 31/55 (56%) Query: 115 RYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKK 169 R I++ +E L K+++ L +NP+ ++ ++ G R+IE+ +L +G + Sbjct: 17 RAIEIAYEELNKQKDTRLYTLGEIIHNPQVVKDLEEKGVRVIEEEELEKLLKGDR 71 >STRCO Q9S2L5 (Q9S2L5) Hypothetical protein SCO1988 Length = 183 Score = 30.8 bits (68), Expect = 9.7 Identities = 18/47 (38%), Positives = 26/47 (55%), Gaps = 5/47 (10%) Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157 GIG R++ L KER A+ + L + N A R Y++ GFR +E Sbjct: 119 GIGDRFVALA-----KERRADGLSLWTFQVNAPARRFYERHGFRAVE 160 >STRP1 Q99XX8 (Q99XX8) Putative pullulanase Length = 1165 Score = 30.8 bits (68), Expect = 9.7 Identities = 37/171 (21%), Positives = 62/171 (36%), Gaps = 31/171 (18%) Query: 83 YTDYHY----PKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPH 138 YT Y+Y + E V +D + W+ T IK A A +DP Sbjct: 473 YTGYYYLYEITRGQEKVMVLDPYAKSLAAWNDATATDDIK----------TAKAAFIDPS 522 Query: 139 KNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFD 198 K P + + + F+ K+ED + E D T+ KA++ + H F Sbjct: 523 KLGPTGLDFAKINNFK-------------KREDAIIYEAHVRD-FTSDKALEGKLTHPFG 568 Query: 199 NFK--VDSIEIIGS-GYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNF 246 F V+ ++ + G V L Y + + ++ Y YN+ Sbjct: 569 TFSAFVEQLDYLKDLGVTHVQLLPVLSYFYANELDKSRSTAYTSSDNNYNW 619 >STRP1 Q99YF0 (Q99YF0) Hypothetical protein SPy1734 (Acetyltransferase) (EC 2.3.1.-) Length = 174 Score = 30.8 bits (68), Expect = 9.7 Identities = 17/65 (26%), Positives = 34/65 (52%), Gaps = 1/65 (1%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166 Y GIG +++ ++ ++ ++ LD N +AI Y+K GFR IE + ++++ Sbjct: 99 YRGYGIGQLLLEIALDWAEENPYIESLKLDVQVRNTKAIYLYKKYGFR-IESMRKNDIKS 157 Query: 167 GKKED 171 +D Sbjct: 158 KNGDD 162 >STAES Q8CTP7 (Q8CTP7) Hypothetical protein SE0368 Length = 158 Score = 30.8 bits (68), Expect = 9.7 Identities = 27/108 (25%), Positives = 44/108 (40%), Gaps = 18/108 (16%) Query: 49 HYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYW 108 H + +++F V E + + +G+ + +ELY HY + P Sbjct: 48 HLKKRLNEQLFLVAEEDSEI-VGFAN-FIYGEELYLSAHYVR--------------PESQ 91 Query: 109 SKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156 +G GTR ++ + K + V L+ NN I YQ GF II Sbjct: 92 HRGYGTRLLEAGLKRFKDQYET--VYLEVDNNNSNGIEYYQNHGFEII 137 >STAES COAD_STAES (Q8CSZ5) Phosphopantetheine adenylyltransferase (EC 2.7.7.3) (Pantetheine-phosphate adenylyltransferase) (PPAT) (Dephospho-CoA pyrophosphorylase) Length = 161 Score = 30.8 bits (68), Expect = 9.7 Identities = 21/111 (18%), Positives = 50/111 (45%), Gaps = 13/111 (11%) Query: 185 NVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIY 244 +VK + + H+F+ VD + +G+ +++ ++ + ++ KK Sbjct: 59 SVKHLPNIQVHHFNGLLVDFCDQVGAKTIIRGLRAVSDFEYELRLTSMNKK--------- 109 Query: 245 NFLNTNLETNVKIPNIEYSYISDEL--SILGYKEIKGTFLTPEIYSTMSEE 293 LN+N+ET + + YS+IS + + Y+ F+ P + + ++ Sbjct: 110 --LNSNIETMYMMTSANYSFISSSIVKEVAAYQADISPFVPPHVERALKKK 158 >MYCLEH Y3433_MYCTU (O06250) Hypothetical protein Rv3433c/MT3539 Length = 473 Score = 30.8 bits (68), Expect = 9.7 Identities = 12/30 (40%), Positives = 23/30 (76%) Query: 130 ANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 A+AV+L+P + + +A+ A+ KSG R++E + Sbjct: 81 ADAVLLNPDRTHRKALAAFTKSGGRLVESV 110 >MYCTU Y3433_MYCTU (O06250) Hypothetical protein Rv3433c/MT3539 Length = 473 Score = 30.8 bits (68), Expect = 9.7 Identities = 12/30 (40%), Positives = 23/30 (76%) Query: 130 ANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 A+AV+L+P + + +A+ A+ KSG R++E + Sbjct: 81 ADAVLLNPDRTHRKALAAFTKSGGRLVESV 110 >MYCBO Q7TWI6 (Q7TWI6) Hypothetical protein Mb3463c Length = 473 Score = 30.8 bits (68), Expect = 9.7 Identities = 12/30 (40%), Positives = 23/30 (76%) Query: 130 ANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159 A+AV+L+P + + +A+ A+ KSG R++E + Sbjct: 81 ADAVLLNPDRTHRKALAAFTKSGGRLVESV 110 >LISMO Q8Y5C4 (Q8Y5C4) Lmo2141 protein Length = 157 Score = 30.8 bits (68), Expect = 9.7 Identities = 28/102 (27%), Positives = 47/102 (46%), Gaps = 9/102 (8%) Query: 76 YKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVIL 135 YK L ++ H + D V+ P+Y GIG + + E + +E+ + L Sbjct: 63 YKSPIPLASNKHVAEIDIAVH--------PDYQRAGIGQLLMDKMKE-VAREKGYIKIAL 113 Query: 136 DPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEY 177 N +AIR Y+K+GF+ L + + +G+ D LM Y Sbjct: 114 RVLSINQKAIRFYEKNGFKQEGLLEKEFIIQGEFVDDILMAY 155 >LISIN Q929Z8 (Q929Z8) Lin2125 protein Length = 231 Score = 30.8 bits (68), Expect = 9.7 Identities = 25/89 (28%), Positives = 38/89 (42%), Gaps = 15/89 (16%) Query: 8 ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKK----YTLESLKKHYTEP--WEDEVFRV 61 + ++TL+ P + WL DE F G Y L ++ +T P W+ V + Sbjct: 107 LVLKTLVARTRPDSVNWLIDESGFSFPSGHATATAVFYGLAAMFLIFTVPKMWQKIVIGI 166 Query: 62 IIEYNNVPIGYGQI-YKMYDELYTDYHYP 89 IGYG I + MY +Y H+P Sbjct: 167 --------IGYGFILFVMYTRVYLGVHFP 187 >ENTFA Q837Z5 (Q837Z5) Acetyltransferase, GNAT family Length = 184 Score = 30.8 bits (68), Expect = 9.7 Identities = 15/45 (33%), Positives = 24/45 (53%) Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 +G G + LI +F E + + L + NN +AI Y+K GF+ Sbjct: 104 QGCGFEAVSLICKFAFYELGLHKIRLAVNSNNQKAIHVYEKVGFK 148 >ENTFA Q831M9 (Q831M9) Ribosomal-protein-alanine acetyltransferase, putative Length = 154 Score = 30.8 bits (68), Expect = 9.7 Identities = 15/45 (33%), Positives = 28/45 (62%), Gaps = 1/45 (2%) Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154 +GIG + +K E++K R+ + L+ ++N A + Y+K+GFR Sbjct: 83 QGIGCQLMKAFKEYVKS-RDITQIFLEVRESNILAQKLYEKTGFR 126 >CLOTE Q895T0 (Q895T0) Acetyltransferase (EC 2.3.1.-) Length = 165 Score = 30.8 bits (68), Expect = 9.7 Identities = 17/67 (25%), Positives = 37/67 (55%), Gaps = 3/67 (4%) Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166 Y +G+GT+ + I + L K++ + + D + NP+ +QK G+ + ++ + L++ Sbjct: 97 YRHQGVGTKLLSYI-KTLAKDKKIHLIKSDTYSLNPKMNALFQKCGYEKVGEI--NLLNK 153 Query: 167 GKKEDCY 173 K +CY Sbjct: 154 PYKFNCY 160 >CLOPE Q8XMF9 (Q8XMF9) Hypothetical protein CPE0730 Length = 154 Score = 30.8 bits (68), Expect = 9.7 Identities = 39/162 (24%), Positives = 66/162 (40%), Gaps = 30/162 (18%) Query: 28 ERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYH 87 ERVLE K ++ + K + E + + + EY +K E+ D + Sbjct: 3 ERVLEIR--EPKNCEIDDIMKIWLESTVEAHYFIEEEY----------WKKNYEVVRDIY 50 Query: 88 YPKTDEIVYGMD------------QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVIL 135 P VY + FIG +K G+ K + E++K + + L Sbjct: 51 IPMAKTFVYCDEGKINGFISIIDSNFIGALFVHTKSQGSGIGKSLLEYVKNKYEN--IEL 108 Query: 136 DPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEY 177 +K+N +A+ Y+K F+II++ + G E YLM Y Sbjct: 109 AVYKDNKKAVEFYKKHDFKIIKEQENED--SGHLE--YLMSY 146 >BRUSU Q8FY73 (Q8FY73) Outer membrane autotransporter Length = 1593 Score = 30.8 bits (68), Expect = 9.7 Identities = 18/64 (28%), Positives = 32/64 (50%), Gaps = 3/64 (4%) Query: 414 EEEIGTNFGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIY 473 E + N + ++ G + ++ +QD + + T +YGI NI QEF+ NGR + Sbjct: 1478 EANVSLNDSDSLIGRAG-VALDYRNAWQDDAGQI--VHTNIYGIANIYQEFMGNGRVGVA 1534 Query: 474 KRTY 477 T+ Sbjct: 1535 DTTF 1538 >BACSU YCBJ_BACSU (P42242) Hypothetical protein ycbJ Length = 306 Score = 30.8 bits (68), Expect = 9.7 Identities = 36/179 (20%), Positives = 66/179 (36%), Gaps = 26/179 (14%) Query: 300 RDIASFLRQMHGLDYTDISECTI------DNKQNVLEEYILLRETIYNDLTDIEKDYIES 353 R +A L ++HG D + I D +Q + + ++ + + E Sbjct: 129 RTLADILAELHGTDQISAGQSGIEVIRPEDFRQMTADSMVDVKNKL-----GVSTTLWER 183 Query: 354 FMERLNATTVFEGKKCLCHNDFSCNHLLLDGNNRLTXXXXXXXXXXXXEYCDFIYLLEDS 413 + + ++ + G L H D H+L+D N R+T DF+ Sbjct: 184 WQKWVDDDAYWPGFSSLIHGDLHPPHILIDQNGRVTGLLDWTEAKVADPAKDFVL----- 238 Query: 414 EEEIGTNFGED----ILRMY---GNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFI 465 T FGE +L Y G K +E+ ++ YP+E ++ ++E I Sbjct: 239 ---YQTIFGEKETARLLEYYDQAGGRIWAKMQEHISEMQAAYPVEIAKLALQTQQEEHI 294 >BACHD Q9KE57 (Q9KE57) BH1001 protein Length = 448 Score = 30.8 bits (68), Expect = 9.7 Identities = 32/119 (26%), Positives = 54/119 (45%), Gaps = 21/119 (17%) Query: 272 LGYKEIKGTFLTPEIYSTMSEEEQNLL------------KRDIASFLRQMHGLDYTDISE 319 LG+K +GT L ++ TMS EE + D F +++G + T ++E Sbjct: 306 LGFKVERGTLLESKVELTMSFEEDGISFDVGMSVDSTYNYDDAVEF--KLYGQERTTLTE 363 Query: 320 CTIDNKQNVLEEYILLRETIYND-LTDIEKDYIESFM--ERLNATTVFEGKKCLCHNDF 375 +D ++ E E++ ND L D ++DY E + E L E ++ + H DF Sbjct: 364 AELD---DLTYEINWELESLVNDLLADFQEDYYEEELSEEDLALLAAIEAQE-VSHEDF 418 >BACC1 Q73CD9 (Q73CD9) Glycerol-3-phosphate dehydrogenase, aerobic (EC 1.1.99.5) Length = 560 Score = 30.8 bits (68), Expect = 9.7 Identities = 19/52 (36%), Positives = 26/52 (50%), Gaps = 2/52 (3%) Query: 146 RAYQKSGFRIIEDLPEHEL--HEGKKEDCYLMEYRYDDNATNVKAMKYLIEH 195 R+ ++ F E L + L EG K Y +EYR DD ++ MK IEH Sbjct: 133 RSERRKMFNREETLKKEPLVKQEGLKGGGYYVEYRTDDARLTIEVMKEAIEH 184 >BACAN Q81VL8 (Q81VL8) Oxidoreductase, FAD-binding Length = 471 Score = 30.8 bits (68), Expect = 9.7 Identities = 28/96 (29%), Positives = 41/96 (42%), Gaps = 12/96 (12%) Query: 244 YNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKRDIA 303 Y L+ +K+ N E +L YKE F E + +L + +A Sbjct: 188 YGLFGVILDVTLKLTNDEL--YETHTKMLDYKEYTSYF--KEKVKKDANVRMHLARISVA 243 Query: 304 --SFLRQMHGLDYTDISECTIDNKQNVLEEYILLRE 337 SFLR+M+ DY T+ QN+ EEY L+E Sbjct: 244 PNSFLREMYVTDY------TLAQNQNMREEYSELKE 273 >BACAN Q81U57 (Q81U57) Glycerol-3-phosphate dehydrogenase, aerobic Length = 560 Score = 30.8 bits (68), Expect = 9.7 Identities = 19/52 (36%), Positives = 26/52 (50%), Gaps = 2/52 (3%) Query: 146 RAYQKSGFRIIEDLPEHEL--HEGKKEDCYLMEYRYDDNATNVKAMKYLIEH 195 R+ ++ F E L + L EG K Y +EYR DD ++ MK IEH Sbjct: 133 RSERRKMFNREETLKKEPLVKQEGLKGGGYYVEYRTDDARLTIEVMKEAIEH 184 >BACAN Q81NU7 (Q81NU7) Acetyltransferase, GNAT family Length = 153 Score = 30.8 bits (68), Expect = 9.7 Identities = 20/79 (25%), Positives = 34/79 (43%), Gaps = 14/79 (17%) Query: 86 YHYPKTDEIVYGMDQFIGEPN--------------YWSKGIGTRYIKLIFEFLKKERNAN 131 Y K D+IV G F G+PN YW+KG T ++ + ++ + Sbjct: 52 YVIRKEDDIVLGDIGFKGKPNEEHTVEVGYGFIEKYWNKGYATEAVQELIDWAFQTGEVE 111 Query: 132 AVILDPHKNNPRAIRAYQK 150 +I + +N +IR +K Sbjct: 112 TIIAETLLDNYGSIRVLEK 130 Database: Blastdata.fdb Posted date: Mar 29, 2006 3:30 PM Number of letters in database: 77,468,597 Number of sequences in database: 240,170 Lambda K H 0.318 0.139 0.409 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Hits to DB: 72,017,968 Number of Sequences: 240170 Number of extensions: 3196375 Number of successful extensions: 9166 Number of sequences better than 10.0: 203 Number of HSP's better than 10.0 without gapping: 69 Number of HSP's successfully gapped in prelim test: 134 Number of HSP's that attempted gapping in prelim test: 8848 Number of HSP's gapped (non-prelim): 424 length of query: 479 length of database: 77,468,597 effective HSP length: 115 effective length of query: 364 effective length of database: 49,849,047 effective search space: 18145053108 effective search space used: 18145053108 T: 11 A: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 68 (30.8 bits) BLASTP 2.2.10 [Oct-19-2004] From mdehoon at c2b2.columbia.edu Wed Apr 19 16:54:33 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Wed, 19 Apr 2006 12:54:33 -0400 Subject: [BioPython] Need help parsing Blastoutput Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF7@cgcmail.cgc.cpmc.columbia.edu> The Blast parser fails to read your file because the format of Blast output has changed. If I edit the data file so that it corresponds to the old format (add a space here, remove a blank line there, etc.), the Blast parser reads the file without problems. The easiest solution is to repeat the Blast run, using XML for the output format, and use the Blast XML parser in Biopython to parse the results. A general question is if anybody still needs the parser for Blast text output. Currently, we are confusing our users by having a Blast text parser that tends to break. A broken parser may be worse than no parser. --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] Sent: Wed 4/19/2006 6:15 AM To: Michiel De Hoon Cc: biopython at lists.open-bio.org Subject: RE: [BioPython] Need help parsing Blastoutput Hi Please see the attachment,it part of my Blast output. yes I am try to parse text output from Blast ,I have use another script to run my local blast that I am trying to perse the NCBIStandalone.BlastParser was working fine without hsp.sbject_end which is one of what I need to print out . On checking the class diagrams from cookbook, findout that sbject_end is not included .I just need another way of printing the int(subject end). Thanks for your help Halimah On Tue, 18 Apr 2006, Michiel De Hoon wrote: > Could you also send us the file Enterococcus_out so we can run the script? > > From the script, it looks like you're trying to parse text output from Blast. > While this is possible (in theory), the format of Blast text output tends to > change a lot, thereby breaking the parser in Biopython. It is more reliable > to have Blast generate output in XML format, and use the XML parser: > > blast_out = open('my_blast.xml', 'r') > > from Bio.Blast import NCBIXML > > b_parser = NCBIXML.BlastParser() > b_record = b_parser.parse(blast_out) > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to > generate Blast output in XML. > > --Michiel. > > > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > Sent: Tue 4/18/2006 11:06 AM > To: Michiel De Hoon > Cc: biopython at lists.open-bio.org > Subject: RE: [BioPython] Need help parsing Blastoutput > > thanks > please see the attchment a copy of my script and copy of my Blast output > Thanks > > > On Thu, 13 Apr 2006, Michiel De Hoon wrote: > > > Could you send us the script you were using? > > > > --Michiel. > > > > Michiel de Hoon > > Center for Computational Biology and Bioinformatics > > Columbia University > > 1150 St Nicholas Avenue > > New York, NY 10032 > > > > > > > > -----Original Message----- > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu > > Sent: Thu 4/13/2006 11:07 AM > > To: biopython at lists.open-bio.org > > Subject: [BioPython] Need help parsing Blastoutput > > > > Hi All, > > I have a BLAST output from a local blast > > I need to calculate my % alignment coverage as regard to my subject > > I try parsed the blast output and wanted to print the > > sbjct Start and Sbjct end. but I could not is there anyway I could this > > try to get mach coverage between my querry and subject I dont need > > Identities,but total % alignment for querry or subject. > > Thanks > > Halimah > > > > _______________________________________________ > > BioPython mailing list - BioPython at lists.open-bio.org > > http://lists.open-bio.org/mailman/listinfo/biopython > > > > > > From elventear at gmail.com Thu Apr 20 01:02:30 2006 From: elventear at gmail.com (Pepe Barbe) Date: Wed, 19 Apr 2006 20:02:30 -0500 Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files Message-ID: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com> Hello, Following the simple steps in the BioPython cookbook, I wanted to create a dictionary with the following GenBank file: ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Escherichia_coli_K12/NC_000913.gbk Below you can find what I tried executing and the error I got. I would appreciate any insight into solving the error and correctly producing the dictionary. Thanks! Pepe ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> dict_file = 'NC_000913.gbk' >>> index_file = 'NC_000913.idx' >>> from Bio import GenBank >>> GenBank.index_file(dict_file, index_file) Traceback (most recent call last): File "", line 1, in ? File "/sw/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 1283, in index_file SimpleSeqRecord.create_flatdb([filename], indexname, indexer) File "/sw/lib/python2.4/site-packages/Bio/Mindy/SimpleSeqRecord.py", line 152, in create_flatdb creator.load(filename, builder = builder, fileid_info = {}) File "/sw/lib/python2.4/site-packages/Bio/Mindy/BaseDB.py", line 36, in load raise TypeError("Cannot identify file as a %s format" % TypeError: Cannot identify file as a unknown format From biopython at maubp.freeserve.co.uk Thu Apr 20 12:42:34 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython)) Date: Thu, 20 Apr 2006 13:42:34 +0100 Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files In-Reply-To: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com> References: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com> Message-ID: <444781BA.8080107@maubp.freeserve.co.uk> Pepe Barbe wrote: > Hello, > > Following the simple steps in the BioPython cookbook, I wanted to > create a dictionary with the following GenBank file: > > ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Escherichia_coli_K12/NC_000913.gbk > > Below you can find what I tried executing and the error I got. I would > appreciate any insight into solving the error and correctly producing > the dictionary. The cookbook tutorial is a little misleading in that regard. Indexing a GenBank file only makes sense for those files with multiple genbank record (i.e. multiple LOCUS lines). For example, you can get multi-record GenBank files with records for different genes. These tend to be small records, and the Martel based indexing code copes fine. It doesn't cope very well with large records like genomes. Your example (and in my experience all Bacterial Genomes) have just a single very large record (which will contain many features). Does this page help? http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/genbank/ I did suggest a change to the documentation but it looks like no one has made the change... http://biopython.org/pipermail/biopython-dev/2005-November/002193.html I had forgotten to chase this up. Peter From alpersoyler at yahoo.com Thu Apr 20 12:59:57 2006 From: alpersoyler at yahoo.com (alper soyler) Date: Thu, 20 Apr 2006 05:59:57 -0700 (PDT) Subject: [BioPython] Need help!!! Message-ID: <20060420125957.72247.qmail@web36804.mail.mud.yahoo.com> Hi All, I am new to Biopython and have a question. I want to construct a pyhlogenetic profile for one organism's proteins. I want to give my protein to blast to search one organism's genome (e.g. Homo sapiens) instead of whole genbank database. How can I solve my problem? Thank you in advance. regards, Alper --------------------------------- New Yahoo! Messenger with Voice. Call regular phones from your PC and save big. From cy at cymon.org Thu Apr 20 13:41:46 2006 From: cy at cymon.org (Cymon J. Cox) Date: Thu, 20 Apr 2006 14:41:46 +0100 Subject: [BioPython] Need help!!! In-Reply-To: <20060420125957.72247.qmail@web36804.mail.mud.yahoo.com> References: <20060420125957.72247.qmail@web36804.mail.mud.yahoo.com> Message-ID: <1145540506.11610.17.camel@clintonite.nhm.ac.uk> Hi Alper, On Thu, 2006-04-20 at 05:59 -0700, alper soyler wrote: > Hi All, > > I am new to Biopython and have a question. I want to construct a pyhlogenetic > profile for one organism's proteins. I want to give my protein to blast to > search one organism's genome (e.g. Homo sapiens) instead of whole genbank > database. How can I solve my problem? Thank you in advance. Assuming you want to do this locally, you'll need to download you target genome, format it with the BLAST distribution programme 'formatdb', and then feed your query and newly formatted genome BLAST database to Bio.Blast.NCBIStandalone. See http://biopython.org/docs/tutorial/Tutorial004.html#toc10 3.1.4 Running BLAST locally for details, Cheers, Cymon ____________________________________________________________________ Cymon J. Cox Biometry and Molecular Research Department of Zoology Natural History Museum Cromwell Road London, SW7 5BD Email: cy at cymon.org, c.cox at nhm.ac.uk, cymon.cox at googlemail.com Phone : +44 (0)20 7942 6981 HomePage : http://www.duke.edu/~cymon -8.63/-6.77 _____________________________________________________________________ Fedora Core release 4 (Stentz) clintonite.nhm.ac.uk 14:35:55 up 13 days, 20:42, 8 users, load average: 0.08, 0.16, 0.12 From mcolosimo at mitre.org Thu Apr 20 14:23:19 2006 From: mcolosimo at mitre.org (Marc Colosimo) Date: Thu, 20 Apr 2006 10:23:19 -0400 Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files In-Reply-To: <444781BA.8080107@maubp.freeserve.co.uk> References: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com> <444781BA.8080107@maubp.freeserve.co.uk> Message-ID: <65CA5BE4-1C83-4FD7-B998-C97BCF9AA6DE@mitre.org> While we are on the subject of parsing multiple GenBank files and the Cookbook, I think a better example (and more pythonish) is the following: from Bio import GenBank gb_file = "my_file.gb" gb_handle = open(gb_file, 'r') feature_parser = GenBank.FeatureParser() gb_iterator = GenBank.Iterator(gb_handle, feature_parser) for cur_record in gb_iterator: # now do something with the record print cur_record.seq which is way nicer (and uses iterators as per pep-234 and ) than while 1: cur_record = gb_iterator.next() if cur_record is None: break # now do something with the record print cur_record.seq Actually, the above works with the Fasta iterator as well. Times for a GenBank file with 72,358 records (LOCUSs): my way (using iterators): 14m16.886s cookbook way (using next and if): 14m28.547s Surprisingly, this isn't much faster (maybe with -O it would be) Marc On Apr 20, 2006, at 8:42 AM, Peter (BioPython) wrote: > Pepe Barbe wrote: >> Hello, >> >> Following the simple steps in the BioPython cookbook, I wanted to >> create a dictionary with the following GenBank file: >> >> ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Escherichia_coli_K12/ >> NC_000913.gbk >> >> Below you can find what I tried executing and the error I got. I >> would >> appreciate any insight into solving the error and correctly producing >> the dictionary. > > The cookbook tutorial is a little misleading in that regard. > Indexing a > GenBank file only makes sense for those files with multiple genbank > record (i.e. multiple LOCUS lines). > > For example, you can get multi-record GenBank files with records for > different genes. These tend to be small records, and the Martel based > indexing code copes fine. It doesn't cope very well with large > records > like genomes. > > Your example (and in my experience all Bacterial Genomes) have just a > single very large record (which will contain many features). > > Does this page help? > > http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/ > python/genbank/ > > I did suggest a change to the documentation but it looks like no > one has > made the change... > > http://biopython.org/pipermail/biopython-dev/2005-November/002193.html > > I had forgotten to chase this up. > > Peter > > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython From elventear at gmail.com Thu Apr 20 16:11:42 2006 From: elventear at gmail.com (Pepe Barbe) Date: Thu, 20 Apr 2006 11:11:42 -0500 Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files In-Reply-To: <444781BA.8080107@maubp.freeserve.co.uk> References: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com> <444781BA.8080107@maubp.freeserve.co.uk> Message-ID: <3e73596b0604200911i2e2c481bj306c5d282cae5c75@mail.gmail.com> On 4/20/06, Peter (BioPython) wrote: > > The cookbook tutorial is a little misleading in that regard. Indexing a > GenBank file only makes sense for those files with multiple genbank > record (i.e. multiple LOCUS lines). > Your example (and in my experience all Bacterial Genomes) have just a > single very large record (which will contain many features). > > Does this page help? > > http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/genbank/ It does help a lot. Thanks! As an aside, while what I was doing, wasn't exactly what I was looking for, I think it was crashing because of a Bug on 1.41. I installed the latest CVS and it works normally now. Pepe From halima at mancala.cbio.uct.ac.za Thu Apr 20 11:57:20 2006 From: halima at mancala.cbio.uct.ac.za (Halima Rabiu) Date: Thu, 20 Apr 2006 13:57:20 +0200 (SAST) Subject: [BioPython] Need help parsing Blastoutput In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF7@cgcmail.cgc.cpmc.columbia.edu> References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF7@cgcmail.cgc.cpmc.columbia.edu> Message-ID: thanks I try using XML parser and I am still geting errors which I dont understand . please see the attchmnt copy of my script and Blast XML output. here is the error raceback (most recent call last): File "Bioperser.py", line 11, in ? b_record = b_parser.parse(b_out) File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 112, in parse self._parser.parse(handler) File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in parse xmlreader.IncrementalParser.parse(self, source) File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in parse self.feed(buffer) File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in feed self._err_handler.fatalError(exc) File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception thanks Halimah On Wed, 19 Apr 2006, Michiel De Hoon wrote: > The Blast parser fails to read your file because the format of Blast output > has changed. If I edit the data file so that it corresponds to the old format > (add a space here, remove a blank line there, etc.), the Blast parser reads > the file without problems. The easiest solution is to repeat the Blast run, > using XML for the output format, and use the Blast XML parser in Biopython to > parse the results. > > A general question is if anybody still needs the parser for Blast text > output. Currently, we are confusing our users by having a Blast text parser > that tends to break. A broken parser may be worse than no parser. > > --Michiel. > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > Sent: Wed 4/19/2006 6:15 AM > To: Michiel De Hoon > Cc: biopython at lists.open-bio.org > Subject: RE: [BioPython] Need help parsing Blastoutput > > Hi > Please see the attachment,it part of my Blast output. > yes I am try to parse text output from Blast ,I have use another script to > run my local blast that I am trying to perse the NCBIStandalone.BlastParser > was working fine without hsp.sbject_end which is one of what I need to > print out . > On checking the class diagrams from cookbook, findout that sbject_end is > not included .I just need another way of printing the int(subject end). > Thanks for your help > Halimah > > On Tue, 18 Apr 2006, Michiel De Hoon wrote: > > > Could you also send us the file Enterococcus_out so we can run the script? > > > > From the script, it looks like you're trying to parse text output from > Blast. > > While this is possible (in theory), the format of Blast text output tends > to > > change a lot, thereby breaking the parser in Biopython. It is more reliable > > to have Blast generate output in XML format, and use the XML parser: > > > > blast_out = open('my_blast.xml', 'r') > > > > from Bio.Blast import NCBIXML > > > > b_parser = NCBIXML.BlastParser() > > b_record = b_parser.parse(blast_out) > > > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to > > generate Blast output in XML. > > > > --Michiel. > > > > > > > > Michiel de Hoon > > Center for Computational Biology and Bioinformatics > > Columbia University > > 1150 St Nicholas Avenue > > New York, NY 10032 > > > > > > > > -----Original Message----- > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > > Sent: Tue 4/18/2006 11:06 AM > > To: Michiel De Hoon > > Cc: biopython at lists.open-bio.org > > Subject: RE: [BioPython] Need help parsing Blastoutput > > > > thanks > > please see the attchment a copy of my script and copy of my Blast output > > Thanks > > > > > > On Thu, 13 Apr 2006, Michiel De Hoon wrote: > > > > > Could you send us the script you were using? > > > > > > --Michiel. > > > > > > Michiel de Hoon > > > Center for Computational Biology and Bioinformatics > > > Columbia University > > > 1150 St Nicholas Avenue > > > New York, NY 10032 > > > > > > > > > > > > -----Original Message----- > > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu > > > Sent: Thu 4/13/2006 11:07 AM > > > To: biopython at lists.open-bio.org > > > Subject: [BioPython] Need help parsing Blastoutput > > > > > > Hi All, > > > I have a BLAST output from a local blast > > > I need to calculate my % alignment coverage as regard to my subject > > > I try parsed the blast output and wanted to print the > > > sbjct Start and Sbjct end. but I could not is there anyway I could this > > > try to get mach coverage between my querry and subject I dont need > > > Identities,but total % alignment for querry or subject. > > > Thanks > > > Halimah > > > > > > _______________________________________________ > > > BioPython mailing list - BioPython at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/biopython > > > > > > > > > > > > -------------- next part -------------- #! /usr/local/bin/python2.4 #halimah #16-04-2006 from string import split from Bio.Blast import NCBIXML #from Bio.Blast import NCBIStandalone b_out = open('blast2.xml','r') b_parser = NCBIXML.BlastParser() b_record = b_parser.parse(b_out) E_VALUE_THRESH = 1.0 while 1: b_record = b_iterator.next() print "The following results are for query " + b_record.query print 'len of query:',b_record.query_letters if b_record is None: break for alignment in b_record.alignments: for hsp in alignment.hsps: if hsp.expect <= E_VALUE_THRESH: print '****Alignment****' print 'title:', alignment.title print 'length:', alignment.length print 'e value:', hsp.expect print 'subjectstart:',hsp.sbjct_start print 'subject end:', hsp.sbject_end -------------- next part -------------- A non-text attachment was scrubbed... Name: blast2.xml Type: text/xml Size: 151659 bytes Desc: URL: From mdehoon at c2b2.columbia.edu Thu Apr 20 17:37:29 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Thu, 20 Apr 2006 13:37:29 -0400 Subject: [BioPython] Need help parsing Blastoutput Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEFF@cgcmail.cgc.cpmc.columbia.edu> Could you send us the Blast XML output also? --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] Sent: Thu 4/20/2006 7:57 AM To: Michiel De Hoon Cc: biopython at lists.open-bio.org Subject: RE: [BioPython] Need help parsing Blastoutput thanks I try using XML parser and I am still geting errors which I dont understand . please see the attchmnt copy of my script and Blast XML output. here is the error raceback (most recent call last): File "Bioperser.py", line 11, in ? b_record = b_parser.parse(b_out) File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 112, in parse self._parser.parse(handler) File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in parse xmlreader.IncrementalParser.parse(self, source) File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in parse self.feed(buffer) File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in feed self._err_handler.fatalError(exc) File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in fatalError raise exception thanks Halimah On Wed, 19 Apr 2006, Michiel De Hoon wrote: > The Blast parser fails to read your file because the format of Blast output > has changed. If I edit the data file so that it corresponds to the old format > (add a space here, remove a blank line there, etc.), the Blast parser reads > the file without problems. The easiest solution is to repeat the Blast run, > using XML for the output format, and use the Blast XML parser in Biopython to > parse the results. > > A general question is if anybody still needs the parser for Blast text > output. Currently, we are confusing our users by having a Blast text parser > that tends to break. A broken parser may be worse than no parser. > > --Michiel. > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > Sent: Wed 4/19/2006 6:15 AM > To: Michiel De Hoon > Cc: biopython at lists.open-bio.org > Subject: RE: [BioPython] Need help parsing Blastoutput > > Hi > Please see the attachment,it part of my Blast output. > yes I am try to parse text output from Blast ,I have use another script to > run my local blast that I am trying to perse the NCBIStandalone.BlastParser > was working fine without hsp.sbject_end which is one of what I need to > print out . > On checking the class diagrams from cookbook, findout that sbject_end is > not included .I just need another way of printing the int(subject end). > Thanks for your help > Halimah > > On Tue, 18 Apr 2006, Michiel De Hoon wrote: > > > Could you also send us the file Enterococcus_out so we can run the script? > > > > From the script, it looks like you're trying to parse text output from > Blast. > > While this is possible (in theory), the format of Blast text output tends > to > > change a lot, thereby breaking the parser in Biopython. It is more reliable > > to have Blast generate output in XML format, and use the XML parser: > > > > blast_out = open('my_blast.xml', 'r') > > > > from Bio.Blast import NCBIXML > > > > b_parser = NCBIXML.BlastParser() > > b_record = b_parser.parse(blast_out) > > > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to > > generate Blast output in XML. > > > > --Michiel. > > > > > > > > Michiel de Hoon > > Center for Computational Biology and Bioinformatics > > Columbia University > > 1150 St Nicholas Avenue > > New York, NY 10032 > > > > > > > > -----Original Message----- > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > > Sent: Tue 4/18/2006 11:06 AM > > To: Michiel De Hoon > > Cc: biopython at lists.open-bio.org > > Subject: RE: [BioPython] Need help parsing Blastoutput > > > > thanks > > please see the attchment a copy of my script and copy of my Blast output > > Thanks > > > > > > On Thu, 13 Apr 2006, Michiel De Hoon wrote: > > > > > Could you send us the script you were using? > > > > > > --Michiel. > > > > > > Michiel de Hoon > > > Center for Computational Biology and Bioinformatics > > > Columbia University > > > 1150 St Nicholas Avenue > > > New York, NY 10032 > > > > > > > > > > > > -----Original Message----- > > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu > > > Sent: Thu 4/13/2006 11:07 AM > > > To: biopython at lists.open-bio.org > > > Subject: [BioPython] Need help parsing Blastoutput > > > > > > Hi All, > > > I have a BLAST output from a local blast > > > I need to calculate my % alignment coverage as regard to my subject > > > I try parsed the blast output and wanted to print the > > > sbjct Start and Sbjct end. but I could not is there anyway I could this > > > try to get mach coverage between my querry and subject I dont need > > > Identities,but total % alignment for querry or subject. > > > Thanks > > > Halimah > > > > > > _______________________________________________ > > > BioPython mailing list - BioPython at lists.open-bio.org > > > http://lists.open-bio.org/mailman/listinfo/biopython > > > > > > > > > > > > From mdehoon at c2b2.columbia.edu Thu Apr 20 19:15:51 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Thu, 20 Apr 2006 15:15:51 -0400 Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF01@cgcmail.cgc.cpmc.columbia.edu> > I did suggest a change to the documentation but it looks like no one has > made the change... > > http://biopython.org/pipermail/biopython-dev/2005-November/002193.html I have now made this update in CVS. I'll put it on the website also as soon as I can figure out how to do that with the new webserver. --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 From alpersoyler at yahoo.com Fri Apr 21 07:07:05 2006 From: alpersoyler at yahoo.com (alper soyler) Date: Fri, 21 Apr 2006 00:07:05 -0700 (PDT) Subject: [BioPython] Need help!!! In-Reply-To: <1145540506.11610.17.camel@clintonite.nhm.ac.uk> Message-ID: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com> Hi Cymon, Thank you for your reply. However, to construct phylogenet?c profile I need to download approx. 100 completed genomes. I am searching to make it easier (e.g. without downloading genomes). Can I do it by running blast over the internet? Regards, Alper Soyler "Cymon J. Cox" wrote: Hi Alper, On Thu, 2006-04-20 at 05:59 -0700, alper soyler wrote: > Hi All, > > I am new to Biopython and have a question. I want to construct a pyhlogenetic > profile for one organism's proteins. I want to give my protein to blast to > search one organism's genome (e.g. Homo sapiens) instead of whole genbank > database. How can I solve my problem? Thank you in advance. Assuming you want to do this locally, you'll need to download you target genome, format it with the BLAST distribution programme 'formatdb', and then feed your query and newly formatted genome BLAST database to Bio.Blast.NCBIStandalone. See http://biopython.org/docs/tutorial/Tutorial004.html#toc10 3.1.4 Running BLAST locally for details, Cheers, Cymon ____________________________________________________________________ Cymon J. Cox Biometry and Molecular Research Department of Zoology Natural History Museum Cromwell Road London, SW7 5BD Email: cy at cymon.org, c.cox at nhm.ac.uk, cymon.cox at googlemail.com Phone : +44 (0)20 7942 6981 HomePage : http://www.duke.edu/~cymon -8.63/-6.77 _____________________________________________________________________ Fedora Core release 4 (Stentz) clintonite.nhm.ac.uk 14:35:55 up 13 days, 20:42, 8 users, load average: 0.08, 0.16, 0.12 --------------------------------- Blab-away for as little as 1?/min. Make PC-to-Phone Calls using Yahoo! Messenger with Voice. From biopython at maubp.freeserve.co.uk Fri Apr 21 08:44:56 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython)) Date: Fri, 21 Apr 2006 09:44:56 +0100 Subject: [BioPython] Updating the tutorial, was :Parsing and Creating Dictionaries of GenBank files In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF01@cgcmail.cgc.cpmc.columbia.edu> References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF01@cgcmail.cgc.cpmc.columbia.edu> Message-ID: <44489B88.2030801@maubp.freeserve.co.uk> Michiel De Hoon wrote: >> I did suggest a change to the documentation but it looks like no >> one has made the change... >> >> http://biopython.org/pipermail/biopython-dev/2005-November/002193.html >> Thanks - I was going to look at this today. Something funny seems to have happened to the plain text version: http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Doc/Tutorial.txt.diff?r1=1.5&r2=1.6&cvsroot=biopython (a) The old "Title" is missing above the contents listing (b) Contents entries contain   which is nasty for plain text. (b) Section references now contain odd text. Is it possible you only ran the TeX file once? Usually with references TeX should be run twice (and in extreme cases, three times) In an earlier discussion it was suggested we remove the plain text documentation from CVS, which I objected to as plain text is much easier for non-TeX people to read. If generating a consistent plain text version is a lot of hassle, then maybe we can live without it? > I have now made this update in CVS. I'll put it on the website also > as soon as I can figure out how to do that with the new webserver. I can't help you there - I was going to post to the Developer mailing list to see if anyone had done this recently. Have you been able to generate new HTML and Tutorial.pdf files? Looks like you have also updated the text about the Blast parser :) Peter From cy at cymon.org Fri Apr 21 09:38:33 2006 From: cy at cymon.org (Cymon J. Cox) Date: Fri, 21 Apr 2006 10:38:33 +0100 Subject: [BioPython] Need help!!! In-Reply-To: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com> References: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com> Message-ID: <1145612313.4167.15.camel@clintonite.nhm.ac.uk> Hi Alper, On Fri, 2006-04-21 at 00:07 -0700, alper soyler wrote: > Hi Cymon, > > Thank you for your reply. However, to construct phylogenet?c profile I need to > download approx. 100 completed genomes. I am searching to make it easier (e.g. > without downloading genomes). Can I do it by running blast over the internet? Well, I'm not sure; but here's my take on it and hopefully someone will correct me if I'm wrong. Assuming you are referring to complete genomes available through NCBI (otherwise you'll almost certainly need to download them), I don't think it's possible with the BioPython interface. Bio.Blast.NCBIWWW uses the qblast interface at NCBI (http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html) which I think only makes the following db's available: http://www.ncbi.nlm.nih.gov/blast/blast_databases.shtml . From looking at the qblast docs it doesn't seem possible to restrict the search to a particular organism while blast'ing against a particular NCBI db (e.g. nr). Depending on what you want to do, it maybe easier and quicker to use the NCBI web Blast interface to the Genomes db's: http://www.ncbi.nlm.nih.gov/sutils/genom_table.cgi Else you'll have to bite the proverbial bullet and download and format them individually. Cheers, Cymon > > Regards, > Alper Soyler > > "Cymon J. Cox" wrote: > Hi Alper, > > On Thu, 2006-04-20 at 05:59 -0700, alper soyler wrote: > > Hi All, > > > > I am new to Biopython and have a question. I want to construct a pyhlogenetic > > profile for one organism's proteins. I want to give my protein to blast to > > search one organism's genome (e.g. Homo sapiens) instead of whole genbank > > database. How can I solve my problem? Thank you in advance. > > Assuming you want to do this locally, you'll need to download you target > genome, format it with the BLAST distribution programme 'formatdb', and > then feed your query and newly formatted genome BLAST database to > Bio.Blast.NCBIStandalone. > > See http://biopython.org/docs/tutorial/Tutorial004.html#toc10 > 3.1.4 Running BLAST locally > > for details, > > Cheers, Cymon From biopython at maubp.freeserve.co.uk Fri Apr 21 09:23:12 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython)) Date: Fri, 21 Apr 2006 10:23:12 +0100 Subject: [BioPython] blast against genomes, was: Need help!!! In-Reply-To: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com> References: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com> Message-ID: <4448A480.5010805@maubp.freeserve.co.uk> alper soyler wrote: > Hi Cymon, > > Thank you for your reply. However, to construct phylogenet?c profile > I need to download approx. 100 completed genomes. I am searching to > make it easier (e.g. without downloading genomes). Can I do it by > running blast over the internet? So you want to search 100 completed genomes using your protein as the input query? As Cymon suggested, downloading the genomes and building your own database is one method. As this is a "big task" you have in mind, the network speed limitations of doing many blast queries may make this a better idea than trying to do it online. However, the NCBI offer online blast against some (all?) of their completed genomes so it may be possible to do it this way via BioPython. http://www.ncbi.nlm.nih.gov/BLAST/ The webpage has a nice interface for blast against specific genomes (right hand side, second box down). You can also use the normal blast pages and the "Limit by entrez query" field, e.g. mouse[ORGN] OR rat[ORGN] It should be possible to do this automatically in code but you will need to compile a list of the species names the NCBI will understand... Peter From sbassi at gmail.com Fri Apr 21 11:46:49 2006 From: sbassi at gmail.com (Sebastian Bassi) Date: Fri, 21 Apr 2006 08:46:49 -0300 Subject: [BioPython] Need help!!! In-Reply-To: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com> References: <1145540506.11610.17.camel@clintonite.nhm.ac.uk> <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com> Message-ID: On 4/21/06, alper soyler wrote: > Hi Cymon, > Thank you for your reply. However, to construct phylogenet?c profile I need to download approx. 100 completed genomes. I am searching to make it easier (e.g. without downloading genomes). Can I do it by running blast over the internet? > Maybe you could download only NR db and then make subsets from it. NCBI utilities or the local BLAST has one utility that allows you to extract sequences from BLAST compiled DBs. I don't know if this would be enough for your needs. -- Bioinformatics news: http://www.bioinformatica.info Lriser: http://www.linspire.com/lraiser_success.php?serial=318 From mdehoon at c2b2.columbia.edu Fri Apr 21 16:26:39 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Fri, 21 Apr 2006 12:26:39 -0400 Subject: [BioPython] Updating the tutorial, was :Parsing and Creating Dictionaries of GenBank files Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF05@cgcmail.cgc.cpmc.columbia.edu> > Something funny seems to have happened to the plain text version: > > http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Doc/Tutorial.t xt.diff?r1=1.5&r2=1.6&cvsroot=biopython The plain text version is generated by hevea, so not by tex directly. The funny output is likely due to having a different hevea version (which I ran a couple of times). I didn't see anything obviously wrong with the Tutorial.tex source file, so I think these errors are due to errors in the Tutorial.tex -> Tutorial.txt translation by hevea. > If generating a consistent plain text version is a lot of hassle, then > maybe we can live without it? Currently, the plain text version is not very useful. It's not a source file, so it should not be in CVS. On the other hand, the plain text version is not available from the Biopython documentation page, and users are better off with the PDF version anyway. So I think nobody will miss the plain text version. Correct me if I'm wrong. --Michiel. From srini_iyyer_bio at yahoo.com Fri Apr 21 22:49:28 2006 From: srini_iyyer_bio at yahoo.com (Srinivas Iyyer) Date: Fri, 21 Apr 2006 15:49:28 -0700 (PDT) Subject: [BioPython] Creating a graphical interface to database of gene coordinates Message-ID: <20060421224928.87408.qmail@web38105.mail.mud.yahoo.com> Dear group, I am happy that I am slowly finding pyhonian projects related to my research area. Problem: 1. I have a database of human gene coordinates on chromosomes. 2. I have gene expression data from my lab concerning the genes I mentioned above. 3. I want to visualize expression data laid on chromosomes. Eg. Coordinates: Chr Gene From To Exon 1 x 100 120 exon:1 1 x 200 250 exon:2 1 x 350 450 exon:3 Expression data: IDent sample Chr From To Expression value xxx_at lung 1 110 120 100.35 x_s_at heart 1 225 250 124.35 x_a_at eye 1 375 400 146.35 What I want: I want to have a simpler window, that would connect to my database. I want to give a gene, this python/tk interfacce or what ever would query the database draw a graph of gene according the exons and plot the values. -------_______----------_______------- -- : exon __: regions that are not exons, introns. My questions to Tutor/BioPython forums: 1. What should I decide to work on a. Py/Tk framework b. python imaging libraries etc. 2. I do not want to impress any one with this work, except that it should help me understand the relationships as the number game in the tables above is highly confusing. So, a working version that accurately plots the expression values for as many samples I have 3. Are there any available modules to jump-start? or do I have to create some from scratch. which would be a problem because I am between novice to mediocral level of python programing. 4. Any ideas/suggestions/pointers are highly appreciated. thanks Sri __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From biopython at maubp.freeserve.co.uk Sat Apr 22 12:32:21 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython)) Date: Sat, 22 Apr 2006 13:32:21 +0100 Subject: [BioPython] Creating a graphical interface to database of gene coordinates In-Reply-To: <20060421224928.87408.qmail@web38105.mail.mud.yahoo.com> References: <20060421224928.87408.qmail@web38105.mail.mud.yahoo.com> Message-ID: <444A2255.6010704@maubp.freeserve.co.uk> Srinivas Iyyer wrote: > Dear group, > I am happy that I am slowly finding pyhonian projects > related to my research area. > > Problem: > 1. I have a database of human gene coordinates on > chromosomes. > 2. I have gene expression data from my lab concerning > the genes I mentioned above. > > 3. I want to visualize expression data laid on > chromosomes. You may be able to produce chromosome diagrams with Leighton Pritchard and Jennifer White's program genomediagram: http://bioinf.scri.sari.ac.uk/lp/programs.html#genomediagram It will do both circular genomes diagrams (nice for bacteria) and linear ones - which would make sense for chromosomes. I think I've seen examples with expression data shown in this way... certainly it could be done. Note that this can produce PDF or bitmap output - but its not interactive. There is also a GUI to go with it, but I have not looked at this. ---------------------------------------------------------------------- One final suggestion, is to consider looking at R/BioConductor - its a completely different language but I have seen examples where expression data is visualised on chromosomes. http://www.r-project.org/ http://www.bioconductor.org/ You can even call R from Python, for example using RPy (R from Python),: http://rpy.sourceforge.net/index.html See also RSPython, an R/SPlus - Python Interface which I have not used personally: http://www.omegahat.org/RSPython/ Peter From biopython at maubp.freeserve.co.uk Mon Apr 24 10:56:06 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython List)) Date: Mon, 24 Apr 2006 11:56:06 +0100 Subject: [BioPython] Bio.Nexus documentation Message-ID: <444CAEC6.5040703@maubp.freeserve.co.uk> I'm thinking of having a go at using the new Bio.Nexus model in BioPython to do some phylogenetic tree manipulation (from Clustal .dnd files in my case), so I thought I would have a hunt for some examples or help... Back in July 2005, Frank Kauff wrote: > I hope most of the methods have a descriptive title and are easy to use. > Let me know if I can help further. And I promise to write some > documentation, but it won't be before end of August. > > Cheers, > Frank Archive link: http://biopython.org/pipermail/biopython/2005-July/002714.html Was that August 2005, or August 2006, you had in mind? ;) Do you have some simple examples you could share with us instead perhaps? Thanks Peter From fkauff at duke.edu Mon Apr 24 13:32:45 2006 From: fkauff at duke.edu (Frank Kauff) Date: Mon, 24 Apr 2006 09:32:45 -0400 Subject: [BioPython] Bio.Nexus documentation In-Reply-To: <444CAEC6.5040703@maubp.freeserve.co.uk> References: <444CAEC6.5040703@maubp.freeserve.co.uk> Message-ID: <1145885566.2369.6.camel@osiris.biology.duke.edu> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From halima at mancala.cbio.uct.ac.za Mon Apr 24 08:45:09 2006 From: halima at mancala.cbio.uct.ac.za (Halima Rabiu) Date: Mon, 24 Apr 2006 10:45:09 +0200 (SAST) Subject: [BioPython] Need help parsing Blastoutput In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEFF@cgcmail.cgc.cpmc.columbia.edu> References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEFF@cgcmail.cgc.cpmc.columbia.edu> Message-ID: Hi attch here is the output xml out I also attached it in my previous post thanks Halimah On Thu, 20 Apr 2006, Michiel De Hoon wrote: > Could you send us the Blast XML output also? > > --Michiel. > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > Sent: Thu 4/20/2006 7:57 AM > To: Michiel De Hoon > Cc: biopython at lists.open-bio.org > Subject: RE: [BioPython] Need help parsing Blastoutput > > thanks I try using XML parser and I am still geting errors which I dont > understand . please see the attchmnt copy of my script and Blast XML > output. > here is the error > raceback (most recent call last): > File "Bioperser.py", line 11, in ? > b_record = b_parser.parse(b_out) > File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line > 112, in parse > self._parser.parse(handler) > File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in > parse > xmlreader.IncrementalParser.parse(self, source) > File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in > parse > self.feed(buffer) > File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in > feed > self._err_handler.fatalError(exc) > File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in > fatalError > raise exception > thanks > Halimah > > On Wed, 19 Apr 2006, Michiel De Hoon wrote: > > > The Blast parser fails to read your file because the format of Blast output > > has changed. If I edit the data file so that it corresponds to the old > format > > (add a space here, remove a blank line there, etc.), the Blast parser reads > > the file without problems. The easiest solution is to repeat the Blast run, > > using XML for the output format, and use the Blast XML parser in Biopython > to > > parse the results. > > > > A general question is if anybody still needs the parser for Blast text > > output. Currently, we are confusing our users by having a Blast text parser > > that tends to break. A broken parser may be worse than no parser. > > > > --Michiel. > > > > Michiel de Hoon > > Center for Computational Biology and Bioinformatics > > Columbia University > > 1150 St Nicholas Avenue > > New York, NY 10032 > > > > > > > > -----Original Message----- > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > > Sent: Wed 4/19/2006 6:15 AM > > To: Michiel De Hoon > > Cc: biopython at lists.open-bio.org > > Subject: RE: [BioPython] Need help parsing Blastoutput > > > > Hi > > Please see the attachment,it part of my Blast output. > > yes I am try to parse text output from Blast ,I have use another script to > > run my local blast that I am trying to perse the NCBIStandalone.BlastParser > > > was working fine without hsp.sbject_end which is one of what I need to > > print out . > > On checking the class diagrams from cookbook, findout that sbject_end is > > not included .I just need another way of printing the int(subject end). > > Thanks for your help > > Halimah > > > > On Tue, 18 Apr 2006, Michiel De Hoon wrote: > > > > > Could you also send us the file Enterococcus_out so we can run the > script? > > > > > > From the script, it looks like you're trying to parse text output from > > Blast. > > > While this is possible (in theory), the format of Blast text output tends > > to > > > change a lot, thereby breaking the parser in Biopython. It is more > reliable > > > to have Blast generate output in XML format, and use the XML parser: > > > > > > blast_out = open('my_blast.xml', 'r') > > > > > > from Bio.Blast import NCBIXML > > > > > > b_parser = NCBIXML.BlastParser() > > > b_record = b_parser.parse(blast_out) > > > > > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to > > > generate Blast output in XML. > > > > > > --Michiel. > > > > > > > > > > > > Michiel de Hoon > > > Center for Computational Biology and Bioinformatics > > > Columbia University > > > 1150 St Nicholas Avenue > > > New York, NY 10032 > > > > > > > > > > > > -----Original Message----- > > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > > > Sent: Tue 4/18/2006 11:06 AM > > > To: Michiel De Hoon > > > Cc: biopython at lists.open-bio.org > > > Subject: RE: [BioPython] Need help parsing Blastoutput > > > > > > thanks > > > please see the attchment a copy of my script and copy of my Blast output > > > Thanks > > > > > > > > > On Thu, 13 Apr 2006, Michiel De Hoon wrote: > > > > > > > Could you send us the script you were using? > > > > > > > > --Michiel. > > > > > > > > Michiel de Hoon > > > > Center for Computational Biology and Bioinformatics > > > > Columbia University > > > > 1150 St Nicholas Avenue > > > > New York, NY 10032 > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu > > > > Sent: Thu 4/13/2006 11:07 AM > > > > To: biopython at lists.open-bio.org > > > > Subject: [BioPython] Need help parsing Blastoutput > > > > > > > > Hi All, > > > > I have a BLAST output from a local blast > > > > I need to calculate my % alignment coverage as regard to my subject > > > > I try parsed the blast output and wanted to print the > > > > sbjct Start and Sbjct end. but I could not is there anyway I could this > > > > > try to get mach coverage between my querry and subject I dont need > > > > Identities,but total % alignment for querry or subject. > > > > Thanks > > > > Halimah > > > > > > > > _______________________________________________ > > > > BioPython mailing list - BioPython at lists.open-bio.org > > > > http://lists.open-bio.org/mailman/listinfo/biopython > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- A non-text attachment was scrubbed... Name: blast2.xml Type: text/xml Size: 151658 bytes Desc: URL: From mdehoon at c2b2.columbia.edu Mon Apr 24 18:14:17 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Mon, 24 Apr 2006 14:14:17 -0400 Subject: [BioPython] Need help parsing Blastoutput Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF0E@cgcmail.cgc.cpmc.columbia.edu> Ha, I see. My stupid email program was removing the XML file from your email messages for security reasons something or other. Anyway, I got the XML files from the mailing list archives. The XML file from Thursday April 20 is different from the one sent on Monday April 24. In fact, the latter seems to be damaged; in line 194, it has: while the former has So in the latter a " is missing for some reason. Anyway, the XML parser can read the XML file from Thursday April 20 if you fix a few things in your script: *) Instead of b_record = b_parser.parse(b_out) you need b_iterator = NCBIStandalone.Iterator(b_out, b_parser) (and then you should also import NCBIStandalone) *) You should check if b_record is None immediately after b_record = b_iterator.next(). *) There is no hsp.sbject_end --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] Sent: Mon 4/24/2006 4:45 AM To: Michiel De Hoon Cc: biopython at lists.open-bio.org Subject: RE: [BioPython] Need help parsing Blastoutput Hi attch here is the output xml out I also attached it in my previous post thanks Halimah On Thu, 20 Apr 2006, Michiel De Hoon wrote: > Could you send us the Blast XML output also? > > --Michiel. > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > Sent: Thu 4/20/2006 7:57 AM > To: Michiel De Hoon > Cc: biopython at lists.open-bio.org > Subject: RE: [BioPython] Need help parsing Blastoutput > > thanks I try using XML parser and I am still geting errors which I dont > understand . please see the attchmnt copy of my script and Blast XML > output. > here is the error > raceback (most recent call last): > File "Bioperser.py", line 11, in ? > b_record = b_parser.parse(b_out) > File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line > 112, in parse > self._parser.parse(handler) > File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in > parse > xmlreader.IncrementalParser.parse(self, source) > File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in > parse > self.feed(buffer) > File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in > feed > self._err_handler.fatalError(exc) > File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in > fatalError > raise exception > thanks > Halimah > > On Wed, 19 Apr 2006, Michiel De Hoon wrote: > > > The Blast parser fails to read your file because the format of Blast output > > has changed. If I edit the data file so that it corresponds to the old > format > > (add a space here, remove a blank line there, etc.), the Blast parser reads > > the file without problems. The easiest solution is to repeat the Blast run, > > using XML for the output format, and use the Blast XML parser in Biopython > to > > parse the results. > > > > A general question is if anybody still needs the parser for Blast text > > output. Currently, we are confusing our users by having a Blast text parser > > that tends to break. A broken parser may be worse than no parser. > > > > --Michiel. > > > > Michiel de Hoon > > Center for Computational Biology and Bioinformatics > > Columbia University > > 1150 St Nicholas Avenue > > New York, NY 10032 > > > > > > > > -----Original Message----- > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > > Sent: Wed 4/19/2006 6:15 AM > > To: Michiel De Hoon > > Cc: biopython at lists.open-bio.org > > Subject: RE: [BioPython] Need help parsing Blastoutput > > > > Hi > > Please see the attachment,it part of my Blast output. > > yes I am try to parse text output from Blast ,I have use another script to > > run my local blast that I am trying to perse the NCBIStandalone.BlastParser > > > was working fine without hsp.sbject_end which is one of what I need to > > print out . > > On checking the class diagrams from cookbook, findout that sbject_end is > > not included .I just need another way of printing the int(subject end). > > Thanks for your help > > Halimah > > > > On Tue, 18 Apr 2006, Michiel De Hoon wrote: > > > > > Could you also send us the file Enterococcus_out so we can run the > script? > > > > > > From the script, it looks like you're trying to parse text output from > > Blast. > > > While this is possible (in theory), the format of Blast text output tends > > to > > > change a lot, thereby breaking the parser in Biopython. It is more > reliable > > > to have Blast generate output in XML format, and use the XML parser: > > > > > > blast_out = open('my_blast.xml', 'r') > > > > > > from Bio.Blast import NCBIXML > > > > > > b_parser = NCBIXML.BlastParser() > > > b_record = b_parser.parse(blast_out) > > > > > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to > > > generate Blast output in XML. > > > > > > --Michiel. > > > > > > > > > > > > Michiel de Hoon > > > Center for Computational Biology and Bioinformatics > > > Columbia University > > > 1150 St Nicholas Avenue > > > New York, NY 10032 > > > > > > > > > > > > -----Original Message----- > > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > > > Sent: Tue 4/18/2006 11:06 AM > > > To: Michiel De Hoon > > > Cc: biopython at lists.open-bio.org > > > Subject: RE: [BioPython] Need help parsing Blastoutput > > > > > > thanks > > > please see the attchment a copy of my script and copy of my Blast output > > > Thanks > > > > > > > > > On Thu, 13 Apr 2006, Michiel De Hoon wrote: > > > > > > > Could you send us the script you were using? > > > > > > > > --Michiel. > > > > > > > > Michiel de Hoon > > > > Center for Computational Biology and Bioinformatics > > > > Columbia University > > > > 1150 St Nicholas Avenue > > > > New York, NY 10032 > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu > > > > Sent: Thu 4/13/2006 11:07 AM > > > > To: biopython at lists.open-bio.org > > > > Subject: [BioPython] Need help parsing Blastoutput > > > > > > > > Hi All, > > > > I have a BLAST output from a local blast > > > > I need to calculate my % alignment coverage as regard to my subject > > > > I try parsed the blast output and wanted to print the > > > > sbjct Start and Sbjct end. but I could not is there anyway I could this > > > > > try to get mach coverage between my querry and subject I dont need > > > > Identities,but total % alignment for querry or subject. > > > > Thanks > > > > Halimah > > > > > > > > _______________________________________________ > > > > BioPython mailing list - BioPython at lists.open-bio.org > > > > http://lists.open-bio.org/mailman/listinfo/biopython > > > > > > > > > > > > > > > > > > > > From mdehoon at c2b2.columbia.edu Mon Apr 24 18:27:31 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Mon, 24 Apr 2006 14:27:31 -0400 Subject: [BioPython] Need help parsing Blastoutput Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF0F@cgcmail.cgc.cpmc.columbia.edu> Also, make sure you have the latest version of Bio/Blast/NCBIStandalone.py; you can get it from here: http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/*checkout*/biopython/Bio /Blast/NCBIStandalone.py?rev=1.60&cvsroot=biopython&content-type=text/plain --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] Sent: Mon 4/24/2006 4:45 AM To: Michiel De Hoon Cc: biopython at lists.open-bio.org Subject: RE: [BioPython] Need help parsing Blastoutput Hi attch here is the output xml out I also attached it in my previous post thanks Halimah On Thu, 20 Apr 2006, Michiel De Hoon wrote: > Could you send us the Blast XML output also? > > --Michiel. > > Michiel de Hoon > Center for Computational Biology and Bioinformatics > Columbia University > 1150 St Nicholas Avenue > New York, NY 10032 > > > > -----Original Message----- > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > Sent: Thu 4/20/2006 7:57 AM > To: Michiel De Hoon > Cc: biopython at lists.open-bio.org > Subject: RE: [BioPython] Need help parsing Blastoutput > > thanks I try using XML parser and I am still geting errors which I dont > understand . please see the attchmnt copy of my script and Blast XML > output. > here is the error > raceback (most recent call last): > File "Bioperser.py", line 11, in ? > b_record = b_parser.parse(b_out) > File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line > 112, in parse > self._parser.parse(handler) > File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in > parse > xmlreader.IncrementalParser.parse(self, source) > File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in > parse > self.feed(buffer) > File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in > feed > self._err_handler.fatalError(exc) > File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in > fatalError > raise exception > thanks > Halimah > > On Wed, 19 Apr 2006, Michiel De Hoon wrote: > > > The Blast parser fails to read your file because the format of Blast output > > has changed. If I edit the data file so that it corresponds to the old > format > > (add a space here, remove a blank line there, etc.), the Blast parser reads > > the file without problems. The easiest solution is to repeat the Blast run, > > using XML for the output format, and use the Blast XML parser in Biopython > to > > parse the results. > > > > A general question is if anybody still needs the parser for Blast text > > output. Currently, we are confusing our users by having a Blast text parser > > that tends to break. A broken parser may be worse than no parser. > > > > --Michiel. > > > > Michiel de Hoon > > Center for Computational Biology and Bioinformatics > > Columbia University > > 1150 St Nicholas Avenue > > New York, NY 10032 > > > > > > > > -----Original Message----- > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > > Sent: Wed 4/19/2006 6:15 AM > > To: Michiel De Hoon > > Cc: biopython at lists.open-bio.org > > Subject: RE: [BioPython] Need help parsing Blastoutput > > > > Hi > > Please see the attachment,it part of my Blast output. > > yes I am try to parse text output from Blast ,I have use another script to > > run my local blast that I am trying to perse the NCBIStandalone.BlastParser > > > was working fine without hsp.sbject_end which is one of what I need to > > print out . > > On checking the class diagrams from cookbook, findout that sbject_end is > > not included .I just need another way of printing the int(subject end). > > Thanks for your help > > Halimah > > > > On Tue, 18 Apr 2006, Michiel De Hoon wrote: > > > > > Could you also send us the file Enterococcus_out so we can run the > script? > > > > > > From the script, it looks like you're trying to parse text output from > > Blast. > > > While this is possible (in theory), the format of Blast text output tends > > to > > > change a lot, thereby breaking the parser in Biopython. It is more > reliable > > > to have Blast generate output in XML format, and use the XML parser: > > > > > > blast_out = open('my_blast.xml', 'r') > > > > > > from Bio.Blast import NCBIXML > > > > > > b_parser = NCBIXML.BlastParser() > > > b_record = b_parser.parse(blast_out) > > > > > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to > > > generate Blast output in XML. > > > > > > --Michiel. > > > > > > > > > > > > Michiel de Hoon > > > Center for Computational Biology and Bioinformatics > > > Columbia University > > > 1150 St Nicholas Avenue > > > New York, NY 10032 > > > > > > > > > > > > -----Original Message----- > > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za] > > > Sent: Tue 4/18/2006 11:06 AM > > > To: Michiel De Hoon > > > Cc: biopython at lists.open-bio.org > > > Subject: RE: [BioPython] Need help parsing Blastoutput > > > > > > thanks > > > please see the attchment a copy of my script and copy of my Blast output > > > Thanks > > > > > > > > > On Thu, 13 Apr 2006, Michiel De Hoon wrote: > > > > > > > Could you send us the script you were using? > > > > > > > > --Michiel. > > > > > > > > Michiel de Hoon > > > > Center for Computational Biology and Bioinformatics > > > > Columbia University > > > > 1150 St Nicholas Avenue > > > > New York, NY 10032 > > > > > > > > > > > > > > > > -----Original Message----- > > > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu > > > > Sent: Thu 4/13/2006 11:07 AM > > > > To: biopython at lists.open-bio.org > > > > Subject: [BioPython] Need help parsing Blastoutput > > > > > > > > Hi All, > > > > I have a BLAST output from a local blast > > > > I need to calculate my % alignment coverage as regard to my subject > > > > I try parsed the blast output and wanted to print the > > > > sbjct Start and Sbjct end. but I could not is there anyway I could this > > > > > try to get mach coverage between my querry and subject I dont need > > > > Identities,but total % alignment for querry or subject. > > > > Thanks > > > > Halimah > > > > > > > > _______________________________________________ > > > > BioPython mailing list - BioPython at lists.open-bio.org > > > > http://lists.open-bio.org/mailman/listinfo/biopython > > > > > > > > > > > > > > > > > > > > From biopython at maubp.freeserve.co.uk Tue Apr 25 09:08:33 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython List)) Date: Tue, 25 Apr 2006 10:08:33 +0100 Subject: [BioPython] Bio.Nexus documentation In-Reply-To: <1145885566.2369.6.camel@osiris.biology.duke.edu> References: <444CAEC6.5040703@maubp.freeserve.co.uk> <1145885566.2369.6.camel@osiris.biology.duke.edu> Message-ID: <444DE711.8070509@maubp.freeserve.co.uk> > Anyway, I'll get some examples together, and I still want to do some > documentation for the cookbook. It won't be before this weekend, though. > For a quick and dirty anchor point, there's the test module that comes > with the distribution, it naturally has some code that does interesting > things with trees and data. Its certainly shown me that the Nexus file format is a lot more complicated than just holding simple trees. What I actually wanted to do was load a Newick format tree (extension *.dnd files from Clustalw/ClustalX in particular) into BioPython. This doesn't look like is possible. However, I can get Clustalx to save the corresponding alignment in Nexus format, but the parser doesn't seem to like it... Traceback (most recent call last): File "C:\temp\hack_trees_000.py", line 7, in ? n=Nexus.Nexus(input_file) File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 525, in __init__ self.read(input) File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 582, in read self._parse_nexus_block(title, contents) File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 623, in _parse_nexus_block getattr(self,'_'+line.command)(line.options) AttributeError: 'Nexus' object has no attribute '_utree' This looks like its cause by the penultimate line of the "Nexus Tree file" produced by ClustalX: .. UTREE PAUP_1= (...); ENDBLOCK; Any ideas? I'll happily send you some example tree files off the list if you want. Peter From fkauff at duke.edu Tue Apr 25 12:03:16 2006 From: fkauff at duke.edu (Frank) Date: Tue, 25 Apr 2006 08:03:16 -0400 Subject: [BioPython] Bio.Nexus documentation In-Reply-To: <444DE711.8070509@maubp.freeserve.co.uk> References: <444CAEC6.5040703@maubp.freeserve.co.uk> <1145885566.2369.6.camel@osiris.biology.duke.edu> <444DE711.8070509@maubp.freeserve.co.uk> Message-ID: <1145966596.2276.3.camel@cpe-066-057-048-192.nc.res.rr.com> Hi Peter, yes, utree is in deed a nexus command I never heard of... The thing is that nexus is extendible, so programs can in theory define new commands. So, what is utree? Maybe an unrooted tree? And, many programs don't care much about the nexus specifications, which are, in turn, not always too precise. If you send the files along, I'd be happy to have a look. Cheers, Frank On Tue, 2006-04-25 at 10:08 +0100, Peter (BioPython List) wrote: > > Anyway, I'll get some examples together, and I still want to do some > > documentation for the cookbook. It won't be before this weekend, though. > > For a quick and dirty anchor point, there's the test module that comes > > with the distribution, it naturally has some code that does interesting > > things with trees and data. > > Its certainly shown me that the Nexus file format is a lot more > complicated than just holding simple trees. > > What I actually wanted to do was load a Newick format tree (extension > *.dnd files from Clustalw/ClustalX in particular) into BioPython. This > doesn't look like is possible. > > However, I can get Clustalx to save the corresponding alignment in Nexus > format, but the parser doesn't seem to like it... > > Traceback (most recent call last): > File "C:\temp\hack_trees_000.py", line 7, in ? > n=Nexus.Nexus(input_file) > File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 525, in > __init__ > self.read(input) > File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 582, in > read > self._parse_nexus_block(title, contents) > File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 623, in > _parse_nexus_block > getattr(self,'_'+line.command)(line.options) > AttributeError: 'Nexus' object has no attribute '_utree' > > This looks like its cause by the penultimate line of the "Nexus Tree > file" produced by ClustalX: > > .. > UTREE PAUP_1= (...); > ENDBLOCK; > > Any ideas? I'll happily send you some example tree files off the list > if you want. > > Peter > > From fkauff at duke.edu Tue Apr 25 21:17:23 2006 From: fkauff at duke.edu (Frank Kauff) Date: Tue, 25 Apr 2006 17:17:23 -0400 Subject: [BioPython] Bio.Nexus documentation In-Reply-To: <444DE711.8070509@maubp.freeserve.co.uk> References: <444CAEC6.5040703@maubp.freeserve.co.uk> <1145885566.2369.6.camel@osiris.biology.duke.edu> <444DE711.8070509@maubp.freeserve.co.uk> Message-ID: <1145999843.2365.25.camel@osiris.biology.duke.edu> Ok, I added support for the utree command used in clustal to denote an unrooted tree (in the nexus parser, it is synonym to 'tree', as trees are unrooted by default anyway), and fixed some issues with linebreaks in tree descriptions. Nexus files from Clustal should now be read without problems (famous last words). Cheers, Frank On Tue, 2006-04-25 at 10:08 +0100, Peter (BioPython List) wrote: > > Anyway, I'll get some examples together, and I still want to do some > > documentation for the cookbook. It won't be before this weekend, though. > > For a quick and dirty anchor point, there's the test module that comes > > with the distribution, it naturally has some code that does interesting > > things with trees and data. > > Its certainly shown me that the Nexus file format is a lot more > complicated than just holding simple trees. > > What I actually wanted to do was load a Newick format tree (extension > *.dnd files from Clustalw/ClustalX in particular) into BioPython. This > doesn't look like is possible. > > However, I can get Clustalx to save the corresponding alignment in Nexus > format, but the parser doesn't seem to like it... > > Traceback (most recent call last): > File "C:\temp\hack_trees_000.py", line 7, in ? > n=Nexus.Nexus(input_file) > File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 525, in > __init__ > self.read(input) > File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 582, in > read > self._parse_nexus_block(title, contents) > File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 623, in > _parse_nexus_block > getattr(self,'_'+line.command)(line.options) > AttributeError: 'Nexus' object has no attribute '_utree' > > This looks like its cause by the penultimate line of the "Nexus Tree > file" produced by ClustalX: > > .. > UTREE PAUP_1= (...); > ENDBLOCK; > > Any ideas? I'll happily send you some example tree files off the list > if you want. > > Peter > > -- Frank Kauff Dept. of Biology Duke University Box 90338 Durham, NC 27708 USA Phone 919-660-7382 Fax 919-660-7293 Web http://www.lutzonilab.net From biopython at maubp.freeserve.co.uk Wed Apr 26 14:16:21 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython List)) Date: Wed, 26 Apr 2006 15:16:21 +0100 Subject: [BioPython] Bio.Nexus and Clustal tree files Message-ID: <444F80B5.60207@maubp.freeserve.co.uk> Hello again, I have installed Frank Kauff's recent changes to Bio.Nexus from CVS, and have actually got a tree loaded now :) Here is my example script, which tries to load two tree files created using ClustalX 1.83 (files previously sent to Frank off list) (b) demo.dnd - Clustal guide tree in Newick format, no bootstraps (b) demo.treb - Clustal NJ tree in Nexus format, with bootstraps Example code starts here: from Bio.Nexus import Nexus for filename in [r"C:\TEMP\nexus\demo.dnd", r"C:\TEMP\nexus\demo.treb"] : input_file = open(filename,"r") n=Nexus.Nexus(input_file) input_file.close() print "-----------------" print "Filename:" + n.filename print "Number of taxlabels = %i" % len(n.taxlabels) print "Number of trees = %i" % len(n.trees) for tree in n.trees : print "Tree name: %s"% tree.name print "Tree nodes: " + ", ".join(tree.get_taxa()) print "-----------------" This gives the following output: ----------------- Filename:C:\TEMP\nexus\demo.dnd Number of taxlabels = 0 Number of trees = 0 ----------------- Filename:C:\TEMP\nexus\demo.treb Number of taxlabels = 0 Number of trees = 1 Tree name: PAUP_1 Tree nodes: V_Harveyi_PATH, B_subtilis_YXEM, B_subtilis_GlnH_homo_YCKK, YA80_HAEIN, FLIY_ECOLI, Deinococcus_radiodurans, HISJ_E_COLI, E_coli_GlnH ----------------- As you can see, loading the ClustalX NEXUS output (*.treb) seems to work without trouble (although n.taxlabels is an empty list... is this to be expected?). On the other hand, I don't get the tree for the Clustal guide tree file (*.dnd) which is a pain. Do I need to load these files differently, as they are Newick format, not NEXUS format? Thank you Peter From fkauff at duke.edu Wed Apr 26 15:17:31 2006 From: fkauff at duke.edu (Frank Kauff) Date: Wed, 26 Apr 2006 11:17:31 -0400 Subject: [BioPython] Bio.Nexus and Clustal tree files In-Reply-To: <444F80B5.60207@maubp.freeserve.co.uk> References: <444F80B5.60207@maubp.freeserve.co.uk> Message-ID: <1146064651.2365.41.camel@osiris.biology.duke.edu> On Wed, 2006-04-26 at 15:16 +0100, Peter (BioPython List) wrote: > Hello again, > > I have installed Frank Kauff's recent changes to Bio.Nexus from CVS, and > have actually got a tree loaded now :) > Excellent! > Here is my example script, which tries to load two tree files created > using ClustalX 1.83 (files previously sent to Frank off list) > > (b) demo.dnd - Clustal guide tree in Newick format, no bootstraps > (b) demo.treb - Clustal NJ tree in Nexus format, with bootstraps > > Example code starts here: > This gives the following output: > > ----------------- > Filename:C:\TEMP\nexus\demo.dnd > Number of taxlabels = 0 > Number of trees = 0 > ----------------- > Filename:C:\TEMP\nexus\demo.treb > Number of taxlabels = 0 > Number of trees = 1 > Tree name: PAUP_1 > Tree nodes: V_Harveyi_PATH, B_subtilis_YXEM, B_subtilis_GlnH_homo_YCKK, > YA80_HAEIN, FLIY_ECOLI, Deinococcus_radiodurans, HISJ_E_COLI, E_coli_GlnH > ----------------- > > As you can see, loading the ClustalX NEXUS output (*.treb) seems to work > without trouble (although n.taxlabels is an empty list... is this to be > expected?). yes, the taxlabels refers to the taxon labels of a nexus data matrix. They are not necessarily identical with the taxa in the tree, but could be a superset or a subset of those. However, the way clustal indicates the no. of supported bootstrap replicates (square brackets after the branchlengths) is unsupported, and thus these values are ignored. > > On the other hand, I don't get the tree for the Clustal guide tree file > (*.dnd) which is a pain. Do I need to load these files differently, as > they are Newick format, not NEXUS format? > Yes, the nexus parser reads only nexus. But you can throw the newick tree directly at the Tree class >>> from Bio.Nexus import Trees >>> t=Trees.Tree(open('demo.dnd').read()) Frank > Thank you > > Peter > _______________________________________________ > BioPython mailing list - BioPython at lists.open-bio.org > http://lists.open-bio.org/mailman/listinfo/biopython -- Frank Kauff Dept. of Biology Duke University Box 90338 Durham, NC 27708 USA Phone 919-660-7382 Fax 919-660-7293 Web http://www.lutzonilab.net From dam6278 at yahoo.fr Thu Apr 27 07:53:24 2006 From: dam6278 at yahoo.fr (dam6278) Date: Thu, 27 Apr 2006 07:53:24 +0000 (GMT) Subject: [BioPython] GenBank Message-ID: <20060427075324.13946.qmail@web86913.mail.ukl.yahoo.com> I have a proble with the GenBank parser : When I execute : from Bio import GenBank gi_list = GenBank.search_for("Opuntia AND rpl16") My output is : Traceback (most recent call last): File "", line 1, in ? File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 1398, in search_for retstart = start_id, retmax = max_ids) File "/usr/lib/python2.4/site-packages/Bio/EUtils/DBIdsClient.py", line 294, in search searchinfo = parse.parse_search(infile, [None]) File "/usr/lib/python2.4/site-packages/Bio/EUtils/parse.py", line 201, in parse_search for ele in pom["TranslationStack"]: File "/usr/lib/python2.4/site-packages/Bio/EUtils/POM.py", line 355, in __getitem__ raise IndexError, "no item matches" IndexError: no item matches Do you know where is my problem ? Thank you for your help. damien From lpritc at scri.sari.ac.uk Thu Apr 27 08:33:21 2006 From: lpritc at scri.sari.ac.uk (Leighton Pritchard) Date: Thu, 27 Apr 2006 09:33:21 +0100 Subject: [BioPython] Creating a graphical interface to database of gene coordinates In-Reply-To: <444A2255.6010704@maubp.freeserve.co.uk> References: <20060421224928.87408.qmail@web38105.mail.mud.yahoo.com> <444A2255.6010704@maubp.freeserve.co.uk> Message-ID: <1146126802.4725.223.camel@lplinuxdev> Hi guys, On Sat, 2006-04-22 at 13:32 +0100, Peter (BioPython) wrote: > Srinivas Iyyer wrote: > > Dear group, > > I am happy that I am slowly finding pyhonian projects > > related to my research area. > > > > Problem: > > 1. I have a database of human gene coordinates on > > chromosomes. > > 2. I have gene expression data from my lab concerning > > the genes I mentioned above. > > > > 3. I want to visualize expression data laid on > > chromosomes. > > You may be able to produce chromosome diagrams with Leighton Pritchard > and Jennifer White's program genomediagram: > > http://bioinf.scri.sari.ac.uk/lp/programs.html#genomediagram > > It will do both circular genomes diagrams (nice for bacteria) and linear > ones - which would make sense for chromosomes. I think I've seen > examples with expression data shown in this way... certainly it could be > done. We use it ourselves to plot array data against chromosome location, but on the whole chromosome scale and, as you mention, not interactively. It's pretty easy to do, but not what Srinivas is looking for, I think. It sounds, Srinivas, like you're wanting something that will operate more like GeneSpring? Is that right? It's possible that, if you just wanted to present a static image of expression data, you could use GenomeDiagram in this way, but it's not the way I would choose to present the data in a GUI - I'd expect drawing straight onto a canvas (in whichever GUI toolkit suited you) to be more flexible for you. > Note that this can produce PDF or bitmap output - but its not > interactive. There is also a GUI to go with it, but I have not looked > at this. The GUI is pretty rudimentary, providing for file selection and just enough document formatting so as to not be entirely useless to the non- programmer. An improved version (but still not interactive) is in a perenially almost-ready state as wxPython widgets in the current source, waiting for a serious fixing and a wxApp to hang from. -- Dr Leighton Pritchard AMRSC D131, Plant-Pathogen Interactions, Scottish Crop Research Institute Invergowrie, Dundee, Scotland, DD2 5DA, UK T: +44 (0)1382 562731 x2405 F: +44 (0)1382 568578 E: lpritc at scri.sari.ac.uk W: http://bioinf.scri.sari.ac.uk/lp GPG/PGP: FEFC205C E58BA41B http://www.keyserver.net (If the signature does not verify, please remove the SCRI disclaimer) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ DISCLAIMER: This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that addressee. If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.sari.ac.uk quoting the name of the sender and delete the email from your system. Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any). From mdehoon at c2b2.columbia.edu Thu Apr 27 15:31:43 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Thu, 27 Apr 2006 11:31:43 -0400 Subject: [BioPython] GenBank Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF1A@cgcmail.cgc.cpmc.columbia.edu> I was not able to replicate this error -- both biopython 1.41 and biopython in CVS worked fine. Perhaps a temporary internet failure? --Michiel. Michiel de Hoon Center for Computational Biology and Bioinformatics Columbia University 1150 St Nicholas Avenue New York, NY 10032 -----Original Message----- From: biopython-bounces at lists.open-bio.org on behalf of dam6278 Sent: Thu 4/27/2006 3:53 AM To: biopython at lists.open-bio.org Subject: [BioPython] GenBank I have a proble with the GenBank parser : When I execute : from Bio import GenBank gi_list = GenBank.search_for("Opuntia AND rpl16") My output is : Traceback (most recent call last): File "", line 1, in ? File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 1398, in search_for retstart = start_id, retmax = max_ids) File "/usr/lib/python2.4/site-packages/Bio/EUtils/DBIdsClient.py", line 294, in search searchinfo = parse.parse_search(infile, [None]) File "/usr/lib/python2.4/site-packages/Bio/EUtils/parse.py", line 201, in parse_search for ele in pom["TranslationStack"]: File "/usr/lib/python2.4/site-packages/Bio/EUtils/POM.py", line 355, in __getitem__ raise IndexError, "no item matches" IndexError: no item matches Do you know where is my problem ? Thank you for your help. damien _______________________________________________ BioPython mailing list - BioPython at lists.open-bio.org http://lists.open-bio.org/mailman/listinfo/biopython From bill at barnard-engineering.com Fri Apr 28 04:44:28 2006 From: bill at barnard-engineering.com (Bill Barnard) Date: Thu, 27 Apr 2006 21:44:28 -0700 Subject: [BioPython] Updating the tutorial, was :Parsing and Creating Dictionaries of GenBank files In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF05@cgcmail.cgc.cpmc.columbia.edu> References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF05@cgcmail.cgc.cpmc.columbia.edu> Message-ID: <1146199468.5816.34.camel@lyell.barnard-engineering.com> On Fri, 2006-04-21 at 12:26 -0400, Michiel De Hoon wrote: > > Something funny seems to have happened to the plain text version: > > > > > http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Doc/Tutorial.t > xt.diff?r1=1.5&r2=1.6&cvsroot=biopython > > The plain text version is generated by hevea, so not by tex directly. The > funny output is likely due to having a different hevea version (which I ran a > couple of times). I didn't see anything obviously wrong with the Tutorial.tex > source file, so I think these errors are due to errors in the Tutorial.tex -> > Tutorial.txt translation by hevea. FWIW - I just updated from CVS and ran my updated Doc makefiles (see http://bugzilla.open-bio.org/show_bug.cgi?id=1939 ) and don't see any of the weird artifacts in the generated Tutorial.txt file. My hevea version is 1.06. > > > If generating a consistent plain text version is a lot of hassle, then > > maybe we can live without it? > > Currently, the plain text version is not very useful. It's not a source file, > so it should not be in CVS. On the other hand, the plain text version is not > available from the Biopython documentation page, and users are better off > with the PDF version anyway. So I think nobody will miss the plain text > version. Correct me if I'm wrong. As long as your release process includes running a make in the Doc tree, then you can generate the txt file from the tex source. Bill From mdehoon at c2b2.columbia.edu Fri Apr 28 16:37:30 2006 From: mdehoon at c2b2.columbia.edu (Michiel De Hoon) Date: Fri, 28 Apr 2006 12:37:30 -0400 Subject: [BioPython] Updating the tutorial, was :Parsing and Creating Dictionaries of GenBank files Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF1E@cgcmail.cgc.cpmc.columbia.edu> > On Fri, 2006-04-21 at 12:26 -0400, Michiel De Hoon wrote: > > > Something funny seems to have happened to the plain text version: > > > > The plain text version is generated by hevea, so not by tex directly. The > > funny output is likely due to having a different hevea version (which I ran a > > couple of times). I didn't see anything obviously wrong with the Tutorial.tex > > source file, so I think these errors are due to errors in the Tutorial.tex -> > > Tutorial.txt translation by hevea. > > FWIW - I just updated from CVS and ran my updated Doc makefiles (see > http://bugzilla.open-bio.org/show_bug.cgi?id=1939 ) and don't see any of > the weird artifacts in the generated Tutorial.txt file. My hevea version > is 1.06. So it's probably a hevea problem -- I'm using version 1.08. > As long as your release process includes running a make in the Doc tree, > then you can generate the txt file from the tex source. That is one of the steps in building a release -- see http://www.biopython.org/docs/developer/build.html --Michiel. From clayton_kd at yahoo.com Sat Apr 29 15:05:09 2006 From: clayton_kd at yahoo.com (Kyle Dent) Date: Sat, 29 Apr 2006 08:05:09 -0700 (PDT) Subject: [BioPython] GenBank parsing Message-ID: <20060429150509.82051.qmail@web31509.mail.mud.yahoo.com> Dear All, My script was successfully implementing the Genbank parser until just today I was trying to get it to parse a genpept file. After much experimentation I discovered that it was actually having trouble parsing even newly downloaded GenBank files as well (downloaded of NCBI). I wanted to ask if anyone is aware of this problem, I understand the flat file format was updated this month and is probably the cause of this. The output which I am getting: Traceback (most recent call last): File "C:\work\GB CDS Extractor.py", line 289, in open1_clicked loadGenBank(self, self.gbFilePath) File "C:\work\GB CDS Extractor.py", line 75, in loadGenBank cur_record = genBank_Iterator.next() File "C:\Python24\Lib\site-packages\Bio\GenBank\__init__.py", line 129, in nex t return self._parser.parse(File.StringHandle(data)) File "C:\Python24\Lib\site-packages\Bio\GenBank\__init__.py", line 219, in par se self._scanner.feed(handle, self._consumer) File "C:\Python24\Lib\site-packages\Bio\GenBank\__init__.py", line 1259, in fe ed self._parser.parseFile(handle) File "C:\Python24\Lib\site-packages\Martel\Parser.py", line 328, in parseFile self.parseString(fileobj.read()) File "C:\Python24\Lib\site-packages\Martel\Parser.py", line 356, in parseStrin g self._err_handler.fatalError(result) File "C:\Python24\lib\xml\sax\handler.py", line 38, in fatalError raise exception Martel.Parser.ParserPositionException: error parsing at or beyond character 136 With thanks, Kyle __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From biopython at maubp.freeserve.co.uk Sat Apr 29 21:54:59 2006 From: biopython at maubp.freeserve.co.uk (Peter (BioPython)) Date: Sat, 29 Apr 2006 22:54:59 +0100 Subject: [BioPython] GenBank parsing In-Reply-To: <20060429150509.82051.qmail@web31509.mail.mud.yahoo.com> References: <20060429150509.82051.qmail@web31509.mail.mud.yahoo.com> Message-ID: <4453E0B3.9040409@maubp.freeserve.co.uk> Kyle Dent wrote: > Dear All, > > My script was successfully implementing the Genbank > parser until just today I was trying to get it to > parse a genpept file. After much experimentation I > discovered that it was actually having trouble parsing > even newly downloaded GenBank files as well > (downloaded of NCBI). > > I wanted to ask if anyone is aware of this problem, I > understand the flat file format was updated this month > and is probably the cause of this. I'm aware that earlier in 2006, there was a new project line added. I haven't been aware of any further changes... on the other hand, I don't think I've ever used a "genpept" file either. Anyway, from the error message you are using the "old" Martel based parser shipped with BioPython 1.41 We recommend you update to the current CVS parser which is (a) more up to date, (b) faster, (c) should give slightly more helpful error messages if it does get stuck. For most cases you can simply download this file, replacing your Bio/GenBank/__init__.py after making a backup of the old version: http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/GenBank/__init__.py?cvsroot=biopython If you see errors about ReseekFile then you will need to make a few other changes... If you are still having trouble, or need further help making the update, please reply back. Including the GenBank reference of any problem file would be handy. Thank you Peter