From srini_iyyer_bio at yahoo.com  Sat Apr  1 13:13:16 2006
From: srini_iyyer_bio at yahoo.com (Srinivas Iyyer)
Date: Sat, 1 Apr 2006 10:13:16 -0800 (PST)
Subject: [BioPython] How can I retreive FASTA sequences from NCBI
In-Reply-To: <442BFFAD.10103@maubp.freeserve.co.uk>
Message-ID: <20060401181316.62812.qmail@web38113.mail.mud.yahoo.com>

Hi , 
 I have 151,204 GenBank Accession IDs. 
I want to retreive FASTA sequences from NCBI and
compile them for my local blast. 


I am unable to get fasta sequences. I do not
understand. 

Could any one please help me. 

my code:
>>> mylis
['AA035383', 'AA971406', 'N98563']
parser = Fasta.RecordParser()
iterator = Fasta.Iterator(mylis,parser)
rec = iterator.next()
rec = iterator.next()
>>> rec
>>>

rec is empty :-(


Accession IDs are not GIs. They are GenBank accession
Ids.

I do not want sequences in GenBank (long format). I
want them in FASTA sequence format. 

Could any one pleast help me. 

Thanks
Srini


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

From biopython at maubp.freeserve.co.uk  Sat Apr  1 14:59:46 2006
From: biopython at maubp.freeserve.co.uk (Peter (BioPython))
Date: Sat, 01 Apr 2006 20:59:46 +0100
Subject: [BioPython] How can I retreive FASTA sequences from NCBI
In-Reply-To: <20060401181316.62812.qmail@web38113.mail.mud.yahoo.com>
References: <20060401181316.62812.qmail@web38113.mail.mud.yahoo.com>
Message-ID: <442EDBB2.3040105@maubp.freeserve.co.uk>

Srinivas Iyyer wrote:
> Hi , 
> I have 151,204 GenBank Accession IDs. 
> I want to retreive FASTA sequences from NCBI and
> compile them for my local blast. 
 >
 > I am unable to get fasta sequences. I do not
 > understand.
 >
 > Could any one please help me.

This should help.  Using the first identifier in your example, AA035383, 
this is a nucleotide sequence, available from the NCBI.  By searching 
the Entrez database you end up here:-

http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=1507107

Note, AA035383 --> gi:1507107

Using the web interface, you can choose to view it as FASTA format 
rather than the default of GenBank format, and save to file.

You could make a note of that URL, and just change the GI number to 
download all the files you want - but you need a simple way to determine 
the GI number...

Now, BioPython can help you here:

 >>> from Bio import GenBank
 >>> gi_list = GenBank.search_for('AA035383', database='nucleotide')
 >>> print gi_list
['1507107']

You could use this code to get the GI numbers for each of your 151,204 
GenBank Accession IDs.  I would check in each case that only one GI 
number is returned.

 >>> assert len(gi_list)==1
 >>> gi_number = gi_list[0]

Once you have the GI number, then you could just download the FASTA file 
yourself and then parse it in the normal way.  Or, get BioPython to do 
all this for you with its rather clever NCBIDictionary object...

 >>> from Bio import Fasta
 >>> from Bio import GenBank
 >>> ncbi_dict = GenBank.NCBIDictionary('nucleotide', 'fasta', \
...             parser =  Fasta.RecordParser())
 >>> gi_number = '1507107'
 >>> fasta_rec = ncbi_dict[gi_number]
 >>> print fasta_rec
 >gi|1507107|gb|AA035383.1|AA035383 zk25e12.r1 
Soares_pregnant_uterus_NbHPU Homo sapiens cDNA clone IMAGE:471598 5', 
mRNA sequence
CTTGAGCCTCAGGAACGAGATGGCGGTTCTCTGGAGGCTGAGTGCCGTTTGCGGTGCCCT
AGGAGGCCGAGCTCTGTTGCTTCGAACTCCAGTGGTCAGACCTGCTCATATCTCAGCATT
TCTTCAGGACCGACCTATCCCAGAATGGTGTGGAGTGCAGCACATACACTTGTCACCCGA
GCCACCATTCTGGCTCCAAGGCTGCATCTCTCCACTGGACTAGCGAGANGGTTGTCANTG
TTTTGCTCCTGGGTCTGCTTCCCGGCTGCTTANTTGAANCCTTGCTCNGCGANGGACTAN
TCCCTGGC

You could use the Fasta.SequenceParser() if you prefer.  I would guess 
you would then want to save these FASTA records into one long FASTA file.

Enjoy!

Peter


From halima at mancala.cbio.uct.ac.za  Sun Apr  2 09:33:11 2006
From: halima at mancala.cbio.uct.ac.za (Halima Rabiu)
Date: Sun, 2 Apr 2006 15:33:11 +0200 (SAST)
Subject: [BioPython] Need help on NCBIStandaloneblast
In-Reply-To: <442BFFAD.10103@maubp.freeserve.co.uk>
References: <Pine.LNX.4.58.0603300915090.7802@mancala.cbio.uct.ac.za>
	<442BFFAD.10103@maubp.freeserve.co.uk>
Message-ID: <Pine.LNX.4.58.0604021523010.10948@mancala.cbio.uct.ac.za>

Thanks Peter , I have been able to trace the error when I print the 
error_info.read() 
the error is with my infile 
There is result in my save file now but I am still having problem passing 
the output file.But I will try to figure it out it may be syntax problem
Thanks

On Thu, 30 Mar 2006, Peter (BioPython List) wrote:

> Halima Rabiu wrote:
> > Hi everyboby ;
> > I am new to biopython having problems with the "NCBIStandalone.blastall".
> > After launching the Blast with "doBlast" it look like runs and end
> > and then I check the output it empty and I try same thing using comand
> > line it work and get result.
> > I attch my code.
> 
> Have you checked the paths are correct, e.g.
> 
> assert os.path.isfile(data), "Missing database file " + data
> assert os.path.isfile(infile), "Missing input file " + infile
> 
> You don't need to check blast_exe yourself, as the blastall command does this
> for you.
> 
> If I understood you correctly, the "blast.out" file is empty.
> 
> Did blast return any error message?  Try:
> 
> print error_info.read()
> 
> or:
> 
> save_file =open("blast.error","w")
> blast_result=error_info.read()
> save_file.write(blast_result)
> save_file.close()
> 
> Next question, could you tell us what you typed at the command line which does
> work?
> 
> > I also try to go though the previous posts on biopython mailing list fund
> > similar problem post by Andreas but no solution to the problem .
> 
> It was worth checking anyway :)
> 
> Peter
> 
> 

From as_nascimento at yahoo.com.br  Wed Apr  5 16:35:35 2006
From: as_nascimento at yahoo.com.br (Alessandro S. Nascimento)
Date: Wed, 05 Apr 2006 17:35:35 -0300
Subject: [BioPython] problems when parsing blast output
In-Reply-To: <43CCD436.7020704@maubp.freeserve.co.uk>
References: <43CC485E.7050702@yahoo.com.br>
	<43CCC6D4.4020307@maubp.freeserve.co.uk>
	<43CCCF56.40803@yahoo.com.br>
	<43CCD436.7020704@maubp.freeserve.co.uk>
Message-ID: <44342A17.4070404@yahoo.com.br>

Hi Peter

I had some troubles when parsing some results from a blastpgp output 
file. My initial script used to work but isn't working this time. My 
blast output file is very, very large.
When I try to run it, I can see my processor working in 99% for some 
minutes than is returns to prompt with no results or information. Any 
idea of what may be happening?

Thanks in advance,


Alessandro


#!/usr/bin/python

import os
from Bio.Blast import NCBIStandalone
from string import *

blast_out = open('blast.output', 'r')

b_parser = NCBIStandalone.PSIBlastParser()

b_record = b_parser.parse(blast_out)

n=0
for round in b_record.rounds:
    for alignment in round.alignments:
        for hsp in alignment.hsps:
            if hsp.identities < 90:
                if hsp.identities > 30:
                        if alignment.length > 200:
                                print "Retrieving sequence query"
                                os.system ("fastacmd -d ..//db/nr -s 
\'%s\' > test.bl2.%d" % (query, n, ))
                                n=n+1

blast_out.close()


From halima at mancala.cbio.uct.ac.za  Thu Apr 13 11:07:52 2006
From: halima at mancala.cbio.uct.ac.za (Halima Rabiu)
Date: Thu, 13 Apr 2006 17:07:52 +0200 (SAST)
Subject: [BioPython] Need help parsing Blastoutput
Message-ID: <Pine.LNX.4.58.0604131640020.23647@mancala.cbio.uct.ac.za>

Hi All,
I have a BLAST output from a local blast
I need to calculate my % alignment coverage as regard to my subject
I try parsed the blast output and wanted to print the
sbjct Start and Sbjct end. but I could not is there anyway I could this 
try to get mach coverage between my querry and subject I dont need 
Identities,but total % alignment for querry or subject.
Thanks
Halimah


From mdehoon at c2b2.columbia.edu  Thu Apr 13 11:56:26 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Thu, 13 Apr 2006 11:56:26 -0400
Subject: [BioPython] Need help parsing Blastoutput
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEE9@cgcmail.cgc.cpmc.columbia.edu>

Could you send us the script you were using?

--Michiel.

Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032


-----Original Message-----
From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
Sent: Thu 4/13/2006 11:07 AM
To: biopython at lists.open-bio.org
Subject: [BioPython] Need help parsing Blastoutput
 
Hi All,
I have a BLAST output from a local blast
I need to calculate my % alignment coverage as regard to my subject
I try parsed the blast output and wanted to print the
sbjct Start and Sbjct end. but I could not is there anyway I could this 
try to get mach coverage between my querry and subject I dont need 
Identities,but total % alignment for querry or subject.
Thanks
Halimah

_______________________________________________
BioPython mailing list  -  BioPython at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython


From rafael at nbn.ac.za  Fri Apr 14 05:52:42 2006
From: rafael at nbn.ac.za (Rafael C. Jimenez)
Date: Fri, 14 Apr 2006 11:52:42 +0200
Subject: [BioPython] Need help parsing Blastoutput
In-Reply-To: <Pine.LNX.4.58.0604131640020.23647@mancala.cbio.uct.ac.za>
References: <Pine.LNX.4.58.0604131640020.23647@mancala.cbio.uct.ac.za>
Message-ID: <9ad32945680e91a485c1e0cdb1ca4eb7@nbn.ac.za>

On 13 Apr 2006, at 5:07 PM, Halima Rabiu wrote:

> Hi All,
> I have a BLAST output from a local blast

Well, I would say that you can use three alternatives to run blast, and 
somehow you can use all of them locally.
  - Blast web server (Through Blastcl3 or through biopython)
  - Blast standalone
  - wwwblast

I guess that when you say local blast you want to say you are using 
blast standalone to use your own local databases. It makes a difference 
to use one of these three different because you will use different 
modules to parse the output:
  - Bio.Blast.NCBIStandalone for Blast standalone outputs
  - Bio.Blast.NCBIWWW for Blast web server outputs
  - No parser for the wwwblast

> I need to calculate my % alignment coverage as regard to my subject

I am not sure what you mean, but I would say that this % is provided by 
the "Identities" field in nucleotide and protein comparisons for each 
alignment, and also by the "Positives" field in protein comparisons.
Example: Identities = 11/26 (42%), Positives = 15/26 (57%)

> I try parsed the blast output and wanted to print the
> sbjct Start and Sbjct end. but I could not is there anyway I could this

# Open your Blast Output file
blastOutput = open("The name of your blast output", 'r')

Once you have parsed the NCBIWWW output:
     from Bio.Blast import NCBIWWW
     parser = NCBIWWW.BlastParser()
     blastRecord = parser.parse(blastOutput)


.... or the NCBI web server output:
     from Bio.Blast import NCBIWWW
     parser = NCBIWWW.BlastParser()
     blastRecord = parser.parse(blastOutput)


now you can start to recover information using the Bio.Blast.Record 
module

     import Bio.Blast.Record
     # ... for instance you can retreive the Blast version you used when 
you got your output ...
     print 'header.version:',blastRecord.version
     for alignment in blastRecord.alignments:
       # ... or the length of the alignment ...
       print 'alignment.length:', alignment.length
       for hsp in alignment.hsps:
	# ... or the sbjct Start as you want ...
           print 'hsp.sbjct_start:', hsp.sbjct_start

>
> try to get mach coverage between my querry and subject I dont need
> Identities,but total % alignment for querry or subject.
> Thanks
> Halimah

I am working in the NBN central node in UWC, not far away from UCT. 
Don't hesitate to visit us if you want help or advice.

Cheers,
Rafael

Rafael C. Jimenez
-----------------------------------------------------------
National Bioinformatics Network
University of the Western Cape
Private Bag X17
Bellville 7530
South Africa
Tel: +27219592991
rafael at nbn.ac.za
www.nbn.ac.za
-----------------------------------------------------------
Proteomics Services Group
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge - CB10 1SD - UK
Tel: +441223492610
rafael at ebi.ac.uk
www.ebi.ac.uk
-----------------------------------------------------------

On 13 Apr 2006, at 5:07 PM, Halima Rabiu wrote:

> Hi All,
> I have a BLAST output from a local blast
> I need to calculate my % alignment coverage as regard to my subject
> I try parsed the blast output and wanted to print the
> sbjct Start and Sbjct end. but I could not is there anyway I could this
> try to get mach coverage between my querry and subject I dont need
> Identities,but total % alignment for querry or subject.
> Thanks
> Halimah
>
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython


From halima at mancala.cbio.uct.ac.za  Tue Apr 18 11:06:02 2006
From: halima at mancala.cbio.uct.ac.za (Halima Rabiu)
Date: Tue, 18 Apr 2006 17:06:02 +0200 (SAST)
Subject: [BioPython] Need help parsing Blastoutput
In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEE9@cgcmail.cgc.cpmc.columbia.edu>
References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEE9@cgcmail.cgc.cpmc.columbia.edu>
Message-ID: <Pine.LNX.4.58.0604181704570.29563@mancala.cbio.uct.ac.za>

thanks
please see the attchment a copy of my script and copy of my Blast output
Thanks


On Thu, 13 Apr 2006, Michiel De Hoon wrote:

> Could you send us the script you were using?
> 
> --Michiel.
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> Sent: Thu 4/13/2006 11:07 AM
> To: biopython at lists.open-bio.org
> Subject: [BioPython] Need help parsing Blastoutput
>  
> Hi All,
> I have a BLAST output from a local blast
> I need to calculate my % alignment coverage as regard to my subject
> I try parsed the blast output and wanted to print the
> sbjct Start and Sbjct end. but I could not is there anyway I could this 
> try to get mach coverage between my querry and subject I dont need 
> Identities,but total % alignment for querry or subject.
> Thanks
> Halimah
> 
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
> 
> 
-------------- next part --------------
#! /usr/local/bin/python2.4

#halimah

#16-04-2006

from string import split

from Bio.Blast import NCBIStandalone

b_out = open('Enterococcus_out','r')

b_parser = NCBIStandalone.BlastParser()

b_iterator = NCBIStandalone.Iterator(b_out,b_parser)


E_VALUE_THRESH = 1.0


while 1:

	b_record = b_iterator.next()

	print "The following results are for query " + b_record.query

	print 'len of query:',b_record.query_letters

	if b_record is None:

	       	break

	
     	for alignment in b_record.alignments:

        	
             		for hsp in alignment.hsps:

               			if hsp.expect <= E_VALUE_THRESH:

                     			print '****Alignment****'

                   			print 'title:', alignment.title

                    			print 'length:', alignment.length

                    			print 'e value:', hsp.expect

              		                print 'subjectstart:',hsp.sbjct_start

					print 'subject end:', hsp.sbject_end

		     			  
From mdehoon at c2b2.columbia.edu  Tue Apr 18 12:40:05 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Tue, 18 Apr 2006 12:40:05 -0400
Subject: [BioPython] Need help parsing Blastoutput
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF5@cgcmail.cgc.cpmc.columbia.edu>

Could you also send us the file Enterococcus_out so we can run the script?

>From the script, it looks like you're trying to parse text output from Blast.
While this is possible (in theory), the format of Blast text output tends to
change a lot, thereby breaking the parser in Biopython. It is more reliable
to have Blast generate output in XML format, and use the XML parser:

blast_out = open('my_blast.xml', 'r')

from Bio.Blast import NCBIXML

b_parser = NCBIXML.BlastParser()
b_record = b_parser.parse(blast_out)

See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to
generate Blast output in XML.

--Michiel.


Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032


-----Original Message-----
From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
Sent: Tue 4/18/2006 11:06 AM
To: Michiel De Hoon
Cc: biopython at lists.open-bio.org
Subject: RE: [BioPython] Need help parsing Blastoutput
 
thanks
please see the attchment a copy of my script and copy of my Blast output
Thanks


On Thu, 13 Apr 2006, Michiel De Hoon wrote:

> Could you send us the script you were using?
> 
> --Michiel.
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> Sent: Thu 4/13/2006 11:07 AM
> To: biopython at lists.open-bio.org
> Subject: [BioPython] Need help parsing Blastoutput
>  
> Hi All,
> I have a BLAST output from a local blast
> I need to calculate my % alignment coverage as regard to my subject
> I try parsed the blast output and wanted to print the
> sbjct Start and Sbjct end. but I could not is there anyway I could this 
> try to get mach coverage between my querry and subject I dont need 
> Identities,but total % alignment for querry or subject.
> Thanks
> Halimah
> 
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
> 
> 


From halima at mancala.cbio.uct.ac.za  Wed Apr 19 06:15:15 2006
From: halima at mancala.cbio.uct.ac.za (Halima Rabiu)
Date: Wed, 19 Apr 2006 12:15:15 +0200 (SAST)
Subject: [BioPython] Need help parsing Blastoutput
In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF5@cgcmail.cgc.cpmc.columbia.edu>
References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF5@cgcmail.cgc.cpmc.columbia.edu>
Message-ID: <Pine.LNX.4.58.0604191200150.29563@mancala.cbio.uct.ac.za>

Hi 
Please see the attachment,it part of my Blast output.
yes I am try to parse text output from Blast ,I have use another script to 
run my local blast that I am trying to perse the NCBIStandalone.BlastParser 
was working fine without hsp.sbject_end  which is one of what I need to 
print out .
On checking the class diagrams from cookbook, findout that sbject_end is 
not included .I just need another way of printing the int(subject end).
Thanks for your help
Halimah

On Tue, 18 Apr 2006, Michiel De Hoon wrote:

> Could you also send us the file Enterococcus_out so we can run the script?
> 
> From the script, it looks like you're trying to parse text output from Blast.
> While this is possible (in theory), the format of Blast text output tends to
> change a lot, thereby breaking the parser in Biopython. It is more reliable
> to have Blast generate output in XML format, and use the XML parser:
> 
> blast_out = open('my_blast.xml', 'r')
> 
> from Bio.Blast import NCBIXML
> 
> b_parser = NCBIXML.BlastParser()
> b_record = b_parser.parse(blast_out)
> 
> See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to
> generate Blast output in XML.
> 
> --Michiel.
> 
> 
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> Sent: Tue 4/18/2006 11:06 AM
> To: Michiel De Hoon
> Cc: biopython at lists.open-bio.org
> Subject: RE: [BioPython] Need help parsing Blastoutput
>  
> thanks
> please see the attchment a copy of my script and copy of my Blast output
> Thanks
> 
> 
> On Thu, 13 Apr 2006, Michiel De Hoon wrote:
> 
> > Could you send us the script you were using?
> > 
> > --Michiel.
> > 
> > Michiel de Hoon
> > Center for Computational Biology and Bioinformatics
> > Columbia University
> > 1150 St Nicholas Avenue
> > New York, NY 10032
> > 
> > 
> > 
> > -----Original Message-----
> > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> > Sent: Thu 4/13/2006 11:07 AM
> > To: biopython at lists.open-bio.org
> > Subject: [BioPython] Need help parsing Blastoutput
> >  
> > Hi All,
> > I have a BLAST output from a local blast
> > I need to calculate my % alignment coverage as regard to my subject
> > I try parsed the blast output and wanted to print the
> > sbjct Start and Sbjct end. but I could not is there anyway I could this 
> > try to get mach coverage between my querry and subject I dont need 
> > Identities,but total % alignment for querry or subject.
> > Thanks
> > Halimah
> > 
> > _______________________________________________
> > BioPython mailing list  -  BioPython at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython
> > 
> > 
> 
> 
-------------- next part --------------
BLASTP 2.2.10 [Oct-19-2004]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Query= ENTFA 3MGH_ENTFA (Q833H5) Putative 3-methyladenine DNA
glycosylase (EC 3.2.2.-)
         (229 letters)

Database: Blastdata.fdb
           240,170 sequences; 77,468,597 total letters

Searching..................................................done

                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

ENTFA 3MGH_ENTFA (Q833H5) Putative 3-methyladenine DNA glycosyla...   462   e-130
LISMO 3MGH_LISMO (P58621) Putative 3-methyladenine DNA glycosyla...   194   2e-49
STAAM 3MGH_STAAM (P65414) Putative 3-methyladenine DNA glycosyla...   187   3e-47
STAES 3MGH_STAES (Q8CRC1) Putative 3-methyladenine DNA glycosyla...   186   5e-47
LISIN 3MGH_LISIN (Q92D89) Putative 3-methyladenine DNA glycosyla...   185   8e-47
LACPL 3MGH_LACPL (Q88VP8) Putative 3-methyladenine DNA glycosyla...   178   1e-44
BACSU 3MGH_BACSU (P94378) Putative 3-methyladenine DNA glycosyla...   160   3e-39
LACJO Q74LU5 (Q74LU5) 3-methyladenine DNA glycosylase                 155   7e-38
OCEIH 3MGH_OCEIH (Q8ETG4) Putative 3-methyladenine DNA glycosyla...   147   2e-35
BACAN 3MGH_BACAN (Q81UJ9) Putative 3-methyladenine DNA glycosyla...   130   4e-30
BACC1 Q73CV5 (Q73CV5) Methylpurine-DNA glycosylase family protein     125   8e-29
CLOTE 3MGH_CLOTE (Q896H4) Putative 3-methyladenine DNA glycosyla...   124   3e-28
CHLMU Q9PJE9 (Q9PJE9) Succinate dehydrogenase, iron-sulfur protein    113   4e-25
CHLCV 3MGH_CHLCV (Q824B4) Putative 3-methyladenine DNA glycosyla...   111   2e-24
CLOAB 3MGH_CLOAB (Q97EY6) Putative 3-methyladenine DNA glycosyla...   108   1e-23
CLOPE 3MGH_CLOPE (Q8XHA9) Putative 3-methyladenine DNA glycosyla...   107   4e-23
STRMU Q8DRU4 (Q8DRU4) Putative 3-methyladenine DNA glycosylase        103   3e-22
DEIRA 3MGH_DEIRA (Q9RSQ0) Putative 3-methyladenine DNA glycosyla...    86   9e-17
CORGL 3MGH_CORGL (Q8NU33) Putative 3-methyladenine DNA glycosyla...    82   1e-15
STRCO 3MGH_STRCO (Q9S208) Putative 3-methyladenine DNA glycosyla...    80   4e-15
BRAJA 3MGH_BRAJA (Q89LR7) Putative 3-methyladenine DNA glycosyla...    79   1e-14
STRAW 3MGH_STRAW (Q829C5) Putative 3-methyladenine DNA glycosyla...    73   8e-13
COREF 3MGH_COREF (Q8FUA2) Putative 3-methyladenine DNA glycosyla...    69   9e-12
PROAC Q6A5L3 (Q6A5L3) Putative 3-methylpurine DNA glycosylase          66   9e-11
MYCPA Q740F6 (Q740F6) Hypothetical protein                             64   3e-10
MYCLEH 3MGH_MYCTU (P65412) Putative 3-methyladenine DNA glycosyl...    64   5e-10
MYCTU 3MGH_MYCTU (P65412) Putative 3-methyladenine DNA glycosyla...    64   5e-10
MYCBO 3MGH_MYCBO (P65413) Putative 3-methyladenine DNA glycosyla...    64   5e-10
MYCLE 3MGH_MYCLE (O05678) Putative 3-methyladenine DNA glycosyla...    60   5e-09
RICCN 3MGH_RICCN (Q92IE0) Putative 3-methyladenine DNA glycosyla...    52   2e-06
RICPR 3MGH_RICPR (Q9ZDH7) Putative 3-methyladenine DNA glycosyla...    49   1e-05
PSEAE 3MGH_PSEAE (Q9HX17) Putative 3-methyladenine DNA glycosyla...    45   2e-04
PSEPK Q88DL3 (Q88DL3) DNA-3-methyladenine glycosylase, putative        42   0.002
BORPE Q7VWB7 (Q7VWB7) Putative methypurine-DNA glycosylase             40   0.004
BORPA Q7W9Q9 (Q7W9Q9) Putative methypurine-DNA glycosylase             40   0.004
STRAW Q93HJ1 (Q93HJ1) Modular polyketide synthase                      35   0.14
STRAW Q93HJ2 (Q93HJ2) Modular polyketide synthase                      33   0.68
SILPO Q5LKE3 (Q5LKE3) Peptidyl-prolyl cis-trans isomerase, FKBP-...    32   1.5
SILPO Q5LVX1 (Q5LVX1) ABC transporter, transmembrane ATP-binding...    30   4.4
CLOTE DXS_CLOTE (Q894H0) 1-deoxy-D-xylulose-5-phosphate synthase...    30   5.8
BURMA Q9AI54 (Q9AI54) DedA family protein                              30   7.5
STAES Q8CST0 (Q8CST0) Hypothetical protein SE0952                      29   9.8
SILPO Q5LMI5 (Q5LMI5) ADA regulatory protein                           29   9.8

>ENTFA 3MGH_ENTFA (Q833H5) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 229

 Score =  462 bits (1190), Expect = e-130
 Identities = 229/229 (100%), Positives = 229/229 (100%)

Query: 1   MVKEMKETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFG 60
           MVKEMKETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFG
Sbjct: 1   MVKEMKETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFG 60

Query: 61  LRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQ 120
           LRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQ
Sbjct: 61  LRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQ 120

Query: 121 GRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGR 180
           GRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGR
Sbjct: 121 GRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGR 180

Query: 181 WTELPLRYVVAGNPYISKQKRTAVDQIDFGWKDEENEKSNNAHILRGTT 229
           WTELPLRYVVAGNPYISKQKRTAVDQIDFGWKDEENEKSNNAHILRGTT
Sbjct: 181 WTELPLRYVVAGNPYISKQKRTAVDQIDFGWKDEENEKSNNAHILRGTT 229


>LISMO 3MGH_LISMO (P58621) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 207

 Score =  194 bits (492), Expect = 2e-49
 Identities = 99/198 (50%), Positives = 134/198 (67%), Gaps = 3/198 (1%)

Query: 8   TINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRL 67
           T   F +KTT E+A+ +LGM L H+T  G+L G IV+ EAYLG  D AAHSF   +T R
Sbjct: 6   TKEFFESKTTIELARDILGMRLVHQTNEGLLSGLIVETEAYLGATDMAAHSFQNLRTKRT 65

Query: 68  QAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVE-GVDKMIENRQGRQGVE 126
           + M+  PGTIY+Y MH  ++LN +T  +G P+ ++IRAIEP E    +M +NR G+ G E
Sbjct: 66  EVMFSSPGTIYMYQMHRQVLLNFITMPKGIPEAILIRAIEPDEQAKQQMTQNRHGKTGYE 125

Query: 127 LTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPL 186
           LTNGPGKL  ALG+  Q YG+++F S++ L  E+ K P  IEA  RIG+PNKG  T  PL
Sbjct: 126 LTNGPGKLTQALGLSMQDYGKTLFDSNIWL--EEAKLPHLIEATNRIGVPNKGIATHYPL 183

Query: 187 RYVVAGNPYISKQKRTAV 204
           R+ V G+PYIS Q++ ++
Sbjct: 184 RFTVKGSPYISGQRKNSI 201


>STAAM 3MGH_STAAM (P65414) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 202

 Score =  187 bits (474), Expect = 3e-47
 Identities = 91/201 (45%), Positives = 132/201 (65%), Gaps = 1/201 (0%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  + T + A+ LLG+ + ++       GYIV+ EAYLG  D+AAH FG + TP++ ++Y
Sbjct: 6   FINQQTTQTAKALLGVKIIYQDDYQTYTGYIVETEAYLGIQDKAAHGFGGKITPKVTSLY 65

Query: 72  DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGP 131
            K GTIY + MHTHL++N VT+ +G P+GV+IRAIEP EG+  M  NR G+ G ELTNGP
Sbjct: 66  KKGGTIYAHVMHTHLLINFVTRTEGIPEGVLIRAIEPDEGIGAMNVNR-GKSGYELTNGP 124

Query: 132 GKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVA 191
           GK   A  I + + G ++    L +    RK+PK I    RIGIPNKG WT  PLR+ V
Sbjct: 125 GKWTKAFNIPRSIDGSTLNDCKLSIDTNHRKYPKTIIESGRIGIPNKGEWTNKPLRFTVK 184

Query: 192 GNPYISKQKRTAVDQIDFGWK 212
           GNPY+S+ +++     D  WK
Sbjct: 185 GNPYVSRMRKSDFQNPDDTWK 205


>LISIN 3MGH_LISIN (Q92D89) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 207

 Score =  185 bits (470), Expect = 8e-47
 Identities = 96/200 (48%), Positives = 130/200 (65%), Gaps = 3/200 (1%)

Query: 6   KETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTP 65
           K T   F  +TT E+A+ ++GM L HE     L GYIV+ EAYLG  D AAHSF   +T
Sbjct: 4   KITPTFFENRTTIELARDIIGMRLVHEIGNYTLSGYIVETEAYLGATDMAAHSFKNLRTK 63

Query: 66  RLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPV-EGVDKMIENRQGRQG 124
           R + M+  PGTIY Y MH  ++LN +T  +G P+ V+IRA+EP  E +++M +NR  + G
Sbjct: 64  RTEVMFGTPGTIYTYQMHQQVLLNFITMREGIPEAVLIRALEPTKESIEQMEQNRFLKTG 123

Query: 125 VELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTEL 184
            ELTNGPGKL  ALG+  Q YG+++F S++ L  E+ K P  IEA  RIG+PNKG  T
Sbjct: 124 FELTNGPGKLTQALGLSMQDYGKTLFDSNIWL--ERAKVPHIIEATNRIGVPNKGIATHY 181

Query: 185 PLRYVVAGNPYISKQKRTAV 204
           PLR+   G+PYIS Q++  +
Sbjct: 182 PLRFTAKGSPYISAQRKRQI 201


>LACPL 3MGH_LACPL (Q88VP8) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 209

 Score =  178 bits (451), Expect = 1e-44
 Identities = 93/199 (46%), Positives = 127/199 (63%), Gaps = 1/199 (0%)

Query: 13  NTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYD 72
           +T TT E+A  LLG  L  +T++GVL  +I + EAYLG  D  AH++   +TPR  A++
Sbjct: 9   STCTTPEIAVSLLGKQLRLQTSSGVLTAWITETEAYLGARDAGAHAYQNHQTPRNHALWQ 68

Query: 73  KPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPG 132
             GTIY+Y M    +LN+VTQ  G P+ V+IR IEP  G+++M + R       LTNGPG
Sbjct: 69  SAGTIYIYQMRAWCLLNIVTQAAGTPECVLIRGIEPDAGLERMQQQRP-VPIANLTNGPG 127

Query: 133 KLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAG 192
           KL+ ALG+DK L GQ++  ++L L     + P+++ A PRIGI NKG WT  PLRY VAG
Sbjct: 128 KLMQALGLDKTLNGQALQPATLSLDLSHYRQPEQVVATPRIGIVNKGEWTTAPLRYFVAG 187

Query: 193 NPYISKQKRTAVDQIDFGW 211
           NP++SK  R  +D    GW
Sbjct: 188 NPFVSKISRRTIDHEHHGW 206


>BACSU 3MGH_BACSU (P94378) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 196

 Score =  160 bits (405), Expect = 3e-39
 Identities = 91/198 (45%), Positives = 112/198 (56%), Gaps = 2/198 (1%)

Query: 1   MVKEMKETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFG 60
           M +E       F  KT  E+A  LLG  L  ET  G   GYIV+ EAY+G  D AAHSF
Sbjct: 1   MTREKNPLPITFYQKTALELAPSLLGCLLVKETDEGTASGYIVETEAYMGAGDRAAHSFN 60

Query: 61  LRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQ 120
            R+T R + M+ + G +Y Y MHTH +LN+V  E+  PQ V+IRAIEP EG   M E R
Sbjct: 61  NRRTKRTEIMFAEAGRVYTYVMHTHTLLNVVAAEEDVPQAVLIRAIEPHEGQLLMEERRP 120

Query: 121 GRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGR 180
           GR   E TNGPGKL  ALG+    YG+ I    L +  E    P+ I   PRIGI N G
Sbjct: 121 GRSPREWTNGPGKLTKALGVTMNDYGRWITEQPLYI--ESGYTPEAISTGPRIGIDNSGE 178

Query: 181 WTELPLRYVVAGNPYISK 198
             + P R+ V GN Y+S+
Sbjct: 179 ARDYPWRFWVTGNRYVSR 196


>LACJO Q74LU5 (Q74LU5) 3-methyladenine DNA glycosylase
          Length = 208

 Score =  155 bits (393), Expect = 7e-38
 Identities = 77/192 (40%), Positives = 125/192 (65%), Gaps = 2/192 (1%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  ++T E+++ LLG  L +     +L G IV+AEAY+G  D AAHS+G R++P  + +Y
Sbjct: 7   FTNRSTSEISKDLLGRTLSYNNGEEILSGTIVEAEAYVGVKDRAAHSYGGRRSPANEGLY 66

Query: 72  DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGP 131
              G++Y+Y+   +   ++  QE+G+PQGV+IRAI+P+ G+D MI+NR G+ G  LTNGP
Sbjct: 67  RPGGSLYIYSQRQYFFFDVSCQEEGEPQGVLIRAIDPLTGIDTMIKNRSGKTGPLLTNGP 126

Query: 132 GKLVAALGIDKQLYG-QSIFSSSLRLVPEKRKFPKKIEALPRIGI-PNKGRWTELPLRYV 189
           GK++ ALGI  + +    +  S   +  + ++  ++I ALPR+GI  +   W +  LR++
Sbjct: 127 GKMMQALGITSRKWDLVDLNDSPFDIDIDHKREIEEIVALPRVGINQSDPEWAQKKLRFI 186

Query: 190 VAGNPYISKQKR 201
           V+GNPY+S  K+
Sbjct: 187 VSGNPYVSDIKK 198


>OCEIH 3MGH_OCEIH (Q8ETG4) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 198

 Score =  147 bits (371), Expect = 2e-35
 Identities = 74/182 (40%), Positives = 112/182 (61%), Gaps = 2/182 (1%)

Query: 17  TEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGT 76
           T E+A+ LLG  L  +T  G   G IV+ EAYLG  D AAH +G R+T R + +Y KPG
Sbjct: 19  TLELAKNLLGCILVKQTEEGTSSGVIVETEAYLGNTDRAAHGYGNRRTKRTEILYSKPGY 78

Query: 77  IYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVA 136
            Y++ +H H ++N+V+  +G P+ V+IRA+EP  G+D+M+  R  ++   LT+GPGKL
Sbjct: 79  AYVHLIHNHRLINVVSSMEGDPESVLIRAVEPFSGIDEMLMRRPVKKFQNLTSGPGKLTQ 138

Query: 137 ALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYI 196
           A+GI  + YG  + +  L +   + K P  ++   RIGI N G   + P R+ V GNP++
Sbjct: 139 AMGIYMEDYGHFMLAPPLFI--SEGKSPASVKTGSRIGIDNTGEAKDYPYRFWVDGNPFV 196

Query: 197 SK 198
           S+
Sbjct: 197 SR 198


>BACAN 3MGH_BACAN (Q81UJ9) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 205

 Score =  130 bits (326), Expect = 4e-30
 Identities = 80/194 (41%), Positives = 112/194 (57%), Gaps = 11/194 (5%)

Query: 17  TEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGT 76
           T EVA+ LLG  L H        G IV+ EAY GPDD+AAHS+G R+T R + M+  PG
Sbjct: 12  TLEVAKKLLGQKLVHIVNGIKRSGIIVEVEAYKGPDDKAAHSYGGRRTDRTEVMFGAPGH 71

Query: 77  IYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGV------ELTN 129
            Y+Y ++  +   N++T   G PQGV+IRA+EPV+G++++   R  +  +       LTN
Sbjct: 72  AYVYLIYGMYHCFNVITAPVGTPQGVLIRALEPVDGIEEIKLARYNKTDITKAQYKNLTN 131

Query: 130 GPGKLVAALGIDKQLYGQSIFSSSL--RLVPEKRKFPK--KIEALPRIGIPNKGRWTELP 185
           GPGKL  ALGI  +  G S+ S +L   LVPE++      KI A PRI I         P
Sbjct: 132 GPGKLCRALGITLEERGVSLQSDTLHIELVPEEKHISSQYKITAGPRINIDYAEEAVHYP 191

Query: 186 LRYVVAGNPYISKQ 199
            R+   G+P++SK+
Sbjct: 192 WRFYYEGHPFVSKK 205


>BACC1 Q73CV5 (Q73CV5) Methylpurine-DNA glycosylase family protein
          Length = 205

 Score =  125 bits (315), Expect = 8e-29
 Identities = 79/194 (40%), Positives = 110/194 (56%), Gaps = 11/194 (5%)

Query: 17  TEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGT 76
           T EVA+ LLG  L H        G IV+ EAY GPDD+AAHS+G R+T R + M+  PG
Sbjct: 12  TLEVAKKLLGQKLVHIVDGIKRSGIIVEVEAYKGPDDKAAHSYGGRRTDRTEVMFGAPGH 71

Query: 77  IYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGV------ELTN 129
            Y+Y ++  +   N++T   G PQGV+IRA+EPV+G++++   R  +  +       LTN
Sbjct: 72  AYVYLIYGMYHCFNVITAPVGTPQGVLIRALEPVDGIEEIKLARYNKTDITKAQYKNLTN 131

Query: 130 GPGKLVAALGIDKQLYGQSIFSSSL--RLVPEKRKFPK--KIEALPRIGIPNKGRWTELP 185
           GPGKL  ALGI  +  G S+ S +L   LV E+       KI A PRI I         P
Sbjct: 132 GPGKLCRALGITLEERGVSLQSDTLHIELVREEEHISSQYKITAGPRINIDYAEEAVHYP 191

Query: 186 LRYVVAGNPYISKQ 199
            R+   G+P++SK+
Sbjct: 192 WRFYYEGHPFVSKK 205


>CLOTE 3MGH_CLOTE (Q896H4) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 203

 Score =  124 bits (310), Expect = 3e-28
 Identities = 74/197 (37%), Positives = 109/197 (55%), Gaps = 9/197 (4%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  K+  +VA+YLLG  L +E     L G IV+ EAY+G  D+A+H++G +KT R+  +Y
Sbjct: 7   FYEKSALQVAKYLLGKILVNEVEGITLKGKIVETEAYIGAIDKASHAYGGKKTERVMPLY 66

Query: 72  DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKM--------IENRQGR 122
            KPGT Y+Y ++  +   N++T+ +G+ +GV+IRAIEP+EG++KM        I
Sbjct: 67  GKPGTAYVYLIYGMYHCFNVITKVEGEAEGVLIRAIEPLEGIEKMAYLRYKKPISEISKT 126

Query: 123 QGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWT 182
           Q   LT GPGKL  AL IDK    Q + +     +    K    I    RIGI
Sbjct: 127 QFKNLTTGPGKLCIALNIDKSNNKQDLCNEGTLYIEHNDKEKFNIVESKRIGIEYAEEAK 186

Query: 183 ELPLRYVVAGNPYISKQ 199
           +   R+ +  NP+ISK+
Sbjct: 187 DFLWRFYIEDNPWISKK 203


>CHLMU Q9PJE9 (Q9PJE9) Succinate dehydrogenase, iron-sulfur protein
          Length = 425711

 Score =  113 bits (283), Expect = 4e-25
 Identities = 72/185 (38%), Positives = 105/185 (56%), Gaps = 5/185 (2%)

Query: 10  NIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQA 69
           + F ++    +AQ LLG  L       +  GYIV+ EAY GPDD+A H++  RKT R +A
Sbjct: 321 HFFLSEDVITLAQQLLGHKLITTHEGLITSGYIVETEAYRGPDDKACHAYNYRKTQRNRA 380

Query: 70  MYDKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVE-- 126
           MY K G+ YLY  +  H +LN+VT  +  P  V+IRAI P +G + MI+ RQ R
Sbjct: 381 MYLKGGSAYLYRCYGMHHLLNVVTGPEDIPHAVLIRAILPDQGKELMIQRRQWRDKPPHL 440

Query: 127 LTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPL 186
           LTNGPGK+  ALGI  +   Q + + +L +   K K    + A  RIGI     + ++P
Sbjct: 441 LTNGPGKVCQALGISLENNRQRLNTPALYI--SKEKISGTLTATARIGIDYAQEYRDVPW 498

Query: 187 RYVVA 191
           R++++
Sbjct: 499 RFLLS 503


>CHLCV 3MGH_CHLCV (Q824B4) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 190

 Score =  111 bits (278), Expect = 2e-24
 Identities = 67/174 (38%), Positives = 98/174 (56%), Gaps = 5/174 (2%)

Query: 20  VAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYL 79
           +A+ LLG  L  + +  +  G+IV+ EAY GPDD+A H++  RKT R   MY + G  Y+
Sbjct: 15  LAKELLGHILITKISGKITSGFIVETEAYRGPDDKACHAYNYRKTKRNSPMYSRGGIAYI 74

Query: 80  YTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVE--LTNGPGKLVA 136
           Y  +  H + N+VT +Q  P  V+IRAI P EG D MI+ RQ +   +  LTNGPGK+
Sbjct: 75  YRCYGMHSLFNVVTAKQDLPHAVLIRAILPYEGEDIMIQRRQWQNKPKHLLTNGPGKVCQ 134

Query: 137 ALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVV 190
           AL +  +    ++ S  L +   K K   +I   PRIGI       +LP R+++
Sbjct: 135 ALNLTLEHNTHALTSPHLHI--SKEKASGRITQTPRIGIDYAEECKDLPWRFLL 186


>CLOAB 3MGH_CLOAB (Q97EY6) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 205

 Score =  108 bits (270), Expect = 1e-23
 Identities = 70/202 (34%), Positives = 110/202 (54%), Gaps = 10/202 (4%)

Query: 9   INIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQ 68
           I  F ++ T  VA+ LLG  L HE       G IV+ EAY G +D+ AH++G R+TPR +
Sbjct: 4   IREFYSRDTIVVAKELLGKVLVHEVNGIRTSGKIVEVEAYRGINDKGAHAYGGRRTPRTE 63

Query: 69  AMYDKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGR----- 122
           A+Y   G  Y+Y ++  +  +N+V  ++G P+GV+IRAIEP+EG++ M E R  +
Sbjct: 64  ALYGPAGHAYVYFIYGLYYCMNVVAMQEGIPEGVLIRAIEPIEGIEVMSERRFKKLFNDL 123

Query: 123 ---QGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKG 179
              Q   LTNGP KL +A+ I ++     +    L +   K +  + +EA  R+GI
Sbjct: 124 TKYQLKNLTNGPSKLCSAMEIRREQNLMDLNGDELYIEEGKNESFEIVEA-KRVGIDYAE 182

Query: 180 RWTELPLRYVVAGNPYISKQKR 201
              +   R+ + GN  +S  K+
Sbjct: 183 EAKDYLWRFYIKGNKCVSVLKK 204


>CLOPE 3MGH_CLOPE (Q8XHA9) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 205

 Score =  107 bits (266), Expect = 4e-23
 Identities = 69/199 (34%), Positives = 107/199 (53%), Gaps = 11/199 (5%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  + T  VA+ LLG  L        L G IV+ EAY+G  D+A+H++G ++T R + +Y
Sbjct: 7   FYNRDTVTVAKELLGKVLVRNINGVTLKGKIVETEAYIGAIDKASHAYGGKRTNRTETLY 66

Query: 72  DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVEL--- 127
             PGT+Y+Y ++  +  LN++++E+    GV+IR IEP+EG+++M + R  +   EL
Sbjct: 67  ADPGTVYVYIIYGMYHCLNLISEEKDVAGGVLIRGIEPLEGIEEMSKLRYKKSYEELSNY 126

Query: 128 -----TNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPK--KIEALPRIGIPNKGR 180
                +NGP KL  ALGIDK   G +  SS    V +     K   I    RIGI
Sbjct: 127 EKKNFSNGPSKLCMALGIDKGENGINTISSEEIYVEDDSLIKKDFSIVEAKRIGIDYAEE 186

Query: 181 WTELPLRYVVAGNPYISKQ 199
             +   R+ +  N ++SK+
Sbjct: 187 ARDFLWRFYIKDNKFVSKK 205


>STRMU Q8DRU4 (Q8DRU4) Putative 3-methyladenine DNA glycosylase
          Length = 192

 Score =  103 bits (258), Expect = 3e-22
 Identities = 64/173 (36%), Positives = 91/173 (52%), Gaps = 15/173 (8%)

Query: 40  GYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQ 99
           G IV+ EAYLG  D A HS   R+TP+ +AMY   G  Y+Y ++ H +LN+VT+ Q   +
Sbjct: 34  GRIVETEAYLGSKDSACHSANDRRTPKNEAMYLAAGHWYVYQIYGHQMLNLVTKPQNVAE 93

Query: 100 GVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPE 159
            V+IRA+E  +             G  L NGPGKL    GIDK   G S+  S L L  +
Sbjct: 94  AVLIRALETAD-------------GHLLANGPGKLTKFAGIDKSFNGDSLQDSRLSL--Q 138

Query: 160 KRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYISKQKRTAVDQIDFGWK 212
           +   P++IE   RIG+     W +  L + V GN ++SK  + ++      WK
Sbjct: 139 EDLSPQRIEERSRIGVTCTDEWKDALLCFYVRGNQHVSKIAKKSLLTDKETWK 191


>DEIRA 3MGH_DEIRA (Q9RSQ0) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 190

 Score = 85.9 bits (211), Expect = 9e-17
 Identities = 64/181 (35%), Positives = 97/181 (53%), Gaps = 7/181 (3%)

Query: 20  VAQYLLGMYLEHETATGV-LGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIY 78
           +A+ LLG  L   T  G  L G +V+ EAY  P D A  + G     R   M   PG
Sbjct: 3   LARELLGGTLVRVTPDGHRLSGRVVEVEAYDCPRDPACTA-GRFHAARSAEMAIAPGHWL 61

Query: 79  LYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAAL 138
            +  H H +L +  +++G    V+IRA+EP+EG  KM++ R   +  +LT+GP KLV AL
Sbjct: 62  FWFAHGHPLLQVACRQEGVSASVLIRALEPLEGAGKMLDYRPVTRQRDLTSGPAKLVYAL 121

Query: 139 GID-KQLYGQSIFSSSLRLV-PEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYI 196
           G+D  Q+  + + S  L L+ PE      ++    R+GI  +GR   LP R+++ GN ++
Sbjct: 122 GLDPMQISHRPVNSPELHLLAPETPLADDEVTVTARVGI-REGR--NLPWRFLIRGNGWV 178

Query: 197 S 197
           S
Sbjct: 179 S 179


>CORGL 3MGH_CORGL (Q8NU33) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 189

 Score = 82.0 bits (201), Expect = 1e-15
 Identities = 66/185 (35%), Positives = 100/185 (54%), Gaps = 16/185 (8%)

Query: 20  VAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYL 79
           VA  LLG  L H    G +G  I + EAYL   DEAAH++   KTPR  AM+   G +Y+
Sbjct: 12  VAPQLLGCTLTH----GGVGIRITEVEAYLDSTDEAAHTY-RGKTPRNAAMFGPGGHMYV 66

Query: 80  YTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGV---ELTNGPGKLV 135
           Y  +  H   N+V   +G  QGV++RA E V G + + ++R+G +G+    L  GPG
Sbjct: 67  YISYGIHRAGNIVCGPEGTGQGVLLRAGEVVSG-ESIAQSRRG-EGIPHARLAQGPGNFG 124

Query: 136 AALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPY 195
            ALG++      S+F  S  L+ ++ + P+ +   PRIGI      TE  LR+ +  +P
Sbjct: 125 QALGLEISDNHASVFGPSF-LISDRVETPEIVRG-PRIGISKN---TEALLRFWIPNDPT 179

Query: 196 ISKQK 200
           +S ++
Sbjct: 180 VSGRR 184


>STRCO 3MGH_STRCO (Q9S208) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 213

 Score = 80.5 bits (197), Expect = 4e-15
 Identities = 59/184 (32%), Positives = 91/184 (49%), Gaps = 8/184 (4%)

Query: 19  EVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIY 78
           EVA  LLG  L      G +   + + EAY G +D  +H++  R TPR + M+  PG +Y
Sbjct: 21  EVAPDLLGRILVRTGPDGPITLRLTEVEAYDGQNDPGSHAYRGR-TPRNEVMFGPPGHVY 79

Query: 79  LY-TMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENR-QGRQGVELTNGPGKLVA 136
           +Y T      +N+V   +G+   V++RA E ++G +     R   R   EL  GP +L
Sbjct: 80  VYFTYGMWFCMNLVCGPEGRSSAVLLRAGEIIDGAELARTRRLSARNDKELAKGPARLAT 139

Query: 137 ALGIDKQLYGQSIFSSS---LRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGN 193
           ALG+D+ L G    +S    LR++        ++   PR G+  +G     P RY VA +
Sbjct: 140 ALGVDRALNGTDACTSQETPLRILTGTPVPGDQVRNGPRTGVAGEG--GVHPWRYWVADD 197

Query: 194 PYIS 197
           P +S
Sbjct: 198 PTVS 201


>BRAJA 3MGH_BRAJA (Q89LR7) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 200

 Score = 79.0 bits (193), Expect = 1e-14
 Identities = 68/193 (35%), Positives = 94/193 (48%), Gaps = 22/193 (11%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  ++  EVA  L+G  +      GV GG IV+ EAY    + AAHS+    TPR   M+
Sbjct: 20  FFGRSVREVAHDLIGATM---LVDGV-GGLIVEVEAY-HHTEPAAHSYN-GPTPRNHVMF 73

Query: 72  DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNG 130
             PG  Y+Y  +  H  +N V + +G    V+IRA+EP  G+  M   R  +    L +G
Sbjct: 74  GPPGFAYVYRSYGIHWCVNFVCEAEGSAAAVLIRALEPTHGIAAMRRRRHLQDVHALCSG 133

Query: 131 PGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALP-----RIGIPNKGRWTELP 185
           PGKL  ALGI       +I  ++L L         + E L      RIGI    +  ELP
Sbjct: 134 PGKLTEALGI-------TIAHNALPLDRPPIALHARTEDLEVATGIRIGIT---KAVELP 183

Query: 186 LRYVVAGNPYISK 198
            RY V G+ ++SK
Sbjct: 184 WRYGVKGSKFLSK 196


>STRAW 3MGH_STRAW (Q829C5) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 213

 Score = 72.8 bits (177), Expect = 8e-13
 Identities = 57/191 (29%), Positives = 88/191 (46%), Gaps = 8/191 (4%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  +   +VA  LLG  L   T  G +   + + EAY GP D  +H++  R T R   M+
Sbjct: 14  FFARPVLDVAPDLLGRVLVRTTPDGPIELRVTEVEAYDGPSDPGSHAYRGR-TARNGVMF 72

Query: 72  DKPGTIYLY-TMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENR-QGRQGVELTN 129
             PG +Y+Y T      +N+V   +G+   V++RA E +EG +     R   R   EL
Sbjct: 73  GPPGHVYVYFTYGMWHCMNLVCGPEGRASAVLLRAGEIIEGAELARTRRLSARNDKELAK 132

Query: 130 GPGKLVAALGIDKQLYGQSIFS---SSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPL 186
           GP +L  AL +D+ L G    +     L L+      P ++   PR G+   G     P
Sbjct: 133 GPARLATALEVDRALDGTDACAPEGGPLTLLSGTPVPPDQVRNGPRTGVSGDG--GVHPW 190

Query: 187 RYVVAGNPYIS 197
           R+ +  +P +S
Sbjct: 191 RFWIDNDPTVS 201


>COREF 3MGH_COREF (Q8FUA2) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 190

 Score = 69.3 bits (168), Expect = 9e-12
 Identities = 58/182 (31%), Positives = 85/182 (46%), Gaps = 11/182 (6%)

Query: 20  VAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYL 79
           VA  LLG    H+  +  L     + EAYLG +D AAH+    KT R  AM+   G +Y+
Sbjct: 12  VAPQLLGCIFTHDGVSIRL----TEVEAYLGAEDAAAHTHR-GKTARNAAMFGPGGHMYI 66

Query: 80  YTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAAL 138
           Y  +  H   N+    +G  QGV++RA E V G D     R       L  GPG L  AL
Sbjct: 67  YISYGIHRAGNIACAPEGVGQGVLLRAGEVVAGEDIAYRRRGDVPFTRLAQGPGNLGQAL 126

Query: 139 GIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYISK 198
                     I  +  +L+ E  + P+ +   PR+GI       + PLR+ + G+P +S
Sbjct: 127 NFQLSDNHAPINGTDFQLM-EPSERPEWVSG-PRVGITKN---ADAPLRFWIPGDPTVSV 181

Query: 199 QK 200
           ++
Sbjct: 182 RR 183


>PROAC Q6A5L3 (Q6A5L3) Putative 3-methylpurine DNA glycosylase
          Length = 191

 Score = 65.9 bits (159), Expect = 9e-11
 Identities = 56/190 (29%), Positives = 85/190 (44%), Gaps = 23/190 (12%)

Query: 19  EVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIY 78
           EVA  LLG  +      G +G  + + EAY+G DD A+H+F    TPR + M+  P  IY
Sbjct: 10  EVAPLLLGATIWR----GPVGIRLTEVEAYMGLDDPASHAFR-GPTPRARVMFGPPSHIY 64

Query: 79  LYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137
           +Y  +  H  +N+V    G+   V++R  + + G D     R       L  GPG + +A
Sbjct: 65  VYLSYGMHRCVNLVCSPDGEASAVLLRGGQVIAGHDDARRRRGNVAENRLACGPGNMGSA 124

Query: 138 LGIDKQLYGQ----------SIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLR 187
           LG   +  G           S     L   PE  +F +     PR+GI    R  + P R
Sbjct: 125 LGASLEESGNPVSIIGNGAISALGWRLEPAPEIAEFRQG----PRVGI---SRNIDAPWR 177

Query: 188 YVVAGNPYIS 197
           + +  +P +S
Sbjct: 178 WWIPQDPTVS 187


>MYCPA Q740F6 (Q740F6) Hypothetical protein
          Length = 205

 Score = 64.3 bits (155), Expect = 3e-10
 Identities = 66/198 (33%), Positives = 92/198 (46%), Gaps = 30/198 (15%)

Query: 19  EVAQYLLGMYLEHETATGVLGGYIVDAEAYLG-PD----DEAAHSF-GLRKTPRLQAMYD 72
           E A+ LLG  L   T  GV  G IV+ EAY G PD    D AAHS+ GLR   R   M+
Sbjct: 14  EAARRLLGATL---TGRGV-SGVIVEVEAYGGVPDGPWPDAAAHSYKGLRA--RNFVMFG 67

Query: 73  KPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQG-----VE 126
            PG +Y Y  H  H+  N+     G    V++RA    +G D      +GR+G
Sbjct: 68  PPGRLYTYRSHGIHVCANVSCGPDGTAAAVLLRAAALEDGTDVA----RGRRGELVHTAA 123

Query: 127 LTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEAL--PRIGIPNKGRWTEL 184
           L  GPG L AA+GI     G  +F       P   +  + + A+  PR+G+    +  +
Sbjct: 124 LARGPGNLCAAMGITMADNGIDLFDPD---SPVTLRLHEPLTAVCGPRVGV---SQAADR 177

Query: 185 PLRYVVAGNPYISKQKRT 202
           P R  + G P +S  +R+
Sbjct: 178 PWRLWLPGRPEVSAYRRS 195


>MYCLEH 3MGH_MYCTU (P65412) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 203

 Score = 63.5 bits (153), Expect = 5e-10
 Identities = 55/171 (32%), Positives = 81/171 (47%), Gaps = 16/171 (9%)

Query: 42  IVDAEAYLG-PD----DEAAHSFGLRKTPRLQAMYDKPGTIYLYTMH-THLILNMVTQEQ 95
           +V+ EAY G PD    D AAHS+  R   R   M+  PG +Y Y  H  H+  N+
Sbjct: 31  VVEVEAYGGVPDGPWPDAAAHSYRGRNG-RNDVMFGPPGRLYTYRSHGIHVCANVACGPD 89

Query: 96  GKPQGVMIRAIEPVEGVDKMIENR-QGRQGVELTNGPGKLVAALGIDKQLYGQSIF--SS 152
           G    V++RA    +G +     R Q  + V L  GPG L AALGI     G  +F  SS
Sbjct: 90  GTAAAVLLRAAAIEDGAELATSRRGQTVRAVALARGPGNLCAALGITMADNGIDLFDPSS 149

Query: 153 SLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYISKQKRTA 203
            +RL   +     +  + PR+G+    +  + P R  + G P +S  +R++
Sbjct: 150 PVRL---RLNDTHRARSGPRVGV---SQAADRPWRLWLTGRPEVSAYRRSS 194


>MYCLE 3MGH_MYCLE (O05678) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 214

 Score = 60.1 bits (144), Expect = 5e-09
 Identities = 60/190 (31%), Positives = 88/190 (46%), Gaps = 18/190 (9%)

Query: 21  AQYLLGMYLEHETATGVLGGYIVDAEAYLG-PD----DEAAHSFGLRKTPRLQAMYDKPG 75
           A  LLG  +   T  GV    +V+ EAY G PD    D AAHS+  R   R   M+  PG
Sbjct: 25  AHRLLGATI---TGRGVCA-IVVEVEAYGGVPDGPWPDAAAHSYHGRND-RNAVMFGPPG 79

Query: 76  TIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGR--QGVELTNGPG 132
            +Y Y  H  H+  N+     G    V+IRA     G D +  +R+G   + V L  GPG
Sbjct: 80  RLYTYCSHGIHVCANVSCGPDGTAAAVLIRAGALENGAD-VARSRRGASVRTVALARGPG 138

Query: 133 KLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAG 192
            L +ALGI     G  +F++   +     +  + +   PR+GI +     + P R  + G
Sbjct: 139 NLCSALGITMDDNGIDVFAADSPVTLVLNEAQEAMSG-PRVGISHA---ADRPWRLWLPG 194

Query: 193 NPYISKQKRT 202
            P +S  +R+
Sbjct: 195 RPEVSTYRRS 204


>RICCN 3MGH_RICCN (Q92IE0) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 183

 Score = 51.6 bits (122), Expect = 2e-06
 Identities = 39/131 (29%), Positives = 62/131 (47%), Gaps = 18/131 (13%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  + T  V+  L+G  L  +  T +    I + E+Y+G +D A H+    +T R   M+
Sbjct: 11  FFARDTNVVSTELIGKALYFQGKTAI----ITETESYIGQNDPACHA-ARGRTKRTDIMF 65

Query: 72  DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNG 130
              G  Y+Y ++  +  LN VT+ +G P   +IR +  +     + EN          NG
Sbjct: 66  GPAGFSYVYLIYGMYYCLNFVTEAKGFPAATLIRGVHVI-----LPENLY-------LNG 113

Query: 131 PGKLVAALGID 141
           PGKL   LGI+
Sbjct: 114 PGKLCKYLGIN 124


>RICPR 3MGH_RICPR (Q9ZDH7) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 217

 Score = 48.9 bits (115), Expect = 1e-05
 Identities = 29/96 (30%), Positives = 49/96 (51%), Gaps = 6/96 (6%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  + T  V+  L+G  L  +  T +    I + E+Y+G DD A H+    +T R   M+
Sbjct: 11  FFARDTNLVSTELIGKVLYFQGTTAI----ITETESYIGEDDPACHA-ARGRTKRTDVMF 65

Query: 72  DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAI 106
              G  Y+Y ++  +  LN VT+++G P   +IR +
Sbjct: 66  GPAGFSYVYLIYGMYYCLNFVTEDEGFPAATLIRGV 101


>PSEAE 3MGH_PSEAE (Q9HX17) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 239

 Score = 45.1 bits (105), Expect = 2e-04
 Identities = 49/184 (26%), Positives = 80/184 (43%), Gaps = 17/184 (9%)

Query: 20  VAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYL 79
           VA+ LLG  + H      L   I++ EAY   +++ +H+  L  T + +A++   G IY+
Sbjct: 29  VARELLGKVIRHRQGNLWLAARIIETEAYY-LEEKGSHA-SLGYTEKRKALFLDGGHIYM 86

Query: 80  YTMHTHLILNMVTQEQGKPQGVMIRAIEP----------VEGVDKMIENRQG--RQGVEL 127
           Y       LN      G    V+I++  P          +E +  +  + QG  R+   L
Sbjct: 87  YYARGGDSLNF--SAGGPGNAVLIKSGHPWLDRISDHTALERMQSLNPDSQGRPREIGRL 144

Query: 128 TNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLR 187
             G   L  A+G+    +    F      V +  + P ++    R+GIP KGR   LP R
Sbjct: 145 CAGQTLLCKAMGLKVPEWDAQRFDPQRLFVDDVGERPSQVIQAARLGIP-KGRDEHLPYR 203

Query: 188 YVVA 191
           +V A
Sbjct: 204 FVDA 207


>PSEPK Q88DL3 (Q88DL3) DNA-3-methyladenine glycosylase, putative
          Length = 222

 Score = 41.6 bits (96), Expect = 0.002
 Identities = 48/192 (25%), Positives = 77/192 (40%), Gaps = 17/192 (8%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  +  + +A+ LLG  + H      L   I++ EAY   D  +  S G   T + +A++
Sbjct: 8   FFDRDAQTLAKALLGKVIRHRHGDLWLAARIIETEAYYLSDKGSHASLGY--TEKRKALF 65

Query: 72  DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEP----VEGVDKMIENRQG------ 121
              G IY+Y       LN      G    V+I++  P    + G D + + +
Sbjct: 66  LDGGHIYMYYARGGDSLNF--SAHGPGNAVLIKSAYPWQDTLSGPDSLAQMQLNNPDASG 123

Query: 122 --RQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKG 179
             R    L  G   L  ALG+    +    F +    V +      ++    R+GIP+ G
Sbjct: 124 NIRPQERLCAGQTLLCRALGLKVPHWDAQRFDAERLYVEDCGNAVPRVIQAARLGIPH-G 182

Query: 180 RWTELPLRYVVA 191
           R   LP R+V A
Sbjct: 183 RDEHLPYRFVDA 194


>BORPE Q7VWB7 (Q7VWB7) Putative methypurine-DNA glycosylase
          Length = 238

 Score = 40.4 bits (93), Expect = 0.004
 Identities = 47/190 (24%), Positives = 78/190 (41%), Gaps = 17/190 (8%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  +  +++A+ LLG  + H      L   I++ EAY   +  +  S G   T + +A++
Sbjct: 20  FFNRDAQQLARDLLGKVVRHRVDGLWLSARIIETEAYYLAEKGSHASLGY--THKRRALF 77

Query: 72  DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRA----IEPVEG------VDKMIENRQG 121
              G +Y+Y       LN      G    V+I++    ++ V G      + ++  + QG
Sbjct: 78  MDGGVVYMYYARGGDSLNF--SAAGPGNAVLIKSGHPWVDEVSGPRALACMQRLNPDAQG 135

Query: 122 --RQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKG 179
             R    L  G   L  ALG+    +    F      V +      ++    R+GIP  G
Sbjct: 136 QPRPPARLCAGQTLLCRALGLKVPQWDAKAFDPRRFFVEDVGVRLDRLVRTTRLGIP-AG 194

Query: 180 RWTELPLRYV 189
           R   LP RYV
Sbjct: 195 RDEHLPYRYV 204


>BORPA Q7W9Q9 (Q7W9Q9) Putative methypurine-DNA glycosylase
          Length = 238

 Score = 40.4 bits (93), Expect = 0.004
 Identities = 47/190 (24%), Positives = 78/190 (41%), Gaps = 17/190 (8%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  +  +++A+ LLG  + H      L   I++ EAY   +  +  S G   T + +A++
Sbjct: 20  FFNRDAQQLARDLLGKVVRHRVDGLWLSARIIETEAYYLAEKGSHASLGY--THKRRALF 77

Query: 72  DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRA----IEPVEG------VDKMIENRQG 121
              G +Y+Y       LN      G    V+I++    ++ V G      + ++  + QG
Sbjct: 78  MDGGVVYMYYARGGDSLNF--SAAGPGNAVLIKSGHPWVDEVSGPRALACMQRLNPDAQG 135

Query: 122 --RQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKG 179
             R    L  G   L  ALG+    +    F      V +      ++    R+GIP  G
Sbjct: 136 QPRPPARLCAGQTLLCRALGLKVPQWDAKAFDPRRFFVEDVGVRLDRLVRTTRLGIP-AG 194

Query: 180 RWTELPLRYV 189
           R   LP RYV
Sbjct: 195 RDEHLPYRYV 204


>STRAW Q93HJ1 (Q93HJ1) Modular polyketide synthase
          Length = 3613

 Score = 35.4 bits (80), Expect = 0.14
 Identities = 16/39 (41%), Positives = 23/39 (58%)

Query: 99  QGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137
           +G M+    PVEGV++ +   +GR GV   NGPG  V +
Sbjct: 700 KGGMVSVALPVEGVEERLARFEGRIGVAAVNGPGSAVVS 738


>STRAW Q93HJ2 (Q93HJ2) Modular polyketide synthase
          Length = 4685

 Score = 33.1 bits (74), Expect = 0.68
 Identities = 15/39 (38%), Positives = 23/39 (58%)

Query: 99   QGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137
            +G M+    PVEGV++ +   +GR GV   NGP  +V +
Sbjct: 3743 KGGMVSVALPVEGVEERLARFEGRIGVAAVNGPTSVVVS 3781


 Score = 33.1 bits (74), Expect = 0.68
 Identities = 15/39 (38%), Positives = 23/39 (58%)

Query: 99   QGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137
            +G M+    PVEGV++ +   +GR GV   NGP  +V +
Sbjct: 2223 KGGMVSVALPVEGVEERLARFEGRIGVAAVNGPTSVVVS 2261


 Score = 30.4 bits (67), Expect = 4.4
 Identities = 14/39 (35%), Positives = 22/39 (56%)

Query: 99  QGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137
           +G M+    PV  V++ +   +GR GV   NGPG +V +
Sbjct: 695 KGGMVSVALPVGEVEERLARFEGRIGVAAVNGPGSVVVS 733


>SILPO Q5LKE3 (Q5LKE3) Peptidyl-prolyl cis-trans isomerase,
           FKBP-type
          Length = 142

 Score = 32.0 bits (71), Expect = 1.5
 Identities = 20/68 (29%), Positives = 34/68 (50%), Gaps = 6/68 (8%)

Query: 114 KMIENRQGRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKF----PKKIEA 169
           K  ++ +GR  +E T G G+++   G+DK + G          VP    +    P+  +A
Sbjct: 22  KTFDSSEGRDPLEFTVGSGQIIP--GLDKAMPGMETGEKKRVEVPCAEAYGPLNPEARQA 79

Query: 170 LPRIGIPN 177
           +PR GIP+
Sbjct: 80  IPREGIPD 87


>SILPO Q5LVX1 (Q5LVX1) ABC transporter, transmembrane ATP-binding
           protein, putative
          Length = 1032

 Score = 30.4 bits (67), Expect = 4.4
 Identities = 19/48 (39%), Positives = 26/48 (54%), Gaps = 8/48 (16%)

Query: 101 VMIRAIEPVEGVDKMIENRQG----RQGVE----LTNGPGKLVAALGI 140
           V ++  EP +G   MIE   G    R+G E    +T GPG+LV  LG+
Sbjct: 935 VFLKDDEPTDGAYMMIEGEAGLYLPREGQEDQLIVTVGPGRLVGELGL 982


>CLOTE DXS_CLOTE (Q894H0) 1-deoxy-D-xylulose-5-phosphate synthase
           (EC 2.2.1.7) (1-deoxyxylulose-5-phosphate synthase) (DXP
           synthase) (DXPS)
          Length = 620

 Score = 30.0 bits (66), Expect = 5.8
 Identities = 19/55 (34%), Positives = 28/55 (50%), Gaps = 1/55 (1%)

Query: 138 LGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAG 192
           + I K + G S + SSLR+ P   KF + +E + +  IPN G+     L  V  G
Sbjct: 179 MSIGKNVGGLSTYLSSLRIDPNYNKFKRDVEGIIK-KIPNIGKGVAKNLERVKDG 232


>BURMA Q9AI54 (Q9AI54) DedA family protein
          Length = 1925639

 Score = 29.6 bits (65), Expect = 7.5
 Identities = 32/136 (23%), Positives = 52/136 (38%), Gaps = 6/136 (4%)

Query: 43      VDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGT-IYLYTMHTHLILNMVTQEQGKPQGV 101
               V+  A   P    A    ++    + A Y   G   + +  H  L   +    Q K   +
Sbjct: 1823164 VELVANEAPGSRMAFMHPVKSRAAISAAYFDHGVKTFSFDTHEELAKILDATGQAKDLNL 1823223

Query: 102     MIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAALGIDKQLYGQS--IFSSSLRLVPE 159
               ++R     EG    +    G+ GVE+ N P  L+AA    + L G S  + S  +R
Sbjct: 1823224 IVRMGVQAEGAAYSLS---GKFGVEMHNAPDLLLAARRATQDLMGVSFHVGSQCMRPTAF 1823280

Query: 160     KRKFPKKIEALPRIGI 175
               +    +   AL R G+
Sbjct: 1823281 QAAMAQASRALVRAGV 1823296


>STAES Q8CST0 (Q8CST0) Hypothetical protein SE0952
          Length = 572

 Score = 29.3 bits (64), Expect = 9.8
 Identities = 17/75 (22%), Positives = 36/75 (48%), Gaps = 5/75 (6%)

Query: 98  PQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAALG-----IDKQLYGQSIFSS 152
           P+  M+     +  +  +IEN++  +G+ LT+G    + A+      ID  +YG  +  +
Sbjct: 60  PEDEMLGVDIVIPDIQYVIENKERLKGIFLTHGHEHAIGAVSYVLEQIDAPVYGSKLTIA 119

Query: 153 SLRLVPEKRKFPKKI 167
            ++   + R   KK+
Sbjct: 120 LVKEAMKARNIKKKV 134


>SILPO Q5LMI5 (Q5LMI5) ADA regulatory protein
          Length = 283

 Score = 29.3 bits (64), Expect = 9.8
 Identities = 24/103 (23%), Positives = 48/103 (46%), Gaps = 4/103 (3%)

Query: 88  LNMVTQEQGKPQGVMIRAIEPVE--GVDKMIENRQGRQGVELTNGPGKLVAALGIDKQLY 145
           +N+ T E+G   GV+ RAIE ++  G    +++   R  +   +        +G+  + Y
Sbjct: 1   MNVQTTEEGYHYGVIRRAIELIDAGGESMPLDDLAARMNMSPAHFQRIFSRWVGVSPKKY 60

Query: 146 GQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRY 188
            Q +     + + E+R     +EA   +G+   GR  +L +R+
Sbjct: 61  QQYLTLGHAKALLEERF--TLLEAAQNVGLSGTGRLHDLFVRW 101


  Database: Blastdata.fdb
    Posted date:  Mar 29, 2006  3:30 PM
  Number of letters in database: 77,468,597
  Number of sequences in database:  240,170

Lambda     K      H
   0.316    0.135    0.391
Gapped
Lambda     K      H
   0.267   0.0410    0.140


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Hits to DB: 35,841,668
Number of Sequences: 240170
Number of extensions: 1550248
Number of successful extensions: 3502
Number of sequences better than 10.0: 43
Number of HSP's better than 10.0 without gapping: 24
Number of HSP's successfully gapped in prelim test: 19
Number of HSP's that attempted gapping in prelim test: 3332
Number of HSP's gapped (non-prelim): 140
length of query: 229
length of database: 77,468,597
effective HSP length: 107
effective length of query: 122
effective length of database: 51,770,407
effective search space: 6315989654
effective search space used: 6315989654
T: 11
A: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 64 (29.3 bits)
BLASTP 2.2.10 [Oct-19-2004]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Query= ENTFA AACA_ENTFA (P0A0C2) Bifunctional AAC/APH [Includes:
6'-aminoglycoside N-acetyltransferase (EC 2.3.1.-) (AAC(6'));
2''-aminoglycoside phosphotransferase (EC 2.7.1.-) (APH(2''))]
         (479 letters)

Database: Blastdata.fdb
           240,170 sequences; 77,468,597 total letters

Searching..................................................done

                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

STAAM AACA_STAAM (P0A0C0) Bifunctional AAC/APH [Includes: 6'-ami...   959   0.0
ENTFA AACA_ENTFA (P0A0C2) Bifunctional AAC/APH [Includes: 6'-ami...   959   0.0
BACAN Q81P54 (Q81P54) Acetyltransferase, GNAT family                  168   4e-41
BACC1 Q735Z9 (Q735Z9) Acetyltransferase, GNAT family                  159   1e-38
BACAN Q81T90 (Q81T90) Aminoglycoside phophotransferase family pr...    67   1e-10
BACC1 Q73BC3 (Q73BC3) Aminoglycoside phophotransferase family pr...    62   4e-09
BRAJA Q89WN0 (Q89WN0) Bll0648 protein                                  59   3e-08
BACHD Q9K9M4 (Q9K9M4) BH2621 protein                                   56   2e-07
BACC1 Q739G2 (Q739G2) 6'-aminoglycoside N-acetyltransferase/2''-...    55   5e-07
THEMA Q9X063 (Q9X063) Hypothetical protein                             52   3e-06
CLOTE Q896X4 (Q896X4) Putative acetyltransferase                       49   3e-05
BACHD Q9KB15 (Q9KB15) BH2121 protein                                   48   6e-05
STAES Q8CQT6 (Q8CQT6) Spermidine acetyltransferase                     47   1e-04
VIBCH Q9KMC2 (Q9KMC2) Acetyltransferase, putative                      45   5e-04
BACSU BLTD_BACSU (P39909) Spermine/spermidine acetyltransferase ...    45   6e-04
BACC1 Q739N8 (Q739N8) Spermidine acetyltransferase, putative           44   0.001
LACPL Q88SW8 (Q88SW8) Acetyltransferase (Putative)                     44   0.001
VIBUY Q7MK89 (Q7MK89) Putative acetyltransferase                       43   0.002
DEIRA Q9RW71 (Q9RW71) Acetyltransferase, putative                      43   0.002
BACAN Q81RM0 (Q81RM0) Spermidine acetyltransferase, putative           43   0.002
LACJO Q74K74 (Q74K74) Hypothetical protein                             42   0.003
BACC1 Q72YV7 (Q72YV7) Acetyltransferase, GNAT family                   42   0.003
BACC1 Q734R6 (Q734R6) Acetyltransferase, GNAT family                   42   0.004
CLOPE Q8XKH9 (Q8XKH9) Hypothetical protein CPE1416                     42   0.005
BACAN Q81SQ8 (Q81SQ8) Acetyltransferase, GNAT family                   41   0.007
VIBCH Q9K330 (Q9K330) Acetyltransferase, putative                      41   0.009
VIBUY Q7MK96 (Q7MK96) Putative acetyltransferase                       40   0.012
WIGBR Q8D3I4 (Q8D3I4) Imp protein                                      40   0.016
BACSU P94482 (P94482) YnaD                                             40   0.021
BACC1 Q73A91 (Q73A91) Acetyltransferase, GNAT family                   40   0.021
THETN Q8RC99 (Q8RC99) Acetyltransferases                               39   0.027
STRAW Q82IB6 (Q82IB6) Putative acetyltransferase                       39   0.027
LISIN Q92E38 (Q92E38) Lin0623 protein                                  39   0.027
STRCO O69977 (O69977) Hypothetical protein SCO5801                     39   0.036
STRAW Q82KD8 (Q82KD8) Hypothetical protein                             39   0.036
VIBCH Q9KMA3 (Q9KMA3) Acetyltransferase, putative                      39   0.046
STAAM Q932M5 (Q932M5) Hypothetical protein SAVP027                     39   0.046
LACLA Q9CHJ8 (Q9CHJ8) Acetyl transferase                               39   0.046
ENTFA Q82YR8 (Q82YR8) Acetyltransferase, GNAT family                   39   0.046
BACC1 Q73AS8 (Q73AS8) Acetyltransferase, GNAT family                   39   0.046
BACC1 Q734C3 (Q734C3) Acetyltransferase, GNAT family                   39   0.046
BACAN Q81YL2 (Q81YL2) Acetyltransferase, GNAT family                   39   0.046
BACAN Q81XB7 (Q81XB7) Spermidine N1-acetyltransferase                  39   0.046
SALTI Q8Z6Z1 (Q8Z6Z1) Spermidine N1-acetyltransferase                  38   0.061
SALTY Q8ZPJ3 (Q8ZPJ3) Spermidine N1-acetyltransferase (EC 2.3.1.57)    38   0.061
SALCH Q57PD6 (Q57PD6) Spermidine N1-acetyltransferase                  38   0.061
MYCPE Q8EWE5 (Q8EWE5) Acetyltransferase GNAT family                    38   0.061
BACC1 Q72Y03 (Q72Y03) Spermidine N1-acetyltransferase (EC 2.3.1.57)    38   0.061
DEIRA Q9RW73 (Q9RW73) Acetyltransferase, putative                      38   0.079
STAAM Q99U68 (Q99U68) Hypothetical protein                             37   0.10
RICPR Q9ZE75 (Q9ZE75) TRANSCRIPTIONAL ACTIVATOR PROTEIN CZCR (CzcR)    37   0.10
LACJO Q74J71 (Q74J71) Hypothetical protein                             37   0.10
CLOAB Q97D33 (Q97D33) Acetyltransferase (With duplicated domains...    37   0.10
VIBCH Q9KU66 (Q9KU66) Ribosomal-protein-alanine acetyltransferase      37   0.18
STRR6 Q8DNF9 (Q8DNF9) Hypothetical protein spr1760                     37   0.18
SALTY Q8ZNU6 (Q8ZNU6) Putative ribose 5-phosphate isomerase (EC ...    37   0.18
SALCH Q57N66 (Q57N66) Putative ribose 5-phosphate isomerase            37   0.18
ECO57 ATDA_ECO57 (P0A952) Spermidine N(1)-acetyltransferase (EC ...    37   0.18
ECOLI ATDA_ECOLI (P0A951) Spermidine N(1)-acetyltransferase (EC ...    37   0.18
BACHD Q9KG16 (Q9KG16) BH0299 protein                                   37   0.18
AQUAE O66838 (O66838) Ribosomal-protein-alanine acetyltransferase      37   0.18
PROAC Q6ABU2 (Q6ABU2) Acetyltransferase (EC 2.3.1.-)                   36   0.23
BACAN Q81QL3 (Q81QL3) Acetyltransferase, GNAT family                   36   0.23
STRMU Q8DV67 (Q8DV67) Putative acetyltransferase                       36   0.30
STAAM COAD_STAAM (P63818) Phosphopantetheine adenylyltransferase...    36   0.30
LISMO Q8Y9B8 (Q8Y9B8) Lmo0614 protein                                  36   0.30
THEMA Q9WZ46 (Q9WZ46) Hypothetical protein                             35   0.39
STRR6 Q8DQU8 (Q8DQU8) Hypothetical protein spr0490                     35   0.39
CLOAB Q97FA0 (Q97FA0) Predicted acetyltransferase                      35   0.39
_BUCAP LOLC_BUCAP (Q8K9N8) Lipoprotein-releasing system transmem...    35   0.39
BACC1 Q734M2 (Q734M2) Acetyltransferase, GNAT family                   35   0.39
BACAN Q81U09 (Q81U09) Acetyltransferase, GNAT family                   35   0.39
YERPE Q8ZHB3 (Q8ZHB3) Putative siderophore biosynthesis protein ...    35   0.51
VIBUY Q7MFH7 (Q7MFH7) Histone acetyltransferase HPA2                   35   0.51
RICCN Q92JG6 (Q92JG6) Transcriptional activator protein czcR           35   0.51
CLOAB Q97G03 (Q97G03) Predicted acetyltransferase                      35   0.51
BACHD Q9KF00 (Q9KF00) Ribosomal-protein-alanine N-acetyltransferase    35   0.51
BACAN Q81NB6 (Q81NB6) Spermine/spermidine acetyltransferase, put...    35   0.51
BACAN Q81N07 (Q81N07) Acetyltransferase, GNAT family                   35   0.51
STRMU Q8DT36 (Q8DT36) Putative acetyltransferase                       35   0.67
PSEAE Q9I3W7 (Q9I3W7) Hypothetical protein                             35   0.67
NITEU Q82UT0 (Q82UT0) GCN5-related N-acetyltransferase (EC 2.3.1...    35   0.67
LACLA Q9CF66 (Q9CF66) Spermidine acetyltransferase (EC 2.3.1.57)       35   0.67
BACC1 Q739G7 (Q739G7) Acetyltransferase, GNAT family                   35   0.67
MYCPE Q8EWM2 (Q8EWM2) Hypothetical protein MYPE1810                    34   0.88
MYCPE Q8EWE6 (Q8EWE6) Acetyltransferase GNAT family                    34   0.88
LACLA Q9CHW9 (Q9CHW9) Hypothetical protein yfiL                        34   0.88
LACPL Q88SM7 (Q88SM7) Prophage Lp4 protein 8, DNA primase/helicase     34   0.88
ENTFA Q837C4 (Q837C4) Acetyltransferase, GNAT family                   34   0.88
CLOTE Q891D4 (Q891D4) Ribosomal-protein-alanine acetyltransferas...    34   0.88
CLOAB Q97I16 (Q97I16) Predicted acetyltransferase domain contain...    34   0.88
BACC1 Q72WY7 (Q72WY7) Hypothetical protein                             34   0.88
VIBPA Q87G30 (Q87G30) Putative acetyltransferase                       34   1.1
STAES ARGD2_STAES (Q8CSG1) Acetylornithine aminotransferase 2 (E...    34   1.1
RICPR Q9ZEC5 (Q9ZEC5) 190 KD ANTIGEN (Sca1)                            34   1.1
PSEPK Q88NL5 (Q88NL5) Acetyltransferase, GNAT family                   34   1.1
LACPL Q88YF8 (Q88YF8) Acetyltransferase (Putative)                     34   1.1
BACC1 Q737S7 (Q737S7) Acetyltransferase, GNAT family                   34   1.1
BACC1 Q734H0 (Q734H0) Acetyltransferase, GNAT family family            34   1.1
BACAN Q81Q67 (Q81Q67) Acetyltransferase, GNAT family                   34   1.1
BACAN Q81P77 (Q81P77) Acetyltransferase, GNAT family                   34   1.1
Q8E1I7 (Q8E1I7) Acetyltransferase, GNAT family                         33   1.5
Q8DXE6 (Q8DXE6) Hypothetical protein SAG1905                           33   1.5
OCEIH Q8ET96 (Q8ET96) Hypothetical conserved protein                   33   1.5
LISIN Q929M8 (Q929M8) Lin2246 protein                                  33   1.5
CLOTE Q895L4 (Q895L4) DNA topoisomerase I (EC 5.99.1.2)                33   1.5
CLOPE Q8XIF5 (Q8XIF5) Ribosomal-protein-alanine N-acetyltransferase    33   1.5
BRUSU Q8FVN1 (Q8FVN1) Acyl-CoA dehydrogenase family protein            33   1.5
BRUME Q8YCN6 (Q8YCN6) ACYL-COA DEHYDROGENASE (EC 1.3.99.-)             33   1.5
BACSU YQAR_BACSU (P45914) Hypothetical protein yqaR                    33   1.5
BACAN Q81RF9 (Q81RF9) Acetyltransferase, GNAT family                   33   1.5
VIBPA Q87NP1 (Q87NP1) Putative spermine/spermidine acetyltransfe...    33   2.0
THETN Q8RC65 (Q8RC65) Acetyltransferases                               33   2.0
STRR6 Q8CYV9 (Q8CYV9) Hypothetical protein spr0850                     33   2.0
STAES Q8CU70 (Q8CU70) Spermidine acetyltransferase                     33   2.0
STAES Q8CS08 (Q8CS08) Hypothetical protein SE1483                      33   2.0
RICPR Q9ZDP9 (Q9ZDP9) Hypothetical protein RP278                       33   2.0
OCEIH Q8CXG0 (Q8CXG0) Diamine N-acetyltransferase (Spermine:sper...    33   2.0
CLOAB Q97J70 (Q97J70) Predicted acetyltransferase                      33   2.0
BURMA Q9AI54 (Q9AI54) DedA family protein                              33   2.0
BRAJA Q89YE3 (Q89YE3) Bll0009 protein                                  33   2.0
BACAN Q81NK7 (Q81NK7) Streptothricin acetyltransferase, putative       33   2.0
VIBUY Q7MI31 (Q7MI31) Ribosomal-protein-alanine acetyltransferase      33   2.6
OCEIH Q8EMB8 (Q8EMB8) Glucose-6-phosphate 1-dehydrogenase (EC 1....    33   2.6
OCEIH Q8EKW6 (Q8EKW6) Acetyltransferase                                33   2.6
MYCPE Q8EWC2 (Q8EWC2) Oligoendopeptidase F                             33   2.6
LISMO Q8Y8S4 (Q8Y8S4) Lmo0820 protein                                  33   2.6
CLOPE Q8XKM5 (Q8XKM5) Probable acetyltransferase                       33   2.6
BUCAI SPED_BUCAI (P57304) S-adenosylmethionine decarboxylase pro...    33   2.6
AQUAE O67458 (O67458) Hypothetical protein aq_1482                     33   2.6
YERPE Q8CZN5 (Q8CZN5) Acyltransferase for 30S ribosomal subunit ...    32   3.3
STRAW Q827N9 (Q827N9) Putative acetyltransferase                       32   3.3
STAES Q8CU81 (Q8CU81) Spermidine acetyltransferase                     32   3.3
RICPR Q9ZCN0 (Q9ZCN0) RIBOSOMAL-PROTEIN-ALANINE ACETYLTRANSFERAS...    32   3.3
OCEIH Q8ERM1 (Q8ERM1) Acetyltransferase                                32   3.3
MYCGE RIBF_MYCGE (P47391) Putative riboflavin biosynthesis prote...    32   3.3
ENTFA Q836M4 (Q836M4) Spermine/spermidine acetyltransferase, put...    32   3.3
CHRVO Q7NRZ7 (Q7NRZ7) Probable acetyltransferase (EC 2.3.1.-)          32   3.3
CHLCV Q824H9 (Q824H9) Acetyltransferase, GNAT family                   32   3.3
CHLMU SYR_CHLMU (Q9PJT8) Arginyl-tRNA synthetase (EC 6.1.1.19) (...    32   3.3
BACSU O34376 (O34376) Putative acetyl transferase (YobR protein)       32   3.3
BACC1 Q738E9 (Q738E9) Acetyltransferase, GNAT family                   32   3.3
BACC1 Q736C5 (Q736C5) Acetyltransferase, GNAT family                   32   3.3
BACC1 Q735P8 (Q735P8) Acetyltransferase, GNAT family                   32   3.3
AQUAE YZ34_AQUAE (O66423) Hypothetical protein AA34                    32   3.3
YERPE Q8ZCX6 (Q8ZCX6) Spermidine acetyltransferase (EC 2.3.1.57)...    32   4.4
STRR6 Q8DNN2 (Q8DNN2) Hypothetical protein spr1627                     32   4.4
STAAM Q99RQ8 (Q99RQ8) Similar to transcription repressor of spor...    32   4.4
LACLA Q9CJA2 (Q9CJA2) Acetyl transferase                               32   4.4
CLOTE Q892J2 (Q892J2) Conserved protein                                32   4.4
BRAJA Q89UA8 (Q89UA8) Hypothetical acetyltransferase                   32   4.4
BACSU O34558 (O34558) YopR protein                                     32   4.4
BACAN Q81R63 (Q81R63) Hypothetical protein                             32   4.4
VIBPA Q87MD0 (Q87MD0) Acetyltransferase-related protein                32   5.7
STRR6 Q8DND0 (Q8DND0) Transcriptional activator                        32   5.7
OCEIH Q8CUS9 (Q8CUS9) Hypothetical conserved protein                   32   5.7
LISIN Q92E28 (Q92E28) Lin0633 protein                                  32   5.7
LACPL Q88U49 (Q88U49) Glucose-6-phosphate 1-dehydrogenase (EC 1....    32   5.7
CORDI Q6NHQ0 (Q6NHQ0) Putative acetyltransferase                       32   5.7
BURMA Q62J98 (Q62J98) Ribosomal-protein-alanine acetyltransferas...    32   5.7
THETN Q8R764 (Q8R764) LysM-repeat proteins and domains                 31   7.4
STRR6 Q8DPX8 (Q8DPX8) Hypothetical protein spr0952                     31   7.4
STRMU Q8DVT1 (Q8DVT1) Putative ribosomal-protein-alanine acetylt...    31   7.4
SALTI SPED_SALTI (Q8Z9E3) S-adenosylmethionine decarboxylase pro...    31   7.4
SALTY SPED_SALTY (Q8ZRS4) S-adenosylmethionine decarboxylase pro...    31   7.4
SALCH Q57T90 (Q57T90) S-adenosylmethionine decarboxylase, proenzyme    31   7.4
RICCN Q92JP8 (Q92JP8) Cell surface antigen                             31   7.4
NEIGI Q5FAH1 (Q5FAH1) Hypothetical protein                             31   7.4
LISIN Q92DJ7 (Q92DJ7) Lin0816 protein                                  31   7.4
LACJO Q74J74 (Q74J74) Hypothetical protein                             31   7.4
GEOSL Q74A59 (Q74A59) Sensory box histidine kinase                     31   7.4
ENTFA Q836Z2 (Q836Z2) Acetyltransferase, GNAT family                   31   7.4
ENTFA Q82YJ4 (Q82YJ4) Toxin ABC transporter, ATP-binding/permeas...    31   7.4
CLOAB Q97M70 (Q97M70) Predicted metal-dependent hydrolase              31   7.4
CLOAB Q97HI5 (Q97HI5) Predicted flavodoxin                             31   7.4
BRAJA Q89R32 (Q89R32) Acyl-CoA dehydrogenase                           31   7.4
BACC1 Q735F5 (Q735F5) Streptothricin acetyltransferase, putative       31   7.4
BACAN Q81YS9 (Q81YS9) Acetyltransferase, GNAT family                   31   7.4
VIBPA Q87HD3 (Q87HD3) Hypothetical protein VPA1032                     31   9.7
VIBCH Q9KL03 (Q9KL03) Spermidine n1-acetyltransferase                  31   9.7
THEMA SYE2_THEMA (Q9X2I8) Glutamyl-tRNA synthetase 2 (EC 6.1.1.1...    31   9.7
THETN ISPH_THETN (Q8RA76) 4-hydroxy-3-methylbut-2-enyl diphospha...    31   9.7
STRCO Q9S2L5 (Q9S2L5) Hypothetical protein SCO1988                     31   9.7
STRP1 Q99XX8 (Q99XX8) Putative pullulanase                             31   9.7
STRP1 Q99YF0 (Q99YF0) Hypothetical protein SPy1734 (Acetyltransf...    31   9.7
STAES Q8CTP7 (Q8CTP7) Hypothetical protein SE0368                      31   9.7
STAES COAD_STAES (Q8CSZ5) Phosphopantetheine adenylyltransferase...    31   9.7
MYCLEH Y3433_MYCTU (O06250) Hypothetical protein Rv3433c/MT3539        31   9.7
MYCTU Y3433_MYCTU (O06250) Hypothetical protein Rv3433c/MT3539         31   9.7
MYCBO Q7TWI6 (Q7TWI6) Hypothetical protein Mb3463c                     31   9.7
LISMO Q8Y5C4 (Q8Y5C4) Lmo2141 protein                                  31   9.7
LISIN Q929Z8 (Q929Z8) Lin2125 protein                                  31   9.7
ENTFA Q837Z5 (Q837Z5) Acetyltransferase, GNAT family                   31   9.7
ENTFA Q831M9 (Q831M9) Ribosomal-protein-alanine acetyltransferas...    31   9.7
CLOTE Q895T0 (Q895T0) Acetyltransferase (EC 2.3.1.-)                   31   9.7
CLOPE Q8XMF9 (Q8XMF9) Hypothetical protein CPE0730                     31   9.7
BRUSU Q8FY73 (Q8FY73) Outer membrane autotransporter                   31   9.7
BACSU YCBJ_BACSU (P42242) Hypothetical protein ycbJ                    31   9.7
BACHD Q9KE57 (Q9KE57) BH1001 protein                                   31   9.7
BACC1 Q73CD9 (Q73CD9) Glycerol-3-phosphate dehydrogenase, aerobi...    31   9.7
BACAN Q81VL8 (Q81VL8) Oxidoreductase, FAD-binding                      31   9.7
BACAN Q81U57 (Q81U57) Glycerol-3-phosphate dehydrogenase, aerobic      31   9.7
BACAN Q81NU7 (Q81NU7) Acetyltransferase, GNAT family                   31   9.7

>STAAM AACA_STAAM (P0A0C0) Bifunctional AAC/APH [Includes:
           6'-aminoglycoside N-acetyltransferase (EC 2.3.1.-)
           (AAC(6')); 2''-aminoglycoside phosphotransferase (EC
           2.7.1.-) (APH(2''))]
          Length = 479

 Score =  959 bits (2480), Expect = 0.0
 Identities = 467/479 (97%), Positives = 467/479 (97%)

Query: 1   MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR 60
           MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR
Sbjct: 1   MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR 60

Query: 61  VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120
           VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI
Sbjct: 61  VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120

Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD 180
           FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD
Sbjct: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD 180

Query: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240
           DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE
Sbjct: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240

Query: 241 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR 300
           KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR
Sbjct: 241 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR 300

Query: 301 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA 360
           DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA
Sbjct: 301 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA 360

Query: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRLTXXXXXXXXXXXXEYCDFIYLLEDSEEEIGTN 420
           TTVFEGKKCLCHNDFSCNHLLLDGNNRLT            EYCDFIYLLEDSEEEIGTN
Sbjct: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRLTGIIDFGDSGIIDEYCDFIYLLEDSEEEIGTN 420

Query: 421 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD 479
           FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD
Sbjct: 421 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD 479


>ENTFA AACA_ENTFA (P0A0C2) Bifunctional AAC/APH [Includes:
           6'-aminoglycoside N-acetyltransferase (EC 2.3.1.-)
           (AAC(6')); 2''-aminoglycoside phosphotransferase (EC
           2.7.1.-) (APH(2''))]
          Length = 479

 Score =  959 bits (2480), Expect = 0.0
 Identities = 467/479 (97%), Positives = 467/479 (97%)

Query: 1   MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR 60
           MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR
Sbjct: 1   MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR 60

Query: 61  VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120
           VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI
Sbjct: 61  VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120

Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD 180
           FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD
Sbjct: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD 180

Query: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240
           DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE
Sbjct: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240

Query: 241 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR 300
           KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR
Sbjct: 241 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR 300

Query: 301 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA 360
           DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA
Sbjct: 301 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA 360

Query: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRLTXXXXXXXXXXXXEYCDFIYLLEDSEEEIGTN 420
           TTVFEGKKCLCHNDFSCNHLLLDGNNRLT            EYCDFIYLLEDSEEEIGTN
Sbjct: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRLTGIIDFGDSGIIDEYCDFIYLLEDSEEEIGTN 420

Query: 421 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD 479
           FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD
Sbjct: 421 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD 479


>BACAN Q81P54 (Q81P54) Acetyltransferase, GNAT family
          Length = 177

 Score =  168 bits (425), Expect = 4e-41
 Identities = 76/174 (43%), Positives = 116/174 (66%), Gaps = 1/174 (0%)

Query: 5   ENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIE 64
           ++ + +R ++++D P++ KWLTD  VL++Y GRD   ++E +  H+         R +IE
Sbjct: 5   KDNVSVRYVVEEDAPIISKWLTDPEVLQYYEGRDDPQSVEMVLNHFIHNPNSPEKRCLIE 64

Query: 65  YNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFL 124
           +++VPIGY Q+Y +  E  T Y Y ++   V+GMDQFIGEP YW KGIGT+++K    ++
Sbjct: 65  FDDVPIGYIQMYPVDSESKTLYGYEESQN-VWGMDQFIGEPTYWGKGIGTKFVKAAITYI 123

Query: 125 KKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYR 178
             E  A A+ +DP  NN RAI+ Y+K GF+ ++ L EHELHEG  EDC++MEY+
Sbjct: 124 LSEMGAEAIAMDPKVNNERAIKCYEKCGFKKVKILKEHELHEGVLEDCWMMEYK 177


>BACC1 Q735Z9 (Q735Z9) Acetyltransferase, GNAT family
          Length = 359

 Score =  159 bits (403), Expect = 1e-38
 Identities = 74/185 (40%), Positives = 118/185 (63%), Gaps = 1/185 (0%)

Query: 5   ENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIE 64
           ++ + +R + ++D P++ KWLT+  VL++Y GRD   +++ +  H+         R +IE
Sbjct: 5   KDNVSVRYVKEEDAPIISKWLTEPEVLQYYEGRDNPQSVDMVLDHFIHNPNSHEKRCLIE 64

Query: 65  YNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFL 124
           +++VPIGY Q+Y +  E  T Y Y ++   V+GMDQFIGEP YW KGIGT+ ++    ++
Sbjct: 65  FDDVPIGYIQMYPVDSEWKTLYGYEESQH-VWGMDQFIGEPTYWGKGIGTKLVQTAITYI 123

Query: 125 KKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNAT 184
            +   A A+ +DP  NN RAI+ Y+K GF+ ++ L EHELHEG  EDC++MEY+  +
Sbjct: 124 MENTGAEAIAMDPKVNNERAIKCYEKCGFKKVKVLKEHELHEGVLEDCWMMEYKQRELRE 183

Query: 185 NVKAM 189
             KA+
Sbjct: 184 MKKAL 188


>BACAN Q81T90 (Q81T90) Aminoglycoside phophotransferase family
           protein
          Length = 300

 Score = 67.0 bits (162), Expect = 1e-10
 Identities = 51/208 (24%), Positives = 95/208 (45%), Gaps = 12/208 (5%)

Query: 190 KYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNT 249
           K  I+    N  + S +    G+D+VA +VN+E +F+             EK +   L
Sbjct: 5   KQYIKEALPNLSIHSYKQNEEGWDNVAVIVNDELLFRFPRKQEYAMRIPLEKELCTILTQ 64

Query: 250 NLETNVKIP--NIEYSYISDELSILGYKE-IKGTFLTPEIYSTMSEEEQNLLKRDIASFL 306
           +L+  +++P  ++ Y   SDE+ +  Y   I G  L  EI + + E+E+ ++   +A+FL
Sbjct: 65  SLQ-EIEVPQYHLIYKNESDEVPLCSYYTLIHGEPLKTEIVANLDEKERKIIITQLATFL 123

Query: 307 RQMHGLDYTDISECTIDNKQNVL---EEYILLRETIYNDLTDIEKDYIESFMERL---NA 360
             +H +    ++      ++ +    E    L E + N LT  +K  +    E      A
Sbjct: 124 AALHSIPLKSVTALGFPTEKTLTYWKELQTKLNEYVTNSLTSFQKSTLNRLFENFFACIA 183

Query: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRL 388
           T+ F     + H DF+ +H+L D  N++
Sbjct: 184 TSAF--PNAIIHADFTHHHILFDKQNKI 209


>BACC1 Q73BC3 (Q73BC3) Aminoglycoside phophotransferase family
           protein
          Length = 300

 Score = 62.0 bits (149), Expect = 4e-09
 Identities = 51/206 (24%), Positives = 92/206 (44%), Gaps = 10/206 (4%)

Query: 190 KYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNT 249
           K  I+    N  + S +    G+D+VA +VN+E +F+             EK +   L+
Sbjct: 5   KQYIKEALPNLSIHSYKQNEEGWDNVAIIVNDELLFRFPRKQEYAMRIPLEKELCTLLSC 64

Query: 250 NL-ETNVKIPNIEYSYISDELSILGYKE-IKGTFLTPEIYSTMSEEEQNLLKRDIASFLR 307
           +L E  V   ++ Y   +D + +  Y   I G  L  EI +T+ ++E+  L   +A+FL
Sbjct: 65  SLHEIEVPKYHLFYEKNTDAIPLCSYYTLIHGEPLKTEIVTTLEKQERKALITQLATFLA 124

Query: 308 QMHGLDYTDISECTIDNKQNVL---EEYILLRETIYNDLTDIEKDYIESFMERL---NAT 361
            +H +    ++      ++ +    E    L E + N LT  +K  +    E      AT
Sbjct: 125 ALHSIPLKSVTALGFPIEKTLTYWKELQAKLNEYVTNSLTSFQKSTLNRLFENFFACLAT 184

Query: 362 TVFEGKKCLCHNDFSCNHLLLDGNNR 387
           + F+    + H DF+ +H+L D  N+
Sbjct: 185 SKFQ--NTIIHADFTHHHILFDKQNK 208


>BRAJA Q89WN0 (Q89WN0) Bll0648 protein
          Length = 161

 Score = 59.3 bits (142), Expect = 3e-08
 Identities = 44/145 (30%), Positives = 75/145 (51%), Gaps = 13/145 (8%)

Query: 11  RTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPI 70
           R +   D PL+ +WL +  V E++G   +++ L S      EP  D+    I+   + P
Sbjct: 8   RPMTAADLPLIRRWLGEAHVREWWGDPGEQFALVS--GDLDEPAMDQF---IVLAGDKPF 62

Query: 71  GYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKE--R 128
           GY Q Y++    +     P+      G+DQFIGE +  ++G G+ +I+   +F+ ++
Sbjct: 63  GYLQCYRL--TAWNTGFGPQPGG-TRGIDQFIGESDMIARGHGSAFIR---QFVDEQLRH 116

Query: 129 NANAVILDPHKNNPRAIRAYQKSGF 153
               V+ DP   N RA+RAY+K+GF
Sbjct: 117 GLPRVVTDPDPLNSRAVRAYEKAGF 141


>BACHD Q9K9M4 (Q9K9M4) BH2621 protein
          Length = 197

 Score = 56.2 bits (134), Expect = 2e-07
 Identities = 35/159 (22%), Positives = 78/159 (49%), Gaps = 6/159 (3%)

Query: 2   NIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRV 61
           ++V  ++  R +  DD  ++  W+ +E V+ ++        L   KKH      D+   +
Sbjct: 15  HVVNKKLSFRHVTMDDVDMLHSWMHEEHVIPYW---KLNIPLVDYKKHLQTFLNDDHQTL 71

Query: 62  II-EYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120
           ++   N VP+ Y + Y + +++  +Y YP  +E   G+   IG   Y  +G+    +  I
Sbjct: 72  MVGAINGVPMSYWESYWVKEDIIANY-YP-FEEHDQGIHLLIGPQEYLGQGLIYPLLLAI 129

Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
            +   +E + N ++ +P + N + I  ++K GF+ ++++
Sbjct: 130 MQQKFQEPDTNTIVAEPDRRNKKMIHVFKKCGFQPVKEV 168


>BACC1 Q739G2 (Q739G2) 6'-aminoglycoside
           N-acetyltransferase/2''-aminoglycoside
           phosphotransferase, putative (EC 2.3.1.-)
          Length = 293

 Score = 55.1 bits (131), Expect = 5e-07
 Identities = 57/289 (19%), Positives = 125/289 (43%), Gaps = 24/289 (8%)

Query: 193 IEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLE 252
           ++  +   +++S+ I   G ++   +VN+  +F+       +KG  K +     L
Sbjct: 11  LQRLYPELQINSVYINEIGQNNDVLIVNDNIVFRFP---KYEKGIQKLRIETQLLEKIRP 67

Query: 253 -TNVKIPNIEYSYISDELS---ILGYKEIKGTFLTPEIYSTMSEEEQ-NLLKRDIASFLR 307
              ++IPN  Y    +E+      GY+ I+G      +++ +++E+Q   L   +A FL+
Sbjct: 68  FITLQIPNPSYQGFQNEVPGKVFAGYEMIEGDPFWKNVFTEINDEKQLQKLAYTLARFLK 127

Query: 308 QMHGLD---YTDISEC-TIDNKQNVLEEYILLRETIYNDLTDI-EKDYIESFMERLNATT 362
           ++H +    +  I +C + D    +   Y  L+E +Y  + ++  K+   SF   LN ++
Sbjct: 128 ELHEIPLSTFESIMQCDSTDMYSEINSLYSQLKEHVYPFMRNVARKEVSTSFELYLNESS 187

Query: 363 VFEGKKCLCHNDFSCNHLLLDGNNR-LTXXXXXXXXXXXXEYCDFIYLLEDSEEEIGTNF 421
            F     L H DF   ++L     + ++               DF  +L         ++
Sbjct: 188 HFNFTPSLVHGDFGMTNILYSATKKNISGVIDFGGASIGDPAYDFAGIL--------ASY 239

Query: 422 GEDILRMYGNI--DIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENG 468
           GE+ L+++     ++E  KE     +  + ++  ++G+ N  ++  E G
Sbjct: 240 GEEFLQLFEAYYPNLEAVKERMYFYKSTFALQEALFGVLNNDKKAFEAG 288


>THEMA Q9X063 (Q9X063) Hypothetical protein
          Length = 182

 Score = 52.4 bits (124), Expect = 3e-06
 Identities = 27/75 (36%), Positives = 41/75 (54%), Gaps = 1/75 (1%)

Query: 101 FIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLP 160
           F+G P YWS+G GT  ++++  F+  E N N + L     N RA R Y+K GF++   L
Sbjct: 94  FLGRP-YWSQGYGTDAMRVLVRFIFNEMNMNKIKLHVFSFNERAKRVYEKIGFKVEGILR 152

Query: 161 EHELHEGKKEDCYLM 175
           +    EG+  D  +M
Sbjct: 153 QELFREGRYHDVIVM 167


>CLOTE Q896X4 (Q896X4) Putative acetyltransferase
          Length = 186

 Score = 48.9 bits (115), Expect = 3e-05
 Identities = 44/173 (25%), Positives = 73/173 (42%), Gaps = 15/173 (8%)

Query: 6   NEICIRTLIDDDFPLMLKWLTDE---RVLEFYGGRDK-KYTLESLKKHYTEPWEDEVFRV 61
           + I I  L ++D   + KW  D    RV +F     K  + +            +  F +
Sbjct: 10  DRIKITALREEDIETITKWYEDTNFLRVFDFNPSAPKTSWKIREWLMEEVSSSNNYFFAI 69

Query: 62  IIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIF 121
             +  N  +GY +I K+             +  V G+   IG+ + W KG G+  + L
Sbjct: 70  RKKDANKILGYVEIEKI-----------NWNNGVGGIAIGIGDSSEWGKGYGSEALSLAM 118

Query: 122 EFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYL 174
           +F  +E N + + L     N RAI++Y+K GF+      E    +GK+ D YL
Sbjct: 119 DFAFRELNLHRLQLITISYNERAIKSYEKLGFKKEGIYREAVNRDGKRYDIYL 171


>BACHD Q9KB15 (Q9KB15) BH2121 protein
          Length = 181

 Score = 48.1 bits (113), Expect = 6e-05
 Identities = 28/78 (35%), Positives = 36/78 (46%), Gaps = 13/78 (16%)

Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           IGE  YW KG G   ++L+  +   E N + V L     N +AIR Y+K GF+
Sbjct: 95  IGEKTYWGKGYGFEALRLLLNYAFLEMNLHRVSLRVFSFNKKAIRLYEKLGFK------- 147

Query: 162 HELHEGKKEDCYLMEYRY 179
              HEG    C    YRY
Sbjct: 148 ---HEGTSRQCL---YRY 159


>STAES Q8CQT6 (Q8CQT6) Spermidine acetyltransferase
          Length = 177

 Score = 47.0 bits (110), Expect = 1e-04
 Identities = 42/169 (24%), Positives = 71/169 (42%), Gaps = 14/169 (8%)

Query: 8   ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR-VIIEYN 66
           + IR L   D    +  L +E  +  Y   +   +L  L+  YT+   DE  R  I+E
Sbjct: 3   LIIRALEKTDLSF-IHHLNNEYSIMSYWFEEPYQSLSELENLYTKHILDETERRFIVEEG 61

Query: 67  NVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKK 126
           +  +G  ++ ++           +T E++  +D     P Y + G   +  K+  ++
Sbjct: 62  STSVGVVELLEIN-------FIHRTCEVLIIID-----PQYANNGYAKKAFKMAIDYAFL 109

Query: 127 ERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175
             N N V L     N +A+  YQ + F I   L EH    G+  DCY+M
Sbjct: 110 VLNMNKVYLYVDIKNEKAVHIYQSNNFEIEGTLKEHFYTRGEYRDCYVM 158


>VIBCH Q9KMC2 (Q9KMC2) Acetyltransferase, putative
          Length = 158

 Score = 45.1 bits (105), Expect = 5e-04
 Identities = 37/166 (22%), Positives = 69/166 (41%), Gaps = 18/166 (10%)

Query: 15  DDDFPLMLKWLTDERVLEFYGGRDKKY--TLESLKKHYTEPWEDEVFRVIIEYNNVPIGY 72
           + DF L++KW+  + +   +GG    +  T E +  H ++    EVF  +++      G+
Sbjct: 8   ESDFDLLIKWIDSDELNYLWGGPAYVFPLTYEQIHSHCSKA---EVFPYLLKVKGRHAGF 64

Query: 73  GQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANA 132
            ++YK+ DE Y                 FI    Y  +G+    + L+ +  + + +A
Sbjct: 65  VELYKVTDEQYRICRV------------FISNA-YRGQGLSKSMLMLLIDKARLDFSATK 111

Query: 133 VILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYR 178
           + L   + N  A + Y+  GF ++          GK  D   ME R
Sbjct: 112 LSLGVFEQNTVARKCYESLGFEVVSTEIGTRAFNGKLWDLVRMEKR 157


>BACSU BLTD_BACSU (P39909) Spermine/spermidine acetyltransferase (EC
           2.3.1.57)
          Length = 152

 Score = 44.7 bits (104), Expect = 6e-04
 Identities = 40/153 (26%), Positives = 69/153 (45%), Gaps = 16/153 (10%)

Query: 8   ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKK-HYTEPWEDEVFRVIIEYN 66
           I I+ + DD+   +L     +  L +      K  LE  K+ HY +P       V + Y
Sbjct: 3   INIKAVTDDNRAAILDLHVSQNQLSYI--ESTKVCLEDAKECHYYKP-------VGLYYE 53

Query: 67  NVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKK 126
              +G+     MY  L+ +Y     +  V+ +D+F  +  Y  KG+G + +K + + L +
Sbjct: 54  GDLVGFA----MYG-LFPEYDEDNKNGRVW-LDRFFIDERYQGKGLGKKMLKALIQHLAE 107

Query: 127 ERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
                 + L   +NN  AIR YQ+ GF+   +L
Sbjct: 108 LYKCKRIYLSIFENNIHAIRLYQRFGFQFNGEL 140


>BACC1 Q739N8 (Q739N8) Spermidine acetyltransferase, putative
          Length = 156

 Score = 43.9 bits (102), Expect = 0.001
 Identities = 18/64 (28%), Positives = 36/64 (56%)

Query: 98  MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157
           +D+F+ +  Y  KG   R+++L+ +FL+ +     + L  H +N  A+  Y+  GFR+
Sbjct: 74  LDRFMIDQQYQGKGYAKRFLRLLIQFLQNKFECKTIYLSLHPDNKLAMGLYESFGFRLNG 133

Query: 158 DLPE 161
           D+ +
Sbjct: 134 DIDD 137


>LACPL Q88SW8 (Q88SW8) Acetyltransferase (Putative)
          Length = 180

 Score = 43.5 bits (101), Expect = 0.001
 Identities = 26/74 (35%), Positives = 33/74 (44%)

Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           IG+P+    G GT  + LI  +   E N   V LD    NP AI  YQ SGF
Sbjct: 93  IGDPDERGHGYGTETLSLILNYAFNELNLYKVCLDVIATNPAAIAVYQNSGFEFEGTNKR 152

Query: 162 HELHEGKKEDCYLM 175
               +G++ D Y M
Sbjct: 153 AIKRDGQRIDLYHM 166


>VIBUY Q7MK89 (Q7MK89) Putative acetyltransferase
          Length = 158

 Score = 43.1 bits (100), Expect = 0.002
 Identities = 30/140 (21%), Positives = 66/140 (47%), Gaps = 14/140 (10%)

Query: 17  DFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIY 76
           DF L+++W+  + +   +GG    + L S ++      ++EVF  +++ N    G+ ++Y
Sbjct: 10  DFHLLIEWIDSDELNYLWGGPAYTFPLTS-EQIIAHCAKEEVFPYLLKVNGQNAGFVELY 68

Query: 77  KMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILD 136
           K+ +E Y                 FI   +Y  +G+    I L+ + ++ + +A  + L
Sbjct: 69  KVTNEHYRICRV------------FISN-SYRGQGLSKSMIMLLIDKVRSDFSATMLSLG 115

Query: 137 PHKNNPRAIRAYQKSGFRII 156
             ++N  A + Y+  GF ++
Sbjct: 116 VFEHNTVARKCYESLGFNVV 135


>DEIRA Q9RW71 (Q9RW71) Acetyltransferase, putative
          Length = 207

 Score = 43.1 bits (100), Expect = 0.002
 Identities = 21/70 (30%), Positives = 38/70 (54%)

Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           I +P +W  G G + ++L  +    E +A+ + L     N R +RA Q++G+R    +PE
Sbjct: 107 IYDPAHWGGGFGRQALRLWTDATFAETDAHLITLTTWSGNERMVRAAQRAGYRECARIPE 166

Query: 162 HELHEGKKED 171
             L +G++ D
Sbjct: 167 ARLWQGQRWD 176


>BACAN Q81RM0 (Q81RM0) Spermidine acetyltransferase, putative
          Length = 156

 Score = 43.1 bits (100), Expect = 0.002
 Identities = 18/64 (28%), Positives = 35/64 (54%)

Query: 98  MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157
           +D+F+ +  Y  KG   R+++L+ +FL+ +     + L  H  N  A+  Y+  GFR+
Sbjct: 74  LDRFMIDQQYQGKGYAKRFLRLLIQFLQHKFECKTIYLSLHPENKLAMGLYESFGFRLNG 133

Query: 158 DLPE 161
           D+ +
Sbjct: 134 DIDD 137


>LACJO Q74K74 (Q74K74) Hypothetical protein
          Length = 189

 Score = 42.4 bits (98), Expect = 0.003
 Identities = 41/162 (25%), Positives = 71/162 (43%), Gaps = 25/162 (15%)

Query: 17  DFPLM---LKWLTDERVLEFYGGRDKKYTLESLKKHYTEP-WEDEVFRVIIEYNNV--PI 70
           DFPL+   LK + DE  ++      +    + +K  +  P +     R+ +E +++  PI
Sbjct: 9   DFPLVYPILKQIFDEMDMDTIKALPESQFYDLMKHGFYSPHYRYSHNRMWVETDDLDRPI 68

Query: 71  GYGQIYKMYDELYTDYH----YPKT----DEIVYG----------MDQFIGEPNYWSKGI 112
           G   +Y   D+   D      YPK     D +++           +D     P +W KGI
Sbjct: 69  GLIVMYGYDDQGLIDISLKSAYPKVGLPLDAVIFSDKEALPHEWYLDAIAVSPKHWGKGI 128

Query: 113 GTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
           G + IK I   + ++     + L+  ++NPRA R Y   GF+
Sbjct: 129 GQKLIK-IAPGIARQNGYKKISLNVDQDNPRAARLYDYMGFK 169


>BACC1 Q72YV7 (Q72YV7) Acetyltransferase, GNAT family
          Length = 174
Score = 42.4 bits (98), Expect = 0.003
 Identities = 35/152 (23%), Positives = 63/152 (41%), Gaps = 14/152 (9%)

Query: 6   NEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKK---HYTEPWEDEVFRVI 62
           N I +R   +DD     KW  D  V+        KY+ +  +K    +      + + +
Sbjct: 5   NRIQLRKFSEDDILTYYKWHNDIDVMSSTTLNLDKYSFQDTEKLCQQFIHSPNAKSYIIE 64

Query: 63  IEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFE 122
            +  N+PIG   +      ++ D +    + I+      IG+ +YW +G G     L+
Sbjct: 65  EKATNLPIGITSL------IHIDSYNRNAECIID-----IGKKDYWGQGYGKEAFTLLLN 113

Query: 123 FLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
           +   E N + + L     N RAI+ Y+  GF+
Sbjct: 114 YAFLELNLHRLSLRVFSFNDRAIKLYKSLGFQ 145


>BACC1 Q734R6 (Q734R6) Acetyltransferase, GNAT family
          Length = 176

 Score = 42.0 bits (97), Expect = 0.004
 Identities = 37/146 (25%), Positives = 63/146 (43%), Gaps = 16/146 (10%)

Query: 15  DDDFPLMLKWLTDERVLEFYGGRDKKYTLES--LKKHYTEPWEDE----VFRVIIEYNNV 68
           ++DF  ++ W+ +      +GG    + L +  LK +     +D     VF+ I E N+
Sbjct: 9   EEDFQQLIDWIPNAEFSLQWGGPAFTFPLTNAQLKNYLQNANKDNAIKYVFKAIDETNSE 68

Query: 69  PIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKER 128
            IG+  +  +           KT+E        IG  N   KG GT+ +  + +F  +E
Sbjct: 69  VIGHISLGNV----------DKTNESARIGKVLIGSTNSRGKGYGTQMMTAVLKFAFEEL 118

Query: 129 NANAVILDPHKNNPRAIRAYQKSGFR 154
             + V L     N  AI+ Y+K GF+
Sbjct: 119 KLHKVTLGVFDFNESAIKCYKKVGFQ 144


>CLOPE Q8XKH9 (Q8XKH9) Hypothetical protein CPE1416
          Length = 193

 Score = 41.6 bits (96), Expect = 0.005
 Identities = 31/106 (29%), Positives = 45/106 (42%), Gaps = 5/106 (4%)

Query: 82  LYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNN 141
           LY    Y    EI Y +     E N+W KG+ +  IK I  F  +  + N +I     NN
Sbjct: 93  LYNIDFYSNNTEIGYTI-----EKNFWRKGVASECIKAIENFAFETLDMNRIIAMIDSNN 147

Query: 142 PRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVK 187
             +I+  +K GF     L EH  ++ K E   +  Y    +   VK
Sbjct: 148 ISSIKLSEKLGFHRDGILREHYYNKSKDEYINICVYSLIKSDIKVK 193


>BACAN Q81SQ8 (Q81SQ8) Acetyltransferase, GNAT family
          Length = 157

 Score = 41.2 bits (95), Expect = 0.007
 Identities = 33/126 (26%), Positives = 54/126 (42%), Gaps = 17/126 (13%)

Query: 34  YGGRDKKYTLESLKKHYTEPWEDEV---FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPK 90
           Y G+   Y +E+ ++   E   DE        ++ N   IGY  + K+ D
Sbjct: 22  YEGKYSFYDIEADEEDLAEFLHDESRGDHTFSVKENGTLIGYFTVCKITDG--------- 72

Query: 91  TDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQK 150
           T +I  G+      PN    G G ++I  I  F K++   N + L     N RAI+ Y++
Sbjct: 73  TVDIGLGI-----RPNITGNGFGLQFINAILAFSKEKYGCNYITLSVATFNKRAIKVYKR 127

Query: 151 SGFRII 156
           +GF  +
Sbjct: 128 AGFEAV 133


>VIBCH Q9K330 (Q9K330) Acetyltransferase, putative
          Length = 178

 Score = 40.8 bits (94), Expect = 0.009
 Identities = 21/80 (26%), Positives = 40/80 (50%), Gaps = 10/80 (12%)

Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           IG+  +W KG+GT   +L+  +  +E   + + L  + +N  A++AY+ +G++
Sbjct: 95  IGDKAFWGKGLGTEVTRLVTNYGFRELGLHRIELTAYCDNVAAVKAYENAGYQ------- 147

Query: 162 HELHEGKKEDCYLMEYRYDD 181
              HEG K +      R+ D
Sbjct: 148 ---HEGIKRESGYRNGRFMD 164


>VIBUY Q7MK96 (Q7MK96) Putative acetyltransferase
          Length = 158

 Score = 40.4 bits (93), Expect = 0.012
 Identities = 35/166 (21%), Positives = 70/166 (42%), Gaps = 18/166 (10%)

Query: 15  DDDFPLMLKWLTDERVLEFYGGRDKKY--TLESLKKHYTEPWEDEVFRVIIEYNNVPIGY 72
           + +F  ++ W+  + +   +GG    +  T E +  H ++    EVF  +++ N    G+
Sbjct: 8   ESNFDQLIAWIDSDELNYLWGGPAYVFPLTYEQIHAHCSKA---EVFPYLLKVNGRHAGF 64

Query: 73  GQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANA 132
            ++YK+ DE Y           VY  + + G      +G+    + L+ +  + + +A
Sbjct: 65  VELYKVTDEQYRICR-------VYISNAYRG------RGLSKSMLMLLIDKARLDFSATK 111

Query: 133 VILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYR 178
           + L   + N  A + Y+  GF ++          GK  D   ME R
Sbjct: 112 LSLGVFEQNTVARKCYESLGFEVVSTEIGTRSFNGKLWDLVRMEKR 157


>WIGBR Q8D3I4 (Q8D3I4) Imp protein
          Length = 723

 Score = 40.0 bits (92), Expect = 0.016
 Identities = 60/261 (22%), Positives = 104/261 (39%), Gaps = 50/261 (19%)

Query: 57  EVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRY 116
           +++   + + N+PI Y   +K+Y E Y D  Y  + +I Y  +  +    Y+ K    +Y
Sbjct: 191 KIWNAKLNFKNIPIFYVPFFKVY-EKYNDIFY--SPKISYKNNNGLSLSFYYKKIFFDKY 247

Query: 117 IKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLME 176
               F F+ K  +   ++L    NN    + Y  S F            + KK + Y++
Sbjct: 248 ---FFYFIPKYNSDGTILL----NN----KIYYSSDF------------DKKKINLYIL- 283
Query: 177 YRYDDNATNVKAMKYLIEHYFDNFKVD---------SIEIIGSGYDSVAYLVNNEYI--F 225
             +D           L ++YF N K+D         +  I    +D     + NE +  F
Sbjct: 284 --FDIKKNKNNWFIDLKQNYFFNKKLDILYIYKKSNNFIIFNKMFDIEKNFLQNEILEKF 341

Query: 226 KTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPE 285
             K+  N  K   + K    F N N    +K P++ +SY  ++      K  K  F+
Sbjct: 342 NLKYFYNNWKLKLEYKKFIIFDNKNF-NYIKFPHVYFSYFDNK-----NKNFKFNFVGKF 395

Query: 286 IYSTMSEEEQNLLKRDIASFL 306
            Y    EE++ +L  +I  FL
Sbjct: 396 SY----EEDKKILHINIEPFL 412


>BACSU P94482 (P94482) YnaD
          Length = 170

 Score = 39.7 bits (91), Expect = 0.021
 Identities = 39/156 (25%), Positives = 66/156 (42%), Gaps = 17/156 (10%)

Query: 1   MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWED--EV 58
           M+I    + IR     D+  + ++ +D  V+++    +  +T E  K    +   D  E
Sbjct: 1   MHITTKRLLIREFEFKDWQAVYEYTSDSNVMKYIP--EGVFTEEDAKAFVNKNKGDNAEK 58

Query: 59  FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIK 118
           F VI+   +  IG+   YK + E         T EI +     +  PNY +KG  +   +
Sbjct: 59  FPVILRDEDCLIGHIVFYKYFGE--------HTYEIGW-----VFNPNYQNKGYASEAAQ 105

Query: 119 LIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
            I E+  KE N + +I      N  + R  +K G R
Sbjct: 106 AILEYGFKEMNLHRIIATCQPENIPSYRVMKKIGMR 141


>BACC1 Q73A91 (Q73A91) Acetyltransferase, GNAT family
          Length = 177

 Score = 39.7 bits (91), Expect = 0.021
 Identities = 23/85 (27%), Positives = 37/85 (43%), Gaps = 5/85 (5%)

Query: 87  HYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIR 146
           H  K  E+ Y    ++G+P YW  G GT   K +  +   E + N +      NNP + R
Sbjct: 86  HIHKRGELAY----WVGKP-YWGNGFGTEAAKTLLHYGFNELHLNKIFAAAFTNNPGSWR 140

Query: 147 AYQKSGFRIIEDLPEHELHEGKKED 171
             +K G +      +H +  G+  D
Sbjct: 141 IMEKIGMKHEGTFKQHVVKSGEPMD 165


>THETN Q8RC99 (Q8RC99) Acetyltransferases
          Length = 149

 Score = 39.3 bits (90), Expect = 0.027
 Identities = 35/149 (23%), Positives = 66/149 (44%), Gaps = 29/149 (19%)

Query: 43  LESLKKHYTEPWEDEVF-----------RVIIEYNNVPIGYGQIYKMYDELYTDYHYPKT 91
           +E  K  +T PW  E F            ++ E +   +GY   + + DE +       T
Sbjct: 18  MEIEKLSFTTPWSREAFVGEVTKNSCARYIVAEVDKKVVGYAGFWVVLDEGHI------T 71

Query: 92  DEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKS 151
           +  V+        P Y  KGIG+R ++ + + L K+    ++ L+  ++N  A   Y+K
Sbjct: 72  NIAVH--------PEYRGKGIGSRLMEGLID-LAKKNGITSMTLEVRESNLVAQNLYKKF 122

Query: 152 GFRIIEDLPEHELHEGKKEDCYLMEYRYD 180
           GF+++        ++   ED  +M ++YD
Sbjct: 123 GFKVLG--RREGYYQDNNEDAIVM-WKYD 148


>STRAW Q82IB6 (Q82IB6) Putative acetyltransferase
          Length = 168

 Score = 39.3 bits (90), Expect = 0.027
 Identities = 21/54 (38%), Positives = 33/54 (61%), Gaps = 5/54 (9%)

Query: 101 FIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
           ++G P YW++GIG+R + L   FL++ER    +  DP   N  ++R  +K GFR
Sbjct: 100 WLGRP-YWARGIGSRALGL---FLRRERT-RPLYADPFHGNTASVRLLEKHGFR 148


>LISIN Q92E38 (Q92E38) Lin0623 protein
          Length = 177

 Score = 39.3 bits (90), Expect = 0.027
 Identities = 22/69 (31%), Positives = 33/69 (47%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166
           +W  GIGT  ++ + +  KK      V L+    N RAI  Y+K GF    ++P     E
Sbjct: 105 FWGLGIGTLIMEGLIKHAKKTERLKLVYLEAVSENKRAINLYKKFGFIEAGEIPALMQVE 164

Query: 167 GKKEDCYLM 175
           G+  D  +M
Sbjct: 165 GRYLDVTMM 173


>STRCO O69977 (O69977) Hypothetical protein SCO5801
          Length = 231

 Score = 38.9 bits (89), Expect = 0.036
 Identities = 30/143 (20%), Positives = 65/143 (45%), Gaps = 6/143 (4%)

Query: 14  IDDDFPLMLKWLTDERVLEFYG-GRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIGY 72
           ++ D PL+ +W+ D  V  ++     +  T + L+       +      +   + VP+ Y
Sbjct: 67  LERDVPLIARWMNDPAVAAYWELTGPQSVTADHLRAQLAG--DGRSVPCVGTLDGVPMSY 124

Query: 73  GQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNA-N 131
            +IY+   +    Y   +  +   G+   IG+  +  +G+GT  I+ + + +   R A
Sbjct: 125 WEIYRADLDPLARYCPVRPHDT--GVHLLIGDGAHRGRGLGTELIRAVVDLVLAGRPACT 182

Query: 132 AVILDPHKNNPRAIRAYQKSGFR 154
            V+ +P   N +++ A+  +GFR
Sbjct: 183 RVLAEPDVRNRQSVAAFLGAGFR 205


>STRAW Q82KD8 (Q82KD8) Hypothetical protein
          Length = 377

 Score = 38.9 bits (89), Expect = 0.036
 Identities = 37/150 (24%), Positives = 66/150 (44%), Gaps = 10/150 (6%)

Query: 17  DFPLMLKWLTDERVLEFYG-GRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIGYGQI 75
           D PL+ +W+ D  V  F+    D+  T + L+              ++E    P+ Y +I
Sbjct: 217 DLPLLGRWMNDPAVAAFWKLAGDESVTEQHLRAQLGGDGRSVPCLGVLE--GTPMSYWEI 274

Query: 76  YKM-YDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANA-V 133
           Y+   D L    HYP       G+   IG      +G+G+  ++ + + +   R + A V
Sbjct: 275 YRADLDSLAR--HYPARPHDT-GIHLLIGGVADRGRGLGSTLLRAVADLVLDRRPSCARV 331

Query: 134 ILDPHKNNPRAIRAYQKSGFRIIE--DLPE 161
           + +P   N  ++ A+  +GFR     DLP+
Sbjct: 332 VAEPDLRNTSSVSAFLGAGFRFSAEVDLPD 361


>VIBCH Q9KMA3 (Q9KMA3) Acetyltransferase, putative
          Length = 230

 Score = 38.5 bits (88), Expect = 0.046
 Identities = 30/144 (20%), Positives = 62/144 (43%), Gaps = 18/144 (12%)

Query: 15  DDDFPLMLKWLTDERVLEFYGGRDKKY--TLESLKKHYTEPWEDEVFRVIIEYNNVPIGY 72
           + DF L++KW+  + +   +G     +  T E +  H ++    EVF  +++      G+
Sbjct: 8   ESDFDLLIKWIDSDELNYLWGCPAYVFPLTYEQIHSHCSKA---EVFPYLLKVKGRHAGF 64

Query: 73  GQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANA 132
            ++YK+ DE Y                 FI    Y  +G+    + L+ +  + + +A
Sbjct: 65  VELYKVTDEQYRICRV------------FISNA-YRGQGLSKSMLMLLIDKARLDFSATK 111

Query: 133 VILDPHKNNPRAIRAYQKSGFRII 156
           + L   + N  A + Y+  GF ++
Sbjct: 112 LSLGVFEQNTVARKCYESLGFEVV 135


>STAAM Q932M5 (Q932M5) Hypothetical protein SAVP027
          Length = 134

 Score = 38.5 bits (88), Expect = 0.046
 Identities = 24/77 (31%), Positives = 36/77 (46%), Gaps = 11/77 (14%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164
           PNY  KG G++ +  I E+  KE   + + L   K NPRA   Y+K G +
Sbjct: 68  PNYQDKGYGSKLLSFIKEY-SKEIGCSEMFLITDKGNPRACHVYEKLGGK---------- 116

Query: 165 HEGKKEDCYLMEYRYDD 181
           ++ K E  Y+ +Y   D
Sbjct: 117 NDYKDEIVYVYDYEKGD 133


>LACLA Q9CHJ8 (Q9CHJ8) Acetyl transferase
          Length = 193

 Score = 38.5 bits (88), Expect = 0.046
 Identities = 29/106 (27%), Positives = 50/106 (47%), Gaps = 12/106 (11%)

Query: 59  FRVIIEYNNVPIGYGQI-YKMYDELYTDYHYPKTD------EIVYGMDQFIGEPNYWSKG 111
           F +  +Y   P+G   I  K    L  D H+ K        EI Y ++Q     NYW++G
Sbjct: 56  FSIANDYMKSPLGKWAIELKSEHRLIGDIHFVKISDKNQSAEIGYVLNQ-----NYWNQG 110

Query: 112 IGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157
           + T  +K++ EF  ++     +IL   K N  + +   KSG+ +++
Sbjct: 111 LLTEALKVLTEFSFEQFGLKKLILLIDKENVPSKKVALKSGYHLVK 156


>ENTFA Q82YR8 (Q82YR8) Acetyltransferase, GNAT family
          Length = 130

 Score = 38.5 bits (88), Expect = 0.046
 Identities = 24/77 (31%), Positives = 36/77 (46%), Gaps = 11/77 (14%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164
           PNY  KG G++ +  I E+  KE   + + L   K NPRA   Y+K G +
Sbjct: 64  PNYQDKGYGSKLLSFIKEY-SKEIGCSEMFLITDKGNPRACHVYEKLGGK---------- 112

Query: 165 HEGKKEDCYLMEYRYDD 181
           ++ K E  Y+ +Y   D
Sbjct: 113 NDYKDEIVYVYDYEKGD 129


>BACC1 Q73AS8 (Q73AS8) Acetyltransferase, GNAT family
          Length = 157

 Score = 38.5 bits (88), Expect = 0.046
 Identities = 34/128 (26%), Positives = 55/128 (42%), Gaps = 21/128 (16%)

Query: 34  YGGRDKKYTLESLKKHYTEPWEDE-----VFRVIIEYNNVPIGYGQIYKMYDELYTDYHY 88
           Y G    Y +E+ ++   E   DE     +F V  + +   IGY  + K+ D
Sbjct: 22  YEGEYSFYDIEADEEDLAEFLHDESRGDHIFSV--KEHGTLIGYFTVCKINDG------- 72

Query: 89  PKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAY 148
             T +I  GM     +PN    G G ++I  I  F K++     + L     N RAI+ Y
Sbjct: 73  --TVDIGLGM-----KPNITGNGFGLQFINAILAFSKEKYGCKYITLSVATFNKRAIKVY 125

Query: 149 QKSGFRII 156
           +++GF  +
Sbjct: 126 KRAGFEAV 133


>BACC1 Q734C3 (Q734C3) Acetyltransferase, GNAT family
          Length = 183

 Score = 38.5 bits (88), Expect = 0.046
 Identities = 27/80 (33%), Positives = 38/80 (47%)

Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           IG+ N   KG G   I LI ++   E N + V LD    N  AI  Y+K GF++   + E
Sbjct: 98  IGDANDRGKGYGREAIHLILKYAFYELNLHRVGLDVISYNKDAIELYKKMGFQMEGCMRE 157

Query: 162 HELHEGKKEDCYLMEYRYDD 181
               +GK  D  +M    D+
Sbjct: 158 AVQRDGKCFDRIIMGILRDE 177


>BACAN Q81YL2 (Q81YL2) Acetyltransferase, GNAT family
          Length = 179

 Score = 38.5 bits (88), Expect = 0.046
 Identities = 27/80 (33%), Positives = 38/80 (47%)

Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           IG+ N   KG G   I LI ++   E N + V LD    N  AI  Y+K GF+I   + E
Sbjct: 96  IGDANDRGKGYGREAIHLILKYAFYELNLHRVGLDVISYNKAAIELYKKMGFQIEGCMRE 155

Query: 162 HELHEGKKEDCYLMEYRYDD 181
               +G+  D  +M    D+
Sbjct: 156 AVQRDGECFDRIIMGILRDE 175


>BACAN Q81XB7 (Q81XB7) Spermidine N1-acetyltransferase
          Length = 171

 Score = 38.5 bits (88), Expect = 0.046
 Identities = 27/116 (23%), Positives = 51/116 (43%), Gaps = 12/116 (10%)

Query: 60  RVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKL 119
           R I+E +N  +G  ++ ++      DY + +T+       Q I +PNY   G      +L
Sbjct: 57  RFIVEKDNEMVGLVELVEI------DYIHRRTEF------QIIIDPNYQGYGYAAEATRL 104

Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175
             ++     N + + L   K N +A+  Y+K GF +  +L +    +G   +   M
Sbjct: 105 AMDYAFSVLNMHKIYLVVDKENEKAVHVYKKVGFMVEGELLDEFFVDGNYHNAIRM 160


>SALTI Q8Z6Z1 (Q8Z6Z1) Spermidine N1-acetyltransferase
          Length = 186

 Score = 38.1 bits (87), Expect = 0.061
 Identities = 25/90 (27%), Positives = 40/90 (44%), Gaps = 4/90 (4%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           Q I  P Y  KG+ +R  KL  ++     N   + L   K N +AI  Y+K GFR+  +L
Sbjct: 87  QIIISPEYQGKGLASRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFRVEGEL 146

Query: 160 PEHELHEGKKED----CYLMEYRYDDNATN 185
                  G+  +    C   +   D++ T+
Sbjct: 147 IHEFFINGEYRNTIRMCIFQQQYLDEHKTS 176


>SALTY Q8ZPJ3 (Q8ZPJ3) Spermidine N1-acetyltransferase (EC 2.3.1.57)
          Length = 186

 Score = 38.1 bits (87), Expect = 0.061
 Identities = 25/90 (27%), Positives = 40/90 (44%), Gaps = 4/90 (4%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           Q I  P Y  KG+ +R  KL  ++     N   + L   K N +AI  Y+K GFR+  +L
Sbjct: 87  QIIISPEYQGKGLASRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFRVEGEL 146

Query: 160 PEHELHEGKKED----CYLMEYRYDDNATN 185
                  G+  +    C   +   D++ T+
Sbjct: 147 IHEFFINGEYRNTIRMCIFQQQYLDEHKTS 176


>SALCH Q57PD6 (Q57PD6) Spermidine N1-acetyltransferase
          Length = 186

 Score = 38.1 bits (87), Expect = 0.061
Identities = 25/90 (27%), Positives = 40/90 (44%), Gaps = 4/90 (4%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           Q I  P Y  KG+ +R  KL  ++     N   + L   K N +AI  Y+K GFR+  +L
Sbjct: 87  QIIISPEYQGKGLASRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFRVEGEL 146

Query: 160 PEHELHEGKKED----CYLMEYRYDDNATN 185
                  G+  +    C   +   D++ T+
Sbjct: 147 IHEFFINGEYRNTIRMCIFQQQYLDEHKTS 176


>MYCPE Q8EWE5 (Q8EWE5) Acetyltransferase GNAT family
          Length = 193

 Score = 38.1 bits (87), Expect = 0.061
 Identities = 34/157 (21%), Positives = 72/157 (45%), Gaps = 20/157 (12%)

Query: 2   NIVENE-ICIRTLIDDDFPLMLKWLTDERVLEFYGG---RDKKYTLESLKKHYTEPWEDE 57
           NI+E + + +R L  +D     ++   E V E  G    +D +Y+ + L K      +
Sbjct: 8   NIIETKRLYLRPLKIEDLNDFYEFAKVEGVGESAGWFHHKDIEYSKKILIKMINSKQD-- 65

Query: 58  VFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYI 117
            + ++ + NN  IG   I+  Y+           D+++ G   F+   +YW+KG+ T  +
Sbjct: 66  -YAIVYKENNKVIGELGIFNKYEN----------DKLMIG---FVLNKDYWNKGLATEIV 111

Query: 118 KLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
           K + +++    +   + +   ++N  + R  +K GF+
Sbjct: 112 KELIDYIFTNTDHQQIYMGHFESNLASKRVVEKCGFK 148


>BACC1 Q72Y03 (Q72Y03) Spermidine N1-acetyltransferase (EC 2.3.1.57)
          Length = 176

 Score = 38.1 bits (87), Expect = 0.061
 Identities = 34/177 (19%), Positives = 74/177 (41%), Gaps = 14/177 (7%)

Query: 1   MNIVE--NEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEV 58
           M ++E   E+ +R L  +D   + +   +  ++ ++     +  +E    +     +
Sbjct: 1   MEVIEMSQELKLRPLEREDLKFVHELNNNAHIMSYWFEEPYEAFVELQDLYDKHIHDQSE 60

Query: 59  FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIK 118
            R I+E +N  +G  ++ ++      DY + +T+       Q I +PNY   G      +
Sbjct: 61  RRFIVEKDNEMVGLVELVEI------DYIHRRTEF------QIIIDPNYQGYGYAAEATR 108

Query: 119 LIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175
           L  ++     N + + L   K N +A+  Y+K GF +  +L +    +G   +   M
Sbjct: 109 LAMDYAFSVLNMHKLYLVVDKENEKAVHVYKKVGFMVEGELLDEFFVDGNYHNAIRM 165


>DEIRA Q9RW73 (Q9RW73) Acetyltransferase, putative
          Length = 186

 Score = 37.7 bits (86), Expect = 0.079
 Identities = 45/179 (25%), Positives = 77/179 (43%), Gaps = 35/179 (19%)

Query: 8   ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYT-----LESLKKHY-----TEPWEDE 57
           + +R    +D P   +WLTDER    +   D  YT      E+++ +      T P  DE
Sbjct: 9   VVLRDRRPEDLPTFTRWLTDERAA--WREWDAPYTPAAQTSETMQAYIRYLQVTPPDADE 66

Query: 58  VFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYG-----MDQFIGEPNYWSKGI 112
             RVI       +G GQ+  M +         +++E   G     +   I +P YW  G+
Sbjct: 67  --RVI------EVG-GQVVGMVN---------RSEEEPAGGGWWDLGILIYDPAYWEGGV 108

Query: 113 GTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKED 171
           GTR + L  +      +A+ + +     N R +RA ++ GF+    + E  +  G++ D
Sbjct: 109 GTRALSLWVQDTLDWTDAHTLTVTTWSGNERMMRAARRLGFQECARVREARVVGGQRYD 167


>STAAM Q99U68 (Q99U68) Hypothetical protein
          Length = 169

 Score = 37.4 bits (85), Expect = 0.10
 Identities = 31/133 (23%), Positives = 55/133 (41%), Gaps = 17/133 (12%)

Query: 44  ESLKKHYTEPWEDEV-------------FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPK 90
           E +K+H  E W+D+              +  ++E N+   G+  + +   E Y D  +P
Sbjct: 22  ELMKEHDNEQWDDQYPLLEHFEEDIAKDYLYVLEENDKIYGFIVVDQDQAEWYDDIDWPV 81

Query: 91  TDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQK 150
             E  + + +  G   Y  KG  T     + + + K R A  ++ D    N  A   + K
Sbjct: 82  NREGAFVIHRLTGSKEY--KGAATELFNYVIDVV-KARGAEVILTDTFALNKPAQGLFAK 138

Query: 151 SGF-RIIEDLPEH 162
            GF ++ E L E+
Sbjct: 139 FGFHKVGEQLMEY 151


>RICPR Q9ZE75 (Q9ZE75) TRANSCRIPTIONAL ACTIVATOR PROTEIN CZCR (CzcR)
          Length = 237

 Score = 37.4 bits (85), Expect = 0.10
 Identities = 35/130 (26%), Positives = 63/130 (48%), Gaps = 15/130 (11%)

Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLET-NVKIPNIEYSYISDELSILGYKEI 277
           V  E I + K    + KG+A     ++ ++ NL+T +V++   +    + E SIL    +
Sbjct: 104 VREELIARIKAIVRRSKGHAASIFRFDKISVNLDTRSVEVDGKKLHLTNKEYSILELLIL 163

Query: 278 -KGTFLTPE-----IYSTMSEEEQNLLKRDIASFLRQMH----GLDYTDISECTIDNKQN 327
            +GT LT E     +YST+ E E  ++   I    +++     G DY D    T+  +
Sbjct: 164 RRGTILTKEMFLNHLYSTVDEPEMKIIDVFICKLRKKLSDAAGGRDYID----TVWGRGY 219

Query: 328 VLEEYILLRE 337
           +L+EY  L++
Sbjct: 220 MLKEYDELQQ 229


>LACJO Q74J71 (Q74J71) Hypothetical protein
          Length = 181

 Score = 37.4 bits (85), Expect = 0.10
 Identities = 23/72 (31%), Positives = 32/72 (44%), Gaps = 1/72 (1%)

Query: 102 IGEPNYWSKGIGTRYIKL-IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLP 160
           I  P YW  GIG R +K+ I E   +      + L     NPR I   QK GF+    +
Sbjct: 98  IYNPTYWHGGIGGRVLKIWISEIFDQYPELEHIGLTTWSGNPRMIHLAQKLGFKKEAQIR 157

Query: 161 EHELHEGKKEDC 172
           +   ++ K  DC
Sbjct: 158 KVRFYKEKYYDC 169


>CLOAB Q97D33 (Q97D33) Acetyltransferase (With duplicated domains),
           possibly RIMI-like protein
          Length = 292

 Score = 37.4 bits (85), Expect = 0.10
 Identities = 18/59 (30%), Positives = 32/59 (54%), Gaps = 1/59 (1%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHE 163
           P Y  +G G   + ++ E+L  ER+ + + L+   NN RA   Y+  GF+I  ++  +E
Sbjct: 225 PEYRGRGFGREMMSMLLEYLI-ERDYDDIALEVDSNNKRAFELYKSIGFQIEREIDYYE 282


>VIBCH Q9KU66 (Q9KU66) Ribosomal-protein-alanine acetyltransferase
          Length = 161

 Score = 36.6 bits (83), Expect = 0.18
 Identities = 22/77 (28%), Positives = 42/77 (54%), Gaps = 2/77 (2%)

Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE-DLPEH 162
           +P    KG G + ++  F  L +++ A +  L+  ++N RA   YQ++GF  I+  +  +
Sbjct: 85  DPAQQGKGYGQQLLQH-FIALCEQQKAESAWLEVRESNQRAFALYQRAGFNEIDRRVNYY 143

Query: 163 ELHEGKKEDCYLMEYRY 179
            + +GK ED  +M Y +
Sbjct: 144 PVAKGKSEDAIIMSYLF 160


>STRR6 Q8DNF9 (Q8DNF9) Hypothetical protein spr1760
          Length = 172

 Score = 36.6 bits (83), Expect = 0.18
 Identities = 23/71 (32%), Positives = 34/71 (47%), Gaps = 3/71 (4%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEH--EL 164
           YW+ G+G+  ++   E+ +       + L     N  A+  YQK GF +IE   E    +
Sbjct: 98  YWNNGLGSLLLEEAIEWAQASGILRRLQLTVQTRNQAAVHLYQKHGF-VIEGSQERGAYI 156

Query: 165 HEGKKEDCYLM 175
            EGK  D YLM
Sbjct: 157 EEGKFIDVYLM 167


>SALTY Q8ZNU6 (Q8ZNU6) Putative ribose 5-phosphate isomerase (EC
           5.3.1.6)
          Length = 212

 Score = 36.6 bits (83), Expect = 0.18
 Identities = 24/93 (25%), Positives = 44/93 (47%), Gaps = 4/93 (4%)

Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIK 278
           +N  +IF+  F+  K +GY  E+      N  +   VK   ++ +Y+ D L  +  + +K
Sbjct: 124 LNVRFIFEKAFTGRKGEGYPPERKAPQVRNAGILNQVKAAVVKENYL-DTLRAIDPELVK 182

Query: 279 GTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHG 311
                P     + E  QN   ++I +F+R+M G
Sbjct: 183 TAVSGPRFQQCLFENGQN---KEIEAFVREMLG 212


>SALCH Q57N66 (Q57N66) Putative ribose 5-phosphate isomerase
          Length = 212

 Score = 36.6 bits (83), Expect = 0.18
 Identities = 24/93 (25%), Positives = 44/93 (47%), Gaps = 4/93 (4%)

Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIK 278
           +N  +IF+  F+  K +GY  E+      N  +   VK   ++ +Y+ D L  +  + +K
Sbjct: 124 LNVRFIFEKTFTGRKGEGYPPERKAPQVRNAGILNQVKAAVVKENYL-DTLRAIDPELVK 182

Query: 279 GTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHG 311
                P     + E  QN   ++I +F+R+M G
Sbjct: 183 TAVSGPRFQQCLFENGQN---KEIEAFVREMLG 212


>ECO57 ATDA_ECO57 (P0A952) Spermidine N(1)-acetyltransferase (EC
           2.3.1.57) (Diamine acetyltransferase) (SAT)
          Length = 185

 Score = 36.6 bits (83), Expect = 0.18
 Identities = 21/60 (35%), Positives = 29/60 (48%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           Q I  P Y  KG+ TR  KL  ++     N   + L   K N +AI  Y+K GF +  +L
Sbjct: 86  QIIISPEYQGKGLATRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFSVEGEL 145


>ECOLI ATDA_ECOLI (P0A951) Spermidine N(1)-acetyltransferase (EC
           2.3.1.57) (Diamine acetyltransferase) (SAT)
          Length = 185

 Score = 36.6 bits (83), Expect = 0.18
 Identities = 21/60 (35%), Positives = 29/60 (48%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           Q I  P Y  KG+ TR  KL  ++     N   + L   K N +AI  Y+K GF +  +L
Sbjct: 86  QIIISPEYQGKGLATRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFSVEGEL 145


>BACHD Q9KG16 (Q9KG16) BH0299 protein
          Length = 305

 Score = 36.6 bits (83), Expect = 0.18
 Identities = 35/126 (27%), Positives = 52/126 (41%), Gaps = 17/126 (13%)

Query: 41  YTLESLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQ 100
           Y  E + K   EP       +IIE   + IGY          Y +   P+  E   G  +
Sbjct: 185 YDAEEILKKINEPTNK---LLIIEKEQIVIGYA---------YVEVE-PEHGE---GQIE 228

Query: 101 FIG-EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           +IG  P+Y  +G+ T+ +      L        + L   K N +AIR YQ +GF+    L
Sbjct: 229 YIGIAPDYRRQGLATQLLTNALHVLFSYPTVEDITLCVSKQNTKAIRLYQAAGFKKERQL 288

Query: 160 PEHELH 165
              EL+
Sbjct: 289 TYFELN 294


>AQUAE O66838 (O66838) Ribosomal-protein-alanine acetyltransferase
          Length = 154

 Score = 36.6 bits (83), Expect = 0.18
 Identities = 23/74 (31%), Positives = 37/74 (50%), Gaps = 5/74 (6%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164
           P Y  KG G + ++     L  +     V+LD  K+N RAI  Y+K GF+++    E +
Sbjct: 75  PGYRGKGYGEKLLREAISRLGDK--VKRVVLDVRKSNLRAINLYKKLGFKVV---TERKG 129

Query: 165 HEGKKEDCYLMEYR 178
           +    E+  LME +
Sbjct: 130 YYSDGENALLMELK 143


>PROAC Q6ABU2 (Q6ABU2) Acetyltransferase (EC 2.3.1.-)
          Length = 188

 Score = 36.2 bits (82), Expect = 0.23
 Identities = 21/77 (27%), Positives = 32/77 (41%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164
           P  W +G+G R ++L  E      +A  V L     N R I      G+R    +P+
Sbjct: 111 PTLWGRGVGRRALRLWTEATFATTDAQVVTLTTWSGNGRMIHCAGAVGYRECGRIPQARS 170

Query: 165 HEGKKEDCYLMEYRYDD 181
            +G++ D   M    DD
Sbjct: 171 WQGRRWDLVTMALLRDD 187


>BACAN Q81QL3 (Q81QL3) Acetyltransferase, GNAT family
          Length = 308

 Score = 36.2 bits (82), Expect = 0.23
 Identities = 44/183 (24%), Positives = 70/183 (38%), Gaps = 31/183 (16%)

Query: 44  ESLKKHYTEPWEDEVFRV------IIEYNNVPIGYGQIYKM-YDELYTDYHYPKTDEIVY 96
           E L +  T  +++E  R       +I+YN  P GY  +  M Y     D +    DE +
Sbjct: 15  EKLTEIMTRTFDEEAERWLCGQGDVIDYNIQPPGYSSVEMMRYSIEELDSYKVIMDEKII 74

Query: 97  G-------------MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPR 143
           G             +D+   EP Y  KGIG+  IKLI       R  +        NN
Sbjct: 75  GGIIVTISGKSYGRIDRIFVEPVYQGKGIGSNVIKLIEAEYPSIRIWDLETSSRQINNH- 133

Query: 144 AIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVD 203
               Y+K G++ I         E + E CY+       N  +V   + +    ++N  ++
Sbjct: 134 --HFYKKMGYQTI--------FESEDEYCYVKRIGTSSNKESVFKNEDMKNSQYENCNLE 183

Query: 204 SIE 206
           + E
Sbjct: 184 NTE 186


>STRMU Q8DV67 (Q8DV67) Putative acetyltransferase
          Length = 166

 Score = 35.8 bits (81), Expect = 0.30
Identities = 21/52 (40%), Positives = 29/52 (55%), Gaps = 2/52 (3%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156
           P Y  +GIGT  +K   E L+K +  + V L   K N  A+  YQK+GF+ I
Sbjct: 95  PAYRGQGIGTELLKTFLEHLRK-KGYHKVSLSVQKEND-AVNMYQKAGFQTI 144


>STAAM COAD_STAAM (P63818) Phosphopantetheine adenylyltransferase
           (EC 2.7.7.3) (Pantetheine-phosphate adenylyltransferase)
           (PPAT) (Dephospho-CoA pyrophosphorylase)
          Length = 160

 Score = 35.8 bits (81), Expect = 0.30
 Identities = 28/132 (21%), Positives = 55/132 (41%), Gaps = 13/132 (9%)

Query: 164 LHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEY 223
           L   KKE  + +E R D    +VK +  +  H F    VD  E +G+          +++
Sbjct: 38  LKNSKKEGTFSLEERMDLIEQSVKHLPNVKVHQFSGLLVDYCEQVGAKTIIRGLRAVSDF 97

Query: 224 IFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDEL--SILGYKEIKGTF 281
            ++ + ++  KK           LN  +ET   + +  YS+IS  +   +  Y+     F
Sbjct: 98  EYELRLTSMNKK-----------LNNEIETLYMMSSTNYSFISSSIVKEVAAYRADISEF 146

Query: 282 LTPEIYSTMSEE 293
           + P +   + ++
Sbjct: 147 VPPYVEKALKKK 158


>LISMO Q8Y9B8 (Q8Y9B8) Lmo0614 protein
          Length = 177

 Score = 35.8 bits (81), Expect = 0.30
 Identities = 17/54 (31%), Positives = 27/54 (50%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLP 160
           YW  GIGT  ++ + ++ K       + L+    N RAI  Y+K GF    ++P
Sbjct: 105 YWGLGIGTICMEELIKYAKSSEYLKLIYLEVVTENKRAINLYKKFGFIEAGEIP 158


>THEMA Q9WZ46 (Q9WZ46) Hypothetical protein
          Length = 179

 Score = 35.4 bits (80), Expect = 0.39
 Identities = 19/49 (38%), Positives = 28/49 (57%), Gaps = 1/49 (2%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRI 155
           YW+ GIGTR I    E+ ++      + L+  K+N RAI  Y+K GF +
Sbjct: 106 YWNIGIGTRMITSAIEWARR-NGFIRIQLEVLKSNERAISLYRKLGFEL 153


>STRR6 Q8DQU8 (Q8DQU8) Hypothetical protein spr0490
          Length = 185

 Score = 35.4 bits (80), Expect = 0.39
 Identities = 29/128 (22%), Positives = 60/128 (46%), Gaps = 16/128 (12%)

Query: 55  EDEVF---RVIIEYN---NVPIGYGQIYKMYDELY--TDYHYPKTDEIVYGMDQFIG--- 103
           EDE++    ++ E N   N+P GYG + K  D++    D+++   D+++      IG
Sbjct: 50  EDEIYYLEHILPERNQKENLPAGYGIVVKGTDKIVGSVDFNHRHEDDVLE-----IGYTL 104

Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHE 163
            P+YW +G      + + +   K+   + + L     N ++ R  +K GF +   + + +
Sbjct: 105 HPDYWGRGYVPEAARALIDLAFKDLGLHKIELTCFGYNLQSKRVAEKLGFTLEARIRDRK 164

Query: 164 LHEGKKED 171
             +G + D
Sbjct: 165 DVQGNRCD 172


>CLOAB Q97FA0 (Q97FA0) Predicted acetyltransferase
          Length = 146

 Score = 35.4 bits (80), Expect = 0.39
 Identities = 23/94 (24%), Positives = 45/94 (47%), Gaps = 15/94 (15%)

Query: 61  VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120
           V+I+ NN+ +GYG ++ + DE +              +      P +   GIG + ++ +
Sbjct: 46  VVIKNNNLVVGYGGLWLIIDEGH--------------ITNIAVHPEFRGMGIGNKILEEL 91

Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
            +  +K RN  ++ L+   +N  A   Y+K GF+
Sbjct: 92  IKLCEK-RNIPSMTLEVRISNTIAQNLYKKFGFK 124


>_BUCAP LOLC_BUCAP (Q8K9N8) Lipoprotein-releasing system
           transmembrane protein lolC
          Length = 399

 Score = 35.4 bits (80), Expect = 0.39
 Identities = 44/156 (28%), Positives = 62/156 (39%), Gaps = 26/156 (16%)

Query: 189 MKYLIEHYFDNFK----VDSIEIIGSGYDSVAYLVNNEYIFKTKFS------------TN 232
           ++YL   Y  NFK    + SI  IG G  S    ++    F+ KF             TN
Sbjct: 11  LRYLWNPYLPNFKKIIIILSILGIGIGISSTIITISIMNGFQNKFKNDILSFIPHIIITN 70

Query: 233 KKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSE 292
           K +   K       LN   ET +K+ N+E   I+D +S     E K      EI     +
Sbjct: 71  KNRNINK-------LNFPKET-LKLKNVEE--ITDFISKKVIIENKNEINIGEIIGINIK 120

Query: 293 EEQNLLKRDIASFLRQMHGLDYTDISECTIDNKQNV 328
            E+NL   +I  FL  +H   Y  I    +  K +V
Sbjct: 121 NEKNLENYNIKKFLHTLHSRKYNAIIGSELAKKMHV 156


>BACC1 Q734M2 (Q734M2) Acetyltransferase, GNAT family
          Length = 153

 Score = 35.4 bits (80), Expect = 0.39
 Identities = 22/91 (24%), Positives = 42/91 (46%), Gaps = 16/91 (17%)

Query: 80  DELYTDYHYPKT-----------DEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKER 128
           DE++  Y Y  T           DE+   + + +  P+Y+ KGI T+ +  +F+     +
Sbjct: 50  DEIFYGYFYEDTLAGFISFKIEKDEV--DIHRLVVSPDYFHKGIATKLLLYVFDMFSPSK 107

Query: 129 NANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
                I+   K N  A+  Y+K GF  ++++
Sbjct: 108 ---TYIVQTGKENTPALSLYKKHGFIEVKEI 135


>BACAN Q81U09 (Q81U09) Acetyltransferase, GNAT family
          Length = 167

 Score = 35.4 bits (80), Expect = 0.39
 Identities = 15/47 (31%), Positives = 27/47 (57%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
           Y ++GIGT+ I+ +  + K++     + L     N RAI+ Y++ GF
Sbjct: 93  YCNQGIGTKLIEFLIRWAKEQNGLEKICLGVVSVNDRAIKVYKRMGF 139


>YERPE Q8ZHB3 (Q8ZHB3) Putative siderophore biosynthesis protein
           IucB (Acetyl CoA:N6-hydroxylsyine acetyl transferase)
          Length = 316

 Score = 35.0 bits (79), Expect = 0.51
 Identities = 35/159 (22%), Positives = 63/159 (39%), Gaps = 31/159 (19%)

Query: 14  IDDDFPLMLKWLTDERVLEFY---GGRDKK--YTLESLKKHYTEPWEDEVFRVIIEYNNV 68
           +D D P   +W+   RV  F+   G  D +  Y    L   Y  P       ++  +++
Sbjct: 151 VDHDAPQFTRWMNSPRVDAFWEMSGPLDVQAAYLQRQLDSPYCYP-------LLGCFDDQ 203

Query: 69  PIGYGQIY-KMYDELYTDYHYPKTDEIVYGMDQFIGEPNY--------WSKGIGTRYIKL 119
           P GY ++Y    D +   Y +   D    G+   +GE N+        W +G+ T Y+ L
Sbjct: 204 PFGYFEVYWAAEDRIGRHYRWQPFDR---GLHMLVGEENWRGAQYIHSWLRGL-THYLYL 259

Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIED 158
                  E     V+ +P  +N R       +G+  +++
Sbjct: 260 ------DESRTTRVVAEPRIDNQRLFHHLPAAGYHTLKE 292


>VIBUY Q7MFH7 (Q7MFH7) Histone acetyltransferase HPA2
          Length = 166

 Score = 35.0 bits (79), Expect = 0.51
 Identities = 18/65 (27%), Positives = 33/65 (50%)

Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKE 170
           GIG++ I+ + E      N   + ++ + +N +AI  Y+K GF I  +  +    EG+
Sbjct: 94  GIGSKLIETVTELADNWLNVRRIQIEVNVDNEKAISLYKKHGFVIEGEAVDSSFREGRFI 153

Query: 171 DCYLM 175
           + Y M
Sbjct: 154 NTYYM 158


>RICCN Q92JG6 (Q92JG6) Transcriptional activator protein czcR
          Length = 237

 Score = 35.0 bits (79), Expect = 0.51
 Identities = 37/136 (27%), Positives = 63/136 (46%), Gaps = 27/136 (19%)

Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLET--------NVKIPNIEYSYISDELS 270
           V  E I + K    + KG+A     ++ ++ NL+T         V + N EY+ +  EL
Sbjct: 104 VREELIARIKAIVRRSKGHAASVFRFDKVSINLDTRSVEVDGKKVHLTNKEYAIL--ELL 161

Query: 271 ILGYKEIKGTFLTPE-----IYSTMSEEEQNLLKRDIASFLRQMH----GLDYTDISECT 321
           IL     +GT LT E     +YS++ E E  ++   I    +++     G DY D    T
Sbjct: 162 ILR----RGTILTKEMFLNHLYSSVDEPEMKIIDVFICKLRKKLSDAAGGRDYID----T 213

Query: 322 IDNKQNVLEEYILLRE 337
           +  +  +L+EY  L++
Sbjct: 214 VWGRGYMLKEYDELQQ 229


>CLOAB Q97G03 (Q97G03) Predicted acetyltransferase
          Length = 167

 Score = 35.0 bits (79), Expect = 0.51
 Identities = 24/75 (32%), Positives = 37/75 (49%), Gaps = 11/75 (14%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166
           Y  KGIG+  IK +FE+  +E     + L+   +N +AI  Y+K GF          + E
Sbjct: 94  YSGKGIGSLIIKRVFEW-AEENAIEKIDLEVFHDNFKAISLYKKFGF----------IEE 142

Query: 167 GKKEDCYLMEYRYDD 181
           G+K++    E  Y D
Sbjct: 143 GRKKNAIKAEDGYKD 157


>BACHD Q9KF00 (Q9KF00) Ribosomal-protein-alanine N-acetyltransferase
          Length = 190

 Score = 35.0 bits (79), Expect = 0.51
 Identities = 43/175 (24%), Positives = 76/175 (43%), Gaps = 23/175 (13%)

Query: 8   ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNN 67
           + +R +  DD   +L +L+D+ V++++ G +   TLE         W + +      +
Sbjct: 16  LILRKITTDDARSILSYLSDKEVMKYF-GLEPFQTLEDALGEIA--WYESIL-----HEQ 67

Query: 68  VPIGYGQIYKMYDELY--TDYH--YPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI--- 120
             I +G   K  DE+     +H   PK      G +       YW +GI +  I+ +
Sbjct: 68  TGIRWGITLKGQDEVIGSCGFHQWVPKHHRAEIGFEL---SKLYWGQGIASEAIRAVIQY 124

Query: 121 -FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYL 174
            FE L+ +R   A+I  P+  + R +   +K GF     L  +E   GK +D Y+
Sbjct: 125 GFEHLELQR-IQALIEPPNIPSQRLV---EKQGFISEGLLRSYEYTCGKFDDLYM 175


>BACAN Q81NB6 (Q81NB6) Spermine/spermidine acetyltransferase,
           putative
          Length = 148

 Score = 35.0 bits (79), Expect = 0.51
 Identities = 31/119 (26%), Positives = 52/119 (43%), Gaps = 20/119 (16%)

Query: 54  WEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPK---TDEIVYGMDQFIGEP---NY 107
           WE+ +   + E     I    +Y + +  + D  Y K    +E + G   F  +P   NY
Sbjct: 14  WEEAIKLSVKEEQQTFIA-SNLYSIAEVQFLDNFYAKGIYLEEKMVGFTMFGIDPEDNNY 72

Query: 108 W-----------SKGIGTRYIKLIFEFLKKERNAN--AVILDPHKNNPRAIRAYQKSGF 153
           W            KGIG + I L+ + +++  NAN   +++     N  A  AY+K+GF
Sbjct: 73  WIYRLMIDENFQGKGIGKQAIYLVIDEIRRNNNANFSRIMIGYAPENLTAKFAYKKAGF 131


>BACAN Q81N07 (Q81N07) Acetyltransferase, GNAT family
          Length = 153

 Score = 35.0 bits (79), Expect = 0.51
 Identities = 21/89 (23%), Positives = 42/89 (47%), Gaps = 12/89 (13%)

Query: 80  DELYTDYHYPKT---------DEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNA 130
           DE++  Y Y  T         D+    + + +  P+++ KGI T+ +  IF+      ++
Sbjct: 50  DEIFYGYFYEDTLAGFISFKIDKEEVDIHRLVVSPDHFHKGIATKLLLYIFDMFS---SS 106

Query: 131 NAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
              I+   K N  A+  Y+K GF  ++++
Sbjct: 107 KTYIVQTGKENTPALSLYKKHGFIEVQNI 135


>STRMU Q8DT36 (Q8DT36) Putative acetyltransferase
          Length = 184

 Score = 34.7 bits (78), Expect = 0.67
 Identities = 16/78 (20%), Positives = 36/78 (46%)

Query: 106 NYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELH 165
           +YW +G+ T  ++ +     +E +   + +  HK N  + R  +K+GFR++      + +
Sbjct: 105 HYWKQGLATEALENLVFLAFQELDLKELEIIVHKENRASARVAEKAGFRLVRQFKGSDRY 164

Query: 166 EGKKEDCYLMEYRYDDNA 183
             K  D    + +  D +
Sbjct: 165 THKMRDYLKYDLKAGDKS 182


>PSEAE Q9I3W7 (Q9I3W7) Hypothetical protein
          Length = 177

 Score = 34.7 bits (78), Expect = 0.67
 Identities = 17/66 (25%), Positives = 33/66 (50%)

Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKK 169
           KG+G+R +  + +      N   V L  + +N  A+  Y+K GF    ++ ++ + +G+
Sbjct: 100 KGVGSRLLGELLDIADNWMNLRRVELTVYTDNAPALALYRKFGFETEGEMRDYAVRDGRF 159

Query: 170 EDCYLM 175
            D Y M
Sbjct: 160 VDVYSM 165


>NITEU Q82UT0 (Q82UT0) GCN5-related N-acetyltransferase (EC
           2.3.1.128)
          Length = 157

 Score = 34.7 bits (78), Expect = 0.67
 Identities = 17/48 (35%), Positives = 30/48 (62%), Gaps = 1/48 (2%)

Query: 109 SKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156
           S+G+G + ++ + E L ++  A  V+LD  ++N  AI  YQ+ GF+ I
Sbjct: 88  SQGLGRKMLRYLIE-LSRKHQAEFVLLDVRESNTGAINLYQRLGFQQI 134


>LACLA Q9CF66 (Q9CF66) Spermidine acetyltransferase (EC 2.3.1.57)
          Length = 180

 Score = 34.7 bits (78), Expect = 0.67
 Identities = 34/131 (25%), Positives = 55/131 (41%), Gaps = 12/131 (9%)

Query: 60  RVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKL 119
           R +IE N+  IG  ++  +      DY + +T EI     Q I    +  KG   + +K
Sbjct: 62  RFVIEANDTFIGIVELMSI------DYIH-RTCEI-----QIIIISGFSGKGYAQKALKT 109

Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRY 179
             ++     N + V L    +N  A+  Y+K GF+I   + E     G+  D Y M
Sbjct: 110 GVDYAFNTLNMHKVYLWVDIDNAPAVHIYKKLGFKIEGTIKEQFFAGGRYHDSYFMGILK 169

Query: 180 DDNATNVKAMK 190
            +     KA+K
Sbjct: 170 SEYTQREKAVK 180


>BACC1 Q739G7 (Q739G7) Acetyltransferase, GNAT family
          Length = 188

 Score = 34.7 bits (78), Expect = 0.67
 Identities = 33/136 (24%), Positives = 52/136 (38%), Gaps = 21/136 (15%)

Query: 47  KKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPN 106
           +K  T   E+ +  +IIE+N   IG    Y         + Y  T  +  G+   I  P
Sbjct: 56  EKMQTRLKEEPLSNLIIEHNGQVIGTVGFY---------WEYKPTRWLEMGI--VIYNPA 104

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166
           YW+ G GT  + L  + L ++     V L     N R ++  +K G  +          E
Sbjct: 105 YWNGGYGTEALTLYRDLLFEKMEIGRVGLTTWSGNERMMKVAEKIGMTL----------E 154

Query: 167 GKKEDCYLMEYRYDDN 182
           G+   C      Y D+
Sbjct: 155 GRMRKCRYYNGTYYDS 170


>MYCPE Q8EWM2 (Q8EWM2) Hypothetical protein MYPE1810
          Length = 300

 Score = 34.3 bits (77), Expect = 0.88
 Identities = 57/252 (22%), Positives = 99/252 (39%), Gaps = 62/252 (24%)

Query: 115 RYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII-EDL-----PEHELHEGK 168
           R+I  + +FL K+ +      +  KN+ +        G  I+ EDL     P   L E K
Sbjct: 55  RFILNLLDFLYKDNDLIEYKRERSKNDLKFFHFSFSKGLDILLEDLHLNKDPYKWLVETK 114

Query: 169 KEDCYLME-YRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLV-----NNE 222
              C+L+  + Y  +  +  +  Y  E    N ++  ++I+   + S+   +     NN
Sbjct: 115 TRSCFLIGLFLYGGSINSPNSSNYHFEIKIHNTEI--LKIVEKIFSSINIPLLVLNRNNT 172

Query: 223 YIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFL 282
           YI   K S +                                ISD L +LG  E
Sbjct: 173 YIVYIKKSES--------------------------------ISDILKLLGATE------ 194

Query: 283 TPEIYSTMSEEEQNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLE--EYILLRETIY 340
                 +M E E+  + RD  + + +++ LD +++ + TI+     L+  EY+     ++
Sbjct: 195 ------SMFEYEEKRISRDYTNQMSRLNNLDMSNLKK-TIEASHIQLQNIEYVK-NNNLF 246

Query: 341 NDLTDIEKDYIE 352
           N LTD EK Y E
Sbjct: 247 NQLTDKEKIYCE 258


>MYCPE Q8EWE6 (Q8EWE6) Acetyltransferase GNAT family
          Length = 190

 Score = 34.3 bits (77), Expect = 0.88
 Identities = 26/112 (23%), Positives = 51/112 (45%), Gaps = 13/112 (11%)

Query: 48  KHYTEPWEDE-VFRVIIEYNNVPIGYGQIYKMYDELYTDYHYP----KTDEIVYGMDQFI 102
           KH+    E E + +++I   N    Y  ++K  +++   +       KT +I Y + +
Sbjct: 45  KHHKNIEETETILKILISGGNF---YALVWKENNKVIGSFGIETPSYKTVKIGYALSK-- 99

Query: 103 GEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
              +YW+ GI T   K I +F+      N +++     N  + +  +KSGF+
Sbjct: 100 ---DYWNLGIMTEVTKHIIDFIFTNSGFNKILVSHFDENTASKKVIEKSGFK 148


>LACLA Q9CHW9 (Q9CHW9) Hypothetical protein yfiL
          Length = 154

 Score = 34.3 bits (77), Expect = 0.88
 Identities = 27/97 (27%), Positives = 46/97 (47%), Gaps = 15/97 (15%)

Query: 90  KTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVIL--DPHKNNPRAIRA 147
           K  +  + ++ F  E ++  +G G + +K +  +LK+   A+ +IL  D   NN   +
Sbjct: 56  KKQKNTFEIENFAVETSFQGQGFGQQMMKQLITYLKENLAADELILGTDDVSNN---VAF 112

Query: 148 YQKSGFRIIE-------DLPEHELHEGK---KEDCYL 174
           Y+K GF I         D  +H + EGK   K+  YL
Sbjct: 113 YEKCGFTITHKISNYFLDNCDHPIFEGKVQLKDKIYL 149


>LACPL Q88SM7 (Q88SM7) Prophage Lp4 protein 8, DNA primase/helicase
          Length = 500

 Score = 34.3 bits (77), Expect = 0.88
 Identities = 37/160 (23%), Positives = 64/160 (40%), Gaps = 11/160 (6%)

Query: 179 YDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEY-----IFKTKFSTNK 233
           Y DNAT      + I+++FD       EI+   ++ +   +N  Y     IF      N
Sbjct: 174 YRDNATTPNIKGWTIDNWFDELACGDDEIVELLWEVINDCLNGNYTRKKAIFLFSELGNS 233

Query: 234 KKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEE 293
            KG  +E  I N +  +    +K+   +  +      ++G     G  + P+IY   S
Sbjct: 234 GKGTFQE-LITNLVGMDNVGTLKVNEFDVRF--RLAGLVGKTVCIGDDIAPDIYIKDSSN 290

Query: 294 EQNLLKRDIASFLRQMHGLD-YTDISECTIDNKQNVLEEY 332
             +++  D+ +   +  G D YT    CTI    N L  +
Sbjct: 291 FNSVVTGDLVNI--EFKGQDGYTSALRCTIVQSCNGLPNF 328


>ENTFA Q837C4 (Q837C4) Acetyltransferase, GNAT family
          Length = 144

 Score = 34.3 bits (77), Expect = 0.88
 Identities = 17/58 (29%), Positives = 30/58 (51%), Gaps = 6/58 (10%)

Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           +P Y+ KG G   I+ + E        + + +D +K N  A++ YQ  GF++I +  E
Sbjct: 78  DPVYFRKGYGGEIIQKLIE------QESIIFVDANKQNEGAVKFYQSQGFQVIGESKE 129


>CLOTE Q891D4 (Q891D4) Ribosomal-protein-alanine acetyltransferase
           (EC 2.3.1.128)
          Length = 152

 Score = 34.3 bits (77), Expect = 0.88
 Identities = 30/118 (25%), Positives = 54/118 (45%), Gaps = 18/118 (15%)

Query: 58  VFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYI 117
           ++ V I+ N + +GYG ++ + DE +       T+  ++        PNY   GI +  +
Sbjct: 48  LYIVAIKDNKI-LGYGGLWIILDEGHV------TNIAIH--------PNYRQLGIASLVL 92

Query: 118 KLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175
             + +   K R  N++ L+  K+N  A   YQK GF  +E+      +    ED  +M
Sbjct: 93  STLIKE-SKNRGVNSITLEVRKSNSVAQNLYQKFGF--VEEGCRKHYYSDNLEDAIIM 147


>CLOAB Q97I16 (Q97I16) Predicted acetyltransferase domain containing
           protein
          Length = 291

 Score = 34.3 bits (77), Expect = 0.88
 Identities = 31/103 (30%), Positives = 37/103 (35%), Gaps = 20/103 (19%)

Query: 68  VPIGYGQIYKMYDELYTDY-----HYPKTDEIVYGMDQFIGEPN------------YWSK 110
           +P+    IY  YDE    Y      +   DEI  G  QFI E N            Y
Sbjct: 177 IPLSIDDIY--YDEAQEYYVDDGAFFISKDEIKIGYGQFIFEHNNITIVNFGIVEQYRGN 234

Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
           G G  ++  I   LK  R    V +    NN  AI  Y   GF
Sbjct: 235 GYGRYFLSYILNILKN-RGCKVVYIKVDMNNVPAINLYTSMGF 276


>BACC1 Q72WY7 (Q72WY7) Hypothetical protein
          Length = 186

 Score = 34.3 bits (77), Expect = 0.88
 Identities = 38/163 (23%), Positives = 70/163 (42%), Gaps = 16/163 (9%)

Query: 195 HYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETN 254
           H   N K++ I  +    D V  L  N Y   T+  T+ +K   K       +N   +
Sbjct: 12  HLEKNIKLEDIPNVDLYVDQVVQLFENTYADTTR--TDDEKVLTK-----TMINNYAKGK 64

Query: 255 VKIPNIEYSYISDELSILG-YKEIKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLD 313
           + IP     Y  + + ++    ++KG     +I S++     +LL  D  SF   M   +
Sbjct: 65  LFIPIKNKKYSKEHMILISLIYQLKGALSINDIKSSLETINDSLLNDD--SFELNMLYKN 122

Query: 314 YTDISECTIDN-KQNVLEEYILLRETIYNDLTDIEKDYIESFM 355
           Y  ++E  +++ KQ+V       R T  N+++ +E   +E F+
Sbjct: 123 YLALTESNVESFKQDVNN-----RVTEVNEISSLEDTKLEKFL 160


>VIBPA Q87G30 (Q87G30) Putative acetyltransferase
          Length = 166

 Score = 33.9 bits (76), Expect = 1.1
 Identities = 22/77 (28%), Positives = 36/77 (46%), Gaps = 6/77 (7%)

Query: 99  DQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIED 158
           DQF G       G+G++ I+ I E      N   + L+ + +N  AI  Y+K GF I  +
Sbjct: 88  DQFHG------LGVGSKLIETITELADNWLNVRRIQLEVNADNEAAIGLYKKHGFEIEGE 141

Query: 159 LPEHELHEGKKEDCYLM 175
             +    +G+  + Y M
Sbjct: 142 AIDASFRDGEFINTYYM 158


>STAES ARGD2_STAES (Q8CSG1) Acetylornithine aminotransferase 2 (EC
           2.6.1.11) (ACOAT 2)
          Length = 375

 Score = 33.9 bits (76), Expect = 1.1
 Identities = 31/133 (23%), Positives = 61/133 (45%), Gaps = 11/133 (8%)

Query: 193 IEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE--KAIYNFLNTN 250
           + + F+N+K D+IE + +  + +    NN Y+  +        G+  E  +A+YN LN
Sbjct: 1   MSYLFNNYKRDNIEFVDANQNELIDKDNNVYLDFSSGIGVTNLGFNMEIYQAVYNQLNLI 60

Query: 251 LETNVKIPNIEYSYISDELS--ILGYKEIKGTFL---TPEIYSTMSEEEQNLLKRDIASF 305
             +    PN+  S I +E++  ++G ++    F    T    + +    +   K +I +F
Sbjct: 61  WHS----PNLYLSSIQEEVAQKLIGQRDYLAFFCNSGTEANEAAIKLARKATGKSEIIAF 116

Query: 306 LRQMHGLDYTDIS 318
            +  HG  Y  +S
Sbjct: 117 KKSFHGRTYGAMS 129


>RICPR Q9ZEC5 (Q9ZEC5) 190 KD ANTIGEN (Sca1)
          Length = 347

 Score = 33.9 bits (76), Expect = 1.1
 Identities = 24/90 (26%), Positives = 43/90 (47%), Gaps = 2/90 (2%)

Query: 197 FDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVK 256
           F N K +  E+IGS   S+      +  F  +   +  K + K+K  Y++ +T ++++VK
Sbjct: 123 FKNGKNNDKELIGSKVISIYGQKELQQNFTLQLLVSASKNFIKDKINYSYGDTQIKSHVK 182

Query: 257 IPNIEYSYISDELSILGYKEIKGTFLTPEI 286
             N  +SY ++ L    Y       +TP I
Sbjct: 183 HHN--HSYNAEALLNYNYLVKNSIIITPNI 210


>PSEPK Q88NL5 (Q88NL5) Acetyltransferase, GNAT family
          Length = 162

 Score = 33.9 bits (76), Expect = 1.1
 Identities = 18/59 (30%), Positives = 28/59 (47%), Gaps = 6/59 (10%)

Query: 98  MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156
           +D     P Y  +G+G R ++     L      NA  LD ++ NP+A+  Y   GF +I
Sbjct: 86  VDMLFVAPGYRGQGVGKRLLRYAISEL------NAEYLDVNEQNPKALGFYLHEGFEVI 138


>LACPL Q88YF8 (Q88YF8) Acetyltransferase (Putative)
          Length = 171

 Score = 33.9 bits (76), Expect = 1.1
 Identities = 13/47 (27%), Positives = 25/47 (53%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
           +W  G+GT  I+ + ++ +   +   ++L     N RA++ YQ  GF
Sbjct: 98  FWGMGLGTALIEEVLDWARNYSSLERLVLTVQLRNVRAVKLYQHLGF 144


>BACC1 Q737S7 (Q737S7) Acetyltransferase, GNAT family
          Length = 282

 Score = 33.9 bits (76), Expect = 1.1
 Identities = 36/125 (28%), Positives = 59/125 (47%), Gaps = 13/125 (10%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANA---VILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           PNY  +GIG    + +FE  K E   N    + L+    N RAIR Y K G+  + DL
Sbjct: 87  PNY--RGIGVS--QKLFELHKDEAIQNGCKQLFLEVIVGNDRAIRFYNKLGYEKVYDLSY 142

Query: 162 HELHEGKK---EDCYLMEYR-YDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAY 217
           + L +  K   ++C  +E +  +  A  V+  K+L  H+  N++ D   I  + +
Sbjct: 143 YNLKDMTKIIHKECKGIEVKQLEFPAFKVEIQKWL--HFHINWQNDMDYIEKTNHTFYGA 200

Query: 218 LVNNE 222
            V+N+
Sbjct: 201 YVDND 205


>BACC1 Q734H0 (Q734H0) Acetyltransferase, GNAT family family
          Length = 182

 Score = 33.9 bits (76), Expect = 1.1
 Identities = 45/172 (26%), Positives = 71/172 (41%), Gaps = 24/172 (13%)

Query: 8   ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWED---EVFRVIIE 64
           +CI    +DD    ++ L +++ L    G    Y LE     + + W D   E+ R  IE
Sbjct: 10  LCIEPFTNDDV-CRIRELANDKELANILGLPHPYKLE-----FAQDWVDMQPELIRKGIE 63

Query: 65  YNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQ-----FIGEPNYWSKGIGTRYIKL 119
           Y   P+G   + K   E+        T  I  G ++     +IG+ NYW KG  T  +
Sbjct: 64  Y---PLGI--VSKESREIVGTI----TLRIDKGNNRGELGYWIGK-NYWGKGFATEALNR 113

Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKED 171
           + +F   E   N +       N  +I+  +KSG R    L ++ L     ED
Sbjct: 114 MIQFGFIELGLNKIWASAISRNRSSIKVLEKSGLRKEGTLRQNRLLLNTYED 165


>BACAN Q81Q67 (Q81Q67) Acetyltransferase, GNAT family
          Length = 282

 Score = 33.9 bits (76), Expect = 1.1
 Identities = 34/125 (27%), Positives = 57/125 (45%), Gaps = 13/125 (10%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANA---VILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           PNY   G+  +    +FE  K+E   N    + L+    N RAIR Y K G+  + DL
Sbjct: 87  PNYRGVGVSQK----LFELHKEEALQNECKQLFLEVIVGNDRAIRFYNKLGYEKVYDLSY 142

Query: 162 HELHEGKK---EDCYLMEYR-YDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAY 217
           + L +  K    +C  +E +  +  A  V+  K+L  H+  N++ D   I  + +
Sbjct: 143 YNLKDMTKIIHRECKGIEVKQLEFAAFKVEIQKWL--HFHINWQNDMDYIEKTNHTFYGA 200

Query: 218 LVNNE 222
            V+N+
Sbjct: 201 YVDND 205


>BACAN Q81P77 (Q81P77) Acetyltransferase, GNAT family
          Length = 181

 Score = 33.9 bits (76), Expect = 1.1
 Identities = 45/181 (24%), Positives = 76/181 (41%), Gaps = 32/181 (17%)

Query: 10  IRTLIDDDFPLMLKWLTDERVLEFYGGRDK--KYTLESLKKHYTEPWEDEVF-------- 59
           +R L  DD     +W  D +V +     D+   +TLE  K+     W +
Sbjct: 8   LRELTLDDVEDRYQWSLDTKVTKHLVVSDQYPPFTLEDTKQ-----WIEACINRKNGYEQ 62

Query: 60  RVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKL 119
           R I   N + IG+ ++ K +D+        K  E+       IG   YW KG G   +
Sbjct: 63  RAITAENGIHIGWIEL-KNFDKTN------KNAELGIA----IGNKEYWGKGDGIAALYS 111

Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHE-LHEGKKEDCYLMEYR 178
           +      E     V L   ++N +A ++Y+K+GF + E L  ++ L +G+    ++  YR
Sbjct: 112 MLHVAFFEFELEKVWLRVDEDNLQARKSYEKAGF-VCEGLMRNDRLRKGR----FIHRYR 166

Query: 179 Y 179
           Y
Sbjct: 167 Y 167


>Q8E1I7 (Q8E1I7) Acetyltransferase, GNAT family
          Length = 186

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 22/64 (34%), Positives = 29/64 (45%), Gaps = 1/64 (1%)

Query: 93  EIVYGMDQFIG-EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKS 151
           EI +  D FI  + +YW  GIG   ++   E+         + L     N RAI  YQK
Sbjct: 97  EIEHVGDLFIAVQKDYWGYGIGHILMEEAIEWASDNDITRRLELSVQGRNERAIHLYQKF 156

Query: 152 GFRI 155
           GF I
Sbjct: 157 GFEI 160


>Q8DXE6 (Q8DXE6) Hypothetical protein SAG1905
          Length = 212

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 29/93 (31%), Positives = 44/93 (47%), Gaps = 9/93 (9%)

Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIK 278
           +N  Y+F+  F   K  GY KE+A+    N  + + +K   I Y    D LS+L  KEI
Sbjct: 125 LNLRYLFERLFEDEKGGGYPKERAVPEQRNARILSEIK--QITY---RDLLSVL--KEID 177

Query: 279 GTFLTPEIYSTMSEEE--QNLLKRDIASFLRQM 309
             FL   I     +E    N   ++IA +L+ +
Sbjct: 178 QDFLKETISGEHFQEYFFANCQNQNIADYLKSV 210


>OCEIH Q8ET96 (Q8ET96) Hypothetical conserved protein
          Length = 167

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 26/90 (28%), Positives = 41/90 (45%), Gaps = 15/90 (16%)

Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKK 169
           KGIG   + LI  +      A+ + LD   +N RAI  Y+K GF +          EG
Sbjct: 92  KGIGKEALNLIKIWAFNSYKAHRLWLDVKTDNKRAITIYKKEGFTL----------EGTL 141

Query: 170 EDCYLMEYRYDDNATNVKAMKYLIEHYFDN 199
            +C  +   Y+    ++  M  L++H +DN
Sbjct: 142 RECLRVGNTYE----SLHVMS-LLKHEYDN 166


>LISIN Q929M8 (Q929M8) Lin2246 protein
          Length = 157

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 23/73 (31%), Positives = 38/73 (52%), Gaps = 1/73 (1%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164
           P+Y  +GIG   +  + E + +E+    + L     N +AIR Y+K+GF+    L +  +
Sbjct: 84  PDYQREGIGQLLMDKMKE-VAREKGFIKISLRVLSINQKAIRFYEKNGFKQEGRLEKEFI 142

Query: 165 HEGKKEDCYLMEY 177
            +GK  D  LM Y
Sbjct: 143 IQGKYVDDILMAY 155


>CLOTE Q895L4 (Q895L4) DNA topoisomerase I (EC 5.99.1.2)
          Length = 696

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 19/52 (36%), Positives = 28/52 (53%), Gaps = 3/52 (5%)

Query: 220 NNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSI 271
           N+EYIF+   S  K  G+     IY +LN + + N+ IP +E   +  E SI
Sbjct: 399 NSEYIFRATGSIVKFDGFM---IIYEYLNEDEKENINIPKLEKGELLKEKSI 447


>CLOPE Q8XIF5 (Q8XIF5) Ribosomal-protein-alanine N-acetyltransferase
          Length = 148

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 23/76 (30%), Positives = 39/76 (51%), Gaps = 4/76 (5%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164
           P Y  +G+G   I  +   L KE N N++ L+  ++N  A   Y+K GF+  E+
Sbjct: 76  PEYRKQGVGNLLIDNLIT-LCKENNINSLTLEVRESNIPAQSLYKKHGFK--EEGIRKNF 132

Query: 165 HEGKKEDCYLMEYRYD 180
           +   KE+  +M +R+D
Sbjct: 133 YNNPKENAIIM-WRHD 147


>BRUSU Q8FVN1 (Q8FVN1) Acyl-CoA dehydrogenase family protein
          Length = 388

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 21/79 (26%), Positives = 38/79 (48%), Gaps = 15/79 (18%)

Query: 181 DNATNVKAMKYLIEH-----YFDNFKVDSIEIIG----------SGYDSVAYLVNNEYIF 225
           +N   ++ ++ ++ H     +FDN +V    +IG          SG ++   L+  E I
Sbjct: 197 NNGLTIRPIRTMMNHATTEVFFDNLRVPVSNLIGEEGKGFRYILSGMNAERILIAAECIG 256

Query: 226 KTKFSTNKKKGYAKEKAIY 244
             K+ T K   YAKE++I+
Sbjct: 257 DAKWFTQKSVNYAKERSIF 275


>BRUME Q8YCN6 (Q8YCN6) ACYL-COA DEHYDROGENASE (EC 1.3.99.-)
          Length = 395

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 21/79 (26%), Positives = 38/79 (48%), Gaps = 15/79 (18%)

Query: 181 DNATNVKAMKYLIEH-----YFDNFKVDSIEIIG----------SGYDSVAYLVNNEYIF 225
           +N   ++ ++ ++ H     +FDN +V    +IG          SG ++   L+  E I
Sbjct: 204 NNGLTIRPIRTMMNHATTEVFFDNLRVPVSNLIGEEGKGFRYILSGMNAERILIAAECIG 263

Query: 226 KTKFSTNKKKGYAKEKAIY 244
             K+ T K   YAKE++I+
Sbjct: 264 DAKWFTQKSVNYAKERSIF 282


>BACSU YQAR_BACSU (P45914) Hypothetical protein yqaR
          Length = 154

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 36/157 (22%), Positives = 74/157 (47%), Gaps = 11/157 (7%)

Query: 201 KVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKK---KGYAKEKAIYNFLNTNLETN--- 254
           ++D   +I S   +V +     Y+ + K     +   KGY     IYN  N  +ET
Sbjct: 3   QIDFGTVITSAITAVFFTGGTNYVLQKKNRKGNEIFTKGYILIDEIYNINNKRIETAAAF 62

Query: 255 VKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLDY 314
           V   N    Y+ ++L    +KE+    L  + +S + ++E N+  ++  ++LR++
Sbjct: 63  VPFYNHPEGYL-EKLHTDYFKELSAFELIVKKFSILFDKELNIKLQEYINYLREVEVALR 121

Query: 315 TDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYI 351
             +++  I  + N  +EYI   E + +++T++ K +I
Sbjct: 122 GFMNDDPI-IEVNFNQEYI---ERLIDEITNLIKKHI 154


>BACAN Q81RF9 (Q81RF9) Acetyltransferase, GNAT family
          Length = 185

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 40/184 (21%), Positives = 71/184 (38%), Gaps = 29/184 (15%)

Query: 6   NEICIRTLIDDDFPLMLKWLTDERVLE-------FYGGRDKKYTLESLKKHYTEPWEDEV 58
           +++ IRT+ + D   +   +  E   E       ++    ++Y++   +K  T   E+ +
Sbjct: 9   DKVTIRTIEESDIKTLWNLVFKEENPEWKKWDAPYFSFSMQEYSVYK-EKMQTRLKEEPL 67

Query: 59  FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIK 118
             +IIE N   IG    Y         + Y  T  +  G+   I  P YW+ G GT  +
Sbjct: 68  SNLIIENNGQVIGTVGFY---------WEYKPTRWLEMGI--VIYNPAYWNGGYGTEALT 116

Query: 119 LIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYR 178
           L  + L ++     V L     N R ++  +K G  +          EG+   C
Sbjct: 117 LYRDLLFEKMEIGRVGLTTWSGNERMMKVAEKIGMSL----------EGRMRKCRYYNGT 166

Query: 179 YDDN 182
           Y D+
Sbjct: 167 YYDS 170


>VIBPA Q87NP1 (Q87NP1) Putative spermine/spermidine
           acetyltransferase BltD
          Length = 182

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 21/91 (23%), Positives = 42/91 (46%), Gaps = 3/91 (3%)

Query: 85  DYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRA 144
           ++H P T  +   M   +  P++  KG+G+  +  +     +  N   V L+ +  N  A
Sbjct: 89  EFHAPSTGTLWLPMLTIL--PSFKGKGLGSEIVSSVIAVACEYANLQNVGLNVYAENISA 146

Query: 145 IRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175
            R + + GF  I    + E+  GK+ +C ++
Sbjct: 147 FRFWYRQGFTQIRAF-DQEIEFGKEYNCLVL 176


>THETN Q8RC65 (Q8RC65) Acetyltransferases
          Length = 200

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 16/50 (32%), Positives = 31/50 (62%), Gaps = 1/50 (2%)

Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLP 160
           G+G++ ++ I +  +K +    ++LD    N +AI+ Y+K G++IIE  P
Sbjct: 134 GLGSKLLEEIEQEARKLK-CKRIVLDVEIENEKAIKLYEKLGYKIIERSP 182


>STRR6 Q8CYV9 (Q8CYV9) Hypothetical protein spr0850
          Length = 166

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 23/81 (28%), Positives = 40/81 (49%), Gaps = 6/81 (7%)

Query: 86  YHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYI-KLIFEFLKKERNANAVILDPHKNNPRA 144
           Y YP  + +  G+  F+ +  Y  KGIG+  + + +  F K  R A    +   K NP++
Sbjct: 80  YAYPDEETVFIGL--FMVDQAYQRKGIGSHIVTEALAYFAKNFRKARLAYV---KGNPQS 134

Query: 145 IRAYQKSGFRIIEDLPEHELH 165
              ++K GF+ I    + EL+
Sbjct: 135 QHFWEKQGFKSIGCEVKQELY 155


>STAES Q8CU70 (Q8CU70) Spermidine acetyltransferase
          Length = 165

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 27/83 (32%), Positives = 37/83 (44%), Gaps = 14/83 (16%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHK-------NNPRAIRAYQKSG 152
           Q I +P +  KG    Y K  FE   K  N    IL+ HK       +N +A+  Y+  G
Sbjct: 81  QIIIKPEFSGKG----YAKFAFE---KAINYAFDILNMHKIYLYVDTDNKKAVHIYESQG 133

Query: 153 FRIIEDLPEHELHEGKKEDCYLM 175
           F+    L E    +GK +D Y M
Sbjct: 134 FKTEGLLKEQFYTKGKYKDAYFM 156


>STAES Q8CS08 (Q8CS08) Hypothetical protein SE1483
          Length = 434

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 38/150 (25%), Positives = 71/150 (47%), Gaps = 24/150 (16%)

Query: 224 IFKTKFSTNKKKGYAK----EKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKE--- 276
           I KT+FST+K KGY K    EK+  N  N + +  ++  N +   I++E+S L
Sbjct: 6   ILKTQFSTSKFKGYLKYINDEKS--NKANHD-KKKIQSLNQDIENINNEMSNLNLNSYSS 62

Query: 277 -IKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLDYTDISECTID----------NK 325
            I G      I    ++ ++ ++KR  A F    + LD  ++++   D          N
Sbjct: 63  YIIGYMKNNSITKKDNQNKKKVIKRTTAPFNNNSYTLDNKELNKLKDDFDTAEKQGCINY 122

Query: 326 QNVL--EEYILLRETIYNDLTD-IEKDYIE 352
           Q+++  +   L++  +Y+  TD + +D I+
Sbjct: 123 QDIISFDNDFLIKNHLYDAKTDELNEDVIK 152


>RICPR Q9ZDP9 (Q9ZDP9) Hypothetical protein RP278
          Length = 371

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 34/116 (29%), Positives = 52/116 (44%), Gaps = 19/116 (16%)

Query: 231 TNKKKGYAK----EKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEI 286
           T  K+ Y +    E+A+Y+ L    +   K  NI  S   D+L     + +KG  LTPE
Sbjct: 101 TRLKENYIQYDTVEEALYSLLTKETDLIKKANNIPESLTPDDL-----RRLKGENLTPE- 154

Query: 287 YSTMSEEEQNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYND 342
                E+E+   K +  S L  +  +D T  S    D + N + E   L +TI N+
Sbjct: 155 -----EQEEERKKFEYLSILGSI--IDDTKKSNEHYDKRANEINEQ--LNKTIINE 201


>OCEIH Q8CXG0 (Q8CXG0) Diamine N-acetyltransferase
           (Spermine:spermidine acetyltransferase) (EC 2.3.1.57)
          Length = 152

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 26/100 (26%), Positives = 44/100 (44%), Gaps = 12/100 (12%)

Query: 66  NNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLK 125
           ++ PIGY  +          +H  +     +  D+F+    +  KG   +YI LI +++K
Sbjct: 53  DDTPIGYAMV---------GFHSQEKQSAWF--DRFMIAAEHQGKGYAHQYIPLILDYIK 101

Query: 126 KERNANAVILDPHKNNPRAIRAYQKSGFRII-EDLPEHEL 164
            +    ++ L     N  A   Y+K GF +  E  PE EL
Sbjct: 102 MKYQVKSIKLSIIPTNDVAKLLYEKYGFVLTGETDPEGEL 141


>CLOAB Q97J70 (Q97J70) Predicted acetyltransferase
          Length = 171

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 19/69 (27%), Positives = 31/69 (44%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166
           YW  G+G + I  +  + KK      + L    +N RAI+ Y+  GF     +    L +
Sbjct: 98  YWGLGVGRKLIMNLIAWSKKNHIVRKINLRVRTDNYRAIKLYESLGFVNEGTIKRDFLID 157

Query: 167 GKKEDCYLM 175
           G+  D + M
Sbjct: 158 GEFYDSFSM 166


>BURMA Q9AI54 (Q9AI54) DedA family protein
          Length = 1925639

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 50/238 (21%), Positives = 103/238 (43%), Gaps = 28/238 (11%)

Query: 12     TLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIG 71
              T+ ++++  M+     ++VLE  G ++K         +YT         ++I+Y N  I
Sbjct: 546537 TINENZYMEMITKDNLKQVLENLGFKNKNENYVKTINNYT---------LLIDYKNQSIN 546587

Query: 72     YGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFL----KKE 127
              Y +  K++D+  +++ +P+   +   + + +       KG    Y++L  ++     KK
Sbjct: 546588 YPKEIKIHDKTTSNFSHPENFVVFECVHRLL------EKGYKAEYLELEPKWNLGRDKKG 546641

Query: 128    RNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVK 187
                A+ ++ D ++NNP  I   + +  +  E +   E +  +++   L  Y   +     K
Sbjct: 546642 GKADILVKD-NENNPYLIIECKTTDSKNSEFI--KEWNRMQEDGGQLFSYFQQE-----K 546693

Query: 188    AMKYLIEHYFD-NFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIY 244
               +KYL  +  D + K++    I   YD+  YL   E     K S N  + +   K  Y
Sbjct: 546694 GVKYLCLYTSDFSDKLEYKNYIIQAYDNEEYLKEKELQNSYKKSNNNIELFKTWKESY 546751


 Score = 31.2 bits (69), Expect = 7.4
 Identities = 20/73 (27%), Positives = 36/73 (49%), Gaps = 2/73 (2%)

Query: 105     PNYWSKGIGTRYIKLIFEFLKK-ERNANAVILDPHKNNPRAIRAYQKSGFRII-EDLPEH 162
               P++  +G+G+R  + +  + +  E     + L     NP A+R Y++ GFR     +
Sbjct: 1424334 PDHQGRGVGSRLFESLIAWARSAEPEIVRIELAAGAGNPGAVRLYERLGFRHEGRQVARG 1424393

Query: 163     ELHEGKKEDCYLM 175
                L +G+ ED  LM
Sbjct: 1424394 RLPDGRFEDDILM 1424406


>BRAJA Q89YE3 (Q89YE3) Bll0009 protein
          Length = 250

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 14/56 (25%), Positives = 31/56 (55%), Gaps = 4/56 (7%)

Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           +PN+  KG+GT    L+  +  +  + + ++     +NP  I  YQ+ GF+++ ++
Sbjct: 165 DPNWVGKGLGT----LLMNYALQRCDEDGIVAYLESSNPENIPFYQRHGFKVVGEI 216


>BACAN Q81NK7 (Q81NK7) Streptothricin acetyltransferase, putative
          Length = 184

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 31/119 (26%), Positives = 54/119 (45%), Gaps = 6/119 (5%)

Query: 40  KYTLE---SLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVY 96
           +YT+E   S +K Y +   +E+  V  EY N P     I  +++++       K
Sbjct: 41  EYTVEDVPSYEKSYLQNDNEEL--VYNEYINKPNQIIYIALLHNQIIGFIVLKKNWNNYA 98

Query: 97  GMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRI 155
            ++    +  Y + G+G R I    ++ K E N   ++L+   NN  A + Y+K GF I
Sbjct: 99  YIEDITVDKKYRTLGVGKRLIAQAKQWAK-EGNMPGIMLETQNNNVAACKFYEKCGFVI 156


>VIBUY Q7MI31 (Q7MI31) Ribosomal-protein-alanine acetyltransferase
          Length = 150

 Score = 32.7 bits (73), Expect = 2.6
 Identities = 19/75 (25%), Positives = 34/75 (45%), Gaps = 1/75 (1%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164
           P    KG G + +    +   +  NA +  L+  ++N  AI  YQ+ GF  ++    +
Sbjct: 74  PKQQGKGYGRQLLDAFIDE-GEAANAESAWLEVRESNVNAIHLYQEMGFNEVDRRRNYYP 132

Query: 165 HEGKKEDCYLMEYRY 179
            +  KED  +M Y +
Sbjct: 133 TQSGKEDAIIMSYLF 147


>OCEIH Q8EMB8 (Q8EMB8) Glucose-6-phosphate 1-dehydrogenase (EC
           1.1.1.49) (G6PD)
          Length = 491

 Score = 32.7 bits (73), Expect = 2.6
 Identities = 45/198 (22%), Positives = 79/198 (39%), Gaps = 25/198 (12%)

Query: 53  PWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTD----EIVYGMDQFIG--EPN 106
           PW DEV R  +E N++         +  E  + ++Y   D    E   G+++ I   E
Sbjct: 51  PWTDEVLRENVE-NSIQDALSPDEDL-SEFISHFYYKSFDVTEKESYQGLNEIIQNLEGQ 108

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166
           Y ++G    Y+ +  +F     N         + N   ++        +IE    H+L
Sbjct: 109 YQTEGNRLFYLAMAPDFFGAIAN---------QLNDYGLKNTSGWTRLVIEKPFGHDLPS 159

Query: 167 GKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFK 226
            KK +  L     +D         Y I+HY     V +IE+I        +L NN +I
Sbjct: 160 AKKLNHELQAAFREDQI-------YRIDHYLGKEMVQNIEVIRFANGIFEHLWNNRFISN 212

Query: 227 TKFSTNKKKGYAKEKAIY 244
            + ++++  G  +E+A Y
Sbjct: 213 IQITSSETLG-VEERARY 229


>OCEIH Q8EKW6 (Q8EKW6) Acetyltransferase
          Length = 166

 Score = 32.7 bits (73), Expect = 2.6
 Identities = 36/165 (21%), Positives = 64/165 (38%), Gaps = 16/165 (9%)

Query: 6   NEICIRTLIDDDFPLMLKWLTDERVLEFYGG---RDKKYTLESLKKHYT--EPWEDEVFR 60
           N +  R   D+DFP +   L D  V+ F G    RD K   + L+  Y   +       +
Sbjct: 5   NRLTFRPYHDNDFPFLQSLLQDPEVVRFIGDGNVRDDKACNDFLQWIYDTYKNGNGLGLQ 64

Query: 61  VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120
           V++   N  +G+  +     E   +       EI Y + +      +W KG  T     +
Sbjct: 65  VLVNKQNERVGHAGLVPQTVEGKNEI------EIGYWIAK-----KHWGKGYATEAALAL 113

Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELH 165
           F F +K    + VI    + N  +    +K   +I +++   + H
Sbjct: 114 FAFARKNIEVDRVISLIQRENTASRNVAEKLMMKIEKEIILKDKH 158


>MYCPE Q8EWC2 (Q8EWC2) Oligoendopeptidase F
          Length = 604

 Score = 32.7 bits (73), Expect = 2.6
 Identities = 30/121 (24%), Positives = 54/121 (44%), Gaps = 7/121 (5%)

Query: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240
           DN T    +  L ++ F  FK+D + I    Y+ V+  +N +   K   + N+K
Sbjct: 37  DNGTCYSNLNKLKKYLF--FKLDMVPIENKLYNYVSNKLNEDLANKEMINWNQKLSSKIS 94

Query: 241 KAIYNFLNTNLETNVKIPNIEY--SYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLL 298
           +   +F N   E N+ + N E   S+I ++  I  ++         E +   +EEE+ L+
Sbjct: 95  EFQLSFAN---EINIILDNKELIKSFIENDSEIKKFERFFDLIFKEENHKLSNEEEKLLV 151

Query: 299 K 299
           K
Sbjct: 152 K 152


>LISMO Q8Y8S4 (Q8Y8S4) Lmo0820 protein
          Length = 185

 Score = 32.7 bits (73), Expect = 2.6
 Identities = 21/74 (28%), Positives = 33/74 (44%), Gaps = 3/74 (4%)

Query: 98  MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157
           +D  +    Y   G+GT  +  + E +  E     V L+  K NP A R Y++ GF +
Sbjct: 113 LDSIVTNEKYRGHGVGTALLAKLTE-IAAEDGEKVVGLNCDKGNPHAKRLYERLGFHVTG 171

Query: 158 D--LPEHELHEGKK 169
           +  L  HE    +K
Sbjct: 172 EITLSGHEYEHMQK 185


>CLOPE Q8XKM5 (Q8XKM5) Probable acetyltransferase
          Length = 167

 Score = 32.7 bits (73), Expect = 2.6
 Identities = 18/76 (23%), Positives = 36/76 (47%), Gaps = 8/76 (10%)

Query: 85  DYHYPKTDEIVYGMD-------QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDP 137
           DY Y   ++I + +D       +    P++  KG G + I  + E + KE+  N++ +
Sbjct: 69  DYAYDVYNDIAWQVDGPFLSFHRIAVSPSHRGKGYGRKMIDFV-EEMAKEKKCNSIRISA 127

Query: 138 HKNNPRAIRAYQKSGF 153
           +  N  A+  Y+  G+
Sbjct: 128 YHKNENAVNLYKNLGY 143


>BUCAI SPED_BUCAI (P57304) S-adenosylmethionine decarboxylase
           proenzyme (EC 4.1.1.50) (AdoMetDC) (SamDC) [Contains:
           S-adenosylmethionine decarboxylase beta chain;
           S-adenosylmethionine decarboxylase alpha chain]
          Length = 265

 Score = 32.7 bits (73), Expect = 2.6
 Identities = 30/103 (29%), Positives = 46/103 (44%), Gaps = 11/103 (10%)

Query: 169 KEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTK 228
           + D   +EYR      ++  +K+ I+H  ++ +    + I S YD V   V  E IF T+
Sbjct: 159 ESDIVTIEYRVRGFTRDIHGIKHFIDHKINSIQNFMSDDIKSMYDMVDVNVYQENIFHTR 218

Query: 229 FSTNKKKGYAKEKAIYNFL-NTNLETNVKIPNIEYSYISDELS 270
                     +E  + N+L N NLE    +   E SYI   LS
Sbjct: 219 M-------LLREFNLKNYLFNINLE---NLEKEERSYIKKLLS 251


>AQUAE O67458 (O67458) Hypothetical protein aq_1482
          Length = 161

 Score = 32.7 bits (73), Expect = 2.6
 Identities = 18/60 (30%), Positives = 30/60 (50%), Gaps = 1/60 (1%)

Query: 95  VYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
           V  + + + +P Y   G+GT  +  I E+ KK +  +   L     N +AI  Y+K GF+
Sbjct: 87  VGAIHEIVVDPEYQGHGVGTALMNTILEYFKK-KGLDTAELWVGDENYKAINFYKKFGFQ 145


>YERPE Q8CZN5 (Q8CZN5) Acyltransferase for 30S ribosomal subunit
           protein S18
          Length = 161

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 20/72 (27%), Positives = 34/72 (47%), Gaps = 1/72 (1%)

Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHE 163
           +P Y  +G G   ++ + E L+ ERN   + L+   +N RAI  Y+  GF  +     +
Sbjct: 86  DPQYQRQGYGRLLLEHLIEQLE-ERNIVTLWLEVRASNARAIALYESLGFNEVSVRRNYY 144

Query: 164 LHEGKKEDCYLM 175
                +ED  +M
Sbjct: 145 PSANGREDAIMM 156


>STRAW Q827N9 (Q827N9) Putative acetyltransferase
          Length = 166

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 19/59 (32%), Positives = 31/59 (52%), Gaps = 1/59 (1%)

Query: 109 SKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEG 167
           S+GIG+  I+   E L +ER  + + L    +NPRA   Y + G+R +    +   +EG
Sbjct: 88  SRGIGSALIRAAEE-LTRERGLDVIGLGVGTDNPRAAELYARLGYRPLTGYVDRWSYEG 145


>STAES Q8CU81 (Q8CU81) Spermidine acetyltransferase
          Length = 165

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 24/80 (30%), Positives = 35/80 (43%), Gaps = 8/80 (10%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFE----FLKKERNANAVILDPHKNNPRAIRAYQKSGFRI 155
           Q I +P +  KG    Y K  FE    +     N + + L    +N +AI  Y+  GF+
Sbjct: 81  QIIIKPEFSGKG----YAKFAFEKAIIYAFNILNMHKIYLYVDADNKKAIHIYESQGFKT 136

Query: 156 IEDLPEHELHEGKKEDCYLM 175
              L E    +GK +D Y M
Sbjct: 137 EGLLKEQFYTKGKYKDAYFM 156


>RICPR Q9ZCN0 (Q9ZCN0) RIBOSOMAL-PROTEIN-ALANINE ACETYLTRANSFERASE
           (RimJ)
          Length = 183

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 26/85 (30%), Positives = 41/85 (48%), Gaps = 12/85 (14%)

Query: 93  EIVYGMDQFIGEPNYWSKGIGTRYIKLIFEF---LKKERNANAVILDPHKNNPRAIRAYQ 149
           EI Y +D     PN+W +GI  + IK I +F   +   R    VI D    N R++   +
Sbjct: 103 EISYDLD-----PNFWGQGIMLKSIKNILKFADCIGIIRVQATVITD----NFRSVNLLE 153

Query: 150 KSGFRIIEDLPEHELHEGKKEDCYL 174
           + GF     L ++E+   K +D Y+
Sbjct: 154 RCGFSKEGILKKYEIIANKHKDYYM 178


>OCEIH Q8ERM1 (Q8ERM1) Acetyltransferase
          Length = 177

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 39/175 (22%), Positives = 64/175 (36%), Gaps = 15/175 (8%)

Query: 5   ENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLK-KHY---TEPWEDEVFR 60
           + E+ IR + + D   + + +  E   E+       ++ ES+  +H+    E W D   R
Sbjct: 4   DQELTIRPIQEKDLKRLWELIYKEDNPEWKQWDAPYFSHESMSYEHFLKEAESWIDAKSR 63

Query: 61  VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120
            ++  NN   G              Y+Y    +    M     E N W KG GT  +KL
Sbjct: 64  WVVCVNNDVHGT-----------VSYYYEDEQKNWLEMGIIFYEGNNWGKGYGTTALKLW 112

Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175
              +  +     V L     N R IR  +K G  +   +     + G+  D   M
Sbjct: 113 VNHIFTQLPVVRVGLTTWSGNKRMIRVAEKLGMTMEGRIRNVRYYNGEYYDSIRM 167


>MYCGE RIBF_MYCGE (P47391) Putative riboflavin biosynthesis protein
           ribF [Includes: Riboflavin kinase (EC 2.7.1.26)
           (Flavokinase); FMN adenylyltransferase (EC 2.7.7.2) (FAD
           pyrophosphorylase) (FAD synthetase)]
          Length = 269

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 16/44 (36%), Positives = 27/44 (61%), Gaps = 3/44 (6%)

Query: 419 TNFGEDILRMY-GNIDIEKAKEYQDIVEEYYPIETIVYGIKNIK 461
           TN    ++R Y  N ++EKA +   +VE YY + T+V+G+K  +
Sbjct: 120 TNLSSSVIRNYLTNNELEKANQL--LVEPYYRVGTVVHGLKKAR 161


>ENTFA Q836M4 (Q836M4) Spermine/spermidine acetyltransferase,
           putative
          Length = 148

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 16/56 (28%), Positives = 29/56 (51%)

Query: 98  MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
           +D+F+ +  +  +G G    +L+   L ++   N + L  +  N  AIR YQ+ GF
Sbjct: 72  LDRFLIDQRFQGQGYGKAACRLLMLKLIEKYQTNKLYLSVYDTNSSAIRLYQQLGF 127


>CHRVO Q7NRZ7 (Q7NRZ7) Probable acetyltransferase (EC 2.3.1.-)
          Length = 172

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 20/70 (28%), Positives = 31/70 (44%)

Query: 106 NYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELH 165
           ++  KG G   ++ I +          + L    +N RAI  Y+K GF     L + +L
Sbjct: 92  DWQGKGAGGAMMRAIIDLADNWLGLIRIELKVIHDNARAIALYEKFGFEYEGRLRQEQLR 151

Query: 166 EGKKEDCYLM 175
            GK ED  +M
Sbjct: 152 AGKLEDVLVM 161


>CHLCV Q824H9 (Q824H9) Acetyltransferase, GNAT family
          Length = 170

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 18/52 (34%), Positives = 29/52 (55%), Gaps = 2/52 (3%)

Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
           +GEP Y +KGIGT  +  +    K   +   + L+ ++ NP AI  Y++ GF
Sbjct: 95  VGEP-YRNKGIGTALLNNLCHLAKSRFHLEILYLEVYEENP-AIELYKRFGF 144


>CHLMU SYR_CHLMU (Q9PJT8) Arginyl-tRNA synthetase (EC 6.1.1.19)
           (Arginine--tRNA ligase) (ArgRS)
          Length = 563

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 56/268 (20%), Positives = 106/268 (39%), Gaps = 63/268 (23%)

Query: 204 SIEIIGSGYD----SVAYLVNNEYIFKTKFSTNKKKGY---AKEKAIYNFLNTNLETNVK 256
           SIEI G+G+     S  +L N   IF  + +    KG+   + +K I +F + N+  ++
Sbjct: 75  SIEIAGAGFINFTFSKEFLANQLQIFSQELA----KGFPVSSPQKVIIDFSSPNIAKDMH 130

Query: 257 IPNIEYSYISDEL----SILGYKEIK-----------GTFLT--PEIYSTMSEEEQNLLK 299
           + ++  + I D L    S +G+  ++           G  +T   E   T   + +NL +
Sbjct: 131 VGHLRSTIIGDCLARCFSFVGHDVLRLNHIGDWGTAFGMLITYLQETAQTDIHQLENLTE 190

Query: 300 RDIASFLRQMHGLDYTDISE--------------------CTIDNKQ-----NVLEEYIL 334
               + +R     ++   S+                    C +  K      ++L+  +
Sbjct: 191 LYKKAHVRFAEDPEFKKRSQYNVVALQSGDPQALALWKQICAVSEKSFQKIYSILDVELH 250

Query: 335 LR-ETIYND-LTDIEKDYIESFMERLNATTVFEGKKCLCHNDFSCNHLLL---DGNNRLT 389
            R E+ YN  L D+  D     +E  N  T+ +G KC+ H +FS   ++     G N  T
Sbjct: 251 TRGESFYNPFLADVVSD-----LESKNLVTLSDGAKCVFHEEFSIPLMIQKSDGGYNYAT 305

Query: 390 XXXXXXXXXXXXEYCDFIYLLEDSEEEI 417
                       ++ D I ++ DS + +
Sbjct: 306 TDVAAMRYRIQQDHADRILIVTDSGQSL 333


>BACSU O34376 (O34376) Putative acetyl transferase (YobR protein)
          Length = 247

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 24/84 (28%), Positives = 37/84 (44%), Gaps = 2/84 (2%)

Query: 76  YKMYD-ELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVI 134
           +KMYD E  T        +   G+   +    +  KG GT+ I+++ E+ K    A  +
Sbjct: 158 FKMYDKESLTALGTVSVIDGYGGLSNIVVAEEHRGKGAGTQVIRVLTEWAKNN-GAERMF 216

Query: 135 LDPHKNNPRAIRAYQKSGFRIIED 158
           L   K N  A+  Y K GF  I +
Sbjct: 217 LQVMKENLAAVSLYGKIGFSPISE 240


>BACC1 Q738E9 (Q738E9) Acetyltransferase, GNAT family
          Length = 308

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 38/159 (23%), Positives = 59/159 (37%), Gaps = 25/159 (15%)

Query: 62  IIEYNNVPIGYGQIYKM-YDELYTDYHYPKTDEIVYG-------------MDQFIGEPNY 107
           +I+YN  P GY  +  M Y     D +    D  + G             +D+   EP Y
Sbjct: 39  VIDYNIQPPGYSSVEMMRYSIEELDCYKVIMDGKIIGGIIVTISGKSYGRIDRIFVEPVY 98

Query: 108 WSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEG 167
             KGIG+  IKLI E     R  +        NN      Y+K G+  I         +
Sbjct: 99  QGKGIGSYVIKLIEEEYPSIRIWDLETSSRQLNNH---HFYKKMGYETI--------FKS 147

Query: 168 KKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIE 206
           + E CY+     +    N+   K +    ++N  + + E
Sbjct: 148 EDEYCYVKRITVESAEENLIKNKDMKNSQYENCNLANTE 186


>BACC1 Q736C5 (Q736C5) Acetyltransferase, GNAT family
          Length = 181

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 23/79 (29%), Positives = 39/79 (49%), Gaps = 6/79 (7%)

Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           IG   YW KG G   +  +      E     V L   ++N +A ++Y+K+GF + E L
Sbjct: 94  IGNKEYWGKGYGIAALYSMLHVAFFEFELEKVWLRVDEDNFQARKSYEKAGF-VCEGLMR 152

Query: 162 HE-LHEGKKEDCYLMEYRY 179
           ++ L +G+    ++  YRY
Sbjct: 153 NDRLRKGQ----FIHRYRY 167


>BACC1 Q735P8 (Q735P8) Acetyltransferase, GNAT family
          Length = 149

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 31/122 (25%), Positives = 49/122 (40%), Gaps = 21/122 (17%)

Query: 50  YTEPWEDEVFRVIIE--YNNVP--IGYGQIYKMYDELY---TDYHYPKTDEIVYGMDQFI 102
           Y  P  +E   V  E  YN+ P  +G+ +  K   +L      Y   K D+IV G   F
Sbjct: 9   YIVPCTEESIHVANEQGYNSGPHIVGHVENVKQDKDLLPWGAWYVIRKEDDIVLGDIGFK 68

Query: 103 GEPN--------------YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAY 148
           G+PN              YW+KG  T  ++ +  +  +      +I +    N  +IR
Sbjct: 69  GKPNEEHTVEVGYGFIEKYWNKGYATEAVRELINWAFQTGEVEMIIAETLLENESSIRVL 128

Query: 149 QK 150
           +K
Sbjct: 129 EK 130


>AQUAE YZ34_AQUAE (O66423) Hypothetical protein AA34
          Length = 318

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 25/71 (35%), Positives = 37/71 (52%), Gaps = 5/71 (7%)

Query: 414 EEEIGTNFGEDILRMYGNIDIE-KAKEYQDIVEEYYPI----ETIVYGIKNIKQEFIENG 468
           EE IG   GE + +    +  E KAKE +  V++   I    ET+ Y IK I +E I +
Sbjct: 215 EELIGETLGELLEKEIEKLVAEEKAKEIEGKVKKLKEIVSWFETLPYEIKQIAKEVISDN 274

Query: 469 RKEIYKRTYKD 479
             +I ++ YKD
Sbjct: 275 VLDIAEKFYKD 285


>YERPE Q8ZCX6 (Q8ZCX6) Spermidine acetyltransferase (EC 2.3.1.57)
           (Spermidine N1-acetyltransferase)
          Length = 181

 Score = 32.0 bits (71), Expect = 4.4
 Identities = 21/69 (30%), Positives = 29/69 (42%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           Q I +P +  KG      KL  E+     N   + L   K N +AI  Y K GF I  +L
Sbjct: 87  QIIIDPTHQGKGYAGAAAKLAMEYGFSVLNLYKLYLIVDKENEKAIHIYSKLGFEIEGEL 146

Query: 160 PEHELHEGK 168
            +     G+
Sbjct: 147 KQEFFINGE 155


>STRR6 Q8DNN2 (Q8DNN2) Hypothetical protein spr1627
          Length = 148

 Score = 32.0 bits (71), Expect = 4.4
 Identities = 14/58 (24%), Positives = 30/58 (51%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157
           +F   P    +G+G++ ++       +  + +++ L+  + N RA   YQK GF I++
Sbjct: 76  RFFINPQKQEQGLGSQALRKFVSLAFENEDIDSISLNVFEANQRAQNLYQKEGFEIVQ 133


>STAAM Q99RQ8 (Q99RQ8) Similar to transcription repressor of
           sporulation, septation and degradation PaiA
          Length = 171

 Score = 32.0 bits (71), Expect = 4.4
 Identities = 20/65 (30%), Positives = 35/65 (53%), Gaps = 4/65 (6%)

Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKE 170
           G G++ I+L  E + +E N + + L   ++NPRA   Y++ GF+++    EH    G
Sbjct: 106 GRGSQLIELA-EKIAQEHNKHKIWLGVWEHNPRAQAFYKRHGFKVV---GEHHFQTGDVT 161

Query: 171 DCYLM 175
           D  L+
Sbjct: 162 DTDLI 166


>LACLA Q9CJA2 (Q9CJA2) Acetyl transferase
          Length = 162

 Score = 32.0 bits (71), Expect = 4.4
 Identities = 21/69 (30%), Positives = 32/69 (46%), Gaps = 2/69 (2%)

Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKK 169
           KG+ T  I    +F KKE     + +    +NP A++ Y K GF     L +    +G+
Sbjct: 89  KGVATTLINFFIDFAKKE-GFKKITIQVMGSNPAALKLYNKLGFVEEGRLKKEFFIDGEY 147

Query: 170 -EDCYLMEY 177
            +DC L  Y
Sbjct: 148 IDDCILAFY 156


>CLOTE Q892J2 (Q892J2) Conserved protein
          Length = 218

 Score = 32.0 bits (71), Expect = 4.4
 Identities = 39/154 (25%), Positives = 57/154 (37%), Gaps = 21/154 (13%)

Query: 219 VNNEYIFK-------------TKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYI 265
           VNN  IFK             T F++ K +G       Y  LN     N+   N   S +
Sbjct: 9   VNNTPIFKCNYCGHCSKEIEATSFTSVKNRGCCWYFPKYTLLNIKNILNIGKENFIISLL 68

Query: 266 SDELSILG--YKEIKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLDYTDISECTID 323
           +++ S +   + E+KG+F   E Y  M E E      D   F R+     +     C++D
Sbjct: 69  NNKNSNISSYFIEVKGSFEEEEYYKFMRENEYTESSFDYKLFFRK---CSFVTDKGCSLD 125

Query: 324 NKQNVLEEYILLRETIYNDLTDIEKDYIESFMER 357
                    + L   I N     +KDY     ER
Sbjct: 126 FSLRPHPCNLYLCRNIIN---TCDKDYSSFSRER 156


>BRAJA Q89UA8 (Q89UA8) Hypothetical acetyltransferase
          Length = 148

 Score = 32.0 bits (71), Expect = 4.4
 Identities = 19/56 (33%), Positives = 33/56 (58%), Gaps = 5/56 (8%)

Query: 98  MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
           +DQ + +P  W    G+   +L+ E  K+  + + V L  +K+N RAIR Y+++GF
Sbjct: 76  LDQLVVDPASW----GSDAARLLVEEAKR-LSPSGVTLLVNKDNTRAIRFYERNGF 126


>BACSU O34558 (O34558) YopR protein
          Length = 325

 Score = 32.0 bits (71), Expect = 4.4
 Identities = 17/45 (37%), Positives = 26/45 (57%), Gaps = 5/45 (11%)

Query: 211 GYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNV 255
           G     +LV N+Y+ KTK ++NK  G A +     F+ TNL T++
Sbjct: 203 GQTKEVFLVENDYVVKTKRTSNKGDGQASK-----FVITNLITDI 242


>BACAN Q81R63 (Q81R63) Hypothetical protein
          Length = 217

 Score = 32.0 bits (71), Expect = 4.4
 Identities = 15/45 (33%), Positives = 27/45 (60%)

Query: 324 NKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNATTVFEGKK 368
           N+ NVL E +  +E +   L++ +KDYI+S  E++  T   E ++
Sbjct: 141 NQMNVLNESVTTQEELQRYLSENKKDYIKSVAEKVYQTATEEKRE 185


>VIBPA Q87MD0 (Q87MD0) Acetyltransferase-related protein
          Length = 168

 Score = 31.6 bits (70), Expect = 5.7
 Identities = 15/53 (28%), Positives = 26/53 (49%)

Query: 101 FIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
           FI +  YW KG+ T  +K  F    +E   + V  + + N+  ++   +K GF
Sbjct: 86  FIFDKAYWGKGLATEALKAFFPKACRELELHKVKANVNSNHQASMAVLEKLGF 138


>STRR6 Q8DND0 (Q8DND0) Transcriptional activator
          Length = 299

 Score = 31.6 bits (70), Expect = 5.7
 Identities = 19/81 (23%), Positives = 40/81 (49%), Gaps = 12/81 (14%)

Query: 295 QNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEK------ 348
           Q L+++D+A F+ Q+  L    + +     K   +E Y ++R+T+ + +  +EK
Sbjct: 167 QMLIRKDLAKFINQIEKLMLFLLEQ----KKVTQIENYFIIRDTLISGMCCLEKVGVTDC 222

Query: 349 --DYIESFMERLNATTVFEGK 367
             DY+    E ++ T  ++ K
Sbjct: 223 FNDYLSCLQEIMDKTQDYQKK 243


>OCEIH Q8CUS9 (Q8CUS9) Hypothetical conserved protein
          Length = 161

 Score = 31.6 bits (70), Expect = 5.7
 Identities = 20/76 (26%), Positives = 36/76 (47%), Gaps = 6/76 (7%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164
           P++   GIG+     +  +   +     + L+  +NN +A+  Y   GF II+D  E+
Sbjct: 92  PSHQGIGIGSA----LLHYGVNQLRPREIQLNVEQNNIKALDFYTSKGFEIIKDFQEN-- 145

Query: 165 HEGKKEDCYLMEYRYD 180
            +G   D Y M ++ D
Sbjct: 146 FDGHLLDTYRMSWKLD 161


>LISIN Q92E28 (Q92E28) Lin0633 protein
          Length = 143

 Score = 31.6 bits (70), Expect = 5.7
 Identities = 20/80 (25%), Positives = 37/80 (46%), Gaps = 1/80 (1%)

Query: 75  IYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVI 134
           +Y ++ +    Y +   DE    +  F+    +  KG GT+ ++ + + L KE     +
Sbjct: 55  LYSIFTDQKIGYLWFHVDEKHAFIYDFVIFETFRGKGFGTKTLEAL-DVLAKEMGITKIE 113

Query: 135 LDPHKNNPRAIRAYQKSGFR 154
           L    +N  AI+ Y K GF+
Sbjct: 114 LHVFAHNQTAIKLYDKVGFK 133


>LACPL Q88U49 (Q88U49) Glucose-6-phosphate 1-dehydrogenase (EC
           1.1.1.49) (G6PD)
          Length = 494

 Score = 31.6 bits (70), Expect = 5.7
 Identities = 26/95 (27%), Positives = 45/95 (47%), Gaps = 8/95 (8%)

Query: 151 SGF-RIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIG 209
           +GF R+I + P    +E  KE    +   +++N        Y I+HY     + +I  I
Sbjct: 140 NGFNRVIIEKPFGHDYESAKELNDQLTATFNENQI------YRIDHYLGKEMIQNITAIR 193

Query: 210 SGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIY 244
            G +    L NN YI   + + ++K G  +E+A+Y
Sbjct: 194 FGNNIWESLWNNRYIDNVQITLSEKLG-VEERAVY 227


>CORDI Q6NHQ0 (Q6NHQ0) Putative acetyltransferase
          Length = 163

 Score = 31.6 bits (70), Expect = 5.7
 Identities = 18/53 (33%), Positives = 27/53 (50%), Gaps = 1/53 (1%)

Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156
           EP Y  KG+G+  +    E L +   A  + L     NPRA + Y++ GF+ I
Sbjct: 98  EPRYRGKGVGSILLNKSLE-LARTLGAPGLSLSVDDGNPRAKKLYERLGFQHI 149


>BURMA Q62J98 (Q62J98) Ribosomal-protein-alanine acetyltransferase
           (EC 2.3.1.128)
          Length = 165

 Score = 31.6 bits (70), Expect = 5.7
 Identities = 15/43 (34%), Positives = 25/43 (58%), Gaps = 1/43 (2%)

Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
           G+G   ++      + ER  + V+L+   +NPRAIR Y++ GF
Sbjct: 87  GVGLALLREAVRIARAER-LDGVLLEVRPSNPRAIRLYERFGF 128


>THETN Q8R764 (Q8R764) LysM-repeat proteins and domains
          Length = 508

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 31/141 (21%), Positives = 53/141 (37%), Gaps = 23/141 (16%)

Query: 235 KGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEE 294
           KGY  E     F+    E    +  +  +Y+S E++ L  KE++  F        ++E+E
Sbjct: 381 KGYRDEYPFRTFVEIEGEVGEVLTEVSTAYVSYEINSL--KELEFKFAIDSCVEVLTEKE 438

Query: 295 QNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESF 354
             L+                 D+ E  +   + V    I+        L DI K Y  +
Sbjct: 439 MTLI----------------YDLKEIEMPRGEEVRHSIIIYMVQKGESLWDIAKRYRVNV 482

Query: 355 MERLNAT-----TVFEGKKCL 370
            + + A       VFEG+K +
Sbjct: 483 EDLITANDLKEDKVFEGEKLI 503


>STRR6 Q8DPX8 (Q8DPX8) Hypothetical protein spr0952
          Length = 253

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 22/106 (20%), Positives = 48/106 (45%), Gaps = 12/106 (11%)

Query: 261 EYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLD------- 313
           E SY+S   +++ Y+E+    + P       +E  + +   +    R++  L
Sbjct: 148 ELSYLS---TLIRYEELY--IINPNQARATPKEHHDFIVNHLVDNTRKLEELAIFERIQI 202

Query: 314 YTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLN 359
           Y     C  D+K+N      +L+E ++ + + +EK+ ++   +RLN
Sbjct: 203 YQRDRSCVYDSKENTTSAADVLQELLFGEWSQVEKEMLQVGEKRLN 248


>STRMU Q8DVT1 (Q8DVT1) Putative ribosomal-protein-alanine
           acetyltransferase (EC 2.3.1.128)
          Length = 144

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 28/124 (22%), Positives = 51/124 (41%), Gaps = 20/124 (16%)

Query: 65  YNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYG---MDQFIGEPN---------YWSKGI 112
           Y   P    QI    + L  DY +   D+ + G   +   +GE           Y  +G+
Sbjct: 22  YQVSPWSQKQILTDMNRLDVDYFFAYDDKEIVGFLSIQHLVGELELTNIAIKKAYQGQGL 81

Query: 113 GTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDC 172
           G++ + ++       ++   + L+   +N  A   YQK GFR +    ++  +   KED
Sbjct: 82  GSQLLAML------TKDELPIFLEVRASNQAAQALYQKFGFRSLTTRKDY--YHNPKEDA 133

Query: 173 YLME 176
            LM+
Sbjct: 134 ILMK 137


>SALTI SPED_SALTI (Q8Z9E3) S-adenosylmethionine decarboxylase
           proenzyme (EC 4.1.1.50) (AdoMetDC) (SamDC) [Contains:
           S-adenosylmethionine decarboxylase beta chain;
           S-adenosylmethionine decarboxylase alpha chain]
          Length = 264

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 19/60 (31%), Positives = 29/60 (48%)

Query: 169 KEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTK 228
           + D   ++YR      +V  MK+ I+H  ++ +    E + S YD V   V  E IF TK
Sbjct: 157 ESDIVTIDYRVRGFTRDVNGMKHFIDHEINSIQNFMSEDMKSLYDMVDVNVYQENIFHTK 216


>SALTY SPED_SALTY (Q8ZRS4) S-adenosylmethionine decarboxylase
           proenzyme (EC 4.1.1.50) (AdoMetDC) (SamDC) [Contains:
           S-adenosylmethionine decarboxylase beta chain;
           S-adenosylmethionine decarboxylase alpha chain]
          Length = 264

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 19/60 (31%), Positives = 29/60 (48%)

Query: 169 KEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTK 228
           + D   ++YR      +V  MK+ I+H  ++ +    E + S YD V   V  E IF TK
Sbjct: 157 ESDIVTIDYRVRGFTRDVNGMKHFIDHEINSIQNFMSEDMKSLYDMVDVNVYQENIFHTK 216


>SALCH Q57T90 (Q57T90) S-adenosylmethionine decarboxylase, proenzyme
          Length = 264

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 19/60 (31%), Positives = 29/60 (48%)

Query: 169 KEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTK 228
           + D   ++YR      +V  MK+ I+H  ++ +    E + S YD V   V  E IF TK
Sbjct: 157 ESDIVTIDYRVRGFTRDVNGMKHFIDHEINSIQNFMSEDMKSLYDMVDVNVYQENIFHTK 216


>RICCN Q92JP8 (Q92JP8) Cell surface antigen
          Length = 1902

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 24/90 (26%), Positives = 41/90 (45%), Gaps = 2/90 (2%)

Query: 197  FDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVK 256
            F N K +  E+I S   S+         F  +   +  K + K+K  Y++ +T +++NVK
Sbjct: 1678 FKNSKNNDKELINSHVVSIYGQKELPKNFALQALVSASKNFIKDKTTYSYGDTKIKSNVK 1737

Query: 257  IPNIEYSYISDELSILGYKEIKGTFLTPEI 286
              N  +SY ++ L    Y       +TP I
Sbjct: 1738 HRN--HSYNAEALLHYNYLLQSKLVITPNI 1765


>NEIGI Q5FAH1 (Q5FAH1) Hypothetical protein
          Length = 177

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 18/45 (40%), Positives = 25/45 (55%), Gaps = 7/45 (15%)

Query: 215 VAYLVNNEYI-------FKTKFSTNKKKGYAKEKAIYNFLNTNLE 252
           + YL++NE +       FK  FSTN+KK    EK I  FL  N++
Sbjct: 69  IDYLISNEILIVRTKFSFKNIFSTNEKKYKEIEKEINKFLYKNMD 113


>LISIN Q92DJ7 (Q92DJ7) Lin0816 protein
          Length = 185

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 16/62 (25%), Positives = 28/62 (45%), Gaps = 1/62 (1%)

Query: 98  MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157
           +D  +    Y   G+GT  +  + E    +     V L+  K NP A R Y++ GF +
Sbjct: 113 LDSIVTNEKYRGHGVGTALLAKLTEIAAND-GEKVVGLNCDKGNPHAKRLYERLGFHVTG 171

Query: 158 DL 159
           ++
Sbjct: 172 EI 173


>LACJO Q74J74 (Q74J74) Hypothetical protein
          Length = 150

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 21/58 (36%), Positives = 28/58 (48%), Gaps = 5/58 (8%)

Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           +P Y SKGI T  IK     L++      V L+   +N RA   Y+K GF  +  L E
Sbjct: 80  DPIYQSKGIATELIKKALTELERP-----VRLEVFTDNERAKALYRKFGFERVNTLTE 132


>GEOSL Q74A59 (Q74A59) Sensory box histidine kinase
          Length = 1053

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 41/188 (21%), Positives = 78/188 (41%), Gaps = 34/188 (18%)

Query: 176 EYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGS------------GYDSVAYLVNNE- 222
           E+RY D    V+A+K   E YF         ++GS            G D    LV+ E
Sbjct: 106 EHRYGD----VEALKSRYEAYFRKATELYPRVLGSTDTFLSGEIARLGADGRLILVDFER 161

Query: 223 ----YIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIK 278
               Y+   +    + +  A++ +IY F+   +   +  P I  +++++ L I   +E++
Sbjct: 162 MSRDYVTSVEHQIERNRALARDTSIYLFVLFGMVVLLAAPAI--TFVANRLLIRPLEELR 219

Query: 279 GTFLTPEIYSTMSEEEQNLLKRDI--------ASFLRQMHGLDYTDISECTIDNKQNVLE 330
           G   +   ++  S +   L   D         ASF   + GL  T +S   +DN    +
Sbjct: 220 GMVTS---FAGGSLDLSGLPDYDAGDEIGSLCASFRSMVEGLQETTVSRDYVDNIIESMS 276

Query: 331 EYILLRET 338
           + +++ +T
Sbjct: 277 DCLIVVDT 284


>ENTFA Q836Z2 (Q836Z2) Acetyltransferase, GNAT family
          Length = 173

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 19/75 (25%), Positives = 33/75 (44%), Gaps = 1/75 (1%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE-HELH 165
           YW  G+G+  ++ +  +  +      + L     N RAI  Y+K GF     +P   +
Sbjct: 99  YWGYGLGSILMEELIRWAHESHVIRRLELTVQDRNQRAIHVYKKLGFETEAIMPRGAKTD 158

Query: 166 EGKKEDCYLMEYRYD 180
           +G+  D +LM    D
Sbjct: 159 QGEFLDVHLMRLLID 173


>ENTFA Q82YJ4 (Q82YJ4) Toxin ABC transporter, ATP-binding/permease
           protein
          Length = 700

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 33/166 (19%), Positives = 68/166 (40%), Gaps = 7/166 (4%)

Query: 94  IVYGMDQFIG-EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSG 152
           +V  ++QF   +  Y S  + +  ++ I   ++ E     +ILD    + R  R   K G
Sbjct: 439 VVSSLNQFGSFQAQYESMQVASHRLESILINMENENVCGEIILDKKIESIRCKRVSIKKG 498

Query: 153 FRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGY 212
             ++ D    E++ GK  +  +        +T +K++  L + Y     +++I+I
Sbjct: 499 DTLLLDTVNCEIYRGK--NLSIRGENGSGKSTLIKSLVRLDDDYRGQILINNIDIKKINL 556

Query: 213 D----SVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETN 254
           D     + ++  N    +     N   G+    +I+N L  + E N
Sbjct: 557 DCLRSKLVFVEPNPKFLEGTIRDNLLLGHKVPNSIFNKLIRDFEIN 602


>CLOAB Q97M70 (Q97M70) Predicted metal-dependent hydrolase
          Length = 259

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 13/39 (33%), Positives = 23/39 (58%), Gaps = 5/39 (12%)

Query: 47  KKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTD 85
           KK+Y E W  ++  + +EY      Y + YK++DE+Y +
Sbjct: 145 KKNYAEKWYKKIAAIELEYL-----YNEKYKIFDEIYDE 178


>CLOAB Q97HI5 (Q97HI5) Predicted flavodoxin
          Length = 180

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 23/76 (30%), Positives = 35/76 (46%), Gaps = 2/76 (2%)

Query: 119 LIFEFLKKERNANAVILDPHKNNP-RAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEY 177
           L+F  L K R AN  IL   + NP RA+  YQ S   I +++  +++ +G     Y +
Sbjct: 22  LMFSRLNKPRQANQKILKAKEANPKRALIVYQPSMSSITDEV-ANQIAKGLNTQGYEVTL 80

Query: 178 RYDDNATNVKAMKYLI 193
            Y  N  +     Y I
Sbjct: 81  NYPSNHLSTNVSDYSI 96


>BRAJA Q89R32 (Q89R32) Acyl-CoA dehydrogenase
          Length = 455

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 13/35 (37%), Positives = 23/35 (65%)

Query: 235 KGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDEL 269
           K  AK++ ++NF   + ET   + N++Y+YI+ EL
Sbjct: 107 KNKAKKEGLWNFFLPDDETGQGLKNLDYAYIASEL 141


>BACC1 Q735F5 (Q735F5) Streptothricin acetyltransferase, putative
          Length = 184

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 30/122 (24%), Positives = 54/122 (44%), Gaps = 6/122 (4%)

Query: 37  RDKKYTLE---SLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDE 93
           R  +YT+E   S +K Y +   +E+     EY N P     I  +++++       K
Sbjct: 38  RHIEYTVEDVPSYEKSYLQNDNEEL--AYNEYINKPNQIIYIALLHNQIIGFIVLKKNWN 95

Query: 94  IVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
               ++    +  Y + G+G R +    ++ K E N   ++L+   NN  A + Y+K GF
Sbjct: 96  HYAYIEDITVDKKYRTLGVGKRLVVQAKQWAK-EGNMPGIMLETQNNNVAACKFYEKCGF 154

Query: 154 RI 155
            I
Sbjct: 155 VI 156


>BACAN Q81YS9 (Q81YS9) Acetyltransferase, GNAT family
          Length = 288

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 13/44 (29%), Positives = 25/44 (56%)

Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
           KG+G R ++   +++   +    + L  + NN RA++ Y+K GF
Sbjct: 233 KGVGERLLQAAIQYIFSFQGMREIELCLNTNNDRAVKLYKKVGF 276


>VIBPA Q87HD3 (Q87HD3) Hypothetical protein VPA1032
          Length = 265

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 19/64 (29%), Positives = 31/64 (48%), Gaps = 3/64 (4%)

Query: 294 EQNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEK---DY 350
           E   L + + SF   M   DY  +SE  +  ++   E+  L  +T ++D+ DI+     Y
Sbjct: 96  ENEELTKSLVSFNLSMVSQDYEQVSELALQIEELRQEKGFLANDTSFSDVRDIDDRLGGY 155

Query: 351 IESF 354
           IE F
Sbjct: 156 IELF 159


>VIBCH Q9KL03 (Q9KL03) Spermidine n1-acetyltransferase
          Length = 173

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 20/83 (24%), Positives = 35/83 (42%), Gaps = 3/83 (3%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           Q I  P +  KG     I    ++     N + + L     NP+A+  Y++ GF     L
Sbjct: 86  QIIIAPEHQGKGFARTLINRALDYSFTILNLHKIYLHVAVENPKAVHLYEECGFVEEGHL 145

Query: 160 PEHELHEGKKED---CYLMEYRY 179
            E     G+ +D    Y+++ +Y
Sbjct: 146 VEEFFINGRYQDVKRMYILQSKY 168


>THEMA SYE2_THEMA (Q9X2I8) Glutamyl-tRNA synthetase 2 (EC 6.1.1.17)
           (Glutamate--tRNA ligase 2) (GluRS 2)
          Length = 487

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 16/44 (36%), Positives = 22/44 (50%)

Query: 325 KQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNATTVFEGKK 368
           K N L +   +     ND  + EKDY+E F++R  A  V E  K
Sbjct: 369 KVNTLSQLYDIMYPFMNDDYEYEKDYVEKFLKREEAERVLEEAK 412


>THETN ISPH_THETN (Q8RA76) 4-hydroxy-3-methylbut-2-enyl diphosphate
           reductase (EC 1.17.1.2)
          Length = 288

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 14/55 (25%), Positives = 31/55 (56%)

Query: 115 RYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKK 169
           R I++ +E L K+++     L    +NP+ ++  ++ G R+IE+    +L +G +
Sbjct: 17  RAIEIAYEELNKQKDTRLYTLGEIIHNPQVVKDLEEKGVRVIEEEELEKLLKGDR 71


>STRCO Q9S2L5 (Q9S2L5) Hypothetical protein SCO1988
          Length = 183

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 18/47 (38%), Positives = 26/47 (55%), Gaps = 5/47 (10%)

Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157
           GIG R++ L      KER A+ + L   + N  A R Y++ GFR +E
Sbjct: 119 GIGDRFVALA-----KERRADGLSLWTFQVNAPARRFYERHGFRAVE 160


>STRP1 Q99XX8 (Q99XX8) Putative pullulanase
          Length = 1165

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 37/171 (21%), Positives = 62/171 (36%), Gaps = 31/171 (18%)

Query: 83  YTDYHY----PKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPH 138
           YT Y+Y     +  E V  +D +      W+    T  IK           A A  +DP
Sbjct: 473 YTGYYYLYEITRGQEKVMVLDPYAKSLAAWNDATATDDIK----------TAKAAFIDPS 522

Query: 139 KNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFD 198
           K  P  +   + + F+             K+ED  + E    D  T+ KA++  + H F
Sbjct: 523 KLGPTGLDFAKINNFK-------------KREDAIIYEAHVRD-FTSDKALEGKLTHPFG 568

Query: 199 NFK--VDSIEIIGS-GYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNF 246
            F   V+ ++ +   G   V  L    Y +  +   ++   Y      YN+
Sbjct: 569 TFSAFVEQLDYLKDLGVTHVQLLPVLSYFYANELDKSRSTAYTSSDNNYNW 619


>STRP1 Q99YF0 (Q99YF0) Hypothetical protein SPy1734
           (Acetyltransferase) (EC 2.3.1.-)
          Length = 174

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 17/65 (26%), Positives = 34/65 (52%), Gaps = 1/65 (1%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166
           Y   GIG   +++  ++ ++     ++ LD    N +AI  Y+K GFR IE + ++++
Sbjct: 99  YRGYGIGQLLLEIALDWAEENPYIESLKLDVQVRNTKAIYLYKKYGFR-IESMRKNDIKS 157

Query: 167 GKKED 171
              +D
Sbjct: 158 KNGDD 162


>STAES Q8CTP7 (Q8CTP7) Hypothetical protein SE0368
          Length = 158

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 27/108 (25%), Positives = 44/108 (40%), Gaps = 18/108 (16%)

Query: 49  HYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYW 108
           H  +   +++F V  E + + +G+   +   +ELY   HY +              P
Sbjct: 48  HLKKRLNEQLFLVAEEDSEI-VGFAN-FIYGEELYLSAHYVR--------------PESQ 91

Query: 109 SKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156
            +G GTR ++   +  K +     V L+   NN   I  YQ  GF II
Sbjct: 92  HRGYGTRLLEAGLKRFKDQYET--VYLEVDNNNSNGIEYYQNHGFEII 137


>STAES COAD_STAES (Q8CSZ5) Phosphopantetheine adenylyltransferase
           (EC 2.7.7.3) (Pantetheine-phosphate adenylyltransferase)
           (PPAT) (Dephospho-CoA pyrophosphorylase)
          Length = 161

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 21/111 (18%), Positives = 50/111 (45%), Gaps = 13/111 (11%)

Query: 185 NVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIY 244
           +VK +  +  H+F+   VD  + +G+          +++ ++ + ++  KK
Sbjct: 59  SVKHLPNIQVHHFNGLLVDFCDQVGAKTIIRGLRAVSDFEYELRLTSMNKK--------- 109

Query: 245 NFLNTNLETNVKIPNIEYSYISDEL--SILGYKEIKGTFLTPEIYSTMSEE 293
             LN+N+ET   + +  YS+IS  +   +  Y+     F+ P +   + ++
Sbjct: 110 --LNSNIETMYMMTSANYSFISSSIVKEVAAYQADISPFVPPHVERALKKK 158


>MYCLEH Y3433_MYCTU (O06250) Hypothetical protein Rv3433c/MT3539
          Length = 473

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 12/30 (40%), Positives = 23/30 (76%)

Query: 130 ANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           A+AV+L+P + + +A+ A+ KSG R++E +
Sbjct: 81  ADAVLLNPDRTHRKALAAFTKSGGRLVESV 110


>MYCTU Y3433_MYCTU (O06250) Hypothetical protein Rv3433c/MT3539
          Length = 473

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 12/30 (40%), Positives = 23/30 (76%)

Query: 130 ANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           A+AV+L+P + + +A+ A+ KSG R++E +
Sbjct: 81  ADAVLLNPDRTHRKALAAFTKSGGRLVESV 110


>MYCBO Q7TWI6 (Q7TWI6) Hypothetical protein Mb3463c
          Length = 473

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 12/30 (40%), Positives = 23/30 (76%)

Query: 130 ANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           A+AV+L+P + + +A+ A+ KSG R++E +
Sbjct: 81  ADAVLLNPDRTHRKALAAFTKSGGRLVESV 110


>LISMO Q8Y5C4 (Q8Y5C4) Lmo2141 protein
          Length = 157

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 28/102 (27%), Positives = 47/102 (46%), Gaps = 9/102 (8%)

Query: 76  YKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVIL 135
           YK    L ++ H  + D  V+        P+Y   GIG   +  + E + +E+    + L
Sbjct: 63  YKSPIPLASNKHVAEIDIAVH--------PDYQRAGIGQLLMDKMKE-VAREKGYIKIAL 113

Query: 136 DPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEY 177
                N +AIR Y+K+GF+    L +  + +G+  D  LM Y
Sbjct: 114 RVLSINQKAIRFYEKNGFKQEGLLEKEFIIQGEFVDDILMAY 155


>LISIN Q929Z8 (Q929Z8) Lin2125 protein
          Length = 231

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 25/89 (28%), Positives = 38/89 (42%), Gaps = 15/89 (16%)

Query: 8   ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKK----YTLESLKKHYTEP--WEDEVFRV 61
           + ++TL+    P  + WL DE    F  G        Y L ++   +T P  W+  V  +
Sbjct: 107 LVLKTLVARTRPDSVNWLIDESGFSFPSGHATATAVFYGLAAMFLIFTVPKMWQKIVIGI 166

Query: 62  IIEYNNVPIGYGQI-YKMYDELYTDYHYP 89
                   IGYG I + MY  +Y   H+P
Sbjct: 167 --------IGYGFILFVMYTRVYLGVHFP 187


>ENTFA Q837Z5 (Q837Z5) Acetyltransferase, GNAT family
          Length = 184

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 15/45 (33%), Positives = 24/45 (53%)

Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
           +G G   + LI +F   E   + + L  + NN +AI  Y+K GF+
Sbjct: 104 QGCGFEAVSLICKFAFYELGLHKIRLAVNSNNQKAIHVYEKVGFK 148


>ENTFA Q831M9 (Q831M9) Ribosomal-protein-alanine acetyltransferase,
           putative
          Length = 154

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 15/45 (33%), Positives = 28/45 (62%), Gaps = 1/45 (2%)

Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
           +GIG + +K   E++K  R+   + L+  ++N  A + Y+K+GFR
Sbjct: 83  QGIGCQLMKAFKEYVKS-RDITQIFLEVRESNILAQKLYEKTGFR 126


>CLOTE Q895T0 (Q895T0) Acetyltransferase (EC 2.3.1.-)
          Length = 165

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 17/67 (25%), Positives = 37/67 (55%), Gaps = 3/67 (4%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166
           Y  +G+GT+ +  I + L K++  + +  D +  NP+    +QK G+  + ++  + L++
Sbjct: 97  YRHQGVGTKLLSYI-KTLAKDKKIHLIKSDTYSLNPKMNALFQKCGYEKVGEI--NLLNK 153

Query: 167 GKKEDCY 173
             K +CY
Sbjct: 154 PYKFNCY 160


>CLOPE Q8XMF9 (Q8XMF9) Hypothetical protein CPE0730
          Length = 154

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 39/162 (24%), Positives = 66/162 (40%), Gaps = 30/162 (18%)

Query: 28  ERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYH 87
           ERVLE      K   ++ + K + E   +  + +  EY          +K   E+  D +
Sbjct: 3   ERVLEIR--EPKNCEIDDIMKIWLESTVEAHYFIEEEY----------WKKNYEVVRDIY 50

Query: 88  YPKTDEIVYGMD------------QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVIL 135
            P     VY  +             FIG     +K  G+   K + E++K +     + L
Sbjct: 51  IPMAKTFVYCDEGKINGFISIIDSNFIGALFVHTKSQGSGIGKSLLEYVKNKYEN--IEL 108

Query: 136 DPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEY 177
             +K+N +A+  Y+K  F+II++    +   G  E  YLM Y
Sbjct: 109 AVYKDNKKAVEFYKKHDFKIIKEQENED--SGHLE--YLMSY 146


>BRUSU Q8FY73 (Q8FY73) Outer membrane autotransporter
          Length = 1593

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 18/64 (28%), Positives = 32/64 (50%), Gaps = 3/64 (4%)

Query: 414  EEEIGTNFGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIY 473
            E  +  N  + ++   G + ++    +QD   +   + T +YGI NI QEF+ NGR  +
Sbjct: 1478 EANVSLNDSDSLIGRAG-VALDYRNAWQDDAGQI--VHTNIYGIANIYQEFMGNGRVGVA 1534

Query: 474  KRTY 477
              T+
Sbjct: 1535 DTTF 1538


>BACSU YCBJ_BACSU (P42242) Hypothetical protein ycbJ
          Length = 306

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 36/179 (20%), Positives = 66/179 (36%), Gaps = 26/179 (14%)

Query: 300 RDIASFLRQMHGLDYTDISECTI------DNKQNVLEEYILLRETIYNDLTDIEKDYIES 353
           R +A  L ++HG D     +  I      D +Q   +  + ++  +      +     E
Sbjct: 129 RTLADILAELHGTDQISAGQSGIEVIRPEDFRQMTADSMVDVKNKL-----GVSTTLWER 183

Query: 354 FMERLNATTVFEGKKCLCHNDFSCNHLLLDGNNRLTXXXXXXXXXXXXEYCDFIYLLEDS 413
           + + ++    + G   L H D    H+L+D N R+T               DF+
Sbjct: 184 WQKWVDDDAYWPGFSSLIHGDLHPPHILIDQNGRVTGLLDWTEAKVADPAKDFVL----- 238

Query: 414 EEEIGTNFGED----ILRMY---GNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFI 465
                T FGE     +L  Y   G     K +E+   ++  YP+E     ++  ++E I
Sbjct: 239 ---YQTIFGEKETARLLEYYDQAGGRIWAKMQEHISEMQAAYPVEIAKLALQTQQEEHI 294


>BACHD Q9KE57 (Q9KE57) BH1001 protein
          Length = 448

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 32/119 (26%), Positives = 54/119 (45%), Gaps = 21/119 (17%)

Query: 272 LGYKEIKGTFLTPEIYSTMSEEEQNLL------------KRDIASFLRQMHGLDYTDISE 319
           LG+K  +GT L  ++  TMS EE  +               D   F  +++G + T ++E
Sbjct: 306 LGFKVERGTLLESKVELTMSFEEDGISFDVGMSVDSTYNYDDAVEF--KLYGQERTTLTE 363

Query: 320 CTIDNKQNVLEEYILLRETIYND-LTDIEKDYIESFM--ERLNATTVFEGKKCLCHNDF 375
             +D   ++  E     E++ ND L D ++DY E  +  E L      E ++ + H DF
Sbjct: 364 AELD---DLTYEINWELESLVNDLLADFQEDYYEEELSEEDLALLAAIEAQE-VSHEDF 418


>BACC1 Q73CD9 (Q73CD9) Glycerol-3-phosphate dehydrogenase, aerobic
           (EC 1.1.99.5)
          Length = 560

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 19/52 (36%), Positives = 26/52 (50%), Gaps = 2/52 (3%)

Query: 146 RAYQKSGFRIIEDLPEHEL--HEGKKEDCYLMEYRYDDNATNVKAMKYLIEH 195
           R+ ++  F   E L +  L   EG K   Y +EYR DD    ++ MK  IEH
Sbjct: 133 RSERRKMFNREETLKKEPLVKQEGLKGGGYYVEYRTDDARLTIEVMKEAIEH 184


>BACAN Q81VL8 (Q81VL8) Oxidoreductase, FAD-binding
          Length = 471

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 28/96 (29%), Positives = 41/96 (42%), Gaps = 12/96 (12%)

Query: 244 YNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKRDIA 303
           Y      L+  +K+ N E         +L YKE    F   E     +    +L +  +A
Sbjct: 188 YGLFGVILDVTLKLTNDEL--YETHTKMLDYKEYTSYF--KEKVKKDANVRMHLARISVA 243

Query: 304 --SFLRQMHGLDYTDISECTIDNKQNVLEEYILLRE 337
             SFLR+M+  DY      T+   QN+ EEY  L+E
Sbjct: 244 PNSFLREMYVTDY------TLAQNQNMREEYSELKE 273


>BACAN Q81U57 (Q81U57) Glycerol-3-phosphate dehydrogenase, aerobic
          Length = 560

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 19/52 (36%), Positives = 26/52 (50%), Gaps = 2/52 (3%)

Query: 146 RAYQKSGFRIIEDLPEHEL--HEGKKEDCYLMEYRYDDNATNVKAMKYLIEH 195
           R+ ++  F   E L +  L   EG K   Y +EYR DD    ++ MK  IEH
Sbjct: 133 RSERRKMFNREETLKKEPLVKQEGLKGGGYYVEYRTDDARLTIEVMKEAIEH 184


>BACAN Q81NU7 (Q81NU7) Acetyltransferase, GNAT family
          Length = 153

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 20/79 (25%), Positives = 34/79 (43%), Gaps = 14/79 (17%)

Query: 86  YHYPKTDEIVYGMDQFIGEPN--------------YWSKGIGTRYIKLIFEFLKKERNAN 131
           Y   K D+IV G   F G+PN              YW+KG  T  ++ + ++  +
Sbjct: 52  YVIRKEDDIVLGDIGFKGKPNEEHTVEVGYGFIEKYWNKGYATEAVQELIDWAFQTGEVE 111

Query: 132 AVILDPHKNNPRAIRAYQK 150
            +I +   +N  +IR  +K
Sbjct: 112 TIIAETLLDNYGSIRVLEK 130


  Database: Blastdata.fdb
    Posted date:  Mar 29, 2006  3:30 PM
  Number of letters in database: 77,468,597
  Number of sequences in database:  240,170

Lambda     K      H
   0.318    0.139    0.409

Gapped
Lambda     K      H
   0.267   0.0410    0.140


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Hits to DB: 72,017,968
Number of Sequences: 240170
Number of extensions: 3196375
Number of successful extensions: 9166
Number of sequences better than 10.0: 203
Number of HSP's better than 10.0 without gapping: 69
Number of HSP's successfully gapped in prelim test: 134
Number of HSP's that attempted gapping in prelim test: 8848
Number of HSP's gapped (non-prelim): 424
length of query: 479
length of database: 77,468,597
effective HSP length: 115
effective length of query: 364
effective length of database: 49,849,047
effective search space: 18145053108
effective search space used: 18145053108
T: 11
A: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 68 (30.8 bits)
BLASTP 2.2.10 [Oct-19-2004]


From mdehoon at c2b2.columbia.edu  Wed Apr 19 12:54:33 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Wed, 19 Apr 2006 12:54:33 -0400
Subject: [BioPython] Need help parsing Blastoutput
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF7@cgcmail.cgc.cpmc.columbia.edu>

The Blast parser fails to read your file because the format of Blast output
has changed. If I edit the data file so that it corresponds to the old format
(add a space here, remove a blank line there, etc.), the Blast parser reads
the file without problems. The easiest solution is to repeat the Blast run,
using XML for the output format, and use the Blast XML parser in Biopython to
parse the results.

A general question is if anybody still needs the parser for Blast text
output. Currently, we are confusing our users by having a Blast text parser
that tends to break. A broken parser may be worse than no parser.

--Michiel.

Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032


-----Original Message-----
From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
Sent: Wed 4/19/2006 6:15 AM
To: Michiel De Hoon
Cc: biopython at lists.open-bio.org
Subject: RE: [BioPython] Need help parsing Blastoutput
 
Hi 
Please see the attachment,it part of my Blast output.
yes I am try to parse text output from Blast ,I have use another script to 
run my local blast that I am trying to perse the NCBIStandalone.BlastParser 
was working fine without hsp.sbject_end  which is one of what I need to 
print out .
On checking the class diagrams from cookbook, findout that sbject_end is 
not included .I just need another way of printing the int(subject end).
Thanks for your help
Halimah

On Tue, 18 Apr 2006, Michiel De Hoon wrote:

> Could you also send us the file Enterococcus_out so we can run the script?
> 
> From the script, it looks like you're trying to parse text output from
Blast.
> While this is possible (in theory), the format of Blast text output tends
to
> change a lot, thereby breaking the parser in Biopython. It is more reliable
> to have Blast generate output in XML format, and use the XML parser:
> 
> blast_out = open('my_blast.xml', 'r')
> 
> from Bio.Blast import NCBIXML
> 
> b_parser = NCBIXML.BlastParser()
> b_record = b_parser.parse(blast_out)
> 
> See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to
> generate Blast output in XML.
> 
> --Michiel.
> 
> 
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> Sent: Tue 4/18/2006 11:06 AM
> To: Michiel De Hoon
> Cc: biopython at lists.open-bio.org
> Subject: RE: [BioPython] Need help parsing Blastoutput
>  
> thanks
> please see the attchment a copy of my script and copy of my Blast output
> Thanks
> 
> 
> On Thu, 13 Apr 2006, Michiel De Hoon wrote:
> 
> > Could you send us the script you were using?
> > 
> > --Michiel.
> > 
> > Michiel de Hoon
> > Center for Computational Biology and Bioinformatics
> > Columbia University
> > 1150 St Nicholas Avenue
> > New York, NY 10032
> > 
> > 
> > 
> > -----Original Message-----
> > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> > Sent: Thu 4/13/2006 11:07 AM
> > To: biopython at lists.open-bio.org
> > Subject: [BioPython] Need help parsing Blastoutput
> >  
> > Hi All,
> > I have a BLAST output from a local blast
> > I need to calculate my % alignment coverage as regard to my subject
> > I try parsed the blast output and wanted to print the
> > sbjct Start and Sbjct end. but I could not is there anyway I could this 
> > try to get mach coverage between my querry and subject I dont need 
> > Identities,but total % alignment for querry or subject.
> > Thanks
> > Halimah
> > 
> > _______________________________________________
> > BioPython mailing list  -  BioPython at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython
> > 
> > 
> 
> 


From elventear at gmail.com  Wed Apr 19 21:02:30 2006
From: elventear at gmail.com (Pepe Barbe)
Date: Wed, 19 Apr 2006 20:02:30 -0500
Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files
Message-ID: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com>

Hello,

Following the simple steps in the BioPython cookbook, I wanted to
create a dictionary with the following GenBank file:

ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Escherichia_coli_K12/NC_000913.gbk

Below you can find what I tried executing and the error I got. I would
appreciate any insight into solving the error and correctly producing
the dictionary.

Thanks!
Pepe
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> dict_file = 'NC_000913.gbk'
>>> index_file = 'NC_000913.idx'
>>> from Bio import GenBank
>>> GenBank.index_file(dict_file, index_file)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/sw/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line
1283, in index_file
    SimpleSeqRecord.create_flatdb([filename], indexname, indexer)
  File "/sw/lib/python2.4/site-packages/Bio/Mindy/SimpleSeqRecord.py",
line 152, in create_flatdb
    creator.load(filename, builder = builder, fileid_info = {})
  File "/sw/lib/python2.4/site-packages/Bio/Mindy/BaseDB.py", line 36, in load
    raise TypeError("Cannot identify file as a %s format" %
TypeError: Cannot identify file as a unknown format


From biopython at maubp.freeserve.co.uk  Thu Apr 20 08:42:34 2006
From: biopython at maubp.freeserve.co.uk (Peter (BioPython))
Date: Thu, 20 Apr 2006 13:42:34 +0100
Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files
In-Reply-To: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com>
References: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com>
Message-ID: <444781BA.8080107@maubp.freeserve.co.uk>

Pepe Barbe wrote:
> Hello,
> 
> Following the simple steps in the BioPython cookbook, I wanted to
> create a dictionary with the following GenBank file:
> 
> ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Escherichia_coli_K12/NC_000913.gbk
> 
> Below you can find what I tried executing and the error I got. I would
> appreciate any insight into solving the error and correctly producing
> the dictionary.

The cookbook tutorial is a little misleading in that regard.  Indexing a 
GenBank file only makes sense for those files with multiple genbank 
record (i.e. multiple LOCUS lines).

For example, you can get multi-record GenBank files with records for 
different genes.  These tend to be small records, and the Martel based 
indexing code copes fine.  It doesn't cope very well with large records 
like genomes.

Your example (and in my experience all Bacterial Genomes) have just a 
single very large record (which will contain many features).

Does this page help?

http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/genbank/

I did suggest a change to the documentation but it looks like no one has 
made the change...

http://biopython.org/pipermail/biopython-dev/2005-November/002193.html

I had forgotten to chase this up.

Peter


From alpersoyler at yahoo.com  Thu Apr 20 08:59:57 2006
From: alpersoyler at yahoo.com (alper soyler)
Date: Thu, 20 Apr 2006 05:59:57 -0700 (PDT)
Subject: [BioPython] Need help!!!
Message-ID: <20060420125957.72247.qmail@web36804.mail.mud.yahoo.com>

Hi All,
 
 I am new to Biopython and have a question. I want to construct a pyhlogenetic profile for one organism's proteins. I want to give my protein to blast to search one organism's genome (e.g. Homo sapiens) instead of whole genbank database. How can I solve my problem? Thank you in advance.
 
 regards,
 Alper
 
		
---------------------------------
New Yahoo! Messenger with Voice. Call regular phones from your PC and save big.

From cy at cymon.org  Thu Apr 20 09:41:46 2006
From: cy at cymon.org (Cymon J. Cox)
Date: Thu, 20 Apr 2006 14:41:46 +0100
Subject: [BioPython] Need help!!!
In-Reply-To: <20060420125957.72247.qmail@web36804.mail.mud.yahoo.com>
References: <20060420125957.72247.qmail@web36804.mail.mud.yahoo.com>
Message-ID: <1145540506.11610.17.camel@clintonite.nhm.ac.uk>

Hi Alper,

On Thu, 2006-04-20 at 05:59 -0700, alper soyler wrote:
> Hi All,
>  
>  I am new to Biopython and have a question. I want to construct a pyhlogenetic
>  profile for one organism's proteins. I want to give my protein to blast to
>  search one organism's genome (e.g. Homo sapiens) instead of whole genbank
>  database. How can I solve my problem? Thank you in advance.

Assuming you want to do this locally, you'll need to download you target
genome, format it with the BLAST distribution programme 'formatdb', and
then feed your query and newly formatted genome BLAST database to
Bio.Blast.NCBIStandalone.

See http://biopython.org/docs/tutorial/Tutorial004.html#toc10
3.1.4  Running BLAST locally

for details,

Cheers, Cymon
____________________________________________________________________

Cymon J. Cox

Biometry and Molecular Research
Department of Zoology
Natural History Museum
Cromwell Road
London, SW7 5BD

Email: cy at cymon.org, c.cox at nhm.ac.uk, cymon.cox at googlemail.com 
Phone : +44 (0)20 7942 6981
HomePage : http://www.duke.edu/~cymon

-8.63/-6.77
_____________________________________________________________________
Fedora Core release 4 (Stentz) clintonite.nhm.ac.uk 14:35:55 up 13 days,
20:42, 8 users, load average: 0.08, 0.16, 0.12


From mcolosimo at mitre.org  Thu Apr 20 10:23:19 2006
From: mcolosimo at mitre.org (Marc Colosimo)
Date: Thu, 20 Apr 2006 10:23:19 -0400
Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files
In-Reply-To: <444781BA.8080107@maubp.freeserve.co.uk>
References: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com>
	<444781BA.8080107@maubp.freeserve.co.uk>
Message-ID: <65CA5BE4-1C83-4FD7-B998-C97BCF9AA6DE@mitre.org>

While we are on the subject of parsing multiple GenBank files and the  
Cookbook, I think a better example (and more pythonish) is the  
following:

from Bio import GenBank

gb_file = "my_file.gb"
gb_handle = open(gb_file, 'r')

feature_parser = GenBank.FeatureParser()

gb_iterator = GenBank.Iterator(gb_handle, feature_parser)

for cur_record in gb_iterator:
    # now do something with the record
    print cur_record.seq

which is way nicer (and uses iterators as per pep-234 and ) than

while 1:
    cur_record = gb_iterator.next()

    if cur_record is None:
        break

    # now do something with the record
    print cur_record.seq

Actually, the above works with the Fasta iterator as well.

Times for a GenBank file with 72,358 records (LOCUSs):
my way (using iterators): 14m16.886s
cookbook way (using next and if):  14m28.547s

Surprisingly, this isn't much faster (maybe with -O it would be)

Marc

On Apr 20, 2006, at 8:42 AM, Peter (BioPython) wrote:

> Pepe Barbe wrote:
>> Hello,
>>
>> Following the simple steps in the BioPython cookbook, I wanted to
>> create a dictionary with the following GenBank file:
>>
>> ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Escherichia_coli_K12/ 
>> NC_000913.gbk
>>
>> Below you can find what I tried executing and the error I got. I  
>> would
>> appreciate any insight into solving the error and correctly producing
>> the dictionary.
>
> The cookbook tutorial is a little misleading in that regard.   
> Indexing a
> GenBank file only makes sense for those files with multiple genbank
> record (i.e. multiple LOCUS lines).
>
> For example, you can get multi-record GenBank files with records for
> different genes.  These tend to be small records, and the Martel based
> indexing code copes fine.  It doesn't cope very well with large  
> records
> like genomes.
>
> Your example (and in my experience all Bacterial Genomes) have just a
> single very large record (which will contain many features).
>
> Does this page help?
>
> http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/ 
> python/genbank/
>
> I did suggest a change to the documentation but it looks like no  
> one has
> made the change...
>
> http://biopython.org/pipermail/biopython-dev/2005-November/002193.html
>
> I had forgotten to chase this up.
>
> Peter
>
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython


From elventear at gmail.com  Thu Apr 20 12:11:42 2006
From: elventear at gmail.com (Pepe Barbe)
Date: Thu, 20 Apr 2006 11:11:42 -0500
Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files
In-Reply-To: <444781BA.8080107@maubp.freeserve.co.uk>
References: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com>
	<444781BA.8080107@maubp.freeserve.co.uk>
Message-ID: <3e73596b0604200911i2e2c481bj306c5d282cae5c75@mail.gmail.com>

On 4/20/06, Peter (BioPython) <biopython at maubp.freeserve.co.uk> wrote:
>
> The cookbook tutorial is a little misleading in that regard.  Indexing a
> GenBank file only makes sense for those files with multiple genbank
> record (i.e. multiple LOCUS lines).
<snip>
> Your example (and in my experience all Bacterial Genomes) have just a
> single very large record (which will contain many features).
>
> Does this page help?
>
> http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/genbank/

It does help a lot. Thanks!

As an aside, while what I was doing, wasn't exactly what I was looking
for, I think it was crashing because of a Bug on 1.41. I installed the
latest CVS and it works normally now.

Pepe


From halima at mancala.cbio.uct.ac.za  Thu Apr 20 07:57:20 2006
From: halima at mancala.cbio.uct.ac.za (Halima Rabiu)
Date: Thu, 20 Apr 2006 13:57:20 +0200 (SAST)
Subject: [BioPython] Need help parsing Blastoutput
In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF7@cgcmail.cgc.cpmc.columbia.edu>
References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF7@cgcmail.cgc.cpmc.columbia.edu>
Message-ID: <Pine.LNX.4.58.0604201350220.10334@mancala.cbio.uct.ac.za>

thanks I try using XML parser and I am still geting errors which I dont 
understand . please see the attchmnt copy of my script and Blast XML 
output.
here is the error
raceback (most recent call last):
  File "Bioperser.py", line 11, in ?
    b_record = b_parser.parse(b_out)
  File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 
112, in parse
    self._parser.parse(handler)
  File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in 
parse
    xmlreader.IncrementalParser.parse(self, source)
  File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in 
parse
    self.feed(buffer)
  File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in 
feed
    self._err_handler.fatalError(exc)
  File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in 
fatalError
    raise exception
thanks
Halimah

On Wed, 19 Apr 2006, Michiel De Hoon wrote:

> The Blast parser fails to read your file because the format of Blast output
> has changed. If I edit the data file so that it corresponds to the old format
> (add a space here, remove a blank line there, etc.), the Blast parser reads
> the file without problems. The easiest solution is to repeat the Blast run,
> using XML for the output format, and use the Blast XML parser in Biopython to
> parse the results.
> 
> A general question is if anybody still needs the parser for Blast text
> output. Currently, we are confusing our users by having a Blast text parser
> that tends to break. A broken parser may be worse than no parser.
> 
> --Michiel.
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> Sent: Wed 4/19/2006 6:15 AM
> To: Michiel De Hoon
> Cc: biopython at lists.open-bio.org
> Subject: RE: [BioPython] Need help parsing Blastoutput
>  
> Hi 
> Please see the attachment,it part of my Blast output.
> yes I am try to parse text output from Blast ,I have use another script to 
> run my local blast that I am trying to perse the NCBIStandalone.BlastParser 
> was working fine without hsp.sbject_end  which is one of what I need to 
> print out .
> On checking the class diagrams from cookbook, findout that sbject_end is 
> not included .I just need another way of printing the int(subject end).
> Thanks for your help
> Halimah
> 
> On Tue, 18 Apr 2006, Michiel De Hoon wrote:
> 
> > Could you also send us the file Enterococcus_out so we can run the script?
> > 
> > From the script, it looks like you're trying to parse text output from
> Blast.
> > While this is possible (in theory), the format of Blast text output tends
> to
> > change a lot, thereby breaking the parser in Biopython. It is more reliable
> > to have Blast generate output in XML format, and use the XML parser:
> > 
> > blast_out = open('my_blast.xml', 'r')
> > 
> > from Bio.Blast import NCBIXML
> > 
> > b_parser = NCBIXML.BlastParser()
> > b_record = b_parser.parse(blast_out)
> > 
> > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to
> > generate Blast output in XML.
> > 
> > --Michiel.
> > 
> > 
> > 
> > Michiel de Hoon
> > Center for Computational Biology and Bioinformatics
> > Columbia University
> > 1150 St Nicholas Avenue
> > New York, NY 10032
> > 
> > 
> > 
> > -----Original Message-----
> > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > Sent: Tue 4/18/2006 11:06 AM
> > To: Michiel De Hoon
> > Cc: biopython at lists.open-bio.org
> > Subject: RE: [BioPython] Need help parsing Blastoutput
> >  
> > thanks
> > please see the attchment a copy of my script and copy of my Blast output
> > Thanks
> > 
> > 
> > On Thu, 13 Apr 2006, Michiel De Hoon wrote:
> > 
> > > Could you send us the script you were using?
> > > 
> > > --Michiel.
> > > 
> > > Michiel de Hoon
> > > Center for Computational Biology and Bioinformatics
> > > Columbia University
> > > 1150 St Nicholas Avenue
> > > New York, NY 10032
> > > 
> > > 
> > > 
> > > -----Original Message-----
> > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> > > Sent: Thu 4/13/2006 11:07 AM
> > > To: biopython at lists.open-bio.org
> > > Subject: [BioPython] Need help parsing Blastoutput
> > >  
> > > Hi All,
> > > I have a BLAST output from a local blast
> > > I need to calculate my % alignment coverage as regard to my subject
> > > I try parsed the blast output and wanted to print the
> > > sbjct Start and Sbjct end. but I could not is there anyway I could this 
> > > try to get mach coverage between my querry and subject I dont need 
> > > Identities,but total % alignment for querry or subject.
> > > Thanks
> > > Halimah
> > > 
> > > _______________________________________________
> > > BioPython mailing list  -  BioPython at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/biopython
> > > 
> > > 
> > 
> > 
> 
> 
-------------- next part --------------
#! /usr/local/bin/python2.4

#halimah

#16-04-2006

from string import split

from Bio.Blast import NCBIXML

#from Bio.Blast import NCBIStandalone

b_out = open('blast2.xml','r')

b_parser = NCBIXML.BlastParser()


b_record = b_parser.parse(b_out)

E_VALUE_THRESH = 1.0


while 1:

	b_record = b_iterator.next()

	print "The following results are for query " + b_record.query

	print 'len of query:',b_record.query_letters

	if b_record is None:

	       	break

	
     	for alignment in b_record.alignments:

        	
             		for hsp in alignment.hsps:

               			if hsp.expect <= E_VALUE_THRESH:

                     			print '****Alignment****'

                   			print 'title:', alignment.title

                    			print 'length:', alignment.length

                    			print 'e value:', hsp.expect

              		                print 'subjectstart:',hsp.sbjct_start

					print 'subject end:', hsp.sbject_end

		     			  
-------------- next part --------------
A non-text attachment was scrubbed...
Name: blast2.xml
Type: text/xml
Size: 151659 bytes
Desc: 
Url : http://lists.open-bio.org/pipermail/biopython/attachments/20060420/391af520/attachment-0001.xml 

From mdehoon at c2b2.columbia.edu  Thu Apr 20 13:37:29 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Thu, 20 Apr 2006 13:37:29 -0400
Subject: [BioPython] Need help parsing Blastoutput
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEFF@cgcmail.cgc.cpmc.columbia.edu>

Could you send us the Blast XML output also?

--Michiel.

Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032


-----Original Message-----
From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
Sent: Thu 4/20/2006 7:57 AM
To: Michiel De Hoon
Cc: biopython at lists.open-bio.org
Subject: RE: [BioPython] Need help parsing Blastoutput
 
thanks I try using XML parser and I am still geting errors which I dont 
understand . please see the attchmnt copy of my script and Blast XML 
output.
here is the error
raceback (most recent call last):
  File "Bioperser.py", line 11, in ?
    b_record = b_parser.parse(b_out)
  File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 
112, in parse
    self._parser.parse(handler)
  File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in 
parse
    xmlreader.IncrementalParser.parse(self, source)
  File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in 
parse
    self.feed(buffer)
  File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in 
feed
    self._err_handler.fatalError(exc)
  File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in 
fatalError
    raise exception
thanks
Halimah

On Wed, 19 Apr 2006, Michiel De Hoon wrote:

> The Blast parser fails to read your file because the format of Blast output
> has changed. If I edit the data file so that it corresponds to the old
format
> (add a space here, remove a blank line there, etc.), the Blast parser reads
> the file without problems. The easiest solution is to repeat the Blast run,
> using XML for the output format, and use the Blast XML parser in Biopython
to
> parse the results.
> 
> A general question is if anybody still needs the parser for Blast text
> output. Currently, we are confusing our users by having a Blast text parser
> that tends to break. A broken parser may be worse than no parser.
> 
> --Michiel.
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> Sent: Wed 4/19/2006 6:15 AM
> To: Michiel De Hoon
> Cc: biopython at lists.open-bio.org
> Subject: RE: [BioPython] Need help parsing Blastoutput
>  
> Hi 
> Please see the attachment,it part of my Blast output.
> yes I am try to parse text output from Blast ,I have use another script to 
> run my local blast that I am trying to perse the NCBIStandalone.BlastParser

> was working fine without hsp.sbject_end  which is one of what I need to 
> print out .
> On checking the class diagrams from cookbook, findout that sbject_end is 
> not included .I just need another way of printing the int(subject end).
> Thanks for your help
> Halimah
> 
> On Tue, 18 Apr 2006, Michiel De Hoon wrote:
> 
> > Could you also send us the file Enterococcus_out so we can run the
script?
> > 
> > From the script, it looks like you're trying to parse text output from
> Blast.
> > While this is possible (in theory), the format of Blast text output tends
> to
> > change a lot, thereby breaking the parser in Biopython. It is more
reliable
> > to have Blast generate output in XML format, and use the XML parser:
> > 
> > blast_out = open('my_blast.xml', 'r')
> > 
> > from Bio.Blast import NCBIXML
> > 
> > b_parser = NCBIXML.BlastParser()
> > b_record = b_parser.parse(blast_out)
> > 
> > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to
> > generate Blast output in XML.
> > 
> > --Michiel.
> > 
> > 
> > 
> > Michiel de Hoon
> > Center for Computational Biology and Bioinformatics
> > Columbia University
> > 1150 St Nicholas Avenue
> > New York, NY 10032
> > 
> > 
> > 
> > -----Original Message-----
> > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > Sent: Tue 4/18/2006 11:06 AM
> > To: Michiel De Hoon
> > Cc: biopython at lists.open-bio.org
> > Subject: RE: [BioPython] Need help parsing Blastoutput
> >  
> > thanks
> > please see the attchment a copy of my script and copy of my Blast output
> > Thanks
> > 
> > 
> > On Thu, 13 Apr 2006, Michiel De Hoon wrote:
> > 
> > > Could you send us the script you were using?
> > > 
> > > --Michiel.
> > > 
> > > Michiel de Hoon
> > > Center for Computational Biology and Bioinformatics
> > > Columbia University
> > > 1150 St Nicholas Avenue
> > > New York, NY 10032
> > > 
> > > 
> > > 
> > > -----Original Message-----
> > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> > > Sent: Thu 4/13/2006 11:07 AM
> > > To: biopython at lists.open-bio.org
> > > Subject: [BioPython] Need help parsing Blastoutput
> > >  
> > > Hi All,
> > > I have a BLAST output from a local blast
> > > I need to calculate my % alignment coverage as regard to my subject
> > > I try parsed the blast output and wanted to print the
> > > sbjct Start and Sbjct end. but I could not is there anyway I could this

> > > try to get mach coverage between my querry and subject I dont need 
> > > Identities,but total % alignment for querry or subject.
> > > Thanks
> > > Halimah
> > > 
> > > _______________________________________________
> > > BioPython mailing list  -  BioPython at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/biopython
> > > 
> > > 
> > 
> > 
> 
> 


From mdehoon at c2b2.columbia.edu  Thu Apr 20 15:15:51 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Thu, 20 Apr 2006 15:15:51 -0400
Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF01@cgcmail.cgc.cpmc.columbia.edu>

> I did suggest a change to the documentation but it looks like no one has 
> made the change...
> 
> http://biopython.org/pipermail/biopython-dev/2005-November/002193.html

I have now made this update in CVS. I'll put it on the website also as soon
as I can figure out how to do that with the new webserver.

--Michiel.


Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032


From alpersoyler at yahoo.com  Fri Apr 21 03:07:05 2006
From: alpersoyler at yahoo.com (alper soyler)
Date: Fri, 21 Apr 2006 00:07:05 -0700 (PDT)
Subject: [BioPython] Need help!!!
In-Reply-To: <1145540506.11610.17.camel@clintonite.nhm.ac.uk>
Message-ID: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com>

Hi Cymon,
   
  Thank you for your reply. However, to construct phylogenet?c profile I need to download approx. 100 completed genomes. I am searching to make it easier (e.g. without downloading genomes). Can I do it by running blast over the internet?
   
  Regards,
  Alper Soyler 

"Cymon J. Cox" <cy at cymon.org> wrote:
  Hi Alper,

On Thu, 2006-04-20 at 05:59 -0700, alper soyler wrote:
> Hi All,
> 
> I am new to Biopython and have a question. I want to construct a pyhlogenetic
> profile for one organism's proteins. I want to give my protein to blast to
> search one organism's genome (e.g. Homo sapiens) instead of whole genbank
> database. How can I solve my problem? Thank you in advance.

Assuming you want to do this locally, you'll need to download you target
genome, format it with the BLAST distribution programme 'formatdb', and
then feed your query and newly formatted genome BLAST database to
Bio.Blast.NCBIStandalone.

See http://biopython.org/docs/tutorial/Tutorial004.html#toc10
3.1.4 Running BLAST locally

for details,

Cheers, Cymon
____________________________________________________________________

Cymon J. Cox

Biometry and Molecular Research
Department of Zoology
Natural History Museum
Cromwell Road
London, SW7 5BD

Email: cy at cymon.org, c.cox at nhm.ac.uk, cymon.cox at googlemail.com 
Phone : +44 (0)20 7942 6981
HomePage : http://www.duke.edu/~cymon

-8.63/-6.77
_____________________________________________________________________
Fedora Core release 4 (Stentz) clintonite.nhm.ac.uk 14:35:55 up 13 days,
20:42, 8 users, load average: 0.08, 0.16, 0.12


---------------------------------
Blab-away for as little as 1?/min. Make  PC-to-Phone Calls using Yahoo! Messenger with Voice.

From biopython at maubp.freeserve.co.uk  Fri Apr 21 04:44:56 2006
From: biopython at maubp.freeserve.co.uk (Peter (BioPython))
Date: Fri, 21 Apr 2006 09:44:56 +0100
Subject: [BioPython] Updating the tutorial,
 was :Parsing and Creating Dictionaries of GenBank files
In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF01@cgcmail.cgc.cpmc.columbia.edu>
References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF01@cgcmail.cgc.cpmc.columbia.edu>
Message-ID: <44489B88.2030801@maubp.freeserve.co.uk>

Michiel De Hoon wrote:
>> I did suggest a change to the documentation but it looks like no
>> one has made the change...
>> 
>> http://biopython.org/pipermail/biopython-dev/2005-November/002193.html
>> 

Thanks - I was going to look at this today.

Something funny seems to have happened to the plain text version:

http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Doc/Tutorial.txt.diff?r1=1.5&r2=1.6&cvsroot=biopython

(a) The old "Title" is missing above the contents listing

(b) Contents entries contain &nbsp; which is nasty for plain text.

(b) Section references now contain odd text.  Is it possible you only
ran the TeX file once?  Usually with references TeX should be run twice
(and in extreme cases, three times)

In an earlier discussion it was suggested we remove the plain text 
documentation from CVS, which I objected to as plain text is much easier 
for non-TeX people to read.

If generating a consistent plain text version is a lot of hassle, then 
maybe we can live without it?

> I have now made this update in CVS. I'll put it on the website also
> as soon as I can figure out how to do that with the new webserver.

I can't help you there - I was going to post to the Developer mailing 
list to see if anyone had done this recently.  Have you been able to 
generate new HTML and Tutorial.pdf files?

Looks like you have also updated the text about the Blast parser :)

Peter


From cy at cymon.org  Fri Apr 21 05:38:33 2006
From: cy at cymon.org (Cymon J. Cox)
Date: Fri, 21 Apr 2006 10:38:33 +0100
Subject: [BioPython] Need help!!!
In-Reply-To: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com>
References: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com>
Message-ID: <1145612313.4167.15.camel@clintonite.nhm.ac.uk>

Hi Alper,

On Fri, 2006-04-21 at 00:07 -0700, alper soyler wrote:
> Hi Cymon,
>    
>   Thank you for your reply. However, to construct phylogenet?c profile I need to
>  download approx. 100 completed genomes. I am searching to make it easier (e.g.
>  without downloading genomes). Can I do it by running blast over the internet?

Well, I'm not sure; but here's my take on it and hopefully someone will
correct me if I'm wrong.

Assuming you are referring to complete genomes available through NCBI
(otherwise you'll almost certainly need to download them), I don't think
it's possible with the BioPython interface. Bio.Blast.NCBIWWW uses the
qblast interface at NCBI
(http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html) which I think only
makes the following db's available:
http://www.ncbi.nlm.nih.gov/blast/blast_databases.shtml . From looking
at the qblast docs it doesn't seem possible to restrict the search to a
particular organism while blast'ing against a particular NCBI db (e.g.
nr).

Depending on what you want to do, it maybe easier and quicker to use the
NCBI web Blast interface to the Genomes db's:
http://www.ncbi.nlm.nih.gov/sutils/genom_table.cgi

Else you'll have to bite the proverbial bullet and download and format
them individually.

Cheers, Cymon

>    
>   Regards,
>   Alper Soyler 
> 
> "Cymon J. Cox" <cy at cymon.org> wrote:
>   Hi Alper,
> 
> On Thu, 2006-04-20 at 05:59 -0700, alper soyler wrote:
> > Hi All,
> > 
> > I am new to Biopython and have a question. I want to construct a pyhlogenetic
> > profile for one organism's proteins. I want to give my protein to blast to
> > search one organism's genome (e.g. Homo sapiens) instead of whole genbank
> > database. How can I solve my problem? Thank you in advance.
> 
> Assuming you want to do this locally, you'll need to download you target
> genome, format it with the BLAST distribution programme 'formatdb', and
> then feed your query and newly formatted genome BLAST database to
> Bio.Blast.NCBIStandalone.
> 
> See http://biopython.org/docs/tutorial/Tutorial004.html#toc10
> 3.1.4 Running BLAST locally
> 
> for details,
> 
> Cheers, Cymon


From biopython at maubp.freeserve.co.uk  Fri Apr 21 05:23:12 2006
From: biopython at maubp.freeserve.co.uk (Peter (BioPython))
Date: Fri, 21 Apr 2006 10:23:12 +0100
Subject: [BioPython] blast against genomes, was: Need help!!!
In-Reply-To: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com>
References: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com>
Message-ID: <4448A480.5010805@maubp.freeserve.co.uk>

alper soyler wrote:
> Hi Cymon,
> 
> Thank you for your reply. However, to construct phylogenet?c profile
> I need to download approx. 100 completed genomes. I am searching to
> make it easier (e.g. without downloading genomes). Can I do it by
> running blast over the internet?

So you want to search 100 completed genomes using your protein as the 
input query?

As Cymon suggested, downloading the genomes and building your own 
database is one method.  As this is a "big task" you have in mind, the 
network speed limitations of doing many blast queries may make this a 
better idea than trying to do it online.

However, the NCBI offer online blast against some (all?) of their 
completed genomes so it may be possible to do it this way via BioPython.

http://www.ncbi.nlm.nih.gov/BLAST/

The webpage has a nice interface for blast against specific genomes 
(right hand side, second box down).

You can also use the normal blast pages and the "Limit by entrez query" 
field, e.g. mouse[ORGN] OR rat[ORGN]

It should be possible to do this automatically in code but you will need 
to compile a list of the species names the NCBI will understand...

Peter


From sbassi at gmail.com  Fri Apr 21 07:46:49 2006
From: sbassi at gmail.com (Sebastian Bassi)
Date: Fri, 21 Apr 2006 08:46:49 -0300
Subject: [BioPython] Need help!!!
In-Reply-To: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com>
References: <1145540506.11610.17.camel@clintonite.nhm.ac.uk>
	<20060421070705.85335.qmail@web36801.mail.mud.yahoo.com>
Message-ID: <b43bf2080604210446xc452797q9b853aa11e66f84c@mail.gmail.com>

On 4/21/06, alper soyler <alpersoyler at yahoo.com> wrote:
> Hi Cymon,
>   Thank you for your reply. However, to construct phylogenet?c profile I need to download approx. 100 completed genomes. I am searching to make it easier (e.g. without downloading genomes). Can I do it by running blast over the internet?
>

Maybe you could download only NR db and then make subsets from it.
NCBI utilities or the local BLAST has one utility that allows you to
extract sequences from BLAST compiled DBs. I don't know if this would
be enough for your needs.

--
Bioinformatics news: http://www.bioinformatica.info
Lriser: http://www.linspire.com/lraiser_success.php?serial=318


From mdehoon at c2b2.columbia.edu  Fri Apr 21 12:26:39 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Fri, 21 Apr 2006 12:26:39 -0400
Subject: [BioPython] Updating the tutorial,
	was :Parsing and Creating Dictionaries of GenBank files
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF05@cgcmail.cgc.cpmc.columbia.edu>

> Something funny seems to have happened to the plain text version:
> 
>
http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Doc/Tutorial.t
xt.diff?r1=1.5&r2=1.6&cvsroot=biopython

The plain text version is generated by hevea, so not by tex directly. The
funny output is likely due to having a different hevea version (which I ran a
couple of times). I didn't see anything obviously wrong with the Tutorial.tex
source file, so I think these errors are due to errors in the Tutorial.tex ->
Tutorial.txt translation by hevea.

> If generating a consistent plain text version is a lot of hassle, then 
> maybe we can live without it?

Currently, the plain text version is not very useful. It's not a source file,
so it should not be in CVS. On the other hand, the plain text version is not
available from the Biopython documentation page, and users are better off
with the PDF version anyway. So I think nobody will miss the plain text
version. Correct me if I'm wrong.

--Michiel.


From srini_iyyer_bio at yahoo.com  Fri Apr 21 18:49:28 2006
From: srini_iyyer_bio at yahoo.com (Srinivas Iyyer)
Date: Fri, 21 Apr 2006 15:49:28 -0700 (PDT)
Subject: [BioPython] Creating a graphical interface to database of gene
	coordinates
Message-ID: <20060421224928.87408.qmail@web38105.mail.mud.yahoo.com>

Dear group, 
 I am happy that I am slowly finding pyhonian projects
related to my research area. 

Problem:
1. I have a database of human gene coordinates on
chromosomes.
2. I have gene expression data from my lab concerning
the genes I mentioned above. 

3. I want to visualize expression data laid on
chromosomes.

Eg. 
Coordinates:
Chr      Gene       From      To     Exon
1         x         100       120    exon:1
1         x         200       250    exon:2
1         x         350       450    exon:3


Expression data:

IDent   sample  Chr    From     To     Expression
value
xxx_at  lung     1     110      120     100.35
x_s_at  heart    1     225      250     124.35
x_a_at  eye      1     375      400     146.35

What I want:

I want to have a simpler window, that would connect to
my database.  I want to give a gene, this python/tk
interfacce or what ever would query the database
draw a graph of gene according the exons and plot the
values. 

-------_______----------_______-------

-- : exon
__: regions that are not exons, introns.


My questions to Tutor/BioPython forums:

1. What should I decide to work on a. Py/Tk framework 
b. python imaging libraries etc. 

2. I do not want to impress any one with this work,
except that it should help me understand the
relationships as the number game in the tables above
is highly confusing. So, a working version that
accurately plots the expression values for as many
samples I have

3. Are there any available modules to jump-start? or
do I have to create some from scratch. which would be
a problem because I am between novice to mediocral
level of python programing. 

4. Any ideas/suggestions/pointers are highly
appreciated. 

thanks
Sri

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

From biopython at maubp.freeserve.co.uk  Sat Apr 22 08:32:21 2006
From: biopython at maubp.freeserve.co.uk (Peter (BioPython))
Date: Sat, 22 Apr 2006 13:32:21 +0100
Subject: [BioPython] Creating a graphical interface to database of gene
 coordinates
In-Reply-To: <20060421224928.87408.qmail@web38105.mail.mud.yahoo.com>
References: <20060421224928.87408.qmail@web38105.mail.mud.yahoo.com>
Message-ID: <444A2255.6010704@maubp.freeserve.co.uk>

Srinivas Iyyer wrote:
> Dear group, 
>  I am happy that I am slowly finding pyhonian projects
> related to my research area. 
> 
> Problem:
> 1. I have a database of human gene coordinates on
> chromosomes.
> 2. I have gene expression data from my lab concerning
> the genes I mentioned above. 
> 
> 3. I want to visualize expression data laid on
> chromosomes.

You may be able to produce chromosome diagrams with Leighton Pritchard 
and Jennifer White's program genomediagram:

http://bioinf.scri.sari.ac.uk/lp/programs.html#genomediagram

It will do both circular genomes diagrams (nice for bacteria) and linear 
ones - which would make sense for chromosomes.  I think I've seen 
examples with expression data shown in this way... certainly it could be 
done.

Note that this can produce PDF or bitmap output - but its not 
interactive.  There is also a GUI to go with it, but I have not looked 
at this.

----------------------------------------------------------------------

One final suggestion, is to consider looking at R/BioConductor - its a 
completely different language but I have seen examples where expression 
data is visualised on chromosomes.

http://www.r-project.org/
http://www.bioconductor.org/

You can even call R from Python, for example using RPy (R from Python),:

http://rpy.sourceforge.net/index.html

See also RSPython, an R/SPlus - Python Interface which I have not used 
personally:

http://www.omegahat.org/RSPython/

Peter


From biopython at maubp.freeserve.co.uk  Mon Apr 24 06:56:06 2006
From: biopython at maubp.freeserve.co.uk (Peter (BioPython List))
Date: Mon, 24 Apr 2006 11:56:06 +0100
Subject: [BioPython] Bio.Nexus documentation
Message-ID: <444CAEC6.5040703@maubp.freeserve.co.uk>

I'm thinking of having a go at using the new Bio.Nexus model in 
BioPython to do some phylogenetic tree manipulation (from Clustal .dnd 
files in my case), so I thought I would have a hunt for some examples or 
help...

Back in July 2005, Frank Kauff wrote:
> I hope most of the methods have a descriptive title and are easy to use.
> Let me know if I can help further. And I promise to write some
> documentation, but it won't be before end of August.
> 
> Cheers,
> Frank 

Archive link:
http://biopython.org/pipermail/biopython/2005-July/002714.html

Was that August 2005, or August 2006, you had in mind? ;)

Do you have some simple examples you could share with us instead perhaps?

Thanks

Peter

From fkauff at duke.edu  Mon Apr 24 09:32:45 2006
From: fkauff at duke.edu (Frank Kauff)
Date: Mon, 24 Apr 2006 09:32:45 -0400
Subject: [BioPython] Bio.Nexus documentation
In-Reply-To: <444CAEC6.5040703@maubp.freeserve.co.uk>
References: <444CAEC6.5040703@maubp.freeserve.co.uk>
Message-ID: <1145885566.2369.6.camel@osiris.biology.duke.edu>

An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: http://lists.open-bio.org/pipermail/biopython/attachments/20060424/d8a5de2f/attachment.ksh 

From halima at mancala.cbio.uct.ac.za  Mon Apr 24 04:45:09 2006
From: halima at mancala.cbio.uct.ac.za (Halima Rabiu)
Date: Mon, 24 Apr 2006 10:45:09 +0200 (SAST)
Subject: [BioPython] Need help parsing Blastoutput
In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEFF@cgcmail.cgc.cpmc.columbia.edu>
References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEFF@cgcmail.cgc.cpmc.columbia.edu>
Message-ID: <Pine.LNX.4.58.0604241036290.18039@mancala.cbio.uct.ac.za>

Hi 
attch here is the output xml out I also attached it in my previous post 
thanks
Halimah

On Thu, 20 Apr 2006, Michiel De Hoon wrote:

> Could you send us the Blast XML output also?
> 
> --Michiel.
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> Sent: Thu 4/20/2006 7:57 AM
> To: Michiel De Hoon
> Cc: biopython at lists.open-bio.org
> Subject: RE: [BioPython] Need help parsing Blastoutput
>  
> thanks I try using XML parser and I am still geting errors which I dont 
> understand . please see the attchmnt copy of my script and Blast XML 
> output.
> here is the error
> raceback (most recent call last):
>   File "Bioperser.py", line 11, in ?
>     b_record = b_parser.parse(b_out)
>   File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 
> 112, in parse
>     self._parser.parse(handler)
>   File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in 
> parse
>     xmlreader.IncrementalParser.parse(self, source)
>   File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in 
> parse
>     self.feed(buffer)
>   File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in 
> feed
>     self._err_handler.fatalError(exc)
>   File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in 
> fatalError
>     raise exception
> thanks
> Halimah
> 
> On Wed, 19 Apr 2006, Michiel De Hoon wrote:
> 
> > The Blast parser fails to read your file because the format of Blast output
> > has changed. If I edit the data file so that it corresponds to the old
> format
> > (add a space here, remove a blank line there, etc.), the Blast parser reads
> > the file without problems. The easiest solution is to repeat the Blast run,
> > using XML for the output format, and use the Blast XML parser in Biopython
> to
> > parse the results.
> > 
> > A general question is if anybody still needs the parser for Blast text
> > output. Currently, we are confusing our users by having a Blast text parser
> > that tends to break. A broken parser may be worse than no parser.
> > 
> > --Michiel.
> > 
> > Michiel de Hoon
> > Center for Computational Biology and Bioinformatics
> > Columbia University
> > 1150 St Nicholas Avenue
> > New York, NY 10032
> > 
> > 
> > 
> > -----Original Message-----
> > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > Sent: Wed 4/19/2006 6:15 AM
> > To: Michiel De Hoon
> > Cc: biopython at lists.open-bio.org
> > Subject: RE: [BioPython] Need help parsing Blastoutput
> >  
> > Hi 
> > Please see the attachment,it part of my Blast output.
> > yes I am try to parse text output from Blast ,I have use another script to 
> > run my local blast that I am trying to perse the NCBIStandalone.BlastParser
> 
> > was working fine without hsp.sbject_end  which is one of what I need to 
> > print out .
> > On checking the class diagrams from cookbook, findout that sbject_end is 
> > not included .I just need another way of printing the int(subject end).
> > Thanks for your help
> > Halimah
> > 
> > On Tue, 18 Apr 2006, Michiel De Hoon wrote:
> > 
> > > Could you also send us the file Enterococcus_out so we can run the
> script?
> > > 
> > > From the script, it looks like you're trying to parse text output from
> > Blast.
> > > While this is possible (in theory), the format of Blast text output tends
> > to
> > > change a lot, thereby breaking the parser in Biopython. It is more
> reliable
> > > to have Blast generate output in XML format, and use the XML parser:
> > > 
> > > blast_out = open('my_blast.xml', 'r')
> > > 
> > > from Bio.Blast import NCBIXML
> > > 
> > > b_parser = NCBIXML.BlastParser()
> > > b_record = b_parser.parse(blast_out)
> > > 
> > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to
> > > generate Blast output in XML.
> > > 
> > > --Michiel.
> > > 
> > > 
> > > 
> > > Michiel de Hoon
> > > Center for Computational Biology and Bioinformatics
> > > Columbia University
> > > 1150 St Nicholas Avenue
> > > New York, NY 10032
> > > 
> > > 
> > > 
> > > -----Original Message-----
> > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > > Sent: Tue 4/18/2006 11:06 AM
> > > To: Michiel De Hoon
> > > Cc: biopython at lists.open-bio.org
> > > Subject: RE: [BioPython] Need help parsing Blastoutput
> > >  
> > > thanks
> > > please see the attchment a copy of my script and copy of my Blast output
> > > Thanks
> > > 
> > > 
> > > On Thu, 13 Apr 2006, Michiel De Hoon wrote:
> > > 
> > > > Could you send us the script you were using?
> > > > 
> > > > --Michiel.
> > > > 
> > > > Michiel de Hoon
> > > > Center for Computational Biology and Bioinformatics
> > > > Columbia University
> > > > 1150 St Nicholas Avenue
> > > > New York, NY 10032
> > > > 
> > > > 
> > > > 
> > > > -----Original Message-----
> > > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> > > > Sent: Thu 4/13/2006 11:07 AM
> > > > To: biopython at lists.open-bio.org
> > > > Subject: [BioPython] Need help parsing Blastoutput
> > > >  
> > > > Hi All,
> > > > I have a BLAST output from a local blast
> > > > I need to calculate my % alignment coverage as regard to my subject
> > > > I try parsed the blast output and wanted to print the
> > > > sbjct Start and Sbjct end. but I could not is there anyway I could this
> 
> > > > try to get mach coverage between my querry and subject I dont need 
> > > > Identities,but total % alignment for querry or subject.
> > > > Thanks
> > > > Halimah
> > > > 
> > > > _______________________________________________
> > > > BioPython mailing list  -  BioPython at lists.open-bio.org
> > > > http://lists.open-bio.org/mailman/listinfo/biopython
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: blast2.xml
Type: text/xml
Size: 151658 bytes
Desc: 
Url : http://lists.open-bio.org/pipermail/biopython/attachments/20060424/af1567dc/attachment-0001.xml 

From mdehoon at c2b2.columbia.edu  Mon Apr 24 14:14:17 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Mon, 24 Apr 2006 14:14:17 -0400
Subject: [BioPython] Need help parsing Blastoutput
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF0E@cgcmail.cgc.cpmc.columbia.edu>

Ha, I see. My stupid email program was removing the XML file from your email
messages for security reasons something or other.
Anyway, I got the XML files from the mailing list archives.

The XML file from Thursday April 20 is different from the one sent on Monday
April 24. In fact, the latter seems to be damaged; in line 194, it has:

<?xml version="1.1?>

while the former has

<?xml version="1.0"?>

So in the latter a " is missing for some reason.

Anyway, the XML parser can read the XML file from Thursday April 20 if you
fix a few things in your script:

*) Instead of
b_record = b_parser.parse(b_out)
you need
b_iterator = NCBIStandalone.Iterator(b_out, b_parser)
(and then you should also import NCBIStandalone)

*) You should check if b_record is None immediately after b_record =
b_iterator.next().

*) There is no hsp.sbject_end


--Michiel.

Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032


-----Original Message-----
From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
Sent: Mon 4/24/2006 4:45 AM
To: Michiel De Hoon
Cc: biopython at lists.open-bio.org
Subject: RE: [BioPython] Need help parsing Blastoutput
 
Hi 
attch here is the output xml out I also attached it in my previous post 
thanks
Halimah

On Thu, 20 Apr 2006, Michiel De Hoon wrote:

> Could you send us the Blast XML output also?
> 
> --Michiel.
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> Sent: Thu 4/20/2006 7:57 AM
> To: Michiel De Hoon
> Cc: biopython at lists.open-bio.org
> Subject: RE: [BioPython] Need help parsing Blastoutput
>  
> thanks I try using XML parser and I am still geting errors which I dont 
> understand . please see the attchmnt copy of my script and Blast XML 
> output.
> here is the error
> raceback (most recent call last):
>   File "Bioperser.py", line 11, in ?
>     b_record = b_parser.parse(b_out)
>   File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 
> 112, in parse
>     self._parser.parse(handler)
>   File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in 
> parse
>     xmlreader.IncrementalParser.parse(self, source)
>   File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in 
> parse
>     self.feed(buffer)
>   File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in 
> feed
>     self._err_handler.fatalError(exc)
>   File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in 
> fatalError
>     raise exception
> thanks
> Halimah
> 
> On Wed, 19 Apr 2006, Michiel De Hoon wrote:
> 
> > The Blast parser fails to read your file because the format of Blast
output
> > has changed. If I edit the data file so that it corresponds to the old
> format
> > (add a space here, remove a blank line there, etc.), the Blast parser
reads
> > the file without problems. The easiest solution is to repeat the Blast
run,
> > using XML for the output format, and use the Blast XML parser in
Biopython
> to
> > parse the results.
> > 
> > A general question is if anybody still needs the parser for Blast text
> > output. Currently, we are confusing our users by having a Blast text
parser
> > that tends to break. A broken parser may be worse than no parser.
> > 
> > --Michiel.
> > 
> > Michiel de Hoon
> > Center for Computational Biology and Bioinformatics
> > Columbia University
> > 1150 St Nicholas Avenue
> > New York, NY 10032
> > 
> > 
> > 
> > -----Original Message-----
> > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > Sent: Wed 4/19/2006 6:15 AM
> > To: Michiel De Hoon
> > Cc: biopython at lists.open-bio.org
> > Subject: RE: [BioPython] Need help parsing Blastoutput
> >  
> > Hi 
> > Please see the attachment,it part of my Blast output.
> > yes I am try to parse text output from Blast ,I have use another script
to 
> > run my local blast that I am trying to perse the
NCBIStandalone.BlastParser
> 
> > was working fine without hsp.sbject_end  which is one of what I need to 
> > print out .
> > On checking the class diagrams from cookbook, findout that sbject_end is 
> > not included .I just need another way of printing the int(subject end).
> > Thanks for your help
> > Halimah
> > 
> > On Tue, 18 Apr 2006, Michiel De Hoon wrote:
> > 
> > > Could you also send us the file Enterococcus_out so we can run the
> script?
> > > 
> > > From the script, it looks like you're trying to parse text output from
> > Blast.
> > > While this is possible (in theory), the format of Blast text output
tends
> > to
> > > change a lot, thereby breaking the parser in Biopython. It is more
> reliable
> > > to have Blast generate output in XML format, and use the XML parser:
> > > 
> > > blast_out = open('my_blast.xml', 'r')
> > > 
> > > from Bio.Blast import NCBIXML
> > > 
> > > b_parser = NCBIXML.BlastParser()
> > > b_record = b_parser.parse(blast_out)
> > > 
> > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how
to
> > > generate Blast output in XML.
> > > 
> > > --Michiel.
> > > 
> > > 
> > > 
> > > Michiel de Hoon
> > > Center for Computational Biology and Bioinformatics
> > > Columbia University
> > > 1150 St Nicholas Avenue
> > > New York, NY 10032
> > > 
> > > 
> > > 
> > > -----Original Message-----
> > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > > Sent: Tue 4/18/2006 11:06 AM
> > > To: Michiel De Hoon
> > > Cc: biopython at lists.open-bio.org
> > > Subject: RE: [BioPython] Need help parsing Blastoutput
> > >  
> > > thanks
> > > please see the attchment a copy of my script and copy of my Blast
output
> > > Thanks
> > > 
> > > 
> > > On Thu, 13 Apr 2006, Michiel De Hoon wrote:
> > > 
> > > > Could you send us the script you were using?
> > > > 
> > > > --Michiel.
> > > > 
> > > > Michiel de Hoon
> > > > Center for Computational Biology and Bioinformatics
> > > > Columbia University
> > > > 1150 St Nicholas Avenue
> > > > New York, NY 10032
> > > > 
> > > > 
> > > > 
> > > > -----Original Message-----
> > > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> > > > Sent: Thu 4/13/2006 11:07 AM
> > > > To: biopython at lists.open-bio.org
> > > > Subject: [BioPython] Need help parsing Blastoutput
> > > >  
> > > > Hi All,
> > > > I have a BLAST output from a local blast
> > > > I need to calculate my % alignment coverage as regard to my subject
> > > > I try parsed the blast output and wanted to print the
> > > > sbjct Start and Sbjct end. but I could not is there anyway I could
this
> 
> > > > try to get mach coverage between my querry and subject I dont need 
> > > > Identities,but total % alignment for querry or subject.
> > > > Thanks
> > > > Halimah
> > > > 
> > > > _______________________________________________
> > > > BioPython mailing list  -  BioPython at lists.open-bio.org
> > > > http://lists.open-bio.org/mailman/listinfo/biopython
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> 
> 


From mdehoon at c2b2.columbia.edu  Mon Apr 24 14:27:31 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Mon, 24 Apr 2006 14:27:31 -0400
Subject: [BioPython] Need help parsing Blastoutput
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF0F@cgcmail.cgc.cpmc.columbia.edu>

Also, make sure you have the latest version of Bio/Blast/NCBIStandalone.py;
you can get it from here:

http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/*checkout*/biopython/Bio
/Blast/NCBIStandalone.py?rev=1.60&cvsroot=biopython&content-type=text/plain

--Michiel.

Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032


-----Original Message-----
From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
Sent: Mon 4/24/2006 4:45 AM
To: Michiel De Hoon
Cc: biopython at lists.open-bio.org
Subject: RE: [BioPython] Need help parsing Blastoutput
 
Hi 
attch here is the output xml out I also attached it in my previous post 
thanks
Halimah

On Thu, 20 Apr 2006, Michiel De Hoon wrote:

> Could you send us the Blast XML output also?
> 
> --Michiel.
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> Sent: Thu 4/20/2006 7:57 AM
> To: Michiel De Hoon
> Cc: biopython at lists.open-bio.org
> Subject: RE: [BioPython] Need help parsing Blastoutput
>  
> thanks I try using XML parser and I am still geting errors which I dont 
> understand . please see the attchmnt copy of my script and Blast XML 
> output.
> here is the error
> raceback (most recent call last):
>   File "Bioperser.py", line 11, in ?
>     b_record = b_parser.parse(b_out)
>   File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 
> 112, in parse
>     self._parser.parse(handler)
>   File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in 
> parse
>     xmlreader.IncrementalParser.parse(self, source)
>   File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in 
> parse
>     self.feed(buffer)
>   File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in 
> feed
>     self._err_handler.fatalError(exc)
>   File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in 
> fatalError
>     raise exception
> thanks
> Halimah
> 
> On Wed, 19 Apr 2006, Michiel De Hoon wrote:
> 
> > The Blast parser fails to read your file because the format of Blast
output
> > has changed. If I edit the data file so that it corresponds to the old
> format
> > (add a space here, remove a blank line there, etc.), the Blast parser
reads
> > the file without problems. The easiest solution is to repeat the Blast
run,
> > using XML for the output format, and use the Blast XML parser in
Biopython
> to
> > parse the results.
> > 
> > A general question is if anybody still needs the parser for Blast text
> > output. Currently, we are confusing our users by having a Blast text
parser
> > that tends to break. A broken parser may be worse than no parser.
> > 
> > --Michiel.
> > 
> > Michiel de Hoon
> > Center for Computational Biology and Bioinformatics
> > Columbia University
> > 1150 St Nicholas Avenue
> > New York, NY 10032
> > 
> > 
> > 
> > -----Original Message-----
> > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > Sent: Wed 4/19/2006 6:15 AM
> > To: Michiel De Hoon
> > Cc: biopython at lists.open-bio.org
> > Subject: RE: [BioPython] Need help parsing Blastoutput
> >  
> > Hi 
> > Please see the attachment,it part of my Blast output.
> > yes I am try to parse text output from Blast ,I have use another script
to 
> > run my local blast that I am trying to perse the
NCBIStandalone.BlastParser
> 
> > was working fine without hsp.sbject_end  which is one of what I need to 
> > print out .
> > On checking the class diagrams from cookbook, findout that sbject_end is 
> > not included .I just need another way of printing the int(subject end).
> > Thanks for your help
> > Halimah
> > 
> > On Tue, 18 Apr 2006, Michiel De Hoon wrote:
> > 
> > > Could you also send us the file Enterococcus_out so we can run the
> script?
> > > 
> > > From the script, it looks like you're trying to parse text output from
> > Blast.
> > > While this is possible (in theory), the format of Blast text output
tends
> > to
> > > change a lot, thereby breaking the parser in Biopython. It is more
> reliable
> > > to have Blast generate output in XML format, and use the XML parser:
> > > 
> > > blast_out = open('my_blast.xml', 'r')
> > > 
> > > from Bio.Blast import NCBIXML
> > > 
> > > b_parser = NCBIXML.BlastParser()
> > > b_record = b_parser.parse(blast_out)
> > > 
> > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how
to
> > > generate Blast output in XML.
> > > 
> > > --Michiel.
> > > 
> > > 
> > > 
> > > Michiel de Hoon
> > > Center for Computational Biology and Bioinformatics
> > > Columbia University
> > > 1150 St Nicholas Avenue
> > > New York, NY 10032
> > > 
> > > 
> > > 
> > > -----Original Message-----
> > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > > Sent: Tue 4/18/2006 11:06 AM
> > > To: Michiel De Hoon
> > > Cc: biopython at lists.open-bio.org
> > > Subject: RE: [BioPython] Need help parsing Blastoutput
> > >  
> > > thanks
> > > please see the attchment a copy of my script and copy of my Blast
output
> > > Thanks
> > > 
> > > 
> > > On Thu, 13 Apr 2006, Michiel De Hoon wrote:
> > > 
> > > > Could you send us the script you were using?
> > > > 
> > > > --Michiel.
> > > > 
> > > > Michiel de Hoon
> > > > Center for Computational Biology and Bioinformatics
> > > > Columbia University
> > > > 1150 St Nicholas Avenue
> > > > New York, NY 10032
> > > > 
> > > > 
> > > > 
> > > > -----Original Message-----
> > > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> > > > Sent: Thu 4/13/2006 11:07 AM
> > > > To: biopython at lists.open-bio.org
> > > > Subject: [BioPython] Need help parsing Blastoutput
> > > >  
> > > > Hi All,
> > > > I have a BLAST output from a local blast
> > > > I need to calculate my % alignment coverage as regard to my subject
> > > > I try parsed the blast output and wanted to print the
> > > > sbjct Start and Sbjct end. but I could not is there anyway I could
this
> 
> > > > try to get mach coverage between my querry and subject I dont need 
> > > > Identities,but total % alignment for querry or subject.
> > > > Thanks
> > > > Halimah
> > > > 
> > > > _______________________________________________
> > > > BioPython mailing list  -  BioPython at lists.open-bio.org
> > > > http://lists.open-bio.org/mailman/listinfo/biopython
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> 
> 


From biopython at maubp.freeserve.co.uk  Tue Apr 25 05:08:33 2006
From: biopython at maubp.freeserve.co.uk (Peter (BioPython List))
Date: Tue, 25 Apr 2006 10:08:33 +0100
Subject: [BioPython] Bio.Nexus documentation
In-Reply-To: <1145885566.2369.6.camel@osiris.biology.duke.edu>
References: <444CAEC6.5040703@maubp.freeserve.co.uk>
	<1145885566.2369.6.camel@osiris.biology.duke.edu>
Message-ID: <444DE711.8070509@maubp.freeserve.co.uk>

> Anyway, I'll get some examples together, and I still want to do some
> documentation for the cookbook. It won't be before this weekend, though.
> For a quick and dirty anchor point, there's the test module that comes
> with the distribution, it naturally has some code that does interesting
> things with trees and data.

Its certainly shown me that the Nexus file format is a lot more 
complicated than just holding simple trees.

What I actually wanted to do was load a Newick format tree (extension 
*.dnd files from Clustalw/ClustalX in particular) into BioPython.  This 
doesn't look like is possible.

However, I can get Clustalx to save the corresponding alignment in Nexus 
format, but the parser doesn't seem to like it...

Traceback (most recent call last):
   File "C:\temp\hack_trees_000.py", line 7, in ?
     n=Nexus.Nexus(input_file)
   File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 525, in 
__init__
     self.read(input)
   File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 582, in 
read
     self._parse_nexus_block(title, contents)
   File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 623, in 
_parse_nexus_block
     getattr(self,'_'+line.command)(line.options)
AttributeError: 'Nexus' object has no attribute '_utree'

This looks like its cause by the penultimate line of the "Nexus Tree 
file" produced by ClustalX:

..
	UTREE PAUP_1= (...);
ENDBLOCK;

Any ideas?  I'll happily send you some example tree files off the list 
if you want.

Peter


From fkauff at duke.edu  Tue Apr 25 08:03:16 2006
From: fkauff at duke.edu (Frank)
Date: Tue, 25 Apr 2006 08:03:16 -0400
Subject: [BioPython] Bio.Nexus documentation
In-Reply-To: <444DE711.8070509@maubp.freeserve.co.uk>
References: <444CAEC6.5040703@maubp.freeserve.co.uk>
	<1145885566.2369.6.camel@osiris.biology.duke.edu>
	<444DE711.8070509@maubp.freeserve.co.uk>
Message-ID: <1145966596.2276.3.camel@cpe-066-057-048-192.nc.res.rr.com>

Hi Peter,

yes, utree is in deed a nexus command I never heard of... The thing is
that nexus is extendible, so programs can in theory define new commands.
So, what is utree? Maybe an unrooted tree?
And, many programs don't care much about the nexus specifications, which
are, in turn, not always too precise. 
If you send the files along, I'd be happy to have a look.

Cheers,
Frank

On Tue, 2006-04-25 at 10:08 +0100, Peter (BioPython List) wrote:
> > Anyway, I'll get some examples together, and I still want to do some
> > documentation for the cookbook. It won't be before this weekend, though.
> > For a quick and dirty anchor point, there's the test module that comes
> > with the distribution, it naturally has some code that does interesting
> > things with trees and data.
> 
> Its certainly shown me that the Nexus file format is a lot more 
> complicated than just holding simple trees.
> 
> What I actually wanted to do was load a Newick format tree (extension 
> *.dnd files from Clustalw/ClustalX in particular) into BioPython.  This 
> doesn't look like is possible.
> 
> However, I can get Clustalx to save the corresponding alignment in Nexus 
> format, but the parser doesn't seem to like it...
> 
> Traceback (most recent call last):
>    File "C:\temp\hack_trees_000.py", line 7, in ?
>      n=Nexus.Nexus(input_file)
>    File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 525, in 
> __init__
>      self.read(input)
>    File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 582, in 
> read
>      self._parse_nexus_block(title, contents)
>    File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 623, in 
> _parse_nexus_block
>      getattr(self,'_'+line.command)(line.options)
> AttributeError: 'Nexus' object has no attribute '_utree'
> 
> This looks like its cause by the penultimate line of the "Nexus Tree 
> file" produced by ClustalX:
> 
> ..
> 	UTREE PAUP_1= (...);
> ENDBLOCK;
> 
> Any ideas?  I'll happily send you some example tree files off the list 
> if you want.
> 
> Peter
> 
> 


From fkauff at duke.edu  Tue Apr 25 17:17:23 2006
From: fkauff at duke.edu (Frank Kauff)
Date: Tue, 25 Apr 2006 17:17:23 -0400
Subject: [BioPython] Bio.Nexus documentation
In-Reply-To: <444DE711.8070509@maubp.freeserve.co.uk>
References: <444CAEC6.5040703@maubp.freeserve.co.uk>
	<1145885566.2369.6.camel@osiris.biology.duke.edu>
	<444DE711.8070509@maubp.freeserve.co.uk>
Message-ID: <1145999843.2365.25.camel@osiris.biology.duke.edu>

Ok, I added support for the utree command used in clustal to denote an
unrooted tree (in the nexus parser, it is synonym to 'tree', as trees
are unrooted by default anyway), and fixed some issues with linebreaks
in tree descriptions. Nexus files from Clustal should now be read
without problems (famous last words).

Cheers,
Frank


On Tue, 2006-04-25 at 10:08 +0100, Peter (BioPython List) wrote:
> > Anyway, I'll get some examples together, and I still want to do some
> > documentation for the cookbook. It won't be before this weekend, though.
> > For a quick and dirty anchor point, there's the test module that comes
> > with the distribution, it naturally has some code that does interesting
> > things with trees and data.
> 
> Its certainly shown me that the Nexus file format is a lot more 
> complicated than just holding simple trees.
> 
> What I actually wanted to do was load a Newick format tree (extension 
> *.dnd files from Clustalw/ClustalX in particular) into BioPython.  This 
> doesn't look like is possible.
> 
> However, I can get Clustalx to save the corresponding alignment in Nexus 
> format, but the parser doesn't seem to like it...
> 
> Traceback (most recent call last):
>    File "C:\temp\hack_trees_000.py", line 7, in ?
>      n=Nexus.Nexus(input_file)
>    File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 525, in 
> __init__
>      self.read(input)
>    File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 582, in 
> read
>      self._parse_nexus_block(title, contents)
>    File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 623, in 
> _parse_nexus_block
>      getattr(self,'_'+line.command)(line.options)
> AttributeError: 'Nexus' object has no attribute '_utree'
> 
> This looks like its cause by the penultimate line of the "Nexus Tree 
> file" produced by ClustalX:
> 
> ..
> 	UTREE PAUP_1= (...);
> ENDBLOCK;
> 
> Any ideas?  I'll happily send you some example tree files off the list 
> if you want.
> 
> Peter
> 
> 
-- 
Frank Kauff
Dept. of Biology
Duke University
Box 90338
Durham, NC 27708
USA

Phone 919-660-7382
Fax 919-660-7293
Web http://www.lutzonilab.net


From biopython at maubp.freeserve.co.uk  Wed Apr 26 10:16:21 2006
From: biopython at maubp.freeserve.co.uk (Peter (BioPython List))
Date: Wed, 26 Apr 2006 15:16:21 +0100
Subject: [BioPython] Bio.Nexus and Clustal tree files
Message-ID: <444F80B5.60207@maubp.freeserve.co.uk>

Hello again,

I have installed Frank Kauff's recent changes to Bio.Nexus from CVS, and 
have actually got a tree loaded now :)

Here is my example script, which tries to load two tree files created 
using ClustalX 1.83 (files previously sent to Frank off list)

(b) demo.dnd - Clustal guide tree in Newick format, no bootstraps
(b) demo.treb - Clustal NJ tree in Nexus format, with bootstraps

Example code starts here:

from Bio.Nexus import Nexus

for filename in [r"C:\TEMP\nexus\demo.dnd",
              r"C:\TEMP\nexus\demo.treb"] :

     input_file = open(filename,"r")
     n=Nexus.Nexus(input_file)
     input_file.close()

     print "-----------------"
     print "Filename:" + n.filename
     print "Number of taxlabels = %i" % len(n.taxlabels)
     print "Number of trees = %i" % len(n.trees)
     for tree in n.trees :
         print "Tree name: %s"% tree.name
         print "Tree nodes: " +  ", ".join(tree.get_taxa())
print "-----------------"


This gives the following output:

-----------------
Filename:C:\TEMP\nexus\demo.dnd
Number of taxlabels = 0
Number of trees = 0
-----------------
Filename:C:\TEMP\nexus\demo.treb
Number of taxlabels = 0
Number of trees = 1
Tree name: PAUP_1
Tree nodes: V_Harveyi_PATH, B_subtilis_YXEM, B_subtilis_GlnH_homo_YCKK, 
YA80_HAEIN, FLIY_ECOLI, Deinococcus_radiodurans, HISJ_E_COLI, E_coli_GlnH
-----------------

As you can see, loading the ClustalX NEXUS output (*.treb) seems to work 
without trouble (although n.taxlabels is an empty list... is this to be 
expected?).

On the other hand, I don't get the tree for the Clustal guide tree file 
(*.dnd) which is a pain.  Do I need to load these files differently, as 
they are Newick format, not NEXUS format?

Thank you

Peter

From fkauff at duke.edu  Wed Apr 26 11:17:31 2006
From: fkauff at duke.edu (Frank Kauff)
Date: Wed, 26 Apr 2006 11:17:31 -0400
Subject: [BioPython] Bio.Nexus and Clustal tree files
In-Reply-To: <444F80B5.60207@maubp.freeserve.co.uk>
References: <444F80B5.60207@maubp.freeserve.co.uk>
Message-ID: <1146064651.2365.41.camel@osiris.biology.duke.edu>

On Wed, 2006-04-26 at 15:16 +0100, Peter (BioPython List) wrote:
> Hello again,
> 
> I have installed Frank Kauff's recent changes to Bio.Nexus from CVS, and 
> have actually got a tree loaded now :)
> 
Excellent!

> Here is my example script, which tries to load two tree files created 
> using ClustalX 1.83 (files previously sent to Frank off list)
> 
> (b) demo.dnd - Clustal guide tree in Newick format, no bootstraps
> (b) demo.treb - Clustal NJ tree in Nexus format, with bootstraps
> 
> Example code starts here:

> This gives the following output:
> 
> -----------------
> Filename:C:\TEMP\nexus\demo.dnd
> Number of taxlabels = 0
> Number of trees = 0
> -----------------
> Filename:C:\TEMP\nexus\demo.treb
> Number of taxlabels = 0
> Number of trees = 1
> Tree name: PAUP_1
> Tree nodes: V_Harveyi_PATH, B_subtilis_YXEM, B_subtilis_GlnH_homo_YCKK, 
> YA80_HAEIN, FLIY_ECOLI, Deinococcus_radiodurans, HISJ_E_COLI, E_coli_GlnH
> -----------------
> 
> As you can see, loading the ClustalX NEXUS output (*.treb) seems to work 
> without trouble (although n.taxlabels is an empty list... is this to be 
> expected?).

yes, the taxlabels refers to the taxon labels of a nexus data matrix.
They are not necessarily identical with the taxa in the tree, but could
be a superset or a subset of those.

However, the way clustal indicates the no. of supported bootstrap
replicates (square brackets after the branchlengths) is unsupported, and
thus these values are ignored. 

> 
> On the other hand, I don't get the tree for the Clustal guide tree file 
> (*.dnd) which is a pain.  Do I need to load these files differently, as 
> they are Newick format, not NEXUS format?
> 
Yes, the nexus parser reads only nexus. But you can throw the newick
tree directly at the Tree class
>>> from Bio.Nexus import Trees
>>> t=Trees.Tree(open('demo.dnd').read())

Frank


> Thank you
> 
> Peter
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
-- 
Frank Kauff
Dept. of Biology
Duke University
Box 90338
Durham, NC 27708
USA

Phone 919-660-7382
Fax 919-660-7293
Web http://www.lutzonilab.net


From dam6278 at yahoo.fr  Thu Apr 27 03:53:24 2006
From: dam6278 at yahoo.fr (dam6278)
Date: Thu, 27 Apr 2006 07:53:24 +0000 (GMT)
Subject: [BioPython] GenBank
Message-ID: <20060427075324.13946.qmail@web86913.mail.ukl.yahoo.com>

I have a proble with the GenBank parser :
  
  When I execute :
  
  from Bio import GenBank
  gi_list = GenBank.search_for("Opuntia AND rpl16")
  
  My output is :
  
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 1398, in search_for
      retstart = start_id, retmax = max_ids)
    File "/usr/lib/python2.4/site-packages/Bio/EUtils/DBIdsClient.py", line 294, in search
      searchinfo = parse.parse_search(infile, [None])
    File "/usr/lib/python2.4/site-packages/Bio/EUtils/parse.py", line 201, in parse_search
      for ele in pom["TranslationStack"]:
    File "/usr/lib/python2.4/site-packages/Bio/EUtils/POM.py", line 355, in __getitem__
      raise IndexError, "no item matches"
  IndexError: no item matches
  
  Do you know where is my problem ?
  
  Thank you for your help.
  
  damien
 

From lpritc at scri.sari.ac.uk  Thu Apr 27 04:33:21 2006
From: lpritc at scri.sari.ac.uk (Leighton Pritchard)
Date: Thu, 27 Apr 2006 09:33:21 +0100
Subject: [BioPython] Creating a graphical interface to database of
	gene	coordinates
In-Reply-To: <444A2255.6010704@maubp.freeserve.co.uk>
References: <20060421224928.87408.qmail@web38105.mail.mud.yahoo.com>
	<444A2255.6010704@maubp.freeserve.co.uk>
Message-ID: <1146126802.4725.223.camel@lplinuxdev>

Hi guys,

On Sat, 2006-04-22 at 13:32 +0100, Peter (BioPython) wrote:
> Srinivas Iyyer wrote:
> > Dear group, 
> >  I am happy that I am slowly finding pyhonian projects
> > related to my research area. 
> > 
> > Problem:
> > 1. I have a database of human gene coordinates on
> > chromosomes.
> > 2. I have gene expression data from my lab concerning
> > the genes I mentioned above. 
> > 
> > 3. I want to visualize expression data laid on
> > chromosomes.
> 
> You may be able to produce chromosome diagrams with Leighton Pritchard 
> and Jennifer White's program genomediagram:
> 
> http://bioinf.scri.sari.ac.uk/lp/programs.html#genomediagram
> 
> It will do both circular genomes diagrams (nice for bacteria) and linear 
> ones - which would make sense for chromosomes.  I think I've seen 
> examples with expression data shown in this way... certainly it could be 
> done.

We use it ourselves to plot array data against chromosome location, but
on the whole chromosome scale and, as you mention, not interactively.
It's pretty easy to do, but not what Srinivas is looking for, I think.
It sounds, Srinivas, like you're wanting something that will operate
more like GeneSpring?  Is that right?

It's possible that, if you just wanted to present a static image of
expression data, you could use GenomeDiagram in this way, but it's not
the way I would choose to present the data in a GUI - I'd expect drawing
straight onto a canvas (in whichever GUI toolkit suited you) to be more
flexible for you.

> Note that this can produce PDF or bitmap output - but its not 
> interactive.  There is also a GUI to go with it, but I have not looked 
> at this.

The GUI is pretty rudimentary, providing for file selection and just
enough document formatting so as to not be entirely useless to the non-
programmer.  An improved version (but still not interactive) is in a
perenially almost-ready state as wxPython widgets in the current source,
waiting for a serious fixing and a wxApp to hang from.

-- 
Dr Leighton Pritchard AMRSC
D131, Plant-Pathogen Interactions, Scottish Crop Research Institute
Invergowrie, Dundee, Scotland, DD2 5DA, UK
T: +44 (0)1382 562731 x2405 F: +44 (0)1382 568578
E: lpritc at scri.sari.ac.uk   W: http://bioinf.scri.sari.ac.uk/lp
GPG/PGP: FEFC205C E58BA41B  http://www.keyserver.net             
(If the signature does not verify, please remove the SCRI disclaimer)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

DISCLAIMER:

This email is from the Scottish Crop Research Institute, but the views 
expressed by the sender are not necessarily the views of SCRI and its 
subsidiaries.  This email and any files transmitted with it are confidential 
to the intended recipient at the e-mail address to which it has been 
addressed.  It may not be disclosed or used by any other than that addressee.
If you are not the intended recipient you are requested to preserve this 
confidentiality and you must not use, disclose, copy, print or rely on this 
e-mail in any way. Please notify postmaster at scri.sari.ac.uk quoting the 
name of the sender and delete the email from your system.

Although SCRI has taken reasonable precautions to ensure no viruses are 
present in this email, neither the Institute nor the sender accepts any 
responsibility for any viruses, and it is your responsibility to scan the email 
and the attachments (if any).


From mdehoon at c2b2.columbia.edu  Thu Apr 27 11:31:43 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Thu, 27 Apr 2006 11:31:43 -0400
Subject: [BioPython] GenBank
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF1A@cgcmail.cgc.cpmc.columbia.edu>

I was not able to replicate this error -- both biopython 1.41 and biopython
in CVS worked fine. Perhaps a temporary internet failure?

--Michiel.

Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032


-----Original Message-----
From: biopython-bounces at lists.open-bio.org on behalf of dam6278
Sent: Thu 4/27/2006 3:53 AM
To: biopython at lists.open-bio.org
Subject: [BioPython] GenBank
 
I have a proble with the GenBank parser :
  
  When I execute :
  
  from Bio import GenBank
  gi_list = GenBank.search_for("Opuntia AND rpl16")
  
  My output is :
  
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line
1398, in search_for
      retstart = start_id, retmax = max_ids)
    File "/usr/lib/python2.4/site-packages/Bio/EUtils/DBIdsClient.py", line
294, in search
      searchinfo = parse.parse_search(infile, [None])
    File "/usr/lib/python2.4/site-packages/Bio/EUtils/parse.py", line 201, in
parse_search
      for ele in pom["TranslationStack"]:
    File "/usr/lib/python2.4/site-packages/Bio/EUtils/POM.py", line 355, in
__getitem__
      raise IndexError, "no item matches"
  IndexError: no item matches
  
  Do you know where is my problem ?
  
  Thank you for your help.
  
  damien
 

_______________________________________________
BioPython mailing list  -  BioPython at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython


From bill at barnard-engineering.com  Fri Apr 28 00:44:28 2006
From: bill at barnard-engineering.com (Bill Barnard)
Date: Thu, 27 Apr 2006 21:44:28 -0700
Subject: [BioPython] Updating the tutorial,
	was :Parsing and Creating	Dictionaries of GenBank files
In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF05@cgcmail.cgc.cpmc.columbia.edu>
References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF05@cgcmail.cgc.cpmc.columbia.edu>
Message-ID: <1146199468.5816.34.camel@lyell.barnard-engineering.com>

On Fri, 2006-04-21 at 12:26 -0400, Michiel De Hoon wrote:
> > Something funny seems to have happened to the plain text version:
> > 
> >
> http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Doc/Tutorial.t
> xt.diff?r1=1.5&r2=1.6&cvsroot=biopython
> 
> The plain text version is generated by hevea, so not by tex directly. The
> funny output is likely due to having a different hevea version (which I ran a
> couple of times). I didn't see anything obviously wrong with the Tutorial.tex
> source file, so I think these errors are due to errors in the Tutorial.tex ->
> Tutorial.txt translation by hevea.

FWIW - I just updated from CVS and ran my updated Doc makefiles (see
http://bugzilla.open-bio.org/show_bug.cgi?id=1939 ) and don't see any of
the weird artifacts in the generated Tutorial.txt file. My hevea version
is 1.06.

> 
> > If generating a consistent plain text version is a lot of hassle, then 
> > maybe we can live without it?
> 
> Currently, the plain text version is not very useful. It's not a source file,
> so it should not be in CVS. On the other hand, the plain text version is not
> available from the Biopython documentation page, and users are better off
> with the PDF version anyway. So I think nobody will miss the plain text
> version. Correct me if I'm wrong.

As long as your release process includes running a make in the Doc tree,
then you can generate the txt file from the tex source.

Bill

From mdehoon at c2b2.columbia.edu  Fri Apr 28 12:37:30 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Fri, 28 Apr 2006 12:37:30 -0400
Subject: [BioPython] Updating the tutorial,
	was :Parsing and Creating	Dictionaries of GenBank files
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF1E@cgcmail.cgc.cpmc.columbia.edu>

> On Fri, 2006-04-21 at 12:26 -0400, Michiel De Hoon wrote:
> > > Something funny seems to have happened to the plain text version:
> > 
> > The plain text version is generated by hevea, so not by tex directly. The
> > funny output is likely due to having a different hevea version (which I
ran a
> > couple of times). I didn't see anything obviously wrong with the
Tutorial.tex
> > source file, so I think these errors are due to errors in the
Tutorial.tex ->
> > Tutorial.txt translation by hevea.
> 
> FWIW - I just updated from CVS and ran my updated Doc makefiles (see
> http://bugzilla.open-bio.org/show_bug.cgi?id=1939 ) and don't see any of
> the weird artifacts in the generated Tutorial.txt file. My hevea version
> is 1.06.

So it's probably a hevea problem -- I'm using version 1.08.

> As long as your release process includes running a make in the Doc tree,
> then you can generate the txt file from the tex source.

That is one of the steps in building a release -- see
http://www.biopython.org/docs/developer/build.html

--Michiel.


From clayton_kd at yahoo.com  Sat Apr 29 11:05:09 2006
From: clayton_kd at yahoo.com (Kyle Dent)
Date: Sat, 29 Apr 2006 08:05:09 -0700 (PDT)
Subject: [BioPython] GenBank parsing
Message-ID: <20060429150509.82051.qmail@web31509.mail.mud.yahoo.com>

Dear All,

My script was successfully implementing the Genbank
parser until just today I was trying to get it to
parse a genpept file. After much experimentation I
discovered that it was actually having trouble parsing
even newly downloaded GenBank files as well
(downloaded of NCBI).

I wanted to ask if anyone is aware of this problem, I
understand the flat file format was updated this month
and is probably the cause of this.

The output which I am getting:

Traceback (most recent call last):
  File "C:\work\GB CDS Extractor.py", line 289, in
open1_clicked
    loadGenBank(self, self.gbFilePath)
  File "C:\work\GB CDS Extractor.py", line 75, in
loadGenBank
    cur_record = genBank_Iterator.next()
  File
"C:\Python24\Lib\site-packages\Bio\GenBank\__init__.py",
line 129, in nex
t
    return self._parser.parse(File.StringHandle(data))
  File
"C:\Python24\Lib\site-packages\Bio\GenBank\__init__.py",
line 219, in par
se
    self._scanner.feed(handle, self._consumer)
  File
"C:\Python24\Lib\site-packages\Bio\GenBank\__init__.py",
line 1259, in fe
ed
    self._parser.parseFile(handle)
  File
"C:\Python24\Lib\site-packages\Martel\Parser.py", line
328, in parseFile
    self.parseString(fileobj.read())
  File
"C:\Python24\Lib\site-packages\Martel\Parser.py", line
356, in parseStrin
g
    self._err_handler.fatalError(result)
  File "C:\Python24\lib\xml\sax\handler.py", line 38,
in fatalError
    raise exception
Martel.Parser.ParserPositionException: error parsing
at or beyond character 136

With thanks,
Kyle


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

From biopython at maubp.freeserve.co.uk  Sat Apr 29 17:54:59 2006
From: biopython at maubp.freeserve.co.uk (Peter (BioPython))
Date: Sat, 29 Apr 2006 22:54:59 +0100
Subject: [BioPython] GenBank parsing
In-Reply-To: <20060429150509.82051.qmail@web31509.mail.mud.yahoo.com>
References: <20060429150509.82051.qmail@web31509.mail.mud.yahoo.com>
Message-ID: <4453E0B3.9040409@maubp.freeserve.co.uk>

Kyle Dent wrote:
> Dear All,
> 
> My script was successfully implementing the Genbank
> parser until just today I was trying to get it to
> parse a genpept file. After much experimentation I
> discovered that it was actually having trouble parsing
> even newly downloaded GenBank files as well
> (downloaded of NCBI).
> 
> I wanted to ask if anyone is aware of this problem, I
> understand the flat file format was updated this month
> and is probably the cause of this.

I'm aware that earlier in 2006, there was a new project line added.  I 
haven't been aware of any further changes... on the other hand, I don't 
think I've ever used a "genpept" file either.

Anyway, from the error message you are using the "old" Martel based 
parser shipped with BioPython 1.41

We recommend you update to the current CVS parser which is (a) more up 
to date, (b) faster, (c) should give slightly more helpful error 
messages if it does get stuck.

For most cases you can simply download this file, replacing your 
Bio/GenBank/__init__.py after making a backup of the old version:

http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/GenBank/__init__.py?cvsroot=biopython

If you see errors about ReseekFile then you will need to make a few 
other changes...

If you are still having trouble, or need further help making the update, 
please reply back.  Including the GenBank reference of any problem file 
would be handy.

Thank you

Peter


From srini_iyyer_bio at yahoo.com  Sat Apr  1 18:13:16 2006
From: srini_iyyer_bio at yahoo.com (Srinivas Iyyer)
Date: Sat, 1 Apr 2006 10:13:16 -0800 (PST)
Subject: [BioPython] How can I retreive FASTA sequences from NCBI
In-Reply-To: <442BFFAD.10103@maubp.freeserve.co.uk>
Message-ID: <20060401181316.62812.qmail@web38113.mail.mud.yahoo.com>

Hi , 
 I have 151,204 GenBank Accession IDs. 
I want to retreive FASTA sequences from NCBI and
compile them for my local blast. 


I am unable to get fasta sequences. I do not
understand. 

Could any one please help me. 

my code:
>>> mylis
['AA035383', 'AA971406', 'N98563']
parser = Fasta.RecordParser()
iterator = Fasta.Iterator(mylis,parser)
rec = iterator.next()
rec = iterator.next()
>>> rec
>>>

rec is empty :-(


Accession IDs are not GIs. They are GenBank accession
Ids.

I do not want sequences in GenBank (long format). I
want them in FASTA sequence format. 

Could any one pleast help me. 

Thanks
Srini


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


From biopython at maubp.freeserve.co.uk  Sat Apr  1 19:59:46 2006
From: biopython at maubp.freeserve.co.uk (Peter (BioPython))
Date: Sat, 01 Apr 2006 20:59:46 +0100
Subject: [BioPython] How can I retreive FASTA sequences from NCBI
In-Reply-To: <20060401181316.62812.qmail@web38113.mail.mud.yahoo.com>
References: <20060401181316.62812.qmail@web38113.mail.mud.yahoo.com>
Message-ID: <442EDBB2.3040105@maubp.freeserve.co.uk>

Srinivas Iyyer wrote:
> Hi , 
> I have 151,204 GenBank Accession IDs. 
> I want to retreive FASTA sequences from NCBI and
> compile them for my local blast. 
 >
 > I am unable to get fasta sequences. I do not
 > understand.
 >
 > Could any one please help me.

This should help.  Using the first identifier in your example, AA035383, 
this is a nucleotide sequence, available from the NCBI.  By searching 
the Entrez database you end up here:-

http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=1507107

Note, AA035383 --> gi:1507107

Using the web interface, you can choose to view it as FASTA format 
rather than the default of GenBank format, and save to file.

You could make a note of that URL, and just change the GI number to 
download all the files you want - but you need a simple way to determine 
the GI number...

Now, BioPython can help you here:

 >>> from Bio import GenBank
 >>> gi_list = GenBank.search_for('AA035383', database='nucleotide')
 >>> print gi_list
['1507107']

You could use this code to get the GI numbers for each of your 151,204 
GenBank Accession IDs.  I would check in each case that only one GI 
number is returned.

 >>> assert len(gi_list)==1
 >>> gi_number = gi_list[0]

Once you have the GI number, then you could just download the FASTA file 
yourself and then parse it in the normal way.  Or, get BioPython to do 
all this for you with its rather clever NCBIDictionary object...

 >>> from Bio import Fasta
 >>> from Bio import GenBank
 >>> ncbi_dict = GenBank.NCBIDictionary('nucleotide', 'fasta', \
...             parser =  Fasta.RecordParser())
 >>> gi_number = '1507107'
 >>> fasta_rec = ncbi_dict[gi_number]
 >>> print fasta_rec
 >gi|1507107|gb|AA035383.1|AA035383 zk25e12.r1 
Soares_pregnant_uterus_NbHPU Homo sapiens cDNA clone IMAGE:471598 5', 
mRNA sequence
CTTGAGCCTCAGGAACGAGATGGCGGTTCTCTGGAGGCTGAGTGCCGTTTGCGGTGCCCT
AGGAGGCCGAGCTCTGTTGCTTCGAACTCCAGTGGTCAGACCTGCTCATATCTCAGCATT
TCTTCAGGACCGACCTATCCCAGAATGGTGTGGAGTGCAGCACATACACTTGTCACCCGA
GCCACCATTCTGGCTCCAAGGCTGCATCTCTCCACTGGACTAGCGAGANGGTTGTCANTG
TTTTGCTCCTGGGTCTGCTTCCCGGCTGCTTANTTGAANCCTTGCTCNGCGANGGACTAN
TCCCTGGC

You could use the Fasta.SequenceParser() if you prefer.  I would guess 
you would then want to save these FASTA records into one long FASTA file.

Enjoy!

Peter


From halima at mancala.cbio.uct.ac.za  Sun Apr  2 13:33:11 2006
From: halima at mancala.cbio.uct.ac.za (Halima Rabiu)
Date: Sun, 2 Apr 2006 15:33:11 +0200 (SAST)
Subject: [BioPython] Need help on NCBIStandaloneblast
In-Reply-To: <442BFFAD.10103@maubp.freeserve.co.uk>
References: <Pine.LNX.4.58.0603300915090.7802@mancala.cbio.uct.ac.za>
	<442BFFAD.10103@maubp.freeserve.co.uk>
Message-ID: <Pine.LNX.4.58.0604021523010.10948@mancala.cbio.uct.ac.za>

Thanks Peter , I have been able to trace the error when I print the 
error_info.read() 
the error is with my infile 
There is result in my save file now but I am still having problem passing 
the output file.But I will try to figure it out it may be syntax problem
Thanks

On Thu, 30 Mar 2006, Peter (BioPython List) wrote:

> Halima Rabiu wrote:
> > Hi everyboby ;
> > I am new to biopython having problems with the "NCBIStandalone.blastall".
> > After launching the Blast with "doBlast" it look like runs and end
> > and then I check the output it empty and I try same thing using comand
> > line it work and get result.
> > I attch my code.
> 
> Have you checked the paths are correct, e.g.
> 
> assert os.path.isfile(data), "Missing database file " + data
> assert os.path.isfile(infile), "Missing input file " + infile
> 
> You don't need to check blast_exe yourself, as the blastall command does this
> for you.
> 
> If I understood you correctly, the "blast.out" file is empty.
> 
> Did blast return any error message?  Try:
> 
> print error_info.read()
> 
> or:
> 
> save_file =open("blast.error","w")
> blast_result=error_info.read()
> save_file.write(blast_result)
> save_file.close()
> 
> Next question, could you tell us what you typed at the command line which does
> work?
> 
> > I also try to go though the previous posts on biopython mailing list fund
> > similar problem post by Andreas but no solution to the problem .
> 
> It was worth checking anyway :)
> 
> Peter
> 
> 


From as_nascimento at yahoo.com.br  Wed Apr  5 20:35:35 2006
From: as_nascimento at yahoo.com.br (Alessandro S. Nascimento)
Date: Wed, 05 Apr 2006 17:35:35 -0300
Subject: [BioPython] problems when parsing blast output
In-Reply-To: <43CCD436.7020704@maubp.freeserve.co.uk>
References: <43CC485E.7050702@yahoo.com.br>
	<43CCC6D4.4020307@maubp.freeserve.co.uk>
	<43CCCF56.40803@yahoo.com.br>
	<43CCD436.7020704@maubp.freeserve.co.uk>
Message-ID: <44342A17.4070404@yahoo.com.br>

Hi Peter

I had some troubles when parsing some results from a blastpgp output 
file. My initial script used to work but isn't working this time. My 
blast output file is very, very large.
When I try to run it, I can see my processor working in 99% for some 
minutes than is returns to prompt with no results or information. Any 
idea of what may be happening?

Thanks in advance,


Alessandro


#!/usr/bin/python

import os
from Bio.Blast import NCBIStandalone
from string import *

blast_out = open('blast.output', 'r')

b_parser = NCBIStandalone.PSIBlastParser()

b_record = b_parser.parse(blast_out)

n=0
for round in b_record.rounds:
    for alignment in round.alignments:
        for hsp in alignment.hsps:
            if hsp.identities < 90:
                if hsp.identities > 30:
                        if alignment.length > 200:
                                print "Retrieving sequence query"
                                os.system ("fastacmd -d ..//db/nr -s 
\'%s\' > test.bl2.%d" % (query, n, ))
                                n=n+1

blast_out.close()


From halima at mancala.cbio.uct.ac.za  Thu Apr 13 15:07:52 2006
From: halima at mancala.cbio.uct.ac.za (Halima Rabiu)
Date: Thu, 13 Apr 2006 17:07:52 +0200 (SAST)
Subject: [BioPython] Need help parsing Blastoutput
Message-ID: <Pine.LNX.4.58.0604131640020.23647@mancala.cbio.uct.ac.za>

Hi All,
I have a BLAST output from a local blast
I need to calculate my % alignment coverage as regard to my subject
I try parsed the blast output and wanted to print the
sbjct Start and Sbjct end. but I could not is there anyway I could this 
try to get mach coverage between my querry and subject I dont need 
Identities,but total % alignment for querry or subject.
Thanks
Halimah


From mdehoon at c2b2.columbia.edu  Thu Apr 13 15:56:26 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Thu, 13 Apr 2006 11:56:26 -0400
Subject: [BioPython] Need help parsing Blastoutput
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEE9@cgcmail.cgc.cpmc.columbia.edu>

Could you send us the script you were using?

--Michiel.

Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032


-----Original Message-----
From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
Sent: Thu 4/13/2006 11:07 AM
To: biopython at lists.open-bio.org
Subject: [BioPython] Need help parsing Blastoutput
 
Hi All,
I have a BLAST output from a local blast
I need to calculate my % alignment coverage as regard to my subject
I try parsed the blast output and wanted to print the
sbjct Start and Sbjct end. but I could not is there anyway I could this 
try to get mach coverage between my querry and subject I dont need 
Identities,but total % alignment for querry or subject.
Thanks
Halimah

_______________________________________________
BioPython mailing list  -  BioPython at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython


From rafael at nbn.ac.za  Fri Apr 14 09:52:42 2006
From: rafael at nbn.ac.za (Rafael C. Jimenez)
Date: Fri, 14 Apr 2006 11:52:42 +0200
Subject: [BioPython] Need help parsing Blastoutput
In-Reply-To: <Pine.LNX.4.58.0604131640020.23647@mancala.cbio.uct.ac.za>
References: <Pine.LNX.4.58.0604131640020.23647@mancala.cbio.uct.ac.za>
Message-ID: <9ad32945680e91a485c1e0cdb1ca4eb7@nbn.ac.za>

On 13 Apr 2006, at 5:07 PM, Halima Rabiu wrote:

> Hi All,
> I have a BLAST output from a local blast

Well, I would say that you can use three alternatives to run blast, and 
somehow you can use all of them locally.
  - Blast web server (Through Blastcl3 or through biopython)
  - Blast standalone
  - wwwblast

I guess that when you say local blast you want to say you are using 
blast standalone to use your own local databases. It makes a difference 
to use one of these three different because you will use different 
modules to parse the output:
  - Bio.Blast.NCBIStandalone for Blast standalone outputs
  - Bio.Blast.NCBIWWW for Blast web server outputs
  - No parser for the wwwblast

> I need to calculate my % alignment coverage as regard to my subject

I am not sure what you mean, but I would say that this % is provided by 
the "Identities" field in nucleotide and protein comparisons for each 
alignment, and also by the "Positives" field in protein comparisons.
Example: Identities = 11/26 (42%), Positives = 15/26 (57%)

> I try parsed the blast output and wanted to print the
> sbjct Start and Sbjct end. but I could not is there anyway I could this

# Open your Blast Output file
blastOutput = open("The name of your blast output", 'r')

Once you have parsed the NCBIWWW output:
     from Bio.Blast import NCBIWWW
     parser = NCBIWWW.BlastParser()
     blastRecord = parser.parse(blastOutput)


.... or the NCBI web server output:
     from Bio.Blast import NCBIWWW
     parser = NCBIWWW.BlastParser()
     blastRecord = parser.parse(blastOutput)


now you can start to recover information using the Bio.Blast.Record 
module

     import Bio.Blast.Record
     # ... for instance you can retreive the Blast version you used when 
you got your output ...
     print 'header.version:',blastRecord.version
     for alignment in blastRecord.alignments:
       # ... or the length of the alignment ...
       print 'alignment.length:', alignment.length
       for hsp in alignment.hsps:
	# ... or the sbjct Start as you want ...
           print 'hsp.sbjct_start:', hsp.sbjct_start

>
> try to get mach coverage between my querry and subject I dont need
> Identities,but total % alignment for querry or subject.
> Thanks
> Halimah

I am working in the NBN central node in UWC, not far away from UCT. 
Don't hesitate to visit us if you want help or advice.

Cheers,
Rafael

Rafael C. Jimenez
-----------------------------------------------------------
National Bioinformatics Network
University of the Western Cape
Private Bag X17
Bellville 7530
South Africa
Tel: +27219592991
rafael at nbn.ac.za
www.nbn.ac.za
-----------------------------------------------------------
Proteomics Services Group
European Bioinformatics Institute (EMBL-EBI)
Wellcome Trust Genome Campus, Hinxton
Cambridge - CB10 1SD - UK
Tel: +441223492610
rafael at ebi.ac.uk
www.ebi.ac.uk
-----------------------------------------------------------

On 13 Apr 2006, at 5:07 PM, Halima Rabiu wrote:

> Hi All,
> I have a BLAST output from a local blast
> I need to calculate my % alignment coverage as regard to my subject
> I try parsed the blast output and wanted to print the
> sbjct Start and Sbjct end. but I could not is there anyway I could this
> try to get mach coverage between my querry and subject I dont need
> Identities,but total % alignment for querry or subject.
> Thanks
> Halimah
>
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython


From halima at mancala.cbio.uct.ac.za  Tue Apr 18 15:06:02 2006
From: halima at mancala.cbio.uct.ac.za (Halima Rabiu)
Date: Tue, 18 Apr 2006 17:06:02 +0200 (SAST)
Subject: [BioPython] Need help parsing Blastoutput
In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEE9@cgcmail.cgc.cpmc.columbia.edu>
References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEE9@cgcmail.cgc.cpmc.columbia.edu>
Message-ID: <Pine.LNX.4.58.0604181704570.29563@mancala.cbio.uct.ac.za>

thanks
please see the attchment a copy of my script and copy of my Blast output
Thanks


On Thu, 13 Apr 2006, Michiel De Hoon wrote:

> Could you send us the script you were using?
> 
> --Michiel.
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> Sent: Thu 4/13/2006 11:07 AM
> To: biopython at lists.open-bio.org
> Subject: [BioPython] Need help parsing Blastoutput
>  
> Hi All,
> I have a BLAST output from a local blast
> I need to calculate my % alignment coverage as regard to my subject
> I try parsed the blast output and wanted to print the
> sbjct Start and Sbjct end. but I could not is there anyway I could this 
> try to get mach coverage between my querry and subject I dont need 
> Identities,but total % alignment for querry or subject.
> Thanks
> Halimah
> 
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
> 
> 
-------------- next part --------------
#! /usr/local/bin/python2.4

#halimah

#16-04-2006

from string import split

from Bio.Blast import NCBIStandalone

b_out = open('Enterococcus_out','r')

b_parser = NCBIStandalone.BlastParser()

b_iterator = NCBIStandalone.Iterator(b_out,b_parser)


E_VALUE_THRESH = 1.0


while 1:

	b_record = b_iterator.next()

	print "The following results are for query " + b_record.query

	print 'len of query:',b_record.query_letters

	if b_record is None:

	       	break

	
     	for alignment in b_record.alignments:

        	
             		for hsp in alignment.hsps:

               			if hsp.expect <= E_VALUE_THRESH:

                     			print '****Alignment****'

                   			print 'title:', alignment.title

                    			print 'length:', alignment.length

                    			print 'e value:', hsp.expect

              		                print 'subjectstart:',hsp.sbjct_start

					print 'subject end:', hsp.sbject_end

		     			  
From mdehoon at c2b2.columbia.edu  Tue Apr 18 16:40:05 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Tue, 18 Apr 2006 12:40:05 -0400
Subject: [BioPython] Need help parsing Blastoutput
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF5@cgcmail.cgc.cpmc.columbia.edu>

Could you also send us the file Enterococcus_out so we can run the script?

>From the script, it looks like you're trying to parse text output from Blast.
While this is possible (in theory), the format of Blast text output tends to
change a lot, thereby breaking the parser in Biopython. It is more reliable
to have Blast generate output in XML format, and use the XML parser:

blast_out = open('my_blast.xml', 'r')

from Bio.Blast import NCBIXML

b_parser = NCBIXML.BlastParser()
b_record = b_parser.parse(blast_out)

See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to
generate Blast output in XML.

--Michiel.


Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032


-----Original Message-----
From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
Sent: Tue 4/18/2006 11:06 AM
To: Michiel De Hoon
Cc: biopython at lists.open-bio.org
Subject: RE: [BioPython] Need help parsing Blastoutput
 
thanks
please see the attchment a copy of my script and copy of my Blast output
Thanks


On Thu, 13 Apr 2006, Michiel De Hoon wrote:

> Could you send us the script you were using?
> 
> --Michiel.
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> Sent: Thu 4/13/2006 11:07 AM
> To: biopython at lists.open-bio.org
> Subject: [BioPython] Need help parsing Blastoutput
>  
> Hi All,
> I have a BLAST output from a local blast
> I need to calculate my % alignment coverage as regard to my subject
> I try parsed the blast output and wanted to print the
> sbjct Start and Sbjct end. but I could not is there anyway I could this 
> try to get mach coverage between my querry and subject I dont need 
> Identities,but total % alignment for querry or subject.
> Thanks
> Halimah
> 
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
> 
> 


From halima at mancala.cbio.uct.ac.za  Wed Apr 19 10:15:15 2006
From: halima at mancala.cbio.uct.ac.za (Halima Rabiu)
Date: Wed, 19 Apr 2006 12:15:15 +0200 (SAST)
Subject: [BioPython] Need help parsing Blastoutput
In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF5@cgcmail.cgc.cpmc.columbia.edu>
References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF5@cgcmail.cgc.cpmc.columbia.edu>
Message-ID: <Pine.LNX.4.58.0604191200150.29563@mancala.cbio.uct.ac.za>

Hi 
Please see the attachment,it part of my Blast output.
yes I am try to parse text output from Blast ,I have use another script to 
run my local blast that I am trying to perse the NCBIStandalone.BlastParser 
was working fine without hsp.sbject_end  which is one of what I need to 
print out .
On checking the class diagrams from cookbook, findout that sbject_end is 
not included .I just need another way of printing the int(subject end).
Thanks for your help
Halimah

On Tue, 18 Apr 2006, Michiel De Hoon wrote:

> Could you also send us the file Enterococcus_out so we can run the script?
> 
> From the script, it looks like you're trying to parse text output from Blast.
> While this is possible (in theory), the format of Blast text output tends to
> change a lot, thereby breaking the parser in Biopython. It is more reliable
> to have Blast generate output in XML format, and use the XML parser:
> 
> blast_out = open('my_blast.xml', 'r')
> 
> from Bio.Blast import NCBIXML
> 
> b_parser = NCBIXML.BlastParser()
> b_record = b_parser.parse(blast_out)
> 
> See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to
> generate Blast output in XML.
> 
> --Michiel.
> 
> 
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> Sent: Tue 4/18/2006 11:06 AM
> To: Michiel De Hoon
> Cc: biopython at lists.open-bio.org
> Subject: RE: [BioPython] Need help parsing Blastoutput
>  
> thanks
> please see the attchment a copy of my script and copy of my Blast output
> Thanks
> 
> 
> On Thu, 13 Apr 2006, Michiel De Hoon wrote:
> 
> > Could you send us the script you were using?
> > 
> > --Michiel.
> > 
> > Michiel de Hoon
> > Center for Computational Biology and Bioinformatics
> > Columbia University
> > 1150 St Nicholas Avenue
> > New York, NY 10032
> > 
> > 
> > 
> > -----Original Message-----
> > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> > Sent: Thu 4/13/2006 11:07 AM
> > To: biopython at lists.open-bio.org
> > Subject: [BioPython] Need help parsing Blastoutput
> >  
> > Hi All,
> > I have a BLAST output from a local blast
> > I need to calculate my % alignment coverage as regard to my subject
> > I try parsed the blast output and wanted to print the
> > sbjct Start and Sbjct end. but I could not is there anyway I could this 
> > try to get mach coverage between my querry and subject I dont need 
> > Identities,but total % alignment for querry or subject.
> > Thanks
> > Halimah
> > 
> > _______________________________________________
> > BioPython mailing list  -  BioPython at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython
> > 
> > 
> 
> 
-------------- next part --------------
BLASTP 2.2.10 [Oct-19-2004]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Query= ENTFA 3MGH_ENTFA (Q833H5) Putative 3-methyladenine DNA
glycosylase (EC 3.2.2.-)
         (229 letters)

Database: Blastdata.fdb
           240,170 sequences; 77,468,597 total letters

Searching..................................................done

                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

ENTFA 3MGH_ENTFA (Q833H5) Putative 3-methyladenine DNA glycosyla...   462   e-130
LISMO 3MGH_LISMO (P58621) Putative 3-methyladenine DNA glycosyla...   194   2e-49
STAAM 3MGH_STAAM (P65414) Putative 3-methyladenine DNA glycosyla...   187   3e-47
STAES 3MGH_STAES (Q8CRC1) Putative 3-methyladenine DNA glycosyla...   186   5e-47
LISIN 3MGH_LISIN (Q92D89) Putative 3-methyladenine DNA glycosyla...   185   8e-47
LACPL 3MGH_LACPL (Q88VP8) Putative 3-methyladenine DNA glycosyla...   178   1e-44
BACSU 3MGH_BACSU (P94378) Putative 3-methyladenine DNA glycosyla...   160   3e-39
LACJO Q74LU5 (Q74LU5) 3-methyladenine DNA glycosylase                 155   7e-38
OCEIH 3MGH_OCEIH (Q8ETG4) Putative 3-methyladenine DNA glycosyla...   147   2e-35
BACAN 3MGH_BACAN (Q81UJ9) Putative 3-methyladenine DNA glycosyla...   130   4e-30
BACC1 Q73CV5 (Q73CV5) Methylpurine-DNA glycosylase family protein     125   8e-29
CLOTE 3MGH_CLOTE (Q896H4) Putative 3-methyladenine DNA glycosyla...   124   3e-28
CHLMU Q9PJE9 (Q9PJE9) Succinate dehydrogenase, iron-sulfur protein    113   4e-25
CHLCV 3MGH_CHLCV (Q824B4) Putative 3-methyladenine DNA glycosyla...   111   2e-24
CLOAB 3MGH_CLOAB (Q97EY6) Putative 3-methyladenine DNA glycosyla...   108   1e-23
CLOPE 3MGH_CLOPE (Q8XHA9) Putative 3-methyladenine DNA glycosyla...   107   4e-23
STRMU Q8DRU4 (Q8DRU4) Putative 3-methyladenine DNA glycosylase        103   3e-22
DEIRA 3MGH_DEIRA (Q9RSQ0) Putative 3-methyladenine DNA glycosyla...    86   9e-17
CORGL 3MGH_CORGL (Q8NU33) Putative 3-methyladenine DNA glycosyla...    82   1e-15
STRCO 3MGH_STRCO (Q9S208) Putative 3-methyladenine DNA glycosyla...    80   4e-15
BRAJA 3MGH_BRAJA (Q89LR7) Putative 3-methyladenine DNA glycosyla...    79   1e-14
STRAW 3MGH_STRAW (Q829C5) Putative 3-methyladenine DNA glycosyla...    73   8e-13
COREF 3MGH_COREF (Q8FUA2) Putative 3-methyladenine DNA glycosyla...    69   9e-12
PROAC Q6A5L3 (Q6A5L3) Putative 3-methylpurine DNA glycosylase          66   9e-11
MYCPA Q740F6 (Q740F6) Hypothetical protein                             64   3e-10
MYCLEH 3MGH_MYCTU (P65412) Putative 3-methyladenine DNA glycosyl...    64   5e-10
MYCTU 3MGH_MYCTU (P65412) Putative 3-methyladenine DNA glycosyla...    64   5e-10
MYCBO 3MGH_MYCBO (P65413) Putative 3-methyladenine DNA glycosyla...    64   5e-10
MYCLE 3MGH_MYCLE (O05678) Putative 3-methyladenine DNA glycosyla...    60   5e-09
RICCN 3MGH_RICCN (Q92IE0) Putative 3-methyladenine DNA glycosyla...    52   2e-06
RICPR 3MGH_RICPR (Q9ZDH7) Putative 3-methyladenine DNA glycosyla...    49   1e-05
PSEAE 3MGH_PSEAE (Q9HX17) Putative 3-methyladenine DNA glycosyla...    45   2e-04
PSEPK Q88DL3 (Q88DL3) DNA-3-methyladenine glycosylase, putative        42   0.002
BORPE Q7VWB7 (Q7VWB7) Putative methypurine-DNA glycosylase             40   0.004
BORPA Q7W9Q9 (Q7W9Q9) Putative methypurine-DNA glycosylase             40   0.004
STRAW Q93HJ1 (Q93HJ1) Modular polyketide synthase                      35   0.14
STRAW Q93HJ2 (Q93HJ2) Modular polyketide synthase                      33   0.68
SILPO Q5LKE3 (Q5LKE3) Peptidyl-prolyl cis-trans isomerase, FKBP-...    32   1.5
SILPO Q5LVX1 (Q5LVX1) ABC transporter, transmembrane ATP-binding...    30   4.4
CLOTE DXS_CLOTE (Q894H0) 1-deoxy-D-xylulose-5-phosphate synthase...    30   5.8
BURMA Q9AI54 (Q9AI54) DedA family protein                              30   7.5
STAES Q8CST0 (Q8CST0) Hypothetical protein SE0952                      29   9.8
SILPO Q5LMI5 (Q5LMI5) ADA regulatory protein                           29   9.8

>ENTFA 3MGH_ENTFA (Q833H5) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 229

 Score =  462 bits (1190), Expect = e-130
 Identities = 229/229 (100%), Positives = 229/229 (100%)

Query: 1   MVKEMKETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFG 60
           MVKEMKETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFG
Sbjct: 1   MVKEMKETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFG 60

Query: 61  LRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQ 120
           LRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQ
Sbjct: 61  LRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQ 120

Query: 121 GRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGR 180
           GRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGR
Sbjct: 121 GRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGR 180

Query: 181 WTELPLRYVVAGNPYISKQKRTAVDQIDFGWKDEENEKSNNAHILRGTT 229
           WTELPLRYVVAGNPYISKQKRTAVDQIDFGWKDEENEKSNNAHILRGTT
Sbjct: 181 WTELPLRYVVAGNPYISKQKRTAVDQIDFGWKDEENEKSNNAHILRGTT 229


>LISMO 3MGH_LISMO (P58621) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 207

 Score =  194 bits (492), Expect = 2e-49
 Identities = 99/198 (50%), Positives = 134/198 (67%), Gaps = 3/198 (1%)

Query: 8   TINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRL 67
           T   F +KTT E+A+ +LGM L H+T  G+L G IV+ EAYLG  D AAHSF   +T R
Sbjct: 6   TKEFFESKTTIELARDILGMRLVHQTNEGLLSGLIVETEAYLGATDMAAHSFQNLRTKRT 65

Query: 68  QAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVE-GVDKMIENRQGRQGVE 126
           + M+  PGTIY+Y MH  ++LN +T  +G P+ ++IRAIEP E    +M +NR G+ G E
Sbjct: 66  EVMFSSPGTIYMYQMHRQVLLNFITMPKGIPEAILIRAIEPDEQAKQQMTQNRHGKTGYE 125

Query: 127 LTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPL 186
           LTNGPGKL  ALG+  Q YG+++F S++ L  E+ K P  IEA  RIG+PNKG  T  PL
Sbjct: 126 LTNGPGKLTQALGLSMQDYGKTLFDSNIWL--EEAKLPHLIEATNRIGVPNKGIATHYPL 183

Query: 187 RYVVAGNPYISKQKRTAV 204
           R+ V G+PYIS Q++ ++
Sbjct: 184 RFTVKGSPYISGQRKNSI 201


>STAAM 3MGH_STAAM (P65414) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 202

 Score =  187 bits (474), Expect = 3e-47
 Identities = 91/201 (45%), Positives = 132/201 (65%), Gaps = 1/201 (0%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  + T + A+ LLG+ + ++       GYIV+ EAYLG  D+AAH FG + TP++ ++Y
Sbjct: 6   FINQQTTQTAKALLGVKIIYQDDYQTYTGYIVETEAYLGIQDKAAHGFGGKITPKVTSLY 65

Query: 72  DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGP 131
            K GTIY + MHTHL++N VT+ +G P+GV+IRAIEP EG+  M  NR G+ G ELTNGP
Sbjct: 66  KKGGTIYAHVMHTHLLINFVTRTEGIPEGVLIRAIEPDEGIGAMNVNR-GKSGYELTNGP 124

Query: 132 GKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVA 191
           GK   A  I + + G ++    L +    RK+PK I    RIGIPNKG WT  PLR+ V
Sbjct: 125 GKWTKAFNIPRSIDGSTLNDCKLSIDTNHRKYPKTIIESGRIGIPNKGEWTNKPLRFTVK 184

Query: 192 GNPYISKQKRTAVDQIDFGWK 212
           GNPY+S+ +++     D  WK
Sbjct: 185 GNPYVSRMRKSDFQNPDDTWK 205


>LISIN 3MGH_LISIN (Q92D89) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 207

 Score =  185 bits (470), Expect = 8e-47
 Identities = 96/200 (48%), Positives = 130/200 (65%), Gaps = 3/200 (1%)

Query: 6   KETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTP 65
           K T   F  +TT E+A+ ++GM L HE     L GYIV+ EAYLG  D AAHSF   +T
Sbjct: 4   KITPTFFENRTTIELARDIIGMRLVHEIGNYTLSGYIVETEAYLGATDMAAHSFKNLRTK 63

Query: 66  RLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPV-EGVDKMIENRQGRQG 124
           R + M+  PGTIY Y MH  ++LN +T  +G P+ V+IRA+EP  E +++M +NR  + G
Sbjct: 64  RTEVMFGTPGTIYTYQMHQQVLLNFITMREGIPEAVLIRALEPTKESIEQMEQNRFLKTG 123

Query: 125 VELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTEL 184
            ELTNGPGKL  ALG+  Q YG+++F S++ L  E+ K P  IEA  RIG+PNKG  T
Sbjct: 124 FELTNGPGKLTQALGLSMQDYGKTLFDSNIWL--ERAKVPHIIEATNRIGVPNKGIATHY 181

Query: 185 PLRYVVAGNPYISKQKRTAV 204
           PLR+   G+PYIS Q++  +
Sbjct: 182 PLRFTAKGSPYISAQRKRQI 201


>LACPL 3MGH_LACPL (Q88VP8) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 209

 Score =  178 bits (451), Expect = 1e-44
 Identities = 93/199 (46%), Positives = 127/199 (63%), Gaps = 1/199 (0%)

Query: 13  NTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYD 72
           +T TT E+A  LLG  L  +T++GVL  +I + EAYLG  D  AH++   +TPR  A++
Sbjct: 9   STCTTPEIAVSLLGKQLRLQTSSGVLTAWITETEAYLGARDAGAHAYQNHQTPRNHALWQ 68

Query: 73  KPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPG 132
             GTIY+Y M    +LN+VTQ  G P+ V+IR IEP  G+++M + R       LTNGPG
Sbjct: 69  SAGTIYIYQMRAWCLLNIVTQAAGTPECVLIRGIEPDAGLERMQQQRP-VPIANLTNGPG 127

Query: 133 KLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAG 192
           KL+ ALG+DK L GQ++  ++L L     + P+++ A PRIGI NKG WT  PLRY VAG
Sbjct: 128 KLMQALGLDKTLNGQALQPATLSLDLSHYRQPEQVVATPRIGIVNKGEWTTAPLRYFVAG 187

Query: 193 NPYISKQKRTAVDQIDFGW 211
           NP++SK  R  +D    GW
Sbjct: 188 NPFVSKISRRTIDHEHHGW 206


>BACSU 3MGH_BACSU (P94378) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 196

 Score =  160 bits (405), Expect = 3e-39
 Identities = 91/198 (45%), Positives = 112/198 (56%), Gaps = 2/198 (1%)

Query: 1   MVKEMKETINIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFG 60
           M +E       F  KT  E+A  LLG  L  ET  G   GYIV+ EAY+G  D AAHSF
Sbjct: 1   MTREKNPLPITFYQKTALELAPSLLGCLLVKETDEGTASGYIVETEAYMGAGDRAAHSFN 60

Query: 61  LRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQ 120
            R+T R + M+ + G +Y Y MHTH +LN+V  E+  PQ V+IRAIEP EG   M E R
Sbjct: 61  NRRTKRTEIMFAEAGRVYTYVMHTHTLLNVVAAEEDVPQAVLIRAIEPHEGQLLMEERRP 120

Query: 121 GRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGR 180
           GR   E TNGPGKL  ALG+    YG+ I    L +  E    P+ I   PRIGI N G
Sbjct: 121 GRSPREWTNGPGKLTKALGVTMNDYGRWITEQPLYI--ESGYTPEAISTGPRIGIDNSGE 178

Query: 181 WTELPLRYVVAGNPYISK 198
             + P R+ V GN Y+S+
Sbjct: 179 ARDYPWRFWVTGNRYVSR 196


>LACJO Q74LU5 (Q74LU5) 3-methyladenine DNA glycosylase
          Length = 208

 Score =  155 bits (393), Expect = 7e-38
 Identities = 77/192 (40%), Positives = 125/192 (65%), Gaps = 2/192 (1%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  ++T E+++ LLG  L +     +L G IV+AEAY+G  D AAHS+G R++P  + +Y
Sbjct: 7   FTNRSTSEISKDLLGRTLSYNNGEEILSGTIVEAEAYVGVKDRAAHSYGGRRSPANEGLY 66

Query: 72  DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGP 131
              G++Y+Y+   +   ++  QE+G+PQGV+IRAI+P+ G+D MI+NR G+ G  LTNGP
Sbjct: 67  RPGGSLYIYSQRQYFFFDVSCQEEGEPQGVLIRAIDPLTGIDTMIKNRSGKTGPLLTNGP 126

Query: 132 GKLVAALGIDKQLYG-QSIFSSSLRLVPEKRKFPKKIEALPRIGI-PNKGRWTELPLRYV 189
           GK++ ALGI  + +    +  S   +  + ++  ++I ALPR+GI  +   W +  LR++
Sbjct: 127 GKMMQALGITSRKWDLVDLNDSPFDIDIDHKREIEEIVALPRVGINQSDPEWAQKKLRFI 186

Query: 190 VAGNPYISKQKR 201
           V+GNPY+S  K+
Sbjct: 187 VSGNPYVSDIKK 198


>OCEIH 3MGH_OCEIH (Q8ETG4) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 198

 Score =  147 bits (371), Expect = 2e-35
 Identities = 74/182 (40%), Positives = 112/182 (61%), Gaps = 2/182 (1%)

Query: 17  TEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGT 76
           T E+A+ LLG  L  +T  G   G IV+ EAYLG  D AAH +G R+T R + +Y KPG
Sbjct: 19  TLELAKNLLGCILVKQTEEGTSSGVIVETEAYLGNTDRAAHGYGNRRTKRTEILYSKPGY 78

Query: 77  IYLYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVA 136
            Y++ +H H ++N+V+  +G P+ V+IRA+EP  G+D+M+  R  ++   LT+GPGKL
Sbjct: 79  AYVHLIHNHRLINVVSSMEGDPESVLIRAVEPFSGIDEMLMRRPVKKFQNLTSGPGKLTQ 138

Query: 137 ALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYI 196
           A+GI  + YG  + +  L +   + K P  ++   RIGI N G   + P R+ V GNP++
Sbjct: 139 AMGIYMEDYGHFMLAPPLFI--SEGKSPASVKTGSRIGIDNTGEAKDYPYRFWVDGNPFV 196

Query: 197 SK 198
           S+
Sbjct: 197 SR 198


>BACAN 3MGH_BACAN (Q81UJ9) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 205

 Score =  130 bits (326), Expect = 4e-30
 Identities = 80/194 (41%), Positives = 112/194 (57%), Gaps = 11/194 (5%)

Query: 17  TEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGT 76
           T EVA+ LLG  L H        G IV+ EAY GPDD+AAHS+G R+T R + M+  PG
Sbjct: 12  TLEVAKKLLGQKLVHIVNGIKRSGIIVEVEAYKGPDDKAAHSYGGRRTDRTEVMFGAPGH 71

Query: 77  IYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGV------ELTN 129
            Y+Y ++  +   N++T   G PQGV+IRA+EPV+G++++   R  +  +       LTN
Sbjct: 72  AYVYLIYGMYHCFNVITAPVGTPQGVLIRALEPVDGIEEIKLARYNKTDITKAQYKNLTN 131

Query: 130 GPGKLVAALGIDKQLYGQSIFSSSL--RLVPEKRKFPK--KIEALPRIGIPNKGRWTELP 185
           GPGKL  ALGI  +  G S+ S +L   LVPE++      KI A PRI I         P
Sbjct: 132 GPGKLCRALGITLEERGVSLQSDTLHIELVPEEKHISSQYKITAGPRINIDYAEEAVHYP 191

Query: 186 LRYVVAGNPYISKQ 199
            R+   G+P++SK+
Sbjct: 192 WRFYYEGHPFVSKK 205


>BACC1 Q73CV5 (Q73CV5) Methylpurine-DNA glycosylase family protein
          Length = 205

 Score =  125 bits (315), Expect = 8e-29
 Identities = 79/194 (40%), Positives = 110/194 (56%), Gaps = 11/194 (5%)

Query: 17  TEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGT 76
           T EVA+ LLG  L H        G IV+ EAY GPDD+AAHS+G R+T R + M+  PG
Sbjct: 12  TLEVAKKLLGQKLVHIVDGIKRSGIIVEVEAYKGPDDKAAHSYGGRRTDRTEVMFGAPGH 71

Query: 77  IYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGV------ELTN 129
            Y+Y ++  +   N++T   G PQGV+IRA+EPV+G++++   R  +  +       LTN
Sbjct: 72  AYVYLIYGMYHCFNVITAPVGTPQGVLIRALEPVDGIEEIKLARYNKTDITKAQYKNLTN 131

Query: 130 GPGKLVAALGIDKQLYGQSIFSSSL--RLVPEKRKFPK--KIEALPRIGIPNKGRWTELP 185
           GPGKL  ALGI  +  G S+ S +L   LV E+       KI A PRI I         P
Sbjct: 132 GPGKLCRALGITLEERGVSLQSDTLHIELVREEEHISSQYKITAGPRINIDYAEEAVHYP 191

Query: 186 LRYVVAGNPYISKQ 199
            R+   G+P++SK+
Sbjct: 192 WRFYYEGHPFVSKK 205


>CLOTE 3MGH_CLOTE (Q896H4) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 203

 Score =  124 bits (310), Expect = 3e-28
 Identities = 74/197 (37%), Positives = 109/197 (55%), Gaps = 9/197 (4%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  K+  +VA+YLLG  L +E     L G IV+ EAY+G  D+A+H++G +KT R+  +Y
Sbjct: 7   FYEKSALQVAKYLLGKILVNEVEGITLKGKIVETEAYIGAIDKASHAYGGKKTERVMPLY 66

Query: 72  DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKM--------IENRQGR 122
            KPGT Y+Y ++  +   N++T+ +G+ +GV+IRAIEP+EG++KM        I
Sbjct: 67  GKPGTAYVYLIYGMYHCFNVITKVEGEAEGVLIRAIEPLEGIEKMAYLRYKKPISEISKT 126

Query: 123 QGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWT 182
           Q   LT GPGKL  AL IDK    Q + +     +    K    I    RIGI
Sbjct: 127 QFKNLTTGPGKLCIALNIDKSNNKQDLCNEGTLYIEHNDKEKFNIVESKRIGIEYAEEAK 186

Query: 183 ELPLRYVVAGNPYISKQ 199
           +   R+ +  NP+ISK+
Sbjct: 187 DFLWRFYIEDNPWISKK 203


>CHLMU Q9PJE9 (Q9PJE9) Succinate dehydrogenase, iron-sulfur protein
          Length = 425711

 Score =  113 bits (283), Expect = 4e-25
 Identities = 72/185 (38%), Positives = 105/185 (56%), Gaps = 5/185 (2%)

Query: 10  NIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQA 69
           + F ++    +AQ LLG  L       +  GYIV+ EAY GPDD+A H++  RKT R +A
Sbjct: 321 HFFLSEDVITLAQQLLGHKLITTHEGLITSGYIVETEAYRGPDDKACHAYNYRKTQRNRA 380

Query: 70  MYDKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVE-- 126
           MY K G+ YLY  +  H +LN+VT  +  P  V+IRAI P +G + MI+ RQ R
Sbjct: 381 MYLKGGSAYLYRCYGMHHLLNVVTGPEDIPHAVLIRAILPDQGKELMIQRRQWRDKPPHL 440

Query: 127 LTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPL 186
           LTNGPGK+  ALGI  +   Q + + +L +   K K    + A  RIGI     + ++P
Sbjct: 441 LTNGPGKVCQALGISLENNRQRLNTPALYI--SKEKISGTLTATARIGIDYAQEYRDVPW 498

Query: 187 RYVVA 191
           R++++
Sbjct: 499 RFLLS 503


>CHLCV 3MGH_CHLCV (Q824B4) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 190

 Score =  111 bits (278), Expect = 2e-24
 Identities = 67/174 (38%), Positives = 98/174 (56%), Gaps = 5/174 (2%)

Query: 20  VAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYL 79
           +A+ LLG  L  + +  +  G+IV+ EAY GPDD+A H++  RKT R   MY + G  Y+
Sbjct: 15  LAKELLGHILITKISGKITSGFIVETEAYRGPDDKACHAYNYRKTKRNSPMYSRGGIAYI 74

Query: 80  YTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVE--LTNGPGKLVA 136
           Y  +  H + N+VT +Q  P  V+IRAI P EG D MI+ RQ +   +  LTNGPGK+
Sbjct: 75  YRCYGMHSLFNVVTAKQDLPHAVLIRAILPYEGEDIMIQRRQWQNKPKHLLTNGPGKVCQ 134

Query: 137 ALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVV 190
           AL +  +    ++ S  L +   K K   +I   PRIGI       +LP R+++
Sbjct: 135 ALNLTLEHNTHALTSPHLHI--SKEKASGRITQTPRIGIDYAEECKDLPWRFLL 186


>CLOAB 3MGH_CLOAB (Q97EY6) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 205

 Score =  108 bits (270), Expect = 1e-23
 Identities = 70/202 (34%), Positives = 110/202 (54%), Gaps = 10/202 (4%)

Query: 9   INIFNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQ 68
           I  F ++ T  VA+ LLG  L HE       G IV+ EAY G +D+ AH++G R+TPR +
Sbjct: 4   IREFYSRDTIVVAKELLGKVLVHEVNGIRTSGKIVEVEAYRGINDKGAHAYGGRRTPRTE 63

Query: 69  AMYDKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGR----- 122
           A+Y   G  Y+Y ++  +  +N+V  ++G P+GV+IRAIEP+EG++ M E R  +
Sbjct: 64  ALYGPAGHAYVYFIYGLYYCMNVVAMQEGIPEGVLIRAIEPIEGIEVMSERRFKKLFNDL 123

Query: 123 ---QGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKG 179
              Q   LTNGP KL +A+ I ++     +    L +   K +  + +EA  R+GI
Sbjct: 124 TKYQLKNLTNGPSKLCSAMEIRREQNLMDLNGDELYIEEGKNESFEIVEA-KRVGIDYAE 182

Query: 180 RWTELPLRYVVAGNPYISKQKR 201
              +   R+ + GN  +S  K+
Sbjct: 183 EAKDYLWRFYIKGNKCVSVLKK 204


>CLOPE 3MGH_CLOPE (Q8XHA9) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 205

 Score =  107 bits (266), Expect = 4e-23
 Identities = 69/199 (34%), Positives = 107/199 (53%), Gaps = 11/199 (5%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  + T  VA+ LLG  L        L G IV+ EAY+G  D+A+H++G ++T R + +Y
Sbjct: 7   FYNRDTVTVAKELLGKVLVRNINGVTLKGKIVETEAYIGAIDKASHAYGGKRTNRTETLY 66

Query: 72  DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVEL--- 127
             PGT+Y+Y ++  +  LN++++E+    GV+IR IEP+EG+++M + R  +   EL
Sbjct: 67  ADPGTVYVYIIYGMYHCLNLISEEKDVAGGVLIRGIEPLEGIEEMSKLRYKKSYEELSNY 126

Query: 128 -----TNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPK--KIEALPRIGIPNKGR 180
                +NGP KL  ALGIDK   G +  SS    V +     K   I    RIGI
Sbjct: 127 EKKNFSNGPSKLCMALGIDKGENGINTISSEEIYVEDDSLIKKDFSIVEAKRIGIDYAEE 186

Query: 181 WTELPLRYVVAGNPYISKQ 199
             +   R+ +  N ++SK+
Sbjct: 187 ARDFLWRFYIKDNKFVSKK 205


>STRMU Q8DRU4 (Q8DRU4) Putative 3-methyladenine DNA glycosylase
          Length = 192

 Score =  103 bits (258), Expect = 3e-22
 Identities = 64/173 (36%), Positives = 91/173 (52%), Gaps = 15/173 (8%)

Query: 40  GYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYLYTMHTHLILNMVTQEQGKPQ 99
           G IV+ EAYLG  D A HS   R+TP+ +AMY   G  Y+Y ++ H +LN+VT+ Q   +
Sbjct: 34  GRIVETEAYLGSKDSACHSANDRRTPKNEAMYLAAGHWYVYQIYGHQMLNLVTKPQNVAE 93

Query: 100 GVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPE 159
            V+IRA+E  +             G  L NGPGKL    GIDK   G S+  S L L  +
Sbjct: 94  AVLIRALETAD-------------GHLLANGPGKLTKFAGIDKSFNGDSLQDSRLSL--Q 138

Query: 160 KRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYISKQKRTAVDQIDFGWK 212
           +   P++IE   RIG+     W +  L + V GN ++SK  + ++      WK
Sbjct: 139 EDLSPQRIEERSRIGVTCTDEWKDALLCFYVRGNQHVSKIAKKSLLTDKETWK 191


>DEIRA 3MGH_DEIRA (Q9RSQ0) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 190

 Score = 85.9 bits (211), Expect = 9e-17
 Identities = 64/181 (35%), Positives = 97/181 (53%), Gaps = 7/181 (3%)

Query: 20  VAQYLLGMYLEHETATGV-LGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIY 78
           +A+ LLG  L   T  G  L G +V+ EAY  P D A  + G     R   M   PG
Sbjct: 3   LARELLGGTLVRVTPDGHRLSGRVVEVEAYDCPRDPACTA-GRFHAARSAEMAIAPGHWL 61

Query: 79  LYTMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAAL 138
            +  H H +L +  +++G    V+IRA+EP+EG  KM++ R   +  +LT+GP KLV AL
Sbjct: 62  FWFAHGHPLLQVACRQEGVSASVLIRALEPLEGAGKMLDYRPVTRQRDLTSGPAKLVYAL 121

Query: 139 GID-KQLYGQSIFSSSLRLV-PEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYI 196
           G+D  Q+  + + S  L L+ PE      ++    R+GI  +GR   LP R+++ GN ++
Sbjct: 122 GLDPMQISHRPVNSPELHLLAPETPLADDEVTVTARVGI-REGR--NLPWRFLIRGNGWV 178

Query: 197 S 197
           S
Sbjct: 179 S 179


>CORGL 3MGH_CORGL (Q8NU33) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 189

 Score = 82.0 bits (201), Expect = 1e-15
 Identities = 66/185 (35%), Positives = 100/185 (54%), Gaps = 16/185 (8%)

Query: 20  VAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYL 79
           VA  LLG  L H    G +G  I + EAYL   DEAAH++   KTPR  AM+   G +Y+
Sbjct: 12  VAPQLLGCTLTH----GGVGIRITEVEAYLDSTDEAAHTY-RGKTPRNAAMFGPGGHMYV 66

Query: 80  YTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGV---ELTNGPGKLV 135
           Y  +  H   N+V   +G  QGV++RA E V G + + ++R+G +G+    L  GPG
Sbjct: 67  YISYGIHRAGNIVCGPEGTGQGVLLRAGEVVSG-ESIAQSRRG-EGIPHARLAQGPGNFG 124

Query: 136 AALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPY 195
            ALG++      S+F  S  L+ ++ + P+ +   PRIGI      TE  LR+ +  +P
Sbjct: 125 QALGLEISDNHASVFGPSF-LISDRVETPEIVRG-PRIGISKN---TEALLRFWIPNDPT 179

Query: 196 ISKQK 200
           +S ++
Sbjct: 180 VSGRR 184


>STRCO 3MGH_STRCO (Q9S208) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 213

 Score = 80.5 bits (197), Expect = 4e-15
 Identities = 59/184 (32%), Positives = 91/184 (49%), Gaps = 8/184 (4%)

Query: 19  EVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIY 78
           EVA  LLG  L      G +   + + EAY G +D  +H++  R TPR + M+  PG +Y
Sbjct: 21  EVAPDLLGRILVRTGPDGPITLRLTEVEAYDGQNDPGSHAYRGR-TPRNEVMFGPPGHVY 79

Query: 79  LY-TMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENR-QGRQGVELTNGPGKLVA 136
           +Y T      +N+V   +G+   V++RA E ++G +     R   R   EL  GP +L
Sbjct: 80  VYFTYGMWFCMNLVCGPEGRSSAVLLRAGEIIDGAELARTRRLSARNDKELAKGPARLAT 139

Query: 137 ALGIDKQLYGQSIFSSS---LRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGN 193
           ALG+D+ L G    +S    LR++        ++   PR G+  +G     P RY VA +
Sbjct: 140 ALGVDRALNGTDACTSQETPLRILTGTPVPGDQVRNGPRTGVAGEG--GVHPWRYWVADD 197

Query: 194 PYIS 197
           P +S
Sbjct: 198 PTVS 201


>BRAJA 3MGH_BRAJA (Q89LR7) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 200

 Score = 79.0 bits (193), Expect = 1e-14
 Identities = 68/193 (35%), Positives = 94/193 (48%), Gaps = 22/193 (11%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  ++  EVA  L+G  +      GV GG IV+ EAY    + AAHS+    TPR   M+
Sbjct: 20  FFGRSVREVAHDLIGATM---LVDGV-GGLIVEVEAY-HHTEPAAHSYN-GPTPRNHVMF 73

Query: 72  DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNG 130
             PG  Y+Y  +  H  +N V + +G    V+IRA+EP  G+  M   R  +    L +G
Sbjct: 74  GPPGFAYVYRSYGIHWCVNFVCEAEGSAAAVLIRALEPTHGIAAMRRRRHLQDVHALCSG 133

Query: 131 PGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALP-----RIGIPNKGRWTELP 185
           PGKL  ALGI       +I  ++L L         + E L      RIGI    +  ELP
Sbjct: 134 PGKLTEALGI-------TIAHNALPLDRPPIALHARTEDLEVATGIRIGIT---KAVELP 183

Query: 186 LRYVVAGNPYISK 198
            RY V G+ ++SK
Sbjct: 184 WRYGVKGSKFLSK 196


>STRAW 3MGH_STRAW (Q829C5) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 213

 Score = 72.8 bits (177), Expect = 8e-13
 Identities = 57/191 (29%), Positives = 88/191 (46%), Gaps = 8/191 (4%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  +   +VA  LLG  L   T  G +   + + EAY GP D  +H++  R T R   M+
Sbjct: 14  FFARPVLDVAPDLLGRVLVRTTPDGPIELRVTEVEAYDGPSDPGSHAYRGR-TARNGVMF 72

Query: 72  DKPGTIYLY-TMHTHLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENR-QGRQGVELTN 129
             PG +Y+Y T      +N+V   +G+   V++RA E +EG +     R   R   EL
Sbjct: 73  GPPGHVYVYFTYGMWHCMNLVCGPEGRASAVLLRAGEIIEGAELARTRRLSARNDKELAK 132

Query: 130 GPGKLVAALGIDKQLYGQSIFS---SSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPL 186
           GP +L  AL +D+ L G    +     L L+      P ++   PR G+   G     P
Sbjct: 133 GPARLATALEVDRALDGTDACAPEGGPLTLLSGTPVPPDQVRNGPRTGVSGDG--GVHPW 190

Query: 187 RYVVAGNPYIS 197
           R+ +  +P +S
Sbjct: 191 RFWIDNDPTVS 201


>COREF 3MGH_COREF (Q8FUA2) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 190

 Score = 69.3 bits (168), Expect = 9e-12
 Identities = 58/182 (31%), Positives = 85/182 (46%), Gaps = 11/182 (6%)

Query: 20  VAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYL 79
           VA  LLG    H+  +  L     + EAYLG +D AAH+    KT R  AM+   G +Y+
Sbjct: 12  VAPQLLGCIFTHDGVSIRL----TEVEAYLGAEDAAAHTHR-GKTARNAAMFGPGGHMYI 66

Query: 80  YTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAAL 138
           Y  +  H   N+    +G  QGV++RA E V G D     R       L  GPG L  AL
Sbjct: 67  YISYGIHRAGNIACAPEGVGQGVLLRAGEVVAGEDIAYRRRGDVPFTRLAQGPGNLGQAL 126

Query: 139 GIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYISK 198
                     I  +  +L+ E  + P+ +   PR+GI       + PLR+ + G+P +S
Sbjct: 127 NFQLSDNHAPINGTDFQLM-EPSERPEWVSG-PRVGITKN---ADAPLRFWIPGDPTVSV 181

Query: 199 QK 200
           ++
Sbjct: 182 RR 183


>PROAC Q6A5L3 (Q6A5L3) Putative 3-methylpurine DNA glycosylase
          Length = 191

 Score = 65.9 bits (159), Expect = 9e-11
 Identities = 56/190 (29%), Positives = 85/190 (44%), Gaps = 23/190 (12%)

Query: 19  EVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIY 78
           EVA  LLG  +      G +G  + + EAY+G DD A+H+F    TPR + M+  P  IY
Sbjct: 10  EVAPLLLGATIWR----GPVGIRLTEVEAYMGLDDPASHAFR-GPTPRARVMFGPPSHIY 64

Query: 79  LYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137
           +Y  +  H  +N+V    G+   V++R  + + G D     R       L  GPG + +A
Sbjct: 65  VYLSYGMHRCVNLVCSPDGEASAVLLRGGQVIAGHDDARRRRGNVAENRLACGPGNMGSA 124

Query: 138 LGIDKQLYGQ----------SIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLR 187
           LG   +  G           S     L   PE  +F +     PR+GI    R  + P R
Sbjct: 125 LGASLEESGNPVSIIGNGAISALGWRLEPAPEIAEFRQG----PRVGI---SRNIDAPWR 177

Query: 188 YVVAGNPYIS 197
           + +  +P +S
Sbjct: 178 WWIPQDPTVS 187


>MYCPA Q740F6 (Q740F6) Hypothetical protein
          Length = 205

 Score = 64.3 bits (155), Expect = 3e-10
 Identities = 66/198 (33%), Positives = 92/198 (46%), Gaps = 30/198 (15%)

Query: 19  EVAQYLLGMYLEHETATGVLGGYIVDAEAYLG-PD----DEAAHSF-GLRKTPRLQAMYD 72
           E A+ LLG  L   T  GV  G IV+ EAY G PD    D AAHS+ GLR   R   M+
Sbjct: 14  EAARRLLGATL---TGRGV-SGVIVEVEAYGGVPDGPWPDAAAHSYKGLRA--RNFVMFG 67

Query: 73  KPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQG-----VE 126
            PG +Y Y  H  H+  N+     G    V++RA    +G D      +GR+G
Sbjct: 68  PPGRLYTYRSHGIHVCANVSCGPDGTAAAVLLRAAALEDGTDVA----RGRRGELVHTAA 123

Query: 127 LTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEAL--PRIGIPNKGRWTEL 184
           L  GPG L AA+GI     G  +F       P   +  + + A+  PR+G+    +  +
Sbjct: 124 LARGPGNLCAAMGITMADNGIDLFDPD---SPVTLRLHEPLTAVCGPRVGV---SQAADR 177

Query: 185 PLRYVVAGNPYISKQKRT 202
           P R  + G P +S  +R+
Sbjct: 178 PWRLWLPGRPEVSAYRRS 195


>MYCLEH 3MGH_MYCTU (P65412) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 203

 Score = 63.5 bits (153), Expect = 5e-10
 Identities = 55/171 (32%), Positives = 81/171 (47%), Gaps = 16/171 (9%)

Query: 42  IVDAEAYLG-PD----DEAAHSFGLRKTPRLQAMYDKPGTIYLYTMH-THLILNMVTQEQ 95
           +V+ EAY G PD    D AAHS+  R   R   M+  PG +Y Y  H  H+  N+
Sbjct: 31  VVEVEAYGGVPDGPWPDAAAHSYRGRNG-RNDVMFGPPGRLYTYRSHGIHVCANVACGPD 89

Query: 96  GKPQGVMIRAIEPVEGVDKMIENR-QGRQGVELTNGPGKLVAALGIDKQLYGQSIF--SS 152
           G    V++RA    +G +     R Q  + V L  GPG L AALGI     G  +F  SS
Sbjct: 90  GTAAAVLLRAAAIEDGAELATSRRGQTVRAVALARGPGNLCAALGITMADNGIDLFDPSS 149

Query: 153 SLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAGNPYISKQKRTA 203
            +RL   +     +  + PR+G+    +  + P R  + G P +S  +R++
Sbjct: 150 PVRL---RLNDTHRARSGPRVGV---SQAADRPWRLWLTGRPEVSAYRRSS 194


>MYCLE 3MGH_MYCLE (O05678) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 214

 Score = 60.1 bits (144), Expect = 5e-09
 Identities = 60/190 (31%), Positives = 88/190 (46%), Gaps = 18/190 (9%)

Query: 21  AQYLLGMYLEHETATGVLGGYIVDAEAYLG-PD----DEAAHSFGLRKTPRLQAMYDKPG 75
           A  LLG  +   T  GV    +V+ EAY G PD    D AAHS+  R   R   M+  PG
Sbjct: 25  AHRLLGATI---TGRGVCA-IVVEVEAYGGVPDGPWPDAAAHSYHGRND-RNAVMFGPPG 79

Query: 76  TIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGR--QGVELTNGPG 132
            +Y Y  H  H+  N+     G    V+IRA     G D +  +R+G   + V L  GPG
Sbjct: 80  RLYTYCSHGIHVCANVSCGPDGTAAAVLIRAGALENGAD-VARSRRGASVRTVALARGPG 138

Query: 133 KLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAG 192
            L +ALGI     G  +F++   +     +  + +   PR+GI +     + P R  + G
Sbjct: 139 NLCSALGITMDDNGIDVFAADSPVTLVLNEAQEAMSG-PRVGISHA---ADRPWRLWLPG 194

Query: 193 NPYISKQKRT 202
            P +S  +R+
Sbjct: 195 RPEVSTYRRS 204


>RICCN 3MGH_RICCN (Q92IE0) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 183

 Score = 51.6 bits (122), Expect = 2e-06
 Identities = 39/131 (29%), Positives = 62/131 (47%), Gaps = 18/131 (13%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  + T  V+  L+G  L  +  T +    I + E+Y+G +D A H+    +T R   M+
Sbjct: 11  FFARDTNVVSTELIGKALYFQGKTAI----ITETESYIGQNDPACHA-ARGRTKRTDIMF 65

Query: 72  DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAIEPVEGVDKMIENRQGRQGVELTNG 130
              G  Y+Y ++  +  LN VT+ +G P   +IR +  +     + EN          NG
Sbjct: 66  GPAGFSYVYLIYGMYYCLNFVTEAKGFPAATLIRGVHVI-----LPENLY-------LNG 113

Query: 131 PGKLVAALGID 141
           PGKL   LGI+
Sbjct: 114 PGKLCKYLGIN 124


>RICPR 3MGH_RICPR (Q9ZDH7) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 217

 Score = 48.9 bits (115), Expect = 1e-05
 Identities = 29/96 (30%), Positives = 49/96 (51%), Gaps = 6/96 (6%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  + T  V+  L+G  L  +  T +    I + E+Y+G DD A H+    +T R   M+
Sbjct: 11  FFARDTNLVSTELIGKVLYFQGTTAI----ITETESYIGEDDPACHA-ARGRTKRTDVMF 65

Query: 72  DKPGTIYLYTMH-THLILNMVTQEQGKPQGVMIRAI 106
              G  Y+Y ++  +  LN VT+++G P   +IR +
Sbjct: 66  GPAGFSYVYLIYGMYYCLNFVTEDEGFPAATLIRGV 101


>PSEAE 3MGH_PSEAE (Q9HX17) Putative 3-methyladenine DNA glycosylase
           (EC 3.2.2.-)
          Length = 239

 Score = 45.1 bits (105), Expect = 2e-04
 Identities = 49/184 (26%), Positives = 80/184 (43%), Gaps = 17/184 (9%)

Query: 20  VAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGTIYL 79
           VA+ LLG  + H      L   I++ EAY   +++ +H+  L  T + +A++   G IY+
Sbjct: 29  VARELLGKVIRHRQGNLWLAARIIETEAYY-LEEKGSHA-SLGYTEKRKALFLDGGHIYM 86

Query: 80  YTMHTHLILNMVTQEQGKPQGVMIRAIEP----------VEGVDKMIENRQG--RQGVEL 127
           Y       LN      G    V+I++  P          +E +  +  + QG  R+   L
Sbjct: 87  YYARGGDSLNF--SAGGPGNAVLIKSGHPWLDRISDHTALERMQSLNPDSQGRPREIGRL 144

Query: 128 TNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLR 187
             G   L  A+G+    +    F      V +  + P ++    R+GIP KGR   LP R
Sbjct: 145 CAGQTLLCKAMGLKVPEWDAQRFDPQRLFVDDVGERPSQVIQAARLGIP-KGRDEHLPYR 203

Query: 188 YVVA 191
           +V A
Sbjct: 204 FVDA 207


>PSEPK Q88DL3 (Q88DL3) DNA-3-methyladenine glycosylase, putative
          Length = 222

 Score = 41.6 bits (96), Expect = 0.002
 Identities = 48/192 (25%), Positives = 77/192 (40%), Gaps = 17/192 (8%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  +  + +A+ LLG  + H      L   I++ EAY   D  +  S G   T + +A++
Sbjct: 8   FFDRDAQTLAKALLGKVIRHRHGDLWLAARIIETEAYYLSDKGSHASLGY--TEKRKALF 65

Query: 72  DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRAIEP----VEGVDKMIENRQG------ 121
              G IY+Y       LN      G    V+I++  P    + G D + + +
Sbjct: 66  LDGGHIYMYYARGGDSLNF--SAHGPGNAVLIKSAYPWQDTLSGPDSLAQMQLNNPDASG 123

Query: 122 --RQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKG 179
             R    L  G   L  ALG+    +    F +    V +      ++    R+GIP+ G
Sbjct: 124 NIRPQERLCAGQTLLCRALGLKVPHWDAQRFDAERLYVEDCGNAVPRVIQAARLGIPH-G 182

Query: 180 RWTELPLRYVVA 191
           R   LP R+V A
Sbjct: 183 RDEHLPYRFVDA 194


>BORPE Q7VWB7 (Q7VWB7) Putative methypurine-DNA glycosylase
          Length = 238

 Score = 40.4 bits (93), Expect = 0.004
 Identities = 47/190 (24%), Positives = 78/190 (41%), Gaps = 17/190 (8%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  +  +++A+ LLG  + H      L   I++ EAY   +  +  S G   T + +A++
Sbjct: 20  FFNRDAQQLARDLLGKVVRHRVDGLWLSARIIETEAYYLAEKGSHASLGY--THKRRALF 77

Query: 72  DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRA----IEPVEG------VDKMIENRQG 121
              G +Y+Y       LN      G    V+I++    ++ V G      + ++  + QG
Sbjct: 78  MDGGVVYMYYARGGDSLNF--SAAGPGNAVLIKSGHPWVDEVSGPRALACMQRLNPDAQG 135

Query: 122 --RQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKG 179
             R    L  G   L  ALG+    +    F      V +      ++    R+GIP  G
Sbjct: 136 QPRPPARLCAGQTLLCRALGLKVPQWDAKAFDPRRFFVEDVGVRLDRLVRTTRLGIP-AG 194

Query: 180 RWTELPLRYV 189
           R   LP RYV
Sbjct: 195 RDEHLPYRYV 204


>BORPA Q7W9Q9 (Q7W9Q9) Putative methypurine-DNA glycosylase
          Length = 238

 Score = 40.4 bits (93), Expect = 0.004
 Identities = 47/190 (24%), Positives = 78/190 (41%), Gaps = 17/190 (8%)

Query: 12  FNTKTTEEVAQYLLGMYLEHETATGVLGGYIVDAEAYLGPDDEAAHSFGLRKTPRLQAMY 71
           F  +  +++A+ LLG  + H      L   I++ EAY   +  +  S G   T + +A++
Sbjct: 20  FFNRDAQQLARDLLGKVVRHRVDGLWLSARIIETEAYYLAEKGSHASLGY--THKRRALF 77

Query: 72  DKPGTIYLYTMHTHLILNMVTQEQGKPQGVMIRA----IEPVEG------VDKMIENRQG 121
              G +Y+Y       LN      G    V+I++    ++ V G      + ++  + QG
Sbjct: 78  MDGGVVYMYYARGGDSLNF--SAAGPGNAVLIKSGHPWVDEVSGPRALACMQRLNPDAQG 135

Query: 122 --RQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKG 179
             R    L  G   L  ALG+    +    F      V +      ++    R+GIP  G
Sbjct: 136 QPRPPARLCAGQTLLCRALGLKVPQWDAKAFDPRRFFVEDVGVRLDRLVRTTRLGIP-AG 194

Query: 180 RWTELPLRYV 189
           R   LP RYV
Sbjct: 195 RDEHLPYRYV 204


>STRAW Q93HJ1 (Q93HJ1) Modular polyketide synthase
          Length = 3613

 Score = 35.4 bits (80), Expect = 0.14
 Identities = 16/39 (41%), Positives = 23/39 (58%)

Query: 99  QGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137
           +G M+    PVEGV++ +   +GR GV   NGPG  V +
Sbjct: 700 KGGMVSVALPVEGVEERLARFEGRIGVAAVNGPGSAVVS 738


>STRAW Q93HJ2 (Q93HJ2) Modular polyketide synthase
          Length = 4685

 Score = 33.1 bits (74), Expect = 0.68
 Identities = 15/39 (38%), Positives = 23/39 (58%)

Query: 99   QGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137
            +G M+    PVEGV++ +   +GR GV   NGP  +V +
Sbjct: 3743 KGGMVSVALPVEGVEERLARFEGRIGVAAVNGPTSVVVS 3781


 Score = 33.1 bits (74), Expect = 0.68
 Identities = 15/39 (38%), Positives = 23/39 (58%)

Query: 99   QGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137
            +G M+    PVEGV++ +   +GR GV   NGP  +V +
Sbjct: 2223 KGGMVSVALPVEGVEERLARFEGRIGVAAVNGPTSVVVS 2261


 Score = 30.4 bits (67), Expect = 4.4
 Identities = 14/39 (35%), Positives = 22/39 (56%)

Query: 99  QGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAA 137
           +G M+    PV  V++ +   +GR GV   NGPG +V +
Sbjct: 695 KGGMVSVALPVGEVEERLARFEGRIGVAAVNGPGSVVVS 733


>SILPO Q5LKE3 (Q5LKE3) Peptidyl-prolyl cis-trans isomerase,
           FKBP-type
          Length = 142

 Score = 32.0 bits (71), Expect = 1.5
 Identities = 20/68 (29%), Positives = 34/68 (50%), Gaps = 6/68 (8%)

Query: 114 KMIENRQGRQGVELTNGPGKLVAALGIDKQLYGQSIFSSSLRLVPEKRKF----PKKIEA 169
           K  ++ +GR  +E T G G+++   G+DK + G          VP    +    P+  +A
Sbjct: 22  KTFDSSEGRDPLEFTVGSGQIIP--GLDKAMPGMETGEKKRVEVPCAEAYGPLNPEARQA 79

Query: 170 LPRIGIPN 177
           +PR GIP+
Sbjct: 80  IPREGIPD 87


>SILPO Q5LVX1 (Q5LVX1) ABC transporter, transmembrane ATP-binding
           protein, putative
          Length = 1032

 Score = 30.4 bits (67), Expect = 4.4
 Identities = 19/48 (39%), Positives = 26/48 (54%), Gaps = 8/48 (16%)

Query: 101 VMIRAIEPVEGVDKMIENRQG----RQGVE----LTNGPGKLVAALGI 140
           V ++  EP +G   MIE   G    R+G E    +T GPG+LV  LG+
Sbjct: 935 VFLKDDEPTDGAYMMIEGEAGLYLPREGQEDQLIVTVGPGRLVGELGL 982


>CLOTE DXS_CLOTE (Q894H0) 1-deoxy-D-xylulose-5-phosphate synthase
           (EC 2.2.1.7) (1-deoxyxylulose-5-phosphate synthase) (DXP
           synthase) (DXPS)
          Length = 620

 Score = 30.0 bits (66), Expect = 5.8
 Identities = 19/55 (34%), Positives = 28/55 (50%), Gaps = 1/55 (1%)

Query: 138 LGIDKQLYGQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRYVVAG 192
           + I K + G S + SSLR+ P   KF + +E + +  IPN G+     L  V  G
Sbjct: 179 MSIGKNVGGLSTYLSSLRIDPNYNKFKRDVEGIIK-KIPNIGKGVAKNLERVKDG 232


>BURMA Q9AI54 (Q9AI54) DedA family protein
          Length = 1925639

 Score = 29.6 bits (65), Expect = 7.5
 Identities = 32/136 (23%), Positives = 52/136 (38%), Gaps = 6/136 (4%)

Query: 43      VDAEAYLGPDDEAAHSFGLRKTPRLQAMYDKPGT-IYLYTMHTHLILNMVTQEQGKPQGV 101
               V+  A   P    A    ++    + A Y   G   + +  H  L   +    Q K   +
Sbjct: 1823164 VELVANEAPGSRMAFMHPVKSRAAISAAYFDHGVKTFSFDTHEELAKILDATGQAKDLNL 1823223

Query: 102     MIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAALGIDKQLYGQS--IFSSSLRLVPE 159
               ++R     EG    +    G+ GVE+ N P  L+AA    + L G S  + S  +R
Sbjct: 1823224 IVRMGVQAEGAAYSLS---GKFGVEMHNAPDLLLAARRATQDLMGVSFHVGSQCMRPTAF 1823280

Query: 160     KRKFPKKIEALPRIGI 175
               +    +   AL R G+
Sbjct: 1823281 QAAMAQASRALVRAGV 1823296


>STAES Q8CST0 (Q8CST0) Hypothetical protein SE0952
          Length = 572

 Score = 29.3 bits (64), Expect = 9.8
 Identities = 17/75 (22%), Positives = 36/75 (48%), Gaps = 5/75 (6%)

Query: 98  PQGVMIRAIEPVEGVDKMIENRQGRQGVELTNGPGKLVAALG-----IDKQLYGQSIFSS 152
           P+  M+     +  +  +IEN++  +G+ LT+G    + A+      ID  +YG  +  +
Sbjct: 60  PEDEMLGVDIVIPDIQYVIENKERLKGIFLTHGHEHAIGAVSYVLEQIDAPVYGSKLTIA 119

Query: 153 SLRLVPEKRKFPKKI 167
            ++   + R   KK+
Sbjct: 120 LVKEAMKARNIKKKV 134


>SILPO Q5LMI5 (Q5LMI5) ADA regulatory protein
          Length = 283

 Score = 29.3 bits (64), Expect = 9.8
 Identities = 24/103 (23%), Positives = 48/103 (46%), Gaps = 4/103 (3%)

Query: 88  LNMVTQEQGKPQGVMIRAIEPVE--GVDKMIENRQGRQGVELTNGPGKLVAALGIDKQLY 145
           +N+ T E+G   GV+ RAIE ++  G    +++   R  +   +        +G+  + Y
Sbjct: 1   MNVQTTEEGYHYGVIRRAIELIDAGGESMPLDDLAARMNMSPAHFQRIFSRWVGVSPKKY 60

Query: 146 GQSIFSSSLRLVPEKRKFPKKIEALPRIGIPNKGRWTELPLRY 188
            Q +     + + E+R     +EA   +G+   GR  +L +R+
Sbjct: 61  QQYLTLGHAKALLEERF--TLLEAAQNVGLSGTGRLHDLFVRW 101


  Database: Blastdata.fdb
    Posted date:  Mar 29, 2006  3:30 PM
  Number of letters in database: 77,468,597
  Number of sequences in database:  240,170

Lambda     K      H
   0.316    0.135    0.391
Gapped
Lambda     K      H
   0.267   0.0410    0.140


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Hits to DB: 35,841,668
Number of Sequences: 240170
Number of extensions: 1550248
Number of successful extensions: 3502
Number of sequences better than 10.0: 43
Number of HSP's better than 10.0 without gapping: 24
Number of HSP's successfully gapped in prelim test: 19
Number of HSP's that attempted gapping in prelim test: 3332
Number of HSP's gapped (non-prelim): 140
length of query: 229
length of database: 77,468,597
effective HSP length: 107
effective length of query: 122
effective length of database: 51,770,407
effective search space: 6315989654
effective search space used: 6315989654
T: 11
A: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 64 (29.3 bits)
BLASTP 2.2.10 [Oct-19-2004]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Query= ENTFA AACA_ENTFA (P0A0C2) Bifunctional AAC/APH [Includes:
6'-aminoglycoside N-acetyltransferase (EC 2.3.1.-) (AAC(6'));
2''-aminoglycoside phosphotransferase (EC 2.7.1.-) (APH(2''))]
         (479 letters)

Database: Blastdata.fdb
           240,170 sequences; 77,468,597 total letters

Searching..................................................done

                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

STAAM AACA_STAAM (P0A0C0) Bifunctional AAC/APH [Includes: 6'-ami...   959   0.0
ENTFA AACA_ENTFA (P0A0C2) Bifunctional AAC/APH [Includes: 6'-ami...   959   0.0
BACAN Q81P54 (Q81P54) Acetyltransferase, GNAT family                  168   4e-41
BACC1 Q735Z9 (Q735Z9) Acetyltransferase, GNAT family                  159   1e-38
BACAN Q81T90 (Q81T90) Aminoglycoside phophotransferase family pr...    67   1e-10
BACC1 Q73BC3 (Q73BC3) Aminoglycoside phophotransferase family pr...    62   4e-09
BRAJA Q89WN0 (Q89WN0) Bll0648 protein                                  59   3e-08
BACHD Q9K9M4 (Q9K9M4) BH2621 protein                                   56   2e-07
BACC1 Q739G2 (Q739G2) 6'-aminoglycoside N-acetyltransferase/2''-...    55   5e-07
THEMA Q9X063 (Q9X063) Hypothetical protein                             52   3e-06
CLOTE Q896X4 (Q896X4) Putative acetyltransferase                       49   3e-05
BACHD Q9KB15 (Q9KB15) BH2121 protein                                   48   6e-05
STAES Q8CQT6 (Q8CQT6) Spermidine acetyltransferase                     47   1e-04
VIBCH Q9KMC2 (Q9KMC2) Acetyltransferase, putative                      45   5e-04
BACSU BLTD_BACSU (P39909) Spermine/spermidine acetyltransferase ...    45   6e-04
BACC1 Q739N8 (Q739N8) Spermidine acetyltransferase, putative           44   0.001
LACPL Q88SW8 (Q88SW8) Acetyltransferase (Putative)                     44   0.001
VIBUY Q7MK89 (Q7MK89) Putative acetyltransferase                       43   0.002
DEIRA Q9RW71 (Q9RW71) Acetyltransferase, putative                      43   0.002
BACAN Q81RM0 (Q81RM0) Spermidine acetyltransferase, putative           43   0.002
LACJO Q74K74 (Q74K74) Hypothetical protein                             42   0.003
BACC1 Q72YV7 (Q72YV7) Acetyltransferase, GNAT family                   42   0.003
BACC1 Q734R6 (Q734R6) Acetyltransferase, GNAT family                   42   0.004
CLOPE Q8XKH9 (Q8XKH9) Hypothetical protein CPE1416                     42   0.005
BACAN Q81SQ8 (Q81SQ8) Acetyltransferase, GNAT family                   41   0.007
VIBCH Q9K330 (Q9K330) Acetyltransferase, putative                      41   0.009
VIBUY Q7MK96 (Q7MK96) Putative acetyltransferase                       40   0.012
WIGBR Q8D3I4 (Q8D3I4) Imp protein                                      40   0.016
BACSU P94482 (P94482) YnaD                                             40   0.021
BACC1 Q73A91 (Q73A91) Acetyltransferase, GNAT family                   40   0.021
THETN Q8RC99 (Q8RC99) Acetyltransferases                               39   0.027
STRAW Q82IB6 (Q82IB6) Putative acetyltransferase                       39   0.027
LISIN Q92E38 (Q92E38) Lin0623 protein                                  39   0.027
STRCO O69977 (O69977) Hypothetical protein SCO5801                     39   0.036
STRAW Q82KD8 (Q82KD8) Hypothetical protein                             39   0.036
VIBCH Q9KMA3 (Q9KMA3) Acetyltransferase, putative                      39   0.046
STAAM Q932M5 (Q932M5) Hypothetical protein SAVP027                     39   0.046
LACLA Q9CHJ8 (Q9CHJ8) Acetyl transferase                               39   0.046
ENTFA Q82YR8 (Q82YR8) Acetyltransferase, GNAT family                   39   0.046
BACC1 Q73AS8 (Q73AS8) Acetyltransferase, GNAT family                   39   0.046
BACC1 Q734C3 (Q734C3) Acetyltransferase, GNAT family                   39   0.046
BACAN Q81YL2 (Q81YL2) Acetyltransferase, GNAT family                   39   0.046
BACAN Q81XB7 (Q81XB7) Spermidine N1-acetyltransferase                  39   0.046
SALTI Q8Z6Z1 (Q8Z6Z1) Spermidine N1-acetyltransferase                  38   0.061
SALTY Q8ZPJ3 (Q8ZPJ3) Spermidine N1-acetyltransferase (EC 2.3.1.57)    38   0.061
SALCH Q57PD6 (Q57PD6) Spermidine N1-acetyltransferase                  38   0.061
MYCPE Q8EWE5 (Q8EWE5) Acetyltransferase GNAT family                    38   0.061
BACC1 Q72Y03 (Q72Y03) Spermidine N1-acetyltransferase (EC 2.3.1.57)    38   0.061
DEIRA Q9RW73 (Q9RW73) Acetyltransferase, putative                      38   0.079
STAAM Q99U68 (Q99U68) Hypothetical protein                             37   0.10
RICPR Q9ZE75 (Q9ZE75) TRANSCRIPTIONAL ACTIVATOR PROTEIN CZCR (CzcR)    37   0.10
LACJO Q74J71 (Q74J71) Hypothetical protein                             37   0.10
CLOAB Q97D33 (Q97D33) Acetyltransferase (With duplicated domains...    37   0.10
VIBCH Q9KU66 (Q9KU66) Ribosomal-protein-alanine acetyltransferase      37   0.18
STRR6 Q8DNF9 (Q8DNF9) Hypothetical protein spr1760                     37   0.18
SALTY Q8ZNU6 (Q8ZNU6) Putative ribose 5-phosphate isomerase (EC ...    37   0.18
SALCH Q57N66 (Q57N66) Putative ribose 5-phosphate isomerase            37   0.18
ECO57 ATDA_ECO57 (P0A952) Spermidine N(1)-acetyltransferase (EC ...    37   0.18
ECOLI ATDA_ECOLI (P0A951) Spermidine N(1)-acetyltransferase (EC ...    37   0.18
BACHD Q9KG16 (Q9KG16) BH0299 protein                                   37   0.18
AQUAE O66838 (O66838) Ribosomal-protein-alanine acetyltransferase      37   0.18
PROAC Q6ABU2 (Q6ABU2) Acetyltransferase (EC 2.3.1.-)                   36   0.23
BACAN Q81QL3 (Q81QL3) Acetyltransferase, GNAT family                   36   0.23
STRMU Q8DV67 (Q8DV67) Putative acetyltransferase                       36   0.30
STAAM COAD_STAAM (P63818) Phosphopantetheine adenylyltransferase...    36   0.30
LISMO Q8Y9B8 (Q8Y9B8) Lmo0614 protein                                  36   0.30
THEMA Q9WZ46 (Q9WZ46) Hypothetical protein                             35   0.39
STRR6 Q8DQU8 (Q8DQU8) Hypothetical protein spr0490                     35   0.39
CLOAB Q97FA0 (Q97FA0) Predicted acetyltransferase                      35   0.39
_BUCAP LOLC_BUCAP (Q8K9N8) Lipoprotein-releasing system transmem...    35   0.39
BACC1 Q734M2 (Q734M2) Acetyltransferase, GNAT family                   35   0.39
BACAN Q81U09 (Q81U09) Acetyltransferase, GNAT family                   35   0.39
YERPE Q8ZHB3 (Q8ZHB3) Putative siderophore biosynthesis protein ...    35   0.51
VIBUY Q7MFH7 (Q7MFH7) Histone acetyltransferase HPA2                   35   0.51
RICCN Q92JG6 (Q92JG6) Transcriptional activator protein czcR           35   0.51
CLOAB Q97G03 (Q97G03) Predicted acetyltransferase                      35   0.51
BACHD Q9KF00 (Q9KF00) Ribosomal-protein-alanine N-acetyltransferase    35   0.51
BACAN Q81NB6 (Q81NB6) Spermine/spermidine acetyltransferase, put...    35   0.51
BACAN Q81N07 (Q81N07) Acetyltransferase, GNAT family                   35   0.51
STRMU Q8DT36 (Q8DT36) Putative acetyltransferase                       35   0.67
PSEAE Q9I3W7 (Q9I3W7) Hypothetical protein                             35   0.67
NITEU Q82UT0 (Q82UT0) GCN5-related N-acetyltransferase (EC 2.3.1...    35   0.67
LACLA Q9CF66 (Q9CF66) Spermidine acetyltransferase (EC 2.3.1.57)       35   0.67
BACC1 Q739G7 (Q739G7) Acetyltransferase, GNAT family                   35   0.67
MYCPE Q8EWM2 (Q8EWM2) Hypothetical protein MYPE1810                    34   0.88
MYCPE Q8EWE6 (Q8EWE6) Acetyltransferase GNAT family                    34   0.88
LACLA Q9CHW9 (Q9CHW9) Hypothetical protein yfiL                        34   0.88
LACPL Q88SM7 (Q88SM7) Prophage Lp4 protein 8, DNA primase/helicase     34   0.88
ENTFA Q837C4 (Q837C4) Acetyltransferase, GNAT family                   34   0.88
CLOTE Q891D4 (Q891D4) Ribosomal-protein-alanine acetyltransferas...    34   0.88
CLOAB Q97I16 (Q97I16) Predicted acetyltransferase domain contain...    34   0.88
BACC1 Q72WY7 (Q72WY7) Hypothetical protein                             34   0.88
VIBPA Q87G30 (Q87G30) Putative acetyltransferase                       34   1.1
STAES ARGD2_STAES (Q8CSG1) Acetylornithine aminotransferase 2 (E...    34   1.1
RICPR Q9ZEC5 (Q9ZEC5) 190 KD ANTIGEN (Sca1)                            34   1.1
PSEPK Q88NL5 (Q88NL5) Acetyltransferase, GNAT family                   34   1.1
LACPL Q88YF8 (Q88YF8) Acetyltransferase (Putative)                     34   1.1
BACC1 Q737S7 (Q737S7) Acetyltransferase, GNAT family                   34   1.1
BACC1 Q734H0 (Q734H0) Acetyltransferase, GNAT family family            34   1.1
BACAN Q81Q67 (Q81Q67) Acetyltransferase, GNAT family                   34   1.1
BACAN Q81P77 (Q81P77) Acetyltransferase, GNAT family                   34   1.1
Q8E1I7 (Q8E1I7) Acetyltransferase, GNAT family                         33   1.5
Q8DXE6 (Q8DXE6) Hypothetical protein SAG1905                           33   1.5
OCEIH Q8ET96 (Q8ET96) Hypothetical conserved protein                   33   1.5
LISIN Q929M8 (Q929M8) Lin2246 protein                                  33   1.5
CLOTE Q895L4 (Q895L4) DNA topoisomerase I (EC 5.99.1.2)                33   1.5
CLOPE Q8XIF5 (Q8XIF5) Ribosomal-protein-alanine N-acetyltransferase    33   1.5
BRUSU Q8FVN1 (Q8FVN1) Acyl-CoA dehydrogenase family protein            33   1.5
BRUME Q8YCN6 (Q8YCN6) ACYL-COA DEHYDROGENASE (EC 1.3.99.-)             33   1.5
BACSU YQAR_BACSU (P45914) Hypothetical protein yqaR                    33   1.5
BACAN Q81RF9 (Q81RF9) Acetyltransferase, GNAT family                   33   1.5
VIBPA Q87NP1 (Q87NP1) Putative spermine/spermidine acetyltransfe...    33   2.0
THETN Q8RC65 (Q8RC65) Acetyltransferases                               33   2.0
STRR6 Q8CYV9 (Q8CYV9) Hypothetical protein spr0850                     33   2.0
STAES Q8CU70 (Q8CU70) Spermidine acetyltransferase                     33   2.0
STAES Q8CS08 (Q8CS08) Hypothetical protein SE1483                      33   2.0
RICPR Q9ZDP9 (Q9ZDP9) Hypothetical protein RP278                       33   2.0
OCEIH Q8CXG0 (Q8CXG0) Diamine N-acetyltransferase (Spermine:sper...    33   2.0
CLOAB Q97J70 (Q97J70) Predicted acetyltransferase                      33   2.0
BURMA Q9AI54 (Q9AI54) DedA family protein                              33   2.0
BRAJA Q89YE3 (Q89YE3) Bll0009 protein                                  33   2.0
BACAN Q81NK7 (Q81NK7) Streptothricin acetyltransferase, putative       33   2.0
VIBUY Q7MI31 (Q7MI31) Ribosomal-protein-alanine acetyltransferase      33   2.6
OCEIH Q8EMB8 (Q8EMB8) Glucose-6-phosphate 1-dehydrogenase (EC 1....    33   2.6
OCEIH Q8EKW6 (Q8EKW6) Acetyltransferase                                33   2.6
MYCPE Q8EWC2 (Q8EWC2) Oligoendopeptidase F                             33   2.6
LISMO Q8Y8S4 (Q8Y8S4) Lmo0820 protein                                  33   2.6
CLOPE Q8XKM5 (Q8XKM5) Probable acetyltransferase                       33   2.6
BUCAI SPED_BUCAI (P57304) S-adenosylmethionine decarboxylase pro...    33   2.6
AQUAE O67458 (O67458) Hypothetical protein aq_1482                     33   2.6
YERPE Q8CZN5 (Q8CZN5) Acyltransferase for 30S ribosomal subunit ...    32   3.3
STRAW Q827N9 (Q827N9) Putative acetyltransferase                       32   3.3
STAES Q8CU81 (Q8CU81) Spermidine acetyltransferase                     32   3.3
RICPR Q9ZCN0 (Q9ZCN0) RIBOSOMAL-PROTEIN-ALANINE ACETYLTRANSFERAS...    32   3.3
OCEIH Q8ERM1 (Q8ERM1) Acetyltransferase                                32   3.3
MYCGE RIBF_MYCGE (P47391) Putative riboflavin biosynthesis prote...    32   3.3
ENTFA Q836M4 (Q836M4) Spermine/spermidine acetyltransferase, put...    32   3.3
CHRVO Q7NRZ7 (Q7NRZ7) Probable acetyltransferase (EC 2.3.1.-)          32   3.3
CHLCV Q824H9 (Q824H9) Acetyltransferase, GNAT family                   32   3.3
CHLMU SYR_CHLMU (Q9PJT8) Arginyl-tRNA synthetase (EC 6.1.1.19) (...    32   3.3
BACSU O34376 (O34376) Putative acetyl transferase (YobR protein)       32   3.3
BACC1 Q738E9 (Q738E9) Acetyltransferase, GNAT family                   32   3.3
BACC1 Q736C5 (Q736C5) Acetyltransferase, GNAT family                   32   3.3
BACC1 Q735P8 (Q735P8) Acetyltransferase, GNAT family                   32   3.3
AQUAE YZ34_AQUAE (O66423) Hypothetical protein AA34                    32   3.3
YERPE Q8ZCX6 (Q8ZCX6) Spermidine acetyltransferase (EC 2.3.1.57)...    32   4.4
STRR6 Q8DNN2 (Q8DNN2) Hypothetical protein spr1627                     32   4.4
STAAM Q99RQ8 (Q99RQ8) Similar to transcription repressor of spor...    32   4.4
LACLA Q9CJA2 (Q9CJA2) Acetyl transferase                               32   4.4
CLOTE Q892J2 (Q892J2) Conserved protein                                32   4.4
BRAJA Q89UA8 (Q89UA8) Hypothetical acetyltransferase                   32   4.4
BACSU O34558 (O34558) YopR protein                                     32   4.4
BACAN Q81R63 (Q81R63) Hypothetical protein                             32   4.4
VIBPA Q87MD0 (Q87MD0) Acetyltransferase-related protein                32   5.7
STRR6 Q8DND0 (Q8DND0) Transcriptional activator                        32   5.7
OCEIH Q8CUS9 (Q8CUS9) Hypothetical conserved protein                   32   5.7
LISIN Q92E28 (Q92E28) Lin0633 protein                                  32   5.7
LACPL Q88U49 (Q88U49) Glucose-6-phosphate 1-dehydrogenase (EC 1....    32   5.7
CORDI Q6NHQ0 (Q6NHQ0) Putative acetyltransferase                       32   5.7
BURMA Q62J98 (Q62J98) Ribosomal-protein-alanine acetyltransferas...    32   5.7
THETN Q8R764 (Q8R764) LysM-repeat proteins and domains                 31   7.4
STRR6 Q8DPX8 (Q8DPX8) Hypothetical protein spr0952                     31   7.4
STRMU Q8DVT1 (Q8DVT1) Putative ribosomal-protein-alanine acetylt...    31   7.4
SALTI SPED_SALTI (Q8Z9E3) S-adenosylmethionine decarboxylase pro...    31   7.4
SALTY SPED_SALTY (Q8ZRS4) S-adenosylmethionine decarboxylase pro...    31   7.4
SALCH Q57T90 (Q57T90) S-adenosylmethionine decarboxylase, proenzyme    31   7.4
RICCN Q92JP8 (Q92JP8) Cell surface antigen                             31   7.4
NEIGI Q5FAH1 (Q5FAH1) Hypothetical protein                             31   7.4
LISIN Q92DJ7 (Q92DJ7) Lin0816 protein                                  31   7.4
LACJO Q74J74 (Q74J74) Hypothetical protein                             31   7.4
GEOSL Q74A59 (Q74A59) Sensory box histidine kinase                     31   7.4
ENTFA Q836Z2 (Q836Z2) Acetyltransferase, GNAT family                   31   7.4
ENTFA Q82YJ4 (Q82YJ4) Toxin ABC transporter, ATP-binding/permeas...    31   7.4
CLOAB Q97M70 (Q97M70) Predicted metal-dependent hydrolase              31   7.4
CLOAB Q97HI5 (Q97HI5) Predicted flavodoxin                             31   7.4
BRAJA Q89R32 (Q89R32) Acyl-CoA dehydrogenase                           31   7.4
BACC1 Q735F5 (Q735F5) Streptothricin acetyltransferase, putative       31   7.4
BACAN Q81YS9 (Q81YS9) Acetyltransferase, GNAT family                   31   7.4
VIBPA Q87HD3 (Q87HD3) Hypothetical protein VPA1032                     31   9.7
VIBCH Q9KL03 (Q9KL03) Spermidine n1-acetyltransferase                  31   9.7
THEMA SYE2_THEMA (Q9X2I8) Glutamyl-tRNA synthetase 2 (EC 6.1.1.1...    31   9.7
THETN ISPH_THETN (Q8RA76) 4-hydroxy-3-methylbut-2-enyl diphospha...    31   9.7
STRCO Q9S2L5 (Q9S2L5) Hypothetical protein SCO1988                     31   9.7
STRP1 Q99XX8 (Q99XX8) Putative pullulanase                             31   9.7
STRP1 Q99YF0 (Q99YF0) Hypothetical protein SPy1734 (Acetyltransf...    31   9.7
STAES Q8CTP7 (Q8CTP7) Hypothetical protein SE0368                      31   9.7
STAES COAD_STAES (Q8CSZ5) Phosphopantetheine adenylyltransferase...    31   9.7
MYCLEH Y3433_MYCTU (O06250) Hypothetical protein Rv3433c/MT3539        31   9.7
MYCTU Y3433_MYCTU (O06250) Hypothetical protein Rv3433c/MT3539         31   9.7
MYCBO Q7TWI6 (Q7TWI6) Hypothetical protein Mb3463c                     31   9.7
LISMO Q8Y5C4 (Q8Y5C4) Lmo2141 protein                                  31   9.7
LISIN Q929Z8 (Q929Z8) Lin2125 protein                                  31   9.7
ENTFA Q837Z5 (Q837Z5) Acetyltransferase, GNAT family                   31   9.7
ENTFA Q831M9 (Q831M9) Ribosomal-protein-alanine acetyltransferas...    31   9.7
CLOTE Q895T0 (Q895T0) Acetyltransferase (EC 2.3.1.-)                   31   9.7
CLOPE Q8XMF9 (Q8XMF9) Hypothetical protein CPE0730                     31   9.7
BRUSU Q8FY73 (Q8FY73) Outer membrane autotransporter                   31   9.7
BACSU YCBJ_BACSU (P42242) Hypothetical protein ycbJ                    31   9.7
BACHD Q9KE57 (Q9KE57) BH1001 protein                                   31   9.7
BACC1 Q73CD9 (Q73CD9) Glycerol-3-phosphate dehydrogenase, aerobi...    31   9.7
BACAN Q81VL8 (Q81VL8) Oxidoreductase, FAD-binding                      31   9.7
BACAN Q81U57 (Q81U57) Glycerol-3-phosphate dehydrogenase, aerobic      31   9.7
BACAN Q81NU7 (Q81NU7) Acetyltransferase, GNAT family                   31   9.7

>STAAM AACA_STAAM (P0A0C0) Bifunctional AAC/APH [Includes:
           6'-aminoglycoside N-acetyltransferase (EC 2.3.1.-)
           (AAC(6')); 2''-aminoglycoside phosphotransferase (EC
           2.7.1.-) (APH(2''))]
          Length = 479

 Score =  959 bits (2480), Expect = 0.0
 Identities = 467/479 (97%), Positives = 467/479 (97%)

Query: 1   MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR 60
           MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR
Sbjct: 1   MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR 60

Query: 61  VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120
           VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI
Sbjct: 61  VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120

Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD 180
           FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD
Sbjct: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD 180

Query: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240
           DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE
Sbjct: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240

Query: 241 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR 300
           KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR
Sbjct: 241 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR 300

Query: 301 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA 360
           DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA
Sbjct: 301 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA 360

Query: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRLTXXXXXXXXXXXXEYCDFIYLLEDSEEEIGTN 420
           TTVFEGKKCLCHNDFSCNHLLLDGNNRLT            EYCDFIYLLEDSEEEIGTN
Sbjct: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRLTGIIDFGDSGIIDEYCDFIYLLEDSEEEIGTN 420

Query: 421 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD 479
           FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD
Sbjct: 421 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD 479


>ENTFA AACA_ENTFA (P0A0C2) Bifunctional AAC/APH [Includes:
           6'-aminoglycoside N-acetyltransferase (EC 2.3.1.-)
           (AAC(6')); 2''-aminoglycoside phosphotransferase (EC
           2.7.1.-) (APH(2''))]
          Length = 479

 Score =  959 bits (2480), Expect = 0.0
 Identities = 467/479 (97%), Positives = 467/479 (97%)

Query: 1   MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR 60
           MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR
Sbjct: 1   MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR 60

Query: 61  VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120
           VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI
Sbjct: 61  VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120

Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD 180
           FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD
Sbjct: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYD 180

Query: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240
           DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE
Sbjct: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240

Query: 241 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR 300
           KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR
Sbjct: 241 KAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKR 300

Query: 301 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA 360
           DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA
Sbjct: 301 DIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNA 360

Query: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRLTXXXXXXXXXXXXEYCDFIYLLEDSEEEIGTN 420
           TTVFEGKKCLCHNDFSCNHLLLDGNNRLT            EYCDFIYLLEDSEEEIGTN
Sbjct: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRLTGIIDFGDSGIIDEYCDFIYLLEDSEEEIGTN 420

Query: 421 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD 479
           FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD
Sbjct: 421 FGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIYKRTYKD 479


>BACAN Q81P54 (Q81P54) Acetyltransferase, GNAT family
          Length = 177

 Score =  168 bits (425), Expect = 4e-41
 Identities = 76/174 (43%), Positives = 116/174 (66%), Gaps = 1/174 (0%)

Query: 5   ENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIE 64
           ++ + +R ++++D P++ KWLTD  VL++Y GRD   ++E +  H+         R +IE
Sbjct: 5   KDNVSVRYVVEEDAPIISKWLTDPEVLQYYEGRDDPQSVEMVLNHFIHNPNSPEKRCLIE 64

Query: 65  YNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFL 124
           +++VPIGY Q+Y +  E  T Y Y ++   V+GMDQFIGEP YW KGIGT+++K    ++
Sbjct: 65  FDDVPIGYIQMYPVDSESKTLYGYEESQN-VWGMDQFIGEPTYWGKGIGTKFVKAAITYI 123

Query: 125 KKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYR 178
             E  A A+ +DP  NN RAI+ Y+K GF+ ++ L EHELHEG  EDC++MEY+
Sbjct: 124 LSEMGAEAIAMDPKVNNERAIKCYEKCGFKKVKILKEHELHEGVLEDCWMMEYK 177


>BACC1 Q735Z9 (Q735Z9) Acetyltransferase, GNAT family
          Length = 359

 Score =  159 bits (403), Expect = 1e-38
 Identities = 74/185 (40%), Positives = 118/185 (63%), Gaps = 1/185 (0%)

Query: 5   ENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIE 64
           ++ + +R + ++D P++ KWLT+  VL++Y GRD   +++ +  H+         R +IE
Sbjct: 5   KDNVSVRYVKEEDAPIISKWLTEPEVLQYYEGRDNPQSVDMVLDHFIHNPNSHEKRCLIE 64

Query: 65  YNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFL 124
           +++VPIGY Q+Y +  E  T Y Y ++   V+GMDQFIGEP YW KGIGT+ ++    ++
Sbjct: 65  FDDVPIGYIQMYPVDSEWKTLYGYEESQH-VWGMDQFIGEPTYWGKGIGTKLVQTAITYI 123

Query: 125 KKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNAT 184
            +   A A+ +DP  NN RAI+ Y+K GF+ ++ L EHELHEG  EDC++MEY+  +
Sbjct: 124 MENTGAEAIAMDPKVNNERAIKCYEKCGFKKVKVLKEHELHEGVLEDCWMMEYKQRELRE 183

Query: 185 NVKAM 189
             KA+
Sbjct: 184 MKKAL 188


>BACAN Q81T90 (Q81T90) Aminoglycoside phophotransferase family
           protein
          Length = 300

 Score = 67.0 bits (162), Expect = 1e-10
 Identities = 51/208 (24%), Positives = 95/208 (45%), Gaps = 12/208 (5%)

Query: 190 KYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNT 249
           K  I+    N  + S +    G+D+VA +VN+E +F+             EK +   L
Sbjct: 5   KQYIKEALPNLSIHSYKQNEEGWDNVAVIVNDELLFRFPRKQEYAMRIPLEKELCTILTQ 64

Query: 250 NLETNVKIP--NIEYSYISDELSILGYKE-IKGTFLTPEIYSTMSEEEQNLLKRDIASFL 306
           +L+  +++P  ++ Y   SDE+ +  Y   I G  L  EI + + E+E+ ++   +A+FL
Sbjct: 65  SLQ-EIEVPQYHLIYKNESDEVPLCSYYTLIHGEPLKTEIVANLDEKERKIIITQLATFL 123

Query: 307 RQMHGLDYTDISECTIDNKQNVL---EEYILLRETIYNDLTDIEKDYIESFMERL---NA 360
             +H +    ++      ++ +    E    L E + N LT  +K  +    E      A
Sbjct: 124 AALHSIPLKSVTALGFPTEKTLTYWKELQTKLNEYVTNSLTSFQKSTLNRLFENFFACIA 183

Query: 361 TTVFEGKKCLCHNDFSCNHLLLDGNNRL 388
           T+ F     + H DF+ +H+L D  N++
Sbjct: 184 TSAF--PNAIIHADFTHHHILFDKQNKI 209


>BACC1 Q73BC3 (Q73BC3) Aminoglycoside phophotransferase family
           protein
          Length = 300

 Score = 62.0 bits (149), Expect = 4e-09
 Identities = 51/206 (24%), Positives = 92/206 (44%), Gaps = 10/206 (4%)

Query: 190 KYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNT 249
           K  I+    N  + S +    G+D+VA +VN+E +F+             EK +   L+
Sbjct: 5   KQYIKEALPNLSIHSYKQNEEGWDNVAIIVNDELLFRFPRKQEYAMRIPLEKELCTLLSC 64

Query: 250 NL-ETNVKIPNIEYSYISDELSILGYKE-IKGTFLTPEIYSTMSEEEQNLLKRDIASFLR 307
           +L E  V   ++ Y   +D + +  Y   I G  L  EI +T+ ++E+  L   +A+FL
Sbjct: 65  SLHEIEVPKYHLFYEKNTDAIPLCSYYTLIHGEPLKTEIVTTLEKQERKALITQLATFLA 124

Query: 308 QMHGLDYTDISECTIDNKQNVL---EEYILLRETIYNDLTDIEKDYIESFMERL---NAT 361
            +H +    ++      ++ +    E    L E + N LT  +K  +    E      AT
Sbjct: 125 ALHSIPLKSVTALGFPIEKTLTYWKELQAKLNEYVTNSLTSFQKSTLNRLFENFFACLAT 184

Query: 362 TVFEGKKCLCHNDFSCNHLLLDGNNR 387
           + F+    + H DF+ +H+L D  N+
Sbjct: 185 SKFQ--NTIIHADFTHHHILFDKQNK 208


>BRAJA Q89WN0 (Q89WN0) Bll0648 protein
          Length = 161

 Score = 59.3 bits (142), Expect = 3e-08
 Identities = 44/145 (30%), Positives = 75/145 (51%), Gaps = 13/145 (8%)

Query: 11  RTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPI 70
           R +   D PL+ +WL +  V E++G   +++ L S      EP  D+    I+   + P
Sbjct: 8   RPMTAADLPLIRRWLGEAHVREWWGDPGEQFALVS--GDLDEPAMDQF---IVLAGDKPF 62

Query: 71  GYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKE--R 128
           GY Q Y++    +     P+      G+DQFIGE +  ++G G+ +I+   +F+ ++
Sbjct: 63  GYLQCYRL--TAWNTGFGPQPGG-TRGIDQFIGESDMIARGHGSAFIR---QFVDEQLRH 116

Query: 129 NANAVILDPHKNNPRAIRAYQKSGF 153
               V+ DP   N RA+RAY+K+GF
Sbjct: 117 GLPRVVTDPDPLNSRAVRAYEKAGF 141


>BACHD Q9K9M4 (Q9K9M4) BH2621 protein
          Length = 197

 Score = 56.2 bits (134), Expect = 2e-07
 Identities = 35/159 (22%), Positives = 78/159 (49%), Gaps = 6/159 (3%)

Query: 2   NIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRV 61
           ++V  ++  R +  DD  ++  W+ +E V+ ++        L   KKH      D+   +
Sbjct: 15  HVVNKKLSFRHVTMDDVDMLHSWMHEEHVIPYW---KLNIPLVDYKKHLQTFLNDDHQTL 71

Query: 62  II-EYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120
           ++   N VP+ Y + Y + +++  +Y YP  +E   G+   IG   Y  +G+    +  I
Sbjct: 72  MVGAINGVPMSYWESYWVKEDIIANY-YP-FEEHDQGIHLLIGPQEYLGQGLIYPLLLAI 129

Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
            +   +E + N ++ +P + N + I  ++K GF+ ++++
Sbjct: 130 MQQKFQEPDTNTIVAEPDRRNKKMIHVFKKCGFQPVKEV 168


>BACC1 Q739G2 (Q739G2) 6'-aminoglycoside
           N-acetyltransferase/2''-aminoglycoside
           phosphotransferase, putative (EC 2.3.1.-)
          Length = 293

 Score = 55.1 bits (131), Expect = 5e-07
 Identities = 57/289 (19%), Positives = 125/289 (43%), Gaps = 24/289 (8%)

Query: 193 IEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLE 252
           ++  +   +++S+ I   G ++   +VN+  +F+       +KG  K +     L
Sbjct: 11  LQRLYPELQINSVYINEIGQNNDVLIVNDNIVFRFP---KYEKGIQKLRIETQLLEKIRP 67

Query: 253 -TNVKIPNIEYSYISDELS---ILGYKEIKGTFLTPEIYSTMSEEEQ-NLLKRDIASFLR 307
              ++IPN  Y    +E+      GY+ I+G      +++ +++E+Q   L   +A FL+
Sbjct: 68  FITLQIPNPSYQGFQNEVPGKVFAGYEMIEGDPFWKNVFTEINDEKQLQKLAYTLARFLK 127

Query: 308 QMHGLD---YTDISEC-TIDNKQNVLEEYILLRETIYNDLTDI-EKDYIESFMERLNATT 362
           ++H +    +  I +C + D    +   Y  L+E +Y  + ++  K+   SF   LN ++
Sbjct: 128 ELHEIPLSTFESIMQCDSTDMYSEINSLYSQLKEHVYPFMRNVARKEVSTSFELYLNESS 187

Query: 363 VFEGKKCLCHNDFSCNHLLLDGNNR-LTXXXXXXXXXXXXEYCDFIYLLEDSEEEIGTNF 421
            F     L H DF   ++L     + ++               DF  +L         ++
Sbjct: 188 HFNFTPSLVHGDFGMTNILYSATKKNISGVIDFGGASIGDPAYDFAGIL--------ASY 239

Query: 422 GEDILRMYGNI--DIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENG 468
           GE+ L+++     ++E  KE     +  + ++  ++G+ N  ++  E G
Sbjct: 240 GEEFLQLFEAYYPNLEAVKERMYFYKSTFALQEALFGVLNNDKKAFEAG 288


>THEMA Q9X063 (Q9X063) Hypothetical protein
          Length = 182

 Score = 52.4 bits (124), Expect = 3e-06
 Identities = 27/75 (36%), Positives = 41/75 (54%), Gaps = 1/75 (1%)

Query: 101 FIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLP 160
           F+G P YWS+G GT  ++++  F+  E N N + L     N RA R Y+K GF++   L
Sbjct: 94  FLGRP-YWSQGYGTDAMRVLVRFIFNEMNMNKIKLHVFSFNERAKRVYEKIGFKVEGILR 152

Query: 161 EHELHEGKKEDCYLM 175
           +    EG+  D  +M
Sbjct: 153 QELFREGRYHDVIVM 167


>CLOTE Q896X4 (Q896X4) Putative acetyltransferase
          Length = 186

 Score = 48.9 bits (115), Expect = 3e-05
 Identities = 44/173 (25%), Positives = 73/173 (42%), Gaps = 15/173 (8%)

Query: 6   NEICIRTLIDDDFPLMLKWLTDE---RVLEFYGGRDK-KYTLESLKKHYTEPWEDEVFRV 61
           + I I  L ++D   + KW  D    RV +F     K  + +            +  F +
Sbjct: 10  DRIKITALREEDIETITKWYEDTNFLRVFDFNPSAPKTSWKIREWLMEEVSSSNNYFFAI 69

Query: 62  IIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIF 121
             +  N  +GY +I K+             +  V G+   IG+ + W KG G+  + L
Sbjct: 70  RKKDANKILGYVEIEKI-----------NWNNGVGGIAIGIGDSSEWGKGYGSEALSLAM 118

Query: 122 EFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYL 174
           +F  +E N + + L     N RAI++Y+K GF+      E    +GK+ D YL
Sbjct: 119 DFAFRELNLHRLQLITISYNERAIKSYEKLGFKKEGIYREAVNRDGKRYDIYL 171


>BACHD Q9KB15 (Q9KB15) BH2121 protein
          Length = 181

 Score = 48.1 bits (113), Expect = 6e-05
 Identities = 28/78 (35%), Positives = 36/78 (46%), Gaps = 13/78 (16%)

Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           IGE  YW KG G   ++L+  +   E N + V L     N +AIR Y+K GF+
Sbjct: 95  IGEKTYWGKGYGFEALRLLLNYAFLEMNLHRVSLRVFSFNKKAIRLYEKLGFK------- 147

Query: 162 HELHEGKKEDCYLMEYRY 179
              HEG    C    YRY
Sbjct: 148 ---HEGTSRQCL---YRY 159


>STAES Q8CQT6 (Q8CQT6) Spermidine acetyltransferase
          Length = 177

 Score = 47.0 bits (110), Expect = 1e-04
 Identities = 42/169 (24%), Positives = 71/169 (42%), Gaps = 14/169 (8%)

Query: 8   ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFR-VIIEYN 66
           + IR L   D    +  L +E  +  Y   +   +L  L+  YT+   DE  R  I+E
Sbjct: 3   LIIRALEKTDLSF-IHHLNNEYSIMSYWFEEPYQSLSELENLYTKHILDETERRFIVEEG 61

Query: 67  NVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKK 126
           +  +G  ++ ++           +T E++  +D     P Y + G   +  K+  ++
Sbjct: 62  STSVGVVELLEIN-------FIHRTCEVLIIID-----PQYANNGYAKKAFKMAIDYAFL 109

Query: 127 ERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175
             N N V L     N +A+  YQ + F I   L EH    G+  DCY+M
Sbjct: 110 VLNMNKVYLYVDIKNEKAVHIYQSNNFEIEGTLKEHFYTRGEYRDCYVM 158


>VIBCH Q9KMC2 (Q9KMC2) Acetyltransferase, putative
          Length = 158

 Score = 45.1 bits (105), Expect = 5e-04
 Identities = 37/166 (22%), Positives = 69/166 (41%), Gaps = 18/166 (10%)

Query: 15  DDDFPLMLKWLTDERVLEFYGGRDKKY--TLESLKKHYTEPWEDEVFRVIIEYNNVPIGY 72
           + DF L++KW+  + +   +GG    +  T E +  H ++    EVF  +++      G+
Sbjct: 8   ESDFDLLIKWIDSDELNYLWGGPAYVFPLTYEQIHSHCSKA---EVFPYLLKVKGRHAGF 64

Query: 73  GQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANA 132
            ++YK+ DE Y                 FI    Y  +G+    + L+ +  + + +A
Sbjct: 65  VELYKVTDEQYRICRV------------FISNA-YRGQGLSKSMLMLLIDKARLDFSATK 111

Query: 133 VILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYR 178
           + L   + N  A + Y+  GF ++          GK  D   ME R
Sbjct: 112 LSLGVFEQNTVARKCYESLGFEVVSTEIGTRAFNGKLWDLVRMEKR 157


>BACSU BLTD_BACSU (P39909) Spermine/spermidine acetyltransferase (EC
           2.3.1.57)
          Length = 152

 Score = 44.7 bits (104), Expect = 6e-04
 Identities = 40/153 (26%), Positives = 69/153 (45%), Gaps = 16/153 (10%)

Query: 8   ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKK-HYTEPWEDEVFRVIIEYN 66
           I I+ + DD+   +L     +  L +      K  LE  K+ HY +P       V + Y
Sbjct: 3   INIKAVTDDNRAAILDLHVSQNQLSYI--ESTKVCLEDAKECHYYKP-------VGLYYE 53

Query: 67  NVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKK 126
              +G+     MY  L+ +Y     +  V+ +D+F  +  Y  KG+G + +K + + L +
Sbjct: 54  GDLVGFA----MYG-LFPEYDEDNKNGRVW-LDRFFIDERYQGKGLGKKMLKALIQHLAE 107

Query: 127 ERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
                 + L   +NN  AIR YQ+ GF+   +L
Sbjct: 108 LYKCKRIYLSIFENNIHAIRLYQRFGFQFNGEL 140


>BACC1 Q739N8 (Q739N8) Spermidine acetyltransferase, putative
          Length = 156

 Score = 43.9 bits (102), Expect = 0.001
 Identities = 18/64 (28%), Positives = 36/64 (56%)

Query: 98  MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157
           +D+F+ +  Y  KG   R+++L+ +FL+ +     + L  H +N  A+  Y+  GFR+
Sbjct: 74  LDRFMIDQQYQGKGYAKRFLRLLIQFLQNKFECKTIYLSLHPDNKLAMGLYESFGFRLNG 133

Query: 158 DLPE 161
           D+ +
Sbjct: 134 DIDD 137


>LACPL Q88SW8 (Q88SW8) Acetyltransferase (Putative)
          Length = 180

 Score = 43.5 bits (101), Expect = 0.001
 Identities = 26/74 (35%), Positives = 33/74 (44%)

Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           IG+P+    G GT  + LI  +   E N   V LD    NP AI  YQ SGF
Sbjct: 93  IGDPDERGHGYGTETLSLILNYAFNELNLYKVCLDVIATNPAAIAVYQNSGFEFEGTNKR 152

Query: 162 HELHEGKKEDCYLM 175
               +G++ D Y M
Sbjct: 153 AIKRDGQRIDLYHM 166


>VIBUY Q7MK89 (Q7MK89) Putative acetyltransferase
          Length = 158

 Score = 43.1 bits (100), Expect = 0.002
 Identities = 30/140 (21%), Positives = 66/140 (47%), Gaps = 14/140 (10%)

Query: 17  DFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIY 76
           DF L+++W+  + +   +GG    + L S ++      ++EVF  +++ N    G+ ++Y
Sbjct: 10  DFHLLIEWIDSDELNYLWGGPAYTFPLTS-EQIIAHCAKEEVFPYLLKVNGQNAGFVELY 68

Query: 77  KMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILD 136
           K+ +E Y                 FI   +Y  +G+    I L+ + ++ + +A  + L
Sbjct: 69  KVTNEHYRICRV------------FISN-SYRGQGLSKSMIMLLIDKVRSDFSATMLSLG 115

Query: 137 PHKNNPRAIRAYQKSGFRII 156
             ++N  A + Y+  GF ++
Sbjct: 116 VFEHNTVARKCYESLGFNVV 135


>DEIRA Q9RW71 (Q9RW71) Acetyltransferase, putative
          Length = 207

 Score = 43.1 bits (100), Expect = 0.002
 Identities = 21/70 (30%), Positives = 38/70 (54%)

Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           I +P +W  G G + ++L  +    E +A+ + L     N R +RA Q++G+R    +PE
Sbjct: 107 IYDPAHWGGGFGRQALRLWTDATFAETDAHLITLTTWSGNERMVRAAQRAGYRECARIPE 166

Query: 162 HELHEGKKED 171
             L +G++ D
Sbjct: 167 ARLWQGQRWD 176


>BACAN Q81RM0 (Q81RM0) Spermidine acetyltransferase, putative
          Length = 156

 Score = 43.1 bits (100), Expect = 0.002
 Identities = 18/64 (28%), Positives = 35/64 (54%)

Query: 98  MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157
           +D+F+ +  Y  KG   R+++L+ +FL+ +     + L  H  N  A+  Y+  GFR+
Sbjct: 74  LDRFMIDQQYQGKGYAKRFLRLLIQFLQHKFECKTIYLSLHPENKLAMGLYESFGFRLNG 133

Query: 158 DLPE 161
           D+ +
Sbjct: 134 DIDD 137


>LACJO Q74K74 (Q74K74) Hypothetical protein
          Length = 189

 Score = 42.4 bits (98), Expect = 0.003
 Identities = 41/162 (25%), Positives = 71/162 (43%), Gaps = 25/162 (15%)

Query: 17  DFPLM---LKWLTDERVLEFYGGRDKKYTLESLKKHYTEP-WEDEVFRVIIEYNNV--PI 70
           DFPL+   LK + DE  ++      +    + +K  +  P +     R+ +E +++  PI
Sbjct: 9   DFPLVYPILKQIFDEMDMDTIKALPESQFYDLMKHGFYSPHYRYSHNRMWVETDDLDRPI 68

Query: 71  GYGQIYKMYDELYTDYH----YPKT----DEIVYG----------MDQFIGEPNYWSKGI 112
           G   +Y   D+   D      YPK     D +++           +D     P +W KGI
Sbjct: 69  GLIVMYGYDDQGLIDISLKSAYPKVGLPLDAVIFSDKEALPHEWYLDAIAVSPKHWGKGI 128

Query: 113 GTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
           G + IK I   + ++     + L+  ++NPRA R Y   GF+
Sbjct: 129 GQKLIK-IAPGIARQNGYKKISLNVDQDNPRAARLYDYMGFK 169


>BACC1 Q72YV7 (Q72YV7) Acetyltransferase, GNAT family
          Length = 174
Score = 42.4 bits (98), Expect = 0.003
 Identities = 35/152 (23%), Positives = 63/152 (41%), Gaps = 14/152 (9%)

Query: 6   NEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKK---HYTEPWEDEVFRVI 62
           N I +R   +DD     KW  D  V+        KY+ +  +K    +      + + +
Sbjct: 5   NRIQLRKFSEDDILTYYKWHNDIDVMSSTTLNLDKYSFQDTEKLCQQFIHSPNAKSYIIE 64

Query: 63  IEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFE 122
            +  N+PIG   +      ++ D +    + I+      IG+ +YW +G G     L+
Sbjct: 65  EKATNLPIGITSL------IHIDSYNRNAECIID-----IGKKDYWGQGYGKEAFTLLLN 113

Query: 123 FLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
           +   E N + + L     N RAI+ Y+  GF+
Sbjct: 114 YAFLELNLHRLSLRVFSFNDRAIKLYKSLGFQ 145


>BACC1 Q734R6 (Q734R6) Acetyltransferase, GNAT family
          Length = 176

 Score = 42.0 bits (97), Expect = 0.004
 Identities = 37/146 (25%), Positives = 63/146 (43%), Gaps = 16/146 (10%)

Query: 15  DDDFPLMLKWLTDERVLEFYGGRDKKYTLES--LKKHYTEPWEDE----VFRVIIEYNNV 68
           ++DF  ++ W+ +      +GG    + L +  LK +     +D     VF+ I E N+
Sbjct: 9   EEDFQQLIDWIPNAEFSLQWGGPAFTFPLTNAQLKNYLQNANKDNAIKYVFKAIDETNSE 68

Query: 69  PIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKER 128
            IG+  +  +           KT+E        IG  N   KG GT+ +  + +F  +E
Sbjct: 69  VIGHISLGNV----------DKTNESARIGKVLIGSTNSRGKGYGTQMMTAVLKFAFEEL 118

Query: 129 NANAVILDPHKNNPRAIRAYQKSGFR 154
             + V L     N  AI+ Y+K GF+
Sbjct: 119 KLHKVTLGVFDFNESAIKCYKKVGFQ 144


>CLOPE Q8XKH9 (Q8XKH9) Hypothetical protein CPE1416
          Length = 193

 Score = 41.6 bits (96), Expect = 0.005
 Identities = 31/106 (29%), Positives = 45/106 (42%), Gaps = 5/106 (4%)

Query: 82  LYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNN 141
           LY    Y    EI Y +     E N+W KG+ +  IK I  F  +  + N +I     NN
Sbjct: 93  LYNIDFYSNNTEIGYTI-----EKNFWRKGVASECIKAIENFAFETLDMNRIIAMIDSNN 147

Query: 142 PRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVK 187
             +I+  +K GF     L EH  ++ K E   +  Y    +   VK
Sbjct: 148 ISSIKLSEKLGFHRDGILREHYYNKSKDEYINICVYSLIKSDIKVK 193


>BACAN Q81SQ8 (Q81SQ8) Acetyltransferase, GNAT family
          Length = 157

 Score = 41.2 bits (95), Expect = 0.007
 Identities = 33/126 (26%), Positives = 54/126 (42%), Gaps = 17/126 (13%)

Query: 34  YGGRDKKYTLESLKKHYTEPWEDEV---FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPK 90
           Y G+   Y +E+ ++   E   DE        ++ N   IGY  + K+ D
Sbjct: 22  YEGKYSFYDIEADEEDLAEFLHDESRGDHTFSVKENGTLIGYFTVCKITDG--------- 72

Query: 91  TDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQK 150
           T +I  G+      PN    G G ++I  I  F K++   N + L     N RAI+ Y++
Sbjct: 73  TVDIGLGI-----RPNITGNGFGLQFINAILAFSKEKYGCNYITLSVATFNKRAIKVYKR 127

Query: 151 SGFRII 156
           +GF  +
Sbjct: 128 AGFEAV 133


>VIBCH Q9K330 (Q9K330) Acetyltransferase, putative
          Length = 178

 Score = 40.8 bits (94), Expect = 0.009
 Identities = 21/80 (26%), Positives = 40/80 (50%), Gaps = 10/80 (12%)

Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           IG+  +W KG+GT   +L+  +  +E   + + L  + +N  A++AY+ +G++
Sbjct: 95  IGDKAFWGKGLGTEVTRLVTNYGFRELGLHRIELTAYCDNVAAVKAYENAGYQ------- 147

Query: 162 HELHEGKKEDCYLMEYRYDD 181
              HEG K +      R+ D
Sbjct: 148 ---HEGIKRESGYRNGRFMD 164


>VIBUY Q7MK96 (Q7MK96) Putative acetyltransferase
          Length = 158

 Score = 40.4 bits (93), Expect = 0.012
 Identities = 35/166 (21%), Positives = 70/166 (42%), Gaps = 18/166 (10%)

Query: 15  DDDFPLMLKWLTDERVLEFYGGRDKKY--TLESLKKHYTEPWEDEVFRVIIEYNNVPIGY 72
           + +F  ++ W+  + +   +GG    +  T E +  H ++    EVF  +++ N    G+
Sbjct: 8   ESNFDQLIAWIDSDELNYLWGGPAYVFPLTYEQIHAHCSKA---EVFPYLLKVNGRHAGF 64

Query: 73  GQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANA 132
            ++YK+ DE Y           VY  + + G      +G+    + L+ +  + + +A
Sbjct: 65  VELYKVTDEQYRICR-------VYISNAYRG------RGLSKSMLMLLIDKARLDFSATK 111

Query: 133 VILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYR 178
           + L   + N  A + Y+  GF ++          GK  D   ME R
Sbjct: 112 LSLGVFEQNTVARKCYESLGFEVVSTEIGTRSFNGKLWDLVRMEKR 157


>WIGBR Q8D3I4 (Q8D3I4) Imp protein
          Length = 723

 Score = 40.0 bits (92), Expect = 0.016
 Identities = 60/261 (22%), Positives = 104/261 (39%), Gaps = 50/261 (19%)

Query: 57  EVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRY 116
           +++   + + N+PI Y   +K+Y E Y D  Y  + +I Y  +  +    Y+ K    +Y
Sbjct: 191 KIWNAKLNFKNIPIFYVPFFKVY-EKYNDIFY--SPKISYKNNNGLSLSFYYKKIFFDKY 247

Query: 117 IKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLME 176
               F F+ K  +   ++L    NN    + Y  S F            + KK + Y++
Sbjct: 248 ---FFYFIPKYNSDGTILL----NN----KIYYSSDF------------DKKKINLYIL- 283
Query: 177 YRYDDNATNVKAMKYLIEHYFDNFKVD---------SIEIIGSGYDSVAYLVNNEYI--F 225
             +D           L ++YF N K+D         +  I    +D     + NE +  F
Sbjct: 284 --FDIKKNKNNWFIDLKQNYFFNKKLDILYIYKKSNNFIIFNKMFDIEKNFLQNEILEKF 341

Query: 226 KTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPE 285
             K+  N  K   + K    F N N    +K P++ +SY  ++      K  K  F+
Sbjct: 342 NLKYFYNNWKLKLEYKKFIIFDNKNF-NYIKFPHVYFSYFDNK-----NKNFKFNFVGKF 395

Query: 286 IYSTMSEEEQNLLKRDIASFL 306
            Y    EE++ +L  +I  FL
Sbjct: 396 SY----EEDKKILHINIEPFL 412


>BACSU P94482 (P94482) YnaD
          Length = 170

 Score = 39.7 bits (91), Expect = 0.021
 Identities = 39/156 (25%), Positives = 66/156 (42%), Gaps = 17/156 (10%)

Query: 1   MNIVENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWED--EV 58
           M+I    + IR     D+  + ++ +D  V+++    +  +T E  K    +   D  E
Sbjct: 1   MHITTKRLLIREFEFKDWQAVYEYTSDSNVMKYIP--EGVFTEEDAKAFVNKNKGDNAEK 58

Query: 59  FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIK 118
           F VI+   +  IG+   YK + E         T EI +     +  PNY +KG  +   +
Sbjct: 59  FPVILRDEDCLIGHIVFYKYFGE--------HTYEIGW-----VFNPNYQNKGYASEAAQ 105

Query: 119 LIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
            I E+  KE N + +I      N  + R  +K G R
Sbjct: 106 AILEYGFKEMNLHRIIATCQPENIPSYRVMKKIGMR 141


>BACC1 Q73A91 (Q73A91) Acetyltransferase, GNAT family
          Length = 177

 Score = 39.7 bits (91), Expect = 0.021
 Identities = 23/85 (27%), Positives = 37/85 (43%), Gaps = 5/85 (5%)

Query: 87  HYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIR 146
           H  K  E+ Y    ++G+P YW  G GT   K +  +   E + N +      NNP + R
Sbjct: 86  HIHKRGELAY----WVGKP-YWGNGFGTEAAKTLLHYGFNELHLNKIFAAAFTNNPGSWR 140

Query: 147 AYQKSGFRIIEDLPEHELHEGKKED 171
             +K G +      +H +  G+  D
Sbjct: 141 IMEKIGMKHEGTFKQHVVKSGEPMD 165


>THETN Q8RC99 (Q8RC99) Acetyltransferases
          Length = 149

 Score = 39.3 bits (90), Expect = 0.027
 Identities = 35/149 (23%), Positives = 66/149 (44%), Gaps = 29/149 (19%)

Query: 43  LESLKKHYTEPWEDEVF-----------RVIIEYNNVPIGYGQIYKMYDELYTDYHYPKT 91
           +E  K  +T PW  E F            ++ E +   +GY   + + DE +       T
Sbjct: 18  MEIEKLSFTTPWSREAFVGEVTKNSCARYIVAEVDKKVVGYAGFWVVLDEGHI------T 71

Query: 92  DEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKS 151
           +  V+        P Y  KGIG+R ++ + + L K+    ++ L+  ++N  A   Y+K
Sbjct: 72  NIAVH--------PEYRGKGIGSRLMEGLID-LAKKNGITSMTLEVRESNLVAQNLYKKF 122

Query: 152 GFRIIEDLPEHELHEGKKEDCYLMEYRYD 180
           GF+++        ++   ED  +M ++YD
Sbjct: 123 GFKVLG--RREGYYQDNNEDAIVM-WKYD 148


>STRAW Q82IB6 (Q82IB6) Putative acetyltransferase
          Length = 168

 Score = 39.3 bits (90), Expect = 0.027
 Identities = 21/54 (38%), Positives = 33/54 (61%), Gaps = 5/54 (9%)

Query: 101 FIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
           ++G P YW++GIG+R + L   FL++ER    +  DP   N  ++R  +K GFR
Sbjct: 100 WLGRP-YWARGIGSRALGL---FLRRERT-RPLYADPFHGNTASVRLLEKHGFR 148


>LISIN Q92E38 (Q92E38) Lin0623 protein
          Length = 177

 Score = 39.3 bits (90), Expect = 0.027
 Identities = 22/69 (31%), Positives = 33/69 (47%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166
           +W  GIGT  ++ + +  KK      V L+    N RAI  Y+K GF    ++P     E
Sbjct: 105 FWGLGIGTLIMEGLIKHAKKTERLKLVYLEAVSENKRAINLYKKFGFIEAGEIPALMQVE 164

Query: 167 GKKEDCYLM 175
           G+  D  +M
Sbjct: 165 GRYLDVTMM 173


>STRCO O69977 (O69977) Hypothetical protein SCO5801
          Length = 231

 Score = 38.9 bits (89), Expect = 0.036
 Identities = 30/143 (20%), Positives = 65/143 (45%), Gaps = 6/143 (4%)

Query: 14  IDDDFPLMLKWLTDERVLEFYG-GRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIGY 72
           ++ D PL+ +W+ D  V  ++     +  T + L+       +      +   + VP+ Y
Sbjct: 67  LERDVPLIARWMNDPAVAAYWELTGPQSVTADHLRAQLAG--DGRSVPCVGTLDGVPMSY 124

Query: 73  GQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNA-N 131
            +IY+   +    Y   +  +   G+   IG+  +  +G+GT  I+ + + +   R A
Sbjct: 125 WEIYRADLDPLARYCPVRPHDT--GVHLLIGDGAHRGRGLGTELIRAVVDLVLAGRPACT 182

Query: 132 AVILDPHKNNPRAIRAYQKSGFR 154
            V+ +P   N +++ A+  +GFR
Sbjct: 183 RVLAEPDVRNRQSVAAFLGAGFR 205


>STRAW Q82KD8 (Q82KD8) Hypothetical protein
          Length = 377

 Score = 38.9 bits (89), Expect = 0.036
 Identities = 37/150 (24%), Positives = 66/150 (44%), Gaps = 10/150 (6%)

Query: 17  DFPLMLKWLTDERVLEFYG-GRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIGYGQI 75
           D PL+ +W+ D  V  F+    D+  T + L+              ++E    P+ Y +I
Sbjct: 217 DLPLLGRWMNDPAVAAFWKLAGDESVTEQHLRAQLGGDGRSVPCLGVLE--GTPMSYWEI 274

Query: 76  YKM-YDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANA-V 133
           Y+   D L    HYP       G+   IG      +G+G+  ++ + + +   R + A V
Sbjct: 275 YRADLDSLAR--HYPARPHDT-GIHLLIGGVADRGRGLGSTLLRAVADLVLDRRPSCARV 331

Query: 134 ILDPHKNNPRAIRAYQKSGFRIIE--DLPE 161
           + +P   N  ++ A+  +GFR     DLP+
Sbjct: 332 VAEPDLRNTSSVSAFLGAGFRFSAEVDLPD 361


>VIBCH Q9KMA3 (Q9KMA3) Acetyltransferase, putative
          Length = 230

 Score = 38.5 bits (88), Expect = 0.046
 Identities = 30/144 (20%), Positives = 62/144 (43%), Gaps = 18/144 (12%)

Query: 15  DDDFPLMLKWLTDERVLEFYGGRDKKY--TLESLKKHYTEPWEDEVFRVIIEYNNVPIGY 72
           + DF L++KW+  + +   +G     +  T E +  H ++    EVF  +++      G+
Sbjct: 8   ESDFDLLIKWIDSDELNYLWGCPAYVFPLTYEQIHSHCSKA---EVFPYLLKVKGRHAGF 64

Query: 73  GQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANA 132
            ++YK+ DE Y                 FI    Y  +G+    + L+ +  + + +A
Sbjct: 65  VELYKVTDEQYRICRV------------FISNA-YRGQGLSKSMLMLLIDKARLDFSATK 111

Query: 133 VILDPHKNNPRAIRAYQKSGFRII 156
           + L   + N  A + Y+  GF ++
Sbjct: 112 LSLGVFEQNTVARKCYESLGFEVV 135


>STAAM Q932M5 (Q932M5) Hypothetical protein SAVP027
          Length = 134

 Score = 38.5 bits (88), Expect = 0.046
 Identities = 24/77 (31%), Positives = 36/77 (46%), Gaps = 11/77 (14%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164
           PNY  KG G++ +  I E+  KE   + + L   K NPRA   Y+K G +
Sbjct: 68  PNYQDKGYGSKLLSFIKEY-SKEIGCSEMFLITDKGNPRACHVYEKLGGK---------- 116

Query: 165 HEGKKEDCYLMEYRYDD 181
           ++ K E  Y+ +Y   D
Sbjct: 117 NDYKDEIVYVYDYEKGD 133


>LACLA Q9CHJ8 (Q9CHJ8) Acetyl transferase
          Length = 193

 Score = 38.5 bits (88), Expect = 0.046
 Identities = 29/106 (27%), Positives = 50/106 (47%), Gaps = 12/106 (11%)

Query: 59  FRVIIEYNNVPIGYGQI-YKMYDELYTDYHYPKTD------EIVYGMDQFIGEPNYWSKG 111
           F +  +Y   P+G   I  K    L  D H+ K        EI Y ++Q     NYW++G
Sbjct: 56  FSIANDYMKSPLGKWAIELKSEHRLIGDIHFVKISDKNQSAEIGYVLNQ-----NYWNQG 110

Query: 112 IGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157
           + T  +K++ EF  ++     +IL   K N  + +   KSG+ +++
Sbjct: 111 LLTEALKVLTEFSFEQFGLKKLILLIDKENVPSKKVALKSGYHLVK 156


>ENTFA Q82YR8 (Q82YR8) Acetyltransferase, GNAT family
          Length = 130

 Score = 38.5 bits (88), Expect = 0.046
 Identities = 24/77 (31%), Positives = 36/77 (46%), Gaps = 11/77 (14%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164
           PNY  KG G++ +  I E+  KE   + + L   K NPRA   Y+K G +
Sbjct: 64  PNYQDKGYGSKLLSFIKEY-SKEIGCSEMFLITDKGNPRACHVYEKLGGK---------- 112

Query: 165 HEGKKEDCYLMEYRYDD 181
           ++ K E  Y+ +Y   D
Sbjct: 113 NDYKDEIVYVYDYEKGD 129


>BACC1 Q73AS8 (Q73AS8) Acetyltransferase, GNAT family
          Length = 157

 Score = 38.5 bits (88), Expect = 0.046
 Identities = 34/128 (26%), Positives = 55/128 (42%), Gaps = 21/128 (16%)

Query: 34  YGGRDKKYTLESLKKHYTEPWEDE-----VFRVIIEYNNVPIGYGQIYKMYDELYTDYHY 88
           Y G    Y +E+ ++   E   DE     +F V  + +   IGY  + K+ D
Sbjct: 22  YEGEYSFYDIEADEEDLAEFLHDESRGDHIFSV--KEHGTLIGYFTVCKINDG------- 72

Query: 89  PKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAY 148
             T +I  GM     +PN    G G ++I  I  F K++     + L     N RAI+ Y
Sbjct: 73  --TVDIGLGM-----KPNITGNGFGLQFINAILAFSKEKYGCKYITLSVATFNKRAIKVY 125

Query: 149 QKSGFRII 156
           +++GF  +
Sbjct: 126 KRAGFEAV 133


>BACC1 Q734C3 (Q734C3) Acetyltransferase, GNAT family
          Length = 183

 Score = 38.5 bits (88), Expect = 0.046
 Identities = 27/80 (33%), Positives = 38/80 (47%)

Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           IG+ N   KG G   I LI ++   E N + V LD    N  AI  Y+K GF++   + E
Sbjct: 98  IGDANDRGKGYGREAIHLILKYAFYELNLHRVGLDVISYNKDAIELYKKMGFQMEGCMRE 157

Query: 162 HELHEGKKEDCYLMEYRYDD 181
               +GK  D  +M    D+
Sbjct: 158 AVQRDGKCFDRIIMGILRDE 177


>BACAN Q81YL2 (Q81YL2) Acetyltransferase, GNAT family
          Length = 179

 Score = 38.5 bits (88), Expect = 0.046
 Identities = 27/80 (33%), Positives = 38/80 (47%)

Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           IG+ N   KG G   I LI ++   E N + V LD    N  AI  Y+K GF+I   + E
Sbjct: 96  IGDANDRGKGYGREAIHLILKYAFYELNLHRVGLDVISYNKAAIELYKKMGFQIEGCMRE 155

Query: 162 HELHEGKKEDCYLMEYRYDD 181
               +G+  D  +M    D+
Sbjct: 156 AVQRDGECFDRIIMGILRDE 175


>BACAN Q81XB7 (Q81XB7) Spermidine N1-acetyltransferase
          Length = 171

 Score = 38.5 bits (88), Expect = 0.046
 Identities = 27/116 (23%), Positives = 51/116 (43%), Gaps = 12/116 (10%)

Query: 60  RVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKL 119
           R I+E +N  +G  ++ ++      DY + +T+       Q I +PNY   G      +L
Sbjct: 57  RFIVEKDNEMVGLVELVEI------DYIHRRTEF------QIIIDPNYQGYGYAAEATRL 104

Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175
             ++     N + + L   K N +A+  Y+K GF +  +L +    +G   +   M
Sbjct: 105 AMDYAFSVLNMHKIYLVVDKENEKAVHVYKKVGFMVEGELLDEFFVDGNYHNAIRM 160


>SALTI Q8Z6Z1 (Q8Z6Z1) Spermidine N1-acetyltransferase
          Length = 186

 Score = 38.1 bits (87), Expect = 0.061
 Identities = 25/90 (27%), Positives = 40/90 (44%), Gaps = 4/90 (4%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           Q I  P Y  KG+ +R  KL  ++     N   + L   K N +AI  Y+K GFR+  +L
Sbjct: 87  QIIISPEYQGKGLASRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFRVEGEL 146

Query: 160 PEHELHEGKKED----CYLMEYRYDDNATN 185
                  G+  +    C   +   D++ T+
Sbjct: 147 IHEFFINGEYRNTIRMCIFQQQYLDEHKTS 176


>SALTY Q8ZPJ3 (Q8ZPJ3) Spermidine N1-acetyltransferase (EC 2.3.1.57)
          Length = 186

 Score = 38.1 bits (87), Expect = 0.061
 Identities = 25/90 (27%), Positives = 40/90 (44%), Gaps = 4/90 (4%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           Q I  P Y  KG+ +R  KL  ++     N   + L   K N +AI  Y+K GFR+  +L
Sbjct: 87  QIIISPEYQGKGLASRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFRVEGEL 146

Query: 160 PEHELHEGKKED----CYLMEYRYDDNATN 185
                  G+  +    C   +   D++ T+
Sbjct: 147 IHEFFINGEYRNTIRMCIFQQQYLDEHKTS 176


>SALCH Q57PD6 (Q57PD6) Spermidine N1-acetyltransferase
          Length = 186

 Score = 38.1 bits (87), Expect = 0.061
Identities = 25/90 (27%), Positives = 40/90 (44%), Gaps = 4/90 (4%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           Q I  P Y  KG+ +R  KL  ++     N   + L   K N +AI  Y+K GFR+  +L
Sbjct: 87  QIIISPEYQGKGLASRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFRVEGEL 146

Query: 160 PEHELHEGKKED----CYLMEYRYDDNATN 185
                  G+  +    C   +   D++ T+
Sbjct: 147 IHEFFINGEYRNTIRMCIFQQQYLDEHKTS 176


>MYCPE Q8EWE5 (Q8EWE5) Acetyltransferase GNAT family
          Length = 193

 Score = 38.1 bits (87), Expect = 0.061
 Identities = 34/157 (21%), Positives = 72/157 (45%), Gaps = 20/157 (12%)

Query: 2   NIVENE-ICIRTLIDDDFPLMLKWLTDERVLEFYGG---RDKKYTLESLKKHYTEPWEDE 57
           NI+E + + +R L  +D     ++   E V E  G    +D +Y+ + L K      +
Sbjct: 8   NIIETKRLYLRPLKIEDLNDFYEFAKVEGVGESAGWFHHKDIEYSKKILIKMINSKQD-- 65

Query: 58  VFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYI 117
            + ++ + NN  IG   I+  Y+           D+++ G   F+   +YW+KG+ T  +
Sbjct: 66  -YAIVYKENNKVIGELGIFNKYEN----------DKLMIG---FVLNKDYWNKGLATEIV 111

Query: 118 KLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
           K + +++    +   + +   ++N  + R  +K GF+
Sbjct: 112 KELIDYIFTNTDHQQIYMGHFESNLASKRVVEKCGFK 148


>BACC1 Q72Y03 (Q72Y03) Spermidine N1-acetyltransferase (EC 2.3.1.57)
          Length = 176

 Score = 38.1 bits (87), Expect = 0.061
 Identities = 34/177 (19%), Positives = 74/177 (41%), Gaps = 14/177 (7%)

Query: 1   MNIVE--NEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEV 58
           M ++E   E+ +R L  +D   + +   +  ++ ++     +  +E    +     +
Sbjct: 1   MEVIEMSQELKLRPLEREDLKFVHELNNNAHIMSYWFEEPYEAFVELQDLYDKHIHDQSE 60

Query: 59  FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIK 118
            R I+E +N  +G  ++ ++      DY + +T+       Q I +PNY   G      +
Sbjct: 61  RRFIVEKDNEMVGLVELVEI------DYIHRRTEF------QIIIDPNYQGYGYAAEATR 108

Query: 119 LIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175
           L  ++     N + + L   K N +A+  Y+K GF +  +L +    +G   +   M
Sbjct: 109 LAMDYAFSVLNMHKLYLVVDKENEKAVHVYKKVGFMVEGELLDEFFVDGNYHNAIRM 165


>DEIRA Q9RW73 (Q9RW73) Acetyltransferase, putative
          Length = 186

 Score = 37.7 bits (86), Expect = 0.079
 Identities = 45/179 (25%), Positives = 77/179 (43%), Gaps = 35/179 (19%)

Query: 8   ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYT-----LESLKKHY-----TEPWEDE 57
           + +R    +D P   +WLTDER    +   D  YT      E+++ +      T P  DE
Sbjct: 9   VVLRDRRPEDLPTFTRWLTDERAA--WREWDAPYTPAAQTSETMQAYIRYLQVTPPDADE 66

Query: 58  VFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYG-----MDQFIGEPNYWSKGI 112
             RVI       +G GQ+  M +         +++E   G     +   I +P YW  G+
Sbjct: 67  --RVI------EVG-GQVVGMVN---------RSEEEPAGGGWWDLGILIYDPAYWEGGV 108

Query: 113 GTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKED 171
           GTR + L  +      +A+ + +     N R +RA ++ GF+    + E  +  G++ D
Sbjct: 109 GTRALSLWVQDTLDWTDAHTLTVTTWSGNERMMRAARRLGFQECARVREARVVGGQRYD 167


>STAAM Q99U68 (Q99U68) Hypothetical protein
          Length = 169

 Score = 37.4 bits (85), Expect = 0.10
 Identities = 31/133 (23%), Positives = 55/133 (41%), Gaps = 17/133 (12%)

Query: 44  ESLKKHYTEPWEDEV-------------FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPK 90
           E +K+H  E W+D+              +  ++E N+   G+  + +   E Y D  +P
Sbjct: 22  ELMKEHDNEQWDDQYPLLEHFEEDIAKDYLYVLEENDKIYGFIVVDQDQAEWYDDIDWPV 81

Query: 91  TDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQK 150
             E  + + +  G   Y  KG  T     + + + K R A  ++ D    N  A   + K
Sbjct: 82  NREGAFVIHRLTGSKEY--KGAATELFNYVIDVV-KARGAEVILTDTFALNKPAQGLFAK 138

Query: 151 SGF-RIIEDLPEH 162
            GF ++ E L E+
Sbjct: 139 FGFHKVGEQLMEY 151


>RICPR Q9ZE75 (Q9ZE75) TRANSCRIPTIONAL ACTIVATOR PROTEIN CZCR (CzcR)
          Length = 237

 Score = 37.4 bits (85), Expect = 0.10
 Identities = 35/130 (26%), Positives = 63/130 (48%), Gaps = 15/130 (11%)

Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLET-NVKIPNIEYSYISDELSILGYKEI 277
           V  E I + K    + KG+A     ++ ++ NL+T +V++   +    + E SIL    +
Sbjct: 104 VREELIARIKAIVRRSKGHAASIFRFDKISVNLDTRSVEVDGKKLHLTNKEYSILELLIL 163

Query: 278 -KGTFLTPE-----IYSTMSEEEQNLLKRDIASFLRQMH----GLDYTDISECTIDNKQN 327
            +GT LT E     +YST+ E E  ++   I    +++     G DY D    T+  +
Sbjct: 164 RRGTILTKEMFLNHLYSTVDEPEMKIIDVFICKLRKKLSDAAGGRDYID----TVWGRGY 219

Query: 328 VLEEYILLRE 337
           +L+EY  L++
Sbjct: 220 MLKEYDELQQ 229


>LACJO Q74J71 (Q74J71) Hypothetical protein
          Length = 181

 Score = 37.4 bits (85), Expect = 0.10
 Identities = 23/72 (31%), Positives = 32/72 (44%), Gaps = 1/72 (1%)

Query: 102 IGEPNYWSKGIGTRYIKL-IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLP 160
           I  P YW  GIG R +K+ I E   +      + L     NPR I   QK GF+    +
Sbjct: 98  IYNPTYWHGGIGGRVLKIWISEIFDQYPELEHIGLTTWSGNPRMIHLAQKLGFKKEAQIR 157

Query: 161 EHELHEGKKEDC 172
           +   ++ K  DC
Sbjct: 158 KVRFYKEKYYDC 169


>CLOAB Q97D33 (Q97D33) Acetyltransferase (With duplicated domains),
           possibly RIMI-like protein
          Length = 292

 Score = 37.4 bits (85), Expect = 0.10
 Identities = 18/59 (30%), Positives = 32/59 (54%), Gaps = 1/59 (1%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHE 163
           P Y  +G G   + ++ E+L  ER+ + + L+   NN RA   Y+  GF+I  ++  +E
Sbjct: 225 PEYRGRGFGREMMSMLLEYLI-ERDYDDIALEVDSNNKRAFELYKSIGFQIEREIDYYE 282


>VIBCH Q9KU66 (Q9KU66) Ribosomal-protein-alanine acetyltransferase
          Length = 161

 Score = 36.6 bits (83), Expect = 0.18
 Identities = 22/77 (28%), Positives = 42/77 (54%), Gaps = 2/77 (2%)

Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE-DLPEH 162
           +P    KG G + ++  F  L +++ A +  L+  ++N RA   YQ++GF  I+  +  +
Sbjct: 85  DPAQQGKGYGQQLLQH-FIALCEQQKAESAWLEVRESNQRAFALYQRAGFNEIDRRVNYY 143

Query: 163 ELHEGKKEDCYLMEYRY 179
            + +GK ED  +M Y +
Sbjct: 144 PVAKGKSEDAIIMSYLF 160


>STRR6 Q8DNF9 (Q8DNF9) Hypothetical protein spr1760
          Length = 172

 Score = 36.6 bits (83), Expect = 0.18
 Identities = 23/71 (32%), Positives = 34/71 (47%), Gaps = 3/71 (4%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEH--EL 164
           YW+ G+G+  ++   E+ +       + L     N  A+  YQK GF +IE   E    +
Sbjct: 98  YWNNGLGSLLLEEAIEWAQASGILRRLQLTVQTRNQAAVHLYQKHGF-VIEGSQERGAYI 156

Query: 165 HEGKKEDCYLM 175
            EGK  D YLM
Sbjct: 157 EEGKFIDVYLM 167


>SALTY Q8ZNU6 (Q8ZNU6) Putative ribose 5-phosphate isomerase (EC
           5.3.1.6)
          Length = 212

 Score = 36.6 bits (83), Expect = 0.18
 Identities = 24/93 (25%), Positives = 44/93 (47%), Gaps = 4/93 (4%)

Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIK 278
           +N  +IF+  F+  K +GY  E+      N  +   VK   ++ +Y+ D L  +  + +K
Sbjct: 124 LNVRFIFEKAFTGRKGEGYPPERKAPQVRNAGILNQVKAAVVKENYL-DTLRAIDPELVK 182

Query: 279 GTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHG 311
                P     + E  QN   ++I +F+R+M G
Sbjct: 183 TAVSGPRFQQCLFENGQN---KEIEAFVREMLG 212


>SALCH Q57N66 (Q57N66) Putative ribose 5-phosphate isomerase
          Length = 212

 Score = 36.6 bits (83), Expect = 0.18
 Identities = 24/93 (25%), Positives = 44/93 (47%), Gaps = 4/93 (4%)

Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIK 278
           +N  +IF+  F+  K +GY  E+      N  +   VK   ++ +Y+ D L  +  + +K
Sbjct: 124 LNVRFIFEKTFTGRKGEGYPPERKAPQVRNAGILNQVKAAVVKENYL-DTLRAIDPELVK 182

Query: 279 GTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHG 311
                P     + E  QN   ++I +F+R+M G
Sbjct: 183 TAVSGPRFQQCLFENGQN---KEIEAFVREMLG 212


>ECO57 ATDA_ECO57 (P0A952) Spermidine N(1)-acetyltransferase (EC
           2.3.1.57) (Diamine acetyltransferase) (SAT)
          Length = 185

 Score = 36.6 bits (83), Expect = 0.18
 Identities = 21/60 (35%), Positives = 29/60 (48%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           Q I  P Y  KG+ TR  KL  ++     N   + L   K N +AI  Y+K GF +  +L
Sbjct: 86  QIIISPEYQGKGLATRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFSVEGEL 145


>ECOLI ATDA_ECOLI (P0A951) Spermidine N(1)-acetyltransferase (EC
           2.3.1.57) (Diamine acetyltransferase) (SAT)
          Length = 185

 Score = 36.6 bits (83), Expect = 0.18
 Identities = 21/60 (35%), Positives = 29/60 (48%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           Q I  P Y  KG+ TR  KL  ++     N   + L   K N +AI  Y+K GF +  +L
Sbjct: 86  QIIISPEYQGKGLATRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFSVEGEL 145


>BACHD Q9KG16 (Q9KG16) BH0299 protein
          Length = 305

 Score = 36.6 bits (83), Expect = 0.18
 Identities = 35/126 (27%), Positives = 52/126 (41%), Gaps = 17/126 (13%)

Query: 41  YTLESLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQ 100
           Y  E + K   EP       +IIE   + IGY          Y +   P+  E   G  +
Sbjct: 185 YDAEEILKKINEPTNK---LLIIEKEQIVIGYA---------YVEVE-PEHGE---GQIE 228

Query: 101 FIG-EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           +IG  P+Y  +G+ T+ +      L        + L   K N +AIR YQ +GF+    L
Sbjct: 229 YIGIAPDYRRQGLATQLLTNALHVLFSYPTVEDITLCVSKQNTKAIRLYQAAGFKKERQL 288

Query: 160 PEHELH 165
              EL+
Sbjct: 289 TYFELN 294


>AQUAE O66838 (O66838) Ribosomal-protein-alanine acetyltransferase
          Length = 154

 Score = 36.6 bits (83), Expect = 0.18
 Identities = 23/74 (31%), Positives = 37/74 (50%), Gaps = 5/74 (6%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164
           P Y  KG G + ++     L  +     V+LD  K+N RAI  Y+K GF+++    E +
Sbjct: 75  PGYRGKGYGEKLLREAISRLGDK--VKRVVLDVRKSNLRAINLYKKLGFKVV---TERKG 129

Query: 165 HEGKKEDCYLMEYR 178
           +    E+  LME +
Sbjct: 130 YYSDGENALLMELK 143


>PROAC Q6ABU2 (Q6ABU2) Acetyltransferase (EC 2.3.1.-)
          Length = 188

 Score = 36.2 bits (82), Expect = 0.23
 Identities = 21/77 (27%), Positives = 32/77 (41%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164
           P  W +G+G R ++L  E      +A  V L     N R I      G+R    +P+
Sbjct: 111 PTLWGRGVGRRALRLWTEATFATTDAQVVTLTTWSGNGRMIHCAGAVGYRECGRIPQARS 170

Query: 165 HEGKKEDCYLMEYRYDD 181
            +G++ D   M    DD
Sbjct: 171 WQGRRWDLVTMALLRDD 187


>BACAN Q81QL3 (Q81QL3) Acetyltransferase, GNAT family
          Length = 308

 Score = 36.2 bits (82), Expect = 0.23
 Identities = 44/183 (24%), Positives = 70/183 (38%), Gaps = 31/183 (16%)

Query: 44  ESLKKHYTEPWEDEVFRV------IIEYNNVPIGYGQIYKM-YDELYTDYHYPKTDEIVY 96
           E L +  T  +++E  R       +I+YN  P GY  +  M Y     D +    DE +
Sbjct: 15  EKLTEIMTRTFDEEAERWLCGQGDVIDYNIQPPGYSSVEMMRYSIEELDSYKVIMDEKII 74

Query: 97  G-------------MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPR 143
           G             +D+   EP Y  KGIG+  IKLI       R  +        NN
Sbjct: 75  GGIIVTISGKSYGRIDRIFVEPVYQGKGIGSNVIKLIEAEYPSIRIWDLETSSRQINNH- 133

Query: 144 AIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVD 203
               Y+K G++ I         E + E CY+       N  +V   + +    ++N  ++
Sbjct: 134 --HFYKKMGYQTI--------FESEDEYCYVKRIGTSSNKESVFKNEDMKNSQYENCNLE 183

Query: 204 SIE 206
           + E
Sbjct: 184 NTE 186


>STRMU Q8DV67 (Q8DV67) Putative acetyltransferase
          Length = 166

 Score = 35.8 bits (81), Expect = 0.30
Identities = 21/52 (40%), Positives = 29/52 (55%), Gaps = 2/52 (3%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156
           P Y  +GIGT  +K   E L+K +  + V L   K N  A+  YQK+GF+ I
Sbjct: 95  PAYRGQGIGTELLKTFLEHLRK-KGYHKVSLSVQKEND-AVNMYQKAGFQTI 144


>STAAM COAD_STAAM (P63818) Phosphopantetheine adenylyltransferase
           (EC 2.7.7.3) (Pantetheine-phosphate adenylyltransferase)
           (PPAT) (Dephospho-CoA pyrophosphorylase)
          Length = 160

 Score = 35.8 bits (81), Expect = 0.30
 Identities = 28/132 (21%), Positives = 55/132 (41%), Gaps = 13/132 (9%)

Query: 164 LHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEY 223
           L   KKE  + +E R D    +VK +  +  H F    VD  E +G+          +++
Sbjct: 38  LKNSKKEGTFSLEERMDLIEQSVKHLPNVKVHQFSGLLVDYCEQVGAKTIIRGLRAVSDF 97

Query: 224 IFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDEL--SILGYKEIKGTF 281
            ++ + ++  KK           LN  +ET   + +  YS+IS  +   +  Y+     F
Sbjct: 98  EYELRLTSMNKK-----------LNNEIETLYMMSSTNYSFISSSIVKEVAAYRADISEF 146

Query: 282 LTPEIYSTMSEE 293
           + P +   + ++
Sbjct: 147 VPPYVEKALKKK 158


>LISMO Q8Y9B8 (Q8Y9B8) Lmo0614 protein
          Length = 177

 Score = 35.8 bits (81), Expect = 0.30
 Identities = 17/54 (31%), Positives = 27/54 (50%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLP 160
           YW  GIGT  ++ + ++ K       + L+    N RAI  Y+K GF    ++P
Sbjct: 105 YWGLGIGTICMEELIKYAKSSEYLKLIYLEVVTENKRAINLYKKFGFIEAGEIP 158


>THEMA Q9WZ46 (Q9WZ46) Hypothetical protein
          Length = 179

 Score = 35.4 bits (80), Expect = 0.39
 Identities = 19/49 (38%), Positives = 28/49 (57%), Gaps = 1/49 (2%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRI 155
           YW+ GIGTR I    E+ ++      + L+  K+N RAI  Y+K GF +
Sbjct: 106 YWNIGIGTRMITSAIEWARR-NGFIRIQLEVLKSNERAISLYRKLGFEL 153


>STRR6 Q8DQU8 (Q8DQU8) Hypothetical protein spr0490
          Length = 185

 Score = 35.4 bits (80), Expect = 0.39
 Identities = 29/128 (22%), Positives = 60/128 (46%), Gaps = 16/128 (12%)

Query: 55  EDEVF---RVIIEYN---NVPIGYGQIYKMYDELY--TDYHYPKTDEIVYGMDQFIG--- 103
           EDE++    ++ E N   N+P GYG + K  D++    D+++   D+++      IG
Sbjct: 50  EDEIYYLEHILPERNQKENLPAGYGIVVKGTDKIVGSVDFNHRHEDDVLE-----IGYTL 104

Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHE 163
            P+YW +G      + + +   K+   + + L     N ++ R  +K GF +   + + +
Sbjct: 105 HPDYWGRGYVPEAARALIDLAFKDLGLHKIELTCFGYNLQSKRVAEKLGFTLEARIRDRK 164

Query: 164 LHEGKKED 171
             +G + D
Sbjct: 165 DVQGNRCD 172


>CLOAB Q97FA0 (Q97FA0) Predicted acetyltransferase
          Length = 146

 Score = 35.4 bits (80), Expect = 0.39
 Identities = 23/94 (24%), Positives = 45/94 (47%), Gaps = 15/94 (15%)

Query: 61  VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120
           V+I+ NN+ +GYG ++ + DE +              +      P +   GIG + ++ +
Sbjct: 46  VVIKNNNLVVGYGGLWLIIDEGH--------------ITNIAVHPEFRGMGIGNKILEEL 91

Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
            +  +K RN  ++ L+   +N  A   Y+K GF+
Sbjct: 92  IKLCEK-RNIPSMTLEVRISNTIAQNLYKKFGFK 124


>_BUCAP LOLC_BUCAP (Q8K9N8) Lipoprotein-releasing system
           transmembrane protein lolC
          Length = 399

 Score = 35.4 bits (80), Expect = 0.39
 Identities = 44/156 (28%), Positives = 62/156 (39%), Gaps = 26/156 (16%)

Query: 189 MKYLIEHYFDNFK----VDSIEIIGSGYDSVAYLVNNEYIFKTKFS------------TN 232
           ++YL   Y  NFK    + SI  IG G  S    ++    F+ KF             TN
Sbjct: 11  LRYLWNPYLPNFKKIIIILSILGIGIGISSTIITISIMNGFQNKFKNDILSFIPHIIITN 70

Query: 233 KKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSE 292
           K +   K       LN   ET +K+ N+E   I+D +S     E K      EI     +
Sbjct: 71  KNRNINK-------LNFPKET-LKLKNVEE--ITDFISKKVIIENKNEINIGEIIGINIK 120

Query: 293 EEQNLLKRDIASFLRQMHGLDYTDISECTIDNKQNV 328
            E+NL   +I  FL  +H   Y  I    +  K +V
Sbjct: 121 NEKNLENYNIKKFLHTLHSRKYNAIIGSELAKKMHV 156


>BACC1 Q734M2 (Q734M2) Acetyltransferase, GNAT family
          Length = 153

 Score = 35.4 bits (80), Expect = 0.39
 Identities = 22/91 (24%), Positives = 42/91 (46%), Gaps = 16/91 (17%)

Query: 80  DELYTDYHYPKT-----------DEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKER 128
           DE++  Y Y  T           DE+   + + +  P+Y+ KGI T+ +  +F+     +
Sbjct: 50  DEIFYGYFYEDTLAGFISFKIEKDEV--DIHRLVVSPDYFHKGIATKLLLYVFDMFSPSK 107

Query: 129 NANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
                I+   K N  A+  Y+K GF  ++++
Sbjct: 108 ---TYIVQTGKENTPALSLYKKHGFIEVKEI 135


>BACAN Q81U09 (Q81U09) Acetyltransferase, GNAT family
          Length = 167

 Score = 35.4 bits (80), Expect = 0.39
 Identities = 15/47 (31%), Positives = 27/47 (57%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
           Y ++GIGT+ I+ +  + K++     + L     N RAI+ Y++ GF
Sbjct: 93  YCNQGIGTKLIEFLIRWAKEQNGLEKICLGVVSVNDRAIKVYKRMGF 139


>YERPE Q8ZHB3 (Q8ZHB3) Putative siderophore biosynthesis protein
           IucB (Acetyl CoA:N6-hydroxylsyine acetyl transferase)
          Length = 316

 Score = 35.0 bits (79), Expect = 0.51
 Identities = 35/159 (22%), Positives = 63/159 (39%), Gaps = 31/159 (19%)

Query: 14  IDDDFPLMLKWLTDERVLEFY---GGRDKK--YTLESLKKHYTEPWEDEVFRVIIEYNNV 68
           +D D P   +W+   RV  F+   G  D +  Y    L   Y  P       ++  +++
Sbjct: 151 VDHDAPQFTRWMNSPRVDAFWEMSGPLDVQAAYLQRQLDSPYCYP-------LLGCFDDQ 203

Query: 69  PIGYGQIY-KMYDELYTDYHYPKTDEIVYGMDQFIGEPNY--------WSKGIGTRYIKL 119
           P GY ++Y    D +   Y +   D    G+   +GE N+        W +G+ T Y+ L
Sbjct: 204 PFGYFEVYWAAEDRIGRHYRWQPFDR---GLHMLVGEENWRGAQYIHSWLRGL-THYLYL 259

Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIED 158
                  E     V+ +P  +N R       +G+  +++
Sbjct: 260 ------DESRTTRVVAEPRIDNQRLFHHLPAAGYHTLKE 292


>VIBUY Q7MFH7 (Q7MFH7) Histone acetyltransferase HPA2
          Length = 166

 Score = 35.0 bits (79), Expect = 0.51
 Identities = 18/65 (27%), Positives = 33/65 (50%)

Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKE 170
           GIG++ I+ + E      N   + ++ + +N +AI  Y+K GF I  +  +    EG+
Sbjct: 94  GIGSKLIETVTELADNWLNVRRIQIEVNVDNEKAISLYKKHGFVIEGEAVDSSFREGRFI 153

Query: 171 DCYLM 175
           + Y M
Sbjct: 154 NTYYM 158


>RICCN Q92JG6 (Q92JG6) Transcriptional activator protein czcR
          Length = 237

 Score = 35.0 bits (79), Expect = 0.51
 Identities = 37/136 (27%), Positives = 63/136 (46%), Gaps = 27/136 (19%)

Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLET--------NVKIPNIEYSYISDELS 270
           V  E I + K    + KG+A     ++ ++ NL+T         V + N EY+ +  EL
Sbjct: 104 VREELIARIKAIVRRSKGHAASVFRFDKVSINLDTRSVEVDGKKVHLTNKEYAIL--ELL 161

Query: 271 ILGYKEIKGTFLTPE-----IYSTMSEEEQNLLKRDIASFLRQMH----GLDYTDISECT 321
           IL     +GT LT E     +YS++ E E  ++   I    +++     G DY D    T
Sbjct: 162 ILR----RGTILTKEMFLNHLYSSVDEPEMKIIDVFICKLRKKLSDAAGGRDYID----T 213

Query: 322 IDNKQNVLEEYILLRE 337
           +  +  +L+EY  L++
Sbjct: 214 VWGRGYMLKEYDELQQ 229


>CLOAB Q97G03 (Q97G03) Predicted acetyltransferase
          Length = 167

 Score = 35.0 bits (79), Expect = 0.51
 Identities = 24/75 (32%), Positives = 37/75 (49%), Gaps = 11/75 (14%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166
           Y  KGIG+  IK +FE+  +E     + L+   +N +AI  Y+K GF          + E
Sbjct: 94  YSGKGIGSLIIKRVFEW-AEENAIEKIDLEVFHDNFKAISLYKKFGF----------IEE 142

Query: 167 GKKEDCYLMEYRYDD 181
           G+K++    E  Y D
Sbjct: 143 GRKKNAIKAEDGYKD 157


>BACHD Q9KF00 (Q9KF00) Ribosomal-protein-alanine N-acetyltransferase
          Length = 190

 Score = 35.0 bits (79), Expect = 0.51
 Identities = 43/175 (24%), Positives = 76/175 (43%), Gaps = 23/175 (13%)

Query: 8   ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNN 67
           + +R +  DD   +L +L+D+ V++++ G +   TLE         W + +      +
Sbjct: 16  LILRKITTDDARSILSYLSDKEVMKYF-GLEPFQTLEDALGEIA--WYESIL-----HEQ 67

Query: 68  VPIGYGQIYKMYDELY--TDYH--YPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI--- 120
             I +G   K  DE+     +H   PK      G +       YW +GI +  I+ +
Sbjct: 68  TGIRWGITLKGQDEVIGSCGFHQWVPKHHRAEIGFEL---SKLYWGQGIASEAIRAVIQY 124

Query: 121 -FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYL 174
            FE L+ +R   A+I  P+  + R +   +K GF     L  +E   GK +D Y+
Sbjct: 125 GFEHLELQR-IQALIEPPNIPSQRLV---EKQGFISEGLLRSYEYTCGKFDDLYM 175


>BACAN Q81NB6 (Q81NB6) Spermine/spermidine acetyltransferase,
           putative
          Length = 148

 Score = 35.0 bits (79), Expect = 0.51
 Identities = 31/119 (26%), Positives = 52/119 (43%), Gaps = 20/119 (16%)

Query: 54  WEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPK---TDEIVYGMDQFIGEP---NY 107
           WE+ +   + E     I    +Y + +  + D  Y K    +E + G   F  +P   NY
Sbjct: 14  WEEAIKLSVKEEQQTFIA-SNLYSIAEVQFLDNFYAKGIYLEEKMVGFTMFGIDPEDNNY 72

Query: 108 W-----------SKGIGTRYIKLIFEFLKKERNAN--AVILDPHKNNPRAIRAYQKSGF 153
           W            KGIG + I L+ + +++  NAN   +++     N  A  AY+K+GF
Sbjct: 73  WIYRLMIDENFQGKGIGKQAIYLVIDEIRRNNNANFSRIMIGYAPENLTAKFAYKKAGF 131


>BACAN Q81N07 (Q81N07) Acetyltransferase, GNAT family
          Length = 153

 Score = 35.0 bits (79), Expect = 0.51
 Identities = 21/89 (23%), Positives = 42/89 (47%), Gaps = 12/89 (13%)

Query: 80  DELYTDYHYPKT---------DEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNA 130
           DE++  Y Y  T         D+    + + +  P+++ KGI T+ +  IF+      ++
Sbjct: 50  DEIFYGYFYEDTLAGFISFKIDKEEVDIHRLVVSPDHFHKGIATKLLLYIFDMFS---SS 106

Query: 131 NAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
              I+   K N  A+  Y+K GF  ++++
Sbjct: 107 KTYIVQTGKENTPALSLYKKHGFIEVQNI 135


>STRMU Q8DT36 (Q8DT36) Putative acetyltransferase
          Length = 184

 Score = 34.7 bits (78), Expect = 0.67
 Identities = 16/78 (20%), Positives = 36/78 (46%)

Query: 106 NYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELH 165
           +YW +G+ T  ++ +     +E +   + +  HK N  + R  +K+GFR++      + +
Sbjct: 105 HYWKQGLATEALENLVFLAFQELDLKELEIIVHKENRASARVAEKAGFRLVRQFKGSDRY 164

Query: 166 EGKKEDCYLMEYRYDDNA 183
             K  D    + +  D +
Sbjct: 165 THKMRDYLKYDLKAGDKS 182


>PSEAE Q9I3W7 (Q9I3W7) Hypothetical protein
          Length = 177

 Score = 34.7 bits (78), Expect = 0.67
 Identities = 17/66 (25%), Positives = 33/66 (50%)

Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKK 169
           KG+G+R +  + +      N   V L  + +N  A+  Y+K GF    ++ ++ + +G+
Sbjct: 100 KGVGSRLLGELLDIADNWMNLRRVELTVYTDNAPALALYRKFGFETEGEMRDYAVRDGRF 159

Query: 170 EDCYLM 175
            D Y M
Sbjct: 160 VDVYSM 165


>NITEU Q82UT0 (Q82UT0) GCN5-related N-acetyltransferase (EC
           2.3.1.128)
          Length = 157

 Score = 34.7 bits (78), Expect = 0.67
 Identities = 17/48 (35%), Positives = 30/48 (62%), Gaps = 1/48 (2%)

Query: 109 SKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156
           S+G+G + ++ + E L ++  A  V+LD  ++N  AI  YQ+ GF+ I
Sbjct: 88  SQGLGRKMLRYLIE-LSRKHQAEFVLLDVRESNTGAINLYQRLGFQQI 134


>LACLA Q9CF66 (Q9CF66) Spermidine acetyltransferase (EC 2.3.1.57)
          Length = 180

 Score = 34.7 bits (78), Expect = 0.67
 Identities = 34/131 (25%), Positives = 55/131 (41%), Gaps = 12/131 (9%)

Query: 60  RVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKL 119
           R +IE N+  IG  ++  +      DY + +T EI     Q I    +  KG   + +K
Sbjct: 62  RFVIEANDTFIGIVELMSI------DYIH-RTCEI-----QIIIISGFSGKGYAQKALKT 109

Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRY 179
             ++     N + V L    +N  A+  Y+K GF+I   + E     G+  D Y M
Sbjct: 110 GVDYAFNTLNMHKVYLWVDIDNAPAVHIYKKLGFKIEGTIKEQFFAGGRYHDSYFMGILK 169

Query: 180 DDNATNVKAMK 190
            +     KA+K
Sbjct: 170 SEYTQREKAVK 180


>BACC1 Q739G7 (Q739G7) Acetyltransferase, GNAT family
          Length = 188

 Score = 34.7 bits (78), Expect = 0.67
 Identities = 33/136 (24%), Positives = 52/136 (38%), Gaps = 21/136 (15%)

Query: 47  KKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPN 106
           +K  T   E+ +  +IIE+N   IG    Y         + Y  T  +  G+   I  P
Sbjct: 56  EKMQTRLKEEPLSNLIIEHNGQVIGTVGFY---------WEYKPTRWLEMGI--VIYNPA 104

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166
           YW+ G GT  + L  + L ++     V L     N R ++  +K G  +          E
Sbjct: 105 YWNGGYGTEALTLYRDLLFEKMEIGRVGLTTWSGNERMMKVAEKIGMTL----------E 154

Query: 167 GKKEDCYLMEYRYDDN 182
           G+   C      Y D+
Sbjct: 155 GRMRKCRYYNGTYYDS 170


>MYCPE Q8EWM2 (Q8EWM2) Hypothetical protein MYPE1810
          Length = 300

 Score = 34.3 bits (77), Expect = 0.88
 Identities = 57/252 (22%), Positives = 99/252 (39%), Gaps = 62/252 (24%)

Query: 115 RYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII-EDL-----PEHELHEGK 168
           R+I  + +FL K+ +      +  KN+ +        G  I+ EDL     P   L E K
Sbjct: 55  RFILNLLDFLYKDNDLIEYKRERSKNDLKFFHFSFSKGLDILLEDLHLNKDPYKWLVETK 114

Query: 169 KEDCYLME-YRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLV-----NNE 222
              C+L+  + Y  +  +  +  Y  E    N ++  ++I+   + S+   +     NN
Sbjct: 115 TRSCFLIGLFLYGGSINSPNSSNYHFEIKIHNTEI--LKIVEKIFSSINIPLLVLNRNNT 172

Query: 223 YIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFL 282
           YI   K S +                                ISD L +LG  E
Sbjct: 173 YIVYIKKSES--------------------------------ISDILKLLGATE------ 194

Query: 283 TPEIYSTMSEEEQNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLE--EYILLRETIY 340
                 +M E E+  + RD  + + +++ LD +++ + TI+     L+  EY+     ++
Sbjct: 195 ------SMFEYEEKRISRDYTNQMSRLNNLDMSNLKK-TIEASHIQLQNIEYVK-NNNLF 246

Query: 341 NDLTDIEKDYIE 352
           N LTD EK Y E
Sbjct: 247 NQLTDKEKIYCE 258


>MYCPE Q8EWE6 (Q8EWE6) Acetyltransferase GNAT family
          Length = 190

 Score = 34.3 bits (77), Expect = 0.88
 Identities = 26/112 (23%), Positives = 51/112 (45%), Gaps = 13/112 (11%)

Query: 48  KHYTEPWEDE-VFRVIIEYNNVPIGYGQIYKMYDELYTDYHYP----KTDEIVYGMDQFI 102
           KH+    E E + +++I   N    Y  ++K  +++   +       KT +I Y + +
Sbjct: 45  KHHKNIEETETILKILISGGNF---YALVWKENNKVIGSFGIETPSYKTVKIGYALSK-- 99

Query: 103 GEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
              +YW+ GI T   K I +F+      N +++     N  + +  +KSGF+
Sbjct: 100 ---DYWNLGIMTEVTKHIIDFIFTNSGFNKILVSHFDENTASKKVIEKSGFK 148


>LACLA Q9CHW9 (Q9CHW9) Hypothetical protein yfiL
          Length = 154

 Score = 34.3 bits (77), Expect = 0.88
 Identities = 27/97 (27%), Positives = 46/97 (47%), Gaps = 15/97 (15%)

Query: 90  KTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVIL--DPHKNNPRAIRA 147
           K  +  + ++ F  E ++  +G G + +K +  +LK+   A+ +IL  D   NN   +
Sbjct: 56  KKQKNTFEIENFAVETSFQGQGFGQQMMKQLITYLKENLAADELILGTDDVSNN---VAF 112

Query: 148 YQKSGFRIIE-------DLPEHELHEGK---KEDCYL 174
           Y+K GF I         D  +H + EGK   K+  YL
Sbjct: 113 YEKCGFTITHKISNYFLDNCDHPIFEGKVQLKDKIYL 149


>LACPL Q88SM7 (Q88SM7) Prophage Lp4 protein 8, DNA primase/helicase
          Length = 500

 Score = 34.3 bits (77), Expect = 0.88
 Identities = 37/160 (23%), Positives = 64/160 (40%), Gaps = 11/160 (6%)

Query: 179 YDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEY-----IFKTKFSTNK 233
           Y DNAT      + I+++FD       EI+   ++ +   +N  Y     IF      N
Sbjct: 174 YRDNATTPNIKGWTIDNWFDELACGDDEIVELLWEVINDCLNGNYTRKKAIFLFSELGNS 233

Query: 234 KKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEE 293
            KG  +E  I N +  +    +K+   +  +      ++G     G  + P+IY   S
Sbjct: 234 GKGTFQE-LITNLVGMDNVGTLKVNEFDVRF--RLAGLVGKTVCIGDDIAPDIYIKDSSN 290

Query: 294 EQNLLKRDIASFLRQMHGLD-YTDISECTIDNKQNVLEEY 332
             +++  D+ +   +  G D YT    CTI    N L  +
Sbjct: 291 FNSVVTGDLVNI--EFKGQDGYTSALRCTIVQSCNGLPNF 328


>ENTFA Q837C4 (Q837C4) Acetyltransferase, GNAT family
          Length = 144

 Score = 34.3 bits (77), Expect = 0.88
 Identities = 17/58 (29%), Positives = 30/58 (51%), Gaps = 6/58 (10%)

Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           +P Y+ KG G   I+ + E        + + +D +K N  A++ YQ  GF++I +  E
Sbjct: 78  DPVYFRKGYGGEIIQKLIE------QESIIFVDANKQNEGAVKFYQSQGFQVIGESKE 129


>CLOTE Q891D4 (Q891D4) Ribosomal-protein-alanine acetyltransferase
           (EC 2.3.1.128)
          Length = 152

 Score = 34.3 bits (77), Expect = 0.88
 Identities = 30/118 (25%), Positives = 54/118 (45%), Gaps = 18/118 (15%)

Query: 58  VFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYI 117
           ++ V I+ N + +GYG ++ + DE +       T+  ++        PNY   GI +  +
Sbjct: 48  LYIVAIKDNKI-LGYGGLWIILDEGHV------TNIAIH--------PNYRQLGIASLVL 92

Query: 118 KLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175
             + +   K R  N++ L+  K+N  A   YQK GF  +E+      +    ED  +M
Sbjct: 93  STLIKE-SKNRGVNSITLEVRKSNSVAQNLYQKFGF--VEEGCRKHYYSDNLEDAIIM 147


>CLOAB Q97I16 (Q97I16) Predicted acetyltransferase domain containing
           protein
          Length = 291

 Score = 34.3 bits (77), Expect = 0.88
 Identities = 31/103 (30%), Positives = 37/103 (35%), Gaps = 20/103 (19%)

Query: 68  VPIGYGQIYKMYDELYTDY-----HYPKTDEIVYGMDQFIGEPN------------YWSK 110
           +P+    IY  YDE    Y      +   DEI  G  QFI E N            Y
Sbjct: 177 IPLSIDDIY--YDEAQEYYVDDGAFFISKDEIKIGYGQFIFEHNNITIVNFGIVEQYRGN 234

Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
           G G  ++  I   LK  R    V +    NN  AI  Y   GF
Sbjct: 235 GYGRYFLSYILNILKN-RGCKVVYIKVDMNNVPAINLYTSMGF 276


>BACC1 Q72WY7 (Q72WY7) Hypothetical protein
          Length = 186

 Score = 34.3 bits (77), Expect = 0.88
 Identities = 38/163 (23%), Positives = 70/163 (42%), Gaps = 16/163 (9%)

Query: 195 HYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETN 254
           H   N K++ I  +    D V  L  N Y   T+  T+ +K   K       +N   +
Sbjct: 12  HLEKNIKLEDIPNVDLYVDQVVQLFENTYADTTR--TDDEKVLTK-----TMINNYAKGK 64

Query: 255 VKIPNIEYSYISDELSILG-YKEIKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLD 313
           + IP     Y  + + ++    ++KG     +I S++     +LL  D  SF   M   +
Sbjct: 65  LFIPIKNKKYSKEHMILISLIYQLKGALSINDIKSSLETINDSLLNDD--SFELNMLYKN 122

Query: 314 YTDISECTIDN-KQNVLEEYILLRETIYNDLTDIEKDYIESFM 355
           Y  ++E  +++ KQ+V       R T  N+++ +E   +E F+
Sbjct: 123 YLALTESNVESFKQDVNN-----RVTEVNEISSLEDTKLEKFL 160


>VIBPA Q87G30 (Q87G30) Putative acetyltransferase
          Length = 166

 Score = 33.9 bits (76), Expect = 1.1
 Identities = 22/77 (28%), Positives = 36/77 (46%), Gaps = 6/77 (7%)

Query: 99  DQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIED 158
           DQF G       G+G++ I+ I E      N   + L+ + +N  AI  Y+K GF I  +
Sbjct: 88  DQFHG------LGVGSKLIETITELADNWLNVRRIQLEVNADNEAAIGLYKKHGFEIEGE 141

Query: 159 LPEHELHEGKKEDCYLM 175
             +    +G+  + Y M
Sbjct: 142 AIDASFRDGEFINTYYM 158


>STAES ARGD2_STAES (Q8CSG1) Acetylornithine aminotransferase 2 (EC
           2.6.1.11) (ACOAT 2)
          Length = 375

 Score = 33.9 bits (76), Expect = 1.1
 Identities = 31/133 (23%), Positives = 61/133 (45%), Gaps = 11/133 (8%)

Query: 193 IEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE--KAIYNFLNTN 250
           + + F+N+K D+IE + +  + +    NN Y+  +        G+  E  +A+YN LN
Sbjct: 1   MSYLFNNYKRDNIEFVDANQNELIDKDNNVYLDFSSGIGVTNLGFNMEIYQAVYNQLNLI 60

Query: 251 LETNVKIPNIEYSYISDELS--ILGYKEIKGTFL---TPEIYSTMSEEEQNLLKRDIASF 305
             +    PN+  S I +E++  ++G ++    F    T    + +    +   K +I +F
Sbjct: 61  WHS----PNLYLSSIQEEVAQKLIGQRDYLAFFCNSGTEANEAAIKLARKATGKSEIIAF 116

Query: 306 LRQMHGLDYTDIS 318
            +  HG  Y  +S
Sbjct: 117 KKSFHGRTYGAMS 129


>RICPR Q9ZEC5 (Q9ZEC5) 190 KD ANTIGEN (Sca1)
          Length = 347

 Score = 33.9 bits (76), Expect = 1.1
 Identities = 24/90 (26%), Positives = 43/90 (47%), Gaps = 2/90 (2%)

Query: 197 FDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVK 256
           F N K +  E+IGS   S+      +  F  +   +  K + K+K  Y++ +T ++++VK
Sbjct: 123 FKNGKNNDKELIGSKVISIYGQKELQQNFTLQLLVSASKNFIKDKINYSYGDTQIKSHVK 182

Query: 257 IPNIEYSYISDELSILGYKEIKGTFLTPEI 286
             N  +SY ++ L    Y       +TP I
Sbjct: 183 HHN--HSYNAEALLNYNYLVKNSIIITPNI 210


>PSEPK Q88NL5 (Q88NL5) Acetyltransferase, GNAT family
          Length = 162

 Score = 33.9 bits (76), Expect = 1.1
 Identities = 18/59 (30%), Positives = 28/59 (47%), Gaps = 6/59 (10%)

Query: 98  MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156
           +D     P Y  +G+G R ++     L      NA  LD ++ NP+A+  Y   GF +I
Sbjct: 86  VDMLFVAPGYRGQGVGKRLLRYAISEL------NAEYLDVNEQNPKALGFYLHEGFEVI 138


>LACPL Q88YF8 (Q88YF8) Acetyltransferase (Putative)
          Length = 171

 Score = 33.9 bits (76), Expect = 1.1
 Identities = 13/47 (27%), Positives = 25/47 (53%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
           +W  G+GT  I+ + ++ +   +   ++L     N RA++ YQ  GF
Sbjct: 98  FWGMGLGTALIEEVLDWARNYSSLERLVLTVQLRNVRAVKLYQHLGF 144


>BACC1 Q737S7 (Q737S7) Acetyltransferase, GNAT family
          Length = 282

 Score = 33.9 bits (76), Expect = 1.1
 Identities = 36/125 (28%), Positives = 59/125 (47%), Gaps = 13/125 (10%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANA---VILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           PNY  +GIG    + +FE  K E   N    + L+    N RAIR Y K G+  + DL
Sbjct: 87  PNY--RGIGVS--QKLFELHKDEAIQNGCKQLFLEVIVGNDRAIRFYNKLGYEKVYDLSY 142

Query: 162 HELHEGKK---EDCYLMEYR-YDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAY 217
           + L +  K   ++C  +E +  +  A  V+  K+L  H+  N++ D   I  + +
Sbjct: 143 YNLKDMTKIIHKECKGIEVKQLEFPAFKVEIQKWL--HFHINWQNDMDYIEKTNHTFYGA 200

Query: 218 LVNNE 222
            V+N+
Sbjct: 201 YVDND 205


>BACC1 Q734H0 (Q734H0) Acetyltransferase, GNAT family family
          Length = 182

 Score = 33.9 bits (76), Expect = 1.1
 Identities = 45/172 (26%), Positives = 71/172 (41%), Gaps = 24/172 (13%)

Query: 8   ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWED---EVFRVIIE 64
           +CI    +DD    ++ L +++ L    G    Y LE     + + W D   E+ R  IE
Sbjct: 10  LCIEPFTNDDV-CRIRELANDKELANILGLPHPYKLE-----FAQDWVDMQPELIRKGIE 63

Query: 65  YNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQ-----FIGEPNYWSKGIGTRYIKL 119
           Y   P+G   + K   E+        T  I  G ++     +IG+ NYW KG  T  +
Sbjct: 64  Y---PLGI--VSKESREIVGTI----TLRIDKGNNRGELGYWIGK-NYWGKGFATEALNR 113

Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKED 171
           + +F   E   N +       N  +I+  +KSG R    L ++ L     ED
Sbjct: 114 MIQFGFIELGLNKIWASAISRNRSSIKVLEKSGLRKEGTLRQNRLLLNTYED 165


>BACAN Q81Q67 (Q81Q67) Acetyltransferase, GNAT family
          Length = 282

 Score = 33.9 bits (76), Expect = 1.1
 Identities = 34/125 (27%), Positives = 57/125 (45%), Gaps = 13/125 (10%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANA---VILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           PNY   G+  +    +FE  K+E   N    + L+    N RAIR Y K G+  + DL
Sbjct: 87  PNYRGVGVSQK----LFELHKEEALQNECKQLFLEVIVGNDRAIRFYNKLGYEKVYDLSY 142

Query: 162 HELHEGKK---EDCYLMEYR-YDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAY 217
           + L +  K    +C  +E +  +  A  V+  K+L  H+  N++ D   I  + +
Sbjct: 143 YNLKDMTKIIHRECKGIEVKQLEFAAFKVEIQKWL--HFHINWQNDMDYIEKTNHTFYGA 200

Query: 218 LVNNE 222
            V+N+
Sbjct: 201 YVDND 205


>BACAN Q81P77 (Q81P77) Acetyltransferase, GNAT family
          Length = 181

 Score = 33.9 bits (76), Expect = 1.1
 Identities = 45/181 (24%), Positives = 76/181 (41%), Gaps = 32/181 (17%)

Query: 10  IRTLIDDDFPLMLKWLTDERVLEFYGGRDK--KYTLESLKKHYTEPWEDEVF-------- 59
           +R L  DD     +W  D +V +     D+   +TLE  K+     W +
Sbjct: 8   LRELTLDDVEDRYQWSLDTKVTKHLVVSDQYPPFTLEDTKQ-----WIEACINRKNGYEQ 62

Query: 60  RVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKL 119
           R I   N + IG+ ++ K +D+        K  E+       IG   YW KG G   +
Sbjct: 63  RAITAENGIHIGWIEL-KNFDKTN------KNAELGIA----IGNKEYWGKGDGIAALYS 111

Query: 120 IFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHE-LHEGKKEDCYLMEYR 178
           +      E     V L   ++N +A ++Y+K+GF + E L  ++ L +G+    ++  YR
Sbjct: 112 MLHVAFFEFELEKVWLRVDEDNLQARKSYEKAGF-VCEGLMRNDRLRKGR----FIHRYR 166

Query: 179 Y 179
           Y
Sbjct: 167 Y 167


>Q8E1I7 (Q8E1I7) Acetyltransferase, GNAT family
          Length = 186

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 22/64 (34%), Positives = 29/64 (45%), Gaps = 1/64 (1%)

Query: 93  EIVYGMDQFIG-EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKS 151
           EI +  D FI  + +YW  GIG   ++   E+         + L     N RAI  YQK
Sbjct: 97  EIEHVGDLFIAVQKDYWGYGIGHILMEEAIEWASDNDITRRLELSVQGRNERAIHLYQKF 156

Query: 152 GFRI 155
           GF I
Sbjct: 157 GFEI 160


>Q8DXE6 (Q8DXE6) Hypothetical protein SAG1905
          Length = 212

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 29/93 (31%), Positives = 44/93 (47%), Gaps = 9/93 (9%)

Query: 219 VNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIK 278
           +N  Y+F+  F   K  GY KE+A+    N  + + +K   I Y    D LS+L  KEI
Sbjct: 125 LNLRYLFERLFEDEKGGGYPKERAVPEQRNARILSEIK--QITY---RDLLSVL--KEID 177

Query: 279 GTFLTPEIYSTMSEEE--QNLLKRDIASFLRQM 309
             FL   I     +E    N   ++IA +L+ +
Sbjct: 178 QDFLKETISGEHFQEYFFANCQNQNIADYLKSV 210


>OCEIH Q8ET96 (Q8ET96) Hypothetical conserved protein
          Length = 167

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 26/90 (28%), Positives = 41/90 (45%), Gaps = 15/90 (16%)

Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKK 169
           KGIG   + LI  +      A+ + LD   +N RAI  Y+K GF +          EG
Sbjct: 92  KGIGKEALNLIKIWAFNSYKAHRLWLDVKTDNKRAITIYKKEGFTL----------EGTL 141

Query: 170 EDCYLMEYRYDDNATNVKAMKYLIEHYFDN 199
            +C  +   Y+    ++  M  L++H +DN
Sbjct: 142 RECLRVGNTYE----SLHVMS-LLKHEYDN 166


>LISIN Q929M8 (Q929M8) Lin2246 protein
          Length = 157

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 23/73 (31%), Positives = 38/73 (52%), Gaps = 1/73 (1%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164
           P+Y  +GIG   +  + E + +E+    + L     N +AIR Y+K+GF+    L +  +
Sbjct: 84  PDYQREGIGQLLMDKMKE-VAREKGFIKISLRVLSINQKAIRFYEKNGFKQEGRLEKEFI 142

Query: 165 HEGKKEDCYLMEY 177
            +GK  D  LM Y
Sbjct: 143 IQGKYVDDILMAY 155


>CLOTE Q895L4 (Q895L4) DNA topoisomerase I (EC 5.99.1.2)
          Length = 696

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 19/52 (36%), Positives = 28/52 (53%), Gaps = 3/52 (5%)

Query: 220 NNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSI 271
           N+EYIF+   S  K  G+     IY +LN + + N+ IP +E   +  E SI
Sbjct: 399 NSEYIFRATGSIVKFDGFM---IIYEYLNEDEKENINIPKLEKGELLKEKSI 447


>CLOPE Q8XIF5 (Q8XIF5) Ribosomal-protein-alanine N-acetyltransferase
          Length = 148

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 23/76 (30%), Positives = 39/76 (51%), Gaps = 4/76 (5%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164
           P Y  +G+G   I  +   L KE N N++ L+  ++N  A   Y+K GF+  E+
Sbjct: 76  PEYRKQGVGNLLIDNLIT-LCKENNINSLTLEVRESNIPAQSLYKKHGFK--EEGIRKNF 132

Query: 165 HEGKKEDCYLMEYRYD 180
           +   KE+  +M +R+D
Sbjct: 133 YNNPKENAIIM-WRHD 147


>BRUSU Q8FVN1 (Q8FVN1) Acyl-CoA dehydrogenase family protein
          Length = 388

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 21/79 (26%), Positives = 38/79 (48%), Gaps = 15/79 (18%)

Query: 181 DNATNVKAMKYLIEH-----YFDNFKVDSIEIIG----------SGYDSVAYLVNNEYIF 225
           +N   ++ ++ ++ H     +FDN +V    +IG          SG ++   L+  E I
Sbjct: 197 NNGLTIRPIRTMMNHATTEVFFDNLRVPVSNLIGEEGKGFRYILSGMNAERILIAAECIG 256

Query: 226 KTKFSTNKKKGYAKEKAIY 244
             K+ T K   YAKE++I+
Sbjct: 257 DAKWFTQKSVNYAKERSIF 275


>BRUME Q8YCN6 (Q8YCN6) ACYL-COA DEHYDROGENASE (EC 1.3.99.-)
          Length = 395

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 21/79 (26%), Positives = 38/79 (48%), Gaps = 15/79 (18%)

Query: 181 DNATNVKAMKYLIEH-----YFDNFKVDSIEIIG----------SGYDSVAYLVNNEYIF 225
           +N   ++ ++ ++ H     +FDN +V    +IG          SG ++   L+  E I
Sbjct: 204 NNGLTIRPIRTMMNHATTEVFFDNLRVPVSNLIGEEGKGFRYILSGMNAERILIAAECIG 263

Query: 226 KTKFSTNKKKGYAKEKAIY 244
             K+ T K   YAKE++I+
Sbjct: 264 DAKWFTQKSVNYAKERSIF 282


>BACSU YQAR_BACSU (P45914) Hypothetical protein yqaR
          Length = 154

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 36/157 (22%), Positives = 74/157 (47%), Gaps = 11/157 (7%)

Query: 201 KVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKK---KGYAKEKAIYNFLNTNLETN--- 254
           ++D   +I S   +V +     Y+ + K     +   KGY     IYN  N  +ET
Sbjct: 3   QIDFGTVITSAITAVFFTGGTNYVLQKKNRKGNEIFTKGYILIDEIYNINNKRIETAAAF 62

Query: 255 VKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLDY 314
           V   N    Y+ ++L    +KE+    L  + +S + ++E N+  ++  ++LR++
Sbjct: 63  VPFYNHPEGYL-EKLHTDYFKELSAFELIVKKFSILFDKELNIKLQEYINYLREVEVALR 121

Query: 315 TDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYI 351
             +++  I  + N  +EYI   E + +++T++ K +I
Sbjct: 122 GFMNDDPI-IEVNFNQEYI---ERLIDEITNLIKKHI 154


>BACAN Q81RF9 (Q81RF9) Acetyltransferase, GNAT family
          Length = 185

 Score = 33.5 bits (75), Expect = 1.5
 Identities = 40/184 (21%), Positives = 71/184 (38%), Gaps = 29/184 (15%)

Query: 6   NEICIRTLIDDDFPLMLKWLTDERVLE-------FYGGRDKKYTLESLKKHYTEPWEDEV 58
           +++ IRT+ + D   +   +  E   E       ++    ++Y++   +K  T   E+ +
Sbjct: 9   DKVTIRTIEESDIKTLWNLVFKEENPEWKKWDAPYFSFSMQEYSVYK-EKMQTRLKEEPL 67

Query: 59  FRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIK 118
             +IIE N   IG    Y         + Y  T  +  G+   I  P YW+ G GT  +
Sbjct: 68  SNLIIENNGQVIGTVGFY---------WEYKPTRWLEMGI--VIYNPAYWNGGYGTEALT 116

Query: 119 LIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYR 178
           L  + L ++     V L     N R ++  +K G  +          EG+   C
Sbjct: 117 LYRDLLFEKMEIGRVGLTTWSGNERMMKVAEKIGMSL----------EGRMRKCRYYNGT 166

Query: 179 YDDN 182
           Y D+
Sbjct: 167 YYDS 170


>VIBPA Q87NP1 (Q87NP1) Putative spermine/spermidine
           acetyltransferase BltD
          Length = 182

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 21/91 (23%), Positives = 42/91 (46%), Gaps = 3/91 (3%)

Query: 85  DYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRA 144
           ++H P T  +   M   +  P++  KG+G+  +  +     +  N   V L+ +  N  A
Sbjct: 89  EFHAPSTGTLWLPMLTIL--PSFKGKGLGSEIVSSVIAVACEYANLQNVGLNVYAENISA 146

Query: 145 IRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175
            R + + GF  I    + E+  GK+ +C ++
Sbjct: 147 FRFWYRQGFTQIRAF-DQEIEFGKEYNCLVL 176


>THETN Q8RC65 (Q8RC65) Acetyltransferases
          Length = 200

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 16/50 (32%), Positives = 31/50 (62%), Gaps = 1/50 (2%)

Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLP 160
           G+G++ ++ I +  +K +    ++LD    N +AI+ Y+K G++IIE  P
Sbjct: 134 GLGSKLLEEIEQEARKLK-CKRIVLDVEIENEKAIKLYEKLGYKIIERSP 182


>STRR6 Q8CYV9 (Q8CYV9) Hypothetical protein spr0850
          Length = 166

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 23/81 (28%), Positives = 40/81 (49%), Gaps = 6/81 (7%)

Query: 86  YHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYI-KLIFEFLKKERNANAVILDPHKNNPRA 144
           Y YP  + +  G+  F+ +  Y  KGIG+  + + +  F K  R A    +   K NP++
Sbjct: 80  YAYPDEETVFIGL--FMVDQAYQRKGIGSHIVTEALAYFAKNFRKARLAYV---KGNPQS 134

Query: 145 IRAYQKSGFRIIEDLPEHELH 165
              ++K GF+ I    + EL+
Sbjct: 135 QHFWEKQGFKSIGCEVKQELY 155


>STAES Q8CU70 (Q8CU70) Spermidine acetyltransferase
          Length = 165

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 27/83 (32%), Positives = 37/83 (44%), Gaps = 14/83 (16%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHK-------NNPRAIRAYQKSG 152
           Q I +P +  KG    Y K  FE   K  N    IL+ HK       +N +A+  Y+  G
Sbjct: 81  QIIIKPEFSGKG----YAKFAFE---KAINYAFDILNMHKIYLYVDTDNKKAVHIYESQG 133

Query: 153 FRIIEDLPEHELHEGKKEDCYLM 175
           F+    L E    +GK +D Y M
Sbjct: 134 FKTEGLLKEQFYTKGKYKDAYFM 156


>STAES Q8CS08 (Q8CS08) Hypothetical protein SE1483
          Length = 434

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 38/150 (25%), Positives = 71/150 (47%), Gaps = 24/150 (16%)

Query: 224 IFKTKFSTNKKKGYAK----EKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKE--- 276
           I KT+FST+K KGY K    EK+  N  N + +  ++  N +   I++E+S L
Sbjct: 6   ILKTQFSTSKFKGYLKYINDEKS--NKANHD-KKKIQSLNQDIENINNEMSNLNLNSYSS 62

Query: 277 -IKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLDYTDISECTID----------NK 325
            I G      I    ++ ++ ++KR  A F    + LD  ++++   D          N
Sbjct: 63  YIIGYMKNNSITKKDNQNKKKVIKRTTAPFNNNSYTLDNKELNKLKDDFDTAEKQGCINY 122

Query: 326 QNVL--EEYILLRETIYNDLTD-IEKDYIE 352
           Q+++  +   L++  +Y+  TD + +D I+
Sbjct: 123 QDIISFDNDFLIKNHLYDAKTDELNEDVIK 152


>RICPR Q9ZDP9 (Q9ZDP9) Hypothetical protein RP278
          Length = 371

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 34/116 (29%), Positives = 52/116 (44%), Gaps = 19/116 (16%)

Query: 231 TNKKKGYAK----EKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEI 286
           T  K+ Y +    E+A+Y+ L    +   K  NI  S   D+L     + +KG  LTPE
Sbjct: 101 TRLKENYIQYDTVEEALYSLLTKETDLIKKANNIPESLTPDDL-----RRLKGENLTPE- 154

Query: 287 YSTMSEEEQNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYND 342
                E+E+   K +  S L  +  +D T  S    D + N + E   L +TI N+
Sbjct: 155 -----EQEEERKKFEYLSILGSI--IDDTKKSNEHYDKRANEINEQ--LNKTIINE 201


>OCEIH Q8CXG0 (Q8CXG0) Diamine N-acetyltransferase
           (Spermine:spermidine acetyltransferase) (EC 2.3.1.57)
          Length = 152

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 26/100 (26%), Positives = 44/100 (44%), Gaps = 12/100 (12%)

Query: 66  NNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLK 125
           ++ PIGY  +          +H  +     +  D+F+    +  KG   +YI LI +++K
Sbjct: 53  DDTPIGYAMV---------GFHSQEKQSAWF--DRFMIAAEHQGKGYAHQYIPLILDYIK 101

Query: 126 KERNANAVILDPHKNNPRAIRAYQKSGFRII-EDLPEHEL 164
            +    ++ L     N  A   Y+K GF +  E  PE EL
Sbjct: 102 MKYQVKSIKLSIIPTNDVAKLLYEKYGFVLTGETDPEGEL 141


>CLOAB Q97J70 (Q97J70) Predicted acetyltransferase
          Length = 171

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 19/69 (27%), Positives = 31/69 (44%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166
           YW  G+G + I  +  + KK      + L    +N RAI+ Y+  GF     +    L +
Sbjct: 98  YWGLGVGRKLIMNLIAWSKKNHIVRKINLRVRTDNYRAIKLYESLGFVNEGTIKRDFLID 157

Query: 167 GKKEDCYLM 175
           G+  D + M
Sbjct: 158 GEFYDSFSM 166


>BURMA Q9AI54 (Q9AI54) DedA family protein
          Length = 1925639

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 50/238 (21%), Positives = 103/238 (43%), Gaps = 28/238 (11%)

Query: 12     TLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIG 71
              T+ ++++  M+     ++VLE  G ++K         +YT         ++I+Y N  I
Sbjct: 546537 TINENZYMEMITKDNLKQVLENLGFKNKNENYVKTINNYT---------LLIDYKNQSIN 546587

Query: 72     YGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFL----KKE 127
              Y +  K++D+  +++ +P+   +   + + +       KG    Y++L  ++     KK
Sbjct: 546588 YPKEIKIHDKTTSNFSHPENFVVFECVHRLL------EKGYKAEYLELEPKWNLGRDKKG 546641

Query: 128    RNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVK 187
                A+ ++ D ++NNP  I   + +  +  E +   E +  +++   L  Y   +     K
Sbjct: 546642 GKADILVKD-NENNPYLIIECKTTDSKNSEFI--KEWNRMQEDGGQLFSYFQQE-----K 546693

Query: 188    AMKYLIEHYFD-NFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIY 244
               +KYL  +  D + K++    I   YD+  YL   E     K S N  + +   K  Y
Sbjct: 546694 GVKYLCLYTSDFSDKLEYKNYIIQAYDNEEYLKEKELQNSYKKSNNNIELFKTWKESY 546751


 Score = 31.2 bits (69), Expect = 7.4
 Identities = 20/73 (27%), Positives = 36/73 (49%), Gaps = 2/73 (2%)

Query: 105     PNYWSKGIGTRYIKLIFEFLKK-ERNANAVILDPHKNNPRAIRAYQKSGFRII-EDLPEH 162
               P++  +G+G+R  + +  + +  E     + L     NP A+R Y++ GFR     +
Sbjct: 1424334 PDHQGRGVGSRLFESLIAWARSAEPEIVRIELAAGAGNPGAVRLYERLGFRHEGRQVARG 1424393

Query: 163     ELHEGKKEDCYLM 175
                L +G+ ED  LM
Sbjct: 1424394 RLPDGRFEDDILM 1424406


>BRAJA Q89YE3 (Q89YE3) Bll0009 protein
          Length = 250

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 14/56 (25%), Positives = 31/56 (55%), Gaps = 4/56 (7%)

Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           +PN+  KG+GT    L+  +  +  + + ++     +NP  I  YQ+ GF+++ ++
Sbjct: 165 DPNWVGKGLGT----LLMNYALQRCDEDGIVAYLESSNPENIPFYQRHGFKVVGEI 216


>BACAN Q81NK7 (Q81NK7) Streptothricin acetyltransferase, putative
          Length = 184

 Score = 33.1 bits (74), Expect = 2.0
 Identities = 31/119 (26%), Positives = 54/119 (45%), Gaps = 6/119 (5%)

Query: 40  KYTLE---SLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVY 96
           +YT+E   S +K Y +   +E+  V  EY N P     I  +++++       K
Sbjct: 41  EYTVEDVPSYEKSYLQNDNEEL--VYNEYINKPNQIIYIALLHNQIIGFIVLKKNWNNYA 98

Query: 97  GMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRI 155
            ++    +  Y + G+G R I    ++ K E N   ++L+   NN  A + Y+K GF I
Sbjct: 99  YIEDITVDKKYRTLGVGKRLIAQAKQWAK-EGNMPGIMLETQNNNVAACKFYEKCGFVI 156


>VIBUY Q7MI31 (Q7MI31) Ribosomal-protein-alanine acetyltransferase
          Length = 150

 Score = 32.7 bits (73), Expect = 2.6
 Identities = 19/75 (25%), Positives = 34/75 (45%), Gaps = 1/75 (1%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164
           P    KG G + +    +   +  NA +  L+  ++N  AI  YQ+ GF  ++    +
Sbjct: 74  PKQQGKGYGRQLLDAFIDE-GEAANAESAWLEVRESNVNAIHLYQEMGFNEVDRRRNYYP 132

Query: 165 HEGKKEDCYLMEYRY 179
            +  KED  +M Y +
Sbjct: 133 TQSGKEDAIIMSYLF 147


>OCEIH Q8EMB8 (Q8EMB8) Glucose-6-phosphate 1-dehydrogenase (EC
           1.1.1.49) (G6PD)
          Length = 491

 Score = 32.7 bits (73), Expect = 2.6
 Identities = 45/198 (22%), Positives = 79/198 (39%), Gaps = 25/198 (12%)

Query: 53  PWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTD----EIVYGMDQFIG--EPN 106
           PW DEV R  +E N++         +  E  + ++Y   D    E   G+++ I   E
Sbjct: 51  PWTDEVLRENVE-NSIQDALSPDEDL-SEFISHFYYKSFDVTEKESYQGLNEIIQNLEGQ 108

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166
           Y ++G    Y+ +  +F     N         + N   ++        +IE    H+L
Sbjct: 109 YQTEGNRLFYLAMAPDFFGAIAN---------QLNDYGLKNTSGWTRLVIEKPFGHDLPS 159

Query: 167 GKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFK 226
            KK +  L     +D         Y I+HY     V +IE+I        +L NN +I
Sbjct: 160 AKKLNHELQAAFREDQI-------YRIDHYLGKEMVQNIEVIRFANGIFEHLWNNRFISN 212

Query: 227 TKFSTNKKKGYAKEKAIY 244
            + ++++  G  +E+A Y
Sbjct: 213 IQITSSETLG-VEERARY 229


>OCEIH Q8EKW6 (Q8EKW6) Acetyltransferase
          Length = 166

 Score = 32.7 bits (73), Expect = 2.6
 Identities = 36/165 (21%), Positives = 64/165 (38%), Gaps = 16/165 (9%)

Query: 6   NEICIRTLIDDDFPLMLKWLTDERVLEFYGG---RDKKYTLESLKKHYT--EPWEDEVFR 60
           N +  R   D+DFP +   L D  V+ F G    RD K   + L+  Y   +       +
Sbjct: 5   NRLTFRPYHDNDFPFLQSLLQDPEVVRFIGDGNVRDDKACNDFLQWIYDTYKNGNGLGLQ 64

Query: 61  VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120
           V++   N  +G+  +     E   +       EI Y + +      +W KG  T     +
Sbjct: 65  VLVNKQNERVGHAGLVPQTVEGKNEI------EIGYWIAK-----KHWGKGYATEAALAL 113

Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELH 165
           F F +K    + VI    + N  +    +K   +I +++   + H
Sbjct: 114 FAFARKNIEVDRVISLIQRENTASRNVAEKLMMKIEKEIILKDKH 158


>MYCPE Q8EWC2 (Q8EWC2) Oligoendopeptidase F
          Length = 604

 Score = 32.7 bits (73), Expect = 2.6
 Identities = 30/121 (24%), Positives = 54/121 (44%), Gaps = 7/121 (5%)

Query: 181 DNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKE 240
           DN T    +  L ++ F  FK+D + I    Y+ V+  +N +   K   + N+K
Sbjct: 37  DNGTCYSNLNKLKKYLF--FKLDMVPIENKLYNYVSNKLNEDLANKEMINWNQKLSSKIS 94

Query: 241 KAIYNFLNTNLETNVKIPNIEY--SYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLL 298
           +   +F N   E N+ + N E   S+I ++  I  ++         E +   +EEE+ L+
Sbjct: 95  EFQLSFAN---EINIILDNKELIKSFIENDSEIKKFERFFDLIFKEENHKLSNEEEKLLV 151

Query: 299 K 299
           K
Sbjct: 152 K 152


>LISMO Q8Y8S4 (Q8Y8S4) Lmo0820 protein
          Length = 185

 Score = 32.7 bits (73), Expect = 2.6
 Identities = 21/74 (28%), Positives = 33/74 (44%), Gaps = 3/74 (4%)

Query: 98  MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157
           +D  +    Y   G+GT  +  + E +  E     V L+  K NP A R Y++ GF +
Sbjct: 113 LDSIVTNEKYRGHGVGTALLAKLTE-IAAEDGEKVVGLNCDKGNPHAKRLYERLGFHVTG 171

Query: 158 D--LPEHELHEGKK 169
           +  L  HE    +K
Sbjct: 172 EITLSGHEYEHMQK 185


>CLOPE Q8XKM5 (Q8XKM5) Probable acetyltransferase
          Length = 167

 Score = 32.7 bits (73), Expect = 2.6
 Identities = 18/76 (23%), Positives = 36/76 (47%), Gaps = 8/76 (10%)

Query: 85  DYHYPKTDEIVYGMD-------QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDP 137
           DY Y   ++I + +D       +    P++  KG G + I  + E + KE+  N++ +
Sbjct: 69  DYAYDVYNDIAWQVDGPFLSFHRIAVSPSHRGKGYGRKMIDFV-EEMAKEKKCNSIRISA 127

Query: 138 HKNNPRAIRAYQKSGF 153
           +  N  A+  Y+  G+
Sbjct: 128 YHKNENAVNLYKNLGY 143


>BUCAI SPED_BUCAI (P57304) S-adenosylmethionine decarboxylase
           proenzyme (EC 4.1.1.50) (AdoMetDC) (SamDC) [Contains:
           S-adenosylmethionine decarboxylase beta chain;
           S-adenosylmethionine decarboxylase alpha chain]
          Length = 265

 Score = 32.7 bits (73), Expect = 2.6
 Identities = 30/103 (29%), Positives = 46/103 (44%), Gaps = 11/103 (10%)

Query: 169 KEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTK 228
           + D   +EYR      ++  +K+ I+H  ++ +    + I S YD V   V  E IF T+
Sbjct: 159 ESDIVTIEYRVRGFTRDIHGIKHFIDHKINSIQNFMSDDIKSMYDMVDVNVYQENIFHTR 218

Query: 229 FSTNKKKGYAKEKAIYNFL-NTNLETNVKIPNIEYSYISDELS 270
                     +E  + N+L N NLE    +   E SYI   LS
Sbjct: 219 M-------LLREFNLKNYLFNINLE---NLEKEERSYIKKLLS 251


>AQUAE O67458 (O67458) Hypothetical protein aq_1482
          Length = 161

 Score = 32.7 bits (73), Expect = 2.6
 Identities = 18/60 (30%), Positives = 30/60 (50%), Gaps = 1/60 (1%)

Query: 95  VYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
           V  + + + +P Y   G+GT  +  I E+ KK +  +   L     N +AI  Y+K GF+
Sbjct: 87  VGAIHEIVVDPEYQGHGVGTALMNTILEYFKK-KGLDTAELWVGDENYKAINFYKKFGFQ 145


>YERPE Q8CZN5 (Q8CZN5) Acyltransferase for 30S ribosomal subunit
           protein S18
          Length = 161

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 20/72 (27%), Positives = 34/72 (47%), Gaps = 1/72 (1%)

Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHE 163
           +P Y  +G G   ++ + E L+ ERN   + L+   +N RAI  Y+  GF  +     +
Sbjct: 86  DPQYQRQGYGRLLLEHLIEQLE-ERNIVTLWLEVRASNARAIALYESLGFNEVSVRRNYY 144

Query: 164 LHEGKKEDCYLM 175
                +ED  +M
Sbjct: 145 PSANGREDAIMM 156


>STRAW Q827N9 (Q827N9) Putative acetyltransferase
          Length = 166

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 19/59 (32%), Positives = 31/59 (52%), Gaps = 1/59 (1%)

Query: 109 SKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEG 167
           S+GIG+  I+   E L +ER  + + L    +NPRA   Y + G+R +    +   +EG
Sbjct: 88  SRGIGSALIRAAEE-LTRERGLDVIGLGVGTDNPRAAELYARLGYRPLTGYVDRWSYEG 145


>STAES Q8CU81 (Q8CU81) Spermidine acetyltransferase
          Length = 165

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 24/80 (30%), Positives = 35/80 (43%), Gaps = 8/80 (10%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFE----FLKKERNANAVILDPHKNNPRAIRAYQKSGFRI 155
           Q I +P +  KG    Y K  FE    +     N + + L    +N +AI  Y+  GF+
Sbjct: 81  QIIIKPEFSGKG----YAKFAFEKAIIYAFNILNMHKIYLYVDADNKKAIHIYESQGFKT 136

Query: 156 IEDLPEHELHEGKKEDCYLM 175
              L E    +GK +D Y M
Sbjct: 137 EGLLKEQFYTKGKYKDAYFM 156


>RICPR Q9ZCN0 (Q9ZCN0) RIBOSOMAL-PROTEIN-ALANINE ACETYLTRANSFERASE
           (RimJ)
          Length = 183

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 26/85 (30%), Positives = 41/85 (48%), Gaps = 12/85 (14%)

Query: 93  EIVYGMDQFIGEPNYWSKGIGTRYIKLIFEF---LKKERNANAVILDPHKNNPRAIRAYQ 149
           EI Y +D     PN+W +GI  + IK I +F   +   R    VI D    N R++   +
Sbjct: 103 EISYDLD-----PNFWGQGIMLKSIKNILKFADCIGIIRVQATVITD----NFRSVNLLE 153

Query: 150 KSGFRIIEDLPEHELHEGKKEDCYL 174
           + GF     L ++E+   K +D Y+
Sbjct: 154 RCGFSKEGILKKYEIIANKHKDYYM 178


>OCEIH Q8ERM1 (Q8ERM1) Acetyltransferase
          Length = 177

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 39/175 (22%), Positives = 64/175 (36%), Gaps = 15/175 (8%)

Query: 5   ENEICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKKYTLESLK-KHY---TEPWEDEVFR 60
           + E+ IR + + D   + + +  E   E+       ++ ES+  +H+    E W D   R
Sbjct: 4   DQELTIRPIQEKDLKRLWELIYKEDNPEWKQWDAPYFSHESMSYEHFLKEAESWIDAKSR 63

Query: 61  VIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLI 120
            ++  NN   G              Y+Y    +    M     E N W KG GT  +KL
Sbjct: 64  WVVCVNNDVHGT-----------VSYYYEDEQKNWLEMGIIFYEGNNWGKGYGTTALKLW 112

Query: 121 FEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLM 175
              +  +     V L     N R IR  +K G  +   +     + G+  D   M
Sbjct: 113 VNHIFTQLPVVRVGLTTWSGNKRMIRVAEKLGMTMEGRIRNVRYYNGEYYDSIRM 167


>MYCGE RIBF_MYCGE (P47391) Putative riboflavin biosynthesis protein
           ribF [Includes: Riboflavin kinase (EC 2.7.1.26)
           (Flavokinase); FMN adenylyltransferase (EC 2.7.7.2) (FAD
           pyrophosphorylase) (FAD synthetase)]
          Length = 269

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 16/44 (36%), Positives = 27/44 (61%), Gaps = 3/44 (6%)

Query: 419 TNFGEDILRMY-GNIDIEKAKEYQDIVEEYYPIETIVYGIKNIK 461
           TN    ++R Y  N ++EKA +   +VE YY + T+V+G+K  +
Sbjct: 120 TNLSSSVIRNYLTNNELEKANQL--LVEPYYRVGTVVHGLKKAR 161


>ENTFA Q836M4 (Q836M4) Spermine/spermidine acetyltransferase,
           putative
          Length = 148

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 16/56 (28%), Positives = 29/56 (51%)

Query: 98  MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
           +D+F+ +  +  +G G    +L+   L ++   N + L  +  N  AIR YQ+ GF
Sbjct: 72  LDRFLIDQRFQGQGYGKAACRLLMLKLIEKYQTNKLYLSVYDTNSSAIRLYQQLGF 127


>CHRVO Q7NRZ7 (Q7NRZ7) Probable acetyltransferase (EC 2.3.1.-)
          Length = 172

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 20/70 (28%), Positives = 31/70 (44%)

Query: 106 NYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELH 165
           ++  KG G   ++ I +          + L    +N RAI  Y+K GF     L + +L
Sbjct: 92  DWQGKGAGGAMMRAIIDLADNWLGLIRIELKVIHDNARAIALYEKFGFEYEGRLRQEQLR 151

Query: 166 EGKKEDCYLM 175
            GK ED  +M
Sbjct: 152 AGKLEDVLVM 161


>CHLCV Q824H9 (Q824H9) Acetyltransferase, GNAT family
          Length = 170

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 18/52 (34%), Positives = 29/52 (55%), Gaps = 2/52 (3%)

Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
           +GEP Y +KGIGT  +  +    K   +   + L+ ++ NP AI  Y++ GF
Sbjct: 95  VGEP-YRNKGIGTALLNNLCHLAKSRFHLEILYLEVYEENP-AIELYKRFGF 144


>CHLMU SYR_CHLMU (Q9PJT8) Arginyl-tRNA synthetase (EC 6.1.1.19)
           (Arginine--tRNA ligase) (ArgRS)
          Length = 563

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 56/268 (20%), Positives = 106/268 (39%), Gaps = 63/268 (23%)

Query: 204 SIEIIGSGYD----SVAYLVNNEYIFKTKFSTNKKKGY---AKEKAIYNFLNTNLETNVK 256
           SIEI G+G+     S  +L N   IF  + +    KG+   + +K I +F + N+  ++
Sbjct: 75  SIEIAGAGFINFTFSKEFLANQLQIFSQELA----KGFPVSSPQKVIIDFSSPNIAKDMH 130

Query: 257 IPNIEYSYISDEL----SILGYKEIK-----------GTFLT--PEIYSTMSEEEQNLLK 299
           + ++  + I D L    S +G+  ++           G  +T   E   T   + +NL +
Sbjct: 131 VGHLRSTIIGDCLARCFSFVGHDVLRLNHIGDWGTAFGMLITYLQETAQTDIHQLENLTE 190

Query: 300 RDIASFLRQMHGLDYTDISE--------------------CTIDNKQ-----NVLEEYIL 334
               + +R     ++   S+                    C +  K      ++L+  +
Sbjct: 191 LYKKAHVRFAEDPEFKKRSQYNVVALQSGDPQALALWKQICAVSEKSFQKIYSILDVELH 250

Query: 335 LR-ETIYND-LTDIEKDYIESFMERLNATTVFEGKKCLCHNDFSCNHLLL---DGNNRLT 389
            R E+ YN  L D+  D     +E  N  T+ +G KC+ H +FS   ++     G N  T
Sbjct: 251 TRGESFYNPFLADVVSD-----LESKNLVTLSDGAKCVFHEEFSIPLMIQKSDGGYNYAT 305

Query: 390 XXXXXXXXXXXXEYCDFIYLLEDSEEEI 417
                       ++ D I ++ DS + +
Sbjct: 306 TDVAAMRYRIQQDHADRILIVTDSGQSL 333


>BACSU O34376 (O34376) Putative acetyl transferase (YobR protein)
          Length = 247

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 24/84 (28%), Positives = 37/84 (44%), Gaps = 2/84 (2%)

Query: 76  YKMYD-ELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVI 134
           +KMYD E  T        +   G+   +    +  KG GT+ I+++ E+ K    A  +
Sbjct: 158 FKMYDKESLTALGTVSVIDGYGGLSNIVVAEEHRGKGAGTQVIRVLTEWAKNN-GAERMF 216

Query: 135 LDPHKNNPRAIRAYQKSGFRIIED 158
           L   K N  A+  Y K GF  I +
Sbjct: 217 LQVMKENLAAVSLYGKIGFSPISE 240


>BACC1 Q738E9 (Q738E9) Acetyltransferase, GNAT family
          Length = 308

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 38/159 (23%), Positives = 59/159 (37%), Gaps = 25/159 (15%)

Query: 62  IIEYNNVPIGYGQIYKM-YDELYTDYHYPKTDEIVYG-------------MDQFIGEPNY 107
           +I+YN  P GY  +  M Y     D +    D  + G             +D+   EP Y
Sbjct: 39  VIDYNIQPPGYSSVEMMRYSIEELDCYKVIMDGKIIGGIIVTISGKSYGRIDRIFVEPVY 98

Query: 108 WSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEG 167
             KGIG+  IKLI E     R  +        NN      Y+K G+  I         +
Sbjct: 99  QGKGIGSYVIKLIEEEYPSIRIWDLETSSRQLNNH---HFYKKMGYETI--------FKS 147

Query: 168 KKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIE 206
           + E CY+     +    N+   K +    ++N  + + E
Sbjct: 148 EDEYCYVKRITVESAEENLIKNKDMKNSQYENCNLANTE 186


>BACC1 Q736C5 (Q736C5) Acetyltransferase, GNAT family
          Length = 181

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 23/79 (29%), Positives = 39/79 (49%), Gaps = 6/79 (7%)

Query: 102 IGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           IG   YW KG G   +  +      E     V L   ++N +A ++Y+K+GF + E L
Sbjct: 94  IGNKEYWGKGYGIAALYSMLHVAFFEFELEKVWLRVDEDNFQARKSYEKAGF-VCEGLMR 152

Query: 162 HE-LHEGKKEDCYLMEYRY 179
           ++ L +G+    ++  YRY
Sbjct: 153 NDRLRKGQ----FIHRYRY 167


>BACC1 Q735P8 (Q735P8) Acetyltransferase, GNAT family
          Length = 149

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 31/122 (25%), Positives = 49/122 (40%), Gaps = 21/122 (17%)

Query: 50  YTEPWEDEVFRVIIE--YNNVP--IGYGQIYKMYDELY---TDYHYPKTDEIVYGMDQFI 102
           Y  P  +E   V  E  YN+ P  +G+ +  K   +L      Y   K D+IV G   F
Sbjct: 9   YIVPCTEESIHVANEQGYNSGPHIVGHVENVKQDKDLLPWGAWYVIRKEDDIVLGDIGFK 68

Query: 103 GEPN--------------YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAY 148
           G+PN              YW+KG  T  ++ +  +  +      +I +    N  +IR
Sbjct: 69  GKPNEEHTVEVGYGFIEKYWNKGYATEAVRELINWAFQTGEVEMIIAETLLENESSIRVL 128

Query: 149 QK 150
           +K
Sbjct: 129 EK 130


>AQUAE YZ34_AQUAE (O66423) Hypothetical protein AA34
          Length = 318

 Score = 32.3 bits (72), Expect = 3.3
 Identities = 25/71 (35%), Positives = 37/71 (52%), Gaps = 5/71 (7%)

Query: 414 EEEIGTNFGEDILRMYGNIDIE-KAKEYQDIVEEYYPI----ETIVYGIKNIKQEFIENG 468
           EE IG   GE + +    +  E KAKE +  V++   I    ET+ Y IK I +E I +
Sbjct: 215 EELIGETLGELLEKEIEKLVAEEKAKEIEGKVKKLKEIVSWFETLPYEIKQIAKEVISDN 274

Query: 469 RKEIYKRTYKD 479
             +I ++ YKD
Sbjct: 275 VLDIAEKFYKD 285


>YERPE Q8ZCX6 (Q8ZCX6) Spermidine acetyltransferase (EC 2.3.1.57)
           (Spermidine N1-acetyltransferase)
          Length = 181

 Score = 32.0 bits (71), Expect = 4.4
 Identities = 21/69 (30%), Positives = 29/69 (42%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           Q I +P +  KG      KL  E+     N   + L   K N +AI  Y K GF I  +L
Sbjct: 87  QIIIDPTHQGKGYAGAAAKLAMEYGFSVLNLYKLYLIVDKENEKAIHIYSKLGFEIEGEL 146

Query: 160 PEHELHEGK 168
            +     G+
Sbjct: 147 KQEFFINGE 155


>STRR6 Q8DNN2 (Q8DNN2) Hypothetical protein spr1627
          Length = 148

 Score = 32.0 bits (71), Expect = 4.4
 Identities = 14/58 (24%), Positives = 30/58 (51%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157
           +F   P    +G+G++ ++       +  + +++ L+  + N RA   YQK GF I++
Sbjct: 76  RFFINPQKQEQGLGSQALRKFVSLAFENEDIDSISLNVFEANQRAQNLYQKEGFEIVQ 133


>STAAM Q99RQ8 (Q99RQ8) Similar to transcription repressor of
           sporulation, septation and degradation PaiA
          Length = 171

 Score = 32.0 bits (71), Expect = 4.4
 Identities = 20/65 (30%), Positives = 35/65 (53%), Gaps = 4/65 (6%)

Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKE 170
           G G++ I+L  E + +E N + + L   ++NPRA   Y++ GF+++    EH    G
Sbjct: 106 GRGSQLIELA-EKIAQEHNKHKIWLGVWEHNPRAQAFYKRHGFKVV---GEHHFQTGDVT 161

Query: 171 DCYLM 175
           D  L+
Sbjct: 162 DTDLI 166


>LACLA Q9CJA2 (Q9CJA2) Acetyl transferase
          Length = 162

 Score = 32.0 bits (71), Expect = 4.4
 Identities = 21/69 (30%), Positives = 32/69 (46%), Gaps = 2/69 (2%)

Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKK 169
           KG+ T  I    +F KKE     + +    +NP A++ Y K GF     L +    +G+
Sbjct: 89  KGVATTLINFFIDFAKKE-GFKKITIQVMGSNPAALKLYNKLGFVEEGRLKKEFFIDGEY 147

Query: 170 -EDCYLMEY 177
            +DC L  Y
Sbjct: 148 IDDCILAFY 156


>CLOTE Q892J2 (Q892J2) Conserved protein
          Length = 218

 Score = 32.0 bits (71), Expect = 4.4
 Identities = 39/154 (25%), Positives = 57/154 (37%), Gaps = 21/154 (13%)

Query: 219 VNNEYIFK-------------TKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYI 265
           VNN  IFK             T F++ K +G       Y  LN     N+   N   S +
Sbjct: 9   VNNTPIFKCNYCGHCSKEIEATSFTSVKNRGCCWYFPKYTLLNIKNILNIGKENFIISLL 68

Query: 266 SDELSILG--YKEIKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLDYTDISECTID 323
           +++ S +   + E+KG+F   E Y  M E E      D   F R+     +     C++D
Sbjct: 69  NNKNSNISSYFIEVKGSFEEEEYYKFMRENEYTESSFDYKLFFRK---CSFVTDKGCSLD 125

Query: 324 NKQNVLEEYILLRETIYNDLTDIEKDYIESFMER 357
                    + L   I N     +KDY     ER
Sbjct: 126 FSLRPHPCNLYLCRNIIN---TCDKDYSSFSRER 156


>BRAJA Q89UA8 (Q89UA8) Hypothetical acetyltransferase
          Length = 148

 Score = 32.0 bits (71), Expect = 4.4
 Identities = 19/56 (33%), Positives = 33/56 (58%), Gaps = 5/56 (8%)

Query: 98  MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
           +DQ + +P  W    G+   +L+ E  K+  + + V L  +K+N RAIR Y+++GF
Sbjct: 76  LDQLVVDPASW----GSDAARLLVEEAKR-LSPSGVTLLVNKDNTRAIRFYERNGF 126


>BACSU O34558 (O34558) YopR protein
          Length = 325

 Score = 32.0 bits (71), Expect = 4.4
 Identities = 17/45 (37%), Positives = 26/45 (57%), Gaps = 5/45 (11%)

Query: 211 GYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNV 255
           G     +LV N+Y+ KTK ++NK  G A +     F+ TNL T++
Sbjct: 203 GQTKEVFLVENDYVVKTKRTSNKGDGQASK-----FVITNLITDI 242


>BACAN Q81R63 (Q81R63) Hypothetical protein
          Length = 217

 Score = 32.0 bits (71), Expect = 4.4
 Identities = 15/45 (33%), Positives = 27/45 (60%)

Query: 324 NKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNATTVFEGKK 368
           N+ NVL E +  +E +   L++ +KDYI+S  E++  T   E ++
Sbjct: 141 NQMNVLNESVTTQEELQRYLSENKKDYIKSVAEKVYQTATEEKRE 185


>VIBPA Q87MD0 (Q87MD0) Acetyltransferase-related protein
          Length = 168

 Score = 31.6 bits (70), Expect = 5.7
 Identities = 15/53 (28%), Positives = 26/53 (49%)

Query: 101 FIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
           FI +  YW KG+ T  +K  F    +E   + V  + + N+  ++   +K GF
Sbjct: 86  FIFDKAYWGKGLATEALKAFFPKACRELELHKVKANVNSNHQASMAVLEKLGF 138


>STRR6 Q8DND0 (Q8DND0) Transcriptional activator
          Length = 299

 Score = 31.6 bits (70), Expect = 5.7
 Identities = 19/81 (23%), Positives = 40/81 (49%), Gaps = 12/81 (14%)

Query: 295 QNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEK------ 348
           Q L+++D+A F+ Q+  L    + +     K   +E Y ++R+T+ + +  +EK
Sbjct: 167 QMLIRKDLAKFINQIEKLMLFLLEQ----KKVTQIENYFIIRDTLISGMCCLEKVGVTDC 222

Query: 349 --DYIESFMERLNATTVFEGK 367
             DY+    E ++ T  ++ K
Sbjct: 223 FNDYLSCLQEIMDKTQDYQKK 243


>OCEIH Q8CUS9 (Q8CUS9) Hypothetical conserved protein
          Length = 161

 Score = 31.6 bits (70), Expect = 5.7
 Identities = 20/76 (26%), Positives = 36/76 (47%), Gaps = 6/76 (7%)

Query: 105 PNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHEL 164
           P++   GIG+     +  +   +     + L+  +NN +A+  Y   GF II+D  E+
Sbjct: 92  PSHQGIGIGSA----LLHYGVNQLRPREIQLNVEQNNIKALDFYTSKGFEIIKDFQEN-- 145

Query: 165 HEGKKEDCYLMEYRYD 180
            +G   D Y M ++ D
Sbjct: 146 FDGHLLDTYRMSWKLD 161


>LISIN Q92E28 (Q92E28) Lin0633 protein
          Length = 143

 Score = 31.6 bits (70), Expect = 5.7
 Identities = 20/80 (25%), Positives = 37/80 (46%), Gaps = 1/80 (1%)

Query: 75  IYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVI 134
           +Y ++ +    Y +   DE    +  F+    +  KG GT+ ++ + + L KE     +
Sbjct: 55  LYSIFTDQKIGYLWFHVDEKHAFIYDFVIFETFRGKGFGTKTLEAL-DVLAKEMGITKIE 113

Query: 135 LDPHKNNPRAIRAYQKSGFR 154
           L    +N  AI+ Y K GF+
Sbjct: 114 LHVFAHNQTAIKLYDKVGFK 133


>LACPL Q88U49 (Q88U49) Glucose-6-phosphate 1-dehydrogenase (EC
           1.1.1.49) (G6PD)
          Length = 494

 Score = 31.6 bits (70), Expect = 5.7
 Identities = 26/95 (27%), Positives = 45/95 (47%), Gaps = 8/95 (8%)

Query: 151 SGF-RIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIG 209
           +GF R+I + P    +E  KE    +   +++N        Y I+HY     + +I  I
Sbjct: 140 NGFNRVIIEKPFGHDYESAKELNDQLTATFNENQI------YRIDHYLGKEMIQNITAIR 193

Query: 210 SGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIY 244
            G +    L NN YI   + + ++K G  +E+A+Y
Sbjct: 194 FGNNIWESLWNNRYIDNVQITLSEKLG-VEERAVY 227


>CORDI Q6NHQ0 (Q6NHQ0) Putative acetyltransferase
          Length = 163

 Score = 31.6 bits (70), Expect = 5.7
 Identities = 18/53 (33%), Positives = 27/53 (50%), Gaps = 1/53 (1%)

Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156
           EP Y  KG+G+  +    E L +   A  + L     NPRA + Y++ GF+ I
Sbjct: 98  EPRYRGKGVGSILLNKSLE-LARTLGAPGLSLSVDDGNPRAKKLYERLGFQHI 149


>BURMA Q62J98 (Q62J98) Ribosomal-protein-alanine acetyltransferase
           (EC 2.3.1.128)
          Length = 165

 Score = 31.6 bits (70), Expect = 5.7
 Identities = 15/43 (34%), Positives = 25/43 (58%), Gaps = 1/43 (2%)

Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
           G+G   ++      + ER  + V+L+   +NPRAIR Y++ GF
Sbjct: 87  GVGLALLREAVRIARAER-LDGVLLEVRPSNPRAIRLYERFGF 128


>THETN Q8R764 (Q8R764) LysM-repeat proteins and domains
          Length = 508

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 31/141 (21%), Positives = 53/141 (37%), Gaps = 23/141 (16%)

Query: 235 KGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEE 294
           KGY  E     F+    E    +  +  +Y+S E++ L  KE++  F        ++E+E
Sbjct: 381 KGYRDEYPFRTFVEIEGEVGEVLTEVSTAYVSYEINSL--KELEFKFAIDSCVEVLTEKE 438

Query: 295 QNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESF 354
             L+                 D+ E  +   + V    I+        L DI K Y  +
Sbjct: 439 MTLI----------------YDLKEIEMPRGEEVRHSIIIYMVQKGESLWDIAKRYRVNV 482

Query: 355 MERLNAT-----TVFEGKKCL 370
            + + A       VFEG+K +
Sbjct: 483 EDLITANDLKEDKVFEGEKLI 503


>STRR6 Q8DPX8 (Q8DPX8) Hypothetical protein spr0952
          Length = 253

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 22/106 (20%), Positives = 48/106 (45%), Gaps = 12/106 (11%)

Query: 261 EYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKRDIASFLRQMHGLD------- 313
           E SY+S   +++ Y+E+    + P       +E  + +   +    R++  L
Sbjct: 148 ELSYLS---TLIRYEELY--IINPNQARATPKEHHDFIVNHLVDNTRKLEELAIFERIQI 202

Query: 314 YTDISECTIDNKQNVLEEYILLRETIYNDLTDIEKDYIESFMERLN 359
           Y     C  D+K+N      +L+E ++ + + +EK+ ++   +RLN
Sbjct: 203 YQRDRSCVYDSKENTTSAADVLQELLFGEWSQVEKEMLQVGEKRLN 248


>STRMU Q8DVT1 (Q8DVT1) Putative ribosomal-protein-alanine
           acetyltransferase (EC 2.3.1.128)
          Length = 144

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 28/124 (22%), Positives = 51/124 (41%), Gaps = 20/124 (16%)

Query: 65  YNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYG---MDQFIGEPN---------YWSKGI 112
           Y   P    QI    + L  DY +   D+ + G   +   +GE           Y  +G+
Sbjct: 22  YQVSPWSQKQILTDMNRLDVDYFFAYDDKEIVGFLSIQHLVGELELTNIAIKKAYQGQGL 81

Query: 113 GTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDC 172
           G++ + ++       ++   + L+   +N  A   YQK GFR +    ++  +   KED
Sbjct: 82  GSQLLAML------TKDELPIFLEVRASNQAAQALYQKFGFRSLTTRKDY--YHNPKEDA 133

Query: 173 YLME 176
            LM+
Sbjct: 134 ILMK 137


>SALTI SPED_SALTI (Q8Z9E3) S-adenosylmethionine decarboxylase
           proenzyme (EC 4.1.1.50) (AdoMetDC) (SamDC) [Contains:
           S-adenosylmethionine decarboxylase beta chain;
           S-adenosylmethionine decarboxylase alpha chain]
          Length = 264

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 19/60 (31%), Positives = 29/60 (48%)

Query: 169 KEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTK 228
           + D   ++YR      +V  MK+ I+H  ++ +    E + S YD V   V  E IF TK
Sbjct: 157 ESDIVTIDYRVRGFTRDVNGMKHFIDHEINSIQNFMSEDMKSLYDMVDVNVYQENIFHTK 216


>SALTY SPED_SALTY (Q8ZRS4) S-adenosylmethionine decarboxylase
           proenzyme (EC 4.1.1.50) (AdoMetDC) (SamDC) [Contains:
           S-adenosylmethionine decarboxylase beta chain;
           S-adenosylmethionine decarboxylase alpha chain]
          Length = 264

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 19/60 (31%), Positives = 29/60 (48%)

Query: 169 KEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTK 228
           + D   ++YR      +V  MK+ I+H  ++ +    E + S YD V   V  E IF TK
Sbjct: 157 ESDIVTIDYRVRGFTRDVNGMKHFIDHEINSIQNFMSEDMKSLYDMVDVNVYQENIFHTK 216


>SALCH Q57T90 (Q57T90) S-adenosylmethionine decarboxylase, proenzyme
          Length = 264

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 19/60 (31%), Positives = 29/60 (48%)

Query: 169 KEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTK 228
           + D   ++YR      +V  MK+ I+H  ++ +    E + S YD V   V  E IF TK
Sbjct: 157 ESDIVTIDYRVRGFTRDVNGMKHFIDHEINSIQNFMSEDMKSLYDMVDVNVYQENIFHTK 216


>RICCN Q92JP8 (Q92JP8) Cell surface antigen
          Length = 1902

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 24/90 (26%), Positives = 41/90 (45%), Gaps = 2/90 (2%)

Query: 197  FDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVK 256
            F N K +  E+I S   S+         F  +   +  K + K+K  Y++ +T +++NVK
Sbjct: 1678 FKNSKNNDKELINSHVVSIYGQKELPKNFALQALVSASKNFIKDKTTYSYGDTKIKSNVK 1737

Query: 257  IPNIEYSYISDELSILGYKEIKGTFLTPEI 286
              N  +SY ++ L    Y       +TP I
Sbjct: 1738 HRN--HSYNAEALLHYNYLLQSKLVITPNI 1765


>NEIGI Q5FAH1 (Q5FAH1) Hypothetical protein
          Length = 177

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 18/45 (40%), Positives = 25/45 (55%), Gaps = 7/45 (15%)

Query: 215 VAYLVNNEYI-------FKTKFSTNKKKGYAKEKAIYNFLNTNLE 252
           + YL++NE +       FK  FSTN+KK    EK I  FL  N++
Sbjct: 69  IDYLISNEILIVRTKFSFKNIFSTNEKKYKEIEKEINKFLYKNMD 113


>LISIN Q92DJ7 (Q92DJ7) Lin0816 protein
          Length = 185

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 16/62 (25%), Positives = 28/62 (45%), Gaps = 1/62 (1%)

Query: 98  MDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157
           +D  +    Y   G+GT  +  + E    +     V L+  K NP A R Y++ GF +
Sbjct: 113 LDSIVTNEKYRGHGVGTALLAKLTEIAAND-GEKVVGLNCDKGNPHAKRLYERLGFHVTG 171

Query: 158 DL 159
           ++
Sbjct: 172 EI 173


>LACJO Q74J74 (Q74J74) Hypothetical protein
          Length = 150

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 21/58 (36%), Positives = 28/58 (48%), Gaps = 5/58 (8%)

Query: 104 EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE 161
           +P Y SKGI T  IK     L++      V L+   +N RA   Y+K GF  +  L E
Sbjct: 80  DPIYQSKGIATELIKKALTELERP-----VRLEVFTDNERAKALYRKFGFERVNTLTE 132


>GEOSL Q74A59 (Q74A59) Sensory box histidine kinase
          Length = 1053

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 41/188 (21%), Positives = 78/188 (41%), Gaps = 34/188 (18%)

Query: 176 EYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGS------------GYDSVAYLVNNE- 222
           E+RY D    V+A+K   E YF         ++GS            G D    LV+ E
Sbjct: 106 EHRYGD----VEALKSRYEAYFRKATELYPRVLGSTDTFLSGEIARLGADGRLILVDFER 161

Query: 223 ----YIFKTKFSTNKKKGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDELSILGYKEIK 278
               Y+   +    + +  A++ +IY F+   +   +  P I  +++++ L I   +E++
Sbjct: 162 MSRDYVTSVEHQIERNRALARDTSIYLFVLFGMVVLLAAPAI--TFVANRLLIRPLEELR 219

Query: 279 GTFLTPEIYSTMSEEEQNLLKRDI--------ASFLRQMHGLDYTDISECTIDNKQNVLE 330
           G   +   ++  S +   L   D         ASF   + GL  T +S   +DN    +
Sbjct: 220 GMVTS---FAGGSLDLSGLPDYDAGDEIGSLCASFRSMVEGLQETTVSRDYVDNIIESMS 276

Query: 331 EYILLRET 338
           + +++ +T
Sbjct: 277 DCLIVVDT 284


>ENTFA Q836Z2 (Q836Z2) Acetyltransferase, GNAT family
          Length = 173

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 19/75 (25%), Positives = 33/75 (44%), Gaps = 1/75 (1%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPE-HELH 165
           YW  G+G+  ++ +  +  +      + L     N RAI  Y+K GF     +P   +
Sbjct: 99  YWGYGLGSILMEELIRWAHESHVIRRLELTVQDRNQRAIHVYKKLGFETEAIMPRGAKTD 158

Query: 166 EGKKEDCYLMEYRYD 180
           +G+  D +LM    D
Sbjct: 159 QGEFLDVHLMRLLID 173


>ENTFA Q82YJ4 (Q82YJ4) Toxin ABC transporter, ATP-binding/permease
           protein
          Length = 700

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 33/166 (19%), Positives = 68/166 (40%), Gaps = 7/166 (4%)

Query: 94  IVYGMDQFIG-EPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSG 152
           +V  ++QF   +  Y S  + +  ++ I   ++ E     +ILD    + R  R   K G
Sbjct: 439 VVSSLNQFGSFQAQYESMQVASHRLESILINMENENVCGEIILDKKIESIRCKRVSIKKG 498

Query: 153 FRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFDNFKVDSIEIIGSGY 212
             ++ D    E++ GK  +  +        +T +K++  L + Y     +++I+I
Sbjct: 499 DTLLLDTVNCEIYRGK--NLSIRGENGSGKSTLIKSLVRLDDDYRGQILINNIDIKKINL 556

Query: 213 D----SVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNFLNTNLETN 254
           D     + ++  N    +     N   G+    +I+N L  + E N
Sbjct: 557 DCLRSKLVFVEPNPKFLEGTIRDNLLLGHKVPNSIFNKLIRDFEIN 602


>CLOAB Q97M70 (Q97M70) Predicted metal-dependent hydrolase
          Length = 259

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 13/39 (33%), Positives = 23/39 (58%), Gaps = 5/39 (12%)

Query: 47  KKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTD 85
           KK+Y E W  ++  + +EY      Y + YK++DE+Y +
Sbjct: 145 KKNYAEKWYKKIAAIELEYL-----YNEKYKIFDEIYDE 178


>CLOAB Q97HI5 (Q97HI5) Predicted flavodoxin
          Length = 180

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 23/76 (30%), Positives = 35/76 (46%), Gaps = 2/76 (2%)

Query: 119 LIFEFLKKERNANAVILDPHKNNP-RAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEY 177
           L+F  L K R AN  IL   + NP RA+  YQ S   I +++  +++ +G     Y +
Sbjct: 22  LMFSRLNKPRQANQKILKAKEANPKRALIVYQPSMSSITDEV-ANQIAKGLNTQGYEVTL 80

Query: 178 RYDDNATNVKAMKYLI 193
            Y  N  +     Y I
Sbjct: 81  NYPSNHLSTNVSDYSI 96


>BRAJA Q89R32 (Q89R32) Acyl-CoA dehydrogenase
          Length = 455

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 13/35 (37%), Positives = 23/35 (65%)

Query: 235 KGYAKEKAIYNFLNTNLETNVKIPNIEYSYISDEL 269
           K  AK++ ++NF   + ET   + N++Y+YI+ EL
Sbjct: 107 KNKAKKEGLWNFFLPDDETGQGLKNLDYAYIASEL 141


>BACC1 Q735F5 (Q735F5) Streptothricin acetyltransferase, putative
          Length = 184

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 30/122 (24%), Positives = 54/122 (44%), Gaps = 6/122 (4%)

Query: 37  RDKKYTLE---SLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDE 93
           R  +YT+E   S +K Y +   +E+     EY N P     I  +++++       K
Sbjct: 38  RHIEYTVEDVPSYEKSYLQNDNEEL--AYNEYINKPNQIIYIALLHNQIIGFIVLKKNWN 95

Query: 94  IVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
               ++    +  Y + G+G R +    ++ K E N   ++L+   NN  A + Y+K GF
Sbjct: 96  HYAYIEDITVDKKYRTLGVGKRLVVQAKQWAK-EGNMPGIMLETQNNNVAACKFYEKCGF 154

Query: 154 RI 155
            I
Sbjct: 155 VI 156


>BACAN Q81YS9 (Q81YS9) Acetyltransferase, GNAT family
          Length = 288

 Score = 31.2 bits (69), Expect = 7.4
 Identities = 13/44 (29%), Positives = 25/44 (56%)

Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGF 153
           KG+G R ++   +++   +    + L  + NN RA++ Y+K GF
Sbjct: 233 KGVGERLLQAAIQYIFSFQGMREIELCLNTNNDRAVKLYKKVGF 276


>VIBPA Q87HD3 (Q87HD3) Hypothetical protein VPA1032
          Length = 265

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 19/64 (29%), Positives = 31/64 (48%), Gaps = 3/64 (4%)

Query: 294 EQNLLKRDIASFLRQMHGLDYTDISECTIDNKQNVLEEYILLRETIYNDLTDIEK---DY 350
           E   L + + SF   M   DY  +SE  +  ++   E+  L  +T ++D+ DI+     Y
Sbjct: 96  ENEELTKSLVSFNLSMVSQDYEQVSELALQIEELRQEKGFLANDTSFSDVRDIDDRLGGY 155

Query: 351 IESF 354
           IE F
Sbjct: 156 IELF 159


>VIBCH Q9KL03 (Q9KL03) Spermidine n1-acetyltransferase
          Length = 173

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 20/83 (24%), Positives = 35/83 (42%), Gaps = 3/83 (3%)

Query: 100 QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           Q I  P +  KG     I    ++     N + + L     NP+A+  Y++ GF     L
Sbjct: 86  QIIIAPEHQGKGFARTLINRALDYSFTILNLHKIYLHVAVENPKAVHLYEECGFVEEGHL 145

Query: 160 PEHELHEGKKED---CYLMEYRY 179
            E     G+ +D    Y+++ +Y
Sbjct: 146 VEEFFINGRYQDVKRMYILQSKY 168


>THEMA SYE2_THEMA (Q9X2I8) Glutamyl-tRNA synthetase 2 (EC 6.1.1.17)
           (Glutamate--tRNA ligase 2) (GluRS 2)
          Length = 487

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 16/44 (36%), Positives = 22/44 (50%)

Query: 325 KQNVLEEYILLRETIYNDLTDIEKDYIESFMERLNATTVFEGKK 368
           K N L +   +     ND  + EKDY+E F++R  A  V E  K
Sbjct: 369 KVNTLSQLYDIMYPFMNDDYEYEKDYVEKFLKREEAERVLEEAK 412


>THETN ISPH_THETN (Q8RA76) 4-hydroxy-3-methylbut-2-enyl diphosphate
           reductase (EC 1.17.1.2)
          Length = 288

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 14/55 (25%), Positives = 31/55 (56%)

Query: 115 RYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKK 169
           R I++ +E L K+++     L    +NP+ ++  ++ G R+IE+    +L +G +
Sbjct: 17  RAIEIAYEELNKQKDTRLYTLGEIIHNPQVVKDLEEKGVRVIEEEELEKLLKGDR 71


>STRCO Q9S2L5 (Q9S2L5) Hypothetical protein SCO1988
          Length = 183

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 18/47 (38%), Positives = 26/47 (55%), Gaps = 5/47 (10%)

Query: 111 GIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIE 157
           GIG R++ L      KER A+ + L   + N  A R Y++ GFR +E
Sbjct: 119 GIGDRFVALA-----KERRADGLSLWTFQVNAPARRFYERHGFRAVE 160


>STRP1 Q99XX8 (Q99XX8) Putative pullulanase
          Length = 1165

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 37/171 (21%), Positives = 62/171 (36%), Gaps = 31/171 (18%)

Query: 83  YTDYHY----PKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVILDPH 138
           YT Y+Y     +  E V  +D +      W+    T  IK           A A  +DP
Sbjct: 473 YTGYYYLYEITRGQEKVMVLDPYAKSLAAWNDATATDDIK----------TAKAAFIDPS 522

Query: 139 KNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEYRYDDNATNVKAMKYLIEHYFD 198
           K  P  +   + + F+             K+ED  + E    D  T+ KA++  + H F
Sbjct: 523 KLGPTGLDFAKINNFK-------------KREDAIIYEAHVRD-FTSDKALEGKLTHPFG 568

Query: 199 NFK--VDSIEIIGS-GYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIYNF 246
            F   V+ ++ +   G   V  L    Y +  +   ++   Y      YN+
Sbjct: 569 TFSAFVEQLDYLKDLGVTHVQLLPVLSYFYANELDKSRSTAYTSSDNNYNW 619


>STRP1 Q99YF0 (Q99YF0) Hypothetical protein SPy1734
           (Acetyltransferase) (EC 2.3.1.-)
          Length = 174

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 17/65 (26%), Positives = 34/65 (52%), Gaps = 1/65 (1%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166
           Y   GIG   +++  ++ ++     ++ LD    N +AI  Y+K GFR IE + ++++
Sbjct: 99  YRGYGIGQLLLEIALDWAEENPYIESLKLDVQVRNTKAIYLYKKYGFR-IESMRKNDIKS 157

Query: 167 GKKED 171
              +D
Sbjct: 158 KNGDD 162


>STAES Q8CTP7 (Q8CTP7) Hypothetical protein SE0368
          Length = 158

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 27/108 (25%), Positives = 44/108 (40%), Gaps = 18/108 (16%)

Query: 49  HYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYW 108
           H  +   +++F V  E + + +G+   +   +ELY   HY +              P
Sbjct: 48  HLKKRLNEQLFLVAEEDSEI-VGFAN-FIYGEELYLSAHYVR--------------PESQ 91

Query: 109 SKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRII 156
            +G GTR ++   +  K +     V L+   NN   I  YQ  GF II
Sbjct: 92  HRGYGTRLLEAGLKRFKDQYET--VYLEVDNNNSNGIEYYQNHGFEII 137


>STAES COAD_STAES (Q8CSZ5) Phosphopantetheine adenylyltransferase
           (EC 2.7.7.3) (Pantetheine-phosphate adenylyltransferase)
           (PPAT) (Dephospho-CoA pyrophosphorylase)
          Length = 161

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 21/111 (18%), Positives = 50/111 (45%), Gaps = 13/111 (11%)

Query: 185 NVKAMKYLIEHYFDNFKVDSIEIIGSGYDSVAYLVNNEYIFKTKFSTNKKKGYAKEKAIY 244
           +VK +  +  H+F+   VD  + +G+          +++ ++ + ++  KK
Sbjct: 59  SVKHLPNIQVHHFNGLLVDFCDQVGAKTIIRGLRAVSDFEYELRLTSMNKK--------- 109

Query: 245 NFLNTNLETNVKIPNIEYSYISDEL--SILGYKEIKGTFLTPEIYSTMSEE 293
             LN+N+ET   + +  YS+IS  +   +  Y+     F+ P +   + ++
Sbjct: 110 --LNSNIETMYMMTSANYSFISSSIVKEVAAYQADISPFVPPHVERALKKK 158


>MYCLEH Y3433_MYCTU (O06250) Hypothetical protein Rv3433c/MT3539
          Length = 473

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 12/30 (40%), Positives = 23/30 (76%)

Query: 130 ANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           A+AV+L+P + + +A+ A+ KSG R++E +
Sbjct: 81  ADAVLLNPDRTHRKALAAFTKSGGRLVESV 110


>MYCTU Y3433_MYCTU (O06250) Hypothetical protein Rv3433c/MT3539
          Length = 473

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 12/30 (40%), Positives = 23/30 (76%)

Query: 130 ANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           A+AV+L+P + + +A+ A+ KSG R++E +
Sbjct: 81  ADAVLLNPDRTHRKALAAFTKSGGRLVESV 110


>MYCBO Q7TWI6 (Q7TWI6) Hypothetical protein Mb3463c
          Length = 473

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 12/30 (40%), Positives = 23/30 (76%)

Query: 130 ANAVILDPHKNNPRAIRAYQKSGFRIIEDL 159
           A+AV+L+P + + +A+ A+ KSG R++E +
Sbjct: 81  ADAVLLNPDRTHRKALAAFTKSGGRLVESV 110


>LISMO Q8Y5C4 (Q8Y5C4) Lmo2141 protein
          Length = 157

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 28/102 (27%), Positives = 47/102 (46%), Gaps = 9/102 (8%)

Query: 76  YKMYDELYTDYHYPKTDEIVYGMDQFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVIL 135
           YK    L ++ H  + D  V+        P+Y   GIG   +  + E + +E+    + L
Sbjct: 63  YKSPIPLASNKHVAEIDIAVH--------PDYQRAGIGQLLMDKMKE-VAREKGYIKIAL 113

Query: 136 DPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEY 177
                N +AIR Y+K+GF+    L +  + +G+  D  LM Y
Sbjct: 114 RVLSINQKAIRFYEKNGFKQEGLLEKEFIIQGEFVDDILMAY 155


>LISIN Q929Z8 (Q929Z8) Lin2125 protein
          Length = 231

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 25/89 (28%), Positives = 38/89 (42%), Gaps = 15/89 (16%)

Query: 8   ICIRTLIDDDFPLMLKWLTDERVLEFYGGRDKK----YTLESLKKHYTEP--WEDEVFRV 61
           + ++TL+    P  + WL DE    F  G        Y L ++   +T P  W+  V  +
Sbjct: 107 LVLKTLVARTRPDSVNWLIDESGFSFPSGHATATAVFYGLAAMFLIFTVPKMWQKIVIGI 166

Query: 62  IIEYNNVPIGYGQI-YKMYDELYTDYHYP 89
                   IGYG I + MY  +Y   H+P
Sbjct: 167 --------IGYGFILFVMYTRVYLGVHFP 187


>ENTFA Q837Z5 (Q837Z5) Acetyltransferase, GNAT family
          Length = 184

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 15/45 (33%), Positives = 24/45 (53%)

Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
           +G G   + LI +F   E   + + L  + NN +AI  Y+K GF+
Sbjct: 104 QGCGFEAVSLICKFAFYELGLHKIRLAVNSNNQKAIHVYEKVGFK 148


>ENTFA Q831M9 (Q831M9) Ribosomal-protein-alanine acetyltransferase,
           putative
          Length = 154

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 15/45 (33%), Positives = 28/45 (62%), Gaps = 1/45 (2%)

Query: 110 KGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFR 154
           +GIG + +K   E++K  R+   + L+  ++N  A + Y+K+GFR
Sbjct: 83  QGIGCQLMKAFKEYVKS-RDITQIFLEVRESNILAQKLYEKTGFR 126


>CLOTE Q895T0 (Q895T0) Acetyltransferase (EC 2.3.1.-)
          Length = 165

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 17/67 (25%), Positives = 37/67 (55%), Gaps = 3/67 (4%)

Query: 107 YWSKGIGTRYIKLIFEFLKKERNANAVILDPHKNNPRAIRAYQKSGFRIIEDLPEHELHE 166
           Y  +G+GT+ +  I + L K++  + +  D +  NP+    +QK G+  + ++  + L++
Sbjct: 97  YRHQGVGTKLLSYI-KTLAKDKKIHLIKSDTYSLNPKMNALFQKCGYEKVGEI--NLLNK 153

Query: 167 GKKEDCY 173
             K +CY
Sbjct: 154 PYKFNCY 160


>CLOPE Q8XMF9 (Q8XMF9) Hypothetical protein CPE0730
          Length = 154

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 39/162 (24%), Positives = 66/162 (40%), Gaps = 30/162 (18%)

Query: 28  ERVLEFYGGRDKKYTLESLKKHYTEPWEDEVFRVIIEYNNVPIGYGQIYKMYDELYTDYH 87
           ERVLE      K   ++ + K + E   +  + +  EY          +K   E+  D +
Sbjct: 3   ERVLEIR--EPKNCEIDDIMKIWLESTVEAHYFIEEEY----------WKKNYEVVRDIY 50

Query: 88  YPKTDEIVYGMD------------QFIGEPNYWSKGIGTRYIKLIFEFLKKERNANAVIL 135
            P     VY  +             FIG     +K  G+   K + E++K +     + L
Sbjct: 51  IPMAKTFVYCDEGKINGFISIIDSNFIGALFVHTKSQGSGIGKSLLEYVKNKYEN--IEL 108

Query: 136 DPHKNNPRAIRAYQKSGFRIIEDLPEHELHEGKKEDCYLMEY 177
             +K+N +A+  Y+K  F+II++    +   G  E  YLM Y
Sbjct: 109 AVYKDNKKAVEFYKKHDFKIIKEQENED--SGHLE--YLMSY 146


>BRUSU Q8FY73 (Q8FY73) Outer membrane autotransporter
          Length = 1593

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 18/64 (28%), Positives = 32/64 (50%), Gaps = 3/64 (4%)

Query: 414  EEEIGTNFGEDILRMYGNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFIENGRKEIY 473
            E  +  N  + ++   G + ++    +QD   +   + T +YGI NI QEF+ NGR  +
Sbjct: 1478 EANVSLNDSDSLIGRAG-VALDYRNAWQDDAGQI--VHTNIYGIANIYQEFMGNGRVGVA 1534

Query: 474  KRTY 477
              T+
Sbjct: 1535 DTTF 1538


>BACSU YCBJ_BACSU (P42242) Hypothetical protein ycbJ
          Length = 306

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 36/179 (20%), Positives = 66/179 (36%), Gaps = 26/179 (14%)

Query: 300 RDIASFLRQMHGLDYTDISECTI------DNKQNVLEEYILLRETIYNDLTDIEKDYIES 353
           R +A  L ++HG D     +  I      D +Q   +  + ++  +      +     E
Sbjct: 129 RTLADILAELHGTDQISAGQSGIEVIRPEDFRQMTADSMVDVKNKL-----GVSTTLWER 183

Query: 354 FMERLNATTVFEGKKCLCHNDFSCNHLLLDGNNRLTXXXXXXXXXXXXEYCDFIYLLEDS 413
           + + ++    + G   L H D    H+L+D N R+T               DF+
Sbjct: 184 WQKWVDDDAYWPGFSSLIHGDLHPPHILIDQNGRVTGLLDWTEAKVADPAKDFVL----- 238

Query: 414 EEEIGTNFGED----ILRMY---GNIDIEKAKEYQDIVEEYYPIETIVYGIKNIKQEFI 465
                T FGE     +L  Y   G     K +E+   ++  YP+E     ++  ++E I
Sbjct: 239 ---YQTIFGEKETARLLEYYDQAGGRIWAKMQEHISEMQAAYPVEIAKLALQTQQEEHI 294


>BACHD Q9KE57 (Q9KE57) BH1001 protein
          Length = 448

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 32/119 (26%), Positives = 54/119 (45%), Gaps = 21/119 (17%)

Query: 272 LGYKEIKGTFLTPEIYSTMSEEEQNLL------------KRDIASFLRQMHGLDYTDISE 319
           LG+K  +GT L  ++  TMS EE  +               D   F  +++G + T ++E
Sbjct: 306 LGFKVERGTLLESKVELTMSFEEDGISFDVGMSVDSTYNYDDAVEF--KLYGQERTTLTE 363

Query: 320 CTIDNKQNVLEEYILLRETIYND-LTDIEKDYIESFM--ERLNATTVFEGKKCLCHNDF 375
             +D   ++  E     E++ ND L D ++DY E  +  E L      E ++ + H DF
Sbjct: 364 AELD---DLTYEINWELESLVNDLLADFQEDYYEEELSEEDLALLAAIEAQE-VSHEDF 418


>BACC1 Q73CD9 (Q73CD9) Glycerol-3-phosphate dehydrogenase, aerobic
           (EC 1.1.99.5)
          Length = 560

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 19/52 (36%), Positives = 26/52 (50%), Gaps = 2/52 (3%)

Query: 146 RAYQKSGFRIIEDLPEHEL--HEGKKEDCYLMEYRYDDNATNVKAMKYLIEH 195
           R+ ++  F   E L +  L   EG K   Y +EYR DD    ++ MK  IEH
Sbjct: 133 RSERRKMFNREETLKKEPLVKQEGLKGGGYYVEYRTDDARLTIEVMKEAIEH 184


>BACAN Q81VL8 (Q81VL8) Oxidoreductase, FAD-binding
          Length = 471

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 28/96 (29%), Positives = 41/96 (42%), Gaps = 12/96 (12%)

Query: 244 YNFLNTNLETNVKIPNIEYSYISDELSILGYKEIKGTFLTPEIYSTMSEEEQNLLKRDIA 303
           Y      L+  +K+ N E         +L YKE    F   E     +    +L +  +A
Sbjct: 188 YGLFGVILDVTLKLTNDEL--YETHTKMLDYKEYTSYF--KEKVKKDANVRMHLARISVA 243

Query: 304 --SFLRQMHGLDYTDISECTIDNKQNVLEEYILLRE 337
             SFLR+M+  DY      T+   QN+ EEY  L+E
Sbjct: 244 PNSFLREMYVTDY------TLAQNQNMREEYSELKE 273


>BACAN Q81U57 (Q81U57) Glycerol-3-phosphate dehydrogenase, aerobic
          Length = 560

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 19/52 (36%), Positives = 26/52 (50%), Gaps = 2/52 (3%)

Query: 146 RAYQKSGFRIIEDLPEHEL--HEGKKEDCYLMEYRYDDNATNVKAMKYLIEH 195
           R+ ++  F   E L +  L   EG K   Y +EYR DD    ++ MK  IEH
Sbjct: 133 RSERRKMFNREETLKKEPLVKQEGLKGGGYYVEYRTDDARLTIEVMKEAIEH 184


>BACAN Q81NU7 (Q81NU7) Acetyltransferase, GNAT family
          Length = 153

 Score = 30.8 bits (68), Expect = 9.7
 Identities = 20/79 (25%), Positives = 34/79 (43%), Gaps = 14/79 (17%)

Query: 86  YHYPKTDEIVYGMDQFIGEPN--------------YWSKGIGTRYIKLIFEFLKKERNAN 131
           Y   K D+IV G   F G+PN              YW+KG  T  ++ + ++  +
Sbjct: 52  YVIRKEDDIVLGDIGFKGKPNEEHTVEVGYGFIEKYWNKGYATEAVQELIDWAFQTGEVE 111

Query: 132 AVILDPHKNNPRAIRAYQK 150
            +I +   +N  +IR  +K
Sbjct: 112 TIIAETLLDNYGSIRVLEK 130


  Database: Blastdata.fdb
    Posted date:  Mar 29, 2006  3:30 PM
  Number of letters in database: 77,468,597
  Number of sequences in database:  240,170

Lambda     K      H
   0.318    0.139    0.409

Gapped
Lambda     K      H
   0.267   0.0410    0.140


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Hits to DB: 72,017,968
Number of Sequences: 240170
Number of extensions: 3196375
Number of successful extensions: 9166
Number of sequences better than 10.0: 203
Number of HSP's better than 10.0 without gapping: 69
Number of HSP's successfully gapped in prelim test: 134
Number of HSP's that attempted gapping in prelim test: 8848
Number of HSP's gapped (non-prelim): 424
length of query: 479
length of database: 77,468,597
effective HSP length: 115
effective length of query: 364
effective length of database: 49,849,047
effective search space: 18145053108
effective search space used: 18145053108
T: 11
A: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 68 (30.8 bits)
BLASTP 2.2.10 [Oct-19-2004]


From mdehoon at c2b2.columbia.edu  Wed Apr 19 16:54:33 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Wed, 19 Apr 2006 12:54:33 -0400
Subject: [BioPython] Need help parsing Blastoutput
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF7@cgcmail.cgc.cpmc.columbia.edu>

The Blast parser fails to read your file because the format of Blast output
has changed. If I edit the data file so that it corresponds to the old format
(add a space here, remove a blank line there, etc.), the Blast parser reads
the file without problems. The easiest solution is to repeat the Blast run,
using XML for the output format, and use the Blast XML parser in Biopython to
parse the results.

A general question is if anybody still needs the parser for Blast text
output. Currently, we are confusing our users by having a Blast text parser
that tends to break. A broken parser may be worse than no parser.

--Michiel.

Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032


-----Original Message-----
From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
Sent: Wed 4/19/2006 6:15 AM
To: Michiel De Hoon
Cc: biopython at lists.open-bio.org
Subject: RE: [BioPython] Need help parsing Blastoutput
 
Hi 
Please see the attachment,it part of my Blast output.
yes I am try to parse text output from Blast ,I have use another script to 
run my local blast that I am trying to perse the NCBIStandalone.BlastParser 
was working fine without hsp.sbject_end  which is one of what I need to 
print out .
On checking the class diagrams from cookbook, findout that sbject_end is 
not included .I just need another way of printing the int(subject end).
Thanks for your help
Halimah

On Tue, 18 Apr 2006, Michiel De Hoon wrote:

> Could you also send us the file Enterococcus_out so we can run the script?
> 
> From the script, it looks like you're trying to parse text output from
Blast.
> While this is possible (in theory), the format of Blast text output tends
to
> change a lot, thereby breaking the parser in Biopython. It is more reliable
> to have Blast generate output in XML format, and use the XML parser:
> 
> blast_out = open('my_blast.xml', 'r')
> 
> from Bio.Blast import NCBIXML
> 
> b_parser = NCBIXML.BlastParser()
> b_record = b_parser.parse(blast_out)
> 
> See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to
> generate Blast output in XML.
> 
> --Michiel.
> 
> 
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> Sent: Tue 4/18/2006 11:06 AM
> To: Michiel De Hoon
> Cc: biopython at lists.open-bio.org
> Subject: RE: [BioPython] Need help parsing Blastoutput
>  
> thanks
> please see the attchment a copy of my script and copy of my Blast output
> Thanks
> 
> 
> On Thu, 13 Apr 2006, Michiel De Hoon wrote:
> 
> > Could you send us the script you were using?
> > 
> > --Michiel.
> > 
> > Michiel de Hoon
> > Center for Computational Biology and Bioinformatics
> > Columbia University
> > 1150 St Nicholas Avenue
> > New York, NY 10032
> > 
> > 
> > 
> > -----Original Message-----
> > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> > Sent: Thu 4/13/2006 11:07 AM
> > To: biopython at lists.open-bio.org
> > Subject: [BioPython] Need help parsing Blastoutput
> >  
> > Hi All,
> > I have a BLAST output from a local blast
> > I need to calculate my % alignment coverage as regard to my subject
> > I try parsed the blast output and wanted to print the
> > sbjct Start and Sbjct end. but I could not is there anyway I could this 
> > try to get mach coverage between my querry and subject I dont need 
> > Identities,but total % alignment for querry or subject.
> > Thanks
> > Halimah
> > 
> > _______________________________________________
> > BioPython mailing list  -  BioPython at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biopython
> > 
> > 
> 
> 


From elventear at gmail.com  Thu Apr 20 01:02:30 2006
From: elventear at gmail.com (Pepe Barbe)
Date: Wed, 19 Apr 2006 20:02:30 -0500
Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files
Message-ID: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com>

Hello,

Following the simple steps in the BioPython cookbook, I wanted to
create a dictionary with the following GenBank file:

ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Escherichia_coli_K12/NC_000913.gbk

Below you can find what I tried executing and the error I got. I would
appreciate any insight into solving the error and correctly producing
the dictionary.

Thanks!
Pepe
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> dict_file = 'NC_000913.gbk'
>>> index_file = 'NC_000913.idx'
>>> from Bio import GenBank
>>> GenBank.index_file(dict_file, index_file)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/sw/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line
1283, in index_file
    SimpleSeqRecord.create_flatdb([filename], indexname, indexer)
  File "/sw/lib/python2.4/site-packages/Bio/Mindy/SimpleSeqRecord.py",
line 152, in create_flatdb
    creator.load(filename, builder = builder, fileid_info = {})
  File "/sw/lib/python2.4/site-packages/Bio/Mindy/BaseDB.py", line 36, in load
    raise TypeError("Cannot identify file as a %s format" %
TypeError: Cannot identify file as a unknown format


From biopython at maubp.freeserve.co.uk  Thu Apr 20 12:42:34 2006
From: biopython at maubp.freeserve.co.uk (Peter (BioPython))
Date: Thu, 20 Apr 2006 13:42:34 +0100
Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files
In-Reply-To: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com>
References: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com>
Message-ID: <444781BA.8080107@maubp.freeserve.co.uk>

Pepe Barbe wrote:
> Hello,
> 
> Following the simple steps in the BioPython cookbook, I wanted to
> create a dictionary with the following GenBank file:
> 
> ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Escherichia_coli_K12/NC_000913.gbk
> 
> Below you can find what I tried executing and the error I got. I would
> appreciate any insight into solving the error and correctly producing
> the dictionary.

The cookbook tutorial is a little misleading in that regard.  Indexing a 
GenBank file only makes sense for those files with multiple genbank 
record (i.e. multiple LOCUS lines).

For example, you can get multi-record GenBank files with records for 
different genes.  These tend to be small records, and the Martel based 
indexing code copes fine.  It doesn't cope very well with large records 
like genomes.

Your example (and in my experience all Bacterial Genomes) have just a 
single very large record (which will contain many features).

Does this page help?

http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/genbank/

I did suggest a change to the documentation but it looks like no one has 
made the change...

http://biopython.org/pipermail/biopython-dev/2005-November/002193.html

I had forgotten to chase this up.

Peter


From alpersoyler at yahoo.com  Thu Apr 20 12:59:57 2006
From: alpersoyler at yahoo.com (alper soyler)
Date: Thu, 20 Apr 2006 05:59:57 -0700 (PDT)
Subject: [BioPython] Need help!!!
Message-ID: <20060420125957.72247.qmail@web36804.mail.mud.yahoo.com>

Hi All,
 
 I am new to Biopython and have a question. I want to construct a pyhlogenetic profile for one organism's proteins. I want to give my protein to blast to search one organism's genome (e.g. Homo sapiens) instead of whole genbank database. How can I solve my problem? Thank you in advance.
 
 regards,
 Alper
 
		
---------------------------------
New Yahoo! Messenger with Voice. Call regular phones from your PC and save big.


From cy at cymon.org  Thu Apr 20 13:41:46 2006
From: cy at cymon.org (Cymon J. Cox)
Date: Thu, 20 Apr 2006 14:41:46 +0100
Subject: [BioPython] Need help!!!
In-Reply-To: <20060420125957.72247.qmail@web36804.mail.mud.yahoo.com>
References: <20060420125957.72247.qmail@web36804.mail.mud.yahoo.com>
Message-ID: <1145540506.11610.17.camel@clintonite.nhm.ac.uk>

Hi Alper,

On Thu, 2006-04-20 at 05:59 -0700, alper soyler wrote:
> Hi All,
>  
>  I am new to Biopython and have a question. I want to construct a pyhlogenetic
>  profile for one organism's proteins. I want to give my protein to blast to
>  search one organism's genome (e.g. Homo sapiens) instead of whole genbank
>  database. How can I solve my problem? Thank you in advance.

Assuming you want to do this locally, you'll need to download you target
genome, format it with the BLAST distribution programme 'formatdb', and
then feed your query and newly formatted genome BLAST database to
Bio.Blast.NCBIStandalone.

See http://biopython.org/docs/tutorial/Tutorial004.html#toc10
3.1.4  Running BLAST locally

for details,

Cheers, Cymon
____________________________________________________________________

Cymon J. Cox

Biometry and Molecular Research
Department of Zoology
Natural History Museum
Cromwell Road
London, SW7 5BD

Email: cy at cymon.org, c.cox at nhm.ac.uk, cymon.cox at googlemail.com 
Phone : +44 (0)20 7942 6981
HomePage : http://www.duke.edu/~cymon

-8.63/-6.77
_____________________________________________________________________
Fedora Core release 4 (Stentz) clintonite.nhm.ac.uk 14:35:55 up 13 days,
20:42, 8 users, load average: 0.08, 0.16, 0.12


From mcolosimo at mitre.org  Thu Apr 20 14:23:19 2006
From: mcolosimo at mitre.org (Marc Colosimo)
Date: Thu, 20 Apr 2006 10:23:19 -0400
Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files
In-Reply-To: <444781BA.8080107@maubp.freeserve.co.uk>
References: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com>
	<444781BA.8080107@maubp.freeserve.co.uk>
Message-ID: <65CA5BE4-1C83-4FD7-B998-C97BCF9AA6DE@mitre.org>

While we are on the subject of parsing multiple GenBank files and the  
Cookbook, I think a better example (and more pythonish) is the  
following:

from Bio import GenBank

gb_file = "my_file.gb"
gb_handle = open(gb_file, 'r')

feature_parser = GenBank.FeatureParser()

gb_iterator = GenBank.Iterator(gb_handle, feature_parser)

for cur_record in gb_iterator:
    # now do something with the record
    print cur_record.seq

which is way nicer (and uses iterators as per pep-234 and ) than

while 1:
    cur_record = gb_iterator.next()

    if cur_record is None:
        break

    # now do something with the record
    print cur_record.seq

Actually, the above works with the Fasta iterator as well.

Times for a GenBank file with 72,358 records (LOCUSs):
my way (using iterators): 14m16.886s
cookbook way (using next and if):  14m28.547s

Surprisingly, this isn't much faster (maybe with -O it would be)

Marc

On Apr 20, 2006, at 8:42 AM, Peter (BioPython) wrote:

> Pepe Barbe wrote:
>> Hello,
>>
>> Following the simple steps in the BioPython cookbook, I wanted to
>> create a dictionary with the following GenBank file:
>>
>> ftp://ftp.ncbi.nlm.nih.gov/genomes/Bacteria/Escherichia_coli_K12/ 
>> NC_000913.gbk
>>
>> Below you can find what I tried executing and the error I got. I  
>> would
>> appreciate any insight into solving the error and correctly producing
>> the dictionary.
>
> The cookbook tutorial is a little misleading in that regard.   
> Indexing a
> GenBank file only makes sense for those files with multiple genbank
> record (i.e. multiple LOCUS lines).
>
> For example, you can get multi-record GenBank files with records for
> different genes.  These tend to be small records, and the Martel based
> indexing code copes fine.  It doesn't cope very well with large  
> records
> like genomes.
>
> Your example (and in my experience all Bacterial Genomes) have just a
> single very large record (which will contain many features).
>
> Does this page help?
>
> http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/ 
> python/genbank/
>
> I did suggest a change to the documentation but it looks like no  
> one has
> made the change...
>
> http://biopython.org/pipermail/biopython-dev/2005-November/002193.html
>
> I had forgotten to chase this up.
>
> Peter
>
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython


From elventear at gmail.com  Thu Apr 20 16:11:42 2006
From: elventear at gmail.com (Pepe Barbe)
Date: Thu, 20 Apr 2006 11:11:42 -0500
Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files
In-Reply-To: <444781BA.8080107@maubp.freeserve.co.uk>
References: <3e73596b0604191802g3807f101jb99e41f95208122e@mail.gmail.com>
	<444781BA.8080107@maubp.freeserve.co.uk>
Message-ID: <3e73596b0604200911i2e2c481bj306c5d282cae5c75@mail.gmail.com>

On 4/20/06, Peter (BioPython) <biopython at maubp.freeserve.co.uk> wrote:
>
> The cookbook tutorial is a little misleading in that regard.  Indexing a
> GenBank file only makes sense for those files with multiple genbank
> record (i.e. multiple LOCUS lines).
<snip>
> Your example (and in my experience all Bacterial Genomes) have just a
> single very large record (which will contain many features).
>
> Does this page help?
>
> http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/genbank/

It does help a lot. Thanks!

As an aside, while what I was doing, wasn't exactly what I was looking
for, I think it was crashing because of a Bug on 1.41. I installed the
latest CVS and it works normally now.

Pepe


From halima at mancala.cbio.uct.ac.za  Thu Apr 20 11:57:20 2006
From: halima at mancala.cbio.uct.ac.za (Halima Rabiu)
Date: Thu, 20 Apr 2006 13:57:20 +0200 (SAST)
Subject: [BioPython] Need help parsing Blastoutput
In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF7@cgcmail.cgc.cpmc.columbia.edu>
References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEF7@cgcmail.cgc.cpmc.columbia.edu>
Message-ID: <Pine.LNX.4.58.0604201350220.10334@mancala.cbio.uct.ac.za>

thanks I try using XML parser and I am still geting errors which I dont 
understand . please see the attchmnt copy of my script and Blast XML 
output.
here is the error
raceback (most recent call last):
  File "Bioperser.py", line 11, in ?
    b_record = b_parser.parse(b_out)
  File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 
112, in parse
    self._parser.parse(handler)
  File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in 
parse
    xmlreader.IncrementalParser.parse(self, source)
  File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in 
parse
    self.feed(buffer)
  File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in 
feed
    self._err_handler.fatalError(exc)
  File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in 
fatalError
    raise exception
thanks
Halimah

On Wed, 19 Apr 2006, Michiel De Hoon wrote:

> The Blast parser fails to read your file because the format of Blast output
> has changed. If I edit the data file so that it corresponds to the old format
> (add a space here, remove a blank line there, etc.), the Blast parser reads
> the file without problems. The easiest solution is to repeat the Blast run,
> using XML for the output format, and use the Blast XML parser in Biopython to
> parse the results.
> 
> A general question is if anybody still needs the parser for Blast text
> output. Currently, we are confusing our users by having a Blast text parser
> that tends to break. A broken parser may be worse than no parser.
> 
> --Michiel.
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> Sent: Wed 4/19/2006 6:15 AM
> To: Michiel De Hoon
> Cc: biopython at lists.open-bio.org
> Subject: RE: [BioPython] Need help parsing Blastoutput
>  
> Hi 
> Please see the attachment,it part of my Blast output.
> yes I am try to parse text output from Blast ,I have use another script to 
> run my local blast that I am trying to perse the NCBIStandalone.BlastParser 
> was working fine without hsp.sbject_end  which is one of what I need to 
> print out .
> On checking the class diagrams from cookbook, findout that sbject_end is 
> not included .I just need another way of printing the int(subject end).
> Thanks for your help
> Halimah
> 
> On Tue, 18 Apr 2006, Michiel De Hoon wrote:
> 
> > Could you also send us the file Enterococcus_out so we can run the script?
> > 
> > From the script, it looks like you're trying to parse text output from
> Blast.
> > While this is possible (in theory), the format of Blast text output tends
> to
> > change a lot, thereby breaking the parser in Biopython. It is more reliable
> > to have Blast generate output in XML format, and use the XML parser:
> > 
> > blast_out = open('my_blast.xml', 'r')
> > 
> > from Bio.Blast import NCBIXML
> > 
> > b_parser = NCBIXML.BlastParser()
> > b_record = b_parser.parse(blast_out)
> > 
> > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to
> > generate Blast output in XML.
> > 
> > --Michiel.
> > 
> > 
> > 
> > Michiel de Hoon
> > Center for Computational Biology and Bioinformatics
> > Columbia University
> > 1150 St Nicholas Avenue
> > New York, NY 10032
> > 
> > 
> > 
> > -----Original Message-----
> > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > Sent: Tue 4/18/2006 11:06 AM
> > To: Michiel De Hoon
> > Cc: biopython at lists.open-bio.org
> > Subject: RE: [BioPython] Need help parsing Blastoutput
> >  
> > thanks
> > please see the attchment a copy of my script and copy of my Blast output
> > Thanks
> > 
> > 
> > On Thu, 13 Apr 2006, Michiel De Hoon wrote:
> > 
> > > Could you send us the script you were using?
> > > 
> > > --Michiel.
> > > 
> > > Michiel de Hoon
> > > Center for Computational Biology and Bioinformatics
> > > Columbia University
> > > 1150 St Nicholas Avenue
> > > New York, NY 10032
> > > 
> > > 
> > > 
> > > -----Original Message-----
> > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> > > Sent: Thu 4/13/2006 11:07 AM
> > > To: biopython at lists.open-bio.org
> > > Subject: [BioPython] Need help parsing Blastoutput
> > >  
> > > Hi All,
> > > I have a BLAST output from a local blast
> > > I need to calculate my % alignment coverage as regard to my subject
> > > I try parsed the blast output and wanted to print the
> > > sbjct Start and Sbjct end. but I could not is there anyway I could this 
> > > try to get mach coverage between my querry and subject I dont need 
> > > Identities,but total % alignment for querry or subject.
> > > Thanks
> > > Halimah
> > > 
> > > _______________________________________________
> > > BioPython mailing list  -  BioPython at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/biopython
> > > 
> > > 
> > 
> > 
> 
> 
-------------- next part --------------
#! /usr/local/bin/python2.4

#halimah

#16-04-2006

from string import split

from Bio.Blast import NCBIXML

#from Bio.Blast import NCBIStandalone

b_out = open('blast2.xml','r')

b_parser = NCBIXML.BlastParser()


b_record = b_parser.parse(b_out)

E_VALUE_THRESH = 1.0


while 1:

	b_record = b_iterator.next()

	print "The following results are for query " + b_record.query

	print 'len of query:',b_record.query_letters

	if b_record is None:

	       	break

	
     	for alignment in b_record.alignments:

        	
             		for hsp in alignment.hsps:

               			if hsp.expect <= E_VALUE_THRESH:

                     			print '****Alignment****'

                   			print 'title:', alignment.title

                    			print 'length:', alignment.length

                    			print 'e value:', hsp.expect

              		                print 'subjectstart:',hsp.sbjct_start

					print 'subject end:', hsp.sbject_end

		     			  
-------------- next part --------------
A non-text attachment was scrubbed...
Name: blast2.xml
Type: text/xml
Size: 151659 bytes
Desc: 
URL: <http://lists.open-bio.org/pipermail/biopython/attachments/20060420/391af520/attachment-0002.xml>

From mdehoon at c2b2.columbia.edu  Thu Apr 20 17:37:29 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Thu, 20 Apr 2006 13:37:29 -0400
Subject: [BioPython] Need help parsing Blastoutput
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEFF@cgcmail.cgc.cpmc.columbia.edu>

Could you send us the Blast XML output also?

--Michiel.

Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032


-----Original Message-----
From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
Sent: Thu 4/20/2006 7:57 AM
To: Michiel De Hoon
Cc: biopython at lists.open-bio.org
Subject: RE: [BioPython] Need help parsing Blastoutput
 
thanks I try using XML parser and I am still geting errors which I dont 
understand . please see the attchmnt copy of my script and Blast XML 
output.
here is the error
raceback (most recent call last):
  File "Bioperser.py", line 11, in ?
    b_record = b_parser.parse(b_out)
  File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 
112, in parse
    self._parser.parse(handler)
  File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in 
parse
    xmlreader.IncrementalParser.parse(self, source)
  File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in 
parse
    self.feed(buffer)
  File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in 
feed
    self._err_handler.fatalError(exc)
  File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in 
fatalError
    raise exception
thanks
Halimah

On Wed, 19 Apr 2006, Michiel De Hoon wrote:

> The Blast parser fails to read your file because the format of Blast output
> has changed. If I edit the data file so that it corresponds to the old
format
> (add a space here, remove a blank line there, etc.), the Blast parser reads
> the file without problems. The easiest solution is to repeat the Blast run,
> using XML for the output format, and use the Blast XML parser in Biopython
to
> parse the results.
> 
> A general question is if anybody still needs the parser for Blast text
> output. Currently, we are confusing our users by having a Blast text parser
> that tends to break. A broken parser may be worse than no parser.
> 
> --Michiel.
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> Sent: Wed 4/19/2006 6:15 AM
> To: Michiel De Hoon
> Cc: biopython at lists.open-bio.org
> Subject: RE: [BioPython] Need help parsing Blastoutput
>  
> Hi 
> Please see the attachment,it part of my Blast output.
> yes I am try to parse text output from Blast ,I have use another script to 
> run my local blast that I am trying to perse the NCBIStandalone.BlastParser

> was working fine without hsp.sbject_end  which is one of what I need to 
> print out .
> On checking the class diagrams from cookbook, findout that sbject_end is 
> not included .I just need another way of printing the int(subject end).
> Thanks for your help
> Halimah
> 
> On Tue, 18 Apr 2006, Michiel De Hoon wrote:
> 
> > Could you also send us the file Enterococcus_out so we can run the
script?
> > 
> > From the script, it looks like you're trying to parse text output from
> Blast.
> > While this is possible (in theory), the format of Blast text output tends
> to
> > change a lot, thereby breaking the parser in Biopython. It is more
reliable
> > to have Blast generate output in XML format, and use the XML parser:
> > 
> > blast_out = open('my_blast.xml', 'r')
> > 
> > from Bio.Blast import NCBIXML
> > 
> > b_parser = NCBIXML.BlastParser()
> > b_record = b_parser.parse(blast_out)
> > 
> > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to
> > generate Blast output in XML.
> > 
> > --Michiel.
> > 
> > 
> > 
> > Michiel de Hoon
> > Center for Computational Biology and Bioinformatics
> > Columbia University
> > 1150 St Nicholas Avenue
> > New York, NY 10032
> > 
> > 
> > 
> > -----Original Message-----
> > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > Sent: Tue 4/18/2006 11:06 AM
> > To: Michiel De Hoon
> > Cc: biopython at lists.open-bio.org
> > Subject: RE: [BioPython] Need help parsing Blastoutput
> >  
> > thanks
> > please see the attchment a copy of my script and copy of my Blast output
> > Thanks
> > 
> > 
> > On Thu, 13 Apr 2006, Michiel De Hoon wrote:
> > 
> > > Could you send us the script you were using?
> > > 
> > > --Michiel.
> > > 
> > > Michiel de Hoon
> > > Center for Computational Biology and Bioinformatics
> > > Columbia University
> > > 1150 St Nicholas Avenue
> > > New York, NY 10032
> > > 
> > > 
> > > 
> > > -----Original Message-----
> > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> > > Sent: Thu 4/13/2006 11:07 AM
> > > To: biopython at lists.open-bio.org
> > > Subject: [BioPython] Need help parsing Blastoutput
> > >  
> > > Hi All,
> > > I have a BLAST output from a local blast
> > > I need to calculate my % alignment coverage as regard to my subject
> > > I try parsed the blast output and wanted to print the
> > > sbjct Start and Sbjct end. but I could not is there anyway I could this

> > > try to get mach coverage between my querry and subject I dont need 
> > > Identities,but total % alignment for querry or subject.
> > > Thanks
> > > Halimah
> > > 
> > > _______________________________________________
> > > BioPython mailing list  -  BioPython at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/biopython
> > > 
> > > 
> > 
> > 
> 
> 


From mdehoon at c2b2.columbia.edu  Thu Apr 20 19:15:51 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Thu, 20 Apr 2006 15:15:51 -0400
Subject: [BioPython] Parsing and Creating Dictionaries of GenBank files
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF01@cgcmail.cgc.cpmc.columbia.edu>

> I did suggest a change to the documentation but it looks like no one has 
> made the change...
> 
> http://biopython.org/pipermail/biopython-dev/2005-November/002193.html

I have now made this update in CVS. I'll put it on the website also as soon
as I can figure out how to do that with the new webserver.

--Michiel.


Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032


From alpersoyler at yahoo.com  Fri Apr 21 07:07:05 2006
From: alpersoyler at yahoo.com (alper soyler)
Date: Fri, 21 Apr 2006 00:07:05 -0700 (PDT)
Subject: [BioPython] Need help!!!
In-Reply-To: <1145540506.11610.17.camel@clintonite.nhm.ac.uk>
Message-ID: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com>

Hi Cymon,
   
  Thank you for your reply. However, to construct phylogenet?c profile I need to download approx. 100 completed genomes. I am searching to make it easier (e.g. without downloading genomes). Can I do it by running blast over the internet?
   
  Regards,
  Alper Soyler 

"Cymon J. Cox" <cy at cymon.org> wrote:
  Hi Alper,

On Thu, 2006-04-20 at 05:59 -0700, alper soyler wrote:
> Hi All,
> 
> I am new to Biopython and have a question. I want to construct a pyhlogenetic
> profile for one organism's proteins. I want to give my protein to blast to
> search one organism's genome (e.g. Homo sapiens) instead of whole genbank
> database. How can I solve my problem? Thank you in advance.

Assuming you want to do this locally, you'll need to download you target
genome, format it with the BLAST distribution programme 'formatdb', and
then feed your query and newly formatted genome BLAST database to
Bio.Blast.NCBIStandalone.

See http://biopython.org/docs/tutorial/Tutorial004.html#toc10
3.1.4 Running BLAST locally

for details,

Cheers, Cymon
____________________________________________________________________

Cymon J. Cox

Biometry and Molecular Research
Department of Zoology
Natural History Museum
Cromwell Road
London, SW7 5BD

Email: cy at cymon.org, c.cox at nhm.ac.uk, cymon.cox at googlemail.com 
Phone : +44 (0)20 7942 6981
HomePage : http://www.duke.edu/~cymon

-8.63/-6.77
_____________________________________________________________________
Fedora Core release 4 (Stentz) clintonite.nhm.ac.uk 14:35:55 up 13 days,
20:42, 8 users, load average: 0.08, 0.16, 0.12


---------------------------------
Blab-away for as little as 1?/min. Make  PC-to-Phone Calls using Yahoo! Messenger with Voice.


From biopython at maubp.freeserve.co.uk  Fri Apr 21 08:44:56 2006
From: biopython at maubp.freeserve.co.uk (Peter (BioPython))
Date: Fri, 21 Apr 2006 09:44:56 +0100
Subject: [BioPython] Updating the tutorial,
 was :Parsing and Creating Dictionaries of GenBank files
In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF01@cgcmail.cgc.cpmc.columbia.edu>
References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF01@cgcmail.cgc.cpmc.columbia.edu>
Message-ID: <44489B88.2030801@maubp.freeserve.co.uk>

Michiel De Hoon wrote:
>> I did suggest a change to the documentation but it looks like no
>> one has made the change...
>> 
>> http://biopython.org/pipermail/biopython-dev/2005-November/002193.html
>> 

Thanks - I was going to look at this today.

Something funny seems to have happened to the plain text version:

http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Doc/Tutorial.txt.diff?r1=1.5&r2=1.6&cvsroot=biopython

(a) The old "Title" is missing above the contents listing

(b) Contents entries contain &nbsp; which is nasty for plain text.

(b) Section references now contain odd text.  Is it possible you only
ran the TeX file once?  Usually with references TeX should be run twice
(and in extreme cases, three times)

In an earlier discussion it was suggested we remove the plain text 
documentation from CVS, which I objected to as plain text is much easier 
for non-TeX people to read.

If generating a consistent plain text version is a lot of hassle, then 
maybe we can live without it?

> I have now made this update in CVS. I'll put it on the website also
> as soon as I can figure out how to do that with the new webserver.

I can't help you there - I was going to post to the Developer mailing 
list to see if anyone had done this recently.  Have you been able to 
generate new HTML and Tutorial.pdf files?

Looks like you have also updated the text about the Blast parser :)

Peter


From cy at cymon.org  Fri Apr 21 09:38:33 2006
From: cy at cymon.org (Cymon J. Cox)
Date: Fri, 21 Apr 2006 10:38:33 +0100
Subject: [BioPython] Need help!!!
In-Reply-To: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com>
References: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com>
Message-ID: <1145612313.4167.15.camel@clintonite.nhm.ac.uk>

Hi Alper,

On Fri, 2006-04-21 at 00:07 -0700, alper soyler wrote:
> Hi Cymon,
>    
>   Thank you for your reply. However, to construct phylogenet?c profile I need to
>  download approx. 100 completed genomes. I am searching to make it easier (e.g.
>  without downloading genomes). Can I do it by running blast over the internet?

Well, I'm not sure; but here's my take on it and hopefully someone will
correct me if I'm wrong.

Assuming you are referring to complete genomes available through NCBI
(otherwise you'll almost certainly need to download them), I don't think
it's possible with the BioPython interface. Bio.Blast.NCBIWWW uses the
qblast interface at NCBI
(http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html) which I think only
makes the following db's available:
http://www.ncbi.nlm.nih.gov/blast/blast_databases.shtml . From looking
at the qblast docs it doesn't seem possible to restrict the search to a
particular organism while blast'ing against a particular NCBI db (e.g.
nr).

Depending on what you want to do, it maybe easier and quicker to use the
NCBI web Blast interface to the Genomes db's:
http://www.ncbi.nlm.nih.gov/sutils/genom_table.cgi

Else you'll have to bite the proverbial bullet and download and format
them individually.

Cheers, Cymon

>    
>   Regards,
>   Alper Soyler 
> 
> "Cymon J. Cox" <cy at cymon.org> wrote:
>   Hi Alper,
> 
> On Thu, 2006-04-20 at 05:59 -0700, alper soyler wrote:
> > Hi All,
> > 
> > I am new to Biopython and have a question. I want to construct a pyhlogenetic
> > profile for one organism's proteins. I want to give my protein to blast to
> > search one organism's genome (e.g. Homo sapiens) instead of whole genbank
> > database. How can I solve my problem? Thank you in advance.
> 
> Assuming you want to do this locally, you'll need to download you target
> genome, format it with the BLAST distribution programme 'formatdb', and
> then feed your query and newly formatted genome BLAST database to
> Bio.Blast.NCBIStandalone.
> 
> See http://biopython.org/docs/tutorial/Tutorial004.html#toc10
> 3.1.4 Running BLAST locally
> 
> for details,
> 
> Cheers, Cymon


From biopython at maubp.freeserve.co.uk  Fri Apr 21 09:23:12 2006
From: biopython at maubp.freeserve.co.uk (Peter (BioPython))
Date: Fri, 21 Apr 2006 10:23:12 +0100
Subject: [BioPython] blast against genomes, was: Need help!!!
In-Reply-To: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com>
References: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com>
Message-ID: <4448A480.5010805@maubp.freeserve.co.uk>

alper soyler wrote:
> Hi Cymon,
> 
> Thank you for your reply. However, to construct phylogenet?c profile
> I need to download approx. 100 completed genomes. I am searching to
> make it easier (e.g. without downloading genomes). Can I do it by
> running blast over the internet?

So you want to search 100 completed genomes using your protein as the 
input query?

As Cymon suggested, downloading the genomes and building your own 
database is one method.  As this is a "big task" you have in mind, the 
network speed limitations of doing many blast queries may make this a 
better idea than trying to do it online.

However, the NCBI offer online blast against some (all?) of their 
completed genomes so it may be possible to do it this way via BioPython.

http://www.ncbi.nlm.nih.gov/BLAST/

The webpage has a nice interface for blast against specific genomes 
(right hand side, second box down).

You can also use the normal blast pages and the "Limit by entrez query" 
field, e.g. mouse[ORGN] OR rat[ORGN]

It should be possible to do this automatically in code but you will need 
to compile a list of the species names the NCBI will understand...

Peter


From sbassi at gmail.com  Fri Apr 21 11:46:49 2006
From: sbassi at gmail.com (Sebastian Bassi)
Date: Fri, 21 Apr 2006 08:46:49 -0300
Subject: [BioPython] Need help!!!
In-Reply-To: <20060421070705.85335.qmail@web36801.mail.mud.yahoo.com>
References: <1145540506.11610.17.camel@clintonite.nhm.ac.uk>
	<20060421070705.85335.qmail@web36801.mail.mud.yahoo.com>
Message-ID: <b43bf2080604210446xc452797q9b853aa11e66f84c@mail.gmail.com>

On 4/21/06, alper soyler <alpersoyler at yahoo.com> wrote:
> Hi Cymon,
>   Thank you for your reply. However, to construct phylogenet?c profile I need to download approx. 100 completed genomes. I am searching to make it easier (e.g. without downloading genomes). Can I do it by running blast over the internet?
>

Maybe you could download only NR db and then make subsets from it.
NCBI utilities or the local BLAST has one utility that allows you to
extract sequences from BLAST compiled DBs. I don't know if this would
be enough for your needs.

--
Bioinformatics news: http://www.bioinformatica.info
Lriser: http://www.linspire.com/lraiser_success.php?serial=318


From mdehoon at c2b2.columbia.edu  Fri Apr 21 16:26:39 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Fri, 21 Apr 2006 12:26:39 -0400
Subject: [BioPython] Updating the tutorial,
	was :Parsing and Creating Dictionaries of GenBank files
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF05@cgcmail.cgc.cpmc.columbia.edu>

> Something funny seems to have happened to the plain text version:
> 
>
http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Doc/Tutorial.t
xt.diff?r1=1.5&r2=1.6&cvsroot=biopython

The plain text version is generated by hevea, so not by tex directly. The
funny output is likely due to having a different hevea version (which I ran a
couple of times). I didn't see anything obviously wrong with the Tutorial.tex
source file, so I think these errors are due to errors in the Tutorial.tex ->
Tutorial.txt translation by hevea.

> If generating a consistent plain text version is a lot of hassle, then 
> maybe we can live without it?

Currently, the plain text version is not very useful. It's not a source file,
so it should not be in CVS. On the other hand, the plain text version is not
available from the Biopython documentation page, and users are better off
with the PDF version anyway. So I think nobody will miss the plain text
version. Correct me if I'm wrong.

--Michiel.


From srini_iyyer_bio at yahoo.com  Fri Apr 21 22:49:28 2006
From: srini_iyyer_bio at yahoo.com (Srinivas Iyyer)
Date: Fri, 21 Apr 2006 15:49:28 -0700 (PDT)
Subject: [BioPython] Creating a graphical interface to database of gene
	coordinates
Message-ID: <20060421224928.87408.qmail@web38105.mail.mud.yahoo.com>

Dear group, 
 I am happy that I am slowly finding pyhonian projects
related to my research area. 

Problem:
1. I have a database of human gene coordinates on
chromosomes.
2. I have gene expression data from my lab concerning
the genes I mentioned above. 

3. I want to visualize expression data laid on
chromosomes.

Eg. 
Coordinates:
Chr      Gene       From      To     Exon
1         x         100       120    exon:1
1         x         200       250    exon:2
1         x         350       450    exon:3


Expression data:

IDent   sample  Chr    From     To     Expression
value
xxx_at  lung     1     110      120     100.35
x_s_at  heart    1     225      250     124.35
x_a_at  eye      1     375      400     146.35

What I want:

I want to have a simpler window, that would connect to
my database.  I want to give a gene, this python/tk
interfacce or what ever would query the database
draw a graph of gene according the exons and plot the
values. 

-------_______----------_______-------

-- : exon
__: regions that are not exons, introns.


My questions to Tutor/BioPython forums:

1. What should I decide to work on a. Py/Tk framework 
b. python imaging libraries etc. 

2. I do not want to impress any one with this work,
except that it should help me understand the
relationships as the number game in the tables above
is highly confusing. So, a working version that
accurately plots the expression values for as many
samples I have

3. Are there any available modules to jump-start? or
do I have to create some from scratch. which would be
a problem because I am between novice to mediocral
level of python programing. 

4. Any ideas/suggestions/pointers are highly
appreciated. 

thanks
Sri

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


From biopython at maubp.freeserve.co.uk  Sat Apr 22 12:32:21 2006
From: biopython at maubp.freeserve.co.uk (Peter (BioPython))
Date: Sat, 22 Apr 2006 13:32:21 +0100
Subject: [BioPython] Creating a graphical interface to database of gene
 coordinates
In-Reply-To: <20060421224928.87408.qmail@web38105.mail.mud.yahoo.com>
References: <20060421224928.87408.qmail@web38105.mail.mud.yahoo.com>
Message-ID: <444A2255.6010704@maubp.freeserve.co.uk>

Srinivas Iyyer wrote:
> Dear group, 
>  I am happy that I am slowly finding pyhonian projects
> related to my research area. 
> 
> Problem:
> 1. I have a database of human gene coordinates on
> chromosomes.
> 2. I have gene expression data from my lab concerning
> the genes I mentioned above. 
> 
> 3. I want to visualize expression data laid on
> chromosomes.

You may be able to produce chromosome diagrams with Leighton Pritchard 
and Jennifer White's program genomediagram:

http://bioinf.scri.sari.ac.uk/lp/programs.html#genomediagram

It will do both circular genomes diagrams (nice for bacteria) and linear 
ones - which would make sense for chromosomes.  I think I've seen 
examples with expression data shown in this way... certainly it could be 
done.

Note that this can produce PDF or bitmap output - but its not 
interactive.  There is also a GUI to go with it, but I have not looked 
at this.

----------------------------------------------------------------------

One final suggestion, is to consider looking at R/BioConductor - its a 
completely different language but I have seen examples where expression 
data is visualised on chromosomes.

http://www.r-project.org/
http://www.bioconductor.org/

You can even call R from Python, for example using RPy (R from Python),:

http://rpy.sourceforge.net/index.html

See also RSPython, an R/SPlus - Python Interface which I have not used 
personally:

http://www.omegahat.org/RSPython/

Peter


From biopython at maubp.freeserve.co.uk  Mon Apr 24 10:56:06 2006
From: biopython at maubp.freeserve.co.uk (Peter (BioPython List))
Date: Mon, 24 Apr 2006 11:56:06 +0100
Subject: [BioPython] Bio.Nexus documentation
Message-ID: <444CAEC6.5040703@maubp.freeserve.co.uk>

I'm thinking of having a go at using the new Bio.Nexus model in 
BioPython to do some phylogenetic tree manipulation (from Clustal .dnd 
files in my case), so I thought I would have a hunt for some examples or 
help...

Back in July 2005, Frank Kauff wrote:
> I hope most of the methods have a descriptive title and are easy to use.
> Let me know if I can help further. And I promise to write some
> documentation, but it won't be before end of August.
> 
> Cheers,
> Frank 

Archive link:
http://biopython.org/pipermail/biopython/2005-July/002714.html

Was that August 2005, or August 2006, you had in mind? ;)

Do you have some simple examples you could share with us instead perhaps?

Thanks

Peter


From fkauff at duke.edu  Mon Apr 24 13:32:45 2006
From: fkauff at duke.edu (Frank Kauff)
Date: Mon, 24 Apr 2006 09:32:45 -0400
Subject: [BioPython] Bio.Nexus documentation
In-Reply-To: <444CAEC6.5040703@maubp.freeserve.co.uk>
References: <444CAEC6.5040703@maubp.freeserve.co.uk>
Message-ID: <1145885566.2369.6.camel@osiris.biology.duke.edu>

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <http://lists.open-bio.org/pipermail/biopython/attachments/20060424/d8a5de2f/attachment-0001.ksh>

From halima at mancala.cbio.uct.ac.za  Mon Apr 24 08:45:09 2006
From: halima at mancala.cbio.uct.ac.za (Halima Rabiu)
Date: Mon, 24 Apr 2006 10:45:09 +0200 (SAST)
Subject: [BioPython] Need help parsing Blastoutput
In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEFF@cgcmail.cgc.cpmc.columbia.edu>
References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECEFF@cgcmail.cgc.cpmc.columbia.edu>
Message-ID: <Pine.LNX.4.58.0604241036290.18039@mancala.cbio.uct.ac.za>

Hi 
attch here is the output xml out I also attached it in my previous post 
thanks
Halimah

On Thu, 20 Apr 2006, Michiel De Hoon wrote:

> Could you send us the Blast XML output also?
> 
> --Michiel.
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> Sent: Thu 4/20/2006 7:57 AM
> To: Michiel De Hoon
> Cc: biopython at lists.open-bio.org
> Subject: RE: [BioPython] Need help parsing Blastoutput
>  
> thanks I try using XML parser and I am still geting errors which I dont 
> understand . please see the attchmnt copy of my script and Blast XML 
> output.
> here is the error
> raceback (most recent call last):
>   File "Bioperser.py", line 11, in ?
>     b_record = b_parser.parse(b_out)
>   File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 
> 112, in parse
>     self._parser.parse(handler)
>   File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in 
> parse
>     xmlreader.IncrementalParser.parse(self, source)
>   File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in 
> parse
>     self.feed(buffer)
>   File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in 
> feed
>     self._err_handler.fatalError(exc)
>   File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in 
> fatalError
>     raise exception
> thanks
> Halimah
> 
> On Wed, 19 Apr 2006, Michiel De Hoon wrote:
> 
> > The Blast parser fails to read your file because the format of Blast output
> > has changed. If I edit the data file so that it corresponds to the old
> format
> > (add a space here, remove a blank line there, etc.), the Blast parser reads
> > the file without problems. The easiest solution is to repeat the Blast run,
> > using XML for the output format, and use the Blast XML parser in Biopython
> to
> > parse the results.
> > 
> > A general question is if anybody still needs the parser for Blast text
> > output. Currently, we are confusing our users by having a Blast text parser
> > that tends to break. A broken parser may be worse than no parser.
> > 
> > --Michiel.
> > 
> > Michiel de Hoon
> > Center for Computational Biology and Bioinformatics
> > Columbia University
> > 1150 St Nicholas Avenue
> > New York, NY 10032
> > 
> > 
> > 
> > -----Original Message-----
> > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > Sent: Wed 4/19/2006 6:15 AM
> > To: Michiel De Hoon
> > Cc: biopython at lists.open-bio.org
> > Subject: RE: [BioPython] Need help parsing Blastoutput
> >  
> > Hi 
> > Please see the attachment,it part of my Blast output.
> > yes I am try to parse text output from Blast ,I have use another script to 
> > run my local blast that I am trying to perse the NCBIStandalone.BlastParser
> 
> > was working fine without hsp.sbject_end  which is one of what I need to 
> > print out .
> > On checking the class diagrams from cookbook, findout that sbject_end is 
> > not included .I just need another way of printing the int(subject end).
> > Thanks for your help
> > Halimah
> > 
> > On Tue, 18 Apr 2006, Michiel De Hoon wrote:
> > 
> > > Could you also send us the file Enterococcus_out so we can run the
> script?
> > > 
> > > From the script, it looks like you're trying to parse text output from
> > Blast.
> > > While this is possible (in theory), the format of Blast text output tends
> > to
> > > change a lot, thereby breaking the parser in Biopython. It is more
> reliable
> > > to have Blast generate output in XML format, and use the XML parser:
> > > 
> > > blast_out = open('my_blast.xml', 'r')
> > > 
> > > from Bio.Blast import NCBIXML
> > > 
> > > b_parser = NCBIXML.BlastParser()
> > > b_record = b_parser.parse(blast_out)
> > > 
> > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how to
> > > generate Blast output in XML.
> > > 
> > > --Michiel.
> > > 
> > > 
> > > 
> > > Michiel de Hoon
> > > Center for Computational Biology and Bioinformatics
> > > Columbia University
> > > 1150 St Nicholas Avenue
> > > New York, NY 10032
> > > 
> > > 
> > > 
> > > -----Original Message-----
> > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > > Sent: Tue 4/18/2006 11:06 AM
> > > To: Michiel De Hoon
> > > Cc: biopython at lists.open-bio.org
> > > Subject: RE: [BioPython] Need help parsing Blastoutput
> > >  
> > > thanks
> > > please see the attchment a copy of my script and copy of my Blast output
> > > Thanks
> > > 
> > > 
> > > On Thu, 13 Apr 2006, Michiel De Hoon wrote:
> > > 
> > > > Could you send us the script you were using?
> > > > 
> > > > --Michiel.
> > > > 
> > > > Michiel de Hoon
> > > > Center for Computational Biology and Bioinformatics
> > > > Columbia University
> > > > 1150 St Nicholas Avenue
> > > > New York, NY 10032
> > > > 
> > > > 
> > > > 
> > > > -----Original Message-----
> > > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> > > > Sent: Thu 4/13/2006 11:07 AM
> > > > To: biopython at lists.open-bio.org
> > > > Subject: [BioPython] Need help parsing Blastoutput
> > > >  
> > > > Hi All,
> > > > I have a BLAST output from a local blast
> > > > I need to calculate my % alignment coverage as regard to my subject
> > > > I try parsed the blast output and wanted to print the
> > > > sbjct Start and Sbjct end. but I could not is there anyway I could this
> 
> > > > try to get mach coverage between my querry and subject I dont need 
> > > > Identities,but total % alignment for querry or subject.
> > > > Thanks
> > > > Halimah
> > > > 
> > > > _______________________________________________
> > > > BioPython mailing list  -  BioPython at lists.open-bio.org
> > > > http://lists.open-bio.org/mailman/listinfo/biopython
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: blast2.xml
Type: text/xml
Size: 151658 bytes
Desc: 
URL: <http://lists.open-bio.org/pipermail/biopython/attachments/20060424/af1567dc/attachment-0002.xml>

From mdehoon at c2b2.columbia.edu  Mon Apr 24 18:14:17 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Mon, 24 Apr 2006 14:14:17 -0400
Subject: [BioPython] Need help parsing Blastoutput
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF0E@cgcmail.cgc.cpmc.columbia.edu>

Ha, I see. My stupid email program was removing the XML file from your email
messages for security reasons something or other.
Anyway, I got the XML files from the mailing list archives.

The XML file from Thursday April 20 is different from the one sent on Monday
April 24. In fact, the latter seems to be damaged; in line 194, it has:

<?xml version="1.1?>

while the former has

<?xml version="1.0"?>

So in the latter a " is missing for some reason.

Anyway, the XML parser can read the XML file from Thursday April 20 if you
fix a few things in your script:

*) Instead of
b_record = b_parser.parse(b_out)
you need
b_iterator = NCBIStandalone.Iterator(b_out, b_parser)
(and then you should also import NCBIStandalone)

*) You should check if b_record is None immediately after b_record =
b_iterator.next().

*) There is no hsp.sbject_end


--Michiel.

Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032


-----Original Message-----
From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
Sent: Mon 4/24/2006 4:45 AM
To: Michiel De Hoon
Cc: biopython at lists.open-bio.org
Subject: RE: [BioPython] Need help parsing Blastoutput
 
Hi 
attch here is the output xml out I also attached it in my previous post 
thanks
Halimah

On Thu, 20 Apr 2006, Michiel De Hoon wrote:

> Could you send us the Blast XML output also?
> 
> --Michiel.
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> Sent: Thu 4/20/2006 7:57 AM
> To: Michiel De Hoon
> Cc: biopython at lists.open-bio.org
> Subject: RE: [BioPython] Need help parsing Blastoutput
>  
> thanks I try using XML parser and I am still geting errors which I dont 
> understand . please see the attchmnt copy of my script and Blast XML 
> output.
> here is the error
> raceback (most recent call last):
>   File "Bioperser.py", line 11, in ?
>     b_record = b_parser.parse(b_out)
>   File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 
> 112, in parse
>     self._parser.parse(handler)
>   File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in 
> parse
>     xmlreader.IncrementalParser.parse(self, source)
>   File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in 
> parse
>     self.feed(buffer)
>   File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in 
> feed
>     self._err_handler.fatalError(exc)
>   File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in 
> fatalError
>     raise exception
> thanks
> Halimah
> 
> On Wed, 19 Apr 2006, Michiel De Hoon wrote:
> 
> > The Blast parser fails to read your file because the format of Blast
output
> > has changed. If I edit the data file so that it corresponds to the old
> format
> > (add a space here, remove a blank line there, etc.), the Blast parser
reads
> > the file without problems. The easiest solution is to repeat the Blast
run,
> > using XML for the output format, and use the Blast XML parser in
Biopython
> to
> > parse the results.
> > 
> > A general question is if anybody still needs the parser for Blast text
> > output. Currently, we are confusing our users by having a Blast text
parser
> > that tends to break. A broken parser may be worse than no parser.
> > 
> > --Michiel.
> > 
> > Michiel de Hoon
> > Center for Computational Biology and Bioinformatics
> > Columbia University
> > 1150 St Nicholas Avenue
> > New York, NY 10032
> > 
> > 
> > 
> > -----Original Message-----
> > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > Sent: Wed 4/19/2006 6:15 AM
> > To: Michiel De Hoon
> > Cc: biopython at lists.open-bio.org
> > Subject: RE: [BioPython] Need help parsing Blastoutput
> >  
> > Hi 
> > Please see the attachment,it part of my Blast output.
> > yes I am try to parse text output from Blast ,I have use another script
to 
> > run my local blast that I am trying to perse the
NCBIStandalone.BlastParser
> 
> > was working fine without hsp.sbject_end  which is one of what I need to 
> > print out .
> > On checking the class diagrams from cookbook, findout that sbject_end is 
> > not included .I just need another way of printing the int(subject end).
> > Thanks for your help
> > Halimah
> > 
> > On Tue, 18 Apr 2006, Michiel De Hoon wrote:
> > 
> > > Could you also send us the file Enterococcus_out so we can run the
> script?
> > > 
> > > From the script, it looks like you're trying to parse text output from
> > Blast.
> > > While this is possible (in theory), the format of Blast text output
tends
> > to
> > > change a lot, thereby breaking the parser in Biopython. It is more
> reliable
> > > to have Blast generate output in XML format, and use the XML parser:
> > > 
> > > blast_out = open('my_blast.xml', 'r')
> > > 
> > > from Bio.Blast import NCBIXML
> > > 
> > > b_parser = NCBIXML.BlastParser()
> > > b_record = b_parser.parse(blast_out)
> > > 
> > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how
to
> > > generate Blast output in XML.
> > > 
> > > --Michiel.
> > > 
> > > 
> > > 
> > > Michiel de Hoon
> > > Center for Computational Biology and Bioinformatics
> > > Columbia University
> > > 1150 St Nicholas Avenue
> > > New York, NY 10032
> > > 
> > > 
> > > 
> > > -----Original Message-----
> > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > > Sent: Tue 4/18/2006 11:06 AM
> > > To: Michiel De Hoon
> > > Cc: biopython at lists.open-bio.org
> > > Subject: RE: [BioPython] Need help parsing Blastoutput
> > >  
> > > thanks
> > > please see the attchment a copy of my script and copy of my Blast
output
> > > Thanks
> > > 
> > > 
> > > On Thu, 13 Apr 2006, Michiel De Hoon wrote:
> > > 
> > > > Could you send us the script you were using?
> > > > 
> > > > --Michiel.
> > > > 
> > > > Michiel de Hoon
> > > > Center for Computational Biology and Bioinformatics
> > > > Columbia University
> > > > 1150 St Nicholas Avenue
> > > > New York, NY 10032
> > > > 
> > > > 
> > > > 
> > > > -----Original Message-----
> > > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> > > > Sent: Thu 4/13/2006 11:07 AM
> > > > To: biopython at lists.open-bio.org
> > > > Subject: [BioPython] Need help parsing Blastoutput
> > > >  
> > > > Hi All,
> > > > I have a BLAST output from a local blast
> > > > I need to calculate my % alignment coverage as regard to my subject
> > > > I try parsed the blast output and wanted to print the
> > > > sbjct Start and Sbjct end. but I could not is there anyway I could
this
> 
> > > > try to get mach coverage between my querry and subject I dont need 
> > > > Identities,but total % alignment for querry or subject.
> > > > Thanks
> > > > Halimah
> > > > 
> > > > _______________________________________________
> > > > BioPython mailing list  -  BioPython at lists.open-bio.org
> > > > http://lists.open-bio.org/mailman/listinfo/biopython
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> 
> 


From mdehoon at c2b2.columbia.edu  Mon Apr 24 18:27:31 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Mon, 24 Apr 2006 14:27:31 -0400
Subject: [BioPython] Need help parsing Blastoutput
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF0F@cgcmail.cgc.cpmc.columbia.edu>

Also, make sure you have the latest version of Bio/Blast/NCBIStandalone.py;
you can get it from here:

http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/*checkout*/biopython/Bio
/Blast/NCBIStandalone.py?rev=1.60&cvsroot=biopython&content-type=text/plain

--Michiel.

Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032


-----Original Message-----
From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
Sent: Mon 4/24/2006 4:45 AM
To: Michiel De Hoon
Cc: biopython at lists.open-bio.org
Subject: RE: [BioPython] Need help parsing Blastoutput
 
Hi 
attch here is the output xml out I also attached it in my previous post 
thanks
Halimah

On Thu, 20 Apr 2006, Michiel De Hoon wrote:

> Could you send us the Blast XML output also?
> 
> --Michiel.
> 
> Michiel de Hoon
> Center for Computational Biology and Bioinformatics
> Columbia University
> 1150 St Nicholas Avenue
> New York, NY 10032
> 
> 
> 
> -----Original Message-----
> From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> Sent: Thu 4/20/2006 7:57 AM
> To: Michiel De Hoon
> Cc: biopython at lists.open-bio.org
> Subject: RE: [BioPython] Need help parsing Blastoutput
>  
> thanks I try using XML parser and I am still geting errors which I dont 
> understand . please see the attchmnt copy of my script and Blast XML 
> output.
> here is the error
> raceback (most recent call last):
>   File "Bioperser.py", line 11, in ?
>     b_record = b_parser.parse(b_out)
>   File "/usr/local/lib/python2.4/site-packages/Bio/Blast/NCBIXML.py", line 
> 112, in parse
>     self._parser.parse(handler)
>   File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 107, in 
> parse
>     xmlreader.IncrementalParser.parse(self, source)
>   File "/usr/local//lib/python2.4/xml/sax/xmlreader.py", line 123, in 
> parse
>     self.feed(buffer)
>   File "/usr/local//lib/python2.4/xml/sax/expatreader.py", line 211, in 
> feed
>     self._err_handler.fatalError(exc)
>   File "/usr/local//lib/python2.4/xml/sax/handler.py", line 38, in 
> fatalError
>     raise exception
> thanks
> Halimah
> 
> On Wed, 19 Apr 2006, Michiel De Hoon wrote:
> 
> > The Blast parser fails to read your file because the format of Blast
output
> > has changed. If I edit the data file so that it corresponds to the old
> format
> > (add a space here, remove a blank line there, etc.), the Blast parser
reads
> > the file without problems. The easiest solution is to repeat the Blast
run,
> > using XML for the output format, and use the Blast XML parser in
Biopython
> to
> > parse the results.
> > 
> > A general question is if anybody still needs the parser for Blast text
> > output. Currently, we are confusing our users by having a Blast text
parser
> > that tends to break. A broken parser may be worse than no parser.
> > 
> > --Michiel.
> > 
> > Michiel de Hoon
> > Center for Computational Biology and Bioinformatics
> > Columbia University
> > 1150 St Nicholas Avenue
> > New York, NY 10032
> > 
> > 
> > 
> > -----Original Message-----
> > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > Sent: Wed 4/19/2006 6:15 AM
> > To: Michiel De Hoon
> > Cc: biopython at lists.open-bio.org
> > Subject: RE: [BioPython] Need help parsing Blastoutput
> >  
> > Hi 
> > Please see the attachment,it part of my Blast output.
> > yes I am try to parse text output from Blast ,I have use another script
to 
> > run my local blast that I am trying to perse the
NCBIStandalone.BlastParser
> 
> > was working fine without hsp.sbject_end  which is one of what I need to 
> > print out .
> > On checking the class diagrams from cookbook, findout that sbject_end is 
> > not included .I just need another way of printing the int(subject end).
> > Thanks for your help
> > Halimah
> > 
> > On Tue, 18 Apr 2006, Michiel De Hoon wrote:
> > 
> > > Could you also send us the file Enterococcus_out so we can run the
> script?
> > > 
> > > From the script, it looks like you're trying to parse text output from
> > Blast.
> > > While this is possible (in theory), the format of Blast text output
tends
> > to
> > > change a lot, thereby breaking the parser in Biopython. It is more
> reliable
> > > to have Blast generate output in XML format, and use the XML parser:
> > > 
> > > blast_out = open('my_blast.xml', 'r')
> > > 
> > > from Bio.Blast import NCBIXML
> > > 
> > > b_parser = NCBIXML.BlastParser()
> > > b_record = b_parser.parse(blast_out)
> > > 
> > > See section 3.1.2 in the Biopython cookbook, and section 3.1.4 on how
to
> > > generate Blast output in XML.
> > > 
> > > --Michiel.
> > > 
> > > 
> > > 
> > > Michiel de Hoon
> > > Center for Computational Biology and Bioinformatics
> > > Columbia University
> > > 1150 St Nicholas Avenue
> > > New York, NY 10032
> > > 
> > > 
> > > 
> > > -----Original Message-----
> > > From: Halima Rabiu [mailto:halima at cbio.uct.ac.za]
> > > Sent: Tue 4/18/2006 11:06 AM
> > > To: Michiel De Hoon
> > > Cc: biopython at lists.open-bio.org
> > > Subject: RE: [BioPython] Need help parsing Blastoutput
> > >  
> > > thanks
> > > please see the attchment a copy of my script and copy of my Blast
output
> > > Thanks
> > > 
> > > 
> > > On Thu, 13 Apr 2006, Michiel De Hoon wrote:
> > > 
> > > > Could you send us the script you were using?
> > > > 
> > > > --Michiel.
> > > > 
> > > > Michiel de Hoon
> > > > Center for Computational Biology and Bioinformatics
> > > > Columbia University
> > > > 1150 St Nicholas Avenue
> > > > New York, NY 10032
> > > > 
> > > > 
> > > > 
> > > > -----Original Message-----
> > > > From: biopython-bounces at lists.open-bio.org on behalf of Halima Rabiu
> > > > Sent: Thu 4/13/2006 11:07 AM
> > > > To: biopython at lists.open-bio.org
> > > > Subject: [BioPython] Need help parsing Blastoutput
> > > >  
> > > > Hi All,
> > > > I have a BLAST output from a local blast
> > > > I need to calculate my % alignment coverage as regard to my subject
> > > > I try parsed the blast output and wanted to print the
> > > > sbjct Start and Sbjct end. but I could not is there anyway I could
this
> 
> > > > try to get mach coverage between my querry and subject I dont need 
> > > > Identities,but total % alignment for querry or subject.
> > > > Thanks
> > > > Halimah
> > > > 
> > > > _______________________________________________
> > > > BioPython mailing list  -  BioPython at lists.open-bio.org
> > > > http://lists.open-bio.org/mailman/listinfo/biopython
> > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> 
> 


From biopython at maubp.freeserve.co.uk  Tue Apr 25 09:08:33 2006
From: biopython at maubp.freeserve.co.uk (Peter (BioPython List))
Date: Tue, 25 Apr 2006 10:08:33 +0100
Subject: [BioPython] Bio.Nexus documentation
In-Reply-To: <1145885566.2369.6.camel@osiris.biology.duke.edu>
References: <444CAEC6.5040703@maubp.freeserve.co.uk>
	<1145885566.2369.6.camel@osiris.biology.duke.edu>
Message-ID: <444DE711.8070509@maubp.freeserve.co.uk>

> Anyway, I'll get some examples together, and I still want to do some
> documentation for the cookbook. It won't be before this weekend, though.
> For a quick and dirty anchor point, there's the test module that comes
> with the distribution, it naturally has some code that does interesting
> things with trees and data.

Its certainly shown me that the Nexus file format is a lot more 
complicated than just holding simple trees.

What I actually wanted to do was load a Newick format tree (extension 
*.dnd files from Clustalw/ClustalX in particular) into BioPython.  This 
doesn't look like is possible.

However, I can get Clustalx to save the corresponding alignment in Nexus 
format, but the parser doesn't seem to like it...

Traceback (most recent call last):
   File "C:\temp\hack_trees_000.py", line 7, in ?
     n=Nexus.Nexus(input_file)
   File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 525, in 
__init__
     self.read(input)
   File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 582, in 
read
     self._parse_nexus_block(title, contents)
   File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 623, in 
_parse_nexus_block
     getattr(self,'_'+line.command)(line.options)
AttributeError: 'Nexus' object has no attribute '_utree'

This looks like its cause by the penultimate line of the "Nexus Tree 
file" produced by ClustalX:

..
	UTREE PAUP_1= (...);
ENDBLOCK;

Any ideas?  I'll happily send you some example tree files off the list 
if you want.

Peter


From fkauff at duke.edu  Tue Apr 25 12:03:16 2006
From: fkauff at duke.edu (Frank)
Date: Tue, 25 Apr 2006 08:03:16 -0400
Subject: [BioPython] Bio.Nexus documentation
In-Reply-To: <444DE711.8070509@maubp.freeserve.co.uk>
References: <444CAEC6.5040703@maubp.freeserve.co.uk>
	<1145885566.2369.6.camel@osiris.biology.duke.edu>
	<444DE711.8070509@maubp.freeserve.co.uk>
Message-ID: <1145966596.2276.3.camel@cpe-066-057-048-192.nc.res.rr.com>

Hi Peter,

yes, utree is in deed a nexus command I never heard of... The thing is
that nexus is extendible, so programs can in theory define new commands.
So, what is utree? Maybe an unrooted tree?
And, many programs don't care much about the nexus specifications, which
are, in turn, not always too precise. 
If you send the files along, I'd be happy to have a look.

Cheers,
Frank

On Tue, 2006-04-25 at 10:08 +0100, Peter (BioPython List) wrote:
> > Anyway, I'll get some examples together, and I still want to do some
> > documentation for the cookbook. It won't be before this weekend, though.
> > For a quick and dirty anchor point, there's the test module that comes
> > with the distribution, it naturally has some code that does interesting
> > things with trees and data.
> 
> Its certainly shown me that the Nexus file format is a lot more 
> complicated than just holding simple trees.
> 
> What I actually wanted to do was load a Newick format tree (extension 
> *.dnd files from Clustalw/ClustalX in particular) into BioPython.  This 
> doesn't look like is possible.
> 
> However, I can get Clustalx to save the corresponding alignment in Nexus 
> format, but the parser doesn't seem to like it...
> 
> Traceback (most recent call last):
>    File "C:\temp\hack_trees_000.py", line 7, in ?
>      n=Nexus.Nexus(input_file)
>    File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 525, in 
> __init__
>      self.read(input)
>    File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 582, in 
> read
>      self._parse_nexus_block(title, contents)
>    File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 623, in 
> _parse_nexus_block
>      getattr(self,'_'+line.command)(line.options)
> AttributeError: 'Nexus' object has no attribute '_utree'
> 
> This looks like its cause by the penultimate line of the "Nexus Tree 
> file" produced by ClustalX:
> 
> ..
> 	UTREE PAUP_1= (...);
> ENDBLOCK;
> 
> Any ideas?  I'll happily send you some example tree files off the list 
> if you want.
> 
> Peter
> 
> 


From fkauff at duke.edu  Tue Apr 25 21:17:23 2006
From: fkauff at duke.edu (Frank Kauff)
Date: Tue, 25 Apr 2006 17:17:23 -0400
Subject: [BioPython] Bio.Nexus documentation
In-Reply-To: <444DE711.8070509@maubp.freeserve.co.uk>
References: <444CAEC6.5040703@maubp.freeserve.co.uk>
	<1145885566.2369.6.camel@osiris.biology.duke.edu>
	<444DE711.8070509@maubp.freeserve.co.uk>
Message-ID: <1145999843.2365.25.camel@osiris.biology.duke.edu>

Ok, I added support for the utree command used in clustal to denote an
unrooted tree (in the nexus parser, it is synonym to 'tree', as trees
are unrooted by default anyway), and fixed some issues with linebreaks
in tree descriptions. Nexus files from Clustal should now be read
without problems (famous last words).

Cheers,
Frank


On Tue, 2006-04-25 at 10:08 +0100, Peter (BioPython List) wrote:
> > Anyway, I'll get some examples together, and I still want to do some
> > documentation for the cookbook. It won't be before this weekend, though.
> > For a quick and dirty anchor point, there's the test module that comes
> > with the distribution, it naturally has some code that does interesting
> > things with trees and data.
> 
> Its certainly shown me that the Nexus file format is a lot more 
> complicated than just holding simple trees.
> 
> What I actually wanted to do was load a Newick format tree (extension 
> *.dnd files from Clustalw/ClustalX in particular) into BioPython.  This 
> doesn't look like is possible.
> 
> However, I can get Clustalx to save the corresponding alignment in Nexus 
> format, but the parser doesn't seem to like it...
> 
> Traceback (most recent call last):
>    File "C:\temp\hack_trees_000.py", line 7, in ?
>      n=Nexus.Nexus(input_file)
>    File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 525, in 
> __init__
>      self.read(input)
>    File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 582, in 
> read
>      self._parse_nexus_block(title, contents)
>    File "C:\Python23\lib\site-packages\Bio\Nexus\Nexus.py", line 623, in 
> _parse_nexus_block
>      getattr(self,'_'+line.command)(line.options)
> AttributeError: 'Nexus' object has no attribute '_utree'
> 
> This looks like its cause by the penultimate line of the "Nexus Tree 
> file" produced by ClustalX:
> 
> ..
> 	UTREE PAUP_1= (...);
> ENDBLOCK;
> 
> Any ideas?  I'll happily send you some example tree files off the list 
> if you want.
> 
> Peter
> 
> 
-- 
Frank Kauff
Dept. of Biology
Duke University
Box 90338
Durham, NC 27708
USA

Phone 919-660-7382
Fax 919-660-7293
Web http://www.lutzonilab.net


From biopython at maubp.freeserve.co.uk  Wed Apr 26 14:16:21 2006
From: biopython at maubp.freeserve.co.uk (Peter (BioPython List))
Date: Wed, 26 Apr 2006 15:16:21 +0100
Subject: [BioPython] Bio.Nexus and Clustal tree files
Message-ID: <444F80B5.60207@maubp.freeserve.co.uk>

Hello again,

I have installed Frank Kauff's recent changes to Bio.Nexus from CVS, and 
have actually got a tree loaded now :)

Here is my example script, which tries to load two tree files created 
using ClustalX 1.83 (files previously sent to Frank off list)

(b) demo.dnd - Clustal guide tree in Newick format, no bootstraps
(b) demo.treb - Clustal NJ tree in Nexus format, with bootstraps

Example code starts here:

from Bio.Nexus import Nexus

for filename in [r"C:\TEMP\nexus\demo.dnd",
              r"C:\TEMP\nexus\demo.treb"] :

     input_file = open(filename,"r")
     n=Nexus.Nexus(input_file)
     input_file.close()

     print "-----------------"
     print "Filename:" + n.filename
     print "Number of taxlabels = %i" % len(n.taxlabels)
     print "Number of trees = %i" % len(n.trees)
     for tree in n.trees :
         print "Tree name: %s"% tree.name
         print "Tree nodes: " +  ", ".join(tree.get_taxa())
print "-----------------"


This gives the following output:

-----------------
Filename:C:\TEMP\nexus\demo.dnd
Number of taxlabels = 0
Number of trees = 0
-----------------
Filename:C:\TEMP\nexus\demo.treb
Number of taxlabels = 0
Number of trees = 1
Tree name: PAUP_1
Tree nodes: V_Harveyi_PATH, B_subtilis_YXEM, B_subtilis_GlnH_homo_YCKK, 
YA80_HAEIN, FLIY_ECOLI, Deinococcus_radiodurans, HISJ_E_COLI, E_coli_GlnH
-----------------

As you can see, loading the ClustalX NEXUS output (*.treb) seems to work 
without trouble (although n.taxlabels is an empty list... is this to be 
expected?).

On the other hand, I don't get the tree for the Clustal guide tree file 
(*.dnd) which is a pain.  Do I need to load these files differently, as 
they are Newick format, not NEXUS format?

Thank you

Peter


From fkauff at duke.edu  Wed Apr 26 15:17:31 2006
From: fkauff at duke.edu (Frank Kauff)
Date: Wed, 26 Apr 2006 11:17:31 -0400
Subject: [BioPython] Bio.Nexus and Clustal tree files
In-Reply-To: <444F80B5.60207@maubp.freeserve.co.uk>
References: <444F80B5.60207@maubp.freeserve.co.uk>
Message-ID: <1146064651.2365.41.camel@osiris.biology.duke.edu>

On Wed, 2006-04-26 at 15:16 +0100, Peter (BioPython List) wrote:
> Hello again,
> 
> I have installed Frank Kauff's recent changes to Bio.Nexus from CVS, and 
> have actually got a tree loaded now :)
> 
Excellent!

> Here is my example script, which tries to load two tree files created 
> using ClustalX 1.83 (files previously sent to Frank off list)
> 
> (b) demo.dnd - Clustal guide tree in Newick format, no bootstraps
> (b) demo.treb - Clustal NJ tree in Nexus format, with bootstraps
> 
> Example code starts here:

> This gives the following output:
> 
> -----------------
> Filename:C:\TEMP\nexus\demo.dnd
> Number of taxlabels = 0
> Number of trees = 0
> -----------------
> Filename:C:\TEMP\nexus\demo.treb
> Number of taxlabels = 0
> Number of trees = 1
> Tree name: PAUP_1
> Tree nodes: V_Harveyi_PATH, B_subtilis_YXEM, B_subtilis_GlnH_homo_YCKK, 
> YA80_HAEIN, FLIY_ECOLI, Deinococcus_radiodurans, HISJ_E_COLI, E_coli_GlnH
> -----------------
> 
> As you can see, loading the ClustalX NEXUS output (*.treb) seems to work 
> without trouble (although n.taxlabels is an empty list... is this to be 
> expected?).

yes, the taxlabels refers to the taxon labels of a nexus data matrix.
They are not necessarily identical with the taxa in the tree, but could
be a superset or a subset of those.

However, the way clustal indicates the no. of supported bootstrap
replicates (square brackets after the branchlengths) is unsupported, and
thus these values are ignored. 

> 
> On the other hand, I don't get the tree for the Clustal guide tree file 
> (*.dnd) which is a pain.  Do I need to load these files differently, as 
> they are Newick format, not NEXUS format?
> 
Yes, the nexus parser reads only nexus. But you can throw the newick
tree directly at the Tree class
>>> from Bio.Nexus import Trees
>>> t=Trees.Tree(open('demo.dnd').read())

Frank


> Thank you
> 
> Peter
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
-- 
Frank Kauff
Dept. of Biology
Duke University
Box 90338
Durham, NC 27708
USA

Phone 919-660-7382
Fax 919-660-7293
Web http://www.lutzonilab.net


From dam6278 at yahoo.fr  Thu Apr 27 07:53:24 2006
From: dam6278 at yahoo.fr (dam6278)
Date: Thu, 27 Apr 2006 07:53:24 +0000 (GMT)
Subject: [BioPython] GenBank
Message-ID: <20060427075324.13946.qmail@web86913.mail.ukl.yahoo.com>

I have a proble with the GenBank parser :
  
  When I execute :
  
  from Bio import GenBank
  gi_list = GenBank.search_for("Opuntia AND rpl16")
  
  My output is :
  
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line 1398, in search_for
      retstart = start_id, retmax = max_ids)
    File "/usr/lib/python2.4/site-packages/Bio/EUtils/DBIdsClient.py", line 294, in search
      searchinfo = parse.parse_search(infile, [None])
    File "/usr/lib/python2.4/site-packages/Bio/EUtils/parse.py", line 201, in parse_search
      for ele in pom["TranslationStack"]:
    File "/usr/lib/python2.4/site-packages/Bio/EUtils/POM.py", line 355, in __getitem__
      raise IndexError, "no item matches"
  IndexError: no item matches
  
  Do you know where is my problem ?
  
  Thank you for your help.
  
  damien
 

From lpritc at scri.sari.ac.uk  Thu Apr 27 08:33:21 2006
From: lpritc at scri.sari.ac.uk (Leighton Pritchard)
Date: Thu, 27 Apr 2006 09:33:21 +0100
Subject: [BioPython] Creating a graphical interface to database of
	gene	coordinates
In-Reply-To: <444A2255.6010704@maubp.freeserve.co.uk>
References: <20060421224928.87408.qmail@web38105.mail.mud.yahoo.com>
	<444A2255.6010704@maubp.freeserve.co.uk>
Message-ID: <1146126802.4725.223.camel@lplinuxdev>

Hi guys,

On Sat, 2006-04-22 at 13:32 +0100, Peter (BioPython) wrote:
> Srinivas Iyyer wrote:
> > Dear group, 
> >  I am happy that I am slowly finding pyhonian projects
> > related to my research area. 
> > 
> > Problem:
> > 1. I have a database of human gene coordinates on
> > chromosomes.
> > 2. I have gene expression data from my lab concerning
> > the genes I mentioned above. 
> > 
> > 3. I want to visualize expression data laid on
> > chromosomes.
> 
> You may be able to produce chromosome diagrams with Leighton Pritchard 
> and Jennifer White's program genomediagram:
> 
> http://bioinf.scri.sari.ac.uk/lp/programs.html#genomediagram
> 
> It will do both circular genomes diagrams (nice for bacteria) and linear 
> ones - which would make sense for chromosomes.  I think I've seen 
> examples with expression data shown in this way... certainly it could be 
> done.

We use it ourselves to plot array data against chromosome location, but
on the whole chromosome scale and, as you mention, not interactively.
It's pretty easy to do, but not what Srinivas is looking for, I think.
It sounds, Srinivas, like you're wanting something that will operate
more like GeneSpring?  Is that right?

It's possible that, if you just wanted to present a static image of
expression data, you could use GenomeDiagram in this way, but it's not
the way I would choose to present the data in a GUI - I'd expect drawing
straight onto a canvas (in whichever GUI toolkit suited you) to be more
flexible for you.

> Note that this can produce PDF or bitmap output - but its not 
> interactive.  There is also a GUI to go with it, but I have not looked 
> at this.

The GUI is pretty rudimentary, providing for file selection and just
enough document formatting so as to not be entirely useless to the non-
programmer.  An improved version (but still not interactive) is in a
perenially almost-ready state as wxPython widgets in the current source,
waiting for a serious fixing and a wxApp to hang from.

-- 
Dr Leighton Pritchard AMRSC
D131, Plant-Pathogen Interactions, Scottish Crop Research Institute
Invergowrie, Dundee, Scotland, DD2 5DA, UK
T: +44 (0)1382 562731 x2405 F: +44 (0)1382 568578
E: lpritc at scri.sari.ac.uk   W: http://bioinf.scri.sari.ac.uk/lp
GPG/PGP: FEFC205C E58BA41B  http://www.keyserver.net             
(If the signature does not verify, please remove the SCRI disclaimer)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

DISCLAIMER:

This email is from the Scottish Crop Research Institute, but the views 
expressed by the sender are not necessarily the views of SCRI and its 
subsidiaries.  This email and any files transmitted with it are confidential 
to the intended recipient at the e-mail address to which it has been 
addressed.  It may not be disclosed or used by any other than that addressee.
If you are not the intended recipient you are requested to preserve this 
confidentiality and you must not use, disclose, copy, print or rely on this 
e-mail in any way. Please notify postmaster at scri.sari.ac.uk quoting the 
name of the sender and delete the email from your system.

Although SCRI has taken reasonable precautions to ensure no viruses are 
present in this email, neither the Institute nor the sender accepts any 
responsibility for any viruses, and it is your responsibility to scan the email 
and the attachments (if any).


From mdehoon at c2b2.columbia.edu  Thu Apr 27 15:31:43 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Thu, 27 Apr 2006 11:31:43 -0400
Subject: [BioPython] GenBank
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF1A@cgcmail.cgc.cpmc.columbia.edu>

I was not able to replicate this error -- both biopython 1.41 and biopython
in CVS worked fine. Perhaps a temporary internet failure?

--Michiel.

Michiel de Hoon
Center for Computational Biology and Bioinformatics
Columbia University
1150 St Nicholas Avenue
New York, NY 10032


-----Original Message-----
From: biopython-bounces at lists.open-bio.org on behalf of dam6278
Sent: Thu 4/27/2006 3:53 AM
To: biopython at lists.open-bio.org
Subject: [BioPython] GenBank
 
I have a proble with the GenBank parser :
  
  When I execute :
  
  from Bio import GenBank
  gi_list = GenBank.search_for("Opuntia AND rpl16")
  
  My output is :
  
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    File "/usr/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line
1398, in search_for
      retstart = start_id, retmax = max_ids)
    File "/usr/lib/python2.4/site-packages/Bio/EUtils/DBIdsClient.py", line
294, in search
      searchinfo = parse.parse_search(infile, [None])
    File "/usr/lib/python2.4/site-packages/Bio/EUtils/parse.py", line 201, in
parse_search
      for ele in pom["TranslationStack"]:
    File "/usr/lib/python2.4/site-packages/Bio/EUtils/POM.py", line 355, in
__getitem__
      raise IndexError, "no item matches"
  IndexError: no item matches
  
  Do you know where is my problem ?
  
  Thank you for your help.
  
  damien
 

_______________________________________________
BioPython mailing list  -  BioPython at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/biopython


From bill at barnard-engineering.com  Fri Apr 28 04:44:28 2006
From: bill at barnard-engineering.com (Bill Barnard)
Date: Thu, 27 Apr 2006 21:44:28 -0700
Subject: [BioPython] Updating the tutorial,
	was :Parsing and Creating	Dictionaries of GenBank files
In-Reply-To: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF05@cgcmail.cgc.cpmc.columbia.edu>
References: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF05@cgcmail.cgc.cpmc.columbia.edu>
Message-ID: <1146199468.5816.34.camel@lyell.barnard-engineering.com>

On Fri, 2006-04-21 at 12:26 -0400, Michiel De Hoon wrote:
> > Something funny seems to have happened to the plain text version:
> > 
> >
> http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Doc/Tutorial.t
> xt.diff?r1=1.5&r2=1.6&cvsroot=biopython
> 
> The plain text version is generated by hevea, so not by tex directly. The
> funny output is likely due to having a different hevea version (which I ran a
> couple of times). I didn't see anything obviously wrong with the Tutorial.tex
> source file, so I think these errors are due to errors in the Tutorial.tex ->
> Tutorial.txt translation by hevea.

FWIW - I just updated from CVS and ran my updated Doc makefiles (see
http://bugzilla.open-bio.org/show_bug.cgi?id=1939 ) and don't see any of
the weird artifacts in the generated Tutorial.txt file. My hevea version
is 1.06.

> 
> > If generating a consistent plain text version is a lot of hassle, then 
> > maybe we can live without it?
> 
> Currently, the plain text version is not very useful. It's not a source file,
> so it should not be in CVS. On the other hand, the plain text version is not
> available from the Biopython documentation page, and users are better off
> with the PDF version anyway. So I think nobody will miss the plain text
> version. Correct me if I'm wrong.

As long as your release process includes running a make in the Doc tree,
then you can generate the txt file from the tex source.

Bill


From mdehoon at c2b2.columbia.edu  Fri Apr 28 16:37:30 2006
From: mdehoon at c2b2.columbia.edu (Michiel De Hoon)
Date: Fri, 28 Apr 2006 12:37:30 -0400
Subject: [BioPython] Updating the tutorial,
	was :Parsing and Creating	Dictionaries of GenBank files
Message-ID: <6CA15ADD82E5724F88CB53D50E61C9AE9ECF1E@cgcmail.cgc.cpmc.columbia.edu>

> On Fri, 2006-04-21 at 12:26 -0400, Michiel De Hoon wrote:
> > > Something funny seems to have happened to the plain text version:
> > 
> > The plain text version is generated by hevea, so not by tex directly. The
> > funny output is likely due to having a different hevea version (which I
ran a
> > couple of times). I didn't see anything obviously wrong with the
Tutorial.tex
> > source file, so I think these errors are due to errors in the
Tutorial.tex ->
> > Tutorial.txt translation by hevea.
> 
> FWIW - I just updated from CVS and ran my updated Doc makefiles (see
> http://bugzilla.open-bio.org/show_bug.cgi?id=1939 ) and don't see any of
> the weird artifacts in the generated Tutorial.txt file. My hevea version
> is 1.06.

So it's probably a hevea problem -- I'm using version 1.08.

> As long as your release process includes running a make in the Doc tree,
> then you can generate the txt file from the tex source.

That is one of the steps in building a release -- see
http://www.biopython.org/docs/developer/build.html

--Michiel.


From clayton_kd at yahoo.com  Sat Apr 29 15:05:09 2006
From: clayton_kd at yahoo.com (Kyle Dent)
Date: Sat, 29 Apr 2006 08:05:09 -0700 (PDT)
Subject: [BioPython] GenBank parsing
Message-ID: <20060429150509.82051.qmail@web31509.mail.mud.yahoo.com>

Dear All,

My script was successfully implementing the Genbank
parser until just today I was trying to get it to
parse a genpept file. After much experimentation I
discovered that it was actually having trouble parsing
even newly downloaded GenBank files as well
(downloaded of NCBI).

I wanted to ask if anyone is aware of this problem, I
understand the flat file format was updated this month
and is probably the cause of this.

The output which I am getting:

Traceback (most recent call last):
  File "C:\work\GB CDS Extractor.py", line 289, in
open1_clicked
    loadGenBank(self, self.gbFilePath)
  File "C:\work\GB CDS Extractor.py", line 75, in
loadGenBank
    cur_record = genBank_Iterator.next()
  File
"C:\Python24\Lib\site-packages\Bio\GenBank\__init__.py",
line 129, in nex
t
    return self._parser.parse(File.StringHandle(data))
  File
"C:\Python24\Lib\site-packages\Bio\GenBank\__init__.py",
line 219, in par
se
    self._scanner.feed(handle, self._consumer)
  File
"C:\Python24\Lib\site-packages\Bio\GenBank\__init__.py",
line 1259, in fe
ed
    self._parser.parseFile(handle)
  File
"C:\Python24\Lib\site-packages\Martel\Parser.py", line
328, in parseFile
    self.parseString(fileobj.read())
  File
"C:\Python24\Lib\site-packages\Martel\Parser.py", line
356, in parseStrin
g
    self._err_handler.fatalError(result)
  File "C:\Python24\lib\xml\sax\handler.py", line 38,
in fatalError
    raise exception
Martel.Parser.ParserPositionException: error parsing
at or beyond character 136

With thanks,
Kyle


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


From biopython at maubp.freeserve.co.uk  Sat Apr 29 21:54:59 2006
From: biopython at maubp.freeserve.co.uk (Peter (BioPython))
Date: Sat, 29 Apr 2006 22:54:59 +0100
Subject: [BioPython] GenBank parsing
In-Reply-To: <20060429150509.82051.qmail@web31509.mail.mud.yahoo.com>
References: <20060429150509.82051.qmail@web31509.mail.mud.yahoo.com>
Message-ID: <4453E0B3.9040409@maubp.freeserve.co.uk>

Kyle Dent wrote:
> Dear All,
> 
> My script was successfully implementing the Genbank
> parser until just today I was trying to get it to
> parse a genpept file. After much experimentation I
> discovered that it was actually having trouble parsing
> even newly downloaded GenBank files as well
> (downloaded of NCBI).
> 
> I wanted to ask if anyone is aware of this problem, I
> understand the flat file format was updated this month
> and is probably the cause of this.

I'm aware that earlier in 2006, there was a new project line added.  I 
haven't been aware of any further changes... on the other hand, I don't 
think I've ever used a "genpept" file either.

Anyway, from the error message you are using the "old" Martel based 
parser shipped with BioPython 1.41

We recommend you update to the current CVS parser which is (a) more up 
to date, (b) faster, (c) should give slightly more helpful error 
messages if it does get stuck.

For most cases you can simply download this file, replacing your 
Bio/GenBank/__init__.py after making a backup of the old version:

http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/GenBank/__init__.py?cvsroot=biopython

If you see errors about ReseekFile then you will need to make a few 
other changes...

If you are still having trouble, or need further help making the update, 
please reply back.  Including the GenBank reference of any problem file 
would be handy.

Thank you

Peter