[BioPython] How to detect sequences that not produce alignments

Sebastian Bassi sbassi at gmail.com
Thu Feb 28 17:18:43 UTC 2008


On Wed, Feb 27, 2008 at 12:41 PM, Bruno Santos <bsantos at biocant.pt> wrote:
>  way to detect which blast_records are empty? Or the module simply ignores
>  this cases and don't put them on the blast_records?

Here is my code (I put a copy here http://pastebin.com/f74133375 if
formating get lost in the mail).

from Bio import SeqIO
from Bio.Blast import NCBIXML

def blastcomp(fastafile,blastfile):
    handle = open(fastafile)
    fastanames=set()
    #Reads the fasta names
    for record in SeqIO.parse(handle, "fasta") :
        fastanames.add(record.name)
    handle.close()
    blastnames=set()
    #Reads the blast names
    b_records=NCBIXML.parse(open(blastfile))
    for b_record in b_records:
        blastnames.add(b_record.query)
    return fastanames.difference(blastnames)


blastfile="/home/sbassi/bioinfo/INTA/filtracMT.xml"
fastafile='INTA/allfiltrados.txt'
print blastcomp(fastafile,blastfile)


-- 
Curso Biologia Molecular para programadores: http://tinyurl.com/2vv8w6
Bioinformatics news: http://www.bioinformatica.info
Tutorial libre de Python: http://tinyurl.com/2az5d5



More information about the Biopython mailing list