[Biopython] Is this feasible?

Willis, Jordan R jordan.r.willis at Vanderbilt.Edu
Tue Jun 8 03:36:05 UTC 2010


Hello I'm relatively new to both programming and bioinformatics. I wanted to know if anyone would knew how to do something like this:

I have a list of sequences that have evolved away from a given germline sequence. I was going to use biopython to iteratively map the closest mutant to the germline and pull it out of the list. I would then align these two sequences and give a score. I would then take the remaining sequences and find the one that is closest to the one that was just taken out of the list and do the same thing as the list is empty.

The output would look something like this:

Prints to screen:
--------------------------------------------------------------------------------------------------------------------
Round one:
Seed(germline)
Closets scoring sequence --> with a bitwise score of <number from score algorithm>

Round two:
Closest scoring sequence to seed
Next closest scoring sequence(s) ---> with a bitwise score of....
...
Round N:
Seed
Next to last closest scoring sequence
Last place sequence(s) ---> with a bitwise score of....

--------------------------------------------------------------------------------------------------------------------


In a way it's sort of a tractable phylogeny tree but with simpler sequences.


Def run_blast(command):
    subprocess.call(str(command), shell=(sys.platform!="win32")
    xml_return = 'tmp.xml'
    return xml_return


Def main()
    Database = [ seq_record for seq_record in seqIO.parse('Input.fasta', "fasta")]
    germline = seqIO.read('germline.fasta')
    while Database:
        cline = NcbiblastpCommandline(query=germline, db=Database out='tmp.xml')
        blast_records  = NCBIXML.parse(run_blast(cline))
        print blast_records.alignment[0]
        print germline
        print "\n\n\n"
        germline = blast_records.alignment[0]
        Database.remove(germline)



I guess my first question is does this seem logical. Is blast the best algorithm to use for this scenario? The other problem is creating my own database. I read the documentation, and it said you could create your own database to run local blast (which I have), I just have no idea how do to that. The second thing is in blast_records.alignment[0], will this always give me the best scoring sequence?

Any help would be much appreciated.

Thanks for the help,

Jordan




More information about the Biopython mailing list