[BioPython] Can't download a FASTA file from NCBI to BLAST

Roger Barrette rwbarrette at gmail.com
Tue Jun 19 11:58:19 UTC 2007


I am trying to set up a script to automatically go into NCBI and retrieve
individual FASTA file based on a list of accession numbers (either gi or
NC).  The code that I have written gets the sequences and saves the file,
but when I run a blast against the file, it doesn't work, Am I not using the
correct parser for preparing to save the file for blasting? I tried to set
the format to "fasta", but I was getting errors saying that gi_list[0]
doesn't contain the arguement 'data.seq'.  I also tried the arguement
.sequence, and it gave me the same errror.  I realize I'm not currently
calling the file in as a FASTA, but this is the only way I've been able to
even automate the record retrieval process for the long series of Blasting
that I have to do.    I have a separate function for calling the Blast,
but it works fine with manually downloaded FASTA files, so the
problem appears to be here.  Any suggestions for a fix, or even a better way
to do this would be greatly appreciated. Thanks.   My code is:


def Get_FASTA_Seq(NC_ID):

    i = NC_ID

## Search for Viruses based on TXID

    from Bio import GenBank
    gi_list = GenBank.search_for(i)
    ncbi_dict = GenBank.NCBIDictionary("nucleotide","genbank")

    fasta_file = open("c:\Current_Query.gbk", "w")

## Extract individual Sequence from NCBI based on gi# or NC#  ##

    gb_record = ncbi_dict[gi_list[0]]
    record_parser = GenBank.FeatureParser()
    ncbi_dict = GenBank.NCBIDictionary("nucleotide","genbank", parser =
record_parser)
    gb_seqrecord = ncbi_dict[gi_list[0]]

    SeqValue = ncbi_dict[gi_list[0]].seq.data
    NameValue = ncbi_dict[gi_list[0]].annotations["organism"]
    Length = len(SeqValue)
    Seq5 = 0
    Seq3 = Seq5 + Length

    print NameValue
    print Length
    print SeqValue

## Write sequences into the FASTA file ##

    fasta_file.write(">" + i + " " + NameValue + "\n")
    for j in range(0, len(SeqValue[Seq5:Seq3]), Length):
        fasta_file.write(SeqValue[Seq5:Seq3])
        fasta_file.write("\n")
## Close and Save the FASTA file ##
    fasta_file.close()



More information about the Biopython mailing list