[BioPython] Index Error: List index out of range

Quoc-Dien Trinh qdtrinh@yahoo.com
Thu, 19 Jul 2001 15:05:19 -0400


> At 1:03 PM -0400 7/19/01, Quoc-Dien Trinh wrote:
>> I'm running a python script quite similar to the dictionary.py published in
>> the tutorial for biopython. Essentially, the script writes down all
>> sequences matching a certain pattern. When I run the script against a small
>> database of proteins (e.g. A small proteome), there's no problem, but when I
>> try it on bigger proteomes, like the chicken or the human, there's an error
>> message stating the list index is out of range.
>> 
>> Can someone help me on this?
> 
> Not without a better description of the problem.  I searched through
> the tutorial and couldn't find a script called "dictionary.py".

Sorry, it was fasta_dictionary.py

> Thus, I don't know what your code looks like, which biopython modules
> it's using, or what the error message looks like.  With this little
> information, I can't do much, except say to check your sequence
> indexes carefully.

Here is the code:

import string
from Bio import Fasta
from Bio.Alphabet import IUPAC

def get_accession_num(fasta_record):
    title_atoms = string.split(fasta_record.title)

    accession_atoms = string.split(title_atoms[0], '|')

    gb_name = accession_atoms[1]

    return gb_name

Fasta.index_file("proteome.fasta", "proteome.idx", get_accession_num)

dna_parser = Fasta.SequenceParser(IUPAC.protein)

proteome_dict = Fasta.Dictionary("proteome.idx", dna_parser)

f=open('polya_id.txt','w')

for id_num in proteome_dict.keys():
    my_sequence = proteome_dict[id_num].seq.data
    if my_sequence.find('AAAAA') > -1:
        f.write(id_num + '\n')
...


Proteome.fasta is every known protein sequences of a specie (downloaded from
the Taxonomy section @ncbi. When I run this program with the C.Elegans and
Re.Drerio, the script executes without any trouble and returns the desired
list of poly-A proteins. However, when I run this with other species like
Gallus, Homo Sapiens, ... It doesn't work and I get the following error
message:

[localhost:biopython-1.00a1/Doc/examples] qdtrinh% python fasta_d*
Traceback (most recent call last):
  File "fasta_dictionary.py", line 14, in ?
    Fasta.index_file("proteome.fasta", "proteome.idx", get_accession_num)
  File "/usr/lib/python2.1/site-packages/Bio/Fasta/__init__.py", line 333,
in index_file
    key = rec2key(rec)
  File "fasta_dictionary.py", line 8, in get_accession_num
    accession_atoms = string.split(title_atoms[0], '|')
IndexError: list index out of range


 ===========================================================
| Quoc-Dien Trinh         || quoc-dien.trinh@umontreal.ca   |
| Tel.:  (514) 481-2808   || Université de Montréal         |
 ===========================================================



> 
> Jeff
> _______________________________________________
> BioPython mailing list  -  BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython


_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com