[BioPython] Index Error: List index out of range
Quoc-Dien Trinh
qdtrinh@yahoo.com
Thu, 19 Jul 2001 15:05:19 -0400
> At 1:03 PM -0400 7/19/01, Quoc-Dien Trinh wrote:
>> I'm running a python script quite similar to the dictionary.py published in
>> the tutorial for biopython. Essentially, the script writes down all
>> sequences matching a certain pattern. When I run the script against a small
>> database of proteins (e.g. A small proteome), there's no problem, but when I
>> try it on bigger proteomes, like the chicken or the human, there's an error
>> message stating the list index is out of range.
>>
>> Can someone help me on this?
>
> Not without a better description of the problem. I searched through
> the tutorial and couldn't find a script called "dictionary.py".
Sorry, it was fasta_dictionary.py
> Thus, I don't know what your code looks like, which biopython modules
> it's using, or what the error message looks like. With this little
> information, I can't do much, except say to check your sequence
> indexes carefully.
Here is the code:
import string
from Bio import Fasta
from Bio.Alphabet import IUPAC
def get_accession_num(fasta_record):
title_atoms = string.split(fasta_record.title)
accession_atoms = string.split(title_atoms[0], '|')
gb_name = accession_atoms[1]
return gb_name
Fasta.index_file("proteome.fasta", "proteome.idx", get_accession_num)
dna_parser = Fasta.SequenceParser(IUPAC.protein)
proteome_dict = Fasta.Dictionary("proteome.idx", dna_parser)
f=open('polya_id.txt','w')
for id_num in proteome_dict.keys():
my_sequence = proteome_dict[id_num].seq.data
if my_sequence.find('AAAAA') > -1:
f.write(id_num + '\n')
...
Proteome.fasta is every known protein sequences of a specie (downloaded from
the Taxonomy section @ncbi. When I run this program with the C.Elegans and
Re.Drerio, the script executes without any trouble and returns the desired
list of poly-A proteins. However, when I run this with other species like
Gallus, Homo Sapiens, ... It doesn't work and I get the following error
message:
[localhost:biopython-1.00a1/Doc/examples] qdtrinh% python fasta_d*
Traceback (most recent call last):
File "fasta_dictionary.py", line 14, in ?
Fasta.index_file("proteome.fasta", "proteome.idx", get_accession_num)
File "/usr/lib/python2.1/site-packages/Bio/Fasta/__init__.py", line 333,
in index_file
key = rec2key(rec)
File "fasta_dictionary.py", line 8, in get_accession_num
accession_atoms = string.split(title_atoms[0], '|')
IndexError: list index out of range
===========================================================
| Quoc-Dien Trinh || quoc-dien.trinh@umontreal.ca |
| Tel.: (514) 481-2808 || Université de Montréal |
===========================================================
>
> Jeff
> _______________________________________________
> BioPython mailing list - BioPython@biopython.org
> http://biopython.org/mailman/listinfo/biopython
_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com