[Biopython-dev] [Bug 1716] New: Fasta.Dictionary should throw KeyError for invalid keys

Wed Dec 8 18:27:14 EST 2004

http://bugzilla.open-bio.org/show_bug.cgi?id=1716

           Summary: Fasta.Dictionary should throw KeyError for invalid keys
           Product: Biopython
           Version: Not Applicable
          Platform: All
        OS/Version: Windows 2000
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev at biopython.org
        ReportedBy: biopython-bugzilla at maubp.freeserve.co.uk

Product: BioPython
Version: 1.30
Component: Bio.Fasta
OS: Windows and Linux, assume all

The Fasta.Dictionary object can be used to access sequences by key:

fasta_dict = Fasta.Dictionary(...)
my_seq = fasta_dict[my_key]

Try doing this for an invalid key (e.g. a gene not present in the Fasta file).

Actual result:
The Fasta.Dictionary appears to raise a ZeroDivisionError exception (from
Bio\Mindy\FlatDB.py).

Expected Result:
By analogy with the Python dictionary, a KeyError exception should occur.

--------------------------------------------------------
Detailed reproduction steps:

(1) Create and change to an empty test directory
(2) Create the file test.faa as below
(3) Create the file test.py as below
(4) Run the test script, test.py
(5) Check you get the following output:

Building FASTA index file using gene name as index, Done
Loading FASTA index file, Done
Valid keys:
['Alpha', 'Beta', 'Delta', 'Gamma']
About to try an use a non-existant key!
Traceback (most recent call last):
  File "C:\Temp\fasta_dict_bug\test.py", line 42, in ?
    seq = fasta_dict['non_existant_key']
  File "c:\python23\Lib\site-packages\Bio\Fasta\__init__.py", line 190, in
__getitem__
    seqs = self._index.lookup(aliases = key)
  File "c:\python23\Lib\site-packages\Bio\Mindy\BaseDB.py", line 118, in lookup
    return self[namespace][name]
  File "c:\python23\Lib\site-packages\Bio\Mindy\FlatDB.py", line 351, in __getitem__
    primary_keys = _lookup_alias(id_filename, name)
  File "c:\python23\Lib\site-packages\Bio\Mindy\FlatDB.py", line 264, in
_lookup_alias
    lines = _find_range(id_filename, word)
  File "c:\python23\Lib\site-packages\Bio\Mindy\FlatDB.py", line 243, in _find_range
    bf = BisectFile(infile, size)
  File "c:\python23\Lib\site-packages\Bio\Mindy\FlatDB.py", line 216, in __init__
    assert (size - 4) % self.record_size == 0, "record size is wrong"
ZeroDivisionError: long division or modulo by zero

-------------------------------------------------------
test.faa
-------------------------------------------------------
>Alpha This is the first sample
AAAAAAAAAAAA
>Beta Second sample [sensible data!]
GVMNMTISFLSEHIFI
>Gamma Third sample, with a silly sequence
CASTLEINTHESKY
>Delta Fourth sample, again with a silly sequence
LITTLE
-------------------------------------------------------
test.py
-------------------------------------------------------
import os
import string
import Bio
from Bio import Fasta
from Bio.Alphabet import IUPAC

filename_faa="test.faa"
filename_idx="test.idx"

if os.path.isfile(filename_idx) :
    # The index is a file on older versions of BioPython,
    # mind you Fasta.Dictionary seemed to be broken on
    # BioPython 1.24 so there isn't much point trying.
    pass
elif os.path.isdir(filename_idx) :
    #The index files should exist
    pass
else :
    print "Building FASTA index file using gene name as index,",
    Fasta.index_file(filename_faa, filename_idx, \
                     lambda seq : string.split(seq.title)[0])
    print "Done"

print "Loading FASTA index file,",
fasta_dict = Fasta.Dictionary(filename_idx,
Fasta.SequenceParser(IUPAC.ambiguous_dna))
print "Done"

print "Valid keys:"
print fasta_dict.keys()
print "About to try an use a non-existant key!"
seq = fasta_dict['non_existant_key']
print "Done"

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.