[Biopython-dev] [Bug 1704] New: problem with Bio.Blast.NCBIStandalone

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Mon Oct 25 06:56:56 EDT 2004


http://bugzilla.open-bio.org/show_bug.cgi?id=1704

           Summary: problem with Bio.Blast.NCBIStandalone
           Product: Biopython
           Version: Not Applicable
          Platform: All
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Main Distribution
        AssignedTo: biopython-dev at biopython.org
        ReportedBy: gebauer-jung at ice.mpg.de


If the blast database is generated using a GI-List via a *.nal file like that:

#
TITLE insects
#
DBLIST ./nr
#
GILIST insects.list
#

the database report at the end of the blast output file looks like that:

...

Query: 2656 accaacaaaaccaacatca 2674
            |||||||||||||||||||
Sbjct: 85   accaacaaaaccaacatca 67


  Subset of the database(s) listed below
     Number of letters searched: 562,618,960
     Number of sequences searched:  228,924

  Database: insects
    Posted date:  Oct 17, 2004 10:00 PM
  Number of letters in database: 3,987,564,307
  Number of sequences in database:  991,337

  Database: /bio/blast/./nt.01
    Posted date:  Oct 17, 2004 11:04 PM
  Number of letters in database: 3,989,920,418
  Number of sequences in database:  760,163
  
  Database: /bio/blast/./nt.02
    Posted date:  Oct 18, 2004  2:00 AM
  Number of letters in database: 3,989,747,597
  Number of sequences in database:  888,596

  Database: /bio/blast/./nt.03
    Posted date:  Oct 15, 2004  1:00 AM
  Number of letters in database: 14,716,213
  Number of sequences in database:  1558

Lambda     K      H
    1.37    0.711     1.31

Gapped
Lambda     K      H
    1.37    0.711     1.31
...


The 'Subset of the database(s) ...' line lets _Scanner._scan_database_report()
crash.
Even Bio.Blast.Record cannot keep such data. (If there was some need to do so.)

As a work-around I suggest the following change in Bio.Blast.NCBIStandalone.py:

422,423c422,432
<       while 1:
<             read_and_call(uhandle, consumer.database, start='  Database')
---
>
>         while 1:
>         #      read_and_call(uhandle, consumer.database, start='  Database')
>         # work-around to skip:
>         #  Subset of the database(s) listed below
>         #  Number of letters searched: 562,618,960
>         #  Number of sequences searched:  228,924
>         #
>         # even Record.DatabaseRecord does not contain any structure to keep
this stuff
>             read_and_call_until(uhandle, consumer.database, start='  Database')
>



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list