[EMBOSS] db formatting (?) and parsing issue -- emboss version 5.0.0
George Magklaras
georgios at biotek.uio.no
Wed Jan 28 13:47:40 UTC 2009
Hi Peter, thanks for your reply
Certainly:
1)For the failed run (for seq in `cat bhits`; do seqret -debug -filter
staphyl68-id:$seq; done ) the seqret.dbg contains:
Debug file seqret.dbg buffered:No
ajAcdInitP pgm 'seqret' package ''
ajFileNewIn '/site/share/EMBOSS/acd/seqret.acd'
EOF ajFileGetsL file /site/share/EMBOSS/acd/seqret.acd
closing file '/site/share/EMBOSS/acd/seqret.acd'
ajFileNewIn '/site/share/EMBOSS/acd/codes.english'
EOF ajFileGetsL file /site/share/EMBOSS/acd/codes.english
closing file '/site/share/EMBOSS/acd/codes.english'
ajTableNewFunctionLen hint 25 size 251
ajTableNewFunctionLen hint 25 size 251
ajTableNewFunctionLen hint 25 size 251
ajFileNewIn '/site/share/EMBOSS/acd/knowntypes.standard'
EOF ajFileGetsL file /site/share/EMBOSS/acd/knowntypes.standard
closing file '/site/share/EMBOSS/acd/knowntypes.standard'
Set acdprotein value '$(sequence.protein)'
ajSeqinClear called
' 0..0(N) '' 0 'staphyl68-id:FLTU7OB01AHJ67
'SA to test: 'staphyl68-id:FLTU7OB01AHJ67
format regexp: No list:No
no format specified in USA
...input format not set
dbname dbexp: Yes
'ound dbname 'staphyl68' level: 'id' qry->QryString: 'FLTU7OB01AHJ67
' Field 'id'ng 'FLTU7OB01AHJ67
' acc '' sv '' gi '' des '' org '' key ''
no wildcard in stored qry
database type: 'N' format 'embl'
use access method 'emboss'
Matched seqAccess[1] 'emboss'
seqAccessEmboss type 1
' acc '' hasacc:Yess/u4/tjonasse/mrsa/454/068_reads/' entry 'fltu7ob01ahj67
ajFileNewIn '/div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.pxid'
EOF ajFileGetsL file
/div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.pxid
closing file '/div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.pxid'
ajFileNewIn '/div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.ent'
EOF ajFileGetsL file /div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.ent
closing file '/div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.ent'
' acc: '' hasacc:Yesahj67
B+tree Entry failed
' not foundtry id:'fltu7ob01ahj67
seqEmbossQryClose clean up qryd
Database 'staphyl68' : access method 'emboss' failed
2)For the standalone successful run (seqret -debug
staphyl68-id:FLTU7OB01AHJ67), seqret.dbg states:
Debug file seqret.dbg buffered:No
ajAcdInitP pgm 'seqret' package ''
ajFileNewIn '/site/share/EMBOSS/acd/seqret.acd'
EOF ajFileGetsL file /site/share/EMBOSS/acd/seqret.acd
closing file '/site/share/EMBOSS/acd/seqret.acd'
ajFileNewIn '/site/share/EMBOSS/acd/codes.english'
EOF ajFileGetsL file /site/share/EMBOSS/acd/codes.english
closing file '/site/share/EMBOSS/acd/codes.english'
ajTableNewFunctionLen hint 25 size 251
ajTableNewFunctionLen hint 25 size 251
ajTableNewFunctionLen hint 25 size 251
ajFileNewIn '/site/share/EMBOSS/acd/knowntypes.standard'
EOF ajFileGetsL file /site/share/EMBOSS/acd/knowntypes.standard
closing file '/site/share/EMBOSS/acd/knowntypes.standard'
Set acdprotein value '$(sequence.protein)'
ajSeqinClear called
++seqUsaProcess 'staphyl68-id:FLTU7OB01AHJ67' 0..0(N) '' 0
USA to test: 'staphyl68-id:FLTU7OB01AHJ67'
format regexp: No list:No
no format specified in USA
...input format not set
dbname dbexp: Yes
found dbname 'staphyl68' level: 'id' qry->QryString: 'FLTU7OB01AHJ67'
db QryString 'FLTU7OB01AHJ67' Field 'id'
ajSeqQueryWild id 'FLTU7OB01AHJ67' acc '' sv '' gi '' des '' org '' key ''
no wildcard in stored qry
database type: 'N' format 'embl'
use access method 'emboss'
Matched seqAccess[1] 'emboss'
seqAccessEmboss type 1
directory '/div/dias/u4/tjonasse/mrsa/454/068_reads/' entry
'fltu7ob01ahj67' acc '' hasacc:Yes
ajFileNewIn '/div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.pxid'
EOF ajFileGetsL file
/div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.pxid
closing file '/div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.pxid'
ajFileNewIn '/div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.ent'
EOF ajFileGetsL file /div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.ent
closing file '/div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.ent'
entry id: 'fltu7ob01ahj67' acc: '' hasacc:Yes
ajFileNewIn '/div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.dat'
seqEmbossQryClose clean up qryd
seqRead: cleared
seqRead: seqin format 3 'embl'
seqRead: one format specified
ajFileBuffNobuff /div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.dat
buffsize: 0
++seqRead known format 3
++seqReadFmt format 3 (embl) 'staphyl68-id:FLTU7OB01AHJ67' feat No
seqReadEmbl first line 'ID FLTU7OB01AHJ67; SV 1; linear; unassigned
DNA; STD; UNC; 184 BP.
'
seqReadEmbl ID line found
seqSetName word 'FLTU7OB01AHJ67'
seqSetName 'FLTU7OB01AHJ67' result: 'FLTU7OB01AHJ67'
ajTableNewFunctionLen hint 4 size 251
ajTableNewFunctionLen hint 4 size 251
ajTableNewFunctionLen hint 4 size 251
ajTableNewFunctionLen hint 4 size 251
ajFileBuffClear (0) Nobuff: Yes
size 0: Lines: 0 Curr: 0 Prev: 0 Last: 0 Free: 0 Freelast: 0
ajFileBuffClear
'/div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.dat' (0 lines)
Y size: 0 pos: 0 removed 0 lines add to free: 0
Trace buffer file '/div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.dat'
Pos: 0 Size: 0 FreeSize: 0 Fpos: 153477365 End: N
Free: 0 Last: -1
seqReadFmt success with format 3 (embl)
seqQueryMatch 'FLTU7OB01AHJ67' id 'fltu7ob01ahj67' acc '' Sv '' Gi ''
Des '' Key '' Org '' Case No Done Yes
seqTypeSet 'N'
ajSeqTypeCheckIn type 'gapany' found (any valid sequence with gaps)
Convert gaps to '-'
ajSeqTypeCheckIn: bad characters test passed, convert
Convert '?' to 'X'
ajSeqTypeCheckIn: OK - no badchars
seqDefine: thys->Db 'staphyl68', seqin->Db 'staphyl68'
seqDefine: thys->Name 'FLTU7OB01AHJ67' type: N
seqDefine: thys->Entryname 'FLTU7OB01AHJ67', seqin->Entryname ''
seqDefine: returns thys->Name 'FLTU7OB01AHJ67' type: N
++ajSeqallread set db: 'staphyl68' => 'staphyl68'
ajSeqallGetName ''
ajSeqIsNuc Type 'N'
ajSeqIsNuc Type 'N'
ajSeqIsProt Type 'N'
ajSeqallGetUsa 'staphyl68-id:FLTU7OB01AHJ67'
ajSeqallGetseqName 'FLTU7OB01AHJ67'
... output format not set, default to 'fasta'
ajSeqoutClear called
... output format not set, default to 'fasta'
ajSeqoutOpen dir '' qrydir ''
seqoutUsaProcess
output USA to test: 'fltu7ob01ahj67.fasta'
format regexp: No
no format specified in USA
file:id regexp: Yes
found filename fltu7ob01ahj67.fasta single: No dir: ''
ajFileNewOutD('' 'fltu7ob01ahj67.fasta')
ajFileNewOutD open name 'fltu7ob01ahj67.fasta'
ajSeqSetRange (len: 184 0..0 old 0..0) rev:No reversed:No
result: (len: 184 0..0)
ajSeqoutWriteSeq 'FLTU7OB01AHJ67' len: 184
ajSeqoutWriteSeq 17 'fasta' single: No feat: No Save: No
seqClone out Setdb '' Db '' seq Setdb '<null>' Db 'staphyl68'
seqClone outseq->Type '' seq->Type 'N'
seqClone 0 .. 0 1 .. 184 len: 184 type: 'N'
Db: 'staphyl68' Name: 'FLTU7OB01AHJ67' Entryname: 'FLTU7OB01AHJ67'
ajSeqTypeCheckS type 'gapany' found (any valid sequence with gaps)
Convert gaps to '-'
Convert '?' to 'X'
ajSeqoutSetNameDefaultS already has a name 'FLTU7OB01AHJ67'
seqWriteFasta outseq Db 'staphyl68' Setdb '' Setoutdb '' Name
'FLTU7OB01AHJ67'
seqoutUfoLocal Features No Ufo 0 ''
ajSeqoutWriteSeq tests features No tabouitisopen No UfoLocal No ftlocal No
ajSeqRead: input file
'/div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.dat' still there,
try again
seqRead: cleared
seqRead: single access - count 1 - call access routine again
seqAccessEmboss type 1
seqEmbossQryReuse: query data all finished
seqRead: seqin->Query->Access->Access(seqin) *failed*
ajSeqRead: open buffer usa: 'staphyl68-id:FLTU7OB01AHJ67' returns: No
ajSeqallNext failed
ajSeqinClear called
ajFileBuffClear (-1) Nobuff: Yes
size 0: Lines: 0 Curr: 0 Prev: 0 Last: 0 Free: 0 Freelast: 0
ajFileBuffClear
'/div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.dat' (-1 lines)
Y size: 0 pos: 0 removed 0 lines add to free: 0
Trace buffer file '/div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.dat'
Pos: 0 Size: 0 FreeSize: 0 Fpos: 153477365 End: N
Free: 0 Last: -1
closing file '/div/dias/u4/tjonasse/mrsa/454/068_reads//staphyl68.dat'
ajSeqoutClose 'fltu7ob01ahj67.fasta'
closing file 'fltu7ob01ahj67.fasta'
ajSeqinDel called usa:''
ajSeqQueryDel db:'' id:''
Final Summary
=============
Table usage : 11 opened, 0 closed, 251 maxsize, 40 maxmem
List usage : 27 opened, 27 closed, 1438 maxsize 2380 nodes
List iterator usage : 4 opened, 4 closed
File usage : 1 opened, 9 closed, 3 max, 10 total
ajNamExit done
Regexp usage (bytes): 168 allocated, 1008 freed, -840 in use (sizes change)
Regexp usage (number): 21 allocated, 21 freed 0 in use
Array usage (bytes): 0 allocated, 0 freed, 0 in use
Array usage (number): 0 allocated, 0 freed, 0 resized, 0 in use
Array usage 2D (bytes): 0 allocated, 0 freed, 0 in use
Array usage 2D (number): 0 allocated, 0 freed, 0 resized, 0 in use
Array usage 3D (bytes): 0 allocated, 0 freed, 0 in use
Array usage 3D (number): 0 allocated, 0 freed, 0 resized, 0 in use
String usage (bytes): 268013 allocated, 268270 freed, -257 in use
String usage (number): 4982 allocated, 4979 freed 3 in use
Memory usage (bytes): 535329 allocated, 640 reallocated 503881 zeroed
Memory usage (number): 14393 allocates, 14405 frees, 10 resizes, -12 in use
closing file 'seqret.dbg'
3)The staphyl68.pxid file contains:
Order 60
Fill 42
Pagesize 2048
Level 2
Cachesize 200
Order2 82
Fill2 99
Count 288506
Kwlimit 15
In addition, the definition plus resource record I defined for the the
staphyl68 database in my local .embossrc file is the following (which
should accommodate for the length of the id field, shouldn't it?):
DB staphyl68 [
type: N
method: emboss
format: embl
fields: "id,des"
file: staphyl68.dat
indexdirectory: /div/dias/u4/tjonasse/mrsa/454/068_reads/
comment: "mrsa staphyl68 reads"
]
RES staphyl68 [
type: Index
idlen: 20
deslen: 50
]
Best regards,
GM
Peter Rice wrote:
> Hi George,
>
>> Why does the filter mode seqret invoked inside the for loop fails and
>> this one works, and the problem does not exist for the 'afile' but
>> only the 'bfile'?
>
> Can you add "-debug" to the seqret commandline and send me the
> seqret.dbg file (it will be for the last seqret run so you'll need some
> way to make sure the last run failed)
>
> and also sent the seqret.dbg file for running seqret standalone with the
> same ID that worked.
>
> It would also be useful to see the .pxid file for the staphyl68 database
> (it includes the length of ID that was indexed - your IDs are quite long
> for dbxflat)
>
> regards,
>
> Peter
>
--
More information about the EMBOSS
mailing list