[Bioperl-l] Indexing est fasta file.

Lincoln Stein lstein at cshl.edu
Thu Jul 17 12:02:12 EDT 2003


If you're still having trouble you might try Bio::DB::Fasta, which uses a 
slightly different index structure and defalts to DB_File if it possibly can.

Lincoln

On Thursday 10 July 2003 03:16 pm, Ivan Sendin wrote:
> Brian,
>
> > Puzzling. One thing I'd try if this had happened to me would be to
>
> switch
>
> > from SDBM to DB_File as the indexing method. In order to do this
>
> you'll have
>
> > to install DB_File (from RPM say, or from www.sleepycat.com) and then
> > install the DB_File Perl module. This doesn't sound like a problem
>
> with
>
> > Bioperl per se, it seems like a problem with SDBM, that's my guess.
>
> I'll try DB_File.
>
> > One question though: why do you say that the keys are both
> > "gi|6861423|gb|AW357480.1|AW357480" and "6861423" for that sequence?
>
> Are you
>
> Because there are old scripts that search for
> "gi|6861423|gb|AW357480.1|AW357480" and new scripts using only gi...
> I will fix it someday.
>
> I've tried index est file again using "gi" as key and the error
> happened again:
>
> 15944439
> 15944440
> sdbm store returned -1, errno 22, key "15944440" at
> /usr/local/lib/perl5/site_perl/5.8.0/Bio/Index/Abstract.pm line 713,
> <FASTA>
> line 63813394.
>
> > using the idparser() method to specify /gi\|(\d+)/ as the key? You
>
> might
>
> > want to show us the code, this is often illuminating.
>
> The code is in the end of first mail... some "Page Down"
> from here...
>
>
> Thanks,
>
> Ivan
>
> > -----Original Message-----
> > From: bioperl-l-bounces at portal.open-bio.org
> > [mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Ivan Sendin
> > Sent: Friday, July 04, 2003 8:35 AM
> > To: bioperl-l at bioperl.org
> > Subject: RE: [Bioperl-l] Indexing est fasta file.
> >
> > --- Brian Osborne <brian_osborne at cognia.com> wrote:
> > > Ivan,
> > >
> > > My Google search says:
> > > On failure, the tie call returns an undefined value and probably sets
> > > $!
> >
> > to
> >
> > > contain the reason the file could not be tied.
> > > sdbm store returned -1, errno 22, key "..." at ...
> > > This warning is emmitted when you try to store a key or a value that is
> >
> > too
> >
> > > long. It means that the change was not recorded in the database. See
> > > BUGS AND WARNINGS below.
> > > Your key can't be too long - is there something unusual about this
> > > particular sequence or "value"?
> >
> > Brian,
> >
> > There is not unusual with this sequence.
> > I runned the script again, printing some debug info:
> > ......
> > gi|6861420|gb|AW357477.1|AW357477 6861420
> > gi|6861421|gb|AW357478.1|AW357478 6861421
> > gi|6861422|gb|AW357479.1|AW357479 6861422
> > gi|6861423|gb|AW357480.1|AW357480 6861423
> > sdbm store returned -1, errno 22, key "6861423" at
> > /usr/local/lib/perl5/site_perl/5.8.0/Bio/Index/Abstract.pm line 713,
> > <FASTA> line 22096655.
> >
> > The keys for the last sequence are "gi|6861423|gb|AW357480.1|AW357480"
> > and "6861423".
> >
> >
> > Ivan
> >
> > > Brian O.
> > >
> > >
> > > -----Original Message-----
> > > From: bioperl-l-bounces at portal.open-bio.org
> > > [mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Ivan Sendin
> > > Sent: Thursday, July 03, 2003 3:47 PM
> > > To: bioperl-l at bioperl.org
> > > Subject: [Bioperl-l] Indexing est fasta file.
> > >
> > > Hi,
> > >
> > > I'm trying to make an index on est file, but
> > > when I run my script I got this error:
> > >
> > > sdbm store returned -1, errno 22, key "6861423" at
> > > /usr/local/lib/perl5/site_perl/5.8.0/Bio/Index/Abstract.pm line 713,
> >
> > <FASTA>
> >
> > > line 22096655.
> > >
> > >
> > > The script is very simple:
> > >
> > > ...
> > >  my $inx = Bio::Index::Fasta->new(
> > >                                    -filename => $Index_File_Name,
> > >                                    -write_flag => 1
> > >                                   );
> > >   $inx->id_parser(\&parse_ncbi_id);
> > >   $inx->make_index($fasta);
> > > }
> > >
> > > sub parse_ncbi_id {
> > >   my @retvals;
> > >   my $p = $_[0];
> > >   if( $p =~ /^>(\S+)/ ) {
> > >     my $val = $1;
> > >     push @retvals, $val;
> > >     while ( $p =~/gi\|(\d*)/g) {
> > >       push(@retvals,$1);
> > >     }
> > >   }
> > >   return @retvals;
> > > }
> > >
> > >
> > > Anybody knows what is wrong?
> > >
> > > The size of est file (11077527557 bytes) is a issue?
> > >
> > >
> > > Thanks,
> > >
> > >
> > > Ivan Sendin
> > >
> > > __________________________________
> > > Do you Yahoo!?
> > > SBC Yahoo! DSL - Now only $29.95 per month!
> > > http://sbc.yahoo.com
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l at portal.open-bio.org
> > > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> > =====
> > ------------------------------------------------------------
> >  Ivan da Silva Sendin  -   Bioinformatics
> >  Raw IP,TCP & UDP with Java: http://jpacket.sourceforge.net
> >                                Campinas - Brasil
> > ------------------------------------------------------------
> >
> > __________________________________
> > Do you Yahoo!?
> > SBC Yahoo! DSL - Now only $29.95 per month!
> > http://sbc.yahoo.com
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
> __________________________________
> Do you Yahoo!?
> SBC Yahoo! DSL - Now only $29.95 per month!
> http://sbc.yahoo.com
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

-- 
========================================================================
Lincoln D. Stein                           Cold Spring Harbor Laboratory
lstein at cshl.org			                  Cold Spring Harbor, NY
========================================================================




More information about the Bioperl-l mailing list