[Biojava-dev] bioperl like blastparser

Michael Gang michaelgang at gmail.com
Sun Dec 23 15:22:24 UTC 2007


Hi all,

I've now added the extraction of the query length.
Can someone explain me the procedure of checking in code to biojava ?
I ran the unit tests in the biojava distribution? Are there additional
tests available ?

Best regards,
Michael

On Dec 21, 2007 9:59 AM, Mark Schreiber <markjschreiber at gmail.com> wrote:
> Hi -
>
> It is not required that you turn all Blast results into objects,
> because it is an event based parser you can do what you want with the
> events including turning them into objects or echoing them to STDOUT.
> Take a look at the examples in the cookbook.
>
> It may be that the query length is actually parsed but is not passed
> onto the object model by the event listeners.
>
> - Mark
>
>
> On Dec 21, 2007 12:15 AM, Andreas Prlic <ap3 at sanger.ac.uk> wrote:
> > Hi Michael,
> >
> > The blast parser (BlastLikeSaxParser) in BioJava has been around for
> > a while and is frequently being used to parse a variety
> > of different blast outputs. Still it is not complete and can not
> > parse PSI blast. We have had a number of request about it lately
> > so I suppose it needs a little maintenance now.
> >
> > To write a new blast parser from scratch will involve a significant
> > amount of time. It will take time to fix all the bugs, add support
> > for the different blast versions and write documentation. Much of
> > this is already available in BioJava, so I would prefer if you could
> > submit patches for
> > the current blast parser.  Would you also be interested to
> > collaborate in this direction?
> > Another feature that would be nice to add support for is the
> > possibility to send off blast searches to webservices...
> >
> > Cheers,
> > Andreas
> >
> >
> >
> > On 20 Dec 2007, at 12:54, Michael Gang wrote:
> >
> > > Hi All,
> > >
> > > I used the interface of the java blast parser.
> > > I had mainly two problems with it:
> > > 1) The blast parser does not parse all the information (for example
> > > query length)
> > > 2) The blast parser parses the whole blast report into a list which
> > > eats a lot of memory.
> > >
> > > I would be interested to write and contribute a blast parser which
> > > parses all the information of the blast and parses the blast
> > > iteratively.
> > > Something like the following code in bioperl (just in Java).
> > >   use Bio::SearchIO;
> > >     # format can be 'fasta', 'blast'
> > >     my $searchio = new Bio::SearchIO( -format => 'blastxml',
> > >                                       -file   => 'blastout.xml' );
> > >     while ( my $result = $searchio->next_result() ) {
> > >        while( my $hit = $result->next_hit ) {
> > >         # process the Bio::Search::Hit::HitI object
> > >            while( my $hsp = $hit->next_hsp ) {
> > >             # process the Bio::Search::HSP::HSPI object
> > >         }
> > >     }
> > >
> > > Would you be interested in such a contribution ?
> > >
> > > Best regards,
> > > Michael
> > > _______________________________________________
> > > biojava-dev mailing list
> > > biojava-dev at lists.open-bio.org
> > > http://lists.open-bio.org/mailman/listinfo/biojava-dev
> >
> > -----------------------------------------------------------------------
> >
> > Andreas Prlic      Wellcome Trust Sanger Institute
> >                               Hinxton, Cambridge CB10 1SA, UK
> >                               +44 (0) 1223 49 6891
> >
> > -----------------------------------------------------------------------
> >
> >
> >
> >
> > --
> >  The Wellcome Trust Sanger Institute is operated by Genome Research
> >  Limited, a charity registered in England with number 1021457 and a
> >  company registered in England with number 2742969, whose registered
> >  office is 215 Euston Road, London, NW1 2BE.
> >
> > _______________________________________________
> > biojava-dev mailing list
> > biojava-dev at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biojava-dev
> >
>



More information about the biojava-dev mailing list