[Biojava-dev] bioperl like blastparser

Mark Schreiber markjschreiber at gmail.com
Fri Dec 21 07:59:27 UTC 2007


Hi -

It is not required that you turn all Blast results into objects,
because it is an event based parser you can do what you want with the
events including turning them into objects or echoing them to STDOUT.
Take a look at the examples in the cookbook.

It may be that the query length is actually parsed but is not passed
onto the object model by the event listeners.

- Mark

On Dec 21, 2007 12:15 AM, Andreas Prlic <ap3 at sanger.ac.uk> wrote:
> Hi Michael,
>
> The blast parser (BlastLikeSaxParser) in BioJava has been around for
> a while and is frequently being used to parse a variety
> of different blast outputs. Still it is not complete and can not
> parse PSI blast. We have had a number of request about it lately
> so I suppose it needs a little maintenance now.
>
> To write a new blast parser from scratch will involve a significant
> amount of time. It will take time to fix all the bugs, add support
> for the different blast versions and write documentation. Much of
> this is already available in BioJava, so I would prefer if you could
> submit patches for
> the current blast parser.  Would you also be interested to
> collaborate in this direction?
> Another feature that would be nice to add support for is the
> possibility to send off blast searches to webservices...
>
> Cheers,
> Andreas
>
>
>
> On 20 Dec 2007, at 12:54, Michael Gang wrote:
>
> > Hi All,
> >
> > I used the interface of the java blast parser.
> > I had mainly two problems with it:
> > 1) The blast parser does not parse all the information (for example
> > query length)
> > 2) The blast parser parses the whole blast report into a list which
> > eats a lot of memory.
> >
> > I would be interested to write and contribute a blast parser which
> > parses all the information of the blast and parses the blast
> > iteratively.
> > Something like the following code in bioperl (just in Java).
> >   use Bio::SearchIO;
> >     # format can be 'fasta', 'blast'
> >     my $searchio = new Bio::SearchIO( -format => 'blastxml',
> >                                       -file   => 'blastout.xml' );
> >     while ( my $result = $searchio->next_result() ) {
> >        while( my $hit = $result->next_hit ) {
> >         # process the Bio::Search::Hit::HitI object
> >            while( my $hsp = $hit->next_hsp ) {
> >             # process the Bio::Search::HSP::HSPI object
> >         }
> >     }
> >
> > Would you be interested in such a contribution ?
> >
> > Best regards,
> > Michael
> > _______________________________________________
> > biojava-dev mailing list
> > biojava-dev at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biojava-dev
>
> -----------------------------------------------------------------------
>
> Andreas Prlic      Wellcome Trust Sanger Institute
>                               Hinxton, Cambridge CB10 1SA, UK
>                               +44 (0) 1223 49 6891
>
> -----------------------------------------------------------------------
>
>
>
>
> --
>  The Wellcome Trust Sanger Institute is operated by Genome Research
>  Limited, a charity registered in England with number 1021457 and a
>  company registered in England with number 2742969, whose registered
>  office is 215 Euston Road, London, NW1 2BE.
>
> _______________________________________________
> biojava-dev mailing list
> biojava-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>



More information about the biojava-dev mailing list