[Biojava-l] BLAST Parser for extracting all BLAST data?
Sébastien PETIT
great_fred at yahoo.com
Tue Jun 28 07:34:17 EDT 2005
Arggh!!!!I didn't find what I wanted!!
I used the program you gave me but with a light modification because it
didn't recognize my XML file...
The parser is, now, a BlastXMLParserFacade....
And it gave me everythings it found in the file.....
BUT not what I want!!GRRR...>:( >:( >:(
There is a mark out (I don't know if it's the good word...) in my XML
file which frame what I'm searching for : <Hsp_midline>....
Why the parser doesn't see it..??
I didn't really understand how the XML parser works....So, how can I
modifie it to find my happiness...??
PLEASE DOC'!!! ;);)
Help me!!
Thanks for everythings..
Sebastien
--- mark.schreiber at novartis.com a écrit :
> Hi -
>
> Try running this program
> http://www.biojava.org/docs/bj_in_anger/blastecho.htm
>
> If you see what you need in the output then it is being read by the
> Blast
> parser and emitted as an event (which you could listen for). If it
> isn't
> then the Blast parser is not emitting those events although someone
> confident with the blast format could probably modify it so it does.
>
> In short, it is possible but it might not be implemented ; )
>
> - Mark
>
>
>
>
>
> Sébastien PETIT <great_fred at yahoo.com>
> Sent by: biojava-l-bounces at portal.open-bio.org
> 06/28/2005 05:11 PM
>
>
> To: biojava-l at biojava.org
> cc: (bcc: Mark Schreiber/GP/Novartis)
> Subject: RE: [Biojava-l] BLAST Parser for extracting
> all BLAST data?
>
>
> Hi, everybody...
>
> I'm like Georges....I want to extract data from BLAST files.....
> I can have the alignements, no problem...But, now, I want the
> alignment
> between the 2 sequences (the lines with "+", "-" and some letters in
> George's example....) because with this, we can see in a glance if
> the
> alignment between the 2 sequences is really good or not.
>
> Is it possible, Docs??
>
> Thank you.
>
> Sebastien
>
> --- Richard HOLLAND <hollandr at gis.a-star.edu.sg> a écrit :
>
> > BioJava's BLAST framework parses files and fires events for every
> > piece of information it finds. The SeqSimilarityAdapter class is an
> > example of how to catch these events and construct basic BLAST
> result
> > objects (SimpleSeqSimilarityHit), however they are not
> comprehensive
> > and do not record full details of every hit.
> >
> > If you want the kind of detail you mention below you will have to
> > write your own content handler for BLAST parsing and parse it to
> the
> > BLASTLikeSAXParser when parsing a file. This event handler should
> > implement the ContentHandler interface. Look at the source of
> > SeqSimilarityAdapter for guidance. You will then receive events for
> > every part of the file, from which you can construct your own
> custom
> > BLAST result objects to describe them.
> >
> > If you're not sure what tag names to listen for in your
> > ContentHandler the easiest thing to do is just run it once and dump
> > them all out to see what you get.
> >
> > cheers,
> > Richard
> >
> >
> > -----Original Message-----
> > From: biojava-l-bounces at portal.open-bio.org on behalf of Y
> D
> Sun
> > Sent: Sun 6/26/2005 5:42 PM
> > To: biojava-l at biojava.org
> > Cc:
> > Subject: [Biojava-l] BLAST Parser for extracting all
> BLAST
> data?
> >
> > Hi,
> >
> > I want to extract all data from BLASTP results. In the following
> hit,
> > for example, I need to get the lengths of query and subject
> proteins,
> > the identities (including all data 54, 124 and 43%), the positives
> > (all
> > data 79, 124 and 63%), and the gaps (3, 124 and 2%). Can the
> > BLASTLikeSAXParser filter all these information? I can't find the
> > methods in SeqSimilaritySearchHit and SeqSimilaritySearchSubHit
> APIs
> > to
> > retrieve these data. Does Biojava provide any methods for this
> > purpose?
> >
> > Thanks,
> >
> > George
> >
> >
> > BLASTP 2.2.5 [Nov-16-2002]
> >
> > Query= Prot0001
> > (138 letters)
> >
> > Database: /work/nys1/fasta/protein/AE000782.pro.fasta
> > 2407 sequences; 662,866 total letters
> >
> > Searching.....done
> >
> >
> > Score
> > E
> > Sequences producing significant alignments:
> > (bits)
> > Value
> >
> > Prot0002
> > 100
> > 1e-23
> > Prot0003
> > 74
> > 2e-15
> > Prot0004
> > 43
> > 3e-06
> >
> > >Prot0002
> > Length = 138
> >
> > Score = 100 bits (250), Expect = 1e-23
> > Identities = 54/124 (43%), Positives = 79/124 (63%), Gaps = 3/124
> > (2%)
> >
> > Query: 18
> > NARTKFTDIAKTLNLTEAAIRKRIKKLEENQIIKRYSIDIDYKKLGYNMAIIGLDIDMDY
> > 77
> > NAR T IAK LN+TEAA+RKRI LE + I Y I+YKK+G + ++
> G+D+D
> > D
> > Sbjct: 15
> > NARIPKTRIAKELNVTEAAVRKRIANLERREEILGYKAIINYKKVGLSASLTGVDVDPDK
> > 74
> >
> > Query: 78
> > FPKIIKELEKRKEFLHIYSSAGDHDIMVIAIYK---DLEEIYNYLKNLKGVKRVCPAIII
> > 134
> > K+++EL+ + ++ + GDH IM I K +L EI+ +
> > ++GVKRVCP+II
> > Sbjct: 75
> > LWKVVEELKDLESVKSLWLTTGDHTIMAEIIAKSVQELSEIHQKIAEMEGVKRVCPSIIT
> > 134
> >
> > Query: 135 DQIK 138
> > D +K
> > Sbjct: 135 DIVK 138
> >
> > _______________________________________________
> > Biojava-l mailing list - Biojava-l at biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> >
> >
> >
> >
> > _______________________________________________
> > Biojava-l mailing list - Biojava-l at biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> >
>
>
>
>
>
>
>
>
___________________________________________________________________________
>
>
> Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo!
> Messenger
>
> Téléchargez cette version sur http://fr.messenger.yahoo.com
> _______________________________________________
> Biojava-l mailing list - Biojava-l at biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
>
>
>
>
___________________________________________________________________________
Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger
Téléchargez cette version sur http://fr.messenger.yahoo.com
More information about the Biojava-l
mailing list