[Biojava-dev] Biojava - svn migration was : bioperl like blastparser

Hilmar Lapp hlapp at duke.edu
Wed Dec 26 23:41:54 UTC 2007


On Dec 26, 2007, at 6:29 PM, Andreas Prlic wrote:

>
>> You just need to put the repositor(ies) in
>> /home/svn-repositories/biojava
>
> Thanks for the info.  I now have the new biojava svn repository for  
> developers running

Great, congrats!

> and it is possible to check out (and do commits) via
>
> svn co svn+ssh://dev.open-bio.org/home/svn-repositories/biojava/ 
> biojava-svn/biojava-live/trunk/  ./biojava-svn

Is this the directory structure template we should all mirror for the  
different projects? I suppose some consistency isn't bad ...

The URL looks awfully long - did we choose not to use /home/svn  
(instead of /home/svn-repositories), and is the intervening 'biojava- 
svn directory needed?

I.e., is root URL of the repository at /home/svn-repositories/biojava/ 
biojava-svn/biojava-live/, or /home/svn-repositories/biojava/ ?


	-hilmar

>
> I am just running final tests to see if all is fine. Access should  
> work for other biojava developers as well.
>
>
> For the anonymous access - who will set this up? I assume there  
> will be a commit hook in the developers repository which will do a  
> svnsync with the anonymous repository?
>
> Andreas
>
>
>
>
>
>>
>> anyone in the biojava group can write there.
>> you'll want to delete the existing biojava-live that is in there.
>>
>> I'm traveling most of 26th and will be on vacation most of the  
>> week, but will check in when I have a chance.
>>
>> -jason
>>
>> On Dec 25, 2007, at 3:42 PM, Andreas Prlic wrote:
>>
>>> Hi Mark,
>>>
>>> Unfortunately the biojava svn respository is not ready yet.
>>>
>>> George has converted our CVS to an initial svn dump, which I  
>>> tested and fixed some details.
>>> This dump has been ready since dezember 17th. - ( see dev.open- 
>>> bio.org:~andreas/biojava-final.svndump.bz2 )
>>> The next step is to load this into the public open-bio  
>>> repository, after which (and some more testing)  the new biojava  
>>> repository would be ready for new commits.
>>>
>>> At the present I am waiting for somebody who has admin rights on  
>>> the open-bio servers to do these final steps.
>>> (or to delegate and give permissions to somebody else).
>>>
>>> I tried to contact support at open-bio, root-l, as well as mailing  
>>> several people directly,
>>> but so far I did not get a response.  could be that the holiday  
>>> season is slowing response times down...
>>>
>>> Andreas
>>>
>>>
>>>
>>> On 25 Dec 2007, at 21:44, Mark Schreiber wrote:
>>>
>>>> Hi -
>>>>
>>>> When will the subversion system be ready for checkin?
>>>>
>>>> - Mark
>>>>
>>>> On Dec 24, 2007 4:29 PM, Michael Gang <michaelgang at gmail.com>  
>>>> wrote:
>>>>> OK,
>>>>> I made four changes,
>>>>> in the package  org.biojava.bio.program.sax; at class  
>>>>> BlastSaxParser
>>>>> 1)  at line 86 i added the variable
>>>>> private String                                            
>>>>> oQueryLength;
>>>>> 2) at the method private void interpret(String poLine) throws  
>>>>> SAXException
>>>>> in the if "if (iState == IN_HEADER) {"
>>>>> at line 209 i added
>>>>>
>>>>> if (poLine.startsWith("(", 9) && poLine.endsWith("letters)") ) {
>>>>>                 StringTokenizer st = new StringTokenizer(poLine);
>>>>>                 oQueryLength = st.nextToken().substring(1);
>>>>>            }
>>>>> 3)at the function private void emitHeaderIds() throws  
>>>>> SAXException {
>>>>> at line 564 i added
>>>>>  oAttQName.setQName("queryLength");
>>>>>        oAtts.addAttribute(oAttQName.getURI(),
>>>>>                           oAttQName.getLocalName(),
>>>>>                           oAttQName.getQName(),
>>>>>                           "CDATA", oQueryLength);
>>>>>
>>>>>  at the package  org.biojava.bio.program.ssbind; in  
>>>>> HeaderStAXHandler.java
>>>>> 4)at the private class QueryIDStAXHandler at line 95 I changed the
>>>>> method startelement
>>>>>
>>>>>        public void startElement(String            uri,
>>>>>                                 String            localName,
>>>>>                                 String            qName,
>>>>>                                 Attributes        attr,
>>>>>                                 DelegationManager dm)
>>>>>        throws SAXException
>>>>>        {
>>>>>            ssContext.getSearchContentHandler().setQueryID 
>>>>> (attr.getValue("id"));
>>>>>            if (attr.getValue("queryLength") != null)
>>>>>            {
>>>>>                ssContext.getSearchContentHandler 
>>>>> ().addSearchProperty("queryLength",
>>>>> attr.getValue("queryLength"));
>>>>>            }
>>>>>        }
>>>>>    }
>>>>>
>>>>> Now query length is a property of the annotation  of a blast  
>>>>> result.
>>>>> It is really fun to participate in the biojava project.
>>>>>
>>>>> Best regards,
>>>>> Michael
>>>>>
>>>>>
>>>>> On Dec 24, 2007 2:32 AM, Mark Schreiber  
>>>>> <markjschreiber at gmail.com> wrote:
>>>>>> Hi -
>>>>>>
>>>>>> We are currently merging the code base into subversion (from CVS)
>>>>>> after this it will be possible to check in code again.  For small
>>>>>> additions it is usually easier to post the code to the dev  
>>>>>> list (in
>>>>>> the body of the email as the list doesn't like attachments) or  
>>>>>> send it
>>>>>> to one of the regular committers and get them to add it.
>>>>>>
>>>>>> The JUnit tests are the standard test package. If you have  
>>>>>> added new
>>>>>> functionality it would be a good idea to add another test  
>>>>>> method in
>>>>>> the appropriate JUnit test to make sure it works (and  
>>>>>> continues to
>>>>>> work in the future).
>>>>>>
>>>>>> - Mark
>>>>>>
>>>>>>
>>>>>> On Dec 23, 2007 11:22 PM, Michael Gang <michaelgang at gmail.com>  
>>>>>> wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I've now added the extraction of the query length.
>>>>>>> Can someone explain me the procedure of checking in code to  
>>>>>>> biojava ?
>>>>>>> I ran the unit tests in the biojava distribution? Are there  
>>>>>>> additional
>>>>>>> tests available ?
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Michael
>>>>>>>
>>>>>>>
>>>>>>> On Dec 21, 2007 9:59 AM, Mark Schreiber  
>>>>>>> <markjschreiber at gmail.com> wrote:
>>>>>>>> Hi -
>>>>>>>>
>>>>>>>> It is not required that you turn all Blast results into  
>>>>>>>> objects,
>>>>>>>> because it is an event based parser you can do what you want  
>>>>>>>> with the
>>>>>>>> events including turning them into objects or echoing them  
>>>>>>>> to STDOUT.
>>>>>>>> Take a look at the examples in the cookbook.
>>>>>>>>
>>>>>>>> It may be that the query length is actually parsed but is  
>>>>>>>> not passed
>>>>>>>> onto the object model by the event listeners.
>>>>>>>>
>>>>>>>> - Mark
>>>>>>>>
>>>>>>>>
>>>>>>>> On Dec 21, 2007 12:15 AM, Andreas Prlic <ap3 at sanger.ac.uk>  
>>>>>>>> wrote:
>>>>>>>>> Hi Michael,
>>>>>>>>>
>>>>>>>>> The blast parser (BlastLikeSaxParser) in BioJava has been  
>>>>>>>>> around for
>>>>>>>>> a while and is frequently being used to parse a variety
>>>>>>>>> of different blast outputs. Still it is not complete and  
>>>>>>>>> can not
>>>>>>>>> parse PSI blast. We have had a number of request about it  
>>>>>>>>> lately
>>>>>>>>> so I suppose it needs a little maintenance now.
>>>>>>>>>
>>>>>>>>> To write a new blast parser from scratch will involve a  
>>>>>>>>> significant
>>>>>>>>> amount of time. It will take time to fix all the bugs, add  
>>>>>>>>> support
>>>>>>>>> for the different blast versions and write documentation.  
>>>>>>>>> Much of
>>>>>>>>> this is already available in BioJava, so I would prefer if  
>>>>>>>>> you could
>>>>>>>>> submit patches for
>>>>>>>>> the current blast parser.  Would you also be interested to
>>>>>>>>> collaborate in this direction?
>>>>>>>>> Another feature that would be nice to add support for is the
>>>>>>>>> possibility to send off blast searches to webservices...
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Andreas
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 20 Dec 2007, at 12:54, Michael Gang wrote:
>>>>>>>>>
>>>>>>>>>> Hi All,
>>>>>>>>>>
>>>>>>>>>> I used the interface of the java blast parser.
>>>>>>>>>> I had mainly two problems with it:
>>>>>>>>>> 1) The blast parser does not parse all the information  
>>>>>>>>>> (for example
>>>>>>>>>> query length)
>>>>>>>>>> 2) The blast parser parses the whole blast report into a  
>>>>>>>>>> list which
>>>>>>>>>> eats a lot of memory.
>>>>>>>>>>
>>>>>>>>>> I would be interested to write and contribute a blast  
>>>>>>>>>> parser which
>>>>>>>>>> parses all the information of the blast and parses the blast
>>>>>>>>>> iteratively.
>>>>>>>>>> Something like the following code in bioperl (just in Java).
>>>>>>>>>>   use Bio::SearchIO;
>>>>>>>>>>     # format can be 'fasta', 'blast'
>>>>>>>>>>     my $searchio = new Bio::SearchIO( -format => 'blastxml',
>>>>>>>>>>                                       -file   =>  
>>>>>>>>>> 'blastout.xml' );
>>>>>>>>>>     while ( my $result = $searchio->next_result() ) {
>>>>>>>>>>        while( my $hit = $result->next_hit ) {
>>>>>>>>>>         # process the Bio::Search::Hit::HitI object
>>>>>>>>>>            while( my $hsp = $hit->next_hsp ) {
>>>>>>>>>>             # process the Bio::Search::HSP::HSPI object
>>>>>>>>>>         }
>>>>>>>>>>     }
>>>>>>>>>>
>>>>>>>>>> Would you be interested in such a contribution ?
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>> Michael
>>>>>>>>>> _______________________________________________
>>>>>>>>>> biojava-dev mailing list
>>>>>>>>>> biojava-dev at lists.open-bio.org
>>>>>>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>>>>>>>>
>>>>>>>>> -------------------------------------------------------------- 
>>>>>>>>> ---------
>>>>>>>>>
>>>>>>>>> Andreas Prlic      Wellcome Trust Sanger Institute
>>>>>>>>>                               Hinxton, Cambridge CB10 1SA, UK
>>>>>>>>>                               +44 (0) 1223 49 6891
>>>>>>>>>
>>>>>>>>> -------------------------------------------------------------- 
>>>>>>>>> ---------
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>  The Wellcome Trust Sanger Institute is operated by Genome  
>>>>>>>>> Research
>>>>>>>>>  Limited, a charity registered in England with number  
>>>>>>>>> 1021457 and a
>>>>>>>>>  company registered in England with number 2742969, whose  
>>>>>>>>> registered
>>>>>>>>>  office is 215 Euston Road, London, NW1 2BE.
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> biojava-dev mailing list
>>>>>>>>> biojava-dev at lists.open-bio.org
>>>>>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>>>>>>>>
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> biojava-dev mailing list
>>>>>>> biojava-dev at lists.open-bio.org
>>>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> biojava-dev mailing list
>>>>> biojava-dev at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>>>>
>>>> _______________________________________________
>>>> biojava-dev mailing list
>>>> biojava-dev at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>>
>>> -------------------------------------------------------------------- 
>>> ---
>>>
>>> Andreas Prlic      Wellcome Trust Sanger Institute
>>>                               Hinxton, Cambridge CB10 1SA, UK
>>>                               +44 (0) 1223 49 6891
>>>
>>> -------------------------------------------------------------------- 
>>> ---
>>>
>>>
>>>
>>>
>>> -- 
>>> The Wellcome Trust Sanger Institute is operated by Genome  
>>> ResearchLimited, a charity registered in England with number  
>>> 1021457 and acompany registered in England with number 2742969,  
>>> whose registeredoffice is 215 Euston Road, London, NW1 2BE.
>>
>
> ---------------------------------------------------------------------- 
> -
>
> Andreas Prlic      Wellcome Trust Sanger Institute
>                               Hinxton, Cambridge CB10 1SA, UK
>                               +44 (0) 1223 49 6891
>
> ---------------------------------------------------------------------- 
> -
>
>
>
>
> -- 
> The Wellcome Trust Sanger Institute is operated by Genome  
> ResearchLimited, a charity registered in England with number  
> 1021457 and acompany registered in England with number 2742969,  
> whose registeredoffice is 215 Euston Road, London, NW1 2BE.

-- 
===========================================================
: Hilmar Lapp  -:-  Durham, NC  -:- hlapp at duke dot edu :
===========================================================






More information about the biojava-dev mailing list