[Biojava-dev] Biojava3-Core
Richard Holland
holland at eaglegenomics.com
Wed May 12 12:21:31 UTC 2010
The below is a massive generalisation based on my experience of having worked as a software developer both in academia and in commercial environments (including an airline and a bank).
The standards of academic-generated software tend to be defined as whatever the author (usually a sole author) identifies as requiring least effort to achieve, so the library that gets used is the one they've just found on Google that looks like it probably works, the formatting is their editor's default, documentation is minimal (because most users will probably just email and ask questions anyway), user requirements gathering is based on asking friends at coffee, the code commenting is for their own short-term reference (because most software ceases to be developed any further after the paper about it is published or the author leaves the institute), etc. The standards tend to be selected as whatever makes the author's life easiest, as the software's purpose is to support some other research and usually is not the author's main goal - it gets published as a side-effect of their main research interests.
In commercial software standards tend to be defined by best practice, with authors (usually teams) being asked to adhere strictly to defined customs such a code formatting, and to thoroughly test any new libraries before using them. Detailed user documentation is required because helpdesks are expensive things to run, and proper detailed user requirements gathering is important to minimise subsequent helpdesk contact and improvement requests. Code commenting is also critical to aid new members of the team to take over when old members leave and there's no overlap to transfer knowledge face-to-face. These processes slow down development and increase the costs of the software produced but they do ensure that the code is of higher quality and is more maintainable, and that the users get more benefit from it.
The above comments might seem harsh over-generalisations but I can put my hand up and say I've definitely been guilty in the past of most of those academic charges myself, and I know plenty of others who have been there too. Likewise I've experienced how depressing it is to work in a commercial environment so strictly locked down that you think development is heading nowhere because it's just all too much effort.
BioJava sits between the two, along with most other Bio* open source projects. It's produced and maintained mostly by a team, as in commercial software, but that team is made up mostly of academics. If we want BioJava to be accepted as quality software we have to adopt at least some of the above commercial software development techniques, and then we need to ensure that everyone who develops for BioJava sticks to them to the letter. That could include being asked to reconfigure default settings on editors, post detailed explanations about choice of external libraries, etc.
cheers,
Richard
On 12 May 2010, at 12:56, Andy Yates wrote:
> I can say that the Eclipse diff tool does care about whitespace and I'm quite sure that the Unix diff cares about it as well. I can't say anything about Netbeans since I haven't used it in years.
>
> With all the best respect if the formatting rules say 4 spaces per indentation & code comes in not behaving in that manner then it'll be reformatted. This is nothing against the person who does the commits just they should be aware that it _could_ happen. I should say my personal preference is 2 spaces per indentation; I am flexible about this & will go with what the majority agree. That said if I'm being flexible about it I hope others will be as well.
>
> Andy
>
> On 12 May 2010, at 12:41, LAW Andrew wrote:
>
>> I think the main thing should be for each of us to find a diff tool that doesn't care about whitespace. The other stuff (camelCase, use of braces [ALWAYS!!!], splitting lines) are more important and can/should be defined but if you tell me that we must use 2 spaces to indent and my editor of choice uses a tab then I'm probably not going to listen to you in all practical situations.
>>
>> Not being awkward, just pragmatic.
>>
>>
>> On 12 May 2010, at 12:26, Andy Yates wrote:
>>
>>> So long as they are documented then developers shouldn't complain when their formatting changes. I just want to avoid a situation like now with Scooter's check-in which has a lot of formatting changes so the code changes are quite hard to pick out.
>>>
>>> I've gone and started a page about conventions and have started a discussion. Once we are agreed on what should go into the conventions then they will migrate to the wiki page.
>>>
>>> Andy
>>>
>>> On 12 May 2010, at 12:09, LAW Andrew wrote:
>>>
>>>>
>>>> On 12 May 2010, at 11:52, Richard Holland wrote:
>>>>
>>>>> I see a need for a formal coding style here, regardless of what platform people are using. Taking Netbeans as a basis is a good start but it needs to be documented for contributors to read and follow, then enforced across the whole project.
>>>>
>>>> Absolutely. But the pain in making <insert favourite tool name here> use those conventions can be numbing.
>>>>
>>>>
>>>>
>>>> Later,
>>>>
>>>> Andy
>>>> --------
>>>> Yada, yada, yada...
>>>>
>>>> The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336
>>>> Disclaimer: This e-mail and any attachments are confidential and intended solely for the use of the recipient(s) to whom they are addressed. If you have received it in error, please destroy all copies and inform the sender.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> The University of Edinburgh is a charitable body, registered in
>>>> Scotland, with registration number SC005336.
>>>>
>>>
>>> --
>>> Andrew Yates Ensembl Genomes Engineer
>>> EMBL-EBI Tel: +44-(0)1223-492538
>>> Wellcome Trust Genome Campus Fax: +44-(0)1223-494468
>>> Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/
>>>
>>>
>>>
>>>
>>
>> Later,
>>
>> Andy
>> --------
>> Yada, yada, yada...
>>
>> The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336
>> Disclaimer: This e-mail and any attachments are confidential and intended solely for the use of the recipient(s) to whom they are addressed. If you have received it in error, please destroy all copies and inform the sender.
>>
>>
>>
>>
>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>
> --
> Andrew Yates Ensembl Genomes Engineer
> EMBL-EBI Tel: +44-(0)1223-492538
> Wellcome Trust Genome Campus Fax: +44-(0)1223-494468
> Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/
>
>
>
>
--
Richard Holland, BSc MBCS
Operations and Delivery Director, Eagle Genomics Ltd
T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/
More information about the biojava-dev
mailing list