[Bioperl-l] speedbumps

Ewan Birney birney@ebi.ac.uk
Tue, 3 Dec 2002 09:11:25 +0000 (GMT)


On Mon, 2 Dec 2002, Rob Edwards wrote:

> Hi all,
>
> These are some of my thoughts on speed bumps as Nat called them (I like
> that term).

Thanks.

>
> Some really silly things irked me. For example sometimes the database is
> called GenBank, sometimes its Genbank, and sometimes its genbank (but
> never genBank!) This is really trivial until you are looking through a
> file with a case sensitive sort (like in less, my pager for perldoc).


So - we need standardised names? (Genbank I would go for, and EMBL all
caps)

>
> For SeqFeatureI.pm (and others probably) it is sometimes not clear what
> will be returned (scalar, array, hash). I know, I should read the pod
> and it will explain what is returned, but it wasn't obvious at first.
> Things that I thought would return a scalar returned an array (even
> though it only had one thing in it). However, its not clear to me how to
> make this more clearer as it is in the docs for most things what will be
> returned. Maybe a line at the start that says read the docs. There's
> just so much of them :)

ok. Can fix.

>
> For something like parsing genBank files, it is quite easy to call
> something that hasn't been defined. I realize that this is due to the
> haphazard nature of some genbank files, and is not easy to control for
> everything that may be there, but is there a way to define a set of
> basic fields that should be there and return a null string if they are
> missing from a file. At least that would stop myfirstscript.pl from
> throwing an error it looked for something that was missing (like
> organism).


I think the biggest problem is species. Cue debate about whether we should
always attach the "unknown species" to sequences. Hmmm. thoughts?

>
> In general though, I agree with Nat. There is so much there, just
> finding out what does what is confusing. I ended up extracting all the
> docs into a single file that I look through (with less) to figure out
> how to use something. I know there is the tutorial and whatnot, but
> perhaps a one line summary about each file would be another good place
> to start.
>
> e.g: file xxx.pm use this to work with yyy and zzz

This is sort of a "hackers guide" or rather "lazy programmer's guide to
Bioperl". I like it.


>
> Now that I have written this it just seems like griping (though it is
> really not meant to be).

It is really useful. No worries.

>
> So... if others think a one line summary would be a Good Idea, I will
> try and put my money where my mouth is (or at least my text editor) in
> the next week or so and come up with something.


Ha. brilliant! go for it.





>
> Rob
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>