[Biopython-dev] Propose: Adding an alias name (gb) for Genbank in SeqIO

Michiel de Hoon mjldehoon at yahoo.com
Wed Apr 15 10:57:43 UTC 2009


I think it's nice to be consistent with NCBI, and I don't see a big problem in having an alias for GenBank in SeqIO. At least, having "gb" in Bio.Entrez but "genbank" in Bio.SeqIO would go against the principle of least surprise.

--Michiel.


--- On Wed, 4/15/09, Peter <biopython at maubp.freeserve.co.uk> wrote:

> From: Peter <biopython at maubp.freeserve.co.uk>
> Subject: Re: [Biopython-dev] Propose: Adding an alias name (gb) for Genbank in SeqIO
> To: "Sebastian Bassi" <sbassi at clubdelarazon.org>
> Cc: biopython-dev at lists.open-bio.org
> Date: Wednesday, April 15, 2009, 5:40 AM
> On Wed, Apr 15, 2009 at 3:05 AM, Sebastian Bassi
> <sbassi at clubdelarazon.org> wrote:
> > As a follow up to bug 2811 where "gb" is now
> a valid name in
> > Bio.Entrez, ...
> 
> Just to note that in Entrez EFetch, using rettype=gb (and
> the related
> rettype=gb for proteins in GenPept format) has always been
> a valid
> argument (and in fact has always been the documented way to
> get a
> GenBank/GenPept file back).
> 
> >From my point of view it was a nice feature of Entrez
> EFetch that they
> used to (unofficially) support retype=genbank, which was
> consistent with
> Bio.SeqIO.  I suppose you could all try lobbing the NCBI to
> put Entrez
> EFetch back to the pre Easter 2009 behavior, but
> realistically we'll just
> have to live with it.
> 
> Now that Entrez EFetch doesn't support the unofficial
> rettype=genbank
> argument anymore, we have the current situation where you
> must use
> "gb" (or "gp") for Bio.Entrez but
> "genbank" for Bio.SeqIO.  I agree this
> isn't so nice, but as I wrote on Bug 2811, I'm not
> keen on having aliases
> in Bio.SeqIO (but I may be in a minority here, hence
> suggesting a
> discussion).  On the plus side, EMBOSS offers
> "gb" (and "ddbj") as
> alternative aliases for "genbank", so there is
> precedent.
> 
> In a related approach, I suppose we could have Bio.SeqIO
> take
> "genbank" to mean GenBank or GenPept as
> determined from the file
> or the alphabet (as now), and add "gb" meaning
> (nucelotide) GenBank
> files, and "gb" meaning (protein) GenPept files.
> 
> But again, this breaks the Python ideal of there being one
> clear way to
> do things (having multiple names for the same format).
> 
> Peter
> _______________________________________________
> Biopython-dev mailing list
> Biopython-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython-dev


      



More information about the Biopython-dev mailing list