[Bioperl-l] Bug in SeqIO genbank output

Wes Barris wes.barris at csiro.au
Thu Jan 1 18:46:34 EST 2004


Heikki Lehvaslaiho wrote:

> Wes,
> 
> You didnot say which versionof bioperl you are using. For some reason 

I am using bioperl-1.2.3

> which I
> can not quite understand, the current code:
>           $self->_print(sprintf("%-6s%s\n",'ORIGIN',$o ? $o->value : ''));
> 
> does print out the requred six spaces after the word ORIGIN. This was 
> recently

Really?  How?  In the above line "%-6s" left justifies 'ORIGIN' (which is
already 6 characters).  The '6' needs to be changed to '12' to get six
extra spaces.  See below.


> fixed. Now, why doesn't it work for you? Could you check that you do not 
> have
> multiple copies of bioperl in your computer and the older one gets 
> accidently
> executed?
> 
> Sorry, I can not comeupwith any better explanation,
> 
>         -Heikki
> 
> On Tuesday 16 Dec 2003 4:38 am, Wes Barris wrote:
>  > Hi,
>  >
>  > I have just succeeded in tracking down a bug that prevents genbank files
>  > written from bioperl from being properly imported into StackPack
>  > (clustering software).  The problem is due to a subtle difference in
>  > a genbank entry downloaded from NCBI and a genbank entry produced using
>  > genbank.pm.  If you use "od -c" to look at a genbank record from NCBI,
>  > you will notice that the word "ORIGIN" is followed by six space 
> characters.
>  >
>  > ORIGIN
>  >          1 cggccgcgtc gacttttttt ttaggtattt ttctcttatt atttctaaaa
>  > tataaatttt 61 ggacattcaa aagtgcaaca ngttaatgtg cctgtgggga atatcacagt
>  > taaaaaaata
>  >
>  > If I process this file using bioperl and then write out a new genbank
>  > format file, the word "ORIGIN" is followed immediately by a carriage 
> return
>  > (newline) character.
>  >
>  > It seems silly to me that spaces should be required after the word
>  > "ORIGIN", but they do exist in files downloaded from NCBI and StackPack
>  > seems to require these space characters in order to import a genbank 
> file.
>  > Is there an official specification for the genbank format?  I have 
> sent a
>  > bug report to the makers of StackPack too.
>  >
>  > In the meantime, I have modified my installed copy of 
> Bio/SeqIO/genbank.pm
>  > changing this line:
>  >
>  >          $self->_print(sprintf("%-6s%s\n",'ORIGIN',$o ? $o->value : 
> ''));
>  >
>  > to this:
>  >
>  >          $self->_print(sprintf("%-12s%s\n",'ORIGIN      ',$o ? 
> $o->value :
>  > ''));
> 
> -- 
> ______ _/      _/_____________________________________________________
>       _/      _/                      http://www.ebi.ac.uk/mutations/
>      _/  _/  _/  Heikki Lehvaslaiho    heikki_at_ebi ac uk
>     _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
>    _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
>   _/  _/  _/  Cambs. CB10 1SD, United Kingdom
>      _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
> ___ _/_/_/_/_/________________________________________________________
> 


-- 
Wes Barris
E-Mail: Wes.Barris at csiro.au



More information about the Bioperl-l mailing list