[Bioperl-l] Extracting patterns

Ken Y. Clark kclark@logsoft.com
Mon, 20 Aug 2001 22:31:48 -0400 (EDT)


On Mon, 20 Aug 2001, Mariana Mondragon wrote:

> Date: Mon, 20 Aug 2001 19:03:25 -0700 (PDT)
> From: Mariana Mondragon <mmondrag@ea.oac.uci.edu>
> To: bioperl-l@bioperl.org
> Subject: [Bioperl-l] Extracting patterns
>
>
> Hi everyone,
>
> I have a list of amino acid sequences in FASTA format, with the script
> pasted below I would like to obtain a list of the sequence IDs and lengths
> of every sequence as well as sum of all the sequence lengths. Like this:
>
> f11m1513: 572
> F24D78: 967
> T12P1811: 1032
> TOTAL LENGTH = 2571
>
> However I am obtaining something like this:
> :2671
> TOTAL LENGTH= 2671
>
> This is part of the exercises I am using to learn Perl. In order to fix
> the problem I have made changes on the counters or in the order of
> variable declaration, but this does not seem to work. I have written the
> original script I got from the chaper 17 "Using perl to facilitate
> biological analysis" from the book Bioinformatics by Baxevanis A. and
> Ouellette B.F.
>
> Hope any of you can shed some light on this. Thanks in advance.
>
> M. Mondragon

> **********************************************************************
> THE SCRIPT:
>
> #!/usr/bin/perl
>
> $id='';                          #holds sequence ID of current sequence
> $length=0;                       #holds length of current sequence
> $total_length=0;                 #tallies aggregate lenght of all seqs
> while (<>)
> {chomp;
> if (/^>(\S+)$/)                  #found a new description line
>  {print "$id:$length\n" if $length>0;
>  $1=$id;
   ^^^^^^^
Mariana,

Here is your problem.  I think you meant to say:

   $id=$1;

You don't want to assign $1 -- it's a read-only value.  As a general
rule, I'd recommend you 'use strict' and run with warnings
('#!/usr/bin/perl -w' or 'use warnings') whenever you write anything
more than a few lines.

ky