[Bioperl-l] Fasta Qual files

hilmar.lapp@pharma.Novartis.com hilmar.lapp@pharma.Novartis.com
Thu, 14 Sep 2000 17:57:30 +0100




> > I have code for reading Phred-produced seq/qual pairs of
FASTA-formatted
> > files into Bio::QualSeq objects, which are merely Bio::Seq objects with
> > quality values (being truncated and reversed, too, when you
> > truncate/reverse the seq). The code can also write qual-files.
> >
> > If you think this is useful for you I'll try to put it to the
repository.
>
> Sounds very useful. Check it in!

Hi,

Sorry if I'm insulting anyone by stating the
obvious.

I have to deal with quality values for the Sanger
submissions.  It is a good trick to store the
quality array in memory as a string of unsigned
chars.  Perl arrays 100k long start consuming a
lot of memory!  To do this you use pack and
unpack:

  my @qual = (40,34,35,99,99);
  my $qual_str = pack('C*', @qual);
  @qual = unpack('C*', $qual_str);

     James


     So far I have only dealt with qualvals for reads, which obviously
     don't extend that much. In general, incorporating this shouldn't be
     any problem, but for writing/truncation/reversal the array will still
     have to be unpacked.

     James, if you're already handling quality values, does your code
     already provide what seems to be needed? Wouldn't it make sense that
     you check in your code?

          Hilmar





Received: from Lists.Uni-Bielefeld.DE (IDENT:0@pan.hrz.uni-bielefeld.de [129.70.4.30])
	by pw600a.bioperl.org (8.9.3/8.9.3) with ESMTP id KAA07088
	for <bioperl-l@bioperl.org>; Thu, 14 Sep 2000 10:56:21 -0400
Received: from jess.sanger.ac.uk (root@jess.sanger.ac.uk [193.60.84.61])
	by Lists.Uni-Bielefeld.DE (8.8.6 (PHNE_17135)/8.8.6) with ESMTP id RAA02322
	for <vsns-bcd-perl@lists.uni-bielefeld.de>; Thu, 14 Sep 2000 17:59:50 +0200 (METDST)
Received: from obi-wan.sanger.ac.uk (root@obi-wan [193.62.207.144])
	by jess.sanger.ac.uk (8.8.8/8.8.7) with ESMTP id QAA16916
	for <vsns-bcd-perl@lists.uni-bielefeld.de>; Thu, 14 Sep 2000 16:59:49 +0100 (BST)
Received: from localhost (jgrg@localhost [127.0.0.1])
	by obi-wan.sanger.ac.uk (8.8.7/8.8.7) with ESMTP id QAA16461
	for <vsns-bcd-perl@lists.uni-bielefeld.de>; Thu, 14 Sep 2000 16:47:06 +0100 (BST)
Date: Thu, 14 Sep 2000 16:47:06 +0100 (BST)
From: James Gilbert <jgrg@sanger.ac.uk>
To: Bioperl <vsns-bcd-perl@lists.uni-bielefeld.de>
Message-ID: <Pine.OSF.4.21.0009141621440.187-100000@obi-wan.sanger.ac.uk>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Subject: [Bioperl-l] Bio::Root::RootI::rearrange bug & Bio::Index changes
Sender: bioperl-l-admin@bioperl.org
Errors-To: bioperl-l-admin@bioperl.org
X-BeenThere: bioperl-l@bioperl.org
X-Mailman-Version: 2.0beta2
Precedence: bulk
List-Id: Bioperl Project Discussion List <bioperl-l.bioperl.org>


Hi,

I've found a bug in the Bio::Root::RootI::_rearrange method.
If you call new() without parameters, _rearrange()
was assigning empty string to all of the variables
on the left of the expression (except for the last
one).  I'm changing _rearrange to just return an
empty list, which works fine -- all tests pass.

I found this after I tightened up the
Bio::SeqFeature::Generic method to only accept
'1', '-1', or '0'.

I'm making two changes to Bio::Index:

The first is the addition of an
allow_relative_paths() method.  Previously we
insisted that the path to a file being indexed
must be absolute, but this prevents new databases
being indexed in one location, and moved to
another (a popular way of updating databases).

The second is to throw an exception if an id is
asked for which is not in the database.  At the
moment it produces an uninformative exception in
this circumstance, so I don't think I'll be
breaking anything.

	James

James G.R. Gilbert
The Sanger Centre
Wellcome Trust Genome Campus
Hinxton
Cambridge                        Tel: 01223 494906
CB10 1SA                         Fax: 01223 494919