Bioperl: Re: Bio::Tools::Blast

Steve A. Chervitz sac@alberich.Stanford.EDU
Wed, 26 Aug 1998 15:13:33 -0700 (PDT)


Lincoln, 

Spaces are not permitted in identifiers in Blast.pm. In the Fasta
files I've seen, a space is used to separate the identifier from the 
description line. Here's how Bio::PreSeq::parse_fasta() grabs the 
identifier and description:

 ($self->{"id"}, $self->{"desc"}) = $head =~ /^>[ \t]*(\S*)[ \t]*(.*)$/;

BTW, I just updated the Blast distribution (now 0.061). It includes   
an important memory management fix that helps when crunching lots of 
reports. 

Steve Chervitz
sac@genome.stanford.edu


On 26 Aug 1998, Lincoln Stein wrote:

> Hi Steve,
> 
> Does Blast.pm not deal correctly with sequence identifiers that
> contain spaces?  I just tried to blast a database made from
> identifiers like this:
> 
> >notch4 exon #1
> atgcagccccagttgctgctgctgctgctcttgccactcaatttccctgtcatcctgacc
> agag
> 
> >notch4 exon #2
> agcttctgtgtggaggatccccagagccctgtgccaacggaggcacctgcctgaggctat
> ctcggggacaagggatctgcca
> 
> >notch4 exon #3
> gtgtgcccctggatttctgggtgagacttgccagtttcctgacccctgcagggataccca
> actctgcaagaatggtggcagctgccaagccctgctccccacacccccaagctcccgtag
> tcctacttctccactgacccctcacttctcctgcacctgcccctctggcttcaccggtga
> tcgatgccaaacccatctggaagagctctgtccaccttctttctgttccaacgggggtca
> ctgctatgttcaggcctcaggccgcccacagtgctcctgcgagcctgggtggacag
> 
> but I only got "notch4" as the hit.  When I changed the spaces to
> dots, I got the full identifier.
> 
> I don't think the FASTA format forbids spaces in the identifiers.
> 
> Oh, this is with 0.06, just downloaded today.
> 
> Lincoln
> 
> -- 
> ========================================================================
> Lincoln D. Stein                           Cold Spring Harbor Laboratory
> lstein@cshl.org			                  Cold Spring Harbor, NY
> ========================================================================
> 
=========== Bioperl Project Mailing List Message Footer =======
Project URL: http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/
For info about how to (un)subscribe, where messages are archived, etc:
http://www.techfak.uni-bielefeld.de/bcd/Perl/Bio/vsns-bcd-perl.html
====================================================================