[Bioperl-l] Bio::Index::Fasta foreach loop

Brian Osborne brian_osborne at cognia.com
Thu Apr 10 09:58:19 EDT 2003


Ann,

You could use Bio::DB::Fasta, it's similar to Bio::Index::Fasta. From the
module documentation:

use Bio::DB::Fasta;

 # create database from directory of fasta files
 my $db      = Bio::DB::Fasta->new('/path/to/fasta/files');

 # simple access (for those without Bioperl)
 my $seq     = $db->seq('CHROMOSOME_I',4_000_000 => 4_100_000);
 my $revseq  = $db->seq('CHROMOSOME_I',4_100_000 => 4_000_000);
 my @ids     = $db->ids;
 my $length  = $db->length('CHROMOSOME_I');
 my $alphabet = $db->alphabet('CHROMOSOME_I');
 my $header  = $db->header('CHROMOSOME_I');

 # Bioperl-style access
 my $db      = Bio::DB::Fasta->new('/path/to/fasta/files');

 my $obj     = $db->get_Seq_by_id('CHROMOSOME_I');
 my $seq     = $obj->seq;
 my $subseq  = $obj->subseq(4_000_000 => 4_100_000);
 my $length  = $obj->length;
 # (etc)

 # Bio::SeqIO-style access
 my $stream  =
Bio::DB::Fasta->new('/path/to/fasta/files')->get_PrimarySeq_stre
m;
 while (my $seq = $stream->next_seq) {
   # Bio::PrimarySeqI stuff
 }

 my $fh = Bio::DB::Fasta->newFh('/path/to/fasta/files');
 while (my $seq = <$fh>) {
   # Bio::PrimarySeqI stuff
 }

 # tied hash access
 tie %sequences,'Bio::DB::Fasta','/path/to/fasta/files';
 print $sequences{'CHROMOSOME_I:1,20000'};



Brian O.

-----Original Message-----
From: bioperl-l-bounces at bioperl.org [mailto:bioperl-l-bounces at bioperl.org]On
Behalf Of Ann Hedley
Sent: Thursday, April 10, 2003 8:31 AM
To: bioperl
Subject: [Bioperl-l] Bio::Index::Fasta foreach loop

Hi

I've used Bio::Index::Fasta to make the index $inxdc from a fasta file.
 I now want a  loop to to examine each sequence in turn.

This script prints the correct sequence...

  my $seqdc = $inxdc->fetch("CEC00023");
  print $seqdc->seq;

... so the index file is OK, but how do I put that in a loop for each
sequence $id?

Any ideas much appreciated

Many Thanks
Ann


#!/usr/bin/perl -w

use strict;
use Bio::Index::Fasta; # using fasta file format

##########get decoder seqs into $idxdc
#########################################

  my $Index_File_name = shift;
  my $filedc = "/home/user1/alldc5.fsa";
  my $inxdc = Bio::Index::Fasta->new( -filename   => $filedc . ".idx",
                                    -write_flag => 1 );
  # pass a reference to the critical function to the Bio::Index object
  $inxdc->id_parser(\&get_id);
  # make the index
  $inxdc->make_index($filedc);

###########print sequence and length for all
seqs###############################

foreach $inxdc($id)        #????????????
{
  my $seqdc = $inxdc->fetch("$id");
  print $seqdc->seq;
  print " length is ",$seqdc->length(), ", decoder\n";
}


#########################################################################

# here is where the retrieval key is specified
  sub get_id
  {
     my $header = shift;
     $header =~ /^>(CEC\d{5})/;
     $1;
  }


_______________________________________________
Bioperl-l mailing list
Bioperl-l at bioperl.org
http://bioperl.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list