[Bioperl-l] Trouble retrieving multiple sequences from NCBI in a single list query

Hotz, Hans-Rudolf hrh at fmi.ch
Wed Nov 4 10:05:17 UTC 2009


Hi

try

my $out = new Bio::SeqIO (-file => ">>extracted_seqs.fasta",
                                     ^

this way you no longer overwrite your existing file, but append the next
sequence.

Regards, Hans



On 11/4/09 9:43 AM, "jluis.lavin at unavarra.es" <jluis.lavin at unavarra.es>
wrote:

> 
> Hello all,
> 
> I´m a newbie who is having terrible troubles trying to retrieve a list
> multiple sequences from the NCBI and write them to a single file in Fasta
> format.
> The code I´ve written seems to read mylist and retrive the sequences, but
> it kinda overwrites them so that I only get the last sequence on the list.
> I´ve been told to ask the people on this mailing list for help, since you
> may have come across this problem also or at last will know how to solve
> it...
> 
> Here is my code, which basically consist on an STDIN for the list to be
> read into an array and a loop to read each sequence (stopping when the
> list ends) and retrieve a sequence each time the loop is launched,
> writting that sequence to a fasta file. I only get a sequence back
> although it seems to perform the retrieving process with each of the
> sequences of the list...
> 
> 
> #!/usr/bin/perl -w
> use strict;
> use Bio::DB::GenPept;
> use Bio::DB::GenBank;
> use Bio::SeqIO;
> print "Enter your list name:";
> my $archivo=<STDIN>;
> chomp $archivo;
> die ("Can´t open input\n") unless (open(INFILE, $archivo));
> my @lista = <INFILE>;
> foreach my $seq (@lista) {
>     if ($seq eq '') {
>         die ("empty list")
>         }
>     else {
> my $db = new Bio::DB::GenPept("-format" => "Fasta");
> my $seqobj = $db->get_Seq_by_acc($seq);
> my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta",
> -format => 'fasta');
> $out->write_seq($seqobj);
> }
> }
> exit;
> 
> 
> An example list of sequences can be this one:
> 
> YP_003107578.1
> YP_003106103.1
> YP_003106552.1
> YP_003106560.1
> YP_003107053.1
> YP_003107450.1
> YP_003108000.1
> YP_003105023.1
> YP_003105264.1
> 
> Thanks in advance for your help ;)





More information about the Bioperl-l mailing list