[Bioperl-l] genbank
Jason Stajich
jason at bioperl.org
Tue Nov 30 17:06:08 UTC 2010
great - the whole point of the scripts are as examples really, not that
you need to send patches back to show everything that you modified, but
that you modify to use modules and code to do whatever special thing you
want. The hope is that the modules are flexible enough that you can
write the script to accomplish your goal.
BTW - the one thing you can't recover from the GBK version of the file
is the source of the accession number -- you have hardcoded in 'ref' but
it can be 'gb', 'emb', 'sp' etc this field isn't part of the genbank
record unfortunately -- one can come up with a pattern based on
knowledge of accession number formats but I don't know that anyone has
really been that worried about that sort of thing to try and write
something for it.
>>
> Hi again,
> i managed to solve my problem. It may be dirty but it works the way i
> want :)
> I reworked the 'download_query_genbank.pl' (attached). Now i can get
> the seqs in full fasta for proteomes and genomes and the genpept
> report files for the proteomes.
> For DB handle i only use GenPept now cos it gives me stream which i
> can track with term::progressbar.
>
> For the output i use 2 cases:
> -----------------
> while( my $seq = $stream->next_seq ) {
> #DIMITAR
> my($gi,$locus,$refnum,$desc,$seqstr);
> if($retformat eq 'fasta'){ <-------------------------| for the
> fasta as i want it
> check_progress($prgs,$seqnum,$count);
> $locus=$seq->display_id;
> $refnum=$seq->accession_number;
> $gi=$seq->primary_id;
> $desc=$seq->desc;
> $desc=~s/\.$//;
> $seqstr=$seq->seq;
> print $fhout ">gi\|$gi\|ref\|$refnum\|$locus $desc\n$seqstr\n";
> }else{
> check_progress($prgs,$seqnum,$count);
> $out->write_seq($seq); <--------------------------| for the
> genbank reports
> }
> $seqnum++;
> #DIMITAR
>
> # $out->write_seq($seq);#original
>
> }
> ------------------
>
> Thank you for your help and time.
>
> Cheers
> Dimitar
>
--
Jason Stajich
jason at bioperl.org
More information about the Bioperl-l
mailing list