[Bioperl-l] next_seq problem

Hermann Norpois hnorpois at googlemail.com
Mon May 7 16:45:24 EDT 2012


I apologise that I forgot to mention a subject.

2012/5/7 Hermann Norpois <hnorpois at googlemail.com>

> Hello,
>
> I wrote a script to retrieve promoter sequences from genbank (see below)
> that doesnt work in the get_stream modus. Input is an array (or list) of
> geneids. In my output file (test4.fasta) only the last header is followed
> by a sequence (see below). I guess I need a second stream but I do not know
> how to construct.
>
> Thank you very much for helping.
>
> Hermann
>
> *script:
> *#!/bin/perl -w
> use strict;
> use warnings;
> use Bio::DB::EntrezGene;
> use Bio::SeqIO;
> use Bio::DB::GenBank;
>
>
> my $seqio_obj = Bio::SeqIO->new(-file => ">test4.fasta", -format => 'fasta'
> ); # outputfile
>
> my $db = new Bio::DB::EntrezGene;
>
> my $seqio = $db->get_Stream_by_id([54161, 12064, 71661]); # Gene ids
>
>    while ( my $seq = $seqio->next_seq ) {
>
>            my $ac = $seq->annotation;
>
>            for my $ann ($ac->get_Annotations('dblink')) {
>            if ($ann->database eq "Evidence Viewer") {
>                # get the sequence identifier, the start, and the stop
>        my ($contig,$from,$to) = $ann->url =~
>          /contig=([^&]+).+from=(\d+)&to=(\d+)/;
>                 my $chr_start = $from-700;
>                 my $chr_stop = $from;
>                 my $strand = 1;
>
>                 my $gb  = Bio::DB::GenBank->new(-format     => 'fasta',
>                                -seq_start  => $chr_start,
>                                -seq_stop   => $chr_stop,
>                               -strand     => $strand
>                                );
>               print "$contig\t$from\t$to\n\t$chr_start\t$chr_stop\n";  #
> For control: Printing coordinates
>                my $obj = $gb->get_Seq_by_id($contig);
>
>
>
>                $seqio_obj->write_seq($obj);
>
>    }
> }
> }
>
> *test4.fasta*
>
> >gi|28485536:10326633-10326632 Mus musculus chromosome 2 genomic contig,
> strain C57BL/6J
>
> >gi|28526280:23304000-23303999 Mus musculus chromosome 6 genomic contig,
> strain C57BL/6J
>
> >gi|372099010:6466323-6467023 Mus musculus strain C57BL/6J chromosome 7
> genomic contig, GRCm38 C57BL/6J MMCHR7_CTG5_2
> CCCCTCCCCGCAGCTTGATTCCTATAAAAACCTGCCATTTTGGATGAATGTGCTGTTCGC
> CCTTGGCTCCTTTCTTGGTCCACTTGCCCTCTCTCTTCTCTCTCTCTCCTTTCACTTTCT...
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>



More information about the Bioperl-l mailing list