[Bioperl-l] Are there arguments for REGION of ACCESSION in Bio::DB
Roy Chaudhuri
roy.chaudhuri at gmail.com
Tue Mar 13 12:41:36 EDT 2012
Hi,
I get the same error as you, although I should also note that I'm not
familiar with this module, so I may be missing a problem with the HowTo
code. Also (like you) I have an old version of BioPerl installed, so
perhaps you could try upgrading your BioPerl to the most recent version
(1.6.901) from CPAN or bioperl-live from GitHub? There have probably
been modifications to Bio::DB::GenBank since 1.6.1.
One thing I noticed - the accession numbers you quote are from RefSeq,
not GenBank (the NCBI make the two difficult to distinguish in Entrez,
but RefSeq accessions contain an underscore). I tried replacing
Bio::DB::GenBank with Bio::DB::RefSeq and that seemed to work -
according to the docs the RefSeq module downloads from the EBI rather
than the NCBI.
Cheers,
Roy.
On 13/03/2012 08:30, yun YAN wrote:
> Dear Roy,
> Great thanks for your reply. And I try it as soon as I receive your
> mail. However, it reports an error:
>
> MSG: acc NM_000344 does not exist
> STACK: Error::throw
> STACK: Bio::Root::Root::throw
> /usr/local/share/perl/5.10.1/Bio/Root/Root.pm:368
> STACK: Bio::DB::WebDBSeqI::get_Seq_by_acc
> /usr/local/share/perl/5.10.1/Bio/DB/WebDBSeqI.pm:195
> STACK: test_gene_bank_with_sublocation.pl:26
> <http://test_gene_bank_with_sublocation.pl:26>
>
> I've repeatedly checked my codes, and still cannot figure out where is
> the bug. At first I think maybe it does not support genome assembly
> (NC_000005), thus I try SMN1 gene directly ( NM_000344). Neither of them
> works. Even the simplest codes still report the error: "acc NM_000344
> does not exist", while the accession number does exists,
> http://www.ncbi.nlm.nih.gov/nuccore/NM_000344.3.
> My test code is (almost exactly copied from HOWTO tutorial) :
>
> use strict;
> use warnings;
> use Bio::DB::GenBank;
> my $gb = Bio::DB::GenBank->new (-format => 'genbank', -seq_start =>
> 1, -seq_stop => 2000, -strand =>1,);
> my $seq_obj = $gb->get_Seq_by_acc('NM_000344');
> print $seq_obj; #just for test
>
> Currently my perl is 5.10.1, and BioPerl stays in 1.6.1. All codes run
> on Ubuntu 10.04 LTS. I've checked Bio::DB::GenBank module of 1.6.1
> version, and it supports -seq_start and -seq_stop function.
> Any ideas? Hope I don't make some low-level mistakes. Look forward to
> your reply.
> Thanks.
>
> On Mon, Mar 12, 2012 at 8:38 PM, Roy Chaudhuri <roy.chaudhuri at gmail.com
> <mailto:roy.chaudhuri at gmail.com>> wrote:
>
> I think this is what you want:
> http://www.bioperl.org/wiki/__HOWTO:Getting_Genomic___Sequences#Using_Bio::DB::__GenBank_when_you_have_genomic___coordinates_to_get_a_Seq___object
> <http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBank_when_you_have_genomic_coordinates_to_get_a_Seq_object>
>
>
> On 12/03/2012 05:33, yun YAN wrote:
>
> One's goal is to get both exon/intron region of gene of interest
> from
> remote database(NCBI), with the help of Bio::DB::GenBank.
> "get_seq_by_acc"
> will work for most cases, but it seems that it cannot be used for
> exon/intron parsing.
>
> Let's say gene SMN1,
> http://www.ncbi.nlm.nih.gov/__nuccore/NC_000005.9?report=__genbank&from=70220768&to=__70248839
> <http://www.ncbi.nlm.nih.gov/nuccore/NC_000005.9?report=genbank&from=70220768&to=70248839>
> .
> The exon/inron information can only be available in genome
> assembly part,
> and the accession number (
> NC_000005<http://www.ncbi.nlm.__nih.gov/nuccore/NC_000005
> <http://www.ncbi.nlm.nih.gov/nuccore/NC_000005>>) is
>
> actually the genome contig, not gene. To define my gene SMN1, an
> additional
> argument "REGION" is needed (REGION: 70220768..70248839). If I
> use simply
> "get_seq_by_acc", it will not return the gene, but return the genome
> assembly results.
>
> Thus any ideas about how to retrieve the gene (not mRNA)
> containing both
> exon/intron? Are there any additional arguments in
> get_by_acc('XXXX')
> REGION( 1234..6789), perhaps?
>
> I want to use command-line as much as possible. I used to copy
> out the page
> (indeed they are arranged in strict genbank format) and paste as
> genbank
> file , and afterwards I use Bio::DB::GenBank LOCALLY. The first
> step is
> done actually by my hand, by graphic interface which is not
> convenient.
>
> Thanks
> _________________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org <mailto:Bioperl-l at lists.open-bio.org>
> http://lists.open-bio.org/__mailman/listinfo/bioperl-l
> <http://lists.open-bio.org/mailman/listinfo/bioperl-l>
>
>
>
More information about the Bioperl-l
mailing list