[Bioperl-l] Are there arguments for REGION of ACCESSION in Bio::DB

Roy Chaudhuri roy.chaudhuri at gmail.com
Mon Mar 12 08:38:08 EDT 2012


I think this is what you want:
http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences#Using_Bio::DB::GenBank_when_you_have_genomic_coordinates_to_get_a_Seq_object

On 12/03/2012 05:33, yun YAN wrote:
> One's goal is to get both exon/intron region of gene of interest from
> remote database(NCBI), with the help of Bio::DB::GenBank. "get_seq_by_acc"
> will work for most cases, but it seems that it cannot be used for
> exon/intron parsing.
>
> Let's say gene SMN1,
> http://www.ncbi.nlm.nih.gov/nuccore/NC_000005.9?report=genbank&from=70220768&to=70248839
>   .
> The exon/inron information can only be available in genome assembly part,
> and the accession number (
> NC_000005<http://www.ncbi.nlm.nih.gov/nuccore/NC_000005>) is
> actually the genome contig, not gene. To define my gene SMN1, an additional
> argument "REGION" is needed (REGION: 70220768..70248839). If I use simply
> "get_seq_by_acc", it will not return the gene, but return the genome
> assembly results.
>
> Thus any ideas about how to retrieve the gene (not mRNA) containing both
> exon/intron? Are there any additional arguments in get_by_acc('XXXX')
> REGION( 1234..6789), perhaps?
>
> I want to use command-line as much as possible. I used to copy out the page
> (indeed they are arranged in strict genbank format) and paste as genbank
> file , and afterwards I use Bio::DB::GenBank LOCALLY. The first step is
> done actually by my hand, by graphic interface which is not convenient.
>
> Thanks
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l



More information about the Bioperl-l mailing list