[Bioperl-l] Bioperl-l Digest, Vol 71, Issue 15
Brian Osborne
bosborne11 at verizon.net
Fri Apr 10 14:05:06 UTC 2009
Dereje,
There's a HOW TO that discusses an approach similar to this (Using
local Genbank and Entrez Gene files):
http://www.bioperl.org/wiki/HOWTO:Getting_Genomic_Sequences
But the provided script uses Gene ids, not chromosome names. The more
general suggestion would be to look at the module Bio::DB::Fasta.
Brian O.
On Mar 31, 2009, at 6:59 PM, demis001 wrote:
>
> Hi ,
>
> I am new to BioPerl and this forum and even do not know how to post
> the new
> post. I have one question for you guys.
>
> Is there any BioPerl module that allows me to download sequence
> based on
> chromosome name, seqStart and SeqEnd given the formatted human genome
> database downloaded on my Linux desktop?
>
> I used to do this using Perl $URI object and it is really slow as the
> process depend on the network. To be more specific, I took chrName,
> seqStart
> and seqEnd and go to Ensembl database to get the sequence one by one
> using
> Perl $URI object.
>
> I thought it might be easier if I process locally using indexed
> database
> using BioPerl module if there is any designed for this purpose.
>
> Input, millions rows of tab delimited (CSV) file contain
> information about
> chrName, seqStart, seqEnd. Locally formatted/indexed human genome.
> Output
> should be the fasta sequence contain the sequence and with the header
> contain chr name and location persed
>
> Sorry if I posted in the wrong section of the forum and happy to
> get any
> recommendation.
> Thanks
>
> Govind Chandra wrote:
>>
>> Hi,
>>
>> The code below
>>
>>
>> ====== code begins =======
>> #use strict;
>> use Bio::SeqIO;
>>
>> $infile='NC_000913.gbk';
>> my $seqio=Bio::SeqIO->new(-file => $infile);
>> my $seqobj=$seqio->next_seq();
>> my @features=$seqobj->all_SeqFeatures();
>> my $count=0;
>> foreach my $feature (@features) {
>> unless($feature->primary_tag() eq 'CDS') {next;}
>> print($feature->start()," ", $feature->end(), "
>> ",$feature->strand(),"\n");
>> $ac=$feature->annotation();
>> $temp1=$ac->get_Annotations("locus_tag");
>> @temp2=$ac->get_Annotations();
>> print("$temp1 $temp2[0] @temp2\n");
>> if($count++ > 5) {last;}
>> }
>>
>> print(ref($ac),"\n");
>> exit;
>>
>> ======= code ends ========
>>
>> produces the output
>>
>> ========== output begins ========
>>
>> 190 255 1
>> 0
>> 337 2799 1
>> 0
>> 2801 3733 1
>> 0
>> 3734 5020 1
>> 0
>> 5234 5530 1
>> 0
>> 5683 6459 -1
>> 0
>> 6529 7959 -1
>> 0
>> Bio::Annotation::Collection
>>
>> =========== output ends ==========
>>
>> $ac is-a Bio::Annotation::Collection but does not actually contain
>> any
>> annotation from the feature. Is this how it should be? I cannot
>> figure
>> out what is wrong with the script. Earlier I used to use has_tag(),
>> get_tag_values() etc. but the documentation says these are
>> deprecated.
>>
>> Perl is 5.8.8. BioPerl version is 1.6 (installed today). Output of
>> uname
>> -a is
>>
>> Linux n61347 2.6.18-92.1.6.el5 #1 SMP Fri Jun 20 02:36:06 EDT 2008
>> x86_64 x86_64 x86_64 GNU/Linux
>>
>> Thanks in advance for any help.
>>
>> Govind
>>
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>
> --
> View this message in context: http://www.nabble.com/Re%3A-Bioperl-l-Digest%2C-Vol-71%2C-Issue-15-tp22744119p22816585.html
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list