[Bioperl-l] Bioperl-l Digest, Vol 71, Issue 15

demis001 dereje1227 at yahoo.com
Thu Apr 2 09:45:08 EDT 2009


Hi ,

I am new to BioPerl and this forum and even do not know how to post the new
post. I have one question for you guys.

Is there any BioPerl module that allows me to download sequence based on
chromosome name, seqStart and SeqEnd given the formatted human genome
database downloaded on my Linux desktop?

I used to do this using Perl $URI object and it is really slow as the
process depend on the network. To be more specific, I took chrName, seqStart
and seqEnd and go to Ensembl database to get the sequence one by one using
Perl $URI object.

I thought it might be easier if I process locally using indexed database 
using BioPerl module if there is any designed for this purpose.

Input, millions  rows of tab delimited (CSV) file contain information about
chrName, seqStart, seqEnd. Locally formatted/indexed human genome. Output
should be the fasta sequence contain the sequence and with the header
contain chr name  and location persed

Sorry if I posted in the wrong section of the forum and happy to get  any
recommendation.
Thanks 

Govind Chandra wrote:
> 
> Hi,
> 
> The code below
> 
> 
> ====== code begins =======
> #use strict;
> use Bio::SeqIO;
> 
> $infile='NC_000913.gbk';
> my $seqio=Bio::SeqIO->new(-file => $infile);
> my $seqobj=$seqio->next_seq();
> my @features=$seqobj->all_SeqFeatures();
> my $count=0;
> foreach my $feature (@features) {
>   unless($feature->primary_tag() eq 'CDS') {next;}
>   print($feature->start(),"   ", $feature->end(), "  
> ",$feature->strand(),"\n");
>   $ac=$feature->annotation();
>   $temp1=$ac->get_Annotations("locus_tag");
>   @temp2=$ac->get_Annotations();
>   print("$temp1   $temp2[0] @temp2\n");
>   if($count++ > 5) {last;}
> }
> 
> print(ref($ac),"\n");
> exit;
> 
> ======= code ends ========
> 
> produces the output
> 
> ========== output begins ========
> 
> 190   255   1
> 0    
> 337   2799   1
> 0    
> 2801   3733   1
> 0    
> 3734   5020   1
> 0    
> 5234   5530   1
> 0    
> 5683   6459   -1
> 0    
> 6529   7959   -1
> 0    
> Bio::Annotation::Collection
> 
> =========== output ends ==========
> 
> $ac is-a Bio::Annotation::Collection but does not actually contain any
> annotation from the feature. Is this how it should be? I cannot figure
> out what is wrong with the script. Earlier I used to use has_tag(),
> get_tag_values() etc. but the documentation says these are deprecated.
> 
> Perl is 5.8.8. BioPerl version is 1.6 (installed today). Output of uname
> -a is
> 
> Linux n61347 2.6.18-92.1.6.el5 #1 SMP Fri Jun 20 02:36:06 EDT 2008
> x86_64 x86_64 x86_64 GNU/Linux
> 
> Thanks in advance for any help.
> 
> Govind
> 
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> 
> 

-- 
View this message in context: http://www.nabble.com/Re%3A-Bioperl-l-Digest%2C-Vol-71%2C-Issue-15-tp22744119p22816585.html
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.




More information about the Bioperl-l mailing list