[Bioperl-l] Query Unigene title from input a ACC number

Andrew Macgregor andrew at anatomy.otago.ac.nz
Wed Mar 26 11:10:41 EST 2003


Hi Darson,

When I do this sort of thing I normally use bioperl to parse the 
Hs.data file once, into a format I can then load into mysql. Then I 
would run the thousands of queries against the database. Even if the 
parsing takes a little while you only have to do it once. UniGene is 
only updated every few weeks so I just reload it into the database when 
needed.

-- Andrew.


On Tuesday, March 25, 2003, at 07:39  PM, darson wrote:

> Hello,
>
> I'm trying to write a script to grab Unigene title from a Hs.data file 
> by
> input a ACC number,
> The following script is premature test,
>
> use Bio::Cluster::UniGene; use Bio::ClusterIO; use Bio::ClusterI;
> $stream=Bio::ClusterIO->new('-file'=>"/home/human_unigene/Hs.data", #
> location of human unigene file from NCBI FTP
>                                                   
> '-format'=>"unigene");
> while (my $in=$stream->next_cluster()){
>      while (my $sequence=$in->next_seq()){
>           if ($sequence->accession_number()=~/BG618921/){ #BG618921 is 
> a ACC
> member of Hs.107 fibrinogen-like 1
>                print $hitid=$in->unigene_id()."\n";
>                print $hitti=$in->title()."\n";
>          }
>      }
> }
>
> It can report the correct one, however this script spents over 1 hour 
> and
> more  to accomplish.  That's extremely low efficiency. Furthermore I 
> have
> thousands to do. I would be very appreciative if any suggestions or 
> other
> methods to solve my problems. Thanks!
>                     Best regards,
>                                                      Darson Chung 
> 2003/03/25
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>
>



More information about the Bioperl-l mailing list