[Bioperl-l] Re: load_gff.pl question

Shin Enomoto shin at biosci.cbs.umn.edu
Wed Aug 6 23:52:19 EDT 2003


Thank you.
After a good night's sleep I modified the GFF table  one column at a  
time and found that (ref, source, method, start, end, gclass, name)  
mattered. What does fbin come from?

I have a different question?
When I load the following:
VI_3	nelson	cdna	404988	405465	.	+	.	EST "gi|2099810|"; Note "CpEST.323  
uniZAPCpIOWAsporoLib3 Cryptosporidium parvum cDNA 5' similar to C.  
elegans ORF M28.5 and H. sapiens nuclear protein-NHP2-like protein.,  
mRNA sequence"

with a
[EST]
glyph	      = generic

in the configuration file.
gbrowse script fails "glyph genric new not available".

My work around was either to change the word EST to something else or  
use another glyph. What do you think is  the conflict?


On Wednesday, August 6, 2003, at 02:21  PM, Scott Cain wrote:

> Shin,
>
> The problem you are running into is not really with load_gff.pl, but
> with the database schema.  Assuming you are using MySQL, the table
> create statement for fdata looks like this:
>
>  create table fdata (
>     fid                 int not null  auto_increment,
>     fref                varchar(100) not null,
>     fstart              int unsigned   not null,
>     fstop               int unsigned   not null,
>     fbin                double(20,6)  not null,
>     ftypeid             int not null,
>     fscore              float,
>     fstrand             enum('+','-'),
>     fphase              enum('0','1','2'),
>     gid                 int not null,
>     ftarget_start       int unsigned,
>     ftarget_stop        int unsigned,
>     primary key(fid),
>     unique index(fref,fbin,fstart,fstop,ftypeid,gid),
>     index(ftypeid),
>     index(gid)
>
> The problem  you have is with that unique index on
> (fref,fbin,fstart,fstop,ftypeid,gid).  This index conflicts with your
> data, in that the similar lines are getting assigned the same gid  
> (group
> id), since they look like the same thing.  So, the quick way to fix  
> this
> is to remove the 'unique' from the index declaration.  That can be  
> found
> in Bio/DB/GFF/Adaptor/dbi/mysql.pm. Then run load_gff.pl as usual.  The
> longer way to fix this is look at your data and figure out why they are
> all getting assigned the same group id and make them sufficiently
> different so that they don't.
>
> Hope that helps,
> Scott
>
> On Wed, 2003-08-06 at 13:31, bioperl-l-request at portal.open-bio.org
> wrote:
>> Where do I start to customize this script to allow loading of large
>> number of similar entities?
>
> --  
> ----------------------------------------------------------------------- 
> -
> Scott Cain, Ph. D.                                          
> cain at cshl.org
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>
>
Shin Enomoto
295 ASLVM
1988 Fitch Ave
St. Paul, MN 55108

612-625-7737



More information about the Bioperl-l mailing list