[Bioperl-l] BioSQL: loading large sequence records, and taxon parsing

Hilmar Lapp hlapp at gnf.org
Fri Jun 20 14:30:34 EDT 2003



> > 
> We will try to make our full BioSQL dumps available soon, let me know 
> if you want to have them.
> 

That would be very useful. Remember at the hackathon we said that at
some point we'd like to dump a bioperl-db generated load and reload into
a biojava-managed instance and see how things look then.

Although I guess the biojava folks want a Postgres dump for that.

	-hilmar

> Elia
> 
> 
> >
> >>
> >> 3. The problem I encountered that may be related to how the
> >> taxon_name table is
> >> populated by the load_seqdatabase.pl (or modules called 
> by). I loaded 
> >> the
> >> database with 2 organelle genomes the mito and the 
> chloroplast with 
> >> following
> >> two records in that order.  Though both records show up in the 
> >> bioentry table,
> >> it seems only the info from the first record got populated 
> into the 
> >> taxon_name
> >> table:
> >>
> >> taxon_id |                name                |   name_class
> >> ----------+------------------------------------+-----------------
> >>         1 | Eukaryota                          | scientific name
> >>         2 | Viridiplantae                      | scientific name
> >> .......... extra lines removed ...................
> >>        13 | Brassicaceae                       | scientific name
> >>        14 | Arabidopsis                        | scientific name
> >>        15 | Mitochondrion                      | scientific name
> >>        16 | Mitochondrion Arabidopsis          | scientific name
> >>        17 | Mitochondrion Arabidopsis thaliana | scientific name
> >>        17 | thale cress                        | common name
> >> (18 rows)
> >
> > To be honest, I do not care about it, as long as you can fetch the
> > result out correctly. I actually met such case before. One way to 
> > solve it is to load_ncbi_taxonomy before load your 
> sequence. (That may 
> > be unnecessary in your case)
> >
> > A user-to-user talk. :-)
> >
> > Juguang
> >
> > ------------ATGCCGAGCTTNNNNCT--------------
> > Juguang Xiao
> > Temasek Life Sciences Laboratory, National University of 
> Singapore 1 
> > Research Link,  Singapore 117604 juguang at tll.org.sg
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org 
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> ---
> Bioinformatics Program Manager
> Temasek Life Sciences Laboratory
> 1, Research Link
> Singapore 117604
> Tel. +65 6874 4945
> Fax. +65 6872 7007
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org 
> http://portal.open-> bio.org/mailman/listinfo/bioperl-l
> 



More information about the Bioperl-l mailing list