[Bioperl-l] load_seqdatabase.pl

Roy Chaudhuri roy.chaudhuri at gmail.com
Fri Feb 19 07:38:56 EST 2010


Hi Krzysztof,

Please cc all replies to the mailing list, that way others can 
contribute answers. I don't have experience of dealing with contig 
records so I'm not sure how best to deal with them in BioPerl/BioSQL. 
 From a quick search it looks like representation of this type of data 
is still on the BioSQL to do list:

http://biosql.org/wiki/Enhancement_Requests#Ability_to_fully_represent_contig_assembly

Roy.

On 19/02/2010 10:24, Krzysztof Sarapata wrote:
> Hi Roy
>
>
>
> I'm grateful for your answers.
>
>
>
> Suppose that the weak paradigm of bioSQL allows storing any feature tags in
> appropriate table. Even their which we don't expect.
>
>
>
> I consider expand existing structure on represent contig assembly. The most
> difficult for me is represent relation between sequence from contig and
> entry record. Exactly where put the contig name (e.g. NT_028395.3), may be
> in "Dbxref" table, because we know the remote location this contig? Add new
> column in "location" for gaps?
>
>
> LOCUS NC_000022 51304566 bp DNA linear CON 10-JUN-2009...CONTIG
> join(gap(10000),gap(12990000),gap(3000000),gap(50000),NT_028395.3:1..647850,gap(150000),NT_011519.10:1..3661581,gap(100000),NT_011520.12:1..29755346,gap(50000),NT_011526.7:1..829789,gap(50000),gap(10000))With
> regardsKris----- Original Message -----
> From: "Roy Chaudhuri"<roy.chaudhuri at gmail.com>
> To: "Krzysztof Sarapata"<mysarapa at cyf-kr.edu.pl>
> Cc:<bioperl-l at bioperl.org>
> Sent: Wednesday, February 17, 2010 5:29 PM
> Subject: Re: [Bioperl-l] load_seqdatabase.pl
>
>
>> Hi Krzysztof,
>>
>> The reasons for this are discussed in this post on the BioSQL list:
>> http://lists.open-bio.org/pipermail/biosql-l/2005-March/000751.html
>>
>> In short - BioPerl does not perform any magic to insert the db_xref
>> feature tags where you might expect (ie. the seqfeature_dbxref table). It
>> should be possible to write a SeqProcessor to do so, although it's pretty
>> easy to access them where they are with a query like:
>> SELECT * FROM seqfeature_qualifier_value s JOIN term t USING(term_id)
>> WHERE t.name='db_xref'
>>
>> Roy.
>>
>> On 16/02/2010 17:03, Krzysztof Sarapata wrote:
>>> Hi
>>>
>>>
>>>
>>> I've loaded GenBank file into bioSQL structure with
>>> "load_seqdatabase.pl" script.
>>>
>>> Is it correct that each entity (qualifier) from FEATURES even
>>> "db_xref"  e.g.
>>>
>>>
>>>
>>> FEATURES             Location/Qualifiers
>>>
>>> source          1..257719
>>>
>>> /organism="Homo sapiens"
>>>
>>> /mol_type="genomic DNA"
>>>
>>> /db_xref="taxon:9606"
>>>
>>> /chromosome="1"
>>>
>>> .....
>>>
>>> /db_xref="GeneID:100287102"
>>>
>>>
>>>
>>> is put into  "Seqfeature Qualifier Value" table, if we have exactly
>>> table "Dbxref"?
>>>
>>> Why value of this qualifier (db_xref) isn't put into "Dbxref" table?
>>>
>>>
>>>
>>> With regards
>>>
>>> Krzysztof Sarapata
>



More information about the Bioperl-l mailing list