[Bioperl-l] bp_genbank2gff3.pl - circular genomes, origin-spanning features, and GFF3
Leighton Pritchard
lpritc at scri.ac.uk
Fri Apr 9 09:30:25 EDT 2010
Sorry - clicked 'send' while moving windows round when composing... Full
email coming soon...
Apologies,
L.
On 09/04/2010 Friday, April 9, 14:29, "Chris Fields" <cjfields at illinois.edu>
wrote:
> Leighton,
>
> Didn't see the GFF3 in question.
>
> chris
>
> On Apr 9, 2010, at 8:06 AM, Leighton Pritchard wrote:
>
>> Hi,
>>
>> (cc'd to Lincoln due to GFF3 relevance)
>>
>> I've recently been trying to use BioPerl, CHADO and GBROWSE to represent
>> bacterial genome sequences. In doing this, I've been testing with GenBank
>> genome/feature files, converting these to GFF3 with bp_genbank2gff3.pl to
>> get a CHADO-friendly gene model. There appears to be an issue when
>> converting GenBank files that contain features which span the genomic
>> origin.
>>
>> For example, the GenBank file NC_002127.gbk describes a plasmid from E.coli
>> O157H7. This contains the following feature which spans the reference
>> sequence origin:
>>
>> gene join(92527..92721,1..2502)
>> /gene="tagA"
>> /locus_tag="pO157p01"
>> /db_xref="GeneID:1789672"
>> CDS join(92527..92721,1..2502)
>> /gene="tagA"
>> /locus_tag="pO157p01"
>> /codon_start=1
>> /transl_table=11
>> /product="ToxR-regulated lipoprotein"
>> /protein_id="NP_052607.1"
>> /db_xref="GI:10955349"
>> /db_xref="GeneID:1789672"
>>
>> When using the bp_genbank2gff3.pl script (either from bioperl-live or
>> release 1.6.1) to convert NC_002128.gbk to GFF3 with the command-line
>>
>> $ bp_genbank2gff3.pl ./Escherichia_coli_O157H7/NC_002128.gbk -out stdout >
>> test.gff3
>>
>> This produces the following, non-sequence ontology-compatible GFF:
>
>
> ??????
>
>
>> --
>> Dr Leighton Pritchard MRSC
>> D131, Plant Pathology Programme, SCRI
>> Errol Road, Invergowrie, Perth and Kinross, Scotland, DD2 5DA
>> e:lpritc at scri.ac.uk w:http://www.scri.ac.uk/staff/leightonpritchard
>> gpg/pgp: 0xFEFC205C tel:+44(0)1382 562731 x2405
>
>
>
>
>
> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email
> ______________________________________________________________________
--
Dr Leighton Pritchard MRSC
D131, Plant Pathology Programme, SCRI
Errol Road, Invergowrie, Perth and Kinross, Scotland, DD2 5DA
e:lpritc at scri.ac.uk w:http://www.scri.ac.uk/staff/leightonpritchard
gpg/pgp: 0xFEFC205C tel:+44(0)1382 562731 x2405
______________________________________________________
SCRI, Invergowrie, Dundee, DD2 5DA.
The Scottish Crop Research Institute is a charitable company limited by guarantee.
Registered in Scotland No: SC 29367.
Recognised by the Inland Revenue as a Scottish Charity No: SC 006662.
DISCLAIMER:
This email is from the Scottish Crop Research Institute, but the views expressed by the sender are not necessarily the views of SCRI and its subsidiaries. This email and any files transmitted with it are confidential to the intended recipient at the e-mail address to which it has been addressed. It may not be disclosed or used by any other than that
addressee.
If you are not the intended recipient you are requested to preserve this confidentiality and you must not use, disclose, copy, print or rely on this e-mail in any way. Please notify postmaster at scri.ac.uk quoting the name of the sender and delete the email from your system.
Although SCRI has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and the attachments (if any).
______________________________________________________
More information about the Bioperl-l
mailing list