[Biopython-dev] [Bug 2544] Bio.GenBank and SeqFeature improvements

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Sat Aug 9 18:53:16 UTC 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2544





------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk  2008-08-09 14:53 EST -------
I have updated Bio/SeqFeature.py in CVS to add __repr__ methods to the
SeqFeature object and the locations.

Checking in SeqFeature.py;
/home/repository/biopython/biopython/Bio/SeqFeature.py,v  <--  SeqFeature.py
new revision: 1.12; previous revision: 1.11
done

As an example of the sort of output you now get:

>>> from Bio import SeqIO
>>> record = SeqIO.read(open("AE017199.gbk"), "genbank")
>>> record.features[0]
Bio.SeqFeature.SeqFeature(Bio.SeqFeature.FeatureLocation(Bio.SeqFeature.ExactPosition(0),Bio.SeqFeature.ExactPosition(490885)),
type='source', strand=1)
>>> print record.features[0]
type: source
location: [0:490885]
ref: None:None
strand: 1
qualifiers:
        Key: db_xref, Value: ['taxon:228908']
        Key: mol_type, Value: ['genomic DNA']
        Key: organism, Value: ['Nanoarchaeum equitans Kin4-M']

>>> print record.features[-1]
type: CDS
location: [486422:486962]
ref: None:None
strand: -1
qualifiers:
        Key: codon_start, Value: ['1']
        Key: db_xref, Value: ['GI:40069056']
        Key: locus_tag, Value: ['NEQ550']
        Key: note, Value: ['hypothetical protein']
        Key: product, Value: ['NEQ550']
        Key: protein_id, Value: ['AAR39391.1']
        Key: transl_table, Value: ['11']
        Key: translation, Value:
['MLELLAGFKQSILYVLAQFKKPEYATSYTIKLVNPFYYISDSLNVITSTKEDKVNYKVSLSDIAFDFPFKFPIVAIVEGKANREFTFIIDRQNKKLSYDLKKGIIYIQDATIIPNGIKITVNGLAELKNIKINPNDPSITVQKVVGEQNTYIIKTSKDSVKITISADFVVKAEKWLFIQ']
>>> record.features[-1]
Bio.SeqFeature.SeqFeature(Bio.SeqFeature.FeatureLocation(Bio.SeqFeature.ExactPosition(486422),Bio.SeqFeature.ExactPosition(486962)),
type='CDS', strand=-1)


I have also updated Bio/SeqRecord.py so that the __str__ method of the
SeqRecord reports the number of features,

/home/repository/biopython/biopython/Bio/SeqRecord.py,v  <--  SeqRecord.py
new revision: 1.19; previous revision: 1.18
done

For example,

>>> from Bio import SeqIO
>>> record = SeqIO.read(open("AE017199.gbk"), "genbank")
>>> print record
ID: AE017199.1
Name: AE017199
Description: Nanoarchaeum equitans Kin4-M, complete genome.
Number of features: 1107
/comment=On Dec 18, 2003 this sequence version replaced gi:37777680.
/sequence_version=1
/source=Nanoarchaeum equitans Kin4-M
/taxonomy=['Archaea', 'Nanoarchaeota', 'Nanoarchaeum']
/keywords=['']
/references=[<Bio.SeqFeature.Reference instance at 0x2aaaad299248>,
<Bio.SeqFeature.Reference instance at 0x2aaaad299b48>]
/accessions=['AE017199', 'AACL01000000', 'AACL01000001']
/data_file_division=BCT
/date=22-DEC-2003
/organism=Nanoarchaeum equitans Kin4-M
/gi=40068520
Seq('TCTCGCAGAGTTCTTTTTTGTATTAACAAACCCAAAACCCATAGAATTTAATGA...TTA',
IUPACAmbiguousDNA())

Still to do: Defining __repr__ for the Bio.SeqFeature.Reference object (and
perhaps tweaking the display of the references in the SeqRecord __str__
method).


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list