[Biopython-dev] [Bug 2544] New: Bio.SeqIO improvements
bugzilla-daemon at portal.open-bio.org
bugzilla-daemon at portal.open-bio.org
Wed Jul 16 09:39:01 UTC 2008
http://bugzilla.open-bio.org/show_bug.cgi?id=2544
Summary: Bio.SeqIO improvements
Product: Biopython
Version: 1.47
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Main Distribution
AssignedTo: biopython-dev at biopython.org
ReportedBy: mmokrejs at ribosome.natur.cuni.cz
$ python
Python 2.5.2 (r252:60911, Jul 2 2008, 22:55:24)
[GCC 4.3.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from Bio import SeqIO
>>> handle = open("genbank-synthetic.gb")
>>> print seq_record
ID: EF452680.2
Name: EF452680
Description: Synthetic construct nitric oxide synthase (NOS) gene, partial cds.
/comment=On Feb 4, 2008 this sequence version replaced gi:145391444.
/sequence_version=2
/source=synthetic construct
/taxonomy=['other sequences', 'artificial sequences']
/keywords=['']
/references=[<Bio.SeqFeature.Reference instance at 0x834cb6c>,
<Bio.SeqFeature.Reference instance at 0x834c16c>, <Bio.SeqFeature.Reference
instance at 0x834ceac>, <Bio.SeqFeature.Reference instance at 0x834c2ec>]
/accessions=['EF452680']
/data_file_division=SYN
/date=11-JUN-2008
/organism=synthetic construct
/gi=166831528
Seq('TAGGCCTCTGCTTGCCGTTTGTTTCGTCAGCGATTTTTATAGTCTCAGCCTCCT...GCC',
IUPACAmbiguousDNA())
>>>
I do not see how I could access the value 'DNA' from the LOCUS line:
LOCUS EF452680 260 bp DNA linear SYN 11-JUN-2008
No, I do not want to read seq_record.features[0].qualifiers['mol_type'][0].
Could seq_record.features have a repr() function to give me something useful
instead of this?
>>> print seq_record.features
[<Bio.SeqFeature.SeqFeature instance at 0x837bc2c>, <Bio.SeqFeature.SeqFeature
instance at 0x837b9cc>, <Bio.SeqFeature.SeqFeature instance at 0x837bd4c>]
>>>
I don't see documented anywhere in the biopython docs access the features,
pasting something like the following into docs would give a user clue where to
look for for values:
>>> print seq_record.features[0].qualifiers
{'db_xref': ['taxon:32630'], 'mol_type': ['other DNA'], 'organism': ['synthetic
construct'], 'chromosome': ['Ib'], 'PCR_primers': ['fwd_seq:
aggcctctgcttgccgtttgtttcg, rev_seq: cgccggcggcacacgctcaactaattac']}
>>> print seq_record.features[1].qualifiers
{'gene': ['NOS']}
>>> print seq_record.features[2].qualifiers
{'product': ['nitric oxide synthase'], 'codon_start': ['2'], 'EC_number':
['1.14.13.39'], 'transl_table': ['11'], 'note': ['derived from Toxoplasma
gondii'], 'db_xref': ['GI:166831529'], 'translation':
['RPLLAVCFVSDFYSLSLLHFASVPFHESDGCVGRSHWLPGKHANYVKPAGARKRPEVGCRSSCLLRSVCCDILSPVRTRGN'],
'gene': ['NOS'], 'protein_id': ['ABP65329.2']}
>>> print seq_record.features[3].qualifiers
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
>>>
I wonder if I could access the above dicts as seq_record.features['source']
or seq_record.features['CDS']. Where is the 'source', 'gene', 'CDS' gone?
--
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
More information about the Biopython-dev
mailing list