[Biopython] How to read certain GEO files with Bio.Geo?

Ilya Flyamer flyamer at gmail.com
Thu Nov 14 20:27:34 UTC 2013


Hello everyone!

I have just recently posted a question on Stackoverflow here (
http://stackoverflow.com/questions/19961582/how-to-read-certain-geo-files-with-bio-geo),
but I am not getting any answers there.

I have a problem parsing a particular GEO file (accession number GSE40603).
I do it according to the tutorial in this way:

from Bio import Geo
handle = open('GSE40603_combined_L1_L2.txt')
records = Geo.parse(handle)for record in records:
    print record

But I get an error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/dist-packages/spyderlib/widgets/externalshell/sitecustomize.py",
line 585, in runfile
    execfile(filename, namespace)
  File "/home/ilya/Документы/biology/E coli GCC/GEOanalyzer.py", line
11, in <module>
    for record in records:
  File "/usr/local/lib/python2.7/dist-packages/Bio/Geo/__init__.py",
line 60, in parse
    record.table_rows.append(row)AttributeError: 'NoneType' object has
no attribute 'table_rows'

Here is the head of that file:

0   0   63  NC_000913   0   152 NC_000913   0   152 |neigh_up
NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene= thrL
|neigh_up NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene=
thrL  0   1   81  NC_000913   0   152 NC_000913   153 599 |neigh_up
NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene= thrL
|gene gene= thrL  |CDS(+,190,255) gene= thrL  |gene gene= thrA
|CDS(+,337,2799) gene= thrA  note= bifunctional: aspartokinase I
(N-terminal); 0   2   1   NC_000913   0   152 NC_000913   600 698
|neigh_up NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene=
thrL    |gene gene= thrA  |CDS[fcd=-312](+,337,2799) gene= thrA  note=
bifunctional: aspartokinase I (N-terminal); 0   3   1   NC_000913   0
 152 NC_000913   699 755 |neigh_up NC_000913-start |neigh_down
CDS[fcd=114](+,190,255) gene= thrL    |gene gene= thrA
|CDS[fcd=-390](+,337,2799) gene= thrA  note= bifunctional:
aspartokinase I (N-terminal); 0   4   1   NC_000913   0   152
NC_000913   756 757 |neigh_up NC_000913-start |neigh_down
CDS[fcd=114](+,190,255) gene= thrL    |gene gene= thrA
|CDS[fcd=-419](+,337,2799) gene= thrA  note= bifunctional:
aspartokinase I (N-terminal); 0   2620    1   NC_000913   0   152
NC_000913   352429  352483  |neigh_up NC_000913-start |neigh_down
CDS[fcd=114](+,190,255) gene= thrL    |gene gene= prpE
|CDS[fcd=-526](+,351930,353816) gene= prpE  note= putative
propionyl-CoA synthetase  0   18818   1   NC_000913   0   152
NC_000913   2560323 2560384 |neigh_up NC_000913-start |neigh_down
CDS[fcd=114](+,190,255) gene= thrL    |misc_feature note= cryptic
prophage Eut/CPZ-55  |gene gene= yffO
|CDS[fcd=-220](+,2560133,2560549) gene= yffO  0   2617    1
NC_000913   0   152 NC_000913   352326  352375  |neigh_up
NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene= thrL
|gene gene= prpE  |CDS[fcd=-420](+,351930,353816) gene= prpE  note=
putative propionyl-CoA synthetase  0   18817   1   NC_000913   0   152
NC_000913   2560275 2560322 |neigh_up NC_000913-start |neigh_down
CDS[fcd=114](+,190,255) gene= thrL    |misc_feature note= cryptic
prophage Eut/CPZ-55  |gene gene= yffO
|CDS[fcd=-165](+,2560133,2560549) gene= yffO  0   912 1   NC_000913
0   152 NC_000913   113055  113082  |neigh_up NC_000913-start
|neigh_down CDS[fcd=114](+,190,255) gene= thrL    |gene gene= coaE
|CDS[fcd=151](-,112599,113219) gene= coaE  note= putative DNA repair
protein

Am I doing something wrong? How do I read such files?

Thank you in advance!
Best,

Ilya Flyamer




More information about the Biopython mailing list