[Biopython] How to read certain GEO files with Bio.Geo?
Sean Davis
sdavis2 at mail.nih.gov
Thu Nov 14 21:06:25 UTC 2013
On Thu, Nov 14, 2013 at 3:27 PM, Ilya Flyamer <flyamer at gmail.com> wrote:
> Hello everyone!
>
> I have just recently posted a question on Stackoverflow here (
>
> http://stackoverflow.com/questions/19961582/how-to-read-certain-geo-files-with-bio-geo
> ),
> but I am not getting any answers there.
>
> I have a problem parsing a particular GEO file (accession number GSE40603).
> I do it according to the tutorial in this way:
>
> from Bio import Geo
> handle = open('GSE40603_combined_L1_L2.txt')
>
This file is a so-called "supplemental file" from GEO. It was supplied by
the original submitter, so tools to read GEO formats will not work with it.
In this particular case (NGS data), your best bet is to simply parse your
downloaded file with standard python tools.
Sean
> records = Geo.parse(handle)for record in records:
> print record
>
> But I get an error:
>
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File
> "/usr/lib/python2.7/dist-packages/spyderlib/widgets/externalshell/sitecustomize.py",
> line 585, in runfile
> execfile(filename, namespace)
> File "/home/ilya/Документы/biology/E coli GCC/GEOanalyzer.py", line
> 11, in <module>
> for record in records:
> File "/usr/local/lib/python2.7/dist-packages/Bio/Geo/__init__.py",
> line 60, in parse
> record.table_rows.append(row)AttributeError: 'NoneType' object has
> no attribute 'table_rows'
>
> Here is the head of that file:
>
> 0 0 63 NC_000913 0 152 NC_000913 0 152 |neigh_up
> NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene= thrL
> |neigh_up NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene=
> thrL 0 1 81 NC_000913 0 152 NC_000913 153 599 |neigh_up
> NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene= thrL
> |gene gene= thrL |CDS(+,190,255) gene= thrL |gene gene= thrA
> |CDS(+,337,2799) gene= thrA note= bifunctional: aspartokinase I
> (N-terminal); 0 2 1 NC_000913 0 152 NC_000913 600 698
> |neigh_up NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene=
> thrL |gene gene= thrA |CDS[fcd=-312](+,337,2799) gene= thrA note=
> bifunctional: aspartokinase I (N-terminal); 0 3 1 NC_000913 0
> 152 NC_000913 699 755 |neigh_up NC_000913-start |neigh_down
> CDS[fcd=114](+,190,255) gene= thrL |gene gene= thrA
> |CDS[fcd=-390](+,337,2799) gene= thrA note= bifunctional:
> aspartokinase I (N-terminal); 0 4 1 NC_000913 0 152
> NC_000913 756 757 |neigh_up NC_000913-start |neigh_down
> CDS[fcd=114](+,190,255) gene= thrL |gene gene= thrA
> |CDS[fcd=-419](+,337,2799) gene= thrA note= bifunctional:
> aspartokinase I (N-terminal); 0 2620 1 NC_000913 0 152
> NC_000913 352429 352483 |neigh_up NC_000913-start |neigh_down
> CDS[fcd=114](+,190,255) gene= thrL |gene gene= prpE
> |CDS[fcd=-526](+,351930,353816) gene= prpE note= putative
> propionyl-CoA synthetase 0 18818 1 NC_000913 0 152
> NC_000913 2560323 2560384 |neigh_up NC_000913-start |neigh_down
> CDS[fcd=114](+,190,255) gene= thrL |misc_feature note= cryptic
> prophage Eut/CPZ-55 |gene gene= yffO
> |CDS[fcd=-220](+,2560133,2560549) gene= yffO 0 2617 1
> NC_000913 0 152 NC_000913 352326 352375 |neigh_up
> NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene= thrL
> |gene gene= prpE |CDS[fcd=-420](+,351930,353816) gene= prpE note=
> putative propionyl-CoA synthetase 0 18817 1 NC_000913 0 152
> NC_000913 2560275 2560322 |neigh_up NC_000913-start |neigh_down
> CDS[fcd=114](+,190,255) gene= thrL |misc_feature note= cryptic
> prophage Eut/CPZ-55 |gene gene= yffO
> |CDS[fcd=-165](+,2560133,2560549) gene= yffO 0 912 1 NC_000913
> 0 152 NC_000913 113055 113082 |neigh_up NC_000913-start
> |neigh_down CDS[fcd=114](+,190,255) gene= thrL |gene gene= coaE
> |CDS[fcd=151](-,112599,113219) gene= coaE note= putative DNA repair
> protein
>
> Am I doing something wrong? How do I read such files?
>
> Thank you in advance!
> Best,
>
> Ilya Flyamer
>
> _______________________________________________
> Biopython mailing list - Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>
More information about the Biopython
mailing list