[Biopython] Extracting data genpept files
Peter
biopython at maubp.freeserve.co.uk
Tue Nov 23 03:53:52 EST 2010
On Tue, Nov 23, 2010 at 3:52 AM, Ara Kooser <akooser at unm.edu> wrote:
> Hello all,
>
> I think Peter pointed me to part of this code (shown below) for extracting
> data out of a genpept file. I am trying to get a handle on the formating end
> of things. My questions is when there is missing taxonomic data grabbed by
> tax_records = gb_record.annotations["taxonomy"] instead of leaving the space
> blank the program fills it in with the next piece of data, usually the date.
> This throws off the whole spreadsheet when I import as a CSV file.
>
If I understood your aim, try using this if the taxonomy isn't in the
annotations
dictionary (which would give a KeyError),
tax_records = gb_record.annotations.get("taxonomy", [])
Perhaps you could clarify if you want the taxonomy (a list of variable length)
to go in one column of your CSV file?
Peter
P.S. I prefer using tab separated variables (tsv) over csv, as I find commas
in descriptions quite often - and although this can be dealt with it is fiddly.
More information about the Biopython
mailing list