[Biopython] Extracting data genpept files

Peter biopython at maubp.freeserve.co.uk
Tue Nov 23 08:53:52 UTC 2010


On Tue, Nov 23, 2010 at 3:52 AM, Ara Kooser <akooser at unm.edu> wrote:
> Hello all,
>
>   I think Peter pointed me to part of this code (shown below) for extracting
> data out of a genpept file. I am trying to get a handle on the formating end
> of things. My questions is when there is missing taxonomic data grabbed by
> tax_records = gb_record.annotations["taxonomy"] instead of leaving the space
> blank the program fills it in with the next piece of data, usually the date.
> This throws off the whole spreadsheet when I import as a CSV file.
>

If I understood your aim, try using this if the taxonomy isn't in the
annotations
dictionary (which would give a KeyError),

tax_records = gb_record.annotations.get("taxonomy", [])

Perhaps you could clarify if you want the taxonomy (a list of variable length)
to go in one column of your CSV file?

Peter

P.S. I prefer using tab separated variables (tsv) over csv, as I find commas
in descriptions quite often - and although this can be dealt with it is fiddly.




More information about the Biopython mailing list