[Biopython-dev] Strange Genbank feature description: how should biopython handle this?
Brad Chapman
chapmanb at arches.uga.edu
Mon Aug 12 11:56:45 EDT 2002
Hi Danny;
Sorry to be so slow in getting back with you. Evil post-conference
mounds of work piled up on me.
> Ok, I fiddling around with the Genbank parser. In one of my test cases,
> there's one particular entry that's very evil. It comes from AP000423
> (GI:5881673), as gene RPS12:
[...]
> Having a strand of 'None' doesn't appear to be right.
Yes, actually I ran into this problem right before the conference with
Jeremy and thought I had committed the fix (ugh, forgot. Bad Brad!
Bad!). The problem is that the following code:
> - if self._seq_type == "DNA":
> - self._cur_feature.strand = 1
only will set the strand if we are dealing with a DNA molecule. The
problem is that your _seq_type looks like:
DNA circular
which mucks things up badly. I've changed this code so it looks like:
if self._seq_type.find("DNA") >= 0:
so that we only require DNA to be in the name. I think this will fix
this and the changes are in CVS. Please let me know if this doesn't
help.
> + assert(new_sub_feature.strand in (1, -1)) ## debug
Things aren't actually quite as easy to debug as this. The strand in
Biopython can take on 4 values:
None --> protein and RNA, which don't have any strand information
1 --> DNA on the plus strand
-1 --> DNA on the minus strand
0 --> DNA on both strands
Hopefully this explains things and fixes your problem. If not, feel free
to drop another e-mail!
Brad
More information about the Biopython-dev
mailing list