[Biopython-dev] incomplete patch to parse Genbank-TPA record
BK000008
Andreas Kuntzagk
andreas.kuntzagk at mdc-berlin.de
Mon Sep 15 07:06:39 EDT 2003
Since I'm stuck with this, I'm giving you the incomplete patch in hope,
somebody will have time to look at it. At the moment it parses the
BK000008 and BK000018 entries in GenBank and hopefully others as well.
BUT: the information is put into the Record only in form of one long
string with newlines deleted. If anybody really neads this info, it
should be stored in a better way.
-------------- next part --------------
Index: Bio/GenBank/Record.py
===================================================================
RCS file: /home/repository/biopython/biopython/Bio/GenBank/Record.py,v
retrieving revision 1.8
diff -r1.8 Record.py
9,11d8
< # standard modules
< import string
<
175a173
> self.primary=[]
Index: Bio/GenBank/__init__.py
===================================================================
RCS file: /home/repository/biopython/biopython/Bio/GenBank/__init__.py,v
retrieving revision 1.42
diff -r1.42 __init__.py
988c988
<
---
>
1068c1068
<
---
>
1074c1074
<
---
>
1077a1078,1084
> def primary_ref_line(self,content):
> """Data for the PRIMARY line"""
> self.data.primary.append(content)
>
> def primary(self,content):
> pass
>
1217c1224
< "sequence", "contig_location", "record_end"]
---
> "sequence", "contig_location", "record_end","primary_ref_line"]
Index: Bio/GenBank/genbank_format.py
===================================================================
RCS file: /home/repository/biopython/biopython/Bio/GenBank/genbank_format.py,v
retrieving revision 1.28
diff -r1.28 genbank_format.py
22,24c22
< # standard library
< import string
<
---
>
303a302,313
> # PRIMARY
> primary_line = Martel.Group("primary_line",
> Martel.Str("PRIMARY") +
> blank_space +
> Martel.Str("TPA_SPAN") +
> blank_space +
> Martel.Str("PRIMARY_IDENTIFIER") +
> blank_space +
> Martel.Str("PRIMARY_SPAN") +
> blank_space +
> Martel.Str("COMP") +
> Martel.ToEol())
304a315,327
> primary_ref_line =Martel.Group("primary_ref_line",
> blank_space +
> Martel.Re(r"\d+\-\d+") +
> blank_space +
> Martel.Re("[\S]+") +
> blank_space +
> Martel.Re("\d+\-\d+")+
> Martel.Opt(blank_space + Martel.Str("c"))+
> Martel.ToEol())
>
> primary = Martel.Group("primary",primary_line +
> Martel.Rep1(primary_ref_line))
>
735a759
> Martel.Opt(primary) +\
More information about the Biopython-dev
mailing list