[Biopython-dev] Working on Sequence deprecation
Brad Chapman
chapmanb at arches.uga.edu
Sat Jan 27 12:49:59 EST 2001
Hello all;
I was working some this morning on deprecating Sequence.py (in favor
of Andrew's Seq.py), which I think is on our to-do list for the next
release.
I'd done a little bit of work on this earlier on Fasta.py, and I
completed the job this morning and checked it in along with tests. I
then grepped for other stuff that uses Sequence.py, and came up with:
o Rebase and Gobase -- These contain SequenceParser classes, but
either these are left over from a copy and paste or the
_SequenceConsumer classes haven't been written yet, I guess. What
is the plan for these? It doesn't seem like the data really fits into
a sequence class, but I'm not sure.
o SwissProt -- I changed the SequenceParser to a simple
implementation that uses the SeqRecord and Seq classes. I didn't
really go into anything complicated like SeqFeatures yet.
The context diff for this is attached. It also has a fix for OX
lines, which I think actually fixes my previous patch. I didn't
realize there wasn't a test for SProt before in the regression tests,
so my previous test didn't handle OX lines correctly on older files
(ie. it bombs out if there isn't an OX line. I think the new one does
it right). Sorry about that, I think this might have been
the problem Andrew was talking about in his Martel tests.
I think this is it, and then nothing will use Sequence.py. Pretty
exciting! What do people think? Ready for Sequence.py to go so we only
have one sequence class?
Additionally, have we also thought about getting rid of the SeqIO
directory? I think the current Fasta.py will do everything this does
right now, so we might not need it any more. What do people think?
Brad
-------------- next part --------------
*** SProt.py.orig Wed Nov 29 19:37:27 2000
--- SProt.py Sat Jan 27 12:35:14 2001
***************
*** 20,30 ****
Dictionary Accesses a SwissProt file using a dictionary interface.
ExPASyDictionary Accesses SwissProt records from ExPASy.
RecordParser Parses a SwissProt record into a Record object.
! SequenceParser Parses a SwissProt record into a Sequence object.
_Scanner Scans SwissProt-formatted data.
_RecordConsumer Consumes SwissProt data to a Record object.
! _SequenceConsumer Consumes SwissProt data to a Sequence object.
Functions:
--- 20,30 ----
Dictionary Accesses a SwissProt file using a dictionary interface.
ExPASyDictionary Accesses SwissProt records from ExPASy.
RecordParser Parses a SwissProt record into a Record object.
! SequenceParser Parses a SwissProt record into a Seq object.
_Scanner Scans SwissProt-formatted data.
_RecordConsumer Consumes SwissProt data to a Record object.
! _SequenceConsumer Consumes SwissProt data to a Seq object.
Functions:
***************
*** 36,42 ****
import string
from Bio import File
from Bio import Index
! from Bio import Sequence
from Bio.ParserSupport import *
from Bio.WWW import ExPASy
from Bio.WWW import RequestLimiter
--- 36,44 ----
import string
from Bio import File
from Bio import Index
! from Bio import Alphabet
! from Bio import Seq
! from Bio import SeqRecord
from Bio.ParserSupport import *
from Bio.WWW import ExPASy
from Bio.WWW import RequestLimiter
***************
*** 288,299 ****
return self._consumer.data
class SequenceParser:
! """Parses SwissProt data into a Sequence object.
"""
! def __init__(self):
self._scanner = _Scanner()
! self._consumer = _SequenceConsumer()
def parse(self, handle):
self._scanner.feed(handle, self._consumer)
--- 290,307 ----
return self._consumer.data
class SequenceParser:
! """Parses SwissProt data into a Seq object.
"""
! def __init__(self, alphabet = Alphabet.generic_protein):
! """Initialize a RecordParser.
!
! Arguments:
! o alphabet - The alphabet to use for the generated Seq objects. If
! not supplied this will default to the generic protein alphabet.
! """
self._scanner = _Scanner()
! self._consumer = _SequenceConsumer(alphabet)
def parse(self, handle):
self._scanner.feed(handle, self._consumer)
***************
*** 390,396 ****
def _scan_ox(self, uhandle, consumer):
self._scan_line('OX', uhandle, consumer.taxonomy_id,
! one_or_more=1)
def _scan_reference(self, uhandle, consumer):
while 1:
--- 398,404 ----
def _scan_ox(self, uhandle, consumer):
self._scan_line('OX', uhandle, consumer.taxonomy_id,
! any_number=1)
def _scan_reference(self, uhandle, consumer):
while 1:
***************
*** 712,728 ****
setattr(ref, m, string.rstrip(attr))
class _SequenceConsumer(AbstractConsumer):
! """Consumer that converts a SwissProt record to a Sequence object.
Members:
! data Record with SwissProt data.
"""
! def __init__(self):
self.data = None
def start_record(self):
! self.data = Sequence.NamedSequence(Sequence.Sequence())
def end_record(self):
pass
--- 720,746 ----
setattr(ref, m, string.rstrip(attr))
class _SequenceConsumer(AbstractConsumer):
! """Consumer that converts a SwissProt record to a Seq object.
Members:
! data Record with SwissProt data.
! alphabet The alphabet the generated Seq objects will have.
"""
! def __init__(self, alphabet = Alphabet.generic_protein):
! """Initialize a Sequence Consumer
!
! Arguments:
! o alphabet - The alphabet to use for the generated Seq objects. If
! not supplied this will default to the generic protein alphabet.
! """
self.data = None
+ self.alphabet = alphabet
def start_record(self):
! seq = Seq.Seq("", self.alphabet)
! self.data = SeqRecord.SeqRecord(seq)
! self.data.description = ""
def end_record(self):
pass
***************
*** 730,738 ****
def identification(self, line):
cols = string.split(line)
self.data.name = cols[1]
def sequence_data(self, line):
! seq = string.rstrip(string.replace(line, " ", ""))
self.data.seq = self.data.seq + seq
def index_file(filename, indexname, rec2key=None):
--- 748,765 ----
def identification(self, line):
cols = string.split(line)
self.data.name = cols[1]
+
+ def accession(self, line):
+ ids = string.split(string.rstrip(line[5:]), ';')
+ self.data.id = ids[0]
+
+ def description(self, line):
+ self.data.description = self.data.description + \
+ string.strip(line[5:]) + "\n"
def sequence_data(self, line):
! seq = Seq.Seq(string.rstrip(string.replace(line, " ", "")),
! self.alphabet)
self.data.seq = self.data.seq + seq
def index_file(filename, indexname, rec2key=None):
More information about the Biopython-dev
mailing list