[BioPython] DAS client

Andrew Dalke dalke@dalkescientific.com
Fri, 23 Aug 2002 11:40:37 -0600


I've got a first draft, as it were, of a DAS client.  You can download
and try it out from

  http://www.biopython.org/~dalke/das.tar.gz

Here's some documentation for it.

  - ungzip / untar it
  - cd das
  - start python

  >>> import das
  >>> server = das.Server("http://servlet.sanger.ac.uk:8080/das")
  
  the 'server' object acts like a Python dictionary for the different
  data sources available

  >>> print len(server)
  >>> print server.keys()
  
  >>> dsn = server["mouse73"]
  >>> sheet = dsn.stylesheet()
  
  This is class is built given the DTD.  Here's some examples of how
  you can manipulate it.

  Dump the whole data structure as XML
  >>> print sheet

  See what it contains
  >>> sheet.get_children()

  Work with subelements
  >>> print len(sheet["STYLESHEET"])
  >>> category = sheet["STYLESHEET"][0]

  Get attributes of a node
  >>> print category["TYPE"].id

  >>> glyph = category["TYPE"]["GLYPH"]

  Any subelement can also be dumped to XML
  >>> print glyph
  
  Get the text inside a node  (.text() returns a list of unicode strings)
  >>> print "".join(map(str, glyph[0]["COLOR"].text()))

  Besides 'stylesheet' the following methods are also supported on a DSN

  
  >>> [s for s in dir(dsn) if not s.startswith("_")]
  ['dna', 'dsn', 'entry_points', 'features', 'link', 'sequence', 'server', 'stylesheet', 'types']

(Actually, 'dsn' and 'server' are attributes.

>>> eps = dsn.entry_points()
>>> for segment in ep["ENTRY_POINTS"]:
...     print segment.id, segment.size
...
X 147161770
19 61356199
18 91189200
17 94132929
16 99184200
15 104633288
14 116006794
13 117115093
12 114251360
11 122883361
10 131187037
9 125583845
8 129321983
7 135793178
6 150316567
5 151006098
4 151730910
3 160564582
2 180335396
1 196842934
>>> print dsn.dna( segments = [ ("X", 147161760, 147161770), ("1", 100, 20) ] )
Calling url 'http://servlet.sanger.ac.uk:8080/das/mouse73/dna' with query
'segment=X:147161760,147161770;segment=1:100,20'
<DASDNA>
        <SEQUENCE start="147161760" version="7.3" stop="147161770" id="X">
                <DNA length="11">
agagagagagt
                </DNA>
        </SEQUENCE>
        <SEQUENCE start="100" version="7.3" stop="20" id="1">
                <DNA length="81">
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
nnnnnnnnnnnnnnnnnnnnn
                </DNA>
        </SEQUENCE>
</DASDNA>
>>> print dsn.types( segments = [("6", None, None)] )
<DASTYPES>
        <GFF href="http://servlet.sanger.ac.uk:8080/das/mouse73/types" version="1.0">
                <SEGMENT start="1" version="7.3" stop="150316567" id="6">
                        <TYPE id="static_golden_path">
151
                        </TYPE>
                        <TYPE id="transcript">
23
                        </TYPE>
                        <TYPE id="translation">
23
                        </TYPE>
                        <TYPE id="exon">
304
                        </TYPE>
                </SEGMENT>
        </GFF>
</DASTYPES>
>>>


It also converts DAS errors into Python exceptions
>>> print dsn.sequence( [ ("X", 147161760, 147161770), ("1", 100, 20) ] )
Calling url 'http://servlet.sanger.ac.uk:8080/das/mouse73/sequence' with query
'segment=X:147161760,147161770;segment=1:100,20'
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "das.py", line 255, in sequence
    return self._call("sequence", s, "dassequence")
  File "das.py", line 243, in _call
    return self.server._call(self.dsn + "/" + command, query, dtd_name)
  File "das.py", line 180, in _call
    raise DASError(status_code)
das.DASError: 400: Bad command (command not recognized)
>>>

Note that debugging is turned on in that it shows the URL called and
the query string, which gives you the chance to see what it's doing
and (if needed) reproduce a problem by hand.

Speaking of which, many of the publically available servers listed at
  http://www.tigr.org/tdb/DAS/das_server_list.html
have problems.  I've tweaked the DTDs to support things like 'COLOR'
and 'OUTLINECOLOR' which are pre-1.0 spec, and added a few other things
as I found them.  Another has 'X-DAS-Status' line with descriptive text
after the 3 digit status number.  Still others return syntactically
invalid XML or simply implement the wrong function (a couple return the
DASDSN for 'entry_points'!).

So enjoy, but be wary!  :)

					Andrew
					dalke@dalkescientific.com