[Biopython-dev] large updates in CVS
Jeffrey Chang
jchang at smi.stanford.edu
Tue Sep 10 02:39:16 EDT 2002
Hello everybody,
I have just committed into CVS a reworking of the registry framework.
That's the stuff that gets loaded into the Bio namespace. As before,
you can do:
krusty:~] jchang% python
Python 2.2.1 (#1, 09/06/02, 17:02:21)
[GCC Apple cpp-precomp 6.14] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import Bio
>>> Bio.formats
Bio.dformats, exporting 'blast', 'blastn', 'blastp', 'blastx', 'embl', 'embl/65', 'empty', 'fasta', 'genbank', 'genbank-records', 'genbank-release', 'ncbi-blastn', 'ncbi-blastp', 'ncbi-blastx', 'ncbi-tblastn', 'ncbi-tblastx', 'search', 'sequence', 'swissprot', 'swissprot/38', 'swissprot/40', 'tblastn', 'tblastx', 'wu-blastn', 'wu-blastp', 'wu-blastx'
>>> Bio.db
db, exporting 'embl', 'embl-dbfetch-cgi', 'embl-ebi-cgi', 'embl-fast', 'embl-xembl-cgi', 'interpro-ebi-cgi', 'nucleotide-dbfetch-cgi', 'nucleotide-genbank-cgi', 'pdb', 'pdb-ebi-cgi', 'pdb-rcsb-cgi', 'prodoc-expasy-cgi', 'prosite-expasy-cgi', 'protein-genbank-cgi', 'swissprot', 'swissprot-expasy-cgi'
>>> print Bio.db['swissprot']["P50105"].read()[:200]
ID MPT1_YEAST STANDARD; PRT; 388 AA.
AC P50105;
DT 01-OCT-1996 (Rel. 34, Created)
DT 01-OCT-1996 (Rel. 34, Last sequence update)
DT 01-OCT-1996 (Rel. 34, Last annotation update)
D
>>>
The differences are mostly internal. The Registry code has now been
pulled out to make it easier to add new registries and new entries to
registries.
The DBRegistry has been slightly reworked from before, though.
Similarly as before, the DBRegistry implements a dictionary-like
interface to different kinds of databases. Now, the __getitem__
function will return a type native to the kind of database it is. For
example, CGI databases return handles to the data and BioSQL will
return a SeqRecord object.
However, the DBRegistry objects now export a method:
get_as(key, to_io)
that will automatically convert the data to a specific type.
>>> seq = Bio.db['swissprot'].get_as("P50105", SeqRecord.io)
>>> print seq.seq
Seq('MANSPKKPSDGTGVSASDTPKYQHTVPETKPAFNLSPGKASELSHSLPSPSQIKSTAHVS ...', SingleLetterAlphabet())
>>>
This standardizes the data access across different kinds of databases.
Please update your repositores with
cvs update -P -d
Please play around with this and let me know if anything has broken.
One known issue is that some of the dbdefs need to be updated, due to
changes in external web servers.
Jeff
More information about the Biopython-dev
mailing list