[Bioperl-l] xml sequence download from ncbi

Paul Gordon gordonp@niji.imb.nrc.ca
Thu, 24 Aug 2000 10:58:25 -0300 (ADT)


> Sequence download using an xml format derived from our asn.1 standard format
> is now available from Entrez.  For an example, try
> http://www.ncbi.nlm.nih.gov/entrez/viewer.cgi?cmd&save=on&view=xml&val=18279
> 15  where val is the sequence gi number.  Note that this xml output is based

That's great!  A few minor points though... 

<!DOCTYPE Seq---entry PUBLIC "-//NCBI//NCBI Seqset/EN" "NCBI_Seqset.dtd"> 
<Seq-entry>

According to the XML spec, the doctype name and the root element must have
the same name (see
http://maggie.cbr.nrc.ca/~gordonp/xml/W3C/xml.html#vc-roottype).

Also, can we get the DTD NCBI_Seqset.dtd?  I was not able to find it
poking around 
http://www.ncbi.nlm.nih.gov/entrez/NCBI_Seqset.dtd and environs...

> on our asn.1 records which are both complete and complex -- we may end up
> making a genbank flatfile-like version, especially since there are small
> mismatches between the asn.1 and xml languages that make the xml a bit more
> complex than if xml was our native format.

Regards,
	Paul

________________________________________________________________________
Paul Gordon                                     Paul.Gordon@nrc.ca
Genomic Technologies				http://maggie.cbr.nrc.ca
Institute for Marine Biosciences
National Research Council Canada