[BioRuby] A Rails application with BioRuby

Toshiaki Katayama ktym at hgc.jp
Thu Dec 20 07:41:12 UTC 2007

Hi Yen-Ju,

On 2007/12/19, at 6:54, Yen-Ju Chen wrote:

> Hi,
>  I am working on a rails application using BioRuby to collect references
> and database entries.
>  You can find the application (not source code yet) at
> journalclub.reciprocallattice.com


>  It is still at early stage. I use it personally and figure it would be
> interesting to have more users.
>  If you want to join, please write to me in private so that it will not
> pollute BioRuby maillist.
>  I don't know how many users the application can take. Please see the
> website for more details.
>  These are things related to BioRuby,
>  * The output from Reference to BibTex format lacks abstract.
>  * It would be nice to be able to output to RIS format for EndNote and
> ReferenceManager.

If you could provide a patch for them, I'll include it in BioRuby.

>  * Is it possible to get DOI from PubMed ?

  entry = Bio::PubMed.query(16946072)
  doi = entry[/AID - (\S+) \[doi\]/, 1]

or you can extend the Bio::MEDLINE class to add the doi method

  class Bio::MEDLINE
    attr_reader :pubmed

    def doi
      @pubmed['AID'][/(\S+) \[doi\]/, 1]

  entry = Bio::PubMed.query(16946072)
  medline = Bio::MEDLINE.new(entry)
  doi = medline.doi

or utilize the XML format of the PubMed output

  entry_xml = Bio::PubMed.efetch(16946072, {"retmode" => "xml"})

            <ArticleId IdType="pii">313/5791/1295</ArticleId>
            <ArticleId IdType="doi">10.1126/science.1131542</ArticleId>
            <ArticleId IdType="pubmed">16946072</ArticleId>

then extract DOI ID

  require 'rexml/document'
  pubmed = REXML::Document.new(entry_xml)
  doi = pubmed.elements['//ArticleId[@IdType="doi"]'].get_text

>  * BioRuby can get information from many databases through biofetch,
>    but not processing them, like Pfam, Prosite, etc.

You can process them by appropriate corresponding classes. For example,

  cyclins = Bio::Fetch.query('prosite', 'PS00292')
  prosite = Bio::PROSIE.new(cyclins)

  # ==> "PS00292"

  # ==> "Cyclins signature."

  # ==> "R-x(2)-[LIVMSA]-x(2)-[FYWS]-[LIVM]-x(8)-[LIVMFC]-x(4)-[LIVMFYA]-x(2)-[STAGC]-[LIVMFYQ]-x-[LIVMFYC]-[LIVMFY]-D-[RKH]-[LIVMFYW]."



>  * it is not clear what's the database from biofetch, for example: rn, rp,
> str, pr.
>    I am in structural biology. Many of these abbreviation is not obvious.

In BioRuby, the default BioFetch server is implemented as a proxy for the DBGET system through KEGG API.
So, please refer to the abbreviation field in the DBGET manual at


and also note that the DBGET service for GenBank (gb) database is no longer available.

Toshiaki Katayama

More information about the BioRuby mailing list