[BioRuby] Re: KEGG track in gbrowse
Toshiaki Katayama
ktym at hgc.jp
Tue Jan 25 08:36:41 EST 2005
Hi Venky,
In KEGG DAS (das.hgc.jp), I'm using following conf for GBrowse
--------------------------------------------------
link = sub {
my $feature = shift;
my $name = $feature->display_name;
my $gene = $feature->attributes("Gene");
my $cgi = "http://www.genome.jp/dbget-bin/show_pathway";
my $url = "$cgi?$name+$gene";
return $url;
}
--------------------------------------------------
and GFF lines correspond to this feature are something like
(this sample is taken from yeast's)
--------------------------------------------------
I KEGG pathway 31568 32941 . + . path
"sce00251" ; Gene "YAL062W"
I KEGG pathway 42881 45022 . - . path
"sce00010" ; Gene "YAL054C"
I KEGG pathway 42881 45022 . - . path
"sce00620" ; Gene "YAL054C"
:
--------------------------------------------------
Unfortunately, KEGG DAS doesn't support human because
the KEGG GENES for human doesn't contain a gene
coordination on the chromosome for now.
In your case, you just need gene_id to pathway_id mappings.
That can be obtained from raw KEGG GENES entries (strategy 1)
or using KEGG API (strategy 2).
[Strategy 1]
KEGG GENES flat file for human is available at
ftp://ftp.genome.jp/pub/kegg/genomes/genes/H.sapiens.ent
and the entry looks like
--------------------------------------------------
ENTRY 2 CDS H.sapiens
NAME A2M
DEFINITION alpha-2-macroglobulin
ORTHOLOG KO: K03910 alpha-2-macroglobulin
CLASS Environmental Information Processing; Immune System;
Complement and
coagulation cascades [PATH:hsa04610]
Human Diseases; Neurodegenerative Disorders; Alzheimer's
disease
[PATH:hsa05010]
POSITION 12p13.3-p12.3
DBLINKS LocusLink: 2
GDB: 119639
OMIM: 103950
:
--------------------------------------------------
You can extract entry_id from the ENTRY field (consistent
with the LocusLink ID for human) and a list of pathway_ids
from CLASS field.
With BioRuby, you can do it by the following code.
gene2path.rb:
--------------------------------------------------
#!/usr/bin/env ruby
require 'bio'
Bio::FlatFile.auto(ARGF) do |flatfile|
flatfile.each do |entry|
pathways = entry.pathways
pathways.each do |pathway_id|
puts "#{entry.entry_id}\t#{pathway_id}"
end
end
end
--------------------------------------------------
You can run this script as
--------------------------------------------------
% ruby gene2path.rb H.sapiens.ent
2 hsa04610
2 hsa05010
13 hsa00623
13 hsa00650
13 hsa00960
15 hsa00380
:
--------------------------------------------------
then integrate with your GFF.
[Strategy 2]
You can obtain genes on KEGG PATHWAY using KEGG API,
which is a SOAP/WSDL based web service.
Following code will do the job.
human_genes_on_pathways.rb
--------------------------------------------------
#!/usr/bin/env ruby
require 'bio'
serv = Bio::KEGG::API.new
# obtain a list of pathways for human
list = serv.list_pathways("hsa")
list.each do |pathway|
pathway_id = pathway.entry_id
# display current status on standard error
STDERR.puts "Now processing... #{pathway_id} : #{pathway.definition}"
# obtain a list of genes_ids on the pathway_id
genes = serv.get_genes_by_pathway(pathway_id)
genes.each do |gene|
puts "#{gene}\t#{pathway_id}"
end
end
--------------------------------------------------
Run by
--------------------------------------------------
% ruby human_genes_on_pathways.rb > result.txt
Now processing... path:hsa00010 : Glycolysis / Gluconeogenesis - Homo
sapiens
Now processing... path:hsa00020 : Citrate cycle (TCA cycle) - Homo
sapiens
Now processing... path:hsa00030 : Pentose phosphate pathway - Homo
sapiens
:
--------------------------------------------------
Contents of result.txt will be
--------------------------------------------------
hsa:10327 path:hsa00010
hsa:124 path:hsa00010
hsa:125 path:hsa00010
hsa:126 path:hsa00010
:
--------------------------------------------------
Hope this helps.
Regards,
Toshiaki Katayama
--
Human Genome Center, Institute of Medical Science, University of Tokyo
4-6-1 Shirokanedai, Minato-ku, Tokyo 108-0071, Japan
tel://+81-3-5449-5614, fax://+81-3-5449-5434
BioRuby project http://bioruby.org/~k/
GenomeNet/KEGG http://www.genome.jp/
Human Genome Center http://www.hgc.jp/
On 2005/01/24, at 22:36, B R Venkatesh wrote:
> Hello Folks,
>
> I am using *gbrowse* from GMOD to view human gene
> info but I need to connect genes
> to their pathways like KEGG.Is there a plugin or some
> sort to achive this??
>
> Apparently somebody has added pathway as TRACK in
> grbowse:
> http://das.hgc.jp/cgi-bin/gbrowse/cpv
>
>
> Hope to hear from you.
>
>
> Thanks in advance.
> Venky.
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
More information about the BioRuby
mailing list