From jan.aerts at bbsrc.ac.uk Mon Jun 5 06:14:22 2006 From: jan.aerts at bbsrc.ac.uk (jan aerts (RI)) Date: Mon, 5 Jun 2006 11:14:22 +0100 Subject: [BioRuby] classes for chromosomes, genes, exons, polymorphisms, ... Message-ID: <84DA9D8AC9B05F4B889E7C70238CB451030DADA5@rie2ksrv1.ri.bbsrc.ac.uk> All, As far as I know, there are no classes that describe biological concepts such as 'a chromosome', or 'a gene', or 'an exon'. I'm working out if I need these things for my own purposes, but wonder if it would be worthwhile to include them into bioruby? I was thinking of something like: Bio::Entity::Chromosome Bio::Entity::Gene Bio::Entity::Exon Bio::Entity::Repeat Bio::Entity::Polymorphism Bio::Entity::SNP Bio::Entity::Microsatellite I don't know if Bio::Entity::Chromosome would be necessary, because its principal use would be to holds annotations (i.e. act as a map), so I suppose that Bio::Map::SimpleMap could be instead... Bio::Entity::Gene would include Bio::Map::ActsAsMarker and Bio::Map::ActsLikeMap, so that a gene can be mapped to something like a chromosome, but it can also have exons or polymorphisms mapped to it. Worth to pursue this? And if so: any ideas/comments welcome... Jan Aerts, PhD Bioinformatics Group Roslin Institute Roslin, Scotland, UK +44 131 527 4200 ---------The obligatory disclaimer-------- The information contained in this e-mail (including any attachments) is confidential and is intended for the use of the addressee only. The opinions expressed within this e-mail (including any attachments) are the opinions of the sender and do not necessarily constitute those of Roslin Institute (Edinburgh) ("the Institute") unless specifically stated by a sender who is duly authorised to do so on behalf of the Institute. From ahr6y at virginia.edu Mon Jun 5 14:00:56 2006 From: ahr6y at virginia.edu (Anoop Ranganath) Date: Mon, 5 Jun 2006 14:00:56 -0400 Subject: [BioRuby] adding a flatfile format Message-ID: I'd like to add a simple flatfile format to use with bioruby. In particular, I'm looking at BED files, which have a very simple tab delimited format. I'm poking around in the code, and haven't been able to find a good example of how to incorporate a new parser. I'm not so much interested in adding support for autodetection, so I can't imagine that it would be too difficult. Does anyone have a starting point I can use? Thanks, Anoop From ngoto at gen-info.osaka-u.ac.jp Mon Jun 12 05:56:58 2006 From: ngoto at gen-info.osaka-u.ac.jp (GOTO Naohisa) Date: Mon, 12 Jun 2006 18:56:58 +0900 Subject: [BioRuby] adding a flatfile format In-Reply-To: References: Message-ID: <200606120957.k5C9v3NE005121@idns103.gen-info.osaka-u.ac.jp> Hi, On Mon, 5 Jun 2006 14:00:56 -0400 Anoop Ranganath wrote: > I'd like to add a simple flatfile format to use with bioruby. In > particular, I'm looking at BED files, which have a very simple tab > delimited format. I'm poking around in the code, and haven't been > able to find a good example of how to incorporate a new parser. I'm > not so much interested in adding support for autodetection, so I > can't imagine that it would be too difficult. > > Does anyone have a starting point I can use? > > Thanks, > Anoop If you are using bioruby-1.0.0, some critical bugs have been found in the flatfile.rb. Please apply attached patch. A very simple parser for tab separated values. It reads an entire file at a time when initializing the parser class. ###################################################################### require 'bio' # very simple parser for tab-separated data class SimpleFormat # delimiter needed for flatfile DELIMITER = RS = nil # nil means no delimiter and reading entire file def initialize(str) @data = str.split(/\n/).collect { |x| x.to_s.split(/\t/) } end attr_reader :data end # example code to read a file 'test.dat' and show data Bio::FlatFile.open(SimpleFormat, 'test.dat') do |ff| ff.each do |entry| p entry.data end end ###################################################################### A simple example to parse a file with multiple entries, and each end of the entry is '//'. ###################################################################### require 'bio' # very simple parser for "//"-separated entries class SimpleFormat2 # delimiter needed for flatfile DELIMITER = RS = '//' # the end of each entry is '//' def initialize(str) # very simple parser only to store a text data @data = str end attr_reader :data end # example code to read a file 'sample.gbk' and shows each entry Bio::FlatFile.open(SimpleFormat2, 'sample.gbk') do |ff| ff.each do |entry| p entry.data end end ###################################################################### If you want to parse a data with variable delimiters or no explicit delimiters, you need to write a splitter class to read an entry from an IO wrapper object created by the FlatFile class. However, the specifications of the splitter class haven't been clearly determined yet, and will be changed in the near future. In addition, the flatfile.rb is now under re-construction and descriptions above might be changed. -- Naohisa GOTO ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org Department of Genome Informatics, Genome Information Research Center, Research Institute for Microbial Diseases, Osaka University, Japan -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: flatfile.patch Url: http://lists.open-bio.org/pipermail/bioruby/attachments/20060612/2bc61698/attachment.pl From jan.aerts at bbsrc.ac.uk Mon Jun 5 10:14:22 2006 From: jan.aerts at bbsrc.ac.uk (jan aerts (RI)) Date: Mon, 5 Jun 2006 11:14:22 +0100 Subject: [BioRuby] classes for chromosomes, genes, exons, polymorphisms, ... Message-ID: <84DA9D8AC9B05F4B889E7C70238CB451030DADA5@rie2ksrv1.ri.bbsrc.ac.uk> All, As far as I know, there are no classes that describe biological concepts such as 'a chromosome', or 'a gene', or 'an exon'. I'm working out if I need these things for my own purposes, but wonder if it would be worthwhile to include them into bioruby? I was thinking of something like: Bio::Entity::Chromosome Bio::Entity::Gene Bio::Entity::Exon Bio::Entity::Repeat Bio::Entity::Polymorphism Bio::Entity::SNP Bio::Entity::Microsatellite I don't know if Bio::Entity::Chromosome would be necessary, because its principal use would be to holds annotations (i.e. act as a map), so I suppose that Bio::Map::SimpleMap could be instead... Bio::Entity::Gene would include Bio::Map::ActsAsMarker and Bio::Map::ActsLikeMap, so that a gene can be mapped to something like a chromosome, but it can also have exons or polymorphisms mapped to it. Worth to pursue this? And if so: any ideas/comments welcome... Jan Aerts, PhD Bioinformatics Group Roslin Institute Roslin, Scotland, UK +44 131 527 4200 ---------The obligatory disclaimer-------- The information contained in this e-mail (including any attachments) is confidential and is intended for the use of the addressee only. The opinions expressed within this e-mail (including any attachments) are the opinions of the sender and do not necessarily constitute those of Roslin Institute (Edinburgh) ("the Institute") unless specifically stated by a sender who is duly authorised to do so on behalf of the Institute. From ahr6y at virginia.edu Mon Jun 5 18:00:56 2006 From: ahr6y at virginia.edu (Anoop Ranganath) Date: Mon, 5 Jun 2006 14:00:56 -0400 Subject: [BioRuby] adding a flatfile format Message-ID: I'd like to add a simple flatfile format to use with bioruby. In particular, I'm looking at BED files, which have a very simple tab delimited format. I'm poking around in the code, and haven't been able to find a good example of how to incorporate a new parser. I'm not so much interested in adding support for autodetection, so I can't imagine that it would be too difficult. Does anyone have a starting point I can use? Thanks, Anoop From ngoto at gen-info.osaka-u.ac.jp Mon Jun 12 09:56:58 2006 From: ngoto at gen-info.osaka-u.ac.jp (GOTO Naohisa) Date: Mon, 12 Jun 2006 18:56:58 +0900 Subject: [BioRuby] adding a flatfile format In-Reply-To: References: Message-ID: <200606120957.k5C9v3NE005121@idns103.gen-info.osaka-u.ac.jp> Hi, On Mon, 5 Jun 2006 14:00:56 -0400 Anoop Ranganath wrote: > I'd like to add a simple flatfile format to use with bioruby. In > particular, I'm looking at BED files, which have a very simple tab > delimited format. I'm poking around in the code, and haven't been > able to find a good example of how to incorporate a new parser. I'm > not so much interested in adding support for autodetection, so I > can't imagine that it would be too difficult. > > Does anyone have a starting point I can use? > > Thanks, > Anoop If you are using bioruby-1.0.0, some critical bugs have been found in the flatfile.rb. Please apply attached patch. A very simple parser for tab separated values. It reads an entire file at a time when initializing the parser class. ###################################################################### require 'bio' # very simple parser for tab-separated data class SimpleFormat # delimiter needed for flatfile DELIMITER = RS = nil # nil means no delimiter and reading entire file def initialize(str) @data = str.split(/\n/).collect { |x| x.to_s.split(/\t/) } end attr_reader :data end # example code to read a file 'test.dat' and show data Bio::FlatFile.open(SimpleFormat, 'test.dat') do |ff| ff.each do |entry| p entry.data end end ###################################################################### A simple example to parse a file with multiple entries, and each end of the entry is '//'. ###################################################################### require 'bio' # very simple parser for "//"-separated entries class SimpleFormat2 # delimiter needed for flatfile DELIMITER = RS = '//' # the end of each entry is '//' def initialize(str) # very simple parser only to store a text data @data = str end attr_reader :data end # example code to read a file 'sample.gbk' and shows each entry Bio::FlatFile.open(SimpleFormat2, 'sample.gbk') do |ff| ff.each do |entry| p entry.data end end ###################################################################### If you want to parse a data with variable delimiters or no explicit delimiters, you need to write a splitter class to read an entry from an IO wrapper object created by the FlatFile class. However, the specifications of the splitter class haven't been clearly determined yet, and will be changed in the near future. In addition, the flatfile.rb is now under re-construction and descriptions above might be changed. -- Naohisa GOTO ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org Department of Genome Informatics, Genome Information Research Center, Research Institute for Microbial Diseases, Osaka University, Japan -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: flatfile.patch URL: