[BioRuby] Biosprint bepipred implementation
George Githinji
georgkam at gmail.com
Fri Sep 17 07:25:35 UTC 2010
Hi,
The Biosprint started here in Nairobi yesterday. Most of the
participants are new to Ruby and some to programming in general.
There are about 4 CS guys who are experts in other
languages(php,java,perl and python)
We are using git for version control and have forked GeorgeG bioruby
fork from github
Task 1 Description
BepiPred predicts the location of linear B-cell epitopes in proteins
using a combination of a hidden Markov model and a propensity scale
method. The method is described in the following article:
# Improved method for predicting linear B-cell epitopes.
# Jens Erik Pontoppidan Larsen, Ole Lund and Morten Nielsen
# Immunome Research 2:2, 2006.
We are implementing a wrapper class for bepipred linear B-cell epitope
prediction tool. Specifically we want to
1) be able to call it from within bioruby as follows
# === Examples
#
# require 'bio'
# seq_file = 'test.fasta'
#
# factory = Bio::Bepipred.new(seq_file)
# report = factory.query
# report.class # => Bio::Bepipred::Report
2) The report class should take the bepipred predictions and format them to GFF3
3) Document the tasks
4) Write unit tests for the methods.
We have divided ourselves into 4 groups to accomplish this task.
A couple of questions:
1) While developing, which is the best development lifecycle?
- when testing the development version
2) what is the best way to call a command line program from within
bioruby. for example I have this
require 'bio/command'
require 'shellwords'
module Bio
# == Description
#
# A wrapper for Bepipred linear B-cell epitope prediction program.
#
# === Examples
#
# require 'bio'
# seq_file = 'test.fasta'
#
# factory = Bio::Bepipred.new(seq_file)
# report = factory.query
# report.class # => Bio::Bepipred::Report
#
class Bepipred
autoload :Report, 'bio/appl/bepipred/report'
# Creates a new Bepipred execution wrapper object
def initialize(program='bepipred',score_threshold=0.35,file_name='')
@program = program
@score_threshold = score_threshold
@file_name = file_name
end
# name of the program ('bepipred' in UNIX/Linux)
attr_accessor :program
# options
attr_accessor :score_threshold
# return the names of the input sequences
attr_reader :sequence_names
def sequence_names(file)
sequence_names = []
Bio::FlatFile.auto(@file) do |f|
f.each do |entry|
sequence_names << entry.definition
end
end
sequence_names
end
# TODO create a list of query sequences
#TODO create a commandline as an array cmd
def make_command
cmd = [@program,"-t #{@score_threshold}", at file_name ]
end
#query the file
def query(file_name)
cmd = make_command
exec_local(cmd)
end
# TODO create a parser class for the ouput
# parse_results
private
#executes bepipred when called localy
#The input is a file name or a path to the file containing protein
sequences in fasta format
#This method does not work
# There could be a bug in the way the cmd argument is created.
def exec_local(cmd)
Bio::Command.query_command(cmd)
end
end
end
Seems not to work.
Please assist. Thanks.
--
---------------
Sincerely
George
KEMRI/Wellcome-Trust Research Program
Skype: george_g2
Blog: http://biorelated.wordpress.com/
More information about the BioRuby
mailing list