[BioRuby] RegEx search example fasta file
pjotr at pckassa.com
pjotr at pckassa.com
Wed Mar 24 01:24:16 EST 2004
Thanks! I'll have a look and will improve the Wiki to cover that. Pays
off immediately ;-).
Pj.
On Wed, Mar 24, 2004 at 11:58:20AM +0900, Toshiaki Katayama wrote:
> On 2004/03/21, at 22:33, pjotr at pckassa.com wrote:
> >Can this go in the sample directory of bioruby - I have added it to
> >the Wiki. Comments welcome.
>
> As for the wiki page, comparing to the original BJIA,
> (http://www.biojava.org/docs/bj_in_anger/FastaParser.htm)
> this section is to answer how to parse fasta results.
>
> As the Bio::FlatFile.auto in BioRuby is very powerful and
> entry.definition is implemented in various DB classes,
> the way of your code that finds entries by regexp
> is not limited to the FastaFormat as follows:
>
> % re_grep_def.rb 'serine.* kinase' genbank/gb*.seq
> % re_grep_def.rb 'serine.* kinase' kegg/genes/*.ent
> % re_grep_def.rb 'serine.* kinase' kegg/sequences/*.pep
>
> ----------------------------------------------
> #!/usr/bin/env ruby
>
> require 'bio'
>
> re = /#{ARGV.shift}/i
>
> Bio::FlatFile.auto(ARGF) do |ff|
> ff.each do |entry|
> if re.match(entry.definition)
> puts ff.entry_raw
> end
> end
> end
> ----------------------------------------------
>
>
> -k
>
>
>
> >
> >Pj.
> >
> >
> >#! /usr/bin/ruby
> >#
> ># $Id: fastasearch,v 1.1 2004/03/21 13:18:41 wrk Exp $
> ># $Source: /home/cvs/home/pjotr/lwrk/luw/fasta/fastasearch,v $
> >#
> >
> ># require 'profile'
> >
> >COPYRIGHT = "GPL (c) 2003-2004"
> >
> >usage = <<USAGE
> >
> > Search fasta file(s) tags using a regular expression (regex)
> >
> > Usage: fastasearch [-q query] filename(s)
> >
> > Example:
> >
> > ruby fastasearch -q '/([Hh]uman|[Hh]omo sapiens)/' nr.fa
> >
> > For more information see
> >
> > http://thebird.nl/bioinformatics/
> >
> > Pjotr Prins
> > Wageningen University and Research Centre
> > http://www.wur.nl/
> > http://www.dpw.wageningen-ur.nl/nema/
> >
> >USAGE
> >
> ># --------------------------------------------------------------------
> >
> >srcpath=File.dirname($0)
> >libpath=File.dirname(srcpath)+'/lib'
> >$: << srcpath # ---- Add start path to search libraries
> >$: << libpath
> >
> >require 'getoptlong'
> >require 'bio'
> >
> ># ---- Parse command line
> >opts = GetoptLong.new(
> > [ "--help", "-h", GetoptLong::NO_ARGUMENT ],
> > [ "--query", "-q", GetoptLong::REQUIRED_ARGUMENT ]
> >)
> >
> >do_help = false
> >query=nil
> >
> >opts.each do | opt, arg |
> > do_help |= (opt == '--help')
> > query = arg if (opt == '--query')
> >end
> >
> ># ---- Print usage
> >if (do_help || ARGV.size==0)
> > print usage
> > exit 1
> >end
> >
> >if !query
> > print "Give query: "
> > query = $stdin.gets.chomp
> >end
> >
> >ARGV.each do | fn |
> > $stderr.print "Loading #{fn}..."
> > f = Bio::FlatFile.auto(fn)
> > $stderr.print " detected: #{f.dbclass}\n"
> > f.each_entry do | e |
> > if e.definition =~ /#{query}/
> > print '>',e.definition,e.data
> > end
> > end
> >end
> >
> >_______________________________________________
> >BioRuby mailing list
> >BioRuby at open-bio.org
> >http://portal.open-bio.org/mailman/listinfo/bioruby
>
> _______________________________________________
> BioRuby mailing list
> BioRuby at open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioruby
More information about the BioRuby
mailing list