[BioRuby] Bio::Blat::Report

Tomoaki NISHIYAMA tomoakin at kenroku.kanazawa-u.ac.jp
Fri Sep 5 09:25:30 UTC 2008


Hi,

> ehm... any good translator from japanese to english (or better  
> italian!) ?  :P

Here is a translation by the original sender:

-- start of translation
I am Nishiyama at Kanazawa.

When a multifasta file is used as queries, unlike blast,
blat does not output a header, but instead
outputs the query and target id in each line.

Bio::Blat::Report, in accordance with that
behavior, seems to return one entry with many
hits.  However, as a user, searching with a split file for each query
is undesired, while the results is desired to be aggregated for
each query.
For example when you want the best hit location for each query.

Although, there is no separator in the output of blat, the result
for the same query comes continuously.
When processing as a FlatFile, it would be useful
to return a block with the same query name as an "entry",
I made "flatfile_splitter".
Because each line is parsed for determination of split positioin,
return value were made as an Array of Hit, so that Hit.new
need not be called again.  (For the speed this would about 20%  
difference.)

When processing a psl file of 100-200 Mbytes, more than several  
Gbytes of
memory were required with a system reading the whole data into
a Hash and processing the hits for each query,
but with this system much smaller memory is sufficient.

What do you think?

-- end of translation

The remainder are the diff of the source code.
Note that the name of class and file are changed to avoid collision  
and the
behavior of the original class is not changed.

On 2008/09/04, at 18:11, Davide Rambaldi wrote:

>
> On Sep 4, 2008, at 5:52 AM, Naohisa GOTO wrote:
>
>> This is somehow incompatible, but good at speed and memory usage.
>> In addition, some people requested.
>> http://lists.open-bio.org/pipermail/bioruby-ja/2007-August/ 
>> 000137.html
>> (Mailing list written in Japanese)
>
>
> ehm... any good translator from japanese to english (or better  
> italian!) ?  :P
>
> anyway I am agree that the strange case of mixed hits can be ignored.
>
> This commits will be available in the next version of bioruby?
>
> I have bioruby on the edge in my laptop but not on the cluster...
>
> Last question (sorry for asking everything), there is a way to  
> generate docs of boiruby that can be queried with the ri command?
>
> ri Bio::Blat::Report
> Nothing known about Bio::Blat::Report
>
>
> Thanks!
>
> Davide Rambaldi,
> Bioinformatics PhD student.
> -----------------------------------------------------
> Bioinformatic Group IFOM-IEO Campus
> Via Adamello 16, Milano
> I-20139 Italy
>
> [t] +39 02574303 066
> [e] davide.rambaldi at ifom-ieo-campus.it
> [i] http://ciccarelli.group.ifom-ieo-campus.it/fcwiki/ 
> DavideRambaldi (homepage)
> [i] http://www.semm.it             (PhD school)
> [i] http://www.btbs.unimib.it/     (Master)
>
> -----------------------------------------------------
>
>
>
>
> _______________________________________________
> BioRuby mailing list
> BioRuby at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioruby



-- 
Tomoaki NISHIYAMA

Advanced Science Research Center,
Kanazawa University,
13-1 Takara-machi,
Kanazawa, 920-0934, Japan





More information about the BioRuby mailing list