[Biojava-l] RichSequence.IOTools performance
Andy Yates
ayates at ebi.ac.uk
Thu Mar 31 07:57:33 UTC 2011
Makes a lot of sense. There's no way of knowing if a stream is buffered unless the top level object given was an instance of BufferedOutputStream. Does this mean that by some fluke we could buffer a buffered stream?
TBH I'm more glad that we've got the speed back :).
Andy
On 30 Mar 2011, at 20:38, Scooter Willis wrote:
> Khalil
>
> For BioJava3 FastaWriter was simply using an OutputStream where its
> use was wrapped by FastaWriterHelper which was not using a
> BufferedOutputStream. I made changes to FastaWriter to check if the
> OutputStream is an instance of BufferedOutputStream and if not create
> one locally and the close when returning. The writing of 10,000
> sequences or 4.5MB of data went from 15 seconds to .6 seconds. I
> checked in the code change if you wanted to test using your code.
>
> Thanks
>
> Scooter
>
> On Tue, Mar 29, 2011 at 5:47 PM, Khalil El Mazouari
> <khalil.elmazouari at gmail.com> wrote:
>> Hi
>> I am using netbeans profiler.
>> The total exec time was ± 20s (macbook pro i7, 4GB, SSD) for ± 10.000 seq.
>> By writing the RichSequence object to ByteArrayOutputStream -> FileChannel,
>> where appropriate, the total exec time dropped to 7s. Huge improvement, for
>> the app I am developing. The app will be used to analyze ± 100,000 sequence
>> per run.
>> Regards,
>> khalil
>>
>> On 29 Mar 2011, at 22:13, Scooter Willis wrote:
>>
>> Instead of percentage metrics can you get the time before and after the
>> write execution for comparison without profiling. What profiler are you
>> using?
>>
>> On Mar 28, 2011 5:39 PM, "Andy Yates" <ayates at ebi.ac.uk> wrote:
>>
>> Dang Rich :).
>>
>> At the moment we've not done anything WRT Genbank outputting but would
>> accept anything to help us out with this.
>>
>> As for the performance difference between BJ3 & BJ what happens if you use
>> the writer objects directly with a BufferedOutputStream writer? Have you got
>> any profiling results? It would be very interesting to see where we've lost
>> the performance ...
>>
>> Andy
>>
>> On 28 Mar 2011, at 18:23, Richard Holland wrote:
>>
>>> In which case you've got little option but to r...
>>
>> --
>> Andrew Yates Ensembl Genomes Engineer
>> EMBL-EBI Tel: +44-(0)1223-492538
>> Wellcome Trust Genome Campus Fax: +44-(0)1223-494468
>> Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/
>>
>>
>>
>>
>>
>> _______________________________________________
>> Biojava-l mailing list - Biojava-l at lists.open...
>>
>>
--
Andrew Yates Ensembl Genomes Engineer
EMBL-EBI Tel: +44-(0)1223-492538
Wellcome Trust Genome Campus Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK http://www.ensemblgenomes.org/
More information about the Biojava-l
mailing list