Part 1: Optimizing on physical hardware
Introduction
For those who don't know it, Discovar is a life
sciences variant caller and small genome assembly code. It turns the
output from sequencers into entire genomes given a reference sequence.
This is computationally very expensive and I decided to take a look at
it under MAP, our OpenMP and MPI profiler.
Compiling and Running Discovar
As with many life sciences codes, downloading, compiling and running the Discovar benchmark was refreshingly straightforward:
# Build Discovar $ wget ftp://ftp.broadinstitute.org/pub/crd/Discovar/latest_source_code/LATEST_VERSION.tar.gz $ tar zxf LATEST_VERSION.tar.gz $ cd discovar-* $ ./configure $ make -j32
# Download benchmark code $ wget ftp://ftp.broadinstitute.org/pub/crd/Benchmark/data_only.tar.gz $ tar zxf data_only.tar.gz $ sed s:Discovar:src/Discovar/ -i runme.sh
# Run benchmark $ time ./runme.sh
The results seemed reasonable enough – the benchmark finished in 7.97 minutes with a peak mem of 5.6 GB. That would put our internal 24-core (with hyperthreading) server in the top 4 on the Broad Institute's benchmarking results page.
No comments:
Post a Comment