HPCC Systems Internship: Week 6 and 7

What was done ?

1) Improved timings by identifying sources of runtime bottlenecks (matrix multiplication in Range Finder).

Current timings : (100 components, 100 partitions, 100 node cluster)

10000 x 2000 - 2 min 51 sec

50000 x 10000 - 3 min 21 sec

100000 x 10000 - 6 min

2) Experimented with sparse matrix multiplication in blocked format using eigen library

Observations :

i) Using CSC forma for storage reduce cost of distributing blocks.

ii) After local multiplication, blocks are no longer sparse. This becomes a bottleneck since we do not get any benefit from sparse addition (axpy), but bringing sparse blocks in and out of C++ is time consuming.

Thus, this approach is viable in its current state.

What needs to be done ?

1) Continue working with sparse matrix multiplication to experiment with other approaches.

HPCC Systems Internship

Monday, July 25, 2016

Week 6 and 7

No comments:

Post a Comment