I'm a student working on an older laptop without a dedicated GPU. I wanted to run some neural network experiments that required large matrix multiplications (2000x2000+), but standard numpy.dot (Intel MKL) was becoming a bottleneck for my iteration time.
I realized that for my specific inference tasks (and fuzzy clustering), I didn't need perfect FP32 precision.
I wrote a C99 kernel (wrapped in Python via ctypes) that uses Monte Carlo Outer-Product Sampling. Instead of computing the full N^3 product, it samples columns based on a uniform distribution. It uses OpenMP for parallelism and is optimized for L1/L2 cache locality.
The Result: On my i5 machine, it gets ~4.1x speedup over NumPy with about 5-10% error (configurable via sampling rate).
It's obviously not for scientific simulation or finance, but for stochastic ML approaches, it feels like a free hardware upgrade.
The binary is in the repo if you want to test the speedup. I'm curious if this approach (probabilistic BLAS) is used in production anywhere else.
OP here.
I'm a student working on an older laptop without a dedicated GPU. I wanted to run some neural network experiments that required large matrix multiplications (2000x2000+), but standard numpy.dot (Intel MKL) was becoming a bottleneck for my iteration time.
I realized that for my specific inference tasks (and fuzzy clustering), I didn't need perfect FP32 precision.
I wrote a C99 kernel (wrapped in Python via ctypes) that uses Monte Carlo Outer-Product Sampling. Instead of computing the full N^3 product, it samples columns based on a uniform distribution. It uses OpenMP for parallelism and is optimized for L1/L2 cache locality.
The Result: On my i5 machine, it gets ~4.1x speedup over NumPy with about 5-10% error (configurable via sampling rate).
It's obviously not for scientific simulation or finance, but for stochastic ML approaches, it feels like a free hardware upgrade.
The binary is in the repo if you want to test the speedup. I'm curious if this approach (probabilistic BLAS) is used in production anywhere else.