Benchmarks

I was inspired by this benchmark. I am not too sure how to do 100% fair comparisons across languages[1]. There is a small overhead for using PyCall and RCall. I checked that it was small in my experimentation (~ few milliseconds?).

I test only the Gaussian Mixture case since it is the most common type of mixture (remember that this package allows plenty of other mixtures).

In the code, I did not use (too much) fancy programming tricks, the speed only comes from Julia, LogExpFunctions.jl for logsumexp! function and fit_mle for each distribution's coming from Distributions.jl package.

Univariate Gaussian mixture with 2 components

I compare with Sklearn.py[2], mixtool.R, mixem.py[3]. I wanted to try mclust, but I did not manage to specify initial conditions

Overall, mixtool.R and mixem.py were constructed in a similar spirit as this package, making them easy to use for me. Sklearn.py is built to match the Sklearn format (all in one). GaussianMixturesModel.jl is built with a similar vibe.

If you have comments to improve these benchmarks, they are welcome.

You can find the benchmark code here.

Conclusion: for Gaussian mixtures, ExpectationMaximization.jl is about 2 to 10 times faster than Python or R implementations and about as fast as the specialized Julia package GaussianMixturesModel.jl.

timing_K_2

I did compare with R microbenchmark and Python timeit and they produced very similar timing but in my experience BenchmarkTools.jl is smarter and simpler to use, i.e. it will figure out alone the number of repetition to do in function of the run.

Last, the step-by-step likelihood of Sklearn is not the same as outputted by ExpectationMaximization.jl and mixtool.R (both agree), so I am a bit suspicious.

  • 1Note that @btime with RCall and PyCall might produce a small-time overhead compared to the pure R/Python time; see here for example.
  • 2There is a suspect trigger warning regarding K-means which I do not want to use here. I asked a question here. It led to this issue and that PR. It turns out that even if initial conditions were provided, the K-mean was still computed. However, to this day (23-11-29) with scikit-learn 1.3.2 it still gets the warning. Maybe it will be in the next release? I also noted this recent PR.
  • 3It overflows very quickly for $n>500$ or so. I think it is because of the implementation of logsumexp. So I eventually did not include the result in the benchmark.