Comparison with other packages
This benchmark was inspired by this post. The full benchmark code is available as a Jupyter notebook and here.
Scope and limitations of competing packages
A key distinction of ExpectationMaximization.jl is its genericity: it works with any mixture of distributions supported by Distributions.jl (univariate, multivariate, continuous, discrete, or custom), without any modification to the core algorithm. The competing packages benchmarked here are, in contrast, largely restricted to Gaussian mixtures:
| Package | Language | Gaussian only? | Notes |
|---|---|---|---|
Sklearn | Python | Yes | Hardcoded Gaussian[2] |
mixtools | R | Mostly | Supports some other families but not extensible |
mixem | Python | Mostly | Numerically fragile[3] |
GaussianMixtures.jl | Julia | Yes | Hardcoded Gaussian |
ExpectationMaximization.jl | Julia | No | Any Distributions.jl distribution |
The benchmark below only tests the Gaussian mixture case (the most common), which is deliberately the strongest case for the specialized packages. Despite this, ExpectationMaximization.jl remains highly competitive.
Why is ExpectationMaximization.jl fast?
No heavy programming tricks are used. The performance comes from standard Julia best practices:
- E-step: pre-allocated memory,
@views, type-stable code, andlogsumexp!fromLogExpFunctions.jl. - M-step: delegates to
fit_mlefromDistributions.jl, which is well-optimized for each distribution (e.g., see the Multivariate Normal implementation).
Many more optimizations are possible, however, I'd like to keep the code as simple and readable as possible.
Results
Or the ratio view:
Conclusion: for Gaussian mixtures, ExpectationMaximization.jl is about 4× faster than Sklearn (Python) and 7× faster than mixtools (R), while being only slightly slower than the Gaussian-specialized GaussianMixtures.jl. Crucially, unlike all competing packages, ExpectationMaximization.jl handles arbitrary mixture distributions out of the box.
If you have comments to improve these benchmarks, they are welcome.
Cross-language comparisons are inherently imperfect[1]. PythonCall.jl and RCall.jl introduce a small overhead (~few milliseconds), which was verified to be negligible here.
- 1
@btimewithRCall.jlandPythonCall.jlmay add a small overhead; see this discussion. Timings were cross-checked againstRmicrobenchmarkand Pythontimeit, which gave consistent results.BenchmarkTools.jlautomatically determines the number of repetitions needed for a reliable estimate. - 2
Sklearn'sGaussianMixtureused to run K-means initialization even when initial conditions are explicitly provided — see this discussion, issue, and PR. I should be fix by now. - 3
mixemoverflows for $n \gtrsim 500$ due to a fragilelogsumexpimplementation and was excluded from the benchmark.