Bibliography

Theory

The EM algorithm was introduced by A. P. Dempster, N. M. Laird and D. B. Rubin in 1977 in the reference paper Maximum Likelihood from Incomplete Data Via the EM Algorithm. This is a very generic algorithm, working for almost any distributions. I also added the stochastic version introduced by G. Celeux, and J. Diebolt. in 1985 in The SEM Algorithm: A probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Other versions can be added; PRs are welcomed.

Implementations

Despite being generic, to my knowledge, almost all coding implementations are specific to some mixture classes (mostly Gaussian mixtures, sometimes double exponential or Bernoulli mixtures).

In this package, thanks to Julia's generic code spirit, one can just code the algorithm, and it works for all distributions.

I know of the Python mixem package that also uses a generic algorithm implementation. However, the available distribution choice is very limited as the authors have to define each distribution (Top-Down approach). This package does not define distributions^[1]; it simply uses the Distribution type and what is in Distributions.jl.

In Julia, there is the GaussianMixtures.jl package that also does EM. It seems a little faster than my implementation when used with Gaussian mixtures (I'd like to understand what is creating this difference, though, maybe the in-place allocation while fit_mle creates copy). However, I am not sure if this is maintained anymore.

Have a look at the benchmark section for some comparisons.

I was inspired by Florian Oswald page and Maxime Mouchet HMMBase.jl package.

1I added fit_mle methods for Product distributions, weighted Laplace and Dirac. I am doing PRs to merge that directly into the Distributions.jl package.