ExpectationMaximization.jl

This package provides a simple implementation of the Expectation Maximization (EM) algorithm used to fit mixture models. Due to Julia's amazing dispatch system, generic and reusable code spirit, and the Distributions.jl package, the code while being very generic is both very expressive and fast! Take a look at the Benchmark section.

What type of mixtures?

In particular, it works on a lot of mixtures:

  • Mixture of Univariate continuous distributions
  • Mixture of Univariate discrete distributions
  • Mixture of Multivariate distributions (continuous or discrete)
  • Mixture of mixtures (univariate or multivariate and continuous or discrete)
  • User defined mixtures (e.g. custom distributions)
  • More?

How?

Just define a mix::MixtureModel and do fit_mle(mix, y) where y is your observation array (vector or matrix). That's it! For Stochastic EM, just do fit_mle(mix, y, method = StochasticEM()). Take a look at the Examples section.

To work, the only requirements are that the components of the mixture dist ∈ dists = components(mix) considered (custom or coming from an existing package)

  1. Are a subtype of Distribution i.e. dist<:Distribution.
  2. The logpdf(dist, y) is defined (it is used in the E-step)
  3. The fit_mle(dist, y, weights) returns the distribution with the updated parameters maximizing the likelihood. This is used in the M-step of the ClassicalEM algorithm. For the StochasticEM version, only fit_mle(dist, y) is needed. Type or instance version of fit_mle for your dist are accepted thanks to this conversion line.

In general, step 2. is easy, while step 3. is only known explicitly for a few common distributions. In step 3., if the fit_mle is not explicitly known, you can always implement a numerical scheme, if it exists, for fit_mle(dist, y) see Gamma distribution example or use tools like Optimizations.jl. Or, when possible, represent your “difficult” distribution as a mixture of simple terms. (I had this in mind, but it is not directly a mixture model.)

Note

Distributions.jl currently does not allow MixtureModel to both have discrete and continuous components[2].

Algorithms

So far, the classic EM algorithm and the Stochastic EM are implemented. Look at the Bibliography section for references.

Main function

Warning

To fit the mixture, use the “instance” version of fit_mle(mix::MixtureModel, ...) as described below and NOT the “Type” version, i.e., fit_mle(Type{MixtureModel}, ...). The provided mix is used as the starting point of the EM algorithm. See Instance vs Type version section for more context.

Distributions.fit_mleMethod
fit_mle(mix::MixtureModel, y::AbstractVecOrMat, weights...; method = ClassicEM(), display=:none, maxiter=1000, atol=1e-3, rtol=nothing, robust=false, infos=false)

Use the an Expectation Maximization (EM) algorithm to maximize the Loglikelihood (fit) the mixture with an i.i.d sample y. The mix input is a mixture that is used to initilize the EM algorithm.

  • weights when provided, it will compute a weighted version of the EM. (Useful for fitting mixture of mixtures)
  • method determines the algorithm used.
  • infos = true returns a Dict with informations on the algorithm (converged, iteration number, loglikelihood).
  • robust = true will prevent the (log)likelihood to overflow to -∞ or .
  • atol criteria determining the convergence of the algorithm. If the Loglikelihood difference between two iteration i and i+1 is smaller than atol i.e. |ℓ⁽ⁱ⁺¹⁾ - ℓ⁽ⁱ⁾|<atol, the algorithm stops.
  • rtol relative tolerance for convergence, |ℓ⁽ⁱ⁺¹⁾ - ℓ⁽ⁱ⁾|<rtol*(|ℓ⁽ⁱ⁺¹⁾| + |ℓ⁽ⁱ⁾|)/2 (does not check if rtol is nothing)
  • display value can be :none, :iter, :final to display Loglikelihood evolution at each iterations :iter or just the final one :final
source
Distributions.fit_mleMethod
fit_mle(mix::AbstractArray{<:MixtureModel}, y::AbstractVecOrMat, weights...; method = ClassicEM(), display=:none, maxiter=1000, atol=1e-3, rtol=nothing, robust=false, infos=false)

Do the same as fit_mle for each (initial) mixtures in the mix array. Then it selects the one with the largest loglikelihood. Warning: It uses try and catch to avoid errors messages in case EM converges toward a singular solution (probably using robust should be enough in most case to avoid errors).

source

Utilities

ExpectationMaximization.predictFunction
predict(mix::MixtureModel, y::AbstractVector; robust=false)

Evaluate the most likely category for each observations given a MixtureModel.

  • robust = true will prevent the (log)likelihood to overflow to -∞ or .
source
ExpectationMaximization.predict_probaFunction
predict_proba(mix::MixtureModel, y::AbstractVecOrMat; robust=false)

Evaluate the probability for each observations to belong to a category given a MixtureModel..

  • robust = true will prevent the (log)likelihood to under(overflow)flow to -∞ (or ).
source

fit_mle methods that should be in Distribution.jl

I opened two PRs, PR#1670 and PR#1676 to add these methods.

Distributions.fit_mleMethod
fit_mle(g::Product, x::AbstractMatrix)
fit_mle(g::Product, x::AbstractMatrix, γ::AbstractVector)

The fit_mle for multivariate Product distributions g is the product_distribution of fit_mle of each components of g. Product is meant to be depreacated in next versions of Distribution.jl. Use the analog VectorOfUnivariateDistribution type instead.

source
Distributions.fit_mleMethod
fit_mle(::Type{<:Dirac}, x::AbstractArray{<:Real}[, w::AbstractArray{<:Real}])

fit_mle for Dirac distribution (weighted or not) data sets.

source
Distributions.fit_mleMethod
fit_mle(::Type{<:Laplace}, x::AbstractArray{<:Real}, w::AbstractArray{<:Real})

fit_mle for Laplace distribution weighted data sets.

source

Index

  • 1Have a look at the Benchmark section.
  • 2Rain is a good example of a mixture having both a discrete (Delta distribution in 0) and continuous (Exponential, Gamma, ...) component.