ExpectationMaximization.jl

This package provides a simple implementation of the Expectation Maximization (EM) algorithm used to fit mixture models. Due to Julia amazing dispatch systems, generic and reusable code spirit, and the Distributions.jl package, the code while being very generic is both very expressive and fast[1]!

What type of mixtures?

In particular, it works on a lot of mixtures:

  • Mixture of Univariate continuous distributions
  • Mixture of Univariate discrete distributions
  • Mixture of Multivariate distributions (continuous or discrete)
  • Mixture of mixtures (univariate or multivariate and continuous or discrete)
  • More?

How?

Just define a mix::MixtureModel and do fit_mle(mix, y) with your data y and that's it! Have a look at the examples section.

To work, the only requirements are that the components of the mixture dist ∈ dists = components(mix) considered (custom or coming from an existing package)

  1. Are a subtype of Distribution i.e. dist<:Distribution.
  2. The logpdf(dist, y) is defined (it is used in the E-step)
  3. The fit_mle(dist, y, weigths) returns the distribution with parameters equals to MLE. This is used in the M-step of the ClassicalEM algorithm. For the StocasticEM version, only fit_mle(dist, y) is needed. Type or instance version of fit_mle for your dist are accepted thanks to this conversion line.

In general, 2. is easy, while 3. is only known explicitly for a few common distributions. In case 2. is not explicit known, you can always implement a numerical scheme, if it exists, for fit_mle(dist, y) see Gamma distribution example. Or, when possible, represent your “difficult” distribution as a mixture of simple terms. (I had this in mind, but it is not directly a mixture model.)

Note

Distributions.jl currently does not allow MixtureModel to both have discrete and continuous components[2].

Algorithms

Main function

Warning

To fit the mixture, use the “instance” version of fit_mle(mix::MixtureModel, ...) as described below and NOT the “Type” version, i.e., fit_mle(Type{MixtureModel}, ...). The provided mix is used as the starting point of the EM algorithm. See Instance vs Type version section for more context.

Distributions.fit_mleMethod
fit_mle(mix::MixtureModel, y::AbstractVecOrMat, weights...; method = ClassicEM(), display=:none, maxiter=1000, atol=1e-3, rtol=nothing, robust=false, infos=false)

Use the an Expectation Maximization (EM) algorithm to maximize the Loglikelihood (fit) the mixture with an i.i.d sample y. The mix input is a mixture that is used to initilize the EM algorithm.

  • weights when provided, it will compute a weighted version of the EM. (Useful for fitting mixture of mixtures)
  • method determines the algorithm used.
  • infos = true returns a Dict with informations on the algorithm (converged, iteration number, loglikelihood).
  • robust = true will prevent the (log)likelihood to overflow to -∞ or .
  • atol criteria determining the convergence of the algorithm. If the Loglikelihood difference between two iteration i and i+1 is smaller than atol i.e. |ℓ⁽ⁱ⁺¹⁾ - ℓ⁽ⁱ⁾|<atol, the algorithm stops.
  • rtol relative tolerance for convergence, |ℓ⁽ⁱ⁺¹⁾ - ℓ⁽ⁱ⁾|<rtol*(|ℓ⁽ⁱ⁺¹⁾| + |ℓ⁽ⁱ⁾|)/2 (does not check if rtol is nothing)
  • display value can be :none, :iter, :final to display Loglikelihood evolution at each iterations :iter or just the final one :final
source
Distributions.fit_mleMethod
fit_mle(mix::AbstractArray{<:MixtureModel}, y::AbstractVecOrMat, weights...; method = ClassicEM(), display=:none, maxiter=1000, atol=1e-3, rtol=nothing, robust=false, infos=false)

Do the same as fit_mle for each (initial) mixtures in the mix array. Then it selects the one with the largest loglikelihood. Warning: It uses try and catch to avoid errors messages in case EM converges toward a singular solution (probably using robust should be enough in most case to avoid errors).

source

Utilities

ExpectationMaximization.predictFunction
predict(mix::MixtureModel, y::AbstractVector; robust=false)

Evaluate the most likely category for each observations given a MixtureModel.

  • robust = true will prevent the (log)likelihood to overflow to -∞ or .
source
ExpectationMaximization.predict_probaFunction
predict_proba(mix::MixtureModel, y::AbstractVecOrMat; robust=false)

Evaluate the probability for each observations to belong to a category given a MixtureModel..

  • robust = true will prevent the (log)likelihood to under(overflow)flow to -∞ (or ).
source

fit_mle methods that should be in Distribution.jl

I opened two PRs, PR#1670 and PR#1676 to add these methods.

Distributions.fit_mleMethod
fit_mle(g::Product, x::AbstractMatrix)
fit_mle(g::Product, x::AbstractMatrix, γ::AbstractVector)

The fit_mle for multivariate Product distributions g is the product_distribution of fit_mle of each components of g. Product is meant to be depreacated in next version of Distribution.jl. Use the analog VectorOfUnivariateDistribution type instead.

source
Distributions.fit_mleMethod
fit_mle(::Type{<:Dirac}, x::AbstractArray{<:Real}[, w::AbstractArray{<:Real}])

fit_mle for Dirac distribution (weighted or not) data sets.

source
Distributions.fit_mleMethod
fit_mle(::Type{<:Laplace}, x::AbstractArray{<:Real}, w::AbstractArray{<:Real})

fit_mle for Laplace distribution weighted data sets.

source

Index

  • 1Have a look at the Benchmark section.
  • 2Rain is a good example of mixture having both a discrete (Delta distribution in 0) and continuous (Exponential, Gamma, ...) component.