ARSampling

Documentation for ARSampling.

This package implements univariate adaptive rejection sampling as specified by Gilks & Wild [1].

Showcase

ARSampling.jl allows you to sample from unnormalized logarithmically concave distributions. In the following example we will sample from an unnormalized normal distribution.

using ARSampling: ARSampler, Objective, sample!
using CairoMakie, Distributions, Random
set_theme!(theme_minimal())
update_theme!(linewidth = 6)
update_theme!(fonts = (; regular = "DejaVu Sans"))
Random.seed!(1)

Lets use a normal distribution centered on $\pi$ with standard deviation 2.

const mu::Float64 = π
const sigma::Float64 = 2.0

# Define a normal distribution without the normalizing terms.
f(x) = exp(-((x - mu)^2) / (2 * sigma^2))

# Use the normal distribution from Distributions.jl for verification.
normal(x) = pdf(Normal(mu, sigma), x)

l, u = mu - sigma * 4, mu + sigma * 4
fig, ax, p = lines(l..u, f, label = "Unnormalized density")
lines!(ax, l..u, normal, label = "Actual normal distribution")
axislegend(ax)

Let's imagine we did not know the actual normalizing constant for the normal distribution above. In order to sample from it using ARSampling.jl, we first define an objective function using ARSampling.Objective. By default, automatic differentiation through ForwardDiff.jl is used to calculate the derivative of the function to be sampled from.

First, we need to supply the function in its log form. For illustration purposes, we simply do log(f(x)).

f_log(x) = log(f(x))
obj = Objective(f_log)

ARSampling.Objective{typeof(Main.f_log), ARSampling.var"#3#4"{ADTypes.AutoForwardDiff{nothing, Nothing}, typeof(Main.f_log), DifferentiationInterface.PullbackGradientPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{}}, Float64, DifferentiationInterface.PushforwardPullbackPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, DifferentiationInterfaceForwardDiffExt.ForwardDiffOneArgPushforwardPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, ForwardDiff.Tag{typeof(Main.f_log), Float64}, ForwardDiff.Dual{ForwardDiff.Tag{typeof(Main.f_log), Float64}, Float64, 1}, Tuple{}}}}}, Float64}(Main.f_log, ARSampling.var"#3#4"{ADTypes.AutoForwardDiff{nothing, Nothing}, typeof(Main.f_log), DifferentiationInterface.PullbackGradientPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{}}, Float64, DifferentiationInterface.PushforwardPullbackPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, DifferentiationInterfaceForwardDiffExt.ForwardDiffOneArgPushforwardPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, ForwardDiff.Tag{typeof(Main.f_log), Float64}, ForwardDiff.Dual{ForwardDiff.Tag{typeof(Main.f_log), Float64}, Float64, 1}, Tuple{}}}}}(ADTypes.AutoForwardDiff(), Main.f_log, DifferentiationInterface.PullbackGradientPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{}}, Float64, DifferentiationInterface.PushforwardPullbackPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, DifferentiationInterfaceForwardDiffExt.ForwardDiffOneArgPushforwardPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, ForwardDiff.Tag{typeof(Main.f_log), Float64}, ForwardDiff.Dual{ForwardDiff.Tag{typeof(Main.f_log), Float64}, Float64, 1}, Tuple{}}}}(Val{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{}}}(), -0.5733023867387215, DifferentiationInterface.PushforwardPullbackPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, DifferentiationInterfaceForwardDiffExt.ForwardDiffOneArgPushforwardPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, ForwardDiff.Tag{typeof(Main.f_log), Float64}, ForwardDiff.Dual{ForwardDiff.Tag{typeof(Main.f_log), Float64}, Float64, 1}, Tuple{}}}(Val{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}}(), DifferentiationInterfaceForwardDiffExt.ForwardDiffOneArgPushforwardPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, ForwardDiff.Tag{typeof(Main.f_log), Float64}, ForwardDiff.Dual{ForwardDiff.Tag{typeof(Main.f_log), Float64}, Float64, 1}, Tuple{}}(Val{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}}(), ForwardDiff.Tag{typeof(Main.f_log), Float64}, Dual{ForwardDiff.Tag{typeof(Main.f_log), Float64}}(1.0,1.0), ())))))

We then define the sampler itself providing the initial points for the hull as well as the desired domain. Initial points needs to be at opposing sides of the function maximum.

sam = ARSampler(obj, (-Inf, Inf), [-2., 4.])

ARSampling.ARSampler{Float64, typeof(Main.f_log), ARSampling.var"#3#4"{ADTypes.AutoForwardDiff{nothing, Nothing}, typeof(Main.f_log), DifferentiationInterface.PullbackGradientPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{}}, Float64, DifferentiationInterface.PushforwardPullbackPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, DifferentiationInterfaceForwardDiffExt.ForwardDiffOneArgPushforwardPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, ForwardDiff.Tag{typeof(Main.f_log), Float64}, ForwardDiff.Dual{ForwardDiff.Tag{typeof(Main.f_log), Float64}, Float64, 1}, Tuple{}}}}}}(ARSampling.Objective{typeof(Main.f_log), ARSampling.var"#3#4"{ADTypes.AutoForwardDiff{nothing, Nothing}, typeof(Main.f_log), DifferentiationInterface.PullbackGradientPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{}}, Float64, DifferentiationInterface.PushforwardPullbackPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, DifferentiationInterfaceForwardDiffExt.ForwardDiffOneArgPushforwardPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, ForwardDiff.Tag{typeof(Main.f_log), Float64}, ForwardDiff.Dual{ForwardDiff.Tag{typeof(Main.f_log), Float64}, Float64, 1}, Tuple{}}}}}, Float64}(Main.f_log, ARSampling.var"#3#4"{ADTypes.AutoForwardDiff{nothing, Nothing}, typeof(Main.f_log), DifferentiationInterface.PullbackGradientPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{}}, Float64, DifferentiationInterface.PushforwardPullbackPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, DifferentiationInterfaceForwardDiffExt.ForwardDiffOneArgPushforwardPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, ForwardDiff.Tag{typeof(Main.f_log), Float64}, ForwardDiff.Dual{ForwardDiff.Tag{typeof(Main.f_log), Float64}, Float64, 1}, Tuple{}}}}}(ADTypes.AutoForwardDiff(), Main.f_log, DifferentiationInterface.PullbackGradientPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{}}, Float64, DifferentiationInterface.PushforwardPullbackPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, DifferentiationInterfaceForwardDiffExt.ForwardDiffOneArgPushforwardPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, ForwardDiff.Tag{typeof(Main.f_log), Float64}, ForwardDiff.Dual{ForwardDiff.Tag{typeof(Main.f_log), Float64}, Float64, 1}, Tuple{}}}}(Val{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{}}}(), -0.5733023867387215, DifferentiationInterface.PushforwardPullbackPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, DifferentiationInterfaceForwardDiffExt.ForwardDiffOneArgPushforwardPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, ForwardDiff.Tag{typeof(Main.f_log), Float64}, ForwardDiff.Dual{ForwardDiff.Tag{typeof(Main.f_log), Float64}, Float64, 1}, Tuple{}}}(Val{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}}(), DifferentiationInterfaceForwardDiffExt.ForwardDiffOneArgPushforwardPrep{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}, ForwardDiff.Tag{typeof(Main.f_log), Float64}, ForwardDiff.Dual{ForwardDiff.Tag{typeof(Main.f_log), Float64}, Float64, 1}, Tuple{}}(Val{Tuple{typeof(Main.f_log), ADTypes.AutoForwardDiff{nothing, Nothing}, Float64, Tuple{Float64}, Tuple{}}}(), ForwardDiff.Tag{typeof(Main.f_log), Float64}, Dual{ForwardDiff.Tag{typeof(Main.f_log), Float64}}(1.0,1.0), ()))))), ARSampling.UpperHull{Float64}([-0.7337005501361697, 0.7662994498638303], [1.2853981633974483, -0.21460183660255172], [-Inf, 1.0, Inf], [-2.0, 4.0], [1.350708254883096, 8.09032176797251], (-Inf, Inf)), ARSampling.LowerHull{Float64}([-2.2337005501361697], [0.5353981633974483], [-2.0, 4.0]))

In order to retrieve samples, we use the ARSampling.sample! function. As is indicated by the !, this function modifies the sampler itself in order to improve future sampling whenever a sample is rejected.

samples = sample!(sam, 1000);

1000-element Vector{Float64}:
  1.4397132993142485
 -0.8943148644045693
  2.5520758369708516
  2.355764543594075
  3.635438914948396
  3.0166969586443373
  7.5272435031658045
  2.1511992854849606
  3.773502023868765
  2.1940148057903173
  ⋮
  5.222436492142635
  1.7849065876608339
  2.062106934987589
  2.0475243471820423
  1.746592544515179
  3.768018078244203
  7.974371834490299
  2.5593187163703086
  0.5746858116786866

Lets compare the samples drawn from the sampler to the actual target distribution.

fig, ax, p = hist(samples, bins=100, normalization = :pdf, label = "Samples", axis = (; title="1000 samples"))
lines!(l..u, normal, label = "Target distribution", color = :orange, linewidth = 3, alpha = 0.8)
axislegend(ax)

Lets do one with more samples in order to verify that it actually approaches the target.

Truncated density

Lets try a truncated density such as the beta distribution.

As before, we define the distribution without its normalizing terms as well as a function from Distributions.jl to compare with.

const alpha::Float64 = 2.5
const beta::Float64 = 5.0

f(x) = x^(alpha - 1) * (1 - x)^(beta - 1)

beta_proper(x) = pdf(Beta(alpha, beta), x)

l, u = 0, 1
fig, ax, p = lines(l..u, f, label="Unnormalized density")
lines!(ax, l..u, beta_proper, label = "Actual beta distribution")
axislegend(ax)

Lets define the objective and sampler. As the beta distribution is bounded on $x \in [0, 1]$ we specify this in the domain argument when creating the sampler.

f_log(x) = log(f(x))
obj = Objective(f_log)

sam = ARSampler(obj, (0.0, 1.0), [0.2, 0.8])

Lets visualize the samples as before.

samples = sample!(sam, 100000)

fig, ax, p = hist(samples, bins=100, label = "Samples", normalization = :pdf)
lines!(ax, 0..1, beta_proper, label = "Target", color = Cycled(2))
axislegend(ax)

Of course, we can specify arbitrary bounds. Let's define the same beta distribution as above but truncated at $[0.5, 1.0]$ and compare them with sampling from a truncated Distributions.jl distribution (with plotting code hidden for brevity). Since the distribution is bounded to the right of its maximum we only need to specify one initial segment.

sam_bounded = ARSampler(obj, (0.5, 1.0), [0.8])

samples_bounded = sample!(sam_bounded, 100000)

dist_bounded = truncated(Beta(alpha, beta), lower=0.5)

samples_true = rand(dist_bounded, 100000)

BigFloat

Sometimes unnormalized distributions may run the risk of underflowing even with the range provided by Float64. One example of this is found related to an infinite mixture model[2] concerning the conditional distribution of the parameter $\alpha$. The conditional distributions is based on the number of clusters, $k$, and the number of data points, $n$. For low $k$ and $n$ sampling works fine as shown below.

using ARSampling: ARSampler, Objective, sample!
using CairoMakie, Distributions, Random, SpecialFunctions, DifferentiationInterface

function log_alpha(x, k, n)
           alpha = exp(x)
           x +
           x * (k - 3 / 2) +
           (-1 / (2 * alpha)) +
           loggamma(alpha) -
           loggamma(n + alpha)
end

obj = Objective(x -> log_alpha(x, 3, 15))

sam = ARSampler(obj, (-Inf, Inf), [-0.2, 0.2])
samples = sample!(sam, 10000)

However, for high numbers of clusters/data points we run into issues.

x = log_alpha(0.3, 9, 400)
println("x = ", x)
println("exp(x) = ", exp(x))

x = -2000.5330884437142
exp(x) = 0.0

Using BigFloat "solves" the issue:

x = log_alpha(big(0.3), big(9), big(400))
println("x = ", x)
println("exp(x) = ", exp(x))

x = -2000.533088443713818557964408059791509580231066910879044224059543663174914610449
exp(x) = 1.511885228972232997754798713799356221249465829564221431136166725121503068207777e-869

In order to sample using BigFloat we pass the type to the sampler constructors.

obj = Objective(x -> log_alpha(x, 9, 400), one(BigFloat))

sam = ARSampler(obj, big.((-Inf, Inf)), big.([-0.2, 0.5]))
samples = sample!(sam, 10000)

...And then plot the result. Note that I had to scale the values of the unnormalized distribution by m below in order to get Makie to plot it. The fit in the plot below (and above) should therefore be taken with a grain of salt but at least the shapes seem to match.

fig, ax, p = hist(samples, bins=100, normalization = :pdf, axis = (; title="k = 9, n = 400", xticks = xticks))
ax2 = Axis(fig[1,1], yaxisposition = :right)
hidespines!(ax2)
hidexdecorations!(ax2)
linkxaxes!(ax, ax2)

m = BigFloat("6.426512129038265565810544883751205877443549713389270459189622783861198714057871e+868")
ex = [exp(log_alpha(big(x), big(9), big(400))) * m for x in -3:0.01:3]

lines!(ax2, -3:0.01:3, ex, color = :orange, alpha = 0.8)