Skip to content

Optimisation interface reworking #2634

@penelopeysm

Description

@penelopeysm

There are a number of improvements that can/should be made to Turing's optimisation interface. This issue is mainly to collect sub-issues on this topic, and to introduce what is going on.

Summary and where we are now (as of v0.42)

Broadly speaking, the entire optimisation interface needs a complete overhaul. The current interface is deeply unsatisfying, which stems from the difference between the representation of a Turing model (named parameters, unlinked space) vs the representation of a function passed to Optimization.jl (a single parameter vector, which may be in linked or unlinked space). The current interface tries too hard to provide what Optimization.jl wants, and in the process makes it very difficult for the user to specify meaningful inputs such as initial values or constraints.

For example, initial values must be specified for all parameters at once in the form of a vector. The order of the parameters is unclear, and sometimes there are inconsistencies (#2734). It is also unclear whether the values correspond to linked or unlinked space.

Constraints are a complete mess right now, and don't always give the right result. The user's supplied constraints may refer either to unlinked space or linked space, depending on whether Turing decides to link the model, which should be an internal detail that is hidden from the user.

The overall strategy (@penelopeysm and @mhauru 2026/01/06)

In general, the refactoring will mean substantially paring down the interface that Turing provides "natively". This is, again, because there is not always a clean mapping between the interface that a Turing user specifies, and the interface that Optimization.jl expects. We elect to forbid all the cases where the mapping is ill-defined. In such cases, the user will have to construct their own LogDensityFunction and pass that to Optimization.jl themselves. We will write documentation to tell users how to do this.

The native interface will be:

optimize(
    model::Model,
    solver=DEFAULT_SOLVER;
    link::Bool=false,
    initial_params::Union{Nothing,AbstractInitStrategy},
    constraints::Union{NamedTuple,OrderedDict{VarName},VarNamedTuple},
    warn_on_constraint_violation::Bool=true,
    kwargs...
)

The default solver should be some flavour of LBFGS that incurs as small a dependency as needed (see #2711). I think OptimizationLBFGSB.jl is the correct option here https://github.com/SciML/Optimization.jl/blob/master/lib/OptimizationLBFGSB/src/OptimizationLBFGSB.jl.

Note that:

  • The user can control whether linking happens. However, in all cases, the only difference between unlinked and linked should be performance.
  • Initial parameters are always provided in unlinked space (these semantics are provided by DynamicPPL's initialisation strategies). The default initialisation strategy will be InitFromPrior(), but in practice the initialisation strategy will need to take into account the constraints supplied, as we don't want to automatically generate initial values that are out of bounds.
  • Constraints are always provided in unlinked space and are provided on a per-variable basis. However, we will not support arbitrary constraints, depending on the distributions involved. We discuss constraints in more detail in the following sections.

In general, we will only support user-provided constraints in the cases where it is meaningful to translate them into box constraints for Optimization.jl.

We will refuse to support any constraints that are not box constraints.

Univariate distributions

The case where constraints can obviously be fully supported is in the case of univariate distributions. Say:

u ~ Beta(2, 2)

In this case, the bijector for the distribution is constant. Thus, if the user specifies constraints of [0.1, 0.5], we can easily translate this to either [0.1, 0.5] if an unlinked optimisation is run, or [b(0.1), b(0.5)] if a linked optimisation is run.

In order to call Optimization.jl correctly, we have to determine the unlinked/linked constraints ahead of time. This can be achieved quite straightforwardly using an accumulator:

struct OptimInitialValuesAcc
    linked::Bool
    lb::Dict{VarName}
    ub::Dict{VarName}
end

Inside accumulate_assume!!, this accumulator will use the boolean linked value to generate lower and upper bounds which are stored inside the lb/ub. After the model is evaluated once, we can piece together the lb and ub into vectors before passing them to Optimization.jl.

Univariate distributions with non-constant bijectors

For cases like:

x ~ Normal()
y ~ truncated(Normal(); lower=x)

the bijector for y is non-constant. Therefore, if the user specifies constraints for y such as [0.1, 0.5], we can use a bijector to map this into linked space, but note that these constraints are not necessarily an accurate representation of what the user intended because the value of x will change during the optimisation.

We will still allow this, but we will insert an extra runtime check in the optimisation LDF itself in the form of a debugging accumulator. This accumulator will, essentially, store the constraints and check during the model evaluation whether the generated value obeys the constraints.

This extra accumulator is only needed when optimising in linked space.

struct OptimisationCheckAccumulator
    lb::Dict{VarName}
    ub::Dict{VarName}
end

function accumulate_assume!!(acc::OptimisationCheckAccumulator, val, _, vn, _)
    if val < acc.lb[vn] || val > acc.ub[vn]
        @warn "blahblah"
    end
end

This provides a foolproof way to detect whether the constraints, as specified in the unlinked space, are being correctly obeyed. This comes at an almost negligible performance cost, but we will allow the user to disable it if they really want via the warn_if_constraints_violated keyword argument.

The reason why we allow this is because it is quite possible that, when running a linked optimisation, it just so happens that the constraints are always obeyed (by chance). In such cases, it doesn't matter that the constraints don't map perfectly to linked space.

Multivariate distributions, part 1

Consider:

m ~ Dirichlet(ones(2))

which returns a length-2 vector such that m[1] + m[2] == 1.

If we run in unlinked space, it is quite easy to specify constraints that cannot be satisfied (for example, consider the constraint lb=[0.0, 0.0], ub=[0.4, 0.4]). We consider this to be the user's fault; they should change their constraints. However, because there is a clean mapping into Optimization.jl's API, we will not forbid this at the interface level.

If we run in linked space, though, there is no easy way to translate unlinked constraints into linked space. For the length-2 vector this is actually possible; but for higher-dimensional Dirichlet, this is not possible. So in general, for multivariate distributions we will forbid the use of constraints AND linked optimisation: you have to choose one.

In general, such cases can be caught inside the initial values accumulator shown above and an error thrown before optimisation begins.

Distributions such as Wishart (returns a positive definite matrix) will also fall under this category. In general, if a distribution returns a sample that has some special property that means that not all its constituent elements are independent, this will be the case.

TLDR: in general for multivariate distributions you will be allowed to provide constraints or specify linked space, but not both.

Multivariate distributions, part 2

There are, of course, multivariate distributions which don't need such special treatment. Consider MvNormal: in this case the bijector is identity and thus constraints in linked space are the same as in unlinked space. We can special-case such multivariate distributions to allow constraints to be specified in linked space.

The same is true for product distributions of supported distributions (i.e., products of supported multivariate distributions and univariate distributions), although this will likely require custom code.

LKJCholesky

We consider it to be completely implausible to support constraints on LKJCholesky. These will simply be forbidden.

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    roadmapTuring.jl meta issues

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions