Skip to content

Create an Operator class for better organizing logic for specific operators in compound models #3861

@embray

Description

@embray

Bear with me while I explain some background to this issue (the background itself could be split off into an issue of its own, even):

Background

A few weeks ago we had a telecon discussion the model discretization API ideas in astropy/astropy-api#13 by @patti.

As part of that discussion we considered several possible approaches to implementing convolution of one model with another. Right now one of the most straightforward approaches to this is a kind of "ConvolutionModel" that uses astropy.convolution.convolve in its evaluate() method, and would have a specific kernel and other convolution options attached to it for performing the evaluation.

However, as @cdeil brought up, the existence of separate Kernel classes (i.e. separate from the Model they're based on) is due to the current lack of a discretization interface baked directly into the existing Model interface). One hope is that this will be resolved, perhaps as part of the GSoC project, obviating the need for a separate "Kernel" class, so that Model objects can be used directly with convolve().

If that is resolved, then it opens the door to another possible implementation of model convolution that I suggested, where the convolution itself would be treated as an operator between two models, resulting in a compound model, much in the same way we treat model composition.

For example, currently we implement model composition like:

source = Box2D(1, 0, 0, 1, 1)
sink = Gaussian2D(1, 0, 0, 1, 1)
model = source | sink

the resulting model object is a compound model that takes input to the Box2D and outputs the result through the Gaussian2D.

A convolve operator could look something like:

source = Box2D(1, 0, 0, 1, 1)
kernel = Gaussian2D(1, 0, 0, 1, 1)
model = convolve(source, kernel)

Here we write the convolve(a, b) operator with function syntax rather than infix operator for a few practical reasons:

  1. Python only has so many infix operators we can use, and there isn't a good one that obviously means convolution ('*' is already taken for multiplication)
  2. We do want to be able to pass additional arguments to convolve, such as sampling options.

However, this would still be treated as a binary operator just as addition or multiplication, in that the result would be a compound model that takes input through the source and convolves it with the kernel. This combined object can be further combined with other models in a transformation pipeline, just as we currently do with compound models.

Challenges

I personally really like the convolve operator idea, and in the telecon there seemed to be some support for it as well. This of course does require the discretization support on models. If a model can't be discretized then I guess we don't allow convolving with it.

The other challenge is more of a refactoring issue. This is why I marked this issue "Affects-dev". The Operator class I'm going to propose would only be for internal use for now (though if well designed it could enable further generalization of the compound model framework to enable custom operators as well).

The problem is that currently arithmetic operators (like +, *, etc.) as well as the special operators | and & are passed around the code and stored in the expression tree as just single character strings, like '+' for addition. This clearly won't do for something like "convolve". Nevermind that it requires more than a single character to specify--the "convolve" operator actually represents a whole class of operators with different possible options for boundary handling, sampling modes for the kernel, etc.

Proposal

We need an object that can be passed around and used in the ExpressionTree to represent the convolve operator along with any arguments passed to it. For this purpose I propose an Operator class that can be used to represent an operator (I'm currently undecided whether each operator should get its own (in most cases singleton) class, or if each operator should be an instance of a single Operator class). The Operator class would store a name for the operator (like '+' or 'convolve') and may also store zero or more options to that operator. In most cases, like the existing arithmetic operators, there would be no options; convolve on the other hand could have many options--each "convolve" variant would be a unique, but related operator.

Besides being able to represent a convolve operator this proposal has numerous possible advantages both for code refactoring and enhancing the usability of the framework:

  • We could store information about how to evaluate an operator directly in the Operator object. For example we could do away with the BINARY_OPERATORS table. Instead Operator('+', operator.add) automatically ties the name of the operator to its implementation.
  • If you look at how the operators in BINARY_OPERATORS are implemented, you'll note however
    that it doesn't just directly call (in the case of '+'), operator.add. Actually it wraps that in a more complicated function that takes a tuple consisting of the left operand's model evaluate, as well as its number of inputs and outputs, and a tuple consisting of the same for the right operand. This is an implementation detail related to implementation of more complicated operators like composition and join. A lot of this implementation detail could be baked directly into the Operator class though, so that it's easy to change in one place for all operators. This aspect requires a little more thinking though.
  • The Operator objects could also directly store the order of operations--precedence could be determined entirely by comparison of the Operator objects with each other.
  • When creating a compound model in _CompoundModelMeta._from_operator a good chunk of the code is validation code that checks whether the left and right operands' inputs and outputs are compatible, and this depends largely on which operator is in use. This validation could live with the individual Operator objects.
    • Similarly, in WIP: Prototype for Quantity support in Models #3852 I'm working on validation of compatible units when operating on models, which is highly dependent on the operator (for example the right-hand operand to ** had better have dimensionless output).
  • In addition to forward evaluation Operator objects could store how to evaluate the inverse of that operator (if it all)--almost all the code in _CompoundModel.inverse can live in the appropriate Operator objects so that each inverse doesn't have to be hard-coded in the inverse implementation.
  • As I already mentioned, if done right this could also make it easier to add new specialized operators. Of course this is super deep in the generalization weeds, but I think the resulting code could be easier to understand than the current admittedly dense compound model code.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions