Skip to content

Adjoint matmul in QobjEvo#2802

Merged
Ericgig merged 24 commits into
qutip:masterfrom
amilsted:matrix_mesolve_pr
Jan 28, 2026
Merged

Adjoint matmul in QobjEvo#2802
Ericgig merged 24 commits into
qutip:masterfrom
amilsted:matrix_mesolve_pr

Conversation

@amilsted
Copy link
Copy Markdown
Contributor

@amilsted amilsted commented Jan 14, 2026

This PR introduces a matrix-form master-equation solver variant, which has dramatically better performance when operators are dense, helping to close the gap to other packages such as QuantumOptics.jl and dynamiqs. It also adds some new "cast-complex-to-double" C++ low-level matmul routines, inspired by existing qutip matvec code, that vectorize better than the existing cython code (at least they are significantly faster on my arm Macbook Pro and, for CSR matrices, help to bring performance closer to the CSC routines used by QuantumOptics.jl).

The matrix-form master-equation solver introduces a new "rhs" class LindbladMatrixForm that uses matmul operations to implement the action of the Liouvillian on a density matrix without constructing explicit superoperator matrices. For dense operators of dimension n, this action scales as O(n^3) versus O(n^4) for the current "explicit superoperator matrix" approach. I have also found cases where using CSR representation with the matrix-form solver is faster than both CSR explicit superoperators and fully-dense matrix-form solves.

The matrix-form solver is currently opt-in via the mesolve API and is also usable via the new MESolverMatrixForm Solver class.

I am happy to provide benchmark code for both claimed improvements.

I worked with gen AI tools to produce this PR. I reviewed AI-generated changes in detail myself and iterated on the changes to ensure a good fit with existing paradigms in the qutip codebase and its tests.

I realize this is a fairly significant addition and fully expect to make possibly major adjustments during the review process! For example, should it be "LindbladMatrixForm" or "LiouvillianMatrixForm"? : )

Update: I split the original PR into two. This PR now just introduces underlying matmul routine changes. A second PR will bring in the master-equation solver.

Ashley Milsted added 3 commits January 13, 2026 15:39
- Add C++ implementations: matmul_csr_dense, matmul_diag_vector
- Update Cython interface with new function declarations
- Add matmul_dag_dense_dia_dense for adjoint diagonal operations
- Update test_mathematics.py with test refactoring
- Add matmul_data and adjoint_rmatmul_data methods
- Support scale parameters for efficient accumulation
- Update _brtensor.pyx for signature compatibility
- Implement matmul_data for efficient matrix multiplication
- Add adjoint_rmatmul_data for right multiplication with adjoint
- Support scale parameters for accumulation operations
- Update test_qobjevo.py with test refactoring
Copy link
Copy Markdown
Member

@pmenczel pmenczel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @amilsted, thank you very much for your contribution. It will take time to look at this in detail, but it looks like very good and thorough work. Benchmark code would be nice to have (and could potentially be repackaged into a tutorial notebook at some point?)

Storing superoperators not as a big Liouvillian matrix but in a more "analytic" form has been discussed internally for a while, and is a big goal for one of the next major releases. That would be a deeper change, applying not only to Lindblad-type superoperators... however, it will take some time. I imagine that it would make sense to add your feature to QuTiP now, even if it might become obsolete at some later time. We need to have some discussion within the admin team.

The following are just a few things I noticed scrolling through the code for the first time. (Also, please have a look at the warnings generated in the test runs; we treat warnings as errors.)

Comment thread qutip/solver/mesolve.py Outdated
Comment thread doc/guide/dynamics/dynamics-options.rst Outdated
Comment thread qutip/core/cy/_element.pyx
Comment thread qutip/core/cy/lindblad_matrix_form.pyx Outdated
Comment thread qutip/solver/mesolve.py Outdated
@amilsted
Copy link
Copy Markdown
Contributor Author

amilsted commented Jan 15, 2026

Looks like tests are failing because of warnings. I added coverage for the in-place behavior of the matmul routines, including cases where input and output matrices have different C/Fortran ordering choices (some of the existing code was actually broken for this case). These paths are slower and currently warn. Maybe these should actually be error paths? In any case, at the moment it looks like I just need to teach the tests to expect warnings here.

@amilsted
Copy link
Copy Markdown
Contributor Author

Storing superoperators not as a big Liouvillian matrix but in a more "analytic" form has been discussed internally for a while, and is a big goal for one of the next major releases. That would be a deeper change, applying not only to Lindblad-type superoperators... however, it will take some time. I imagine that it would make sense to add your feature to QuTiP now, even if it might become obsolete at some later time. We need to have some discussion within the admin team.

Thank you for taking a look! It did occur to me that it might make sense to have a general matrix-form Liouvillian class. I hope that the matmul op work here is useful for future changes along these lines. It wouldn't be hard to generalize LindbladMatrixForm to be a general superoperator class. Could keep the same matmul efficiency if we e.g. handle [H_nh, None] and [None, H_nh] by treating None as a noop identity.

@amilsted
Copy link
Copy Markdown
Contributor Author

amilsted commented Jan 15, 2026

Maybe I should also add: the explicit superoperator form for the Liouvillian is still often the fastest for sparse martrices, a according to my benchmarks, so you may not want to drop it entirely!

@Ericgig
Copy link
Copy Markdown
Member

Ericgig commented Jan 15, 2026

@amilsted
With the size of the PR, would you accept to split it into chunk for easier review?
As a first step, only include the core changes with tests, (except the new lindblad_matrix_form.pyx.)
This should cut it in about half.

Comment thread qutip/core/cy/_element.pyx Outdated
Comment thread qutip/core/cy/_element.pyx Outdated
Comment thread qutip/core/cy/_element.pyx
Comment thread qutip/core/data/matmul.pyx
Comment thread qutip/core/data/matmul.pyx Outdated
@Ericgig
Copy link
Copy Markdown
Member

Ericgig commented Jan 15, 2026

Thank you for adding mixed C, F order in the tests.

It would be good to have tests that check the warnings are properly raised:
with pytest.warns(OrderEfficiencyWarning) as warning:

I don't think they should be error.
The design of the low level function is that it should just work, even if not optimal.

@amilsted amilsted changed the title Matrix-form master-equation solver Adjoint matmul in QobjEvo Jan 16, 2026
@amilsted
Copy link
Copy Markdown
Contributor Author

Thank you for adding mixed C, F order in the tests.

It would be good to have tests that check the warnings are properly raised: with pytest.warns(OrderEfficiencyWarning) as warning:

I don't think they should be error. The design of the low level function is that it should just work, even if not optimal.

I added some code to just swallow these warnings in all cases. It's currently difficult to detect when they should or shouldn't occur, but I can work on doing that if you prefer.

I also did some refactoring as I wasn't quite happy with how the tests were structured. Now there is a separate test variant with its own id for out-of-place as well as each in-place ordering.

@amilsted
Copy link
Copy Markdown
Contributor Author

amilsted commented Jan 16, 2026

@amilsted With the size of the PR, would you accept to split it into chunk for easier review? As a first step, only include the core changes with tests, (except the new lindblad_matrix_form.pyx.) This should cut it in about half.

I have now done this. This PR now only includes the matmul stuff. I will make a second for the mesolve stuff. Would you like me to do that now or wait until we are done with this?

For now, I have put the remaining changes here: amilsted#1

@amilsted
Copy link
Copy Markdown
Contributor Author

amilsted commented Jan 21, 2026

Added some benchmark results in a branch here: https://github.com/amilsted/qutip/tree/matmul_bench

See the full results here: https://github.com/amilsted/qutip/blob/matmul_bench/BENCHMARKING.md

In matmul microbenchmarks:

  • CSR matrix-matrix operations get significantly faster on Linux x86_64 and Apple Silicon (15-30%). This is important for the matrix-form solver in the upcoming second PR!
  • CSR matrix-vector operations get significantly faster on Apple Silicon (15-20%) and are about the same on x86_64 (and shouldn't change as the SSE specialized routine is still used unchanged here)
  • DIA matrix-matrix and matrix-vector gets significantly faster on Apple Silicon (15-30%). Matrix-vector is somewhat slower on x86_64 (~5%) and matrix-matrix is about the same. The slowdown seems to be due to use of intermediate variables to avoid writing to vectors all the time in the inner loop. This significantly helps Apple Silicon (more registers?), but slightly hurts x86_64.

In JC model dynamics (sesolve and vectorized mesolve, so only exercising mat-vec routines):

  • ~5% performance improvement for CSR on Apple Silicon, up to ~15% with DIA.
  • Essentially no change on x86_64 for most cases. Perhaps a small slowdown for very small Hilbert spaces, but could be noise.

Copy link
Copy Markdown
Member

@Ericgig Ericgig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the benchmarks. I will try running them tomorrow.

Comment thread qutip/core/cy/_element.pyx Outdated
@nwlambert
Copy link
Copy Markdown
Member

Just wanted to add, thanks for this, looks cool! I had a quick question, I remember looking at what dynamiqs were doing a while back, and they had a nice trick in the Lindblad construction to slightly reduce the number of matrix-matrix multiplications, at the the cost of some small numerical error (https://github.com/dynamiqs/dynamiqs/blob/fdc6c8913bba2b0a03eb0ad314ca01b5a519030b/dynamiqs/integrators/core/diffrax_integrator.py#L297 ). Just curious if you tried this in your benchmarks, and if it was worthwhile.

@amilsted
Copy link
Copy Markdown
Contributor Author

amilsted commented Jan 22, 2026

Just wanted to add, thanks for this, looks cool! I had a quick question, I remember looking at what dynamiqs were doing a while back, and they had a nice trick in the Lindblad construction to slightly reduce the number of matrix-matrix multiplications, at the the cost of some small numerical error (https://github.com/dynamiqs/dynamiqs/blob/fdc6c8913bba2b0a03eb0ad314ca01b5a519030b/dynamiqs/integrators/core/diffrax_integrator.py#L297 ). Just curious if you tried this in your benchmarks, and if it was worthwhile.

Oh man! Yes, I had come across this, but only vaguely remembered it existed. I think it's worth doing for the second PR :)

Slight changes to the number of matmul ops can indeed determine the answer to "which solver variant is best". There are many cases where it is a close call!

Copy link
Copy Markdown
Member

@Ericgig Ericgig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still running benchmarks, but I am happy with everything except the tests for cases with inputs. Auto detection from signature and adding an argument that you must check that it's not used anywhere except one specific case is not a clean solution.

Comment on lines +273 to +288
# Check if op supports out parameter and out_type is Dense.
# If so, we will add in-place test variants for both C and F ordered
# output arrays.
import inspect
supports_out = False
if out_type is Dense:
try:
sig = inspect.signature(op)
supports_out = 'out' in sig.parameters
except (ValueError, TypeError):
# Can't introspect (e.g., built-in function), assume no out support
pass

# Determine out_prealloc values
out_prealloc_values = [None, "C", "F"] if supports_out else [None]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the way...
This will cause problem if the out argument was ever used in another way or named something else.
It would also skip test were the output is not Dense ,cupy , jax, cuQuantum backends uses non-Dense states in our solvers and we should test them too.

Why not use TernaryOpMixin (or ScaledTernaryOpMixin) with out as simply the third input:
the operation is really A @ B * scale + out. With the output being the same instance as out a bonus.
The function tested will have to be wrapped since scale comes before out (or we could inverse them in the functions).
It would allow to add tests form other specialization easily (says we want to test with a CSR out) and keep test_mathematically_correct and cases_type_shape_product simpler.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point taken. That sounds like a better approach!

Copy link
Copy Markdown
Contributor Author

@amilsted amilsted Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done this now with a InPlaceMatmulMixin, since the semantics of TernaryOpMixin were a bit different. It does seem a lot cleaner. What do you think?

This time I have verified that the rest of the tests also pass :)

Comment thread qutip/tests/core/data/test_mathematics.py Outdated
Comment thread qutip/core/data/src/matmul_diag_vector.cpp Outdated
Comment thread qutip/core/data/matmul.pyx
Comment thread qutip/core/data/matmul.pyx
Copy link
Copy Markdown
Member

@Ericgig Ericgig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmarks look good on my side.
If you could just remove the unused op_matmul, it's ready.

I saw you removed the towncrier entry. Knowing that there is a follow up PR, it's fine, but this alone is lot of work and deserve a mention.

Comment thread qutip/core/cy/_element.pyx Outdated
Comment thread qutip/tests/core/data/test_mathematics.py Outdated
@coveralls
Copy link
Copy Markdown

coveralls commented Jan 27, 2026

Coverage Status

coverage: 86.98%. first build
when pulling 428e9b6 on amilsted:matrix_mesolve_pr
into 7589bf3 on qutip:master.

@amilsted
Copy link
Copy Markdown
Contributor Author

Benchmarks look good on my side. If you could just remove the unused op_matmul, it's ready.

I saw you removed the towncrier entry. Knowing that there is a follow up PR, it's fine, but this alone is lot of work and deserve a mention.

Removed the op and added the towncrier entry.

@Ericgig Ericgig merged commit 860bc98 into qutip:master Jan 28, 2026
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants