Skip to content

matmul doesn't fail fast for shape mismatches in some cases #8948

@tomwhite

Description

@tomwhite

What happened: calling matmul with incompatible array sizes doesn't always fail immediately

What you expected to happen: if the array sizes are incompatible, then the operation should (consistently) fail immediately, and not wait until compute is called in some cases.

Minimal Complete Verifiable Example:

>>> import dask.array as da
>>> da.matmul(da.ones(1), da.ones(3))
dask.array<getitem, shape=(), dtype=float64, chunksize=(), chunktype=numpy.ndarray>
>>> da.matmul(da.ones(1), da.ones(3)).compute() # fails on compute
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/tom/projects-workspace/dask/dask/base.py", line 292, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/Users/tom/projects-workspace/dask/dask/base.py", line 575, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/Users/tom/projects-workspace/dask/dask/threaded.py", line 81, in get
    results = get_async(
  File "/Users/tom/projects-workspace/dask/dask/local.py", line 508, in get_async
    raise_exception(exc, tb)
  File "/Users/tom/projects-workspace/dask/dask/local.py", line 316, in reraise
    raise exc
  File "/Users/tom/projects-workspace/dask/dask/local.py", line 221, in execute_task
    result = _execute_task(task, data)
  File "/Users/tom/projects-workspace/dask/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/Users/tom/projects-workspace/dask/dask/core.py", line 119, in <genexpr>
    return func(*(_execute_task(a, cache) for a in args))
  File "/Users/tom/projects-workspace/dask/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/Users/tom/projects-workspace/dask/dask/optimization.py", line 990, in __call__
    return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
  File "/Users/tom/projects-workspace/dask/dask/core.py", line 149, in get
    result = _execute_task(task, cache)
  File "/Users/tom/projects-workspace/dask/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/Users/tom/projects-workspace/dask/dask/array/routines.py", line 402, in _matmul
    chunk = xp.matmul(a, b)
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 1)
>>> da.matmul(da.ones(2), da.ones(3)) # fails without compute
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/tom/projects-workspace/dask/dask/array/routines.py", line 446, in matmul
    out = blockwise(
  File "/Users/tom/projects-workspace/dask/dask/array/blockwise.py", line 174, in blockwise
    chunkss, arrays = unify_chunks(*args)
  File "/Users/tom/projects-workspace/dask/dask/array/core.py", line 3812, in unify_chunks
    arrays.append(a.rechunk(chunks))
  File "/Users/tom/projects-workspace/dask/dask/array/core.py", line 2647, in rechunk
    return rechunk(self, chunks, threshold, block_size_limit, balance)
  File "/Users/tom/projects-workspace/dask/dask/array/rechunk.py", line 297, in rechunk
    chunks = normalize_chunks(
  File "/Users/tom/projects-workspace/dask/dask/array/core.py", line 2958, in normalize_chunks
    raise ValueError(
ValueError: Chunks do not add up to shape. Got chunks=((1,), (3,)), shape=(1, 2)

Anything else we need to know?: there may be similar problems with dot and tensordot (needs investigation)

Environment:

  • Dask version: main
  • Python version: 3.8
  • Operating System: mac
  • Install method (conda, pip, source): source
Cluster Dump State:

Metadata

Metadata

Assignees

No one assigned

    Labels

    arrayneeds attentionIt's been a while since this was pushed on. Needs attention from the owner or a maintainer.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions