Skip to content

Implement reference semantics for all Tensors #160

@mratsim

Description

@mratsim

Current value semantic / copy-on-assignment is not good enough performance-wise. It requires unsafe all over the place which is not ergonomic at all.

I tried copy-on-write as well, in-depth monologue discussion in this issue #157. You can implement COW with atomic reference counting or a shared/non-shared boolean but it has a few problems detailed here:

  • Refcounting/isShared troubles when wrapped in a container
  • Performance predictability "when will it copy or share", compared to always copy or always share
  • Workaroundability: since = is overloaded, it is non-trivial to avoid COW, let a = b.unsafeView won't work

So Arraymancer will move to reference semantics (share Tensor data by default, copy must be made explicit).

Benefits:

  • CudaTensor already had this semantic
  • No need to sprinkle unsafeSlice all over the place for performance
    • All the unsafe proc can be removed
    • Much less code to maintain
  • Numpy and Julia already work like this
  • Most copies are explicit (except asContiguous and reshape
    • debugging copy issues will be just grep clone *.nim

Disadvantages

  • Sharing is implicit: users might forget to use clone and share data by mistake.
    • debugging sharing issues will be harder than grep unsafe *.nim

In the wild

Numpy and Julia have reference semantics, Matlab and R have copy-on-write.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions