Vectorize TensorPrimitives.CosineSimilarity<Half> #116898

stephentoub · 2025-06-22T04:43:26Z

Vectorize for Half by processing it as shorts, using the existing widening routine to two vectors of floats, and operating on those floats. Even for non-vectorized, this improves throughput as each intermediate operation is operating on floats rather than constantly needing to convert back to Half.

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.Numerics.Tensors;

BenchmarkSwitcher.FromAssembly(typeof(Bench).Assembly).Run(args);

public class Bench
{
    private Half[] _x, _y;

    [Params(1, 10, 100, 1000)]
    public int Length { get; set; }

    [GlobalSetup]
    public void Setup()
    {
        _x = new Half[Length];
        _y = new Half[Length];
        var random = new Random(42);
        for (int i = 0; i < Length; i++)
        {
            _x[i] = (Half)random.NextSingle();
            _y[i] = (Half)random.NextSingle();
        }
    }

    [Benchmark]
    public Half CosineSimilarity() => TensorPrimitives.CosineSimilarity(_x, _y);
}

Before:

Method	Length	Mean
CosineSimilarity	1	64.24 ns
CosineSimilarity	10	241.02 ns
CosineSimilarity	100	2,077.22 ns
CosineSimilarity	1000	20,033.55 ns

After

Method	Length	Mean
CosineSimilarity	1	14.59 ns
CosineSimilarity	10	29.79 ns
CosineSimilarity	100	69.57 ns
CosineSimilarity	1000	465.07 ns

Vectorize for Half by processing it as shorts, using the existing widening routine to two vectors of floats, and operating on those floats. Even for non-vectorized, this improves throughput as each intermediate operation is operating on floats rather than constantly needing to convert back to Half.

dotnet-policy-service · 2025-06-22T04:44:18Z

Tagging subscribers to this area: @dotnet/area-system-numerics-tensors
See info in area-owners.md if you want to be subscribed.

Copilot

Pull Request Overview

This PR adds explicit vectorization support for Half inputs in TensorPrimitives.CosineSimilarity, refactors the core implementation to use common Update/Finalize helpers, and introduces a specialized CosineSimilarityHalfCore that processes Half as widened floats.

Adds a generic wrapper for CosineSimilarity<T> that dispatches to a new Half-specific path
Refactors existing vector‐and‐scalar loops into shared Update and Finalize methods
Implements CosineSimilarityHalfCore with 128/256/512-bit vector and scalar fallbacks for Half

Comments suppressed due to low confidence (2)

src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.CosineSimilarity.cs:184

A new specialized path for Half has been added but no tests for TensorPrimitives.CosineSimilarity on Half arrays appear in this PR. Please add unit tests covering both vectorized and scalar code paths to validate correctness.

        private static Half CosineSimilarityHalfCore(ReadOnlySpan<Half> x, ReadOnlySpan<Half> y)

src/libraries/System.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.CosineSimilarity.cs:31

The XML doc for CosineSimilarity<T> does not mention the new Half-specialized path. Please update the summary to note that Half inputs are now vectorized via Half⇒short⇒float widening.

        public static T CosineSimilarity<T>(ReadOnlySpan<T> x, ReadOnlySpan<T> y)

...em.Numerics.Tensors/src/System/Numerics/Tensors/netcore/TensorPrimitives.CosineSimilarity.cs

tannergooding

LGTM.

It's a bit unfortunate we need to duplicate the CosineSimilarityCore function here. I expect we could have a general operate with m-to-n intermediate helper, but that would be a larger refactoring (and I don't think it's worth blocking this on making that happen).

stephentoub · 2025-06-23T21:13:49Z

It's a bit unfortunate we need to duplicate the CosineSimilarityCore function here. I expect we could have a general operate with m-to-n intermediate helper, but that would be a larger refactoring (and I don't think it's worth blocking this on making that happen).

I have such a helper in another PR I'll put up for other methods, but applying it to CosineSimliarity (which doesn't use any of the shared helpers or operators) results in roundtripping between Half and float for each operation, which is measurably worse than staying with float as the accumulator. We can subsequently look at a larger refactoring around our aggregations to enable a) making the accumulation configurable and b) getting CosineSimilarity onto the same helpers (which is desirable, anyway, as it's not currently as robust in its optimizations as the shared helpers are).

stephentoub requested review from Copilot and tannergooding June 22, 2025 04:43

github-actions bot added the area-System.Numerics label Jun 22, 2025

dotnet-policy-service bot assigned stephentoub Jun 22, 2025

stephentoub added area-System.Numerics.Tensors and removed area-System.Numerics labels Jun 22, 2025

Copilot AI reviewed Jun 22, 2025

View reviewed changes

This was referenced Jun 22, 2025

System.Net.Http.Functional.Tests timeouts #115683

Closed

browser-wasm Windows build error #116746

Closed

[iOS/tvOS] System.Runtime.Tests crash with signal 4 #116815

Closed

tannergooding approved these changes Jun 22, 2025

View reviewed changes

stephentoub merged commit 594f85c into dotnet:main Jun 23, 2025
82 of 89 checks passed

stephentoub deleted the vectorizecshalf branch June 23, 2025 21:13

github-actions bot locked and limited conversation to collaborators Jul 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vectorize TensorPrimitives.CosineSimilarity<Half> #116898

Vectorize TensorPrimitives.CosineSimilarity<Half> #116898

Uh oh!

stephentoub commented Jun 22, 2025

Uh oh!

dotnet-policy-service bot commented Jun 22, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tannergooding left a comment

Uh oh!

stephentoub commented Jun 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Vectorize TensorPrimitives.CosineSimilarity<Half> #116898

Vectorize TensorPrimitives.CosineSimilarity<Half> #116898

Uh oh!

Conversation

stephentoub commented Jun 22, 2025

Uh oh!

dotnet-policy-service bot commented Jun 22, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tannergooding left a comment

Choose a reason for hiding this comment

Uh oh!

stephentoub commented Jun 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants