ENH: np.unique: support hash based unique for complex dtype #29537

math-hiyoko · 2025-08-11T04:25:40Z

Description

This PR introduces hash-based uniqueness extraction support for complex types in NumPy's np.unique function.

Benchmark Results

The benchmark demonstrates significant performance improvement from the new implementation.

$ spin bench -t Unique.time_unique_value -c

Change	Before [`989e9a6`]	After [`4ff52ca`] <feature/#28363>	Ratio	Benchmark (Parameter)
-	5.92±0.08μs	4.75±0.06μs	0.8	bench_lib.Unique.time_unique_values(200, 10.0, 20.0, <class 'numpy.complex128'>)
-	12.2±0.1ms	8.63±0.1ms	0.7	bench_lib.Unique.time_unique_values(200000, 10.0, 20.0, <class 'numpy.complex128'>)
-	5.31±0.3μs	2.95±0.04μs	0.56	bench_lib.Unique.time_unique_values(200, 10.0, 0.2, <class 'numpy.complex128'>)
-	5.03±0.02ms	2.71±0.03ms	0.54	bench_lib.Unique.time_unique_values(200000, 90.0, 20.0, <class 'numpy.complex128'>)
-	5.57±0.04μs	2.62±0.05μs	0.47	bench_lib.Unique.time_unique_values(200, 90.0, 20.0, <class 'numpy.complex128'>)
-	5.42±0.07μs	2.04±0.03μs	0.38	bench_lib.Unique.time_unique_values(200, 90.0, 0.2, <class 'numpy.complex128'>)
-	9.91±0.1ms	2.33±0.02ms	0.23	bench_lib.Unique.time_unique_values(200000, 10.0, 0.2, <class 'numpy.complex128'>)
-	5.02±0.08ms	882±4μs	0.18	bench_lib.Unique.time_unique_values(200000, 90.0, 0.2, <class 'numpy.complex128'>)

close #28363

ngoldbaum · 2025-08-12T15:23:03Z

Very cool! When you finish this up for review it'd be nice to see some before/after comparisons on some benchmarks.

math-hiyoko · 2025-08-15T15:02:03Z

Now it's ready for being reviewed.

ngoldbaum · 2025-08-18T15:44:31Z

I added the triage review label so hopefully someone at the next triage meeting on Wedesnday will volunteer to look closer at this. Unfortunately I can't take that on this month - too many things going on!

Copilot

Pull Request Overview

This PR extends NumPy's np.unique function to support hash-based unique value extraction for float and complex dtypes, providing significant performance improvements over the existing sort-based approach. The implementation leverages hash tables to efficiently identify unique values without requiring sorting.

Key changes include:

Implementation of hash-based unique extraction for float and complex numeric types
Addition of specialized hash and equality functions that handle NaN values appropriately
Updates to test cases to accommodate unsorted output from hash-based approach

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`numpy/_core/src/multiarray/unique.cpp`	Core implementation adding hash-based unique support for float/complex types with NaN handling
`numpy/lib/tests/test_arraysetops.py`	Updated test assertions to handle unsorted output from hash-based unique operations
`numpy/_core/meson.build`	Added npymath include directory for math function access
`doc/release/upcoming_changes/29537.performance.rst`	Documentation of performance improvements
`doc/release/upcoming_changes/29537.change.rst`	Documentation of behavior change regarding sorted output

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

ngoldbaum · 2025-08-20T18:13:04Z

I'm not sure what the state of the unique benchmarks is in our benchmarks, but I think it would help to add new benchmarks for this case to our benchmark suite.

It's probably worth benchmarking several sizes of inputs, not just really huge arrays.

ngoldbaum · 2025-08-20T18:20:29Z

@charris is curious how you're handling NaN. What happens if there's more than one distinct NaN value in the array.

ngoldbaum · 2025-10-16T15:10:10Z

I see you debugging in CI here. That's a pretty expensive (in terms of energy usage) way to debug. Maybe you'll find the docs I recently wrote about using native debuggers with NumPy useful? https://numpy.org/devdocs/dev/development_advanced_debugging.html#c-debuggers

math-hiyoko · 2025-10-16T16:46:51Z

Thank you for the suggestion, but the bug did not reproduce in my local environment even in debug build

math-hiyoko · 2025-10-17T15:32:20Z

@ngoldbaum
As described in npy_math_private.h, some floating-point formats include junk bits inside their binary representation, which means that simply hashing the sizeof representation cannot work reliably.
To address this, I used the macros defined in that file to extract and pack the mantissa, exponent, and sign components separately into a buffer, and then computed the hash from that buffer.
This approach successfully avoids the undefined-bit issue.

numpy/numpy/_core/src/npymath/npy_math_private.h

Lines 254 to 261 in 989e9a6

    
               /* 
        
                * Motorola extended 80 bits precision. Bit representation is 
        
                *          |     s  |eeeeeeeeeeeeeee|  junk  |mmmmmmmm................mmmmmmm| 
        
                *          |  1 bit |    15 bits    | 16 bits|            64 bits            | 
        
                *          |             a[0]                |     a[1]     |    a[2]        | 
        
                * 
        
                * 16 low bits of a[0] are junk 
        
                */

ngoldbaum · 2025-10-17T15:44:34Z

Awesome! I'll take a close look at this soon and will make sure we ship this in 2.4.0.

In the meantime, would you mind re-running the benchmarks to make sure there are no big regressions with the new implementation? Or maybe it's faster? Also to make sure the numbers in the release note are still accurate.

math-hiyoko · 2025-10-17T15:52:34Z

I already run benchmarks and ensured there is no big regressions.
The result table in the PR comment #29537 (comment) is updated to the latest commit.

ngoldbaum

ping @seberg and @inakleinbottle in case either of you would like to give this a once-over and hit the merge button if you don't spot any issues

ngoldbaum · 2025-10-20T14:59:09Z

numpy/_core/src/multiarray/unique.cpp

+    #if defined(HAVE_LDOUBLE_IBM_DOUBLE_DOUBLE_BE) || defined(HAVE_LDOUBLE_IBM_DOUBLE_DOUBLE_LE)
+    constexpr size_t SIZEOF_BUFFER = NPY_SIZEOF_CLONGDOUBLE;
+    const unsigned char* buffer = reinterpret_cast<const unsigned char*>(value);
+    #else


The approach below looks correct to me. My only comment is that you might be able to add an #else if block for platforms where longdouble and double are identical, avoiding the possibly slightly more expensive but standards-compliant version below for 80-bit longdouble on Linux. I don't know offhand if we already have preprocessor macros for platforms where longdouble is double already defined. If not, then this is fine as-is and no need for you to figure that out to merge this PR.

In practice, long double representations that contain junk bits only appear in the following cases:
• HAVE_LDOUBLE_INTEL_EXTENDED_12_BYTES_LE
• HAVE_LDOUBLE_INTEL_EXTENDED_16_BYTES_LE
• HAVE_LDOUBLE_MOTOROLA_EXTENDED_12_BYTES_BE

Therefore, the high-cost special handling is actually required only for these cases.
For all other platforms,
const unsigned char* buffer = reinterpret_cast<const unsigned char*>(value);
should be sufficient.

Actually, macros such as HAVE_LDOUBLE_IEEE_DOUBLE_BE indicate that long double has the same representation as double in big-endian format.

numpy/numpy/_core/src/npymath/npy_math_private.h

Lines 178 to 290 in 989e9a6

#if defined(HAVE_LDOUBLE_INTEL_EXTENDED_12_BYTES_LE)

/*

* Intel extended 80 bits precision. Bit representation is

* | junk | s |eeeeeeeeeeeeeee|mmmmmmmm................mmmmmmm|

* | 16 bits| 1 bit | 15 bits | 64 bits |

* | a[2] | a[1] | a[0] |

*

* 16 stronger bits of a[2] are junk

*/

typedef npy_uint32 IEEEl2bitsrep_part;

union IEEEl2bitsrep {

npy_longdouble e;

IEEEl2bitsrep_part a[3];

};

#define LDBL_MANL_INDEX 0

#define LDBL_MANL_MASK 0xFFFFFFFF

#define LDBL_MANL_SHIFT 0

#define LDBL_MANH_INDEX 1

#define LDBL_MANH_MASK 0xFFFFFFFF

#define LDBL_MANH_SHIFT 0

#define LDBL_EXP_INDEX 2

#define LDBL_EXP_MASK 0x7FFF

#define LDBL_EXP_SHIFT 0

#define LDBL_SIGN_INDEX 2

#define LDBL_SIGN_MASK 0x8000

#define LDBL_SIGN_SHIFT 15

#define LDBL_NBIT 0x80000000

typedef npy_uint32 ldouble_man_t;

typedef npy_uint32 ldouble_exp_t;

typedef npy_uint32 ldouble_sign_t;

#elif defined(HAVE_LDOUBLE_INTEL_EXTENDED_16_BYTES_LE)

/*

* Intel extended 80 bits precision, 16 bytes alignment.. Bit representation is

* | junk | s |eeeeeeeeeeeeeee|mmmmmmmm................mmmmmmm|

* | 16 bits| 1 bit | 15 bits | 64 bits |

* | a[2] | a[1] | a[0] |

*

* a[3] and 16 stronger bits of a[2] are junk

*/

typedef npy_uint32 IEEEl2bitsrep_part;

union IEEEl2bitsrep {

npy_longdouble e;

IEEEl2bitsrep_part a[4];

};

#define LDBL_MANL_INDEX 0

#define LDBL_MANL_MASK 0xFFFFFFFF

#define LDBL_MANL_SHIFT 0

#define LDBL_MANH_INDEX 1

#define LDBL_MANH_MASK 0xFFFFFFFF

#define LDBL_MANH_SHIFT 0

#define LDBL_EXP_INDEX 2

#define LDBL_EXP_MASK 0x7FFF

#define LDBL_EXP_SHIFT 0

#define LDBL_SIGN_INDEX 2

#define LDBL_SIGN_MASK 0x8000

#define LDBL_SIGN_SHIFT 15

#define LDBL_NBIT 0x800000000

typedef npy_uint32 ldouble_man_t;

typedef npy_uint32 ldouble_exp_t;

typedef npy_uint32 ldouble_sign_t;

#elif defined(HAVE_LDOUBLE_MOTOROLA_EXTENDED_12_BYTES_BE)

/*

* Motorola extended 80 bits precision. Bit representation is

* | s |eeeeeeeeeeeeeee| junk |mmmmmmmm................mmmmmmm|

* | 1 bit | 15 bits | 16 bits| 64 bits |

* | a[0] | a[1] | a[2] |

*

* 16 low bits of a[0] are junk

*/

typedef npy_uint32 IEEEl2bitsrep_part;

union IEEEl2bitsrep {

npy_longdouble e;

IEEEl2bitsrep_part a[3];

};

#define LDBL_MANL_INDEX 2

#define LDBL_MANL_MASK 0xFFFFFFFF

#define LDBL_MANL_SHIFT 0

#define LDBL_MANH_INDEX 1

#define LDBL_MANH_MASK 0xFFFFFFFF

#define LDBL_MANH_SHIFT 0

#define LDBL_EXP_INDEX 0

#define LDBL_EXP_MASK 0x7FFF0000

#define LDBL_EXP_SHIFT 16

#define LDBL_SIGN_INDEX 0

#define LDBL_SIGN_MASK 0x80000000

#define LDBL_SIGN_SHIFT 31

#define LDBL_NBIT 0x80000000

typedef npy_uint32 ldouble_man_t;

typedef npy_uint32 ldouble_exp_t;

typedef npy_uint32 ldouble_sign_t;

ngoldbaum · 2025-10-30T01:48:40Z

Alright, I’d like to have this merged in with a little time before 2.4.0 comes out. I doubt anyone else will look closely before the release and I already held this up enough already.

Thanks for your diligence here @math-hiyoko.

) * save: editting * enh: implement unique_complex * fix: spin build * enh: impl unique_numeric for integer, float and complex * fix: case when complex is nan * fix: case when order is not guaranteed * fix: lint * fix: type of data * fix: error * enh: change template type for unordered_set * enh: set function * enh: refactoring * enh: refactoring * enh: refactor test * enh: apply inline * doc: performance and change * enh: test check -0.0 and -np.nan * enh: use fnv-hash for complex * fix: add const * fix: use memcpy * fix: use static_cast * fix: remove support for float * fix: change variable name * fix: inline * fix: template type * fix: padding buffer * fix: padding buffer * fix: use buffer * fix: add constexpr * fix: copy of long double * fix: valie_real -> value_imag * fix: reinterpret value pointer * fix: use NPY_BITSOF_* * fix: use fnv magic prime * enh: refactoring * doc: fix docs * doc: fix docs * enh: add comment * fix: use sizeof to hash complex * fix: use reinterpret_cast * fix: impl hash_complex_large * fix: use constexpr * fix: define hash_complex_clongdouble * fix: use NPY_SIZEOF_LONGDOUBLE * fix: use IEEEl2bitsrep * fix: remove designated initializers * fix: comment * fix: small fix * fix: change case for handling bit expression * fix: refactoring

math-hiyoko added 4 commits August 8, 2025 15:40

save: editting

864b27e

enh: implement unique_complex

d3957a1

fix: spin build

8946b5a

enh: impl unique_numeric for integer, float and complex

96355af

math-hiyoko marked this pull request as draft August 11, 2025 04:25

math-hiyoko changed the title ~~Feature/#28363~~ ENH: np.unique: support hash based unique for float and complex dtype Aug 11, 2025

math-hiyoko added 10 commits August 11, 2025 14:50

fix: case when complex is nan

f183be3

fix: case when order is not guaranteed

7d22553

fix: lint

eb858a4

fix: type of data

29bbbb5

fix: error

edf17ce

enh: change template type for unordered_set

8ccc544

enh: set function

a0ba83c

enh: refactoring

8c53898

enh: refactoring

ca50af4

Merge branch 'main' into feature/numpy#28363

fd3fe91

math-hiyoko added 4 commits August 13, 2025 18:16

enh: refactor test

7237a62

Merge branch 'main' into feature/numpy#28363

00c16a2

enh: apply inline

ef4179b

doc: performance and change

5a4af0a

math-hiyoko marked this pull request as ready for review August 15, 2025 13:43

ngoldbaum added 01 - Enhancement triage review Issue/PR to be discussed at the next triage meeting labels Aug 18, 2025

jorenham requested a review from Copilot August 20, 2025 18:05

Copilot AI reviewed Aug 20, 2025

View reviewed changes

math-hiyoko added 6 commits October 16, 2025 19:01

fix: use sizeof to hash complex

f9d0d96

fix: use reinterpret_cast

a457695

fix: impl hash_complex_large

f3c8964

fix: use constexpr

e88d3fe

fix: define hash_complex_clongdouble

8e464d7

fix: use NPY_SIZEOF_LONGDOUBLE

843be2e

math-hiyoko added 2 commits October 17, 2025 22:08

fix: use IEEEl2bitsrep

a289546

Merge branch 'main' into feature/numpy#28363

2d1a6b9

math-hiyoko force-pushed the feature/#28363 branch from 995c5c8 to 2d1a6b9 Compare October 17, 2025 13:11

math-hiyoko added 3 commits October 17, 2025 22:21

fix: remove designated initializers

8413c0c

fix: comment

3dc9996

fix: small fix

4ff52ca

ngoldbaum approved these changes Oct 20, 2025

View reviewed changes

ngoldbaum added this to the 2.4.0 release milestone Oct 20, 2025

math-hiyoko added 2 commits October 21, 2025 13:31

fix: change case for handling bit expression

acd2ec2

fix: refactoring

a45c5d1

ngoldbaum merged commit fa84b04 into numpy:main Oct 30, 2025
76 checks passed

math-hiyoko deleted the feature/#28363 branch October 30, 2025 02:32

math-hiyoko mentioned this pull request Oct 30, 2025

Hash based np.unique to support return_index, return_inverse, and return_counts #28374

Open

ngoldbaum mentioned this pull request Oct 31, 2025

BUG: np.unique: return_index does not reflect the first occurrence for complex arrays with NaN #30113

Open

jorenham mentioned this pull request Nov 1, 2025

BUG: issue detected with hash based unique changes for complex dtypes #30123

Closed

math-hiyoko mentioned this pull request Nov 22, 2025

MAINT: Take advantage of C++ for more robust use of the PyGILState_* API. #30180

Closed

	#if defined(HAVE_LDOUBLE_INTEL_EXTENDED_12_BYTES_LE)
	/*
	* Intel extended 80 bits precision. Bit representation is
	* \| junk \| s \|eeeeeeeeeeeeeee\|mmmmmmmm................mmmmmmm\|
	* \| 16 bits\| 1 bit \| 15 bits \| 64 bits \|
	* \| a[2] \| a[1] \| a[0] \|
	*
	* 16 stronger bits of a[2] are junk
	*/
	typedef npy_uint32 IEEEl2bitsrep_part;


	union IEEEl2bitsrep {
	npy_longdouble e;
	IEEEl2bitsrep_part a[3];
	};

	#define LDBL_MANL_INDEX 0
	#define LDBL_MANL_MASK 0xFFFFFFFF
	#define LDBL_MANL_SHIFT 0

	#define LDBL_MANH_INDEX 1
	#define LDBL_MANH_MASK 0xFFFFFFFF
	#define LDBL_MANH_SHIFT 0

	#define LDBL_EXP_INDEX 2
	#define LDBL_EXP_MASK 0x7FFF
	#define LDBL_EXP_SHIFT 0

	#define LDBL_SIGN_INDEX 2
	#define LDBL_SIGN_MASK 0x8000
	#define LDBL_SIGN_SHIFT 15

	#define LDBL_NBIT 0x80000000

	typedef npy_uint32 ldouble_man_t;
	typedef npy_uint32 ldouble_exp_t;
	typedef npy_uint32 ldouble_sign_t;
	#elif defined(HAVE_LDOUBLE_INTEL_EXTENDED_16_BYTES_LE)
	/*
	* Intel extended 80 bits precision, 16 bytes alignment.. Bit representation is
	* \| junk \| s \|eeeeeeeeeeeeeee\|mmmmmmmm................mmmmmmm\|
	* \| 16 bits\| 1 bit \| 15 bits \| 64 bits \|
	* \| a[2] \| a[1] \| a[0] \|
	*
	* a[3] and 16 stronger bits of a[2] are junk
	*/
	typedef npy_uint32 IEEEl2bitsrep_part;

	union IEEEl2bitsrep {
	npy_longdouble e;
	IEEEl2bitsrep_part a[4];
	};

	#define LDBL_MANL_INDEX 0
	#define LDBL_MANL_MASK 0xFFFFFFFF
	#define LDBL_MANL_SHIFT 0

	#define LDBL_MANH_INDEX 1
	#define LDBL_MANH_MASK 0xFFFFFFFF
	#define LDBL_MANH_SHIFT 0

	#define LDBL_EXP_INDEX 2
	#define LDBL_EXP_MASK 0x7FFF
	#define LDBL_EXP_SHIFT 0

	#define LDBL_SIGN_INDEX 2
	#define LDBL_SIGN_MASK 0x8000
	#define LDBL_SIGN_SHIFT 15

	#define LDBL_NBIT 0x800000000

	typedef npy_uint32 ldouble_man_t;
	typedef npy_uint32 ldouble_exp_t;
	typedef npy_uint32 ldouble_sign_t;
	#elif defined(HAVE_LDOUBLE_MOTOROLA_EXTENDED_12_BYTES_BE)
	/*
	* Motorola extended 80 bits precision. Bit representation is
	* \| s \|eeeeeeeeeeeeeee\| junk \|mmmmmmmm................mmmmmmm\|
	* \| 1 bit \| 15 bits \| 16 bits\| 64 bits \|
	* \| a[0] \| a[1] \| a[2] \|
	*
	* 16 low bits of a[0] are junk
	*/
	typedef npy_uint32 IEEEl2bitsrep_part;


	union IEEEl2bitsrep {
	npy_longdouble e;
	IEEEl2bitsrep_part a[3];
	};

	#define LDBL_MANL_INDEX 2
	#define LDBL_MANL_MASK 0xFFFFFFFF
	#define LDBL_MANL_SHIFT 0

	#define LDBL_MANH_INDEX 1
	#define LDBL_MANH_MASK 0xFFFFFFFF
	#define LDBL_MANH_SHIFT 0

	#define LDBL_EXP_INDEX 0
	#define LDBL_EXP_MASK 0x7FFF0000
	#define LDBL_EXP_SHIFT 16

	#define LDBL_SIGN_INDEX 0
	#define LDBL_SIGN_MASK 0x80000000
	#define LDBL_SIGN_SHIFT 31

	#define LDBL_NBIT 0x80000000

	typedef npy_uint32 ldouble_man_t;
	typedef npy_uint32 ldouble_exp_t;
	typedef npy_uint32 ldouble_sign_t;

Uh oh!

ENH: np.unique: support hash based unique for complex dtype #29537

ENH: np.unique: support hash based unique for complex dtype #29537

Uh oh!

Conversation

math-hiyoko commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Benchmark Results

Uh oh!

ngoldbaum commented Aug 12, 2025

Uh oh!

math-hiyoko commented Aug 15, 2025

Uh oh!

ngoldbaum commented Aug 18, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

ngoldbaum commented Aug 20, 2025

Uh oh!

ngoldbaum commented Aug 20, 2025

Uh oh!

ngoldbaum commented Oct 16, 2025

Uh oh!

math-hiyoko commented Oct 16, 2025

Uh oh!

math-hiyoko commented Oct 17, 2025

Uh oh!

ngoldbaum commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

math-hiyoko commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ngoldbaum left a comment

Choose a reason for hiding this comment

Uh oh!

ngoldbaum Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

math-hiyoko Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

ngoldbaum commented Oct 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

math-hiyoko commented Aug 11, 2025 •

edited

Loading

ngoldbaum commented Oct 17, 2025 •

edited

Loading

math-hiyoko commented Oct 17, 2025 •

edited

Loading