Use f64::total_cmp instead of OrderedFloat#4133
Conversation
|
@tustvold I have replaced almost all entries OrderedFloat to f64. Still thinking how to use you hasher to remove OrderedFloat from Hash. |
|
@comphead I would recommend creating a newtype wrapper around a float that implements Hash using hash_utils, eq using total_cmp, etc... |
Hi @tustvold I have implemented hasher through |
datafusion/common/src/scalar.rs
Outdated
| let v2 = v2.map(OrderedFloat); | ||
| v1.partial_cmp(&v2) | ||
| } | ||
| (Float32(v1), Float32(v2)) => v1.partial_cmp(v2), |
There was a problem hiding this comment.
| (Float32(v1), Float32(v2)) => v1.partial_cmp(v2), | |
| (Float32(v1), Float32(v2)) => v1.total_cmp(v2), |
There was a problem hiding this comment.
v1 is an Option<f32>, it supports partial_cmp, not total_cmp. let me know if you ok if I unwrap it to floats, the same way as done for Decimals.
There was a problem hiding this comment.
Yes, we will need to match the option, I keep forgetting that ScalarValue has typed nulls for some reason 😆
partial_cmp on Option, will call through to partial_cmp on f32, which is not the same as total_cmp
There was a problem hiding this comment.
Done. Yeah, I checked that Float64(NULL) == Float64(NULL) now.
| let v = v.map(OrderedFloat); | ||
| v.hash(state) | ||
| } | ||
| Float32(v) => v.map(Fl).hash(state), |
There was a problem hiding this comment.
I think this can just call HashValue on v?
There was a problem hiding this comment.
v is Option<f32> is supports Hash, but we have to wrap f32 into some wrapper supporting hash. Fl in this case, I didn't find how to implement Hash directly on f32/f64
There was a problem hiding this comment.
Fair, I think there is a way to clean this up, but we can do that in a follow on PR
datafusion/common/src/scalar.rs
Outdated
| let v1 = v1.map(OrderedFloat); | ||
| let v2 = v2.map(OrderedFloat); | ||
| v1.eq(&v2) | ||
| // Handle NaN == NaN as true manually like in OrderedFloat |
There was a problem hiding this comment.
To be consistent with the hash implementation, this should also use total_cmp. Otherwise two "equal" values according to PartialEq, e.g. +0 and -0, will have different hashes
There was a problem hiding this comment.
What if
match (v1, v2) {
(Some(f1), Some(f2)) => f1.total_cmp(f2).is_eq(),
_ => v1.eq(v2),
}
| .map(f64::from) | ||
| .map(|v| OrderedFloat::from(v as f64)) | ||
| .collect(); | ||
| let values: Vec<_> = (1..=1_000).map(f64::from).map(|v| v as f64).collect(); |
There was a problem hiding this comment.
| let values: Vec<_> = (1..=1_000).map(f64::from).map(|v| v as f64).collect(); | |
| let values: Vec<_> = (1..=1_000).map(f64::from).collect(); |
| for _ in 0..400_000 { | ||
| values.push(OrderedFloat::from(1_000_000_f64)); | ||
| } | ||
| let mut values: Vec<_> = (1..=600_000).map(f64::from).map(|v| v as f64).collect(); |
There was a problem hiding this comment.
| let mut values: Vec<_> = (1..=600_000).map(f64::from).map(|v| v as f64).collect(); | |
| let mut values: Vec<_> = (1..=600_000).map(f64::from).collect(); |
| .map(f64::from) | ||
| .map(|v| OrderedFloat::from(v as f64)) | ||
| .collect(); | ||
| let values: Vec<_> = (1..=1_000_000).map(f64::from).map(|v| v as f64).collect(); |
There was a problem hiding this comment.
| let values: Vec<_> = (1..=1_000_000).map(f64::from).map(|v| v as f64).collect(); | |
| let values: Vec<_> = (1..=1_000_000).map(f64::from).collect(); |
datafusion/common/src/scalar.rs
Outdated
| let v2 = v2.map(OrderedFloat); | ||
| v1.partial_cmp(&v2) | ||
| } | ||
| (Float32(Some(f1)), Float32(Some(f2))) => Some(f1.total_cmp(f2)), |
There was a problem hiding this comment.
I think this will now return None when comparing nulls, which isn't consistent with the other types
|
Benchmark runs are scheduled for baseline = 509c82c and contender = 5883e43. 5883e43 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #4051 .
Rationale for this change
See #4051
What changes are included in this PR?
Are there any user-facing changes?