Skip to content

Change number type from Float64 to Float32#1604

Closed
huiyuxie wants to merge 0 commit intotrixi-framework:mainfrom
huiyuxie:main
Closed

Change number type from Float64 to Float32#1604
huiyuxie wants to merge 0 commit intotrixi-framework:mainfrom
huiyuxie:main

Conversation

@huiyuxie
Copy link
Copy Markdown
Member

This is just the start of my work, and I have something to inform you @ranocha about, so please don't merge them directly.

In conclusion, there are three cases I'd like to talk about,
Case 1: something like 0.5, 0.25, and etc., we can directly convert them to the corresponding Float32 format (i.e., 0.5f0 and 0.25f0).
Case 2: something like 1 / 2, 3 / 2 and etc., we can also convert them to Float32 (i.e., 0.5f0 and 1.5f0) without losing information.
Case 3: when numbers like 0.5 or 1 / 2 combined with comparison, e.g., if a > 0.5, I choose to keep it as original since 1) it does not participate in the process of actual computation and 2) it is better to keep the new code consistent with original as much as possible.

I think what you mainly want to achieve in this PR is focused on the part of PDE/ODE calculation (i.e., anything before the callbacks) and does not include the callback steps. So now I choose to skip those callback directories (e.g. callbacks_stage) under src directory. However, if you also want to apply this PR to the callback part (like making the result of the runtime counter Float32), please inform me :)

But as I said before, this way cannot guarantee that the repository works for both input of type Float32 and Float64 (to get the corresponding output of the same type).
(1) It cannot guarantee that the output is of type Float64 when the input is also Float64. For example,

function foo1(input1, input2, input3)
    a = input1 + input2 + input3 # get a as `Float64`
    b = a > 0.0 ? 1.0f0 : input1 * input2 * input3 # get b as `Float32` or `Float64`
    return b
end

and if, in the process of calculation, functions like this are invoked too much, then the result may probably be of type Float32, even though the input is Float64.
(2) Similarly, it cannot guarantee that the output is of type Float32 when the input is also Float32. For example,

function foo2(input1, input2, input3)
    a = input1 + input2 + input3 # get a as `Float32`
    b = 0.8 * a # get b as `Float64`
    return b
end

this time the type promotion probably happens in the process of calculation because some numbers are not required to convert to Float32 (like 0.8 in the above example). Also, a more typical example could be

function foo3(input1, input2, input3)
    a1 = 0.5f0 * input1 # be of type `Float32`
    a2 = 0.5f0 * input2 # be of type `Float32`
    a3 = 0.1 * input3 # be of type `Float64`
    return SVector(a1, a2, a3) # get all types of `Float64` due to type promotion
end

here the SVector appears frequently in code and the sample like it actually exists.

Please tell me how you think about @ranocha and if you would like to keep going then I would like to continue because first I promised you before and second if I refuse to do this then you probably be the one who have to do such work and I think your time should not waste on something like this. But this work is much more time-consuming and boring than I expected, and I probably choose to complete a small part of them each day (sorry I cannot complete in one day) and please tell me when you would like to make it formally merge into the main branch.

Also, I hope you choose me to work on this not because you think I am only suitable for doing such things. And I still have to say that, from my personal perspective, this way is not good.

@ranocha
Copy link
Copy Markdown
Member

ranocha commented Aug 15, 2023

You mentioned that this PR is still a draft and not ready to be merged. Thus, I converted it to draft mode.

@ranocha ranocha marked this pull request as draft August 15, 2023 06:09
@ranocha
Copy link
Copy Markdown
Member

ranocha commented Aug 15, 2023

In conclusion, there are three cases I'd like to talk about,
Case 1: something like 0.5, 0.25, and etc., we can directly convert them to the corresponding Float32 format (i.e., 0.5f0 and 0.25f0).
Case 2: something like 1 / 2, 3 / 2 and etc., we can also convert them to Float32 (i.e., 0.5f0 and 1.5f0) without losing information.
Case 3: when numbers like 0.5 or 1 / 2 combined with comparison, e.g., if a > 0.5, I choose to keep it as original since 1) it does not participate in the process of actual computation and 2) it is better to keep the new code consistent with original as much as possible.

I definitely agree with cases 1 and 2. I am not so sure about case 3, though. If such a comparison happens in a kernel, we also would like to ensure that it uses the appropriate floating point precision. We would hit

<( x::Real, y::Real)     = (< )(promote(x,y)...)

in https://github.com/JuliaLang/julia/blob/fd38d50cefd17c2e31df34c6b6c4a5b7c721c57d/base/promotion.jl#L462 with mixed types, so we would get the same promotion system as with addition/multiplication etc.

@ranocha
Copy link
Copy Markdown
Member

ranocha commented Aug 15, 2023

(1) It cannot guarantee that the output is of type Float64 when the input is also Float64. For example,

function foo1(input1, input2, input3)
    a = input1 + input2 + input3 # get a as `Float64`
    b = a > 0.0 ? 1.0f0 : input1 * input2 * input3 # get b as `Float32` or `Float64`
    return b
end

and if, in the process of calculation, functions like this are invoked too much, then the result may probably be of type Float32, even though the input is Float64.

That would be a type instability that we definitely have to avoid by using type promotion and things like zero(T), one(T) etc. If you point me to some concrete examples of constructs like this in our code, I could make a suggestion how to handle them.

@ranocha
Copy link
Copy Markdown
Member

ranocha commented Aug 15, 2023

(2) Similarly, it cannot guarantee that the output is of type Float32 when the input is also Float32. For example,

function foo2(input1, input2, input3)
    a = input1 + input2 + input3 # get a as `Float32`
    b = 0.8 * a # get b as `Float64`
    return b
end

this time the type promotion probably happens in the process of calculation because some numbers are not required to convert to Float32 (like 0.8 in the above example). Also, a more typical example could be

function foo3(input1, input2, input3)
    a1 = 0.5f0 * input1 # be of type `Float32`
    a2 = 0.5f0 * input2 # be of type `Float32`
    a3 = 0.1 * input3 # be of type `Float64`
    return SVector(a1, a2, a3) # get all types of `Float64` due to type promotion
end

here the SVector appears frequently in code and the sample like it actually exists.

That's exactly the kind of situation where we would also need to use generic constructs as I mentioned in my reply to (1) above. However, this may make the code a bit harder to read and may require more discussion. Thus, it may be a good approach to convert the unambiguous and easy cases such as 0.5 and 0.25 first and consider the other cases in another PR.
Having an explicit list of these more difficult cases would be great. For example, a draft PR could just add # TODO: Float32 notes to them at first to get an overview of the amount and type of change required.

@ranocha
Copy link
Copy Markdown
Member

ranocha commented Aug 15, 2023

Please tell me how you think about @ranocha and if you would like to keep going then I would like to continue because first I promised you before and second if I refuse to do this then you probably be the one who have to do such work and I think your time should not waste on something like this. But this work is much more time-consuming and boring than I expected, and I probably choose to complete a small part of them each day (sorry I cannot complete in one day) and please tell me when you would like to make it formally merge into the main branch.

Also, I hope you choose me to work on this not because you think I am only suitable for doing such things. And I still have to say that, from my personal perspective, this way is not good.

I do agree that this is a kind of boring work - and that you are definitely capable of much more interesting and demanding work. But it is something we need to consider carefully to make Trixi.jl work nicely with different floating point types etc.

Copy link
Copy Markdown
Member

@ranocha ranocha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a first set of comments that should be representative. It would be great if you could adapt the PR accordingly

@huiyuxie
Copy link
Copy Markdown
Member Author

huiyuxie commented Aug 15, 2023

We would hit <( x::Real, y::Real) = (< )(promote(x,y)...) in https://github.com/JuliaLang/julia/blob/fd38d50cefd17c2e31df34c6b6c4a5b7c721c57d/base/promotion.jl#L462 with mixed types, so we would get the same promotion system as with addition/multiplication etc.

Yes, and I think this promotion is temporary (not permanent) for comparison. The type of x and y will not change after the comparison so we don't have to care about using the appropriate floating point precision in this case. To be more clear,

# Calculate primitive variables
rho = r > 0.5 ? 1.0 : 1.1691
v1 = r > 0.5 ? 0.0 : 0.1882 * cos_phi
p = r > 0.5 ? 1.0 : 1.245
whether we change 0.5 to 0.5f0 or not, 1) the comparison result will not change and 2) the type of r will not change, so there is no need to change type in this case.

If you point me to some concrete examples of constructs like this in our code, I could make a suggestion how to handle them.

A good example that covers (1) should be

function initial_condition_weak_blast_wave(x, t,
equations::CompressibleEulerEquations1D)
# From Hennemann & Gassner JCP paper 2020 (Sec. 6.3)
# Set up polar coordinates
inicenter = SVector(0.0)
x_norm = x[1] - inicenter[1]
r = abs(x_norm)
# The following code is equivalent to
# phi = atan(0.0, x_norm)
# cos_phi = cos(phi)
# in 1D but faster
cos_phi = x_norm > 0 ? one(x_norm) : -one(x_norm)
# Calculate primitive variables
rho = r > 0.5 ? 1.0 : 1.1691
v1 = r > 0.5 ? 0.0 : 0.1882 * cos_phi
p = r > 0.5 ? 1.0 : 1.245
return prim2cons(SVector(rho, v1, p), equations)
end
Here, the types of rho, v1, and p are not decided by the function inputs but are more random (unstable).
Also, a good example that covers (2) should be
@inline function source_terms_convergence_test(u, x, t,
equations::CompressibleEulerEquations1D)
# Same settings as in `initial_condition`
c = 2
A = 0.1
L = 2
f = 1 / L
ω = 2 * pi * f
γ = equations.gamma
x1, = x
si, co = sincos* (x1 - t))
rho = c + A * si
rho_x = ω * A * co
# Note that d/dt rho = -d/dx rho.
# This yields du2 = du3 = d/dx p (derivative of pressure).
# Other terms vanish because of v = 1.
du1 = zero(eltype(u))
du2 = rho_x * (2 * rho - 0.5) *- 1)
du3 = du2
return SVector(du1, du2, du3)
end
Since the A cannot be change to Float32, A and thus rho are definitely be of type Float64, which leads to the result that du1 depends on user input but du2 and du3 are restricted to Float64, and the SVector will further make type promotion and thus (2) happens.

Thus, it may be a good approach to convert the unambiguous and easy cases such as 0.5 and 0.25 first and consider the other cases in another PR.

Do you mean the cases where all floating point numbers can be substantially converted to Float32 (i.e., there is no numbers like 0.1 and 0.2, etc.)?

Having an explicit list of these more difficult cases would be great. For example, a draft PR could just add # TODO: Float32 notes to them at first to get an overview of the amount and type of change required.

Will do!

Also, I forgot to mention another possible case that could be applied to (2) before. For example,

mutable struct MyStruct
    a::Float64
    b::Float64
    c::Float64
end

# Call the function that modifies `MyStruct.a` (or `MyStruct.b`, `MyStruct.c`)

Even though the function output is of type Float32 but it will be promoted to Float64 due to the field type declaration in the struct. This case could probably happens anywhere exists something like ::Float64 (i.e., not only for struct) so I think it is better to convert all ::Float64 to ::Union{Float32,Float64} or ::Real before this PR is merged.

@ranocha
Copy link
Copy Markdown
Member

ranocha commented Aug 15, 2023

A good example that covers (1) should be
Also, a good example that covers (2) should be

Thanks! These are part of the initial conditions and source terms, so I would like to postpone them to another PR for now.

Thus, it may be a good approach to convert the unambiguous and easy cases such as 0.5 and 0.25 first and consider the other cases in another PR.

Do you mean the cases where all floating point numbers can be substantially converted to Float32 (i.e., there is no numbers like 0.1 and 0.2, etc.)?

Yes, exactly 👍

@ranocha
Copy link
Copy Markdown
Member

ranocha commented Aug 15, 2023

Also, I forgot to mention another possible case that could be applied to (2) before. For example,

mutable struct MyStruct
    a::Float64
    b::Float64
    c::Float64
end

# Call the function that modifies `MyStruct.a` (or `MyStruct.b`, `MyStruct.c`)

Even though the function output is of type Float32 but it will be promoted to Float64 due to the field type declaration in the struct. This case could probably happens anywhere exists something like ::Float64 (i.e., not only for struct) so I think it is better to convert all ::Float64 to ::Union{Float32,Float64} or ::Real before this PR is merged.

That's also a good point. Most structs should be able to handle generic types already, e.g.,

struct CompressibleEulerEquations2D{RealT <: Real} <:
AbstractCompressibleEulerEquations{2, 4}
gamma::RealT # ratio of specific heats
inv_gamma_minus_one::RealT # = inv(gamma - 1); can be used to write slow divisions as fast multiplications
function CompressibleEulerEquations2D(gamma)
γ, inv_gamma_minus_one = promote(gamma, inv(gamma - 1))
new{typeof(γ)}(γ, inv_gamma_minus_one)
end
end

However, it would be great to get a list of the cases that need to be adapted, e.g.,
mutable struct SerialTree{NDIMS} <: AbstractTree{NDIMS}
parent_ids::Vector{Int}
child_ids::Matrix{Int}
neighbor_ids::Matrix{Int}
levels::Vector{Int}
coordinates::Matrix{Float64}
original_cell_ids::Vector{Int}
capacity::Int
length::Int
dummy::Int
center_level_0::SVector{NDIMS, Float64}
length_level_0::Float64
periodicity::NTuple{NDIMS, Bool}

@huiyuxie
Copy link
Copy Markdown
Member Author

I have completed the first review of the previous work based on your comments, and currently, there are no complex cases that need to be commented on (such cases exist in initial condition and source terms, but I was suggested not to change any of them now). Also, do you know why the format check always failed @ranocha? Thanks!

@codecov
Copy link
Copy Markdown

codecov bot commented Aug 16, 2023

Codecov Report

Merging #1604 (9c201d5) into main (0b8405a) will increase coverage by 6.54%.
Report is 3 commits behind head on main.
The diff coverage is 99.38%.

@@            Coverage Diff             @@
##             main    #1604      +/-   ##
==========================================
+ Coverage   89.63%   96.17%   +6.54%     
==========================================
  Files         406      406              
  Lines       33521    33532      +11     
==========================================
+ Hits        30045    32248    +2203     
+ Misses       3476     1284    -2192     
Flag Coverage Δ
unittests 96.17% <99.38%> (+6.54%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
src/equations/compressible_euler_3d.jl 97.05% <97.33%> (ø)
src/equations/acoustic_perturbation_2d.jl 100.00% <100.00%> (ø)
src/equations/compressible_euler_1d.jl 97.36% <100.00%> (ø)
src/equations/compressible_euler_2d.jl 99.12% <100.00%> (ø)
.../equations/compressible_euler_multicomponent_1d.jl 100.00% <100.00%> (ø)
.../equations/compressible_euler_multicomponent_2d.jl 100.00% <100.00%> (ø)
src/equations/compressible_navier_stokes_1d.jl 97.59% <100.00%> (ø)
src/equations/compressible_navier_stokes_2d.jl 98.06% <100.00%> (ø)
src/equations/compressible_navier_stokes_3d.jl 98.33% <100.00%> (ø)
src/equations/hyperbolic_diffusion_1d.jl 100.00% <100.00%> (ø)
... and 2 more

... and 36 files with indirect coverage changes

@JoshuaLampert
Copy link
Copy Markdown
Member

For the format-check to pass you can run the formatter as explained here.

Copy link
Copy Markdown
Member Author

@huiyuxie huiyuxie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will leave comments on cases that are not simple and clear. Please ignore something I change in the initial condition and source terms as I never change number types (only change the number format to make it consistent throughout the whole repo).

Comment on lines +155 to +161
# (4/3 * (v1)_x - 2/3 * (v2)_y)
tau_11 = 4.0 / 3.0 * dv1dx - 2.0 / 3.0 * dv2dy
tau_11 = 4 / 3 * dv1dx - 2 / 3 * dv2dy
# ((v1)_y + (v2)_x)
# stress tensor is symmetric
tau_12 = dv1dy + dv2dx # = tau_21
# (4/3 * (v2)_y - 2/3 * (v1)_x)
tau_22 = 4.0 / 3.0 * dv2dy - 2.0 / 3.0 * dv1dx
tau_22 = 4 / 3 * dv2dy - 2 / 3 * dv1dx
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are cases that we should keep on a list of more involved changes (also in 3D) since 4 / 3 will be a Float64 (but the current changes here are fine for now).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants