Change number type from Float64 to Float32#1604
Change number type from Float64 to Float32#1604huiyuxie wants to merge 0 commit intotrixi-framework:mainfrom huiyuxie:main
Float64 to Float32#1604Conversation
|
You mentioned that this PR is still a draft and not ready to be merged. Thus, I converted it to draft mode. |
I definitely agree with cases 1 and 2. I am not so sure about case 3, though. If such a comparison happens in a kernel, we also would like to ensure that it uses the appropriate floating point precision. We would hit <( x::Real, y::Real) = (< )(promote(x,y)...)in https://github.com/JuliaLang/julia/blob/fd38d50cefd17c2e31df34c6b6c4a5b7c721c57d/base/promotion.jl#L462 with mixed types, so we would get the same promotion system as with addition/multiplication etc. |
That would be a type instability that we definitely have to avoid by using type promotion and things like |
That's exactly the kind of situation where we would also need to use generic constructs as I mentioned in my reply to (1) above. However, this may make the code a bit harder to read and may require more discussion. Thus, it may be a good approach to convert the unambiguous and easy cases such as |
I do agree that this is a kind of boring work - and that you are definitely capable of much more interesting and demanding work. But it is something we need to consider carefully to make Trixi.jl work nicely with different floating point types etc. |
ranocha
left a comment
There was a problem hiding this comment.
Here is a first set of comments that should be representative. It would be great if you could adapt the PR accordingly
Yes, and I think this promotion is temporary (not permanent) for comparison. The type of Trixi.jl/src/equations/compressible_euler_1d.jl Lines 162 to 165 in add2542 0.5 to 0.5f0 or not, 1) the comparison result will not change and 2) the type of r will not change, so there is no need to change type in this case.
A good example that covers (1) should be Trixi.jl/src/equations/compressible_euler_1d.jl Lines 149 to 168 in add2542 rho, v1, and p are not decided by the function inputs but are more random (unstable).Also, a good example that covers (2) should be Trixi.jl/src/equations/compressible_euler_1d.jl Lines 92 to 116 in add2542 A cannot be change to Float32, A and thus rho are definitely be of type Float64, which leads to the result that du1 depends on user input but du2 and du3 are restricted to Float64, and the SVector will further make type promotion and thus (2) happens.
Do you mean the cases where all floating point numbers can be substantially converted to
Will do! Also, I forgot to mention another possible case that could be applied to (2) before. For example, mutable struct MyStruct
a::Float64
b::Float64
c::Float64
end
# Call the function that modifies `MyStruct.a` (or `MyStruct.b`, `MyStruct.c`)Even though the function output is of type |
Thanks! These are part of the initial conditions and source terms, so I would like to postpone them to another PR for now.
Yes, exactly 👍 |
That's also a good point. Most Trixi.jl/src/equations/compressible_euler_2d.jl Lines 40 to 49 in 7f83a1a However, it would be great to get a list of the cases that need to be adapted, e.g., Trixi.jl/src/meshes/serial_tree.jl Lines 28 to 42 in a4283e1 |
|
I have completed the first review of the previous work based on your comments, and currently, there are no complex cases that need to be commented on (such cases exist in initial condition and source terms, but I was suggested not to change any of them now). Also, do you know why the format check always failed @ranocha? Thanks! |
Codecov Report
@@ Coverage Diff @@
## main #1604 +/- ##
==========================================
+ Coverage 89.63% 96.17% +6.54%
==========================================
Files 406 406
Lines 33521 33532 +11
==========================================
+ Hits 30045 32248 +2203
+ Misses 3476 1284 -2192
Flags with carried forward coverage won't be shown. Click here to find out more.
|
|
For the format-check to pass you can run the formatter as explained here. |
huiyuxie
left a comment
There was a problem hiding this comment.
I will leave comments on cases that are not simple and clear. Please ignore something I change in the initial condition and source terms as I never change number types (only change the number format to make it consistent throughout the whole repo).
| # (4/3 * (v1)_x - 2/3 * (v2)_y) | ||
| tau_11 = 4.0 / 3.0 * dv1dx - 2.0 / 3.0 * dv2dy | ||
| tau_11 = 4 / 3 * dv1dx - 2 / 3 * dv2dy | ||
| # ((v1)_y + (v2)_x) | ||
| # stress tensor is symmetric | ||
| tau_12 = dv1dy + dv2dx # = tau_21 | ||
| # (4/3 * (v2)_y - 2/3 * (v1)_x) | ||
| tau_22 = 4.0 / 3.0 * dv2dy - 2.0 / 3.0 * dv1dx | ||
| tau_22 = 4 / 3 * dv2dy - 2 / 3 * dv1dx |
There was a problem hiding this comment.
These are cases that we should keep on a list of more involved changes (also in 3D) since 4 / 3 will be a Float64 (but the current changes here are fine for now).
This is just the start of my work, and I have something to inform you @ranocha about, so please don't merge them directly.
In conclusion, there are three cases I'd like to talk about,
Case 1: something like
0.5,0.25, and etc., we can directly convert them to the correspondingFloat32format (i.e.,0.5f0and0.25f0).Case 2: something like
1 / 2,3 / 2and etc., we can also convert them toFloat32(i.e.,0.5f0and1.5f0) without losing information.Case 3: when numbers like
0.5or1 / 2combined with comparison, e.g.,if a > 0.5, I choose to keep it as original since 1) it does not participate in the process of actual computation and 2) it is better to keep the new code consistent with original as much as possible.I think what you mainly want to achieve in this PR is focused on the part of PDE/ODE calculation (i.e., anything before the callbacks) and does not include the callback steps. So now I choose to skip those callback directories (e.g.
callbacks_stage) undersrcdirectory. However, if you also want to apply this PR to the callback part (like making the result of the runtime counterFloat32), please inform me :)But as I said before, this way cannot guarantee that the repository works for both input of type
Float32andFloat64(to get the corresponding output of the same type).(1) It cannot guarantee that the output is of type
Float64when the input is alsoFloat64. For example,and if, in the process of calculation, functions like this are invoked too much, then the result may probably be of type
Float32, even though the input isFloat64.(2) Similarly, it cannot guarantee that the output is of type
Float32when the input is alsoFloat32. For example,this time the type promotion probably happens in the process of calculation because some numbers are not required to convert to
Float32(like0.8in the above example). Also, a more typical example could behere the
SVectorappears frequently in code and the sample like it actually exists.Please tell me how you think about @ranocha and if you would like to keep going then I would like to continue because first I promised you before and second if I refuse to do this then you probably be the one who have to do such work and I think your time should not waste on something like this. But this work is much more time-consuming and boring than I expected, and I probably choose to complete a small part of them each day (sorry I cannot complete in one day) and please tell me when you would like to make it formally merge into the
mainbranch.Also, I hope you choose me to work on this not because you think I am only suitable for doing such things. And I still have to say that, from my personal perspective, this way is not good.