mrconvert: Operate using native data type#200
Conversation
|
You masochist... Couldn't resist either... On $ touch cmd/mrconvert.cpp
$ time ./build
[CC] cmd/mrconvert.o
[LB] bin/mrconvert
real 0m7.273s
user 0m7.153s
sys 0m0.223sOn $ touch cmd/mrconvert.cpp
$ time ./build
[CC] cmd/mrconvert.o
[LB] bin/mrconvert
real 0m17.847s
user 0m17.827s
sys 0m0.330sLooks like the compiler's getting a good workout then... ;) Kidding aside, just one question: I notice you no longer branch with a direct copy when |
|
Yeah, I tend to do these things by changing whatever's necessary to remove the code branching, then adding the templating; guess I forgot to look again to see if the extract could be bypassed. I'll have another play tomorrow. |
|
Hmm, compile time seems to have further doubled. Wonder why... |
|
Hehe... That's why I wasn't all that keen on the idea initially, having witnessed the compile times for Maybe we can reduce this by lumping all the integer operations into a |
|
But, but, ... |
|
Hehe... yes, sorry about your bubble... By the way, the rationale for the extensive template action in Actually, about that last point, we should verify that operating on a real input image to generate a complex output still works... |
|
Yeah I know. Do anything for a giggle. bool, 8 and 16 bits can go. I'd be tempted to keep both (u)int32 and (u)int64 though, just so that most integer ops work in 32 bit but the 64-bit version is there just in case someone's testing the limits. I'd imagine float<->double conversions would cost bugger all. But I'd rather test to see if there's any speed penalty first. Type selection is based on the output image. Though I would have thought that kind of operation would be handled using |
|
Yes, sorry, no need for the 32 bit versions - it's all going to be handled as 64 quantities anyway on modern systems. No need to optimise for 32-bit machines, they'll be a thing of the past soon enough, and anyone serious about performance will be running the 64-bit version anyway. Just stick to
Yes, scratch that, getting myself in a muddle... |
|
@Lestropie - how are you getting on with this one? You want to leave it as-is, or try to reduce the compile times as we discussed...? |
|
Was just talking about it with Thijs before I left work. Will cut some templates out. My understanding though was that even 64-bit processors tend to prefer operating on 32-bit integers rather than 64? I'll speed test everything anyways. |
Not sure - it sounds unnecessary for the processors to duplicate their integer pipelines for this. Where it might make a difference is purely in terms of the amount of RAM used to store each row of values. Doubling the size needed might mean it exceeds the CPU L1 cache, which could potentially slow it down. But frankly, I doubt you'd notice the difference...
Good idea. |
|
@Lestropie - I'm thinking we should merge this now, and leave cutting back on the number of template types for later. I'm keen to make sure this can be merged cleanly, which means not leaving it to linger too long. And also there's #211 looming... |
|
OK, finally got around to testing this.
Compile time back down to ~20s on my system, which I think is acceptable. |
Not sure I'd be that bothered about 15%... But I guess removing the 32 bit versions isn't going to make all that much difference to the compile times now. Let's merge. |
mrconvert now performs copy operation using the data type of the output image.
On a straight copy of a floating-point image, this provides a speedup of about 30%.
It also means that image values will not lose precision due to being cast to a complex float during copy.
Discussed in #196.
Closes #3.