David Nadlinger via Digitalmars-d

2014-06-27 01:31:14 UTC

Hi all,

right now, the use of std.math over core.stdc.math can cause a

huge performance problem in typical floating point graphics code.

An instance of this has recently been discussed here in the

"Perlin noise benchmark speed" thread [1], where even LDC, which

already beat DMD by a factor of two, generated code more than

twice as slow as that by Clang and GCC. Here, the use of floor()

causes trouble. [2]

Besides the somewhat slow pure D implementations in std.math, the

biggest problem is the fact that std.math almost exclusively uses

reals in its API. When working with single- or double-precision

floating point numbers, this is not only more data to shuffle

around than necessary, but on x86_64 requires the caller to

transfer the arguments from the SSE registers onto the x87 stack

and then convert the result back again. Needless to say, this is

a serious performance hazard. In fact, this accounts for an 1.9x

slowdown in the above benchmark with LDC.

Because of this, I propose to add float and double overloads (at

the very least the double ones) for all of the commonly used

functions in std.math. This is unlikely to break much code, but:

a) Somebody could rely on the fact that the calls effectively

widen the calculation to 80 bits on x86 when using type deduction.

b) Additional overloads make e.g. "&floor" ambiguous without

context, of course.

What do you think?

Cheers,

David

[1] http://forum.dlang.org/thread/lo19l7$n2a$1 at digitalmars.com

[2] Fun fact: As the program happens only deal with positive

numbers, the author could have just inserted an int-to-float

cast, sidestepping the issue altogether. All the other language

implementations have the floor() call too, though, so it doesn't

matter for this discussion.

right now, the use of std.math over core.stdc.math can cause a

huge performance problem in typical floating point graphics code.

An instance of this has recently been discussed here in the

"Perlin noise benchmark speed" thread [1], where even LDC, which

already beat DMD by a factor of two, generated code more than

twice as slow as that by Clang and GCC. Here, the use of floor()

causes trouble. [2]

Besides the somewhat slow pure D implementations in std.math, the

biggest problem is the fact that std.math almost exclusively uses

reals in its API. When working with single- or double-precision

floating point numbers, this is not only more data to shuffle

around than necessary, but on x86_64 requires the caller to

transfer the arguments from the SSE registers onto the x87 stack

and then convert the result back again. Needless to say, this is

a serious performance hazard. In fact, this accounts for an 1.9x

slowdown in the above benchmark with LDC.

Because of this, I propose to add float and double overloads (at

the very least the double ones) for all of the commonly used

functions in std.math. This is unlikely to break much code, but:

a) Somebody could rely on the fact that the calls effectively

widen the calculation to 80 bits on x86 when using type deduction.

b) Additional overloads make e.g. "&floor" ambiguous without

context, of course.

What do you think?

Cheers,

David

[1] http://forum.dlang.org/thread/lo19l7$n2a$1 at digitalmars.com

[2] Fun fact: As the program happens only deal with positive

numbers, the author could have just inserted an int-to-float

cast, sidestepping the issue altogether. All the other language

implementations have the floor() call too, though, so it doesn't

matter for this discussion.