![]() |
Home | Libraries | People | FAQ | More |
#include <boost/math/special_functions/rsqrt.hpp> namespace boost::math { template<class Real> Real rsqrt(Real const & x); } // namespaces
The function rsqrt computes
the reciprocal square root 1/√x. Those in the game programming
community might suspect this is a fast, low precision wrapper around the
rsqrtss instruction.
This is not correct: We tried this instruction, but
found no performance benefit to using it. However, the trick
of computing a low precision reciprocal square root and then bootstrapping
to higher precision via Newton's method does work, but
it only yields a performance benefit for quad and higher precision. We do
of course allow you to use rsqrt
for float, double,
and long double,
but be aware there is no performance benefit to doing so. However, the savings
for quad precision and higher are very significant.
The use is
using boost::multiprecision::float128; float128 x = 0.1Q; float128 y = boost::math::rsqrt(x);
The reciprocal square root of +∞ is zero, and the reciprocal square root of a NaN is a NaN.
Performance:
Running ./reporting/performance/rsqrt_performance.x Run on (16 X 4300 MHz CPU s) CPU Caches: L1 Data 32 KiB (x8) L1 Instruction 32 KiB (x8) L2 Unified 1024 KiB (x8) L3 Unified 11264 KiB (x1) Load Average: 0.43, 0.49, 0.46 ---------------------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------------------- Rsqrt<float> 1.35 ns 1.35 ns 503364351 Rsqrt<double> 2.25 ns 2.25 ns 309753242 Rsqrt<long double> 2.68 ns 2.68 ns 261382652 Rsqrt<float128> 182 ns 182 ns 3756956 Rsqrt<number<mpfr_float_backend<100>>> 299 ns 299 ns 2494027 Rsqrt<number<mpfr_float_backend<200>>> 412 ns 412 ns 1589284 Rsqrt<number<mpfr_float_backend<300>>> 617 ns 617 ns 1067473 Rsqrt<number<mpfr_float_backend<400>>> 812 ns 812 ns 830564 Rsqrt<number<mpfr_float_backend<1000>>> 3183 ns 3183 ns 221079 Rsqrt<cpp_bin_float_50> 4321 ns 4321 ns 163243 Rsqrt<cpp_bin_float_100> 9393 ns 9393 ns 72967