在 CUDA 中计算球体距离的最有效方法？

Question

我在 CUDA 中对有符号距离场进行光线行进，我正在渲染的场景包含数千个球体（球体的位置存储在设备缓冲区中，因此我的 SDF 函数遍历 all 每个像素的球体）。

目前，我计算到球体表面的距离为：

sqrtf( dot( pos - sphere_center, pos - sphere_center ) ) - sphere_radius

使用sqrt()函数，渲染我的场景大约需要250ms。但是，当我删除对 sqrt() 的调用并只留下 dot( pos - sphere_center, pos - sphere_center ) - sphere_radius 时，渲染时间下降到 17 毫秒（并渲染黑色图像）。

sqrt() 函数似乎是瓶颈所以我想问一下是否有办法可以改善我的渲染时间（通过使用不使用平方根的不同公式或不同的渲染方法）？

我已经在使用 -use-fast-math。

编辑： 我试过 Nico Schertler, but it didn't work in my renderer. Link to M(n)WE on Shadertoy 建议的公式。

Answer 1

（将我的评论作为答案，因为它似乎对 OP 有效）

您正感受到必须计算 sqrt() 的痛苦。我很同情...如果你能，嗯，不那样做就好了。好吧，是什么阻止了你？毕竟，到球体的平方距离是从 $R^+$ 到 $R^+$ 的单调函数——见鬼，它实际上是一个凸双射！问题是你有来自其他地方的非平方距离，你计算：

min(sqrt(square_distance_to_the_closest_sphere), 
    distance_to_the_closest_object_in_the_rest_of_the_scene)

所以让我们反过来做事情：我们不对到球体的平方距离求平方根，而是对 other 距离求平方：

min(square_distance_to_the_closest_sphere,
    distance_to_the_closest_object_in_the_rest_of_the_scene^2)

由于平方函数的单调性，这与未平方 min() 计算做出相同的选择。从这里开始，尝试在您的程序中进一步传播平方距离的使用，尽可能避免取根，甚至可能一直都取根。

The most effective way of computing distance to sphere in CUDA?