我怎样才能在 clojure 中更快地填充这个缓冲区

Question

我正在尝试制作一个函数，它将采用 BufferedImage 和 return 一个 ByteBuffer，然后我可以将其用作 OpenGL 纹理。为此，我了解到我必须进行一些与我的问题无关的字节移位。它与 BufferedImage 值是 ARGB 和 OpenGL 想要 RGBA 有关。

我要实现的功能（来自 java）是这个：

public static ByteBuffer toByteBuffer(BufferedImage img){
    byte[] byteArray = new byte[img.getWidth()*img.getHeight()*4];
    for(int i = 0; i < img.getWidth()*img.getHeight(); i++){
        int value = img.getRGB(i%img.getWidth(), (i-(i%img.getWidth()))/img.getWidth() );
        byteArray[i*4] = (byte) ((value<<8)>>24); 
        byteArray[i*4+1] = (byte) ((value<<16)>>24);
        byteArray[i*4+2] = (byte) ((value<<24)>>24);
        byteArray[i*4+3] = (byte) (value>>24); 
    }
    return (ByteBuffer) ByteBuffer.allocateDirect(byteArray.length).put(byteArray).flip();
}

这是我对 clojure 的尝试：

(defn sub-byte [^long b ^long x]
  (unchecked-byte (-> x
    (bit-shift-left (* 8 b))
    (bit-shift-right 24))))


(defn bufferedimage->bytebuffer [^BufferedImage img]
  (binding [*unchecked-math* true] 
    (let [w (.getWidth img)
          h (.getHeight img)
          ^bytes arr (make-array Byte/TYPE (* 4 w h))]
      (loop [i 0]
          (let [img-i (mod i w)
                img-j (quot i w)
                value (.getRGB img img-i img-j)]
            (aset arr (* i 4)       (sub-byte 1 value))
            (aset arr (+ 1 (* i 4)) (sub-byte 2 value))
            (aset arr (+ 2 (* i 4)) (sub-byte 3 value))
            (aset arr (+ 3 (* i 4)) (sub-byte 0 value))
            (when (< (+ i 1) (* w h)) (recur (+ i 1)))
            ))
      (cast ByteBuffer (-> (ByteBuffer/allocateDirect (count arr))
                           (.put arr)
                           (.flip))))))

加载一个512*512的tileset需要10秒，这是完全不能接受的。我正在尝试在不到一秒的时间内完成运行。

请注意，一直占用时间的部分是循环。

我不妨提一下，这些时间是使用 REPL 获取的。

此外，请注意，我很清楚我可以将 java 用于代码的性能关键部分，因此这更像是一个理论问题，因此我可以学习如何优化我的 clojure 代码.

Answer 1

我通过将子字节转换为宏将时间从 10 秒减少到 173 毫秒：

(defmacro sub-byte [b x]
  `(unchecked-byte (-> ~x
    (bit-shift-left (* 8 ~b))
    (bit-shift-right 24))))

看来性能问题与所有函数调用有关。

虽然我觉得它很有趣，但我不认为函数调用在 Clojure 中会如此低效。另外，我认为编译器正在为我执行内联优化。

虽然我发现了 "what"，但我不知道 "why"，所以我会接受一个解释发生了什么而不是我的答案。

Answer 2

当您将 *warn-on-reflection* 设置为 true 时，会显示使用函数解决方案的问题：

(set! *warn-on-reflection* true)

当您加载代码时，编译器会告诉您您的 sub-byte 函数 returns Object 并且它无法静态解析匹配方法。

Reflection warning, web_app/so.clj:26:11 - call to static method aset on clojure.lang.RT can't be resolved (argument types: [B, int, java.lang.Object).

不幸的是，您不能在您的函数上使用 byte return 值的类型提示，因为 return 仅支持 long 和 double 原语类型：

(defn sub-byte ^byte [^long b ^long x]
  (unchecked-byte (-> x
                      (bit-shift-left (* 8 b))
                      (bit-shift-right 24))))

CompilerException java.lang.IllegalArgumentException: Only long and double primitives are supported, compiling:(web_app/so.clj:7:1)

您可能会尝试将 ^long 提示为 return 类型，但提示的结果类型不是您的函数体 returns (byte):

(defn sub-byte ^long [^long b ^long x]
  (unchecked-byte (-> x
                      (bit-shift-left (* 8 b))
                      (bit-shift-right 24))))

CompilerException java.lang.IllegalArgumentException: Mismatched primitive return, expected: long, had: byte, compiling:(web_app/so.clj:7:1)

然而，你可以让你的函数成为 return long 但是你必须用 unchecked-byte 把它包裹在任何地方 - 这样你就消除了所有反射警告：

(defn sub-byte ^long [^long b ^long x]
  (-> x
      (bit-shift-left (* 8 b))
      (bit-shift-right 24))))

(unchecked-byte (sub-byte ...))

另一个解决方案是使用您已经发现的宏，这将避免函数调用及其 return 类型的任何问题。

我怎样才能在 clojure 中更快地填充这个缓冲区

How can I fill this buffer faster in clojure

opengl

optimization

clojure

lwjgl