Java 中的快速整数除法
Fast integer division in Java
众所周知,整数除法运算速度很慢(通常比整数乘法慢好几倍)。但是,如果需要用一个固定的除数执行许多除法运算,则可以对除数进行一些预处理并将“/”替换为乘法和位运算(Hacker's Delight 中的第 10 章)。
正如我所测试的那样,如果除数是一个编译时常量(例如 static final long DIVISOR = 12345L;
),JVM 将执行此操作并使用乘法和位运算将所有除法替换为 DIVISOR
。我对同一种技巧很感兴趣,但是当除数仅在运行时已知时。
例如下面的(慢)方法:
void reduceArraySlow(long[] data, long denominator){
for(int i = 0; i < data.length; ++i)
data[i] = data[i] / denominator;
}
可以换成:
void reduceArrayFast(long[] data, long denominator){
SomeMagicStructure magic = computeMagic(denominator);
for(int i = 0; i < data.length; ++i)
// computes data[i] / denominator
data[i] = doFastDivision(data[i], magic);
}
这必须更快地完成工作,因为所有 /
操作都被更快的操作所取代(而且因为除法在 CPU 中没有流水线化)。
众所周知C/C++ libdivide library for fast integer division, and there is my adaptation of this library for Java libdivide4j.
使用 libdivide4j 的快速除法如下所示:
void reduceArrayFast(long[] data, long denominator){
FastDivision.Magic magic = FastDivision.magicSigned(denominator);
for(int i = 0; i < data.length; ++i)
// computes data[i] / denominator
data[i] = FastDivision.divideSignedFast(data[i], magic);
}
一个简单的基准测试
public void benchmark() throws Exception {
Random rnd = new Random();
int nIterations = 10000;
//let the JIT to optimize something
for (int att = 0; att < nIterations; att++) {
long[] data = new long[1000];
for (int i = 0; i < data.length; i++)
data[i] = rnd.nextLong();
long denominator = rnd.nextLong();
long[] slow = data.clone();
long start = System.nanoTime();
reduceArraySlow(slow, denominator);
long slowTime = System.nanoTime() - start;
long[] fast = data.clone();
start = System.nanoTime();
reduceArrayFast(fast, denominator);
long fastTime = System.nanoTime() - start;
Assert.assertArrayEquals(slow, fast);
// print last 100 timings (JVM already warmed up)
if (att > nIterations - 100) {
System.out.println("\"/\" operation: " + slowTime);
System.out.println("Fast division: " + fastTime);
System.out.println("");
}
}
}
显示普通 /
和快速除法(Core i7,jdk8 64 位)的以下计时(纳秒):
"/" operation: 13233
Fast division: 5957
"/" operation: 13148
Fast division: 5103
"/" operation: 13587
Fast division: 6188
"/" operation: 14173
Fast division: 6773
...
众所周知,整数除法运算速度很慢(通常比整数乘法慢好几倍)。但是,如果需要用一个固定的除数执行许多除法运算,则可以对除数进行一些预处理并将“/”替换为乘法和位运算(Hacker's Delight 中的第 10 章)。
正如我所测试的那样,如果除数是一个编译时常量(例如 static final long DIVISOR = 12345L;
),JVM 将执行此操作并使用乘法和位运算将所有除法替换为 DIVISOR
。我对同一种技巧很感兴趣,但是当除数仅在运行时已知时。
例如下面的(慢)方法:
void reduceArraySlow(long[] data, long denominator){
for(int i = 0; i < data.length; ++i)
data[i] = data[i] / denominator;
}
可以换成:
void reduceArrayFast(long[] data, long denominator){
SomeMagicStructure magic = computeMagic(denominator);
for(int i = 0; i < data.length; ++i)
// computes data[i] / denominator
data[i] = doFastDivision(data[i], magic);
}
这必须更快地完成工作,因为所有 /
操作都被更快的操作所取代(而且因为除法在 CPU 中没有流水线化)。
众所周知C/C++ libdivide library for fast integer division, and there is my adaptation of this library for Java libdivide4j.
使用 libdivide4j 的快速除法如下所示:
void reduceArrayFast(long[] data, long denominator){
FastDivision.Magic magic = FastDivision.magicSigned(denominator);
for(int i = 0; i < data.length; ++i)
// computes data[i] / denominator
data[i] = FastDivision.divideSignedFast(data[i], magic);
}
一个简单的基准测试
public void benchmark() throws Exception {
Random rnd = new Random();
int nIterations = 10000;
//let the JIT to optimize something
for (int att = 0; att < nIterations; att++) {
long[] data = new long[1000];
for (int i = 0; i < data.length; i++)
data[i] = rnd.nextLong();
long denominator = rnd.nextLong();
long[] slow = data.clone();
long start = System.nanoTime();
reduceArraySlow(slow, denominator);
long slowTime = System.nanoTime() - start;
long[] fast = data.clone();
start = System.nanoTime();
reduceArrayFast(fast, denominator);
long fastTime = System.nanoTime() - start;
Assert.assertArrayEquals(slow, fast);
// print last 100 timings (JVM already warmed up)
if (att > nIterations - 100) {
System.out.println("\"/\" operation: " + slowTime);
System.out.println("Fast division: " + fastTime);
System.out.println("");
}
}
}
显示普通 /
和快速除法(Core i7,jdk8 64 位)的以下计时(纳秒):
"/" operation: 13233
Fast division: 5957
"/" operation: 13148
Fast division: 5103
"/" operation: 13587
Fast division: 6188
"/" operation: 14173
Fast division: 6773
...