Golang 中的池化映射
Pooling Maps in Golang
我很好奇是否有人尝试过 pool maps in Go before? I've read about pooling buffers previously, and I was wondering if by similar reasoning it could make sense to pool maps if one has to create and destroy them frequently or if there was any reason why, a priori, it might not be efficient. When a map is returned to the pool, one would have to iterate through it and delete all elements, but it seems a popular recommendation is to create a new map instead of deleting the entries in a map which has already been allocated and reusing it 这让我觉得合并地图可能没有那么有益。
如果您的映射通过删除或添加条目改变了(很多)大小,这将导致新的分配,并且将它们集中起来没有任何好处。
如果您的地图大小不会改变,而只有键的值会改变,那么池化将是一个成功的优化。
当您阅读 table-like 结构时,例如 CSV 文件或数据库表,这将很有效。每行将包含完全相同的列,因此您无需清除任何条目。
当 运行 和 go test -benchmem -bench .
到
时,下面的基准显示没有分配
package mappool
import "testing"
const SIZE = 1000000
func BenchmarkMap(b *testing.B) {
m := make(map[int]int)
for i := 0; i < SIZE; i++ {
m[i] = i
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
for i := 0; i < SIZE; i++ {
m[i] = m[i] + 1
}
}
}
正如@Grzegorz Żur 所说,如果您的贴图大小变化不大,那么池化会很有帮助。为了测试这一点,我做了一个基准,其中池化胜出。我机器上的输出是:
Pool time: 115.977µs
No-pool time: 160.828µs
基准代码:
package main
import (
"fmt"
"math/rand"
"time"
)
const BenchIters = 1000
func main() {
pool := map[int]int{}
poolTime := benchmark(func() {
useMapForSomething(pool)
// Return to pool by clearing the map.
for key := range pool {
delete(pool, key)
}
})
nopoolTime := benchmark(func() {
useMapForSomething(map[int]int{})
})
fmt.Println("Pool time:", poolTime)
fmt.Println("No-pool time:", nopoolTime)
}
func useMapForSomething(m map[int]int) {
for i := 0; i < 1000; i++ {
m[rand.Intn(300)] += 5
}
}
// benchmark measures how long f takes, on average.
func benchmark(f func()) time.Duration {
start := time.Now().UnixNano()
for i := 0; i < BenchIters; i++ {
f()
}
return time.Nanosecond * time.Duration((time.Now().UnixNano()-start)/BenchIters)
}
我很好奇是否有人尝试过 pool maps in Go before? I've read about pooling buffers previously, and I was wondering if by similar reasoning it could make sense to pool maps if one has to create and destroy them frequently or if there was any reason why, a priori, it might not be efficient. When a map is returned to the pool, one would have to iterate through it and delete all elements, but it seems a popular recommendation is to create a new map instead of deleting the entries in a map which has already been allocated and reusing it 这让我觉得合并地图可能没有那么有益。
如果您的映射通过删除或添加条目改变了(很多)大小,这将导致新的分配,并且将它们集中起来没有任何好处。
如果您的地图大小不会改变,而只有键的值会改变,那么池化将是一个成功的优化。
当您阅读 table-like 结构时,例如 CSV 文件或数据库表,这将很有效。每行将包含完全相同的列,因此您无需清除任何条目。
当 运行 和 go test -benchmem -bench .
到
package mappool
import "testing"
const SIZE = 1000000
func BenchmarkMap(b *testing.B) {
m := make(map[int]int)
for i := 0; i < SIZE; i++ {
m[i] = i
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
for i := 0; i < SIZE; i++ {
m[i] = m[i] + 1
}
}
}
正如@Grzegorz Żur 所说,如果您的贴图大小变化不大,那么池化会很有帮助。为了测试这一点,我做了一个基准,其中池化胜出。我机器上的输出是:
Pool time: 115.977µs
No-pool time: 160.828µs
基准代码:
package main
import (
"fmt"
"math/rand"
"time"
)
const BenchIters = 1000
func main() {
pool := map[int]int{}
poolTime := benchmark(func() {
useMapForSomething(pool)
// Return to pool by clearing the map.
for key := range pool {
delete(pool, key)
}
})
nopoolTime := benchmark(func() {
useMapForSomething(map[int]int{})
})
fmt.Println("Pool time:", poolTime)
fmt.Println("No-pool time:", nopoolTime)
}
func useMapForSomething(m map[int]int) {
for i := 0; i < 1000; i++ {
m[rand.Intn(300)] += 5
}
}
// benchmark measures how long f takes, on average.
func benchmark(f func()) time.Duration {
start := time.Now().UnixNano()
for i := 0; i < BenchIters; i++ {
f()
}
return time.Nanosecond * time.Duration((time.Now().UnixNano()-start)/BenchIters)
}