偶尔 'slice bounds out of range' 恐慌

Occasional 'slice bounds out of range' panic

我是 运行 一个将 webhook 转发到 WebSocket 的脚本。 将 webhook 发送到 WebSocket 的部分检查非活动连接并在转发 webhook 时尝试删除它们有时会失败并出现此错误:

http: panic serving 10.244.38.169:40958: runtime error: slice bounds out of range

(IP/port总是不同的,这只是一个例子。)

相关代码:

// Map holding all Websocket clients and the endpoints they are subscribed to
var clients = make(map[string][]*websocket.Conn)
var upgrader = websocket.Upgrader{}

// function to execute when a new client connects to the websocket
func handleClient(w http.ResponseWriter, r *http.Request, endpoint string) {
    conn, err := upgrader.Upgrade(w, r, nil)
    // ...
    // Add client to endpoint slice
    clients[endpoint] = append(clients[endpoint], conn)
}

// function to send a webhook to a websocket endpoint
func handleHook(w http.ResponseWriter, r *http.Request, endpoint string) {
    msg := Message{}
    // ...   
    // Get all clients listening to the current endpoint
    conns := clients[endpoint]

    if conns != nil {
        for i, conn := range conns {
            if conn.WriteJSON(msg) != nil {
                // Remove client and close connection if sending failed
                conns = append(conns[:i], conns[i+1:]...)   // this is the line that sometimes triggers the panic
                conn.Close()
            }
        }
    }

    clients[endpoint] = conns
}

我不明白为什么迭代连接并附加它们有时会触发恐慌。

我想说的几点:

  1. 确保你的程序没有竞争条件(例如 clients 是全局可访问的,如果 read/write 或 write/write 应该受到保护 同时发生)。

  2. 当范围遍及切片时 for [...] range [...] 您不需要检查是否将非 nil 切片作为范围句柄已经存在(请参阅我共享的代码)。

  3. 它有时会发生在您身上,因为有时 conn.WriteJSON 会失败并返回错误,并且在遍历时删除元素的错误逻辑会使您的程序崩溃。 (查看我分享的代码)

package main

import "fmt"

func main() {
    var conns []string = nil

    // "if conns != nil" check is not required as "for [...] range [...]"
    // can handle that. It is safe to use for "range" directly.
    for i, conn := range conns {
        fmt.Println(i, conn)
    }

    conns = []string{"1", "2", "3"}
    
    // Will panic
    for i := range conns {
        fmt.Printf("access: %d, length: %d\n", i, len(conns))
        conns = append(conns[:i], conns[i+1:]...)
    }
}

在示例中,您可以看到您尝试访问的索引大于或等于触发恐慌的切片的长度。我认为这个 answer 应该可以帮助您纠正您的逻辑,或者您也可以使用地图来存储连接,但它也有自己的警告,比如没有顺序保证,即它从地图中读取的顺序。