如何优化向 API 发出请求的 for 循环?

How do I optimise a for loop which makes requests to an API?

我的 Go 代码中有一个 for 循环。每次迭代都会向某个 API 发出请求,然后将其结果保存在地图中。如何优化性能以便异步调用迭代?

我目前正在研究 goroutines 和通道等等,但我仍然无法在野外应用它:)

results := map[string]Result

for ID, person := range people {
    result := someApiCall(person)
    results[ID] = result
}

// And do something with all the results once completed

有很多方法可以使每次迭代异步执行。其中之一是利用 goroutine 和 channel(如您所愿)。

请看下面的例子。我想如果我把解释作为注释放在代码的每一部分会更容易。

// prepare the channel for data transporation purpose between goroutines and main routine
resChan := make(chan []interface{})

for ID, person := range people {

    // dispatch an IIFE as goroutine, so no need to change the `someApiCall()`
    go func(id string, person Person) {
        result := someApiCall(person)

        // send both id and result to channel.
        // it'll be better if we construct new type based id and result, but in this example I'll use channel with []interface{} type
        resChan <- []interface{}{id, result}
    }(ID, person)
}

// close the channel since every data is sent.
close(resChan)

// prepare a variable to hold all results
results := make(map[string]Result)

// use `for` and `range` to retrieve data from channel
for res := range ch {
    id := res[0].(string)
    person := res[1].(Person)

    // append it to the map
    result[id] = person
}

// And do something with all the results once completed

另一种方法是使用 sync API 像 sync.Mutexsync.WaitGroup 来达到相同的目标。

// prepare a variable to hold all results
results := make(map[string]Result)

// prepare a mutex object with purpose is to lock and unlock operations related to `results` variable, to avoid data race.
mtx := new(sync.Mutex)

// prepare a waitgroup object for effortlessly waits for goroutines to finish
wg := new(sync.WaitGroup)

// tell the waitgroup object how many goroutines that need to be finished
wg.Add(people)

for ID, person := range people {

    // dispatch an IIFE as goroutine, so no need to change the `someApiCall()`
    go func(id string, person Person) {
        result := someApiCall(person)

        // lock the append operation on `results` variable to avoid data race
        mtx.Lock()
        results[ID] = result
        mtx.Unlock()

        // tell waitgroup object that one goroutine is just finished
        wg.Done()
    }(ID, person)
}

// block the process synchronously till all goroutine finishes.
// after that it'll continue to next process underneath
wg.Wait()

// And do something with all the results once completed

警告。上述两种方法都可以用于需要迭代的数据很少的情况。如果多了就不好了,几乎同时调度了成吨的goroutine,会导致机器内存占用非常高。我建议看一下 worker pool technique 以改进代码。

您可以使用 goroutine 并行调用 api:

type Item struct {
    id string
    res Result
}

func callApi(id string, person Result, resultChannel chan Item) {
    res := someApiCall(person)
    resultChannel <- Item{id, res}
}

resultChannel := make(chan Item)
for id, person := range people {
    go callApi(id, person, resultChannel)
}

result := make(map[string]Result)
for range people {
    item := <- resultChannel
    result[item.id] = item.res
}

但是,上面的代码忽略了错误处理,例如someApiCall可能会失败或panic,如果人数过多,会导致并发的api调用过多,一般情况下应该限制并发调用的api个数。我会把这些问题留给你做练习