从 R 的 Google Analytics API 中获取数据时出错

Error While Fetching Data From Google Analytics API from R

我能够使用 RGA library.Sharing 从 MCF API 接收数据 查询:

temp_data <- get_mcf(profileId = "xxxxxxxxx", start.date = "2017-01-09",
     end.date = "2017-01-31", metrics = "mcf:totalConversions",
     dimensions = "mcf:sourceMediumPath", sort = NULL,
     filters = "mcf:conversionType==Transaction",
     samplingLevel = NULL,start.index=1,max.results = 100000)

以上查询获取了 14836 行数据。当我尝试增加数据范围时出现此错误。 错误:服务器错误:(500)内部服务器错误 响应太大:内部错误

有什么解决办法吗??

如果您查看 MCF API 的文档,您会发现 Max-results 的有效值是 1000 到 10000 之间的数字。

max-results

max-results=100 Optional. Maximum number of rows to include in this response. You can use this in combination with start-index to retrieve a subset of elements, or use it alone to restrict the number of returned elements, starting with the first. If max-results is not supplied, the query returns the default maximum of 1000 rows.

The Multi-Channel Funnels Reporting API returns a maximum of 10,000 rows per request, no matter how many you ask for. It can also return fewer rows than requested, if there aren't as many dimension segments as you expect. For instance, there are fewer than 300 possible values for mcf:medium, so when segmenting only by medium, you can't get more than 300 rows, even if you set max-results to a higher value.

如果您的响应中有超过 10000 行,您应该使用 nextLink 来检索下一组数据。

更新:出于好奇,我联系了 Google Analytics API 团队。我觉得奇怪的是你得到了更多的行然后你应该基于文档。这是我得到的回复

To me it sounds like the developer needs to just shorten the date range to not get 500 server timeout. I don't know how he knows how many row's a query will return when he is getting a 500 response so I think there is a bit of confusion in his question still. As far as I know we have not changed the number of rows allowed in the response, but we still need to construct the full response on our side and sort, so if the number of rows is large and the CPU usage on the server is heavy during his request he will easily get a 500 timeout error.

That being said I have asked the Backend team if anything has changed about the 10k limit recently..

- google dev who shall not be named -

如果您遇到此错误,您可以按天分块处理您的请求,以便在一个请求中获取所有数据。像这样:

start_date <- "some_date"
end_date <- "some_date"

dates <- seq(as.Date(start_date), as.Date(end_date), by = 'day') #making vector of dates

mcf_data <- lapply(seq_along(dates), function(x){
  get_mcf(profileId = ,
          start.date = dates[x], end.date = dates[x], 
          metrics = "", 
          dimensions = "",
          samplingLevel = "")
})

mcf_data <- data.table::rbindlist(mcf_data) #binding to a dataframe