Google 尽管会话数较少,但仍进行分析抽样
Google Analytics Sampling despite low sessions
我正在使用 Google 分析报告 API,但我得到的是抽样结果,即使指定日期范围内的会话数比 500K limit 少得多。我一个月只有约 4K 节课。
我也把"samplingLevel"设为了"LARGE"。
这是 Python 查询:
response=analytics.reports().batchGet(
body={
"reportRequests":[
{
"viewId":myViewID,
"dateRanges":[
{
"startDate":"2017-05-01",
"endDate":"2017-05-30"
}],
"samplingLevel":"LARGE",
"metrics":[
{
"expression":"ga:sessions"
}],
"dimensions": [
{
"name":"ga:browser"
},
{
"name":"ga:city",
}
]
}]
}
).execute()
正如您在下面看到的示例 space 是 4365 个会话,远低于 500K 的限制
response.get('reports', [])[0].get('data',[]).get('samplesReadCounts',[])
Out[31]: [u'2051']
response.get('reports', [])[0].get('data',[]).get('samplingSpaceSizes',[])
Out[32]: [u'4365']
将请求分成更小的日期范围也无济于事。我使用 R 中的 GoogleAnalyticsR 库和 anti_sample=TRUE.
进行了尝试
> web_data <- google_analytics_4(view_id,
+ date_range = c("2017-05-01", "2017-05-30"),
+ dimensions = c("city","browser"),
+ metrics = c("hits"),
+ samplingLevel="LARGE",
+ anti_sample = TRUE)
2017-06-04 11:54:51> anti_sample set to TRUE. Mitigating sampling via multiple API calls.
2017-06-04 11:54:51> Finding how much sampling in data request...
2017-06-04 11:54:52> Downloaded [10] rows from a total of [15].
2017-06-04 11:54:52> Data is sampled, based on 47% of sessions.
2017-06-04 11:54:52> Finding number of sessions for anti-sample calculations...
2017-06-04 11:54:53> Downloaded [30] rows from a total of [30].
2017-06-04 11:54:53> Calculated [3] batches are needed to download approx. [18] rows unsampled.
2017-06-04 11:54:53> Anti-sample call covering 14 days: 2017-05-01, 2017-05-14
2017-06-04 11:54:54> Downloaded [7] rows from a total of [7].
2017-06-04 11:54:54> Data is sampled, based on 53.2% of sessions.
2017-06-04 11:54:54> Anti-sampling failed
2017-06-04 11:54:54> Anti-sample call covering 9 days: 2017-05-15, 2017-05-23
2017-06-04 11:54:54> Downloaded [4] rows from a total of [4].
2017-06-04 11:54:54> Data is sampled, based on 55.7% of sessions.
2017-06-04 11:54:54> Anti-sampling failed
2017-06-04 11:54:54> Anti-sample call covering 7 days: 2017-05-24, 2017-05-30
2017-06-04 11:54:55> Downloaded [10] rows from a total of [10].
2017-06-04 11:54:55> Data is sampled, based on 52.3% of sessions.
2017-06-04 11:54:55> Anti-sampling failed
Joining, by = c("city", "browser")
Joining, by = c("city", "browser")
2017-06-04 11:54:55> Finished unsampled data request, total rows [13]
当我检查自定义请求中的相同数据时,我看到类似的抽样
知道为什么即使会话计数远低于限制,我也会得到抽样结果吗?
500k 适用于默认报告
编辑:
在您用于临时查询的日期范围内,属性 级别的 50 万个会话。
默认报告解释:
Analytics has a set of preconfigured, default reports listed in the left pane under Audience, Acquisition, Behavior, and Conversions.
您似乎正在使用具有次级维度的临时报告,因此 500k 阈值可能不再适用并且可能会低得多。在您最初链接到 here 的页面中有更多关于此的信息。
您在该视图中只有 4k 个会话...但也许该视图正在使用过滤器...通过查看没有过滤器的视图来检查您在该 属性 中有多少流量.... 500k 会话处于 属性 级别而不是视图级别。
的会话较少,但 Google 有一张关于抽样的工单
我正在使用 Google 分析报告 API,但我得到的是抽样结果,即使指定日期范围内的会话数比 500K limit 少得多。我一个月只有约 4K 节课。
我也把"samplingLevel"设为了"LARGE"。
这是 Python 查询:
response=analytics.reports().batchGet(
body={
"reportRequests":[
{
"viewId":myViewID,
"dateRanges":[
{
"startDate":"2017-05-01",
"endDate":"2017-05-30"
}],
"samplingLevel":"LARGE",
"metrics":[
{
"expression":"ga:sessions"
}],
"dimensions": [
{
"name":"ga:browser"
},
{
"name":"ga:city",
}
]
}]
}
).execute()
正如您在下面看到的示例 space 是 4365 个会话,远低于 500K 的限制
response.get('reports', [])[0].get('data',[]).get('samplesReadCounts',[])
Out[31]: [u'2051']
response.get('reports', [])[0].get('data',[]).get('samplingSpaceSizes',[])
Out[32]: [u'4365']
将请求分成更小的日期范围也无济于事。我使用 R 中的 GoogleAnalyticsR 库和 anti_sample=TRUE.
进行了尝试 > web_data <- google_analytics_4(view_id,
+ date_range = c("2017-05-01", "2017-05-30"),
+ dimensions = c("city","browser"),
+ metrics = c("hits"),
+ samplingLevel="LARGE",
+ anti_sample = TRUE)
2017-06-04 11:54:51> anti_sample set to TRUE. Mitigating sampling via multiple API calls.
2017-06-04 11:54:51> Finding how much sampling in data request...
2017-06-04 11:54:52> Downloaded [10] rows from a total of [15].
2017-06-04 11:54:52> Data is sampled, based on 47% of sessions.
2017-06-04 11:54:52> Finding number of sessions for anti-sample calculations...
2017-06-04 11:54:53> Downloaded [30] rows from a total of [30].
2017-06-04 11:54:53> Calculated [3] batches are needed to download approx. [18] rows unsampled.
2017-06-04 11:54:53> Anti-sample call covering 14 days: 2017-05-01, 2017-05-14
2017-06-04 11:54:54> Downloaded [7] rows from a total of [7].
2017-06-04 11:54:54> Data is sampled, based on 53.2% of sessions.
2017-06-04 11:54:54> Anti-sampling failed
2017-06-04 11:54:54> Anti-sample call covering 9 days: 2017-05-15, 2017-05-23
2017-06-04 11:54:54> Downloaded [4] rows from a total of [4].
2017-06-04 11:54:54> Data is sampled, based on 55.7% of sessions.
2017-06-04 11:54:54> Anti-sampling failed
2017-06-04 11:54:54> Anti-sample call covering 7 days: 2017-05-24, 2017-05-30
2017-06-04 11:54:55> Downloaded [10] rows from a total of [10].
2017-06-04 11:54:55> Data is sampled, based on 52.3% of sessions.
2017-06-04 11:54:55> Anti-sampling failed
Joining, by = c("city", "browser")
Joining, by = c("city", "browser")
2017-06-04 11:54:55> Finished unsampled data request, total rows [13]
当我检查自定义请求中的相同数据时,我看到类似的抽样
知道为什么即使会话计数远低于限制,我也会得到抽样结果吗?
500k 适用于默认报告
编辑: 在您用于临时查询的日期范围内,属性 级别的 50 万个会话。
默认报告解释:
Analytics has a set of preconfigured, default reports listed in the left pane under Audience, Acquisition, Behavior, and Conversions.
您似乎正在使用具有次级维度的临时报告,因此 500k 阈值可能不再适用并且可能会低得多。在您最初链接到 here 的页面中有更多关于此的信息。
您在该视图中只有 4k 个会话...但也许该视图正在使用过滤器...通过查看没有过滤器的视图来检查您在该 属性 中有多少流量.... 500k 会话处于 属性 级别而不是视图级别。