如何使用 Google 分析报告 API 将多个维度和指标转换为 Pandas 数据框?

How do you convert multiple Dimensions and Metrics into a Pandas Dataframe using the Google Analytics Reporting API?

我无法将来自 Google Analytics Reporting API 的数据合并到 Pandas Dataframe 中。尽管请求和数据收集都很好,但每当我获得多个维度和指标时,我都无法将其放入 Pandas DataFrame。

输出的是我们所有产品的超长列表。它以维度(产品名称和 SKU)开始,然后在列表中传递指标(收入和数量)示例:

['PRODUCT1', '1234', 'PRODUCT2', '5678'..... 13.0, 324.0, 3.0, 322.0]

我在转换为 DF 时 运行 遇到的错误:

ValueError: Length of values does not match length of index

"None of [Index(['Product', 'SKU', 'Revenue', 'Quantity'], dtype='object')] are in the [columns]"

关于如何将其放入适当的数据帧中的任何想法?我以这篇文章作为开始,但它只解释了如何导出 1 个维度和 1 个指标:https://www.jcchouinard.com/google-analytics-api-using-python/

我的代码:


from apiclient.discovery import build
from oauth2client.service_account import ServiceAccountCredentials
import pandas as pd

SCOPES = ['https://www.googleapis.com/auth/analytics.readonly']
KEY_FILE_LOCATION = 'client_secrets.json'
VIEW_ID = '123456789' #Random number

credentials = ServiceAccountCredentials.from_json_keyfile_name(KEY_FILE_LOCATION, SCOPES)

analytics = build('analyticsreporting', 'v4', credentials=credentials)

response = analytics.reports().batchGet(
    body={
        'reportRequests': [
            {
                'viewId': VIEW_ID, #Add View ID from GA
                'dateRanges': [
                    {'startDate': '30daysAgo', 'endDate': 'today'},
                    ],
                'metrics': [
                    {'expression': 'ga:itemRevenue'},
                    {'expression': 'ga:itemQuantity'}
                ],
                'dimensions': [
                    {
                        "name": "ga:productName"
                    },{
                        "name": "ga:productSku"
                    }
                    ],
                #"filtersExpression":"ga:pagePath=~products;ga:pagePath!@/translate", #Filter by condition "containing products"
                'orderBys': [{"fieldName": "ga:itemRevenue", "sortOrder": "DESCENDING"}],
                'pageSize': 1000
            }]
    }
).execute()


dim = []
val = []

#Extract Data
for report in response.get('reports', []):

    columnHeader = report.get('columnHeader', {})
    dimensionHeaders = columnHeader.get('dimensions', [])
    metricHeaders = columnHeader.get('metricHeader', {}).get('metricHeaderEntries', [])
    rows = report.get('data', {}).get('rows', [])

    for row in rows:

        dimensions = row.get('dimensions', [])
        dateRangeValues = row.get('metrics', [])

        for header, dimension in zip(dimensionHeaders, dimensions):
            dim.append(dimension)


        for i, values in enumerate(dateRangeValues):
            for metricHeader, value in zip(metricHeaders, values.get('values')):
                val.append(float(value))

df = pd.DataFrame()
df["Revenue", "Quantity"]=val
df["Product", "SKU" ]=dim
df=df[["Product", "SKU","Revenue", "Quantity"]]
print(df)


我已经使用 gapandas 包解决了这个问题。

此 Google 分析 API 对于这些操作来说效果不是很好。改用这个:

https://github.com/flyandlure/gapandas