将 SQL 查询导入 Pandas 结果只有 1 列

Question

我正在尝试将复杂 SQL 查询的结果导入 pandas 数据框。我的查询要求我创建几个临时 table，因为我想要的最终结果 table 包括一些聚合。我的代码如下所示：

    cnxn = pyodbc.connect(r'DRIVER=foo;SERVER=bar;etc')
    cursor = cnxn.cursor()
    cursor.execute('SQL QUERY HERE')
    cursor.execute('SECONDARY SQL QUERY HERE')
    ...
    df = pd.DataFrame(cursor.fetchall(),columns = [desc[0] for desc in cursor.description])

我收到一条错误消息，告诉我形状不匹配：

    ValueError: Shape of passed values is (1,900000),indices imply (5,900000)

事实上，所有 SQL 查询的结果应该是一个 table 有 5 列而不是 1 列。我运行 SQL 查询使用Microsoft SQL Server Management Studio 可以正常工作，return 是我想要的第 5 列 table。我试图不将任何列名传递到数据框并打印出数据框的头部，发现 pandas 已将 5 列中的所有信息放入 1。每行中的值是 5 的列表以逗号分隔的值，但 pandas 将整个列表视为 1 列。为什么 pandas 这样做？我也尝试过 pd.read_sql 路线，但我仍然遇到同样的错误。

编辑：

考虑到评论，我做了更多的调试。该问题似乎并非源于我的查询是嵌套的。我尝试了一个简单的（一行）查询 return 一个 3 列 table，但我仍然遇到同样的错误。打印出 fetchall() 如下所示：

    [(str1,str2,str3,datetime.date(stuff),datetime.date(stuff)), 
    (str1,str2,str3,datetime.date(stuff),datetime.date(stuff)),...]

Answer 1

改用pd.DataFrame.from_records：

df = pd.DataFrame.from_records(cursor.fetchall(),
                               columns = [desc[0] for desc in cursor.description])

Answer 2

现在只需调整 pd.DataFrame() 调用 cursor.fetchall() returns one-length 元组列表。使用 tuple() 或 list 将子元素映射到它们自己的列中：

df = pd.DataFrame([tuple(row) for row in cur.fetchall()],
                  columns = [desc[0] for desc in cursor.description])

将 SQL 查询导入 Pandas 结果只有 1 列

Importing SQL query into Pandas results in only 1 column

python

sql

pyodbc

pandas