PyODBC 查询的 Unicode 问题
Unicode issue with PyODBC query
我在使用 PyODBC 查询我的 MSSQL 服务器时遇到问题。
我相信的原因是我有使用 unicode 命名的列。这些列来自我的主要数据中的单个列。
有问题的列是 "afkastningsgrad_primær_drift"
知道我如何能够 运行 这个查询吗? (由于我没有服务器所有权,因此无法在服务器上构建视图并重命名该列)
SQL:
WITH dataTable AS (
SELECT
KredsEjdNr, Navn, Vaerdi
FROM qryEjendomsData
WHERE
RegnskabsAar = 2016
AND Projekt = 1710
AND Navn IN (
'ekm_ko' , 'afkastningsgrad_primær_drift', 'fremst_pris_maelk'
)
GROUP BY KredsEjdNr, Navn, Vaerdi
),
pivotData AS (
SELECT *
FROM dataTable
PIVOT
(
SUM(Vaerdi)
FOR[Navn] IN (
[ekm_ko], [afkastningsgrad_primær_drift], [fremst_pris_maelk]
)
)
AS pivotTable
)
SELECT
CAST([KredsEjdNr] AS NVARCHAR) AS [kredsEjdNr],
CAST(ekm_ko AS int) AS [EKM pr ko],
[afkastningsgrad_primær_drift] as [Afkastningsgrad],
[fremst_pris_maelk] AS [Fremstillingspris pr. kg EKM]
from pivotData
where [ekm_ko] IS NOT NULL and [fremst_pris_maelk] IS NOT NULL
order by kredsEjdNr
Python代码:
#!/usr/local/bin/python
# -*- coding: utf-8 -*-
connectionstring = 'DRIVER={SQL Server Native Client 11.0};SERVER=server;DATABASE=database;UID=%s;PWD=%s' %(usr,pswd)
conn = pyodbc.connect(connectionstring)
cursor = conn.cursor()
dataList = cursor.execute(unicode(sql)).fetchall()
错误:
Traceback (most recent call last): File "data.py", line 84, in
dataList = cursor.execute(unicode(sql)).fetchall() UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
183: ordinal not in range(128)
问题不在于列名中的 Unicode 字符,而在于 Python2 下的 str
变量中的非 ASCII 字节。当 pyodbc .execute
调用接收到作为 str
的命令文本时,它会尝试使用默认编码对其进行处理,即 'ascii' for Python2.
以下测试代码
# -*- coding: utf-8 -*-
import sys
print("sys.getdefaultencoding() is '{0}'".format(sys.getdefaultencoding()))
import pyodbc
cnxn = pyodbc.connect("DSN=SQLmyDb", autocommit=True)
crsr = cnxn.cursor()
# setup test environment
crsr.execute(u"CREATE TABLE #tmp (afkastningsgrad_primær_drift INT)")
crsr.execute(u"INSERT INTO #tmp VALUES (1)")
print('')
print('Test_1: "SELECT * ..." as str')
sql = "SELECT * FROM #tmp"
print(" sql: " + repr(sql))
crsr.execute(sql)
print(" column name from result set: " + repr(crsr.description[0][0]))
print('')
print('Test_2: "SELECT colname ..." as str')
sql = "SELECT afkastningsgrad_primær_drift FROM #tmp"
print(" sql: " + repr(sql))
try:
crsr.execute(sql)
print(" OK")
except UnicodeDecodeError as ude:
print(" UnicodeDecodeError: " + str(ude))
print('')
print('Test_3: "SELECT colname ..." as unicode')
sql = sql.decode('utf-8')
print(" sql: " + repr(sql))
try:
crsr.execute(sql)
print(" OK")
except Exception as ex:
print(" Exception: " + str(ex))
cnxn.close()
产生
sys.getdefaultencoding() is 'ascii'
Test_1: "SELECT * ..." as str
sql: 'SELECT * FROM #tmp'
column name from result set: u'afkastningsgrad_prim\xe6r_drift'
Test_2: "SELECT colname ..." as str
sql: 'SELECT afkastningsgrad_prim\xc3\xa6r_drift FROM #tmp'
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 27: ordinal not in range(128)
Test_3: "SELECT colname ..." as unicode
sql: u'SELECT afkastningsgrad_prim\xe6r_drift FROM #tmp'
OK
Test_1 显示带有 Unicode 字符的列名称作为 unicode
对象正确返回。
Test_2 显示默认 Python2 编码 ('ascii') 在包含大于 0x7F 的 UTF-8 字节的 str
上阻塞。
Test_3 表明如果我们使用 .decode
将 str
(包含 UTF-8 字节)转换为适当的 unicode
对象,则不会发生错误传递给 .execute
方法。
我在使用 PyODBC 查询我的 MSSQL 服务器时遇到问题。
我相信的原因是我有使用 unicode 命名的列。这些列来自我的主要数据中的单个列。
有问题的列是 "afkastningsgrad_primær_drift"
知道我如何能够 运行 这个查询吗? (由于我没有服务器所有权,因此无法在服务器上构建视图并重命名该列)
SQL:
WITH dataTable AS (
SELECT
KredsEjdNr, Navn, Vaerdi
FROM qryEjendomsData
WHERE
RegnskabsAar = 2016
AND Projekt = 1710
AND Navn IN (
'ekm_ko' , 'afkastningsgrad_primær_drift', 'fremst_pris_maelk'
)
GROUP BY KredsEjdNr, Navn, Vaerdi
),
pivotData AS (
SELECT *
FROM dataTable
PIVOT
(
SUM(Vaerdi)
FOR[Navn] IN (
[ekm_ko], [afkastningsgrad_primær_drift], [fremst_pris_maelk]
)
)
AS pivotTable
)
SELECT
CAST([KredsEjdNr] AS NVARCHAR) AS [kredsEjdNr],
CAST(ekm_ko AS int) AS [EKM pr ko],
[afkastningsgrad_primær_drift] as [Afkastningsgrad],
[fremst_pris_maelk] AS [Fremstillingspris pr. kg EKM]
from pivotData
where [ekm_ko] IS NOT NULL and [fremst_pris_maelk] IS NOT NULL
order by kredsEjdNr
Python代码:
#!/usr/local/bin/python
# -*- coding: utf-8 -*-
connectionstring = 'DRIVER={SQL Server Native Client 11.0};SERVER=server;DATABASE=database;UID=%s;PWD=%s' %(usr,pswd)
conn = pyodbc.connect(connectionstring)
cursor = conn.cursor()
dataList = cursor.execute(unicode(sql)).fetchall()
错误:
Traceback (most recent call last): File "data.py", line 84, in dataList = cursor.execute(unicode(sql)).fetchall() UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 183: ordinal not in range(128)
问题不在于列名中的 Unicode 字符,而在于 Python2 下的 str
变量中的非 ASCII 字节。当 pyodbc .execute
调用接收到作为 str
的命令文本时,它会尝试使用默认编码对其进行处理,即 'ascii' for Python2.
以下测试代码
# -*- coding: utf-8 -*-
import sys
print("sys.getdefaultencoding() is '{0}'".format(sys.getdefaultencoding()))
import pyodbc
cnxn = pyodbc.connect("DSN=SQLmyDb", autocommit=True)
crsr = cnxn.cursor()
# setup test environment
crsr.execute(u"CREATE TABLE #tmp (afkastningsgrad_primær_drift INT)")
crsr.execute(u"INSERT INTO #tmp VALUES (1)")
print('')
print('Test_1: "SELECT * ..." as str')
sql = "SELECT * FROM #tmp"
print(" sql: " + repr(sql))
crsr.execute(sql)
print(" column name from result set: " + repr(crsr.description[0][0]))
print('')
print('Test_2: "SELECT colname ..." as str')
sql = "SELECT afkastningsgrad_primær_drift FROM #tmp"
print(" sql: " + repr(sql))
try:
crsr.execute(sql)
print(" OK")
except UnicodeDecodeError as ude:
print(" UnicodeDecodeError: " + str(ude))
print('')
print('Test_3: "SELECT colname ..." as unicode')
sql = sql.decode('utf-8')
print(" sql: " + repr(sql))
try:
crsr.execute(sql)
print(" OK")
except Exception as ex:
print(" Exception: " + str(ex))
cnxn.close()
产生
sys.getdefaultencoding() is 'ascii'
Test_1: "SELECT * ..." as str
sql: 'SELECT * FROM #tmp'
column name from result set: u'afkastningsgrad_prim\xe6r_drift'
Test_2: "SELECT colname ..." as str
sql: 'SELECT afkastningsgrad_prim\xc3\xa6r_drift FROM #tmp'
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 27: ordinal not in range(128)
Test_3: "SELECT colname ..." as unicode
sql: u'SELECT afkastningsgrad_prim\xe6r_drift FROM #tmp'
OK
Test_1 显示带有 Unicode 字符的列名称作为 unicode
对象正确返回。
Test_2 显示默认 Python2 编码 ('ascii') 在包含大于 0x7F 的 UTF-8 字节的 str
上阻塞。
Test_3 表明如果我们使用 .decode
将 str
(包含 UTF-8 字节)转换为适当的 unicode
对象,则不会发生错误传递给 .execute
方法。