pyodbc - group/order 子句中的参数 - 限制?
pyodbc - parameters in group/order clause - Limitation?
所以我试图用 group by 子句中的参数编写一个 SQL 查询并让 pyodbc 执行它;我有以下架构:
http://sqlfiddle.com/#!18/4e7bb7/2
如果 sqlfiddle 失败,
CREATE TABLE Persons (
Personid int IDENTITY(1,1) PRIMARY KEY,
Name varchar(255) NOT NULL,
Birthday datetime
);
INSERT INTO PERSONS (NAME, BIRTHDAY)
VALUES
('a','20220101'),
('b','20220102'),
('c','20220103'),
('d','20220104'),
('e','20220105'),
('f','20220106'),
('g','20220108'),
('h','20220110'),
('i','20220111'),
('j','20220112'),
('k','20220113'),
('l','20220114'),
('m','20220115')
我以下面的查询为例,它是有效的 SQL:
select
COUNT(*)
,dateadd(week, datediff(week,0, birthday), 0)
from Persons
group by
dateadd(week, datediff(week,0, birthday), 0)
order by
dateadd(week, datediff(week,0, birthday), 0)
此查询将按周对用户的生日进行分组。这只是一个人为的例子。我有与此类似的真实数据。
我正在尝试编写一个 python 函数,它可以获取数据并按周对它们进行分组。我希望这个函数能够决定一周的哪一天是一周的开始。我有以下功能:
import pyodbc
def TestSQLServerDB2(dayOfWeekStart=0):
"""Tests the query.
:param: dayOfWeekStart: Int. 0 = Monday, 1 = Tuesday, ... 6 = Saturday
"""
hostname = 'DESKTOPHOST'
database_instance = "test"
db_conn = pyodbc.connect('Trusted_Connection=yes;' + r"DRIVER=" + "{SQL SERVER}" +
";SERVER=" + hostname + ";DATABASE=" +
database_instance + ";")
targetWeekday = dayOfWeekStart % 7
sql = '''
select
COUNT(*)
,dateadd(week, datediff(week,0, birthday), ?)
from Persons
group by
dateadd(week, datediff(week,0, birthday), ?)
order by
dateadd(week, datediff(week,0, birthday), ?)
'''
params = (targetWeekday ,targetWeekday ,targetWeekday )
cur = db_conn.cursor()
cur.execute(sql, params)
print(cur.fetchall())
运行这个函数会产生如下错误:
pyodbc.ProgrammingError: ('42000', "[42000] [Microsoft][ODBC SQL Server Driver][SQL Server]Column 'Persons.Birthday' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. (8120) (SQLExecDirectW); [42000] [Microsoft][ODBC SQL Server Driver][SQL Server]Statement(s) could not be prepared. (8180)")
在执行之前对 SQL 查询进行字符串替换对我来说已经足够简单了,我不会遇到任何问题;我想知道的是 SQL 参数化语句是否基本上仅限于 WHERE 子句中的值比较?这是有意设计的吗?
谢谢
问题是您基本上将参数传递到查询中三次,而 GROUP BY
和 SELECT
使用不同的参数。尽管 you 知道它们相同,但服务器不知道。实际上,您的查询是:
select
COUNT(*)
,dateadd(week, datediff(week,0, birthday), @p0)
from Persons
group by
dateadd(week, datediff(week,0, birthday), @p1)
order by
dateadd(week, datediff(week,0, birthday), @p2)
很明显为什么它不起作用。
解决方案是对每个使用相同的值。
- 不幸的是,你不能在 pyodbc 中使用命名参数,所以那是不可能的。
- 您可以使用变量,但这会禁用参数嗅探,这可能是不可取的。
- 您可以将整个内容放在嵌套的 subquery/derived table 中。但是嵌套很乱,over-complicates查询。
- 相反,将值放在
CROSS APPLY (VALUES
中,这意味着您可以在查询中的任何地方重复使用它
select
COUNT(*)
,v.birthweek
from Persons
cross apply (values (
dateadd(week, datediff(week,0, birthday), ?)
) ) v(birthweek)
group by
v.birthweek
order by
v.birthweek;
无论如何,这都是避免重复计算的明智解决方案。
所以我试图用 group by 子句中的参数编写一个 SQL 查询并让 pyodbc 执行它;我有以下架构:
http://sqlfiddle.com/#!18/4e7bb7/2
如果 sqlfiddle 失败,
CREATE TABLE Persons (
Personid int IDENTITY(1,1) PRIMARY KEY,
Name varchar(255) NOT NULL,
Birthday datetime
);
INSERT INTO PERSONS (NAME, BIRTHDAY)
VALUES
('a','20220101'),
('b','20220102'),
('c','20220103'),
('d','20220104'),
('e','20220105'),
('f','20220106'),
('g','20220108'),
('h','20220110'),
('i','20220111'),
('j','20220112'),
('k','20220113'),
('l','20220114'),
('m','20220115')
我以下面的查询为例,它是有效的 SQL:
select
COUNT(*)
,dateadd(week, datediff(week,0, birthday), 0)
from Persons
group by
dateadd(week, datediff(week,0, birthday), 0)
order by
dateadd(week, datediff(week,0, birthday), 0)
此查询将按周对用户的生日进行分组。这只是一个人为的例子。我有与此类似的真实数据。
我正在尝试编写一个 python 函数,它可以获取数据并按周对它们进行分组。我希望这个函数能够决定一周的哪一天是一周的开始。我有以下功能:
import pyodbc
def TestSQLServerDB2(dayOfWeekStart=0):
"""Tests the query.
:param: dayOfWeekStart: Int. 0 = Monday, 1 = Tuesday, ... 6 = Saturday
"""
hostname = 'DESKTOPHOST'
database_instance = "test"
db_conn = pyodbc.connect('Trusted_Connection=yes;' + r"DRIVER=" + "{SQL SERVER}" +
";SERVER=" + hostname + ";DATABASE=" +
database_instance + ";")
targetWeekday = dayOfWeekStart % 7
sql = '''
select
COUNT(*)
,dateadd(week, datediff(week,0, birthday), ?)
from Persons
group by
dateadd(week, datediff(week,0, birthday), ?)
order by
dateadd(week, datediff(week,0, birthday), ?)
'''
params = (targetWeekday ,targetWeekday ,targetWeekday )
cur = db_conn.cursor()
cur.execute(sql, params)
print(cur.fetchall())
运行这个函数会产生如下错误:
pyodbc.ProgrammingError: ('42000', "[42000] [Microsoft][ODBC SQL Server Driver][SQL Server]Column 'Persons.Birthday' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. (8120) (SQLExecDirectW); [42000] [Microsoft][ODBC SQL Server Driver][SQL Server]Statement(s) could not be prepared. (8180)")
在执行之前对 SQL 查询进行字符串替换对我来说已经足够简单了,我不会遇到任何问题;我想知道的是 SQL 参数化语句是否基本上仅限于 WHERE 子句中的值比较?这是有意设计的吗?
谢谢
问题是您基本上将参数传递到查询中三次,而 GROUP BY
和 SELECT
使用不同的参数。尽管 you 知道它们相同,但服务器不知道。实际上,您的查询是:
select
COUNT(*)
,dateadd(week, datediff(week,0, birthday), @p0)
from Persons
group by
dateadd(week, datediff(week,0, birthday), @p1)
order by
dateadd(week, datediff(week,0, birthday), @p2)
很明显为什么它不起作用。
解决方案是对每个使用相同的值。
- 不幸的是,你不能在 pyodbc 中使用命名参数,所以那是不可能的。
- 您可以使用变量,但这会禁用参数嗅探,这可能是不可取的。
- 您可以将整个内容放在嵌套的 subquery/derived table 中。但是嵌套很乱,over-complicates查询。
- 相反,将值放在
CROSS APPLY (VALUES
中,这意味着您可以在查询中的任何地方重复使用它
select
COUNT(*)
,v.birthweek
from Persons
cross apply (values (
dateadd(week, datediff(week,0, birthday), ?)
) ) v(birthweek)
group by
v.birthweek
order by
v.birthweek;
无论如何,这都是避免重复计算的明智解决方案。