SELECT 语句中的参数化列
Parametrize columns in SELECT statement
我正在寻找一个查询,用于比较两个具有相同列名的 table 的宏,并输出每个字段的匹配百分比。我希望宏将 table 名称作为输入。
ex 表示两个静态 tables。
SELECT
SUM(CASE WHEN table1.field1 = table2.field1 THEN 1 ELSE 0 END)/SUM(1)
,SUM(CASE WHEN table1.field2 = table2.field2 THEN 1 ELSE 0 END)/SUM(1)
,SUM(CASE WHEN table1.field3 = table2.field3 THEN 1 ELSE 0 END)/SUM(1)
....
....
,SUM(CASE WHEN table1.fieldN = table2.fieldN THEN 1 ELSE 0 END)/SUM(1)
FROM table1
INNER JOIN table2
ON table1.keyField = table2.keyField
是否可以写一个概括这个的宏?
例如,伪查询可能如下所示:
CREATE MACRO compareTables (table1 varChar(50),table2 varChar(50),keyField AS varchar(50)) AS (
WITH sharedColumns (columnName) AS (
SELECT columnname
FROM dbc.columns
WHERE
tableName = :table1
INTERSECT
SELECT columnname
FROM dbc.columns
WHERE
tableName = :table2)
SELECT
SUM (
CASE
WHEN :table1.<sharedColumn[1]> = :table2.<sharedColumn[1]
THEN 1
ELSE 0
END)/SUM(1)
....
SUM (
CASE
WHEN :table1.<sharedColumn[N]> = :table2.<sharedColumn[N]
THEN 1
ELSE 0
END)/SUM(1)
FROM :table1
INNER JOIN :table2
ON :table1.:keyField = :table2.:keyField;);
有没有什么方法可以在没有 UDF 的情况下在 Teradata 中完成此操作(我没有创建函数权限)。如果那是唯一的方法,那么我可以提出请求,但如果可以避免的话,我宁愿不这样做。
我通常不会在 SO 上写出所有代码,但当我想到这个时,它听起来很有趣。
就像我在对您的问题的评论中指出的那样,您不能在宏中执行此操作,因为您不能将宏参数用作数据库对象。它们仅适用于您数据中的值。所以:
Select * From Table Where F1= :myparam;
在宏中很酷,但是:
Select * From Table Where :myparam = 'somevalue';
不允许。
但是,您可以在存储过程或您喜欢的任何脚本语言中执行此操作。
问题是你有两个问题。
- 您的 table 需要一个列列表,然后必须在创建比较查询时使用这些列。
- 您必须根据列列表和包含 table 名称的两个参数以及包含关键字段的一个参数来动态构建比较查询。
这些要求都不是微不足道的,但像下面这样的东西应该可以为您完成工作。它可能需要一些调整,但我认为它很接近:
CREATE PROCEDURE compareTables
(
IN table1 varChar(50),
IN table2 varChar(50),
IN keyField varchar(50),
OUT dynamicallyCreatedSQL VARCHAR(10000)
)
DYNAMIC RESULT SETS 1
BEGIN
DECLARE outputSQLStatement VARCHAR(10000); --variable to hold your dynamically created sql statement that will produce the record set that we are outputting form this SP
DECLARE columnSQLStatement VARCHAR(500); --variable to hold your dynamically created sql statement that will hold the columns in Table1
DECLARE columnName VARCHAR(30); --Variable to stick the column name that we get from the column_cursor
DECLARE output_cursor CURSOR WITH RETURN ONLY FOR output_statement; --The dynamically created cursor that will hold your record set produced by outputSQLStatement
DECLARE column_cursor CURSOR FOR column_statement; --The dynamically created cursor that will hold your record set produced by columnSQLStatement
--The start of your dynamic output sql statement:
SET outputSQLStatement = '
SELECT ';
--SQL Statement for your dynamically created cursor to get the columns for your table
--TODO: Change "YourDatabaseHere" to your database...
SET columnSQLStatement = 'SELECT ColumnName FROM "DBC".Columns WHERE DATABASE=''YourDatabaseHere'' AND TableName=''' || table1 || ''';';
--Prepare the dynamically generated column SQL statement for cursor.
Prepare column_statement FROM columnSQLStatement;
--Open the cursor and Loop through each record
OPEN column_cursor;
LABEL1:
LOOP
--WOAH THERE! No data was returned. Much sorries.
-- If there is no data, this thing is going to hang...
-- And... if there is no data, it means that your table probably isn't a table. You should check your parameters.
IF (SQLSTATE ='02000') THEN
LEAVE label1;
END IF;
--Grab the column name from the record into the variable columnName
FETCH column_cursor INTO columnName;
--Now we can build the meat of that sql statement
SET outputSQLStatement = outputSQLStatement || '
SUM (
CASE
WHEN ' || table1 || '."' || columnName || '" = ' || table2 || '."' || columnName || '"
THEN 1
ELSE 0
END)/SUM(1) as "' || columnName || '",';
--End the loop and close the cursor
END LOOP LABEL1;
CLOSE column_cursor;
--There's going to be an extra comma in there that we have to remove before the FROM part of the SQL statement, lets get rid of that:
SET outputSQLStatement = Substring(OutputSQLStatement FROM 1 FOR Length(OutputSQLStatement) - 1);
--Now complete the sql statement
Set outputSQLStatement = outputSQLStatement || '
FROM ' || table1 || '
INNER JOIN ' || table2 || '
ON ' || table1 || '.' || keyfield || ' = ' || table2 || '.' || keyfield || ';';
--Set the output variable to the dynamically generated sql statement for debug fun.
Set dynamicallyCreatedSQL = outputSQLStatement;
--And finally... execute the statement by prepping it and opening the cursor.
-- we don't close the cursor so that the "Dynamic Result Sets 1" catches it and returns it to whatever calls this procedure.
PREPARE output_statement FROM outputSQLStatement;
OPEN output_cursor;
END;
你可以这样称呼它:
CALL compareTables('table1', 'table2', 'yourkeyfield', output);
那将 return 两个记录集。第一个将具有动态创建的 SQL 语句,您可以将其用于调试。第二个将是您想要的记录集。
如果您没有 CREATE PROCEDURE
访问权限,那么这是一个洗礼。但是,无论如何,这将是您必须使用的方法,无论它是在 Teradata SP 中,bash 与 BTEQ,还是其他一些脚本语言,如 VBScript 通过 ADO/ODBC 或其他任何东西。
我试图对其进行很好的评论,以便对每个部分进行解释,但是在将游标用于两种不同目的(循环结果集和打开结果集以从过程输出)之间发生了一些复杂的事情) 以及根据在 dbc.columns
.
中找到的输入和列生成的动态 sql
我正在寻找一个查询,用于比较两个具有相同列名的 table 的宏,并输出每个字段的匹配百分比。我希望宏将 table 名称作为输入。
ex 表示两个静态 tables。
SELECT
SUM(CASE WHEN table1.field1 = table2.field1 THEN 1 ELSE 0 END)/SUM(1)
,SUM(CASE WHEN table1.field2 = table2.field2 THEN 1 ELSE 0 END)/SUM(1)
,SUM(CASE WHEN table1.field3 = table2.field3 THEN 1 ELSE 0 END)/SUM(1)
....
....
,SUM(CASE WHEN table1.fieldN = table2.fieldN THEN 1 ELSE 0 END)/SUM(1)
FROM table1
INNER JOIN table2
ON table1.keyField = table2.keyField
是否可以写一个概括这个的宏?
例如,伪查询可能如下所示:
CREATE MACRO compareTables (table1 varChar(50),table2 varChar(50),keyField AS varchar(50)) AS (
WITH sharedColumns (columnName) AS (
SELECT columnname
FROM dbc.columns
WHERE
tableName = :table1
INTERSECT
SELECT columnname
FROM dbc.columns
WHERE
tableName = :table2)
SELECT
SUM (
CASE
WHEN :table1.<sharedColumn[1]> = :table2.<sharedColumn[1]
THEN 1
ELSE 0
END)/SUM(1)
....
SUM (
CASE
WHEN :table1.<sharedColumn[N]> = :table2.<sharedColumn[N]
THEN 1
ELSE 0
END)/SUM(1)
FROM :table1
INNER JOIN :table2
ON :table1.:keyField = :table2.:keyField;);
有没有什么方法可以在没有 UDF 的情况下在 Teradata 中完成此操作(我没有创建函数权限)。如果那是唯一的方法,那么我可以提出请求,但如果可以避免的话,我宁愿不这样做。
我通常不会在 SO 上写出所有代码,但当我想到这个时,它听起来很有趣。
就像我在对您的问题的评论中指出的那样,您不能在宏中执行此操作,因为您不能将宏参数用作数据库对象。它们仅适用于您数据中的值。所以:
Select * From Table Where F1= :myparam;
在宏中很酷,但是:
Select * From Table Where :myparam = 'somevalue';
不允许。
但是,您可以在存储过程或您喜欢的任何脚本语言中执行此操作。
问题是你有两个问题。
- 您的 table 需要一个列列表,然后必须在创建比较查询时使用这些列。
- 您必须根据列列表和包含 table 名称的两个参数以及包含关键字段的一个参数来动态构建比较查询。
这些要求都不是微不足道的,但像下面这样的东西应该可以为您完成工作。它可能需要一些调整,但我认为它很接近:
CREATE PROCEDURE compareTables
(
IN table1 varChar(50),
IN table2 varChar(50),
IN keyField varchar(50),
OUT dynamicallyCreatedSQL VARCHAR(10000)
)
DYNAMIC RESULT SETS 1
BEGIN
DECLARE outputSQLStatement VARCHAR(10000); --variable to hold your dynamically created sql statement that will produce the record set that we are outputting form this SP
DECLARE columnSQLStatement VARCHAR(500); --variable to hold your dynamically created sql statement that will hold the columns in Table1
DECLARE columnName VARCHAR(30); --Variable to stick the column name that we get from the column_cursor
DECLARE output_cursor CURSOR WITH RETURN ONLY FOR output_statement; --The dynamically created cursor that will hold your record set produced by outputSQLStatement
DECLARE column_cursor CURSOR FOR column_statement; --The dynamically created cursor that will hold your record set produced by columnSQLStatement
--The start of your dynamic output sql statement:
SET outputSQLStatement = '
SELECT ';
--SQL Statement for your dynamically created cursor to get the columns for your table
--TODO: Change "YourDatabaseHere" to your database...
SET columnSQLStatement = 'SELECT ColumnName FROM "DBC".Columns WHERE DATABASE=''YourDatabaseHere'' AND TableName=''' || table1 || ''';';
--Prepare the dynamically generated column SQL statement for cursor.
Prepare column_statement FROM columnSQLStatement;
--Open the cursor and Loop through each record
OPEN column_cursor;
LABEL1:
LOOP
--WOAH THERE! No data was returned. Much sorries.
-- If there is no data, this thing is going to hang...
-- And... if there is no data, it means that your table probably isn't a table. You should check your parameters.
IF (SQLSTATE ='02000') THEN
LEAVE label1;
END IF;
--Grab the column name from the record into the variable columnName
FETCH column_cursor INTO columnName;
--Now we can build the meat of that sql statement
SET outputSQLStatement = outputSQLStatement || '
SUM (
CASE
WHEN ' || table1 || '."' || columnName || '" = ' || table2 || '."' || columnName || '"
THEN 1
ELSE 0
END)/SUM(1) as "' || columnName || '",';
--End the loop and close the cursor
END LOOP LABEL1;
CLOSE column_cursor;
--There's going to be an extra comma in there that we have to remove before the FROM part of the SQL statement, lets get rid of that:
SET outputSQLStatement = Substring(OutputSQLStatement FROM 1 FOR Length(OutputSQLStatement) - 1);
--Now complete the sql statement
Set outputSQLStatement = outputSQLStatement || '
FROM ' || table1 || '
INNER JOIN ' || table2 || '
ON ' || table1 || '.' || keyfield || ' = ' || table2 || '.' || keyfield || ';';
--Set the output variable to the dynamically generated sql statement for debug fun.
Set dynamicallyCreatedSQL = outputSQLStatement;
--And finally... execute the statement by prepping it and opening the cursor.
-- we don't close the cursor so that the "Dynamic Result Sets 1" catches it and returns it to whatever calls this procedure.
PREPARE output_statement FROM outputSQLStatement;
OPEN output_cursor;
END;
你可以这样称呼它:
CALL compareTables('table1', 'table2', 'yourkeyfield', output);
那将 return 两个记录集。第一个将具有动态创建的 SQL 语句,您可以将其用于调试。第二个将是您想要的记录集。
如果您没有 CREATE PROCEDURE
访问权限,那么这是一个洗礼。但是,无论如何,这将是您必须使用的方法,无论它是在 Teradata SP 中,bash 与 BTEQ,还是其他一些脚本语言,如 VBScript 通过 ADO/ODBC 或其他任何东西。
我试图对其进行很好的评论,以便对每个部分进行解释,但是在将游标用于两种不同目的(循环结果集和打开结果集以从过程输出)之间发生了一些复杂的事情) 以及根据在 dbc.columns
.