SELECT 语句中的参数化列

Parametrize columns in SELECT statement

我正在寻找一个查询,用于比较两个具有相同列名的 table 的宏,并输出每个字段的匹配百分比。我希望宏将 table 名称作为输入。

ex 表示两个静态 tables。

SELECT 
 SUM(CASE WHEN table1.field1 = table2.field1 THEN 1 ELSE 0 END)/SUM(1)
,SUM(CASE WHEN table1.field2 = table2.field2 THEN 1 ELSE 0 END)/SUM(1)
,SUM(CASE WHEN table1.field3 = table2.field3 THEN 1 ELSE 0 END)/SUM(1)
....
....
,SUM(CASE WHEN table1.fieldN = table2.fieldN THEN 1 ELSE 0 END)/SUM(1)
FROM table1
INNER JOIN table2
    ON table1.keyField = table2.keyField

是否可以写一个概括这个的宏?

例如,伪查询可能如下所示:

CREATE MACRO compareTables (table1 varChar(50),table2 varChar(50),keyField AS varchar(50)) AS (
    WITH sharedColumns (columnName) AS (
        SELECT columnname
        FROM dbc.columns
        WHERE
            tableName = :table1
        INTERSECT
        SELECT columnname
        FROM dbc.columns
        WHERE
            tableName = :table2)
  SELECT 
   SUM (
       CASE 
           WHEN :table1.<sharedColumn[1]> = :table2.<sharedColumn[1]
           THEN 1
           ELSE 0
       END)/SUM(1)
  ....
   SUM (
       CASE 
           WHEN :table1.<sharedColumn[N]> = :table2.<sharedColumn[N]
           THEN 1
           ELSE 0
       END)/SUM(1)
  FROM :table1
  INNER JOIN :table2
      ON :table1.:keyField = :table2.:keyField;);

有没有什么方法可以在没有 UDF 的情况下在 Teradata 中完成此操作(我没有创建函数权限)。如果那是唯一的方法,那么我可以提出请求,但如果可以避免的话,我宁愿不这样做。

我通常不会在 SO 上写出所有代码,但当我想到这个时,它听起来很有趣。

就像我在对您的问题的评论中指出的那样,您不能在宏中执行此操作,因为您不能将宏参数用作数据库对象。它们仅适用于您数据中的值。所以:

 Select * From Table Where F1= :myparam;

在宏中很酷,但是:

 Select * From Table Where :myparam = 'somevalue';

不允许。


但是,您可以在存储过程或您喜欢的任何脚本语言中执行此操作。

问题是你有两个问题。

  1. 您的 table 需要一个列列表,然后必须在创建比较查询时使用这些列。
  2. 您必须根据列列表和包含 table 名称的两个参数以及包含关键字段的一个参数来动态构建比较查询。

这些要求都不是微不足道的,但像下面这样的东西应该可以为您完成工作。它可能需要一些调整,但我认为它很接近:

CREATE PROCEDURE compareTables 
(
    IN table1 varChar(50),
    IN table2 varChar(50),
    IN keyField varchar(50),
    OUT dynamicallyCreatedSQL VARCHAR(10000)
) 
DYNAMIC RESULT SETS 1

BEGIN

    DECLARE outputSQLStatement VARCHAR(10000); --variable to hold your dynamically created sql statement that will produce the record set that we are outputting form this SP
    DECLARE columnSQLStatement VARCHAR(500); --variable to hold your dynamically created sql statement that will hold the columns in Table1
    DECLARE columnName VARCHAR(30); --Variable to stick the column name that we get from the column_cursor  
    DECLARE output_cursor CURSOR WITH RETURN ONLY FOR output_statement; --The dynamically created cursor that will hold your record set produced by outputSQLStatement
    DECLARE column_cursor CURSOR FOR column_statement; --The dynamically created cursor that will hold your record set produced by columnSQLStatement

    --The start of your dynamic output sql statement:
    SET outputSQLStatement = '
                SELECT ';   

    --SQL Statement for your dynamically created cursor to get the columns for your table
    --TODO: Change "YourDatabaseHere" to your database...
    SET columnSQLStatement = 'SELECT ColumnName FROM "DBC".Columns WHERE DATABASE=''YourDatabaseHere'' AND TableName=''' || table1 || ''';';

    --Prepare the dynamically generated column SQL statement for cursor.
    Prepare column_statement FROM columnSQLStatement;

    --Open the cursor and Loop through each record
    OPEN column_cursor;
    LABEL1: 
    LOOP

        --WOAH THERE! No data was returned. Much sorries.
        -- If there is no data, this thing is going to hang...
        -- And... if there is no data, it means that your table probably isn't a table. You should check your parameters.
        IF (SQLSTATE ='02000') THEN
            LEAVE label1;
        END IF;

        --Grab the column name from the record into the variable columnName
        FETCH column_cursor INTO columnName;

        --Now we can build the meat of that sql statement       
        SET outputSQLStatement = outputSQLStatement || '
                SUM (
                   CASE 
                       WHEN ' || table1 || '."' || columnName || '" = ' || table2 || '."' || columnName || '"
                       THEN 1
                       ELSE 0
                   END)/SUM(1) as "' || columnName || '",';

    --End the loop and close the cursor
    END LOOP LABEL1;
    CLOSE column_cursor;        

    --There's going to be an extra comma in there that we have to remove before the FROM part of the SQL statement, lets get rid of that:
    SET outputSQLStatement = Substring(OutputSQLStatement FROM 1 FOR Length(OutputSQLStatement) - 1);

    --Now complete the sql statement        
    Set outputSQLStatement = outputSQLStatement || '
                FROM ' || table1 || '
                INNER JOIN ' || table2 || '
                  ON ' || table1 || '.' || keyfield || ' = ' || table2 || '.' || keyfield || ';';

    --Set the output variable to the dynamically generated sql statement for debug fun.
    Set dynamicallyCreatedSQL = outputSQLStatement;

    --And finally... execute the statement by prepping it and opening the cursor. 
    --  we don't close the cursor so that the "Dynamic Result Sets 1" catches it and returns it to whatever calls this procedure.
    PREPARE output_statement FROM outputSQLStatement;
    OPEN output_cursor;

END;

你可以这样称呼它:

CALL compareTables('table1', 'table2', 'yourkeyfield', output);

那将 return 两个记录集。第一个将具有动态创建的 SQL 语句,您可以将其用于调试。第二个将是您想要的记录集。

如果您没有 CREATE PROCEDURE 访问权限,那么这是一个洗礼。但是,无论如何,这将是您必须使用的方法,无论它是在 Teradata SP 中,bash 与 BTEQ,还是其他一些脚本语言,如 VBScript 通过 ADO/ODBC 或其他任何东西。

我试图对其进行很好的评论,以便对每个部分进行解释,但是在将游标用于两种不同目的(循环结果集和打开结果集以从过程输出)之间发生了一些复杂的事情) 以及根据在 dbc.columns.

中找到的输入和列生成的动态 sql