使用 Python 的源数据库和目标数据库之间的区别

Difference Between Source and Target DB using Python

我正在尝试使用 Python 3、Oracle(源)从两个数据库中获取行数并将其与 Snowflake(目标)进行比较以检查 ETL 中是否存在任何差异。我需要将两者的结果和它们之间的差异写入一个文件中。

这是我到目前为止的情况:

import cx_Oracle
import snowflake.connector
import sys
import csv
import os

exp_dir = os.path.normpath('C:/Users/user/Documents/')
exp_file_name = 'Count_Dff.csv'
exp_path = os.path.join(exp_dir, exp_file_name)

def runSQL(table):
    statement = "select '{0}', count(0) from  {0}".format(table.replace(' ',''))
    return statement


if __name__ == '__main__':
    """
    This function calls above functions and connects to Snowflake
    """
    tables = ['Table_1','Table_2']
    my_list = []
    try:
        conn_str = u'user/paswword@host/service'
        curcon = cx_Oracle.connect(conn_str) 
        cursor = curcon.cursor()

        ctx = snowflake.connector.connect(user='****', password='****', account='****', role='***')
        cursor2 = ctx.cursor()
        cursor2.execute("USE WAREHOUSE ****")
        cursor2.execute("USE DATABASE ****")
        cursor2.execute("USE SCHEMA ****")

        for table in tables:
            my_dict = {}
            sql = runSQL(table)
            cursor.execute(sql)
            my_list.append(cursor)

        outputFile = open(exp_path,'w') # 'wb'
        output = csv.writer(outputFile)
        for data in my_list:
            output.writerow(data)


    finally:
        cursor.close()
        cursor2.close()


显然这不是一个完整的解决方案。我对下一步有点迷茫。有任何输入吗?

预期输出:

|  Table_Name   |  Source_Count  | Target_Count | Difference  |  
|   Table1      |       14       |     12       |     2       |

您不是针对 Snowflake 数据库的 运行 SQL 命令。你可能应该这样做:

for table in tables:
    sql = runSQL(table)
    cursor.execute(sql)
    o_count = cursor.fetchone()[1]
    cursor2.execute(sql)
    s_count = cursor2.fetchone()[1]
    my_list.append([table, o_count, s_count, o_count - s_count])

编辑 添加了对评论的回应差异。