使用 Python 的源数据库和目标数据库之间的区别
Difference Between Source and Target DB using Python
我正在尝试使用 Python 3、Oracle(源)从两个数据库中获取行数并将其与 Snowflake(目标)进行比较以检查 ETL 中是否存在任何差异。我需要将两者的结果和它们之间的差异写入一个文件中。
这是我到目前为止的情况:
import cx_Oracle
import snowflake.connector
import sys
import csv
import os
exp_dir = os.path.normpath('C:/Users/user/Documents/')
exp_file_name = 'Count_Dff.csv'
exp_path = os.path.join(exp_dir, exp_file_name)
def runSQL(table):
statement = "select '{0}', count(0) from {0}".format(table.replace(' ',''))
return statement
if __name__ == '__main__':
"""
This function calls above functions and connects to Snowflake
"""
tables = ['Table_1','Table_2']
my_list = []
try:
conn_str = u'user/paswword@host/service'
curcon = cx_Oracle.connect(conn_str)
cursor = curcon.cursor()
ctx = snowflake.connector.connect(user='****', password='****', account='****', role='***')
cursor2 = ctx.cursor()
cursor2.execute("USE WAREHOUSE ****")
cursor2.execute("USE DATABASE ****")
cursor2.execute("USE SCHEMA ****")
for table in tables:
my_dict = {}
sql = runSQL(table)
cursor.execute(sql)
my_list.append(cursor)
outputFile = open(exp_path,'w') # 'wb'
output = csv.writer(outputFile)
for data in my_list:
output.writerow(data)
finally:
cursor.close()
cursor2.close()
显然这不是一个完整的解决方案。我对下一步有点迷茫。有任何输入吗?
预期输出:
| Table_Name | Source_Count | Target_Count | Difference |
| Table1 | 14 | 12 | 2 |
您不是针对 Snowflake 数据库的 运行 SQL 命令。你可能应该这样做:
for table in tables:
sql = runSQL(table)
cursor.execute(sql)
o_count = cursor.fetchone()[1]
cursor2.execute(sql)
s_count = cursor2.fetchone()[1]
my_list.append([table, o_count, s_count, o_count - s_count])
编辑
添加了对评论的回应差异。
我正在尝试使用 Python 3、Oracle(源)从两个数据库中获取行数并将其与 Snowflake(目标)进行比较以检查 ETL 中是否存在任何差异。我需要将两者的结果和它们之间的差异写入一个文件中。
这是我到目前为止的情况:
import cx_Oracle
import snowflake.connector
import sys
import csv
import os
exp_dir = os.path.normpath('C:/Users/user/Documents/')
exp_file_name = 'Count_Dff.csv'
exp_path = os.path.join(exp_dir, exp_file_name)
def runSQL(table):
statement = "select '{0}', count(0) from {0}".format(table.replace(' ',''))
return statement
if __name__ == '__main__':
"""
This function calls above functions and connects to Snowflake
"""
tables = ['Table_1','Table_2']
my_list = []
try:
conn_str = u'user/paswword@host/service'
curcon = cx_Oracle.connect(conn_str)
cursor = curcon.cursor()
ctx = snowflake.connector.connect(user='****', password='****', account='****', role='***')
cursor2 = ctx.cursor()
cursor2.execute("USE WAREHOUSE ****")
cursor2.execute("USE DATABASE ****")
cursor2.execute("USE SCHEMA ****")
for table in tables:
my_dict = {}
sql = runSQL(table)
cursor.execute(sql)
my_list.append(cursor)
outputFile = open(exp_path,'w') # 'wb'
output = csv.writer(outputFile)
for data in my_list:
output.writerow(data)
finally:
cursor.close()
cursor2.close()
显然这不是一个完整的解决方案。我对下一步有点迷茫。有任何输入吗?
预期输出:
| Table_Name | Source_Count | Target_Count | Difference | | Table1 | 14 | 12 | 2 |
您不是针对 Snowflake 数据库的 运行 SQL 命令。你可能应该这样做:
for table in tables:
sql = runSQL(table)
cursor.execute(sql)
o_count = cursor.fetchone()[1]
cursor2.execute(sql)
s_count = cursor2.fetchone()[1]
my_list.append([table, o_count, s_count, o_count - s_count])
编辑 添加了对评论的回应差异。