Python SQL 查询执行时间

Question

我几乎没有使用 Python 和 SQL 的经验。为了完成硕士论文，我一直在自学。

我刚刚写了一个小脚本来对大约 50 个相同结构的数据库进行基准测试，如下所示：

import thesis,pyodbc

# SQL Server settings
drvr = '{SQL Server Native Client 10.0}'
host = 'host_directory'
user = 'username'
pswd = 'password'
table = 'tBufferAux' # Found (by inspection) to be the table containing relevant data
column = 'Data'

# Establish a connection to SQL Server
cnxn = pyodbc.connect(driver=drvr, server=host, uid=user, pwd=pswd) # Setup connection

endRow = 'SELECT TOP 1 ' + column + ' FROM [' # Query template for ending row
with open(thesis.db_metadata_path(),'w') as file:
    for db in thesis.db_list():
        # Prepare queries
        countRows_query = 'SELECT COUNT(*) FROM [' + db + '].dbo.' + table
        firstRow_query = endRow + db + '].dbo.' + table + ' ORDER BY ' + column + ' ASC'
        lastRow_query = endRow + db + '].dbo.' + table + ' ORDER BY ' + column + ' DESC'
        # Execute queries
        N_rows = cnxn.cursor().execute(countRows_query).fetchone()[0]
        first_row = cnxn.cursor().execute(firstRow_query).fetchone()
        last_row = cnxn.cursor().execute(lastRow_query).fetchone()
        # Save output to text file
        file.write(db + ' ' + str(N_rows) + ' ' + str(first_row.Data) + ' ' + str(last_row.Data) + '\n')

# Close session
cnxn.cursor().close()
cnxn.close()

我惊讶地发现这个简单的程序需要将近 10 秒才能完成运行，所以我想知道这是否正常，或者我的代码中是否有任何部分可能会减慢执行速度. （我提醒你for循环运行s只有56次）

请注意，thesis（自定义）模块中的任何函数都几乎没有影响，因为它们都只是变量赋值（除了 thesis.db_list()，这是一个快速的 .txt 文件读取）

编辑：This 是该程序生成的输出 .txt 文件。第二列是每个数据库 table 的记录数。

Answer 1

timeit is good to measure and compare the performance of single statements and code chunks (note that in iPython，有一个内置命令可以更轻松地做到这一点。
Profilers 将测量分解为每个调用的函数（因此对于大量代码更有用）。
请注意，独立程序（更确切地说，是一种解释语言的程序）具有启动（和关闭）开销。

总的来说，10 秒对于访问数据库的程序来说看起来并不多。

作为测试，我将您的程序包装在这样的分析器中：

def main():
<your program>
if __name__=='__main__':
    import cProfile
    cProfile.run('main()')

和运行来自 cygwin 的 bash 像这样：

T1=`date +%T,%N`; /c/Python27/python.exe ./t.py; echo $T1; date +%T,%N

结果 table 将 connect 列为单一时间消耗者（我的机器是非常快的 i7 3.9GHz/8GB，具有本地 MSSQL 和 SSD 作为系统磁盘）：

     7200 function calls (7012 primitive calls) in 0.058 seconds

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
<...>
     1    0.003    0.003    0.058    0.058 t.py:1(main)
<...>
     1    0.043    0.043    0.043    0.043 {pyodbc.connect}
<...>

并且 date 命令显示程序本身运行大约 300 毫秒，总开销为 250 毫秒：

<...>:39,782700900
<...>:40,072717400

（通过从命令行中排除 python，我确认其他命令的开销可以忽略不计 - 大约 7us）

Python SQL 查询执行时间

Python SQL query execution time

python

sql

sql-server

performance