python 脚本到运行并每 10 分钟生成一次 csv 文件

Question

所以，我有这样的 python 代码：

import csv
import pandas as pd
import numpy as np
import time
from pandas import Series,DataFrame


    df = pd.read_csv('C:/Users/Desktop/case_study_1.csv',low_memory=False)

    df.head()

    #convert interaction_time to date time format
    df.interaction_time = pd.to_datetime(df.interaction_time)

    #remove null on merchant column
    df_remove_null = df.dropna(subset=['merchant'])
    #count added, comfirmed txn
    df_cnt =    df_remove_null.groupby([pd.Grouper(key='interaction_time',freq='H'),df_remove_null.fullVisitorid,df_remove_null.action_type]).size().reset_index(name='count')
df_final_cnt = df_cnt.groupby(['interaction_time','action_type'])['fullVisitorid'].size().reset_index(name='count')

    #export csv file
    df_final_cnt.to_csv(r'C:\Users\Desktop\filename12.csv',index = False, columns = ["interaction_time","action_type","count"])

如您所见，代码输出了一个 csv 文件。我将 csv 文件保存到我的本地目录。我想要做的只是每 10 分钟自动运行代码并生成一个新的 csv 文件。所以，每隔 10 分钟，新的 csv 文件就会覆盖旧的文件。

我对自动化了解不多，因此非常感谢任何形式的帮助。

我尝试使用 range(100) 进行循环，但错误显示：IndentationError: expected an indented block

谢谢。

Answer 1

如果脚本运行持续

，则将此添加到您的代码周围将每十分钟完成一次工作

import time

while(True):

    ... your code here ...

    time.sleep(600)

格式缩进错误，您需要找到格式错误的地方，我建议为此查看 formatting/linting 工具

Answer 2

您可以将所有工作放在一个函数中，并使用 sched.

这样的模块每 10 分钟调用一次该函数

import sched, time
sd = sched.scheduler(time.time, time.sleep)
def your_func(sc): 
    df = pd.read_csv('C:/Users/Desktop/case_study_1.csv',low_memory=False)

    df.head()

    #convert interaction_time to date time format
    df.interaction_time = pd.to_datetime(df.interaction_time)

    #remove null on merchant column
    df_remove_null = df.dropna(subset=['merchant'])
    #count added, comfirmed txn
    df_cnt =  df_remove_null.groupby([pd.Grouper(key='interaction_time',freq='H'),df_remove_null.fullVisitorid,df_remove_null.action_type]).size().reset_index(name='count')
    df_final_cnt = df_cnt.groupby(['interaction_time','action_type'])['fullVisitorid'].size().reset_index(name='count')

    #export csv file
    df_final_cnt.to_csv(r'C:\Users\Desktop\filename12.csv',index = False, columns = ["interaction_time","action_type","count"])
    sd.enter(600, 1, your_func, (sc,))

sd.enter(600, 1, your_func, (sd,))
sd.run()

这样做的目的是，在两次执行之间有 10 分钟的间隔。（如果您的代码执行时间是 2 分钟，那么它将每 12 分钟执行一次）。

Answer 3

我认为最简单的解决方案是：

import time
while True:
    # your script
    time.sleep(10) ```


This is an infinite loop, you can use a condition for break.

Answer 4

如果不限制在 python 内实施，一个简单的解决方案是使用 Windows 任务计划每 10 分钟执行一次脚本。

请参考以下主题： Run a task every x-minutes with Windows Task Scheduler

python 脚本到运行并每 10 分钟生成一次 csv 文件

python script to run and generate csv file every 10 mins

python

csv

automation

python 脚本到 运行 并每 10 分钟生成一次 csv 文件

python script to run and generate csv file every 10 mins

python

csv

automation

python 脚本到运行并每 10 分钟生成一次 csv 文件