Python 从文件中读取,只有在找不到字符串时才工作
Python read from a file, and only do work if a string isn't found
所以我正在尝试制作一个 reddit 机器人,它将执行提交的代码。我有自己的子程序来控制这些客户端。
while __name__ == '__main__':
string = open('config.txt').read()
for submission in subreddit.get_new(limit = 1):
if submission.url not in string:
f.write(submission.url + "\n")
f.close()
f = open('config.txt', "a")
string = open('config.txt').read()
所以这应该做的是从配置文件中读取,然后只有在提交 url 不在 config.txt 中时才工作。但是,它始终会看到最新的 post 并且可以正常工作。 F的打开方式是这样的
if not os.path.exists('file'):
open('config.txt', 'w').close()
f = open('config.txt', "a")
首先对您现有代码的批评(在评论中):
# the next two lines are not needed; open('config.txt', "a")
# will create the file if it doesn't exist.
if not os.path.exists('file'):
open('config.txt', 'w').close()
f = open('config.txt', "a")
# this is an unusual condition which will confuse readers
while __name__ == '__main__':
# the next line will open a file handle and never explicitly close it
# (it will probably get closed automatically when it goes out of scope,
# but it's not good form)
string = open('config.txt').read()
for submission in subreddit.get_new(limit = 1):
# the next line should check for a full-line match; as written, it
# will match "http://www.test.com" if "http://www.test.com/level2"
# is in config.txt
if submission.url not in string:
f.write(submission.url + "\n")
# the next two lines could be replaced with f.flush()
f.close()
f = open('config.txt', "a")
# this is a cumbersome way to keep your string synced with the file,
# and it never explicitly releases the new file handle
string = open('config.txt').read()
# If subreddit.get_new() doesn't return any results, this will act as
# a busy loop, repeatedly requesting new results as fast as possible.
# If that is undesirable, you might want to sleep here.
# file handle f should get closed after the loop
None 上面指出的问题应该会使您的代码无法正常工作(可能不精确的匹配除外)。但是更简单的代码可能更容易调试。这是一些做同样事情的代码。注意:我假设任何其他进程都不可能同时写入 config.txt。您可以使用 pdb 逐行尝试这段代码(或您的代码),看看它是否按预期工作。
import time
import praw
r = praw.Reddit(...)
subreddit = r.get_subreddit(...)
if __name__ == '__main__':
# open config.txt for reading and writing without truncating.
# moves pointer to end of file; closes file at end of block
with open('config.txt', "a+") as f:
# move pointer to start of file
f.seek(0)
# make a list of existing lines; also move pointer to end of file
lines = set(f.read().splitlines())
while True:
got_one = False
for submission in subreddit.get_new(limit=1):
got_one = True
if submission.url not in lines:
lines.add(submission.url)
f.write(submission.url + "\n")
# write data to disk immediately
f.flush()
...
if not got_one:
# wait a little while before trying again
time.sleep(10)
所以我正在尝试制作一个 reddit 机器人,它将执行提交的代码。我有自己的子程序来控制这些客户端。
while __name__ == '__main__':
string = open('config.txt').read()
for submission in subreddit.get_new(limit = 1):
if submission.url not in string:
f.write(submission.url + "\n")
f.close()
f = open('config.txt', "a")
string = open('config.txt').read()
所以这应该做的是从配置文件中读取,然后只有在提交 url 不在 config.txt 中时才工作。但是,它始终会看到最新的 post 并且可以正常工作。 F的打开方式是这样的
if not os.path.exists('file'):
open('config.txt', 'w').close()
f = open('config.txt', "a")
首先对您现有代码的批评(在评论中):
# the next two lines are not needed; open('config.txt', "a")
# will create the file if it doesn't exist.
if not os.path.exists('file'):
open('config.txt', 'w').close()
f = open('config.txt', "a")
# this is an unusual condition which will confuse readers
while __name__ == '__main__':
# the next line will open a file handle and never explicitly close it
# (it will probably get closed automatically when it goes out of scope,
# but it's not good form)
string = open('config.txt').read()
for submission in subreddit.get_new(limit = 1):
# the next line should check for a full-line match; as written, it
# will match "http://www.test.com" if "http://www.test.com/level2"
# is in config.txt
if submission.url not in string:
f.write(submission.url + "\n")
# the next two lines could be replaced with f.flush()
f.close()
f = open('config.txt', "a")
# this is a cumbersome way to keep your string synced with the file,
# and it never explicitly releases the new file handle
string = open('config.txt').read()
# If subreddit.get_new() doesn't return any results, this will act as
# a busy loop, repeatedly requesting new results as fast as possible.
# If that is undesirable, you might want to sleep here.
# file handle f should get closed after the loop
None 上面指出的问题应该会使您的代码无法正常工作(可能不精确的匹配除外)。但是更简单的代码可能更容易调试。这是一些做同样事情的代码。注意:我假设任何其他进程都不可能同时写入 config.txt。您可以使用 pdb 逐行尝试这段代码(或您的代码),看看它是否按预期工作。
import time
import praw
r = praw.Reddit(...)
subreddit = r.get_subreddit(...)
if __name__ == '__main__':
# open config.txt for reading and writing without truncating.
# moves pointer to end of file; closes file at end of block
with open('config.txt', "a+") as f:
# move pointer to start of file
f.seek(0)
# make a list of existing lines; also move pointer to end of file
lines = set(f.read().splitlines())
while True:
got_one = False
for submission in subreddit.get_new(limit=1):
got_one = True
if submission.url not in lines:
lines.add(submission.url)
f.write(submission.url + "\n")
# write data to disk immediately
f.flush()
...
if not got_one:
# wait a little while before trying again
time.sleep(10)