在下一行找到一个词 python

Question

我在文本文件中搜索某个字符串，然后在该字符串之后寻找另一个字符串，它可能在下一行或文档的更下方。我目前有

所以示例文本输出会像

there is a word1. then there is some more text. 
then we are looking for word2 = apple.

我正在寻找 return 单词 'apple' + word1。但是 word2= 可以在下一行或文档的更下方。我已经设法做到了以下，但这只有在下一行时才有效。如果它在第 3、4、5 行等，则不会。有人可以帮忙吗？

if 'word1' in line and 'word2' not in line:        
    nextLine = next(f)
    pattern = re.match('(?:word2=|word2 =)([a-z0-9_])+',nextLine) 
    if pattern:    
        print('word1', pattern)

Answer 1

如果我做对了，我为你做了一个例子：

string = """

there is a word1. then there is some more text. 
then we are looking for word2 = apple. 


there is a word1. then there is some more text. 
then we are looking for word2 = orange. 



there is a word1. then there is some more text. 
then there is some more text. 
then there is some more text. 
then we are looking for word2= peer. 
"""


import re
result = re.findall(".*?(word1)[\s\S]*?word2 *=.*?([a-z0-9_]+)", string)
print(result)
# should be [('word1', 'apple'), ('word1', 'orange'), ('word1', 'peer')]

注意：由于我使用的是整个字符串匹配，所以我的例子可能不适合大文件。

Answer 2

if 'word1' in line and 'word2' not in line: 
while True:       
    nextLine = next(f)
    pattern = re.match('(?:word2=|word2 =)([a-z0-9_])+',nextLine) 
    if pattern:    
        print('word1', pattern)
        break

不确定是否可以使用无法访问 PC 让我知道，如果无法使用我会删除它

谨防强硬：

Are all infinite loops bad?

Is while (true) with break bad programming practice?

Answer 3

你应该在一个字符串中阅读你的完整文件，然后试试这个。这将捕获 word1，以及使用 capturing groups:

等同于 word2 的任何内容

(word1)(?:.*[\n\r]?)+word2 ?= ?(\w+)

从你的问题中我们不清楚我们应该匹配 word2 = apple 还是 word2=apple（也许你最后一次提到 word2= 是打字错误？），所以我包括了? 个字符，这将使空格成为可选的。

如果您希望您的答案格式为 apple + word1，您可以这样做：

print(pattern.group(1) + " + " + pattern.group(2))

在下一行找到一个词 python

finding a word on the following line python

python

regex

io