在下一行找到一个词 python
finding a word on the following line python
我在文本文件中搜索某个字符串,然后在该字符串之后寻找另一个字符串,它可能在下一行或文档的更下方。我目前有
所以示例文本输出会像
there is a word1. then there is some more text.
then we are looking for word2 = apple.
我正在寻找 return 单词 'apple' + word1。但是 word2= 可以在下一行或文档的更下方。我已经设法做到了以下,但这只有在下一行时才有效。如果它在第 3、4、5 行等,则不会。有人可以帮忙吗?
if 'word1' in line and 'word2' not in line:
nextLine = next(f)
pattern = re.match('(?:word2=|word2 =)([a-z0-9_])+',nextLine)
if pattern:
print('word1', pattern)
如果我做对了,我为你做了一个例子:
string = """
there is a word1. then there is some more text.
then we are looking for word2 = apple.
there is a word1. then there is some more text.
then we are looking for word2 = orange.
there is a word1. then there is some more text.
then there is some more text.
then there is some more text.
then we are looking for word2= peer.
"""
import re
result = re.findall(".*?(word1)[\s\S]*?word2 *=.*?([a-z0-9_]+)", string)
print(result)
# should be [('word1', 'apple'), ('word1', 'orange'), ('word1', 'peer')]
注意:由于我使用的是整个字符串匹配,所以我的例子可能不适合大文件。
if 'word1' in line and 'word2' not in line:
while True:
nextLine = next(f)
pattern = re.match('(?:word2=|word2 =)([a-z0-9_])+',nextLine)
if pattern:
print('word1', pattern)
break
不确定是否可以使用 无法访问 PC 让我知道,如果无法使用我会删除它
谨防强硬:
Are all infinite loops bad?
Is while (true) with break bad programming practice?
你应该在一个字符串中阅读你的完整文件,然后试试这个。这将捕获 word1,以及使用 capturing groups:
等同于 word2 的任何内容
(word1)(?:.*[\n\r]?)+word2 ?= ?(\w+)
从你的问题中我们不清楚我们应该匹配 word2 = apple
还是 word2=apple
(也许你最后一次提到 word2=
是打字错误?),所以我包括了?
个字符,这将使空格成为可选的。
如果您希望您的答案格式为 apple + word1
,您可以这样做:
print(pattern.group(1) + " + " + pattern.group(2))
我在文本文件中搜索某个字符串,然后在该字符串之后寻找另一个字符串,它可能在下一行或文档的更下方。我目前有
所以示例文本输出会像
there is a word1. then there is some more text.
then we are looking for word2 = apple.
我正在寻找 return 单词 'apple' + word1。但是 word2= 可以在下一行或文档的更下方。我已经设法做到了以下,但这只有在下一行时才有效。如果它在第 3、4、5 行等,则不会。有人可以帮忙吗?
if 'word1' in line and 'word2' not in line:
nextLine = next(f)
pattern = re.match('(?:word2=|word2 =)([a-z0-9_])+',nextLine)
if pattern:
print('word1', pattern)
如果我做对了,我为你做了一个例子:
string = """
there is a word1. then there is some more text.
then we are looking for word2 = apple.
there is a word1. then there is some more text.
then we are looking for word2 = orange.
there is a word1. then there is some more text.
then there is some more text.
then there is some more text.
then we are looking for word2= peer.
"""
import re
result = re.findall(".*?(word1)[\s\S]*?word2 *=.*?([a-z0-9_]+)", string)
print(result)
# should be [('word1', 'apple'), ('word1', 'orange'), ('word1', 'peer')]
注意:由于我使用的是整个字符串匹配,所以我的例子可能不适合大文件。
if 'word1' in line and 'word2' not in line:
while True:
nextLine = next(f)
pattern = re.match('(?:word2=|word2 =)([a-z0-9_])+',nextLine)
if pattern:
print('word1', pattern)
break
不确定是否可以使用 无法访问 PC 让我知道,如果无法使用我会删除它
谨防强硬:
Are all infinite loops bad?
Is while (true) with break bad programming practice?
你应该在一个字符串中阅读你的完整文件,然后试试这个。这将捕获 word1,以及使用 capturing groups:
等同于 word2 的任何内容(word1)(?:.*[\n\r]?)+word2 ?= ?(\w+)
从你的问题中我们不清楚我们应该匹配 word2 = apple
还是 word2=apple
(也许你最后一次提到 word2=
是打字错误?),所以我包括了?
个字符,这将使空格成为可选的。
如果您希望您的答案格式为 apple + word1
,您可以这样做:
print(pattern.group(1) + " + " + pattern.group(2))