使用 python 中的正则表达式提取句子中的位置提及

Question

我正在使用 python 编写代码来提取道路、街道、高速公路的名称，例如像 "There is an accident along Uhuru Highway" 这样的句子，我希望我的代码能够提取名称提到的高速公路，我写了下面的代码。

sentence="there is an accident along uhuru highway"
listw=[word for word in sentence.lower().split()]
for i in range(len(listw)):
    if listw[i] == "highway":
        print listw[i-1] + " "+ listw[i]

我可以实现这个但是我的代码没有优化，我正在考虑使用正则表达式，请帮忙

Answer 1

如果您要提取的位置后面总是有高速公路，您可以使用：

>>> sentence = "there is an accident along uhuru highway"

>>> a = re.search(r'.* ([\w\s\d\-\_]+) highway', sentence)
>>> print(a.group(1))

>>> uhuru

Answer 2

'uhuru highway'可以找到如下

import re

m = re.search(r'\S+ highway', sentence)  # non-white-space followed by ' highway'
print(m.group())
# 'uhuru highway'

Answer 3

您可以在不使用正则表达式的情况下执行以下操作：

sentence.split("highway")[0].strip().split(' ')[-1]

先按照"highway"拆分。你会得到：

['there is an accident along uhuru', '']

现在您可以轻松提取第一部分的最后一个词。

使用 python 中的正则表达式提取句子中的位置提及

Using regular expressions in python to extract location mentions in a sentence

python

expression

extraction