在一个字符串中搜索并消除随附列表中找到的位置

Question

我有一个带有文本的字符串和一个附带的列表，其中包含有关第一个列表中每个字符的信息。例如：

text="this, and this are test elems"
textInfo=[1, 4, 6, 7, ,8, 3, 6, 2, 4, ... 7, 0]

其中列表中的每个位置都指向一个字符是文本，即 len(text) == len(textInfo) 其中 textInfo[i] 具有文本中第 i 个字符的信息。

我想剔除文本中"this"的实例，同时剔除列表中引用这些字符的位置（即4个位置，对应"t"的信息, "h", "i" 和 "s").

我的蛮力方法类似于：

tmpText = text
tmpTextInfo = textInfo
m = re.search("this", tmpText)
while m:
  tmpText = tmpText[0:m.start()] + tmpText[m.end():]
  tmpTextInfo =  tmpTextInfo[0:m.start()] + tmpTextInfo[m.end():]
  m = re.search("this", tmpText)
text = tmpText
textInfo = tmpTextInfo

这行得通并且达到了我的预期。例如：如果输入是

text = "this test this is"
textInfo = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]

那么生成的 text 和 textInfo 字符串将是

text=" test  is"
textInfo=[4, 5, 6, 7, 8, 9, 14, 15, 16]

但在我看来它根本不是 Pythonic 的，而且我确信有更紧凑和更有效的方法来做到这一点，是吗？

Answer 1

好吧，如果您确实需要正则表达式，我认为我不会做任何不同的事情。您最多只能存储要保留的字符串列表并在最后加入它们，但这不会提高可读性。

如果问题是关于删除由 space 或标点符号分隔的标记，您可以使用生成器生成一对具有相应 textInfo 的标记。然后，您可以根据令牌（如果需要，也可以附加信息）进行过滤并重新组合两个列表。但我不确定它会带来什么，真的。

在一个字符串中搜索并消除随附列表中找到的位置

search inside one string and eliminate the found positions in an accompanying list

python

string

search