仅在正则表达式中分出非字母数字字符

Question

我正在尝试使用 re.sub:

从 Python 中的字符串中删除符号

re.sub(r"(?![a-z0-9])", "_", "some:long:str-:that:can't+have+symbols".lower())

我要找的答案是：

some_long_str__that_can_t_have_symbols

但它不起作用。我绝对可以使用 findall() 然后 join() 匹配字母数字字符来创建一个新字符串，但这完全消除了字符，所以我最终写了一些低效的 for 循环。

我认为问题在于我如何否定我的表达。有什么想法吗？

Answer 1

像这样使用：

import re
result = re.sub(r"([^a-z0-9])", "_", "some:long:str-:that:can't+have+symbols".lower())
print(result)

输出：

some_long_str__that_can_t_have_symbols

Only sub out non-alphanumeric characters in regex