我如何使用正则表达式获取两个字符内的字符串并删除该字符串内的某些字符
how do i use regex to get a string inside two character and remove certain characters inside that string
我有一个很长的字符串,我想使用正则表达式过滤
<@961483653468439706> Text to remove, this text is useless, that's why i want it gone!
i want this: `keep the letters and spaces`
我想保留 ` 字符之间的文本
唯一的问题是在我想要的字符串部分的每个字符之间都有一个不可见的字符。
你可以在regex101中看到不可见的字符:https://regex101.com/r/rAYrMT/1
`([\'^\w]*)`
所以简而言之:保留 ` 之间的所有内容,除了可以在此处找到的不可见字符信息:https://apps.timwhitlock.info/unicode/inspect?s=%EF%BB%BF
您可以过滤掉不可打印的字符:
import re
from string import printable
# your invisibles are in the string...
s='''<@961483653468439706> Text to remove, this text is useless, that's why i want it gone!
Type `keep the letters and spaces` and `this too`'''
for m in re.findall(r'`([^`]*)`', s):
print(repr(m))
print(''.join([c for c in m if c in printable]))
print()
打印:
'k\ufeffe\ufeffe\ufeffp\ufeff \ufefft\ufeffh\ufeffe\ufeff \ufeffl\ufeffe\ufefft\ufefft\ufeffe\ufeffr\ufeffs a\ufeffn\ufeffd s\ufeffp\ufeffa\ufeffc\ufeffe\ufeffs'
keep the letters and spaces
'this too'
this too
您不需要为此使用正则表达式:
text = "<@961483653468439706> Text to remove, this text is useless, that's " \
"why i want it gone!Type `keep the letters and spaces`"
# put your invisible character between the first quotation marks here. obviously, they
# don't show up in this post.
filtered = text.replace('', '')
# because the passage you want is always between ``, you can split it and know that every
# second item in the list that split returns must be what you are looking for.
passage = filtered.split('`')[::2]
print(passage)
我有一个很长的字符串,我想使用正则表达式过滤
<@961483653468439706> Text to remove, this text is useless, that's why i want it gone!
i want this: `keep the letters and spaces`
我想保留 ` 字符之间的文本
唯一的问题是在我想要的字符串部分的每个字符之间都有一个不可见的字符。 你可以在regex101中看到不可见的字符:https://regex101.com/r/rAYrMT/1
`([\'^\w]*)`
所以简而言之:保留 ` 之间的所有内容,除了可以在此处找到的不可见字符信息:https://apps.timwhitlock.info/unicode/inspect?s=%EF%BB%BF
您可以过滤掉不可打印的字符:
import re
from string import printable
# your invisibles are in the string...
s='''<@961483653468439706> Text to remove, this text is useless, that's why i want it gone!
Type `keep the letters and spaces` and `this too`'''
for m in re.findall(r'`([^`]*)`', s):
print(repr(m))
print(''.join([c for c in m if c in printable]))
print()
打印:
'k\ufeffe\ufeffe\ufeffp\ufeff \ufefft\ufeffh\ufeffe\ufeff \ufeffl\ufeffe\ufefft\ufefft\ufeffe\ufeffr\ufeffs a\ufeffn\ufeffd s\ufeffp\ufeffa\ufeffc\ufeffe\ufeffs'
keep the letters and spaces
'this too'
this too
您不需要为此使用正则表达式:
text = "<@961483653468439706> Text to remove, this text is useless, that's " \
"why i want it gone!Type `keep the letters and spaces`"
# put your invisible character between the first quotation marks here. obviously, they
# don't show up in this post.
filtered = text.replace('', '')
# because the passage you want is always between ``, you can split it and know that every
# second item in the list that split returns must be what you are looking for.
passage = filtered.split('`')[::2]
print(passage)