获取 python 中 '+' 字符之前的字符串内容
Get content of string before '+' character in python
输入:
s = 'Coated tablet + ALFUZOSIN HYDROCHLORIDE, Film-coated tablet + ALFUZOSIN HYDROCHLORIDE, Modified-release tablet + ALFUZOSIN HYDROCHLORIDE, Prolonged-release tablet + ALFUZOSIN HYDROCHLORIDE'
预期输出:
s = 'Coated tablet, Film-coated tablet, Modified-release tablet, Prolonged-release tablet'
对于这样的每个字符串,我如何在 Python 中获得必要的输出,以便 + 之后的所有元素都不会出现。
在 ,
上拆分,然后在 +
上拆分,并在索引 0
处获取项目
', '.join([i.split("+")[0].strip() for i in s.split(",")])
输出
'Coated tablet, Film-coated tablet, Modified-release tablet, Prolonged-release tablet'
使用正则表达式:
import re
old_s = 'Coated tablet + ALFUZOSIN HYDROCHLORIDE, Film-coated tablet + ALFUZOSIN HYDROCHLORIDE, Modified-release tablet + ALFUZOSIN HYDROCHLORIDE, Prolonged-release tablet + ALFUZOSIN HYDROCHLORIDE'
new_s = re.sub(r'\s\+.*?, | \+.*?$', ',', s)[:-1]
print(new_s)
>>> 'Coated tablet, Film-coated tablet, Modified-release tablet, Prolonged-release tablet'
左边的管道\s
表示白色space,\+.*?,
寻找+
和,
之间的所有东西,在右侧,您将使用 $
代替没有逗号的最终情况。
[:-1]
因为所有匹配都被逗号替换,但是,您不希望字符串末尾有逗号。
使用正则表达式,
它从 +
中删除,直到用完非逗号的字符
import re
s = 'Coated tablet + ALFUZOSIN HYDROCHLORIDE, Film-coated tablet + ALFUZOSIN HYDROCHLORIDE, Modified-release tablet + ALFUZOSIN HYDROCHLORIDE, Prolonged-release tablet + ALFUZOSIN HYDROCHLORIDE'
re.sub(" [+] [^,]+","",s)
输入:
s = 'Coated tablet + ALFUZOSIN HYDROCHLORIDE, Film-coated tablet + ALFUZOSIN HYDROCHLORIDE, Modified-release tablet + ALFUZOSIN HYDROCHLORIDE, Prolonged-release tablet + ALFUZOSIN HYDROCHLORIDE'
预期输出:
s = 'Coated tablet, Film-coated tablet, Modified-release tablet, Prolonged-release tablet'
对于这样的每个字符串,我如何在 Python 中获得必要的输出,以便 + 之后的所有元素都不会出现。
在 ,
上拆分,然后在 +
上拆分,并在索引 0
', '.join([i.split("+")[0].strip() for i in s.split(",")])
输出
'Coated tablet, Film-coated tablet, Modified-release tablet, Prolonged-release tablet'
使用正则表达式:
import re
old_s = 'Coated tablet + ALFUZOSIN HYDROCHLORIDE, Film-coated tablet + ALFUZOSIN HYDROCHLORIDE, Modified-release tablet + ALFUZOSIN HYDROCHLORIDE, Prolonged-release tablet + ALFUZOSIN HYDROCHLORIDE'
new_s = re.sub(r'\s\+.*?, | \+.*?$', ',', s)[:-1]
print(new_s)
>>> 'Coated tablet, Film-coated tablet, Modified-release tablet, Prolonged-release tablet'
左边的管道\s
表示白色space,\+.*?,
寻找+
和,
之间的所有东西,在右侧,您将使用 $
代替没有逗号的最终情况。
[:-1]
因为所有匹配都被逗号替换,但是,您不希望字符串末尾有逗号。
使用正则表达式,
它从 +
中删除,直到用完非逗号的字符
import re
s = 'Coated tablet + ALFUZOSIN HYDROCHLORIDE, Film-coated tablet + ALFUZOSIN HYDROCHLORIDE, Modified-release tablet + ALFUZOSIN HYDROCHLORIDE, Prolonged-release tablet + ALFUZOSIN HYDROCHLORIDE'
re.sub(" [+] [^,]+","",s)