获取 python 中 '+' 字符之前的字符串内容

Get content of string before '+' character in python

输入:

s = 'Coated tablet + ALFUZOSIN HYDROCHLORIDE, Film-coated tablet + ALFUZOSIN HYDROCHLORIDE, Modified-release tablet + ALFUZOSIN HYDROCHLORIDE, Prolonged-release tablet + ALFUZOSIN HYDROCHLORIDE'

预期输出:

s = 'Coated tablet, Film-coated tablet, Modified-release tablet, Prolonged-release tablet'

对于这样的每个字符串,我如何在 Python 中获得必要的输出,以便 + 之后的所有元素都不会出现。

, 上拆分,然后在 + 上拆分,并在索引 0

处获取项目
', '.join([i.split("+")[0].strip() for i in s.split(",")])

输出

'Coated tablet, Film-coated tablet, Modified-release tablet, Prolonged-release tablet'

使用正则表达式:

import re

old_s = 'Coated tablet + ALFUZOSIN HYDROCHLORIDE, Film-coated tablet + ALFUZOSIN HYDROCHLORIDE, Modified-release tablet + ALFUZOSIN HYDROCHLORIDE, Prolonged-release tablet + ALFUZOSIN HYDROCHLORIDE'
new_s = re.sub(r'\s\+.*?, | \+.*?$', ',', s)[:-1]

print(new_s)
>>> 'Coated tablet, Film-coated tablet, Modified-release tablet, Prolonged-release tablet'

左边的管道\s表示白色space,\+.*?,寻找+,之间的所有东西,在右侧,您将使用 $ 代替没有逗号的最终情况。

[:-1] 因为所有匹配都被逗号替换,但是,您不希望字符串末尾有逗号。

使用正则表达式,

它从 + 中删除,直到用完非逗号的字符

import re
s = 'Coated tablet + ALFUZOSIN HYDROCHLORIDE, Film-coated tablet + ALFUZOSIN HYDROCHLORIDE, Modified-release tablet + ALFUZOSIN HYDROCHLORIDE, Prolonged-release tablet + ALFUZOSIN HYDROCHLORIDE'

re.sub(" [+] [^,]+","",s)