使用 python 正则表达式仅提取第一个匹配项
Extract only first match using python regular expression
我有一个字符串如下:
course_name = "Post Graduate Certificate Programme in Retail Management (PGCPRM) (Online)"
我只想提取 'PGCPRM' 或第一个括号内值内的任何内容,并有一个新的课程名称如下:
course_name_new = "Post Graduate Certificate Programme in Retail Management (Online)"
您可以使用 str.replace()
:
>>> course_name = "Post Graduate Certificate Programme in Retail Management (PGCPRM) (Online)"
>>> course_name.replace('(PGCPRM) ','')
'Post Graduate Certificate Programme in Retail Management (Online)'
编辑:如果你想替换 (Online)
之前的单词,你需要正则表达式和 positive look-behind:
>>> re.sub(r'(\(\w+\) )(?=\(Online\))','',course_name)
'Post Graduate Certificate Programme in Retail Management (Online)'
或者,如果您想删除第一个括号,请使用以下内容:
>>> re.sub(r'(\(\w+\) ).*?','',course_name)
'Post Graduate Certificate Programme in Retail Management (Online)'
并提取它使用 re.search
:
>>> re.search(r'(\(.*?\))',course_name).group(0)
'(PGCPRM)'
很简单:
In [8]: course_name
Out[8]: 'Post Graduate Certificate Programme in Retail Management (PGCPRM) (Online)'
In [9]: print re.sub('\([A-Z]+\)\s*', '', course_name)
Post Graduate Certificate Programme in Retail Management (Online)
In [17]: print re.search('\(([A-Z]+)\)\s*', course_name).groups()[0]
PGCPRM
提取第一个括号内的值
>>> course_name = "Post Graduate Certificate Programme in Retail Management (PGCPRM) (Online)"
>>> x = re.search(r'\(.*?\)',course_name).group()
>>> x
'(PGCPRM)'
然后进行替换
>>> course_name.replace(x,'')
'Post Graduate Certificate Programme in Retail Management (Online)'
我有一个字符串如下:
course_name = "Post Graduate Certificate Programme in Retail Management (PGCPRM) (Online)"
我只想提取 'PGCPRM' 或第一个括号内值内的任何内容,并有一个新的课程名称如下:
course_name_new = "Post Graduate Certificate Programme in Retail Management (Online)"
您可以使用 str.replace()
:
>>> course_name = "Post Graduate Certificate Programme in Retail Management (PGCPRM) (Online)"
>>> course_name.replace('(PGCPRM) ','')
'Post Graduate Certificate Programme in Retail Management (Online)'
编辑:如果你想替换 (Online)
之前的单词,你需要正则表达式和 positive look-behind:
>>> re.sub(r'(\(\w+\) )(?=\(Online\))','',course_name)
'Post Graduate Certificate Programme in Retail Management (Online)'
或者,如果您想删除第一个括号,请使用以下内容:
>>> re.sub(r'(\(\w+\) ).*?','',course_name)
'Post Graduate Certificate Programme in Retail Management (Online)'
并提取它使用 re.search
:
>>> re.search(r'(\(.*?\))',course_name).group(0)
'(PGCPRM)'
很简单:
In [8]: course_name
Out[8]: 'Post Graduate Certificate Programme in Retail Management (PGCPRM) (Online)'
In [9]: print re.sub('\([A-Z]+\)\s*', '', course_name)
Post Graduate Certificate Programme in Retail Management (Online)
In [17]: print re.search('\(([A-Z]+)\)\s*', course_name).groups()[0]
PGCPRM
提取第一个括号内的值
>>> course_name = "Post Graduate Certificate Programme in Retail Management (PGCPRM) (Online)"
>>> x = re.search(r'\(.*?\)',course_name).group()
>>> x
'(PGCPRM)'
然后进行替换
>>> course_name.replace(x,'')
'Post Graduate Certificate Programme in Retail Management (Online)'