如何只在模式中使用一次输入?

How to only use input in pattern once?

正则表达式 = ^(movie|tv) (.*) (?<=season )([0-9 ]+)

输入=tv game of thrones season 1 2 3

输出

tv
game of thrones season
1 2 3

期望的输出

tv
game of thrones
1 2 3

https://regex101.com/r/wG3aM3/954

(.*) 捕获了 ([0-9 ]+) 之前的所有字符串 我怎样才能防止它捕获 ([0-9 ]+).

之前的 (?<=season )

P.S 我不能直接否定 (.*) 中的“季节”。即 tv game of season season 1 2 3 应该捕获 tv game of season 1 2 3

您可以使用

^(movie|tv)\s*(.*?)(?:\s+season)?(?:\s+([0-9]+(?:\s+[0-9]+)*))?$

参见regex demo

详情

  • ^ - 字符串开头
  • (movie|tv) - 第 1 组:movietv
  • \s* - 零个或多个空格
  • (.*?) - 第 2 组:除换行字符外的任何零个或多个字符尽可能少
  • (?:\s+season)? - 一个可选的非捕获组匹配一个或多个空格后跟 season string
  • (?:\s+([0-9]+(?:\s+[0-9]+)*))? - 可选的非捕获组匹配
    • \s+ - 一个或多个空格
    • ([0-9]+(?:\s+[0-9]+)*) - 第 3 组:一个或多个数字后跟一个或多个空白字符的零次或多次重复后跟一个或多个数字
  • $ - 字符串结尾。

这个呢?

^(movie|tv) (.*) (?:season )([0-9 ]+)

这只是将前瞻 (?<=season ) 更改为非捕获组 (?:season )。在 python 中将是:

import re
text = "tv game of thrones season 1 2 3"
output = re.findall(r"^(movie|tv) (.*) (?:season )([0-9 ]+)", text)
print(output)
#output: [('tv', 'game of thrones', '1 2 3')]

A non-capturing version of regular parentheses. Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern.

https://docs.python.org/3/library/re.html