捕获日期正则表达式

Question

我有两个不同的文件名：

"Profile sep 3 2015.txt"

"Profile mar 5 2014 inactive.txt"

我需要的是一个捕获文件名的日期 MMM dd yyyy 部分的正则表达式。

以前，我有一个正则表达式可以像这样捕获它：

"^Profile (.*).txt$"

但这并没有考虑非活动文件，因为它只会与日期一起捕获。我应该如何处理这个问题？

Answer 1

使用

\b(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s+(?:0?[1-9]|[12][0-9]|3[01])\s+\d{4}\b

使用 不区分大小写的标志（即 /PATTERN_ABOVE/i 或在第一个 \b 之前添加 (?i)）。见regex demo。它将匹配 space 分隔的 3 个字母的月份、1 位或 2 位数字的日期和 4 位数字的年份。

详情:

\b - 前导词边界
(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) - 一个月
\s+ - 1+白spaces
(?:0?[1-9]|[12][0-9]|3[01]) - 1 或 2 天的数字
- 0?[1-9] - 可选零和 1-9 范围内的数字
- | - 或
- [12][0-9] - 从 10 到 29
- | - 或
- 3[01] - 30 或 31
\s+ - 见上文
\d{4} - 4 位数
\b - 尾随单词边界。

Answer 2

下面的模式有助于快速修复，我们可以增强它以涵盖其他验证。

\s+([jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec]{3}\s*[0-3]?[0-9]\s*\d{4})/ig

此模式涵盖：

月份 (mmm) 的长度应一致为 3 且不区分大小写
日期应为 NN 格式
年份应采用 NNNN 格式
Space之间Month/Day/Year可选

所附截图仅供参考，更多示例可在 - http://regexr.com/

进行测试

希望对您有所帮助！

Answer 3

POSIX 具有范围修饰符的字符类

您没有提供特定语言，因此虽然可能有其他方法可以做到这一点，但一种相当便携的方法是使用 POSIX 字符类和范围修饰符。例如：

^Profile[[:space:]]+([[:alpha:]]{3}[[:space:]]+[[:digit:]]{1,2}[[:space:]]+[[:digit:]]{4})

为了便于解释，这里有一个使用 Ruby 中的扩展语法的示例：

str     = "Profile mar 5 2014 inactive.txt"
pattern =
  /                    # start regular expression literal
    ^Profile           # anchor to "Profile" at start of line
    [[:space:]]+       # one or more space\/tab characters
    (                  # start capture
      [[:alpha:]]{3}   # three alphabetical characters
      [[:space:]]+     # one or more space\/tab characters
      [[:digit:]]{1,2} # one or two digits
      [[:space:]]+     # one or more space\/tab characters
      [[:digit:]]{4}   # exactly four digits
    )                  # end capture
  /x                   # close literal; set the Regexp::EXTENDED flag
str.match pattern; 
#=> "mar 5 2014"

捕获日期正则表达式

Capturing Date Regex

regex

regex-negation

POSIX 具有范围修饰符的字符类

捕获日期正则表达式

Capturing Date Regex

regex

regex-negation

POSIX 具有范围修饰符的字符 类

POSIX 具有范围修饰符的字符类