str.startswith 使用正则表达式

Question

我能理解为什么 str.startswith() 不处理正则表达式吗：

   col1
0  country
1  Country

i.e : df.col1.str.startswith('(C|c)ountry')

它returns所有值都是假的：

   col1
0  False
1  False

Answer 1

Series.str.startswith 不接受正则表达式。使用 Series.str.match 代替：

df.col1.str.match(r'(C|c)ountry', as_indexer=True)

输出：

0    True
1    True
Name: col1, dtype: bool

Answer 2

Series.str.startswith 不接受正则表达式，因为它的行为类似于香草 Python 中的 str.startswith，后者不接受正则表达式。另一种方法是使用正则表达式匹配（如 in the docs 所述）：

df.col1.str.contains('^[Cc]ountry')

字符 class [Cc] 可能是匹配 C 或 c 比 (C|c) 更好的方法，当然除非您需要捕获哪个字母被使用。在这种情况下，您可以执行 ([Cc]).

str.startswith using Regex