pandas str 提取为整数

Question

考虑 pd.Series s

s = pd.Series(['A1', 'B2', '3C'])

我想提取每个元素的数字部分。
我知道我可以通过以下方式使用 extract

s.str.extract('(\d)', expand=False)

0    1
1    2
2    3
dtype: object

注意 dtype: object
如果我得到每个元素的type

s.str.extract('(\d)', expand=False).apply(type)

0    <class 'str'>
1    <class 'str'>
2    <class 'str'>
dtype: object

问题
如何直接提取为整数？

0    1
1    2
2    3
dtype: int64

Answer 1

我觉得不可能。

查看文档 str.extract:

Returns:

DataFrame with one row for each subject string, and one column for each group. Any capture group names in regular expression pat will be used for column names; otherwise capture group numbers will be used. The dtype of each result column is always object, even when no match is found. If expand=True and pat has only one capture group, then return a Series (if subject is a Series) or Index (if subject is an Index).

所以需要 astype(int) 或者如果 NaN 在输出中 - to_numeric pd.to_numeric(s.str.extract('(\d)', expand=False))

pandas str 提取为整数

pandas str extract as integer

python

string

series

pandas