python pandas 将文本中的数字提取到新列

Question

我在列中有以下文字 A:

A   
hellothere_3.43  
hellothere_3.9

我只想提取数字到另一个新列B（A旁边），例如：

B                      
3.43   
3.9

我使用：str.extract('(\d.\d\d)', expand=True) 但是这个 仅复制 3.43（即确切的位数）。有没有办法让它更通用？

非常感谢！

Answer 1

使用正则表达式。

例如：

import pandas as pd

df = pd.DataFrame({"A": ["hellothere_3.43", "hellothere_3.9"]})
df["B"] = df["A"].str.extract("(\d*\.?\d+)", expand=True)
print(df)

输出：

                 A     B
0  hellothere_3.43  3.43
1   hellothere_3.9   3.9

Answer 2

我认为字符串拆分和应用 lambda 非常干净。

import pandas as pd

df = pd.DataFrame({"A": ["hellothere_3.43", "hellothere_3.9"]})
df["B"] = df['A'].str.split('_').apply(lambda x: float(x[1]))

我没有做过任何适当的比较，但在小型测试中它似乎比正则表达式解决方案更快。

python pandas 将文本中的数字提取到新列

python pandas extracting numbers within text to a new column

python

extract

pandas