我想从末尾包含 * 的值中删除，比如从列中删除 20、30*

Question

我有一个包含列中值的数据框：

df = pd.DataFrame({
    'A': ['20*', 40, '30*' ],
    'B': ['abc', 'bar', 'xyz'],
})

我想删除 A 列中的 *，结果应该是：['20', 40, '30' ]

如何实现？

Answer 1

将 str.rstrip 与 fillna 一起使用，请注意您的列 A 是包含字符串和 int 的对象，这就是为什么 str.rstrip 将 return NaN on int cell ，那么我们只需要使用 fillna 填充它

df.A=df.A.str.rstrip('*').fillna(df.A)

Answer 2

使用正则表达式这行得通：

import pandas as pd

df = pd.DataFrame({'A': ['20*', 40, '30*' ], 'B': ['abc', 'bar', 'xyz']})
df.replace({'A': {r'(\d+)\*': r''}}, regex=True, inplace=True)

print(df)

括号(\d+)是一个capturing group the contained \d+ checks for strings that are made up of either a single digit or more. </code> is a <a href="https://www.regular-expressions.info/backref.html" rel="nofollow noreferrer">backreference</a>，它访问的是前面括号定义的第一个捕获组。</p> <p>第一个正则表达式基本上是这样写的：查找所有由至少一个数字组成且尾随 <code>* 的字符串（在正则表达式中转义为 \*，因为仅 *匹配零个或多个前面的字符）。

第二个意思是：使用之前捕获的数字并粘贴它们。您可以将第二个正则表达式修改为类似 r'AB 的内容，以更好地理解这意味着什么。

我想从末尾包含 * 的值中删除，比如从列中删除 20、30*

I want to remove * from values containing * at the end say 20, 30 from a column

python

pandas

data-cleaning

我想从末尾包含 * 的值中删除 *，比如从列中删除 20*、30*

I want to remove * from values containing * at the end say 20*, 30* from a column

python

pandas

data-cleaning

我想从末尾包含 * 的值中删除，比如从列中删除 20、30*

I want to remove * from values containing * at the end say 20, 30 from a column