错误强制不将 pandas 中不需要的字符串更改为 NaN

Question

我有一个数据集，其中包含不需要的字符串（表示无法进行测量）。当 pandas 读取数据的文本文件时，我想将这些不需要的字符串更改为“NaN”，因为字符串的存在正在将其他 int 列的数据类型转换为字符串。如果有更好的流程，请告诉我。当我尝试使用我在另一个问题 (Change data type of columns in Pandas) 上发现的方法时，它们会在找到不需要的字符串时中断，否则它们似乎会完全跳过该字符串。

代码

import pandas as pd 
data = {
    'ID': [1,2,3,4],
    'V': [6.6,2.01,'tND - 7777',7.01],
    'A': [33,31,'tND - 88881',35]    
    } 
df = pd.DataFrame(data, columns = ['ID','V','A'])

print(df)
df.astype({"V": int})
print(df)
# returns ValueError: invalid literal for int() with base 10: 'tND - 7777'

pd.to_numeric(df['V'], errors = 'coerce')
pd.to_numeric(df['A'], errors = 'coerce')
print(df)
# returns original array, unwanted strings still in place

不需要的字符串

'tND - 7777','tND - 88881'

期望的结果 数据框列中的数据是整数（我假设 NaN 被认为是一个整数，我只需要在字符串不再存在后绘制数据）。

Answer 1

将输出分配回去：

df['V'] = pd.to_numeric(df['V'], errors = 'coerce')
df['A'] = pd.to_numeric(df['A'], errors = 'coerce')

另一个想法是使用：

df[['V','A']] = df[['V','A']].apply(pd.to_numeric, errors = 'coerce')

错误强制不将 pandas 中不需要的字符串更改为 NaN

errors coerce not changing unwanted string to NaN in pandas

python

string

nan

pandas