处理 pandas 数据框中的一行而不会产生链索引（不只是处理索引）

Question

我的数据组织在一个数据框中：

import pandas as pd
import numpy as np

data = {'Col1' : [4,5,6,7], 'Col2' : [10,20,30,40], 'Col3' : [100,50,-30,-50], 'Col4' : ['AAA', 'BBB', 'AAA', 'CCC']}

df = pd.DataFrame(data=data, index = ['R1','R2','R3','R4'])

看起来像这样（只是大得多）：

    Col1  Col2  Col3 Col4
R1     4    10   100  AAA
R2     5    20    50  BBB
R3     6    30   -30  AAA
R4     7    40   -50  CCC

我的算法遍历 table 行并执行一组操作。

为了 cleaness/lazyness 的缘故，我想在每次迭代时处理一行，而无需键入 df.loc['row index', 'column name'] 来获取每个单元格值

我尝试按照 right style 使用例如：

row_of_interest = df.loc['R2', :]

但是，我仍然收到警告：

row_of_interest['Col2'] = row_of_interest['Col2'] + 1000

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

它没有工作（如我所愿）它正在制作副本

print df

    Col1  Col2  Col3 Col4
R1     4    10   100  AAA
R2     5    20    50  BBB
R3     6    30   -30  AAA
R4     7    40   -50  CCC

关于正确的方法有什么建议吗？还是我应该坚持直接使用数据框？

编辑 1：

如果从代码中删除警告但未修改原始数据框，则使用回复："row of interest" Series 是副本，不是原始数据框的一部分。例如：

import pandas as pd
import numpy as np

data = {'Col1' : [4,5,6,7], 'Col2' : [10,20,30,40], 'Col3' : [100,50,-30,-50], 'Col4' : ['AAA', 'BBB', 'AAA', 'CCC']}

df = pd.DataFrame(data=data, index = ['R1','R2','R3','R4'])

row_of_interest         = df.loc['R2']
row_of_interest.is_copy = False
new_cell_value          = row_of_interest['Col2'] + 1000
row_of_interest['Col2'] = new_cell_value

print row_of_interest 

Col1       5
Col2    1020
Col3      50
Col4     BBB
Name: R2, dtype: object

print df

    Col1  Col2  Col3 Col4
R1     4    10   100  AAA
R2     5    20    50  BBB
R3     6    30   -30  AAA
R4     7    40   -50  CCC

编辑 2：

这是我想要复制的功能示例。在 python 中，列表列表如下所示：

a = [[1,2,3],[4,5,6]]

现在我可以创建 "label"

b = a[0]

如果我更改 b 中的条目：

b[0] = 7

a 和 b 都改变了。

print a, b

[[7,2,3],[4,5,6]], [7,2,3]

可以在 pandas 数据帧之间复制此行为，该数据帧将其中一行标记为 pandas 系列吗？

Answer 1

这应该有效：

row_of_interest = df.loc['R2', :]
row_of_interest.is_copy = False
row_of_interest['Col2'] = row_of_interest['Col2'] + 1000

设置.is_copy = False是诀窍

编辑 2：

import pandas as pd
import numpy as np

data = {'Col1' : [4,5,6,7], 'Col2' : [10,20,30,40], 'Col3' : [100,50,-30,-50], 'Col4' : ['AAA', 'BBB', 'AAA', 'CCC']}

df = pd.DataFrame(data=data, index = ['R1','R2','R3','R4'])

row_of_interest         = df.loc['R2']
row_of_interest.is_copy = False
new_cell_value          = row_of_interest['Col2'] + 1000
row_of_interest['Col2'] = new_cell_value

print row_of_interest 

df.loc['R2'] = row_of_interest 

print df

df:

    Col1  Col2  Col3 Col4
R1     4    10   100  AAA
R2     5  1020    50  BBB
R3     6    30   -30  AAA
R4     7    40   -50  CCC

Answer 2

最直接的方法

df.loc['R2', 'Col2'] += 1000
df

Answer 3

您可以通过使用要处理的切片创建系列来删除警告：

from pandas import Series
row_of_interest = Series(data=df.loc['R2', :])
row_of_interest.loc['Col2'] += 1000
print(row_of_interest)

结果：

Col1       5
Col2    1020
Col3      50
Col4     BBB
Name: R2, dtype: object

处理 pandas 数据框中的一行而不会产生链索引（不只是处理索引）

Work with a row in a pandas dataframe without incurring chain indexing (not coping just indexing)

python

indexing

series

dataframe

pandas