用另一个查找填充 NaN table
fill NaN with another lookup table
有没有办法通过匹配名称、十字线和单元格 rev 来用 test=default
的值填充 NaN
?
使用 "test" 列中的几个变量:
有没有办法更新其他行的值?因为数据类型 "do" 的优先级高于 int 并删除 "do" 数据行?
数据:
测试数据类型名称值标线 cell_rev
默认 int s 0x45 CR1
默认 int s 0xCB CR3
默认 do s 0.68 CR1
我想得到:
测试数据类型名称值标线cell_rev
默认 int s 0.68 CR1
默认 int s 0xCB CR3
您可以使用 set_index
with unstack
for reshaping, then ffill
for add missing values and last reshape to original by stack
:
df = df.set_index(['name','value_old','reticle','test','cell_rev'])
.unstack()
.ffill()
.stack()
.reset_index()
print (df)
name value_old reticle test cell_rev value_new
0 s 0x8E A28 default CR1 0x8C
1 s 0x8E A28 default CR3 0x8E
2 s 0x8E A28 etlc CR1 0x8C
3 s 0x8E A28 etlc CR3 0x8E
通过评论编辑:
使用merge
by subset df1
created by boolean indexing
and then fill NaN
values by combine_first
or fillna
:
df1 = df.ix[df.test == 'default']
print (df1)
test name value_old reticle cell_rev value_new
0 default s 0x8E A28 CR1 0x8E
1 default s 0x8E A28 CR3 0x8C
df2 = pd.merge(df, df1, how='left', on=['name','reticle','cell_rev'], suffixes=('','1'))
print (df2)
test name value_old reticle cell_rev value_new test1 value_old1 \
0 default s 0x8E A28 CR1 0x8E default 0x8E
1 default s 0x8E A28 CR3 0x8C default 0x8E
2 etlc s 0x8E A28 CR1 0x44 default 0x8E
3 etlc s 0x8E A28 CR3 0x44 default 0x8E
4 mlc s 0x1E A28 CR1 NaN default 0x8E
5 mlc s 0x1E A28 CR3 NaN default 0x8E
6 slc s 0x2E A28 CR1 NaN default 0x8E
7 slc s 0x2E A28 CR3 NaN default 0x8E
value_new1
0 0x8E
1 0x8C
2 0x8E
3 0x8C
4 0x8E
5 0x8C
6 0x8E
7 0x8C
df['value_new'] = df2['value_new'].combine_first(df2['value_new1'])
#df['value_new'] = df2['value_new'].fillna(df2['value_new1'])
print (df)
test name value_old reticle cell_rev value_new
0 default s 0x8E A28 CR1 0x8E
1 default s 0x8E A28 CR3 0x8C
2 etlc s 0x8E A28 CR1 0x44
3 etlc s 0x8E A28 CR3 0x44
4 mlc s 0x1E A28 CR1 0x8E
5 mlc s 0x1E A28 CR3 0x8C
6 slc s 0x2E A28 CR1 0x8E
7 slc s 0x2E A28 CR3 0x8C
for i in range(len(df)):
if df.loc[i, 'value_new'] != df.loc[i, 'value_new']:
df.loc[i, 'value_new'] = df.loc[(df.test == 'default') &
(df.name == df.loc[i, 'name']) &
(df.reticle == df.loc[i, 'reticle']) &
(df.cell_rev == df.loc[i, 'cell_rev']),
'value_new']
我认为有更有效的解决方案,但这应该可行。
有没有办法通过匹配名称、十字线和单元格 rev 来用 test=default
的值填充 NaN
?
使用 "test" 列中的几个变量:
有没有办法更新其他行的值?因为数据类型 "do" 的优先级高于 int 并删除 "do" 数据行?
数据:
测试数据类型名称值标线 cell_rev
默认 int s 0x45 CR1
默认 int s 0xCB CR3
默认 do s 0.68 CR1
我想得到:
测试数据类型名称值标线cell_rev
默认 int s 0.68 CR1
默认 int s 0xCB CR3
您可以使用 set_index
with unstack
for reshaping, then ffill
for add missing values and last reshape to original by stack
:
df = df.set_index(['name','value_old','reticle','test','cell_rev'])
.unstack()
.ffill()
.stack()
.reset_index()
print (df)
name value_old reticle test cell_rev value_new
0 s 0x8E A28 default CR1 0x8C
1 s 0x8E A28 default CR3 0x8E
2 s 0x8E A28 etlc CR1 0x8C
3 s 0x8E A28 etlc CR3 0x8E
通过评论编辑:
使用merge
by subset df1
created by boolean indexing
and then fill NaN
values by combine_first
or fillna
:
df1 = df.ix[df.test == 'default']
print (df1)
test name value_old reticle cell_rev value_new
0 default s 0x8E A28 CR1 0x8E
1 default s 0x8E A28 CR3 0x8C
df2 = pd.merge(df, df1, how='left', on=['name','reticle','cell_rev'], suffixes=('','1'))
print (df2)
test name value_old reticle cell_rev value_new test1 value_old1 \
0 default s 0x8E A28 CR1 0x8E default 0x8E
1 default s 0x8E A28 CR3 0x8C default 0x8E
2 etlc s 0x8E A28 CR1 0x44 default 0x8E
3 etlc s 0x8E A28 CR3 0x44 default 0x8E
4 mlc s 0x1E A28 CR1 NaN default 0x8E
5 mlc s 0x1E A28 CR3 NaN default 0x8E
6 slc s 0x2E A28 CR1 NaN default 0x8E
7 slc s 0x2E A28 CR3 NaN default 0x8E
value_new1
0 0x8E
1 0x8C
2 0x8E
3 0x8C
4 0x8E
5 0x8C
6 0x8E
7 0x8C
df['value_new'] = df2['value_new'].combine_first(df2['value_new1'])
#df['value_new'] = df2['value_new'].fillna(df2['value_new1'])
print (df)
test name value_old reticle cell_rev value_new
0 default s 0x8E A28 CR1 0x8E
1 default s 0x8E A28 CR3 0x8C
2 etlc s 0x8E A28 CR1 0x44
3 etlc s 0x8E A28 CR3 0x44
4 mlc s 0x1E A28 CR1 0x8E
5 mlc s 0x1E A28 CR3 0x8C
6 slc s 0x2E A28 CR1 0x8E
7 slc s 0x2E A28 CR3 0x8C
for i in range(len(df)):
if df.loc[i, 'value_new'] != df.loc[i, 'value_new']:
df.loc[i, 'value_new'] = df.loc[(df.test == 'default') &
(df.name == df.loc[i, 'name']) &
(df.reticle == df.loc[i, 'reticle']) &
(df.cell_rev == df.loc[i, 'cell_rev']),
'value_new']
我认为有更有效的解决方案,但这应该可行。