尝试从索引数据框中两行范围内的 pandas 数据框中查找值

Question

我有两个数据框 - Python 笔记本中的“grower_moo”和“定价”分析收获的作物和支付给种植者的价格。

pricing 是索引数据框，grower_moo 有各种独特的负载票，其中包含有关每个负载的信息。

我需要将每吨价格从 pricing 索引拉到负载数据中的新列，如果 Fat负载不大于下一个湿脂肪。

下面是每个数据帧的 .head() 样本和我试过的代码。我收到 ValueError: Can only compare identically-labeled Series objects 错误。


pricing
    Price_Per_Ton   Wet_Fat
0       306            10
1       339            11
2       382            12
3       430            13
4       481            14
5       532            15
6       580            16
7       625            17
8       665            18
9       700            19
10      728            20
11      750            21
12      766            22
13      778            23
14      788            24
15      797            25

grower_moo
    Load Ticket     Net Fruit Weight  Net MOO  Percent_MOO  Fat
0   L2019000011817     56660           833     1.448872    21.92
1   L2019000011816     53680           1409    2.557679    21.12
2   L2019000011815     53560           1001    1.834644    21.36
3   L2019000011161     62320           2737    4.207080    21.41
4   L2019000011160     57940           1129    1.911324    20.06

grower_moo['price_per_ton'] = max(pricing[pricing['Wet_Fat'] < grower_moo['Fat']]['Price_Per_Ton'])

示例输出 - grower_moo['Fat'] 的 13.60 小于 14 Fat，因此得到每吨 430 美元的价格

grower_moo_with_price
    Load Ticket     Net Fruit Weight  Net MOO  Percent_MOO  Fat    price_per_ton
0   L2019000011817     56660           833     1.448872    21.92      750
1   L2019000011816     53680           1409    2.557679    21.12      750
2   L2019000011815     53560           1001    1.834644    21.36      750
3   L2019000011161     62320           2737    4.207080    21.41      750
4   L2019000011160     57940           1129    1.911324    20.06      728

Answer 1

concat_df = pd.concat([grower_moo, pricing], axis)
cocnat_df = concat_df[concat_df['Wet_Fat'] < concat_df['Fat']]
del cocnat_df['Wet_Fat']

Answer 2

这看起来像是 "as of" 合并的作业，pd.merge_asof (documentation):

This is similar to a left-join except that we match on nearest key rather than equal keys. Both DataFrames must be sorted by the key.

For each row in the left DataFrame:

A "backward" search [the default] selects the last row in the right DataFrame whose ‘on’ key is less than or equal to the left’s key.

在下面的代码中，我使用了您的示例输入，但列名使用下划线 _ 而不是空格 </code>.</p> <pre><code># Required by merge_asof: sort keys in left DataFrame grower_moo = grower_moo.sort_values('Fat') # Required by merge_asof: key column data types must match pricing['Wet_Fat'] = pricing['Wet_Fat'].astype('float') # Perform the asof merge res = pd.merge_asof(grower_moo, pricing, left_on='Fat', right_on='Wet_Fat') # Print result res Load_Ticket Net_Fruit_Weight Net_MOO Percent_MOO Fat Price_Per_Ton Wet_Fat 0 L2019000011160 57940 1129 1.911324 20.06 728 20.0 1 L2019000011816 53680 1409 2.557679 21.12 750 21.0 2 L2019000011815 53560 1001 1.834644 21.36 750 21.0 3 L2019000011161 62320 2737 4.207080 21.41 750 21.0 4 L2019000011817 56660 833 1.448872 21.92 750 21.0 # Optional: drop the key column from the right DataFrame res.drop(columns='Wet_Fat') Load_Ticket Net_Fruit_Weight Net_MOO Percent_MOO Fat Price_Per_Ton 0 L2019000011160 57940 1129 1.911324 20.06 728 1 L2019000011816 53680 1409 2.557679 21.12 750 2 L2019000011815 53560 1001 1.834644 21.36 750 3 L2019000011161 62320 2737 4.207080 21.41 750 4 L2019000011817 56660 833 1.448872 21.92 750

尝试从索引数据框中两行范围内的 pandas 数据框中查找值

Trying to lookup a value from a pandas dataframe within a range of two rows in the index dataframe

python

lookup

dataframe

pandas