如何为这些列的最小值和最大值之间的两个随机数更改两列中的空值？

Question

我在 pandas 的数据集中有以下列：

Index(['Country', 'Region', 'Happiness Rank', 'Happiness Score',
       'Lower Confidence Interval', 'Upper Confidence Interval',
       'Economy (GDP per Capita)', 'Family', 'Health (Life Expectancy)',
       'Freedom', 'Trust (Government Corruption)', 'Generosity',
       'Dystopia Residual'],
      dtype='object')

我需要将“置信区间下限”和“置信区间上限”列中的空值更改为每列最小值和最大值之间的随机数。两列中的值都是带小数的整数。

这是我试过的：

import random

print(random.randint((df.max(axis=0)["Lower Confidence Interval"]),(df.mmin(axis=0)["Lower Confidence Interval"])(df.max(axis=0)["Upper Confidence Interval"]),(df.mmin(axis=0)["Upper Confidence Interval"])

df.loc[:, ["Lower Confidence Interval", "Upper Confidence Interval"]].fillna(5, inplace=True)

这是我收到的错误消息：

 File "<ipython-input-100-e3190b8f67a4>", line 1
    print(random.randint((df.max(axis=0)["Lower Confidence Interval"]),(df.mmin(axis=0)["Lower Confidence Interval"])(df.max(axis=0)["Upper Confidence Interval"]),(df.mmin(axis=0)["Upper Confidence Interval"])
                                                                                                                                                                                                               
SyntaxError: unexpected EOF while parsing

我被困在这里有一段时间了，无法克服这个错误。有人知道吗？

提前致谢！ :)

Answer 1

让我们使用这个假设的数据集，df:

   sample_col
0         nan
1         nan
2        4.41
3        9.79
4        8.24
5        7.04
6        4.41
7        4.09
8        5.58
9        6.34

您可以创建名为 use_min 和 use_max 的 int 个对象，它们将是派生自手头列的 min() 和 max() 值。

use_min , use_max = int(df['sample_col'].min()) , int(df['sample_col'].max())

然后，您可以 fillna 使用 random.randint（允许您生成随机数），它以最小值和最大值作为参数，可以是您的 use_min , use_max

import random
df['sample_col'].fillna(random.randint(use_min,use_max))

Out[342]: 
0   6.00
1   6.00
2   4.41
3   9.79
4   8.24
5   7.04
6   4.41
7   4.09
8   5.58
9   6.34
Name: sample_col, dtype: float64

如何为这些列的最小值和最大值之间的两个随机数更改两列中的空值？

How can I change null values in two columns for two random numbers between the min and the max of those columns?

python

python-3.x

pandas

anaconda