pandas 四舍五入，这是一个错误吗？

Question

这是一个错误吗？当我四舍五入到小数点后 4 位时，它实际上 returns 不同的结果。

import pandas as pd
pd.set_option('precision', 10)

pd.DataFrame([[1.446450001],[1.44645]]).round(4)

结果

    0
0   1.4465
1   1.4464

Answer 1

这不是错误 - 相反，这是一个未记录的怪癖。

DataFrame.round 在后台使用 numpy.around，其中：

For values exactly halfway between rounded decimal values, Numpy rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0, -0.5 and 0.5 round to 0.0, etc.

http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.around.html

更多阅读@维基百科：https://en.wikipedia.org/wiki/Rounding#Round_half_to_even

Answer 2

有两种不同的舍入策略

第一轮就像你在学校学过的那样，恰好一半的值（以5结尾）向上舍入
第二轮到下偶数数

第一个策略有副作用，你的平均数有正偏差，因为中心总是调得更高。这是由第二个策略修复的，任意决定舍入到下一个偶数。

Pandas 选择使用实现第二种策略的 numpy.around。

Answer 3

您可以使用以下函数在 pandas 或 python 中进行正常舍入：

import numpy

def round_normal(n, decimals):
    # multiply the decimal by 10 to the number of decimals you want to round to
    multiplicand = 10 ** decimals

    # add 0.5 so that taking the floor will get you the right number when you 
    # divide by the multiplicand

    # e.g. 3.0449 to 2.d.p -> 304.49 + 0.5 = 304.59 -> floor(304.59) / 100 = 3.04
    # e.g. 3.045 to 2.d.p -> 304.5 + 0.5 = 305 -> floor(305) / 100 = 3.05
    rounded_n = numpy.floor(n * multiplicand + 0.5) / multiplicand
    return rounded_n

pandas 四舍五入，这是一个错误吗？

pandas rounding, is this a bug?

python

rounding

pandas