断言不相等的相同数据帧 - Python Pandas

Identical Dataframes Asserting Not Equal - Python Pandas

我正在尝试对我的代码进行单元测试。我有一个给定 MySQL 查询的方法,returns 结果作为 pandas 数据框。请注意,在数据库中,createdexternal_id 中的所有返回值为 NULL。这是测试:

def test_get_data(self):

    ### SET UP

    self.report._query = "SELECT * FROM floor LIMIT 3";
    self.report._columns = ['id', 'facility_id', 'name', 'created', 'modified', 'external_id']
    self.d = {'id': p.Series([1, 2, 3]),
              'facility_id': p.Series([1, 1, 1]),
              'name': p.Series(['1st Floor', '2nd Floor', '3rd Floor']),
              'created': p.Series(['None', 'None', 'None']),
              'modified': p.Series([datetime.strptime('2012-10-06 01:08:27', '%Y-%m-%d %H:%M:%S'),
                                    datetime.strptime('2012-10-06 01:08:27', '%Y-%m-%d %H:%M:%S'),
                                    datetime.strptime('2012-10-06 01:08:27', '%Y-%m-%d %H:%M:%S')]),
              'external_id': p.Series(['None', 'None', 'None'])
              }
    self.df = p.DataFrame(data=self.d, columns=['id', 'facility_id', 'name', 'created', 'modified', 'external_id'])
    self.df.fillna('None')
    print(self.df)
    ### CODE UNDER TEST

    result = self.report.get_data(self.report._cursor_web)
    print(result)
    ### ASSERTIONS

    assert_frame_equal(result, self.df)

这是控制台输出(注意测试代码中的打印语句。手动构建的数据框在上面,从被测函数派生的数据框在底部):

.   id  facility_id       name created            modified external_id
0   1            1  1st Floor    None 2012-10-06 01:08:27        None
1   2            1  2nd Floor    None 2012-10-06 01:08:27        None
2   3            1  3rd Floor    None 2012-10-06 01:08:27        None
   id  facility_id       name created            modified external_id
0   1            1  1st Floor    None 2012-10-06 01:08:27        None
1   2            1  2nd Floor    None 2012-10-06 01:08:27        None
2   3            1  3rd Floor    None 2012-10-06 01:08:27        None
F
======================================================================
FAIL: test_get_data (__main__.ReportTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/path/to/file/ReportsTestCase.py", line 46, in test_get_data
    assert_frame_equal(result, self.df)
   File "/usr/local/lib/python2.7/site-packages/pandas/util/testing.py", line 1313, in assert_frame_equal
obj='DataFrame.iloc[:, {0}]'.format(i))
  File "/usr/local/lib/python2.7/site-packages/pandas/util/testing.py", line 1181, in assert_series_equal
obj='{0}'.format(obj))
  File "pandas/src/testing.pyx", line 59, in pandas._testing.assert_almost_equal (pandas/src/testing.c:4156)
  File "pandas/src/testing.pyx", line 173, in pandas._testing.assert_almost_equal (pandas/src/testing.c:3274)
  File "/usr/local/lib/python2.7/site-packages/pandas/util/testing.py", line 1018, in raise_assert_detail
raise AssertionError(msg)

AssertionError: DataFrame.iloc[:, 3] 不同

DataFrame.iloc[:, 3] values are different (100.0 %)
[left]:  [None, None, None]
[right]: [None, None, None]

----------------------------------------------------------------------
Ran 1 test in 0.354s

FAILED (failures=1)

根据我的估计,'created' 列在左右数据帧中都包含三个字符串值 'None'。为什么断言不相等?

Python还有一个内置常量None,与字符串'None'不同。来自 docs:

None

The sole value of the type NoneType. None is frequently used to represent the absence of a value, as when default arguments are not passed to a function. Assignments to None are illegal and raise a SyntaxError.

在比较 None'None' (None == 'None') 的情况下,结果将为 False。因此,如果其中一个 DataFrame 包含 None 但另一个包含 'None'.

assert_frame_equal 将引发 AssertionError