断言不相等的相同数据帧 - Python Pandas
Identical Dataframes Asserting Not Equal - Python Pandas
我正在尝试对我的代码进行单元测试。我有一个给定 MySQL 查询的方法,returns 结果作为 pandas 数据框。请注意,在数据库中,created
和 external_id
中的所有返回值为 NULL。这是测试:
def test_get_data(self):
### SET UP
self.report._query = "SELECT * FROM floor LIMIT 3";
self.report._columns = ['id', 'facility_id', 'name', 'created', 'modified', 'external_id']
self.d = {'id': p.Series([1, 2, 3]),
'facility_id': p.Series([1, 1, 1]),
'name': p.Series(['1st Floor', '2nd Floor', '3rd Floor']),
'created': p.Series(['None', 'None', 'None']),
'modified': p.Series([datetime.strptime('2012-10-06 01:08:27', '%Y-%m-%d %H:%M:%S'),
datetime.strptime('2012-10-06 01:08:27', '%Y-%m-%d %H:%M:%S'),
datetime.strptime('2012-10-06 01:08:27', '%Y-%m-%d %H:%M:%S')]),
'external_id': p.Series(['None', 'None', 'None'])
}
self.df = p.DataFrame(data=self.d, columns=['id', 'facility_id', 'name', 'created', 'modified', 'external_id'])
self.df.fillna('None')
print(self.df)
### CODE UNDER TEST
result = self.report.get_data(self.report._cursor_web)
print(result)
### ASSERTIONS
assert_frame_equal(result, self.df)
这是控制台输出(注意测试代码中的打印语句。手动构建的数据框在上面,从被测函数派生的数据框在底部):
. id facility_id name created modified external_id
0 1 1 1st Floor None 2012-10-06 01:08:27 None
1 2 1 2nd Floor None 2012-10-06 01:08:27 None
2 3 1 3rd Floor None 2012-10-06 01:08:27 None
id facility_id name created modified external_id
0 1 1 1st Floor None 2012-10-06 01:08:27 None
1 2 1 2nd Floor None 2012-10-06 01:08:27 None
2 3 1 3rd Floor None 2012-10-06 01:08:27 None
F
======================================================================
FAIL: test_get_data (__main__.ReportTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/path/to/file/ReportsTestCase.py", line 46, in test_get_data
assert_frame_equal(result, self.df)
File "/usr/local/lib/python2.7/site-packages/pandas/util/testing.py", line 1313, in assert_frame_equal
obj='DataFrame.iloc[:, {0}]'.format(i))
File "/usr/local/lib/python2.7/site-packages/pandas/util/testing.py", line 1181, in assert_series_equal
obj='{0}'.format(obj))
File "pandas/src/testing.pyx", line 59, in pandas._testing.assert_almost_equal (pandas/src/testing.c:4156)
File "pandas/src/testing.pyx", line 173, in pandas._testing.assert_almost_equal (pandas/src/testing.c:3274)
File "/usr/local/lib/python2.7/site-packages/pandas/util/testing.py", line 1018, in raise_assert_detail
raise AssertionError(msg)
AssertionError: DataFrame.iloc[:, 3] 不同
DataFrame.iloc[:, 3] values are different (100.0 %)
[left]: [None, None, None]
[right]: [None, None, None]
----------------------------------------------------------------------
Ran 1 test in 0.354s
FAILED (failures=1)
根据我的估计,'created' 列在左右数据帧中都包含三个字符串值 'None'。为什么断言不相等?
Python还有一个内置常量None
,与字符串'None'
不同。来自 docs:
None
The sole value of the type NoneType. None is frequently used to
represent the absence of a value, as when default arguments are not
passed to a function. Assignments to None are illegal and raise a
SyntaxError.
在比较 None
与 'None'
(None == 'None'
) 的情况下,结果将为 False。因此,如果其中一个 DataFrame 包含 None
但另一个包含 'None'
.
,assert_frame_equal
将引发 AssertionError
我正在尝试对我的代码进行单元测试。我有一个给定 MySQL 查询的方法,returns 结果作为 pandas 数据框。请注意,在数据库中,created
和 external_id
中的所有返回值为 NULL。这是测试:
def test_get_data(self):
### SET UP
self.report._query = "SELECT * FROM floor LIMIT 3";
self.report._columns = ['id', 'facility_id', 'name', 'created', 'modified', 'external_id']
self.d = {'id': p.Series([1, 2, 3]),
'facility_id': p.Series([1, 1, 1]),
'name': p.Series(['1st Floor', '2nd Floor', '3rd Floor']),
'created': p.Series(['None', 'None', 'None']),
'modified': p.Series([datetime.strptime('2012-10-06 01:08:27', '%Y-%m-%d %H:%M:%S'),
datetime.strptime('2012-10-06 01:08:27', '%Y-%m-%d %H:%M:%S'),
datetime.strptime('2012-10-06 01:08:27', '%Y-%m-%d %H:%M:%S')]),
'external_id': p.Series(['None', 'None', 'None'])
}
self.df = p.DataFrame(data=self.d, columns=['id', 'facility_id', 'name', 'created', 'modified', 'external_id'])
self.df.fillna('None')
print(self.df)
### CODE UNDER TEST
result = self.report.get_data(self.report._cursor_web)
print(result)
### ASSERTIONS
assert_frame_equal(result, self.df)
这是控制台输出(注意测试代码中的打印语句。手动构建的数据框在上面,从被测函数派生的数据框在底部):
. id facility_id name created modified external_id
0 1 1 1st Floor None 2012-10-06 01:08:27 None
1 2 1 2nd Floor None 2012-10-06 01:08:27 None
2 3 1 3rd Floor None 2012-10-06 01:08:27 None
id facility_id name created modified external_id
0 1 1 1st Floor None 2012-10-06 01:08:27 None
1 2 1 2nd Floor None 2012-10-06 01:08:27 None
2 3 1 3rd Floor None 2012-10-06 01:08:27 None
F
======================================================================
FAIL: test_get_data (__main__.ReportTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/path/to/file/ReportsTestCase.py", line 46, in test_get_data
assert_frame_equal(result, self.df)
File "/usr/local/lib/python2.7/site-packages/pandas/util/testing.py", line 1313, in assert_frame_equal
obj='DataFrame.iloc[:, {0}]'.format(i))
File "/usr/local/lib/python2.7/site-packages/pandas/util/testing.py", line 1181, in assert_series_equal
obj='{0}'.format(obj))
File "pandas/src/testing.pyx", line 59, in pandas._testing.assert_almost_equal (pandas/src/testing.c:4156)
File "pandas/src/testing.pyx", line 173, in pandas._testing.assert_almost_equal (pandas/src/testing.c:3274)
File "/usr/local/lib/python2.7/site-packages/pandas/util/testing.py", line 1018, in raise_assert_detail
raise AssertionError(msg)
AssertionError: DataFrame.iloc[:, 3] 不同
DataFrame.iloc[:, 3] values are different (100.0 %)
[left]: [None, None, None]
[right]: [None, None, None]
----------------------------------------------------------------------
Ran 1 test in 0.354s
FAILED (failures=1)
根据我的估计,'created' 列在左右数据帧中都包含三个字符串值 'None'。为什么断言不相等?
Python还有一个内置常量None
,与字符串'None'
不同。来自 docs:
None
The sole value of the type NoneType. None is frequently used to represent the absence of a value, as when default arguments are not passed to a function. Assignments to None are illegal and raise a SyntaxError.
在比较 None
与 'None'
(None == 'None'
) 的情况下,结果将为 False。因此,如果其中一个 DataFrame 包含 None
但另一个包含 'None'
.
assert_frame_equal
将引发 AssertionError