ValueError: Length mismatch: Expected axis has 0 elements while creating hierarchical columns in pandas dataframe
ValueError: Length mismatch: Expected axis has 0 elements while creating hierarchical columns in pandas dataframe
我正在研究 documentation 关于 Pandas 中的层次索引。我尝试测试其中的示例以创建具有分层索引的空数据框:
In [5]: df = pd.DataFrame()
In [6]: df.columns = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])
但是,它抛出一个错误:
ValueError Traceback (most recent call last)
<ipython-input-6-dd823f9b8d22> in <module>()
----> 1 df.columns = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])
/usr/local/lib/python3.4/dist-packages/pandas/core/generic.py in __setattr__(self, name, value)
2755 try:
2756 object.__getattribute__(self, name)
-> 2757 return object.__setattr__(self, name, value)
2758 except AttributeError:
2759 pass
pandas/src/properties.pyx in pandas.lib.AxisProperty.__set__ (pandas/lib.c:44873)()
/usr/local/lib/python3.4/dist-packages/pandas/core/generic.py in _set_axis(self, axis, labels)
446
447 def _set_axis(self, axis, labels):
--> 448 self._data.set_axis(axis, labels)
449 self._clear_item_cache()
450
/usr/local/lib/python3.4/dist-packages/pandas/core/internals.py in set_axis(self, axis, new_labels)
2800 raise ValueError('Length mismatch: Expected axis has %d elements, '
2801 'new values have %d elements' %
-> 2802 (old_len, new_len))
2803
2804 self.axes[axis] = new_labels
ValueError: Length mismatch: Expected axis has 0 elements, new values have 4 elements
我没有发现我的代码有任何问题。知道发生了什么事吗?
问题是您有一个包含零列的空数据框,并且您正试图为其分配四列多索引;如果您最初创建一个四列的空数据框,错误将消失:
df = pd.DataFrame(pd.np.empty((0, 4)))
df.columns = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])
或者您可以使用多索引创建空数据框,如下所示:
multi_index = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])
df = pd.DataFrame(columns=multi_index)
df
# first second
# a b a b
此解决方案不需要 numpy
:
# create empty DataFrame with 4 columns
df = pd.DataFrame(columns = range(4))
df.columns = pd.MultiIndex(
levels = [['first', 'second'], ['a', 'b']],
codes = [[0, 0, 1, 1], [0, 1, 0, 1]]
)
(注意:我将 labels
更改为 codes
因为它在 Pandas v1.0.0 中已更改)
如果您使用了 df.loc[ ]= 值,也会发生此错误
并且您没有将条件括在双括号 () 中。确保始终在双括号中的 loc 语句中插入条件。
它应该类似于下面的内容:
df.loc[<(条件 1) & (条件 2)>, ]= 值
我正在研究 documentation 关于 Pandas 中的层次索引。我尝试测试其中的示例以创建具有分层索引的空数据框:
In [5]: df = pd.DataFrame()
In [6]: df.columns = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])
但是,它抛出一个错误:
ValueError Traceback (most recent call last)
<ipython-input-6-dd823f9b8d22> in <module>()
----> 1 df.columns = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])
/usr/local/lib/python3.4/dist-packages/pandas/core/generic.py in __setattr__(self, name, value)
2755 try:
2756 object.__getattribute__(self, name)
-> 2757 return object.__setattr__(self, name, value)
2758 except AttributeError:
2759 pass
pandas/src/properties.pyx in pandas.lib.AxisProperty.__set__ (pandas/lib.c:44873)()
/usr/local/lib/python3.4/dist-packages/pandas/core/generic.py in _set_axis(self, axis, labels)
446
447 def _set_axis(self, axis, labels):
--> 448 self._data.set_axis(axis, labels)
449 self._clear_item_cache()
450
/usr/local/lib/python3.4/dist-packages/pandas/core/internals.py in set_axis(self, axis, new_labels)
2800 raise ValueError('Length mismatch: Expected axis has %d elements, '
2801 'new values have %d elements' %
-> 2802 (old_len, new_len))
2803
2804 self.axes[axis] = new_labels
ValueError: Length mismatch: Expected axis has 0 elements, new values have 4 elements
我没有发现我的代码有任何问题。知道发生了什么事吗?
问题是您有一个包含零列的空数据框,并且您正试图为其分配四列多索引;如果您最初创建一个四列的空数据框,错误将消失:
df = pd.DataFrame(pd.np.empty((0, 4)))
df.columns = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])
或者您可以使用多索引创建空数据框,如下所示:
multi_index = pd.MultiIndex(levels = [['first', 'second'], ['a', 'b']], labels = [[0, 0, 1, 1], [0, 1, 0, 1]])
df = pd.DataFrame(columns=multi_index)
df
# first second
# a b a b
此解决方案不需要 numpy
:
# create empty DataFrame with 4 columns
df = pd.DataFrame(columns = range(4))
df.columns = pd.MultiIndex(
levels = [['first', 'second'], ['a', 'b']],
codes = [[0, 0, 1, 1], [0, 1, 0, 1]]
)
(注意:我将 labels
更改为 codes
因为它在 Pandas v1.0.0 中已更改)
如果您使用了 df.loc[
它应该类似于下面的内容:
df.loc[<(条件 1) & (条件 2)>,