如何使用 pandas 获取 csv 中的中位数?
How can I get the median in a csv using pandas?
所以我只是在学习 pandas 库,我认为这应该不是那么困难,而且我不知道为什么会抛出一个 Key Error。我可能做错了,因为我是 python 的初学者。但我想知道为什么我无法使用以下代码找到 csv 的中值?我也在使用 repl.it 到 运行 这段代码。
import pandas
census = pandas.read_csv("census(1).csv")
median = census[0].median()
print(median)
如果您需要,我可以 post csv 如果这可能有助于调试它。我正在尝试获取第一列的中位数,我会按名称称呼它,但该列没有 header。我查看了我的书,如果该列没有 header,他们将如何指示对其进行编码。抛出的关键错误是
Traceback (most recent call last):
File "/tmp/.site-packages/pandas/core/indexes/base.py", line 2525, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 5, in <module>
median = census[0].median()
File "/tmp/.site-packages/pandas/core/frame.py", line 2139, in __getitem__
return self._getitem_column(key)
File "/tmp/.site-packages/pandas/core/frame.py", line 2146, in _getitem_column
return self._get_item_cache(key)
File "/tmp/.site-packages/pandas/core/generic.py", line 1842, in _get_item_cache
values = self._data.get(item)
File "/tmp/.site-packages/pandas/core/internals.py", line 3843, in get
loc = self.items.get_loc(item)
File "/tmp/.site-packages/pandas/core/indexes/base.py", line 2527, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 0
csv 的前两行:
39 State-gov 77516 Bachelors 13 Never-married Adm-clerical Not-in-family White Male 2174 0 40 United-States <=50K
50 Self-emp-not-inc 83311 Bachelors 13 Married-civ-spouse Exec-managerial Husband White Male 0 0 13 United-States <=50K
Pandas 正在查找列,而 census[0]
将 return 第一行。如果你知道你想要第一列,你可以使用 df.columns[0]
.
告诉它你想要那个
那就是 census[census.columns[0]].median()
。
所以我只是在学习 pandas 库,我认为这应该不是那么困难,而且我不知道为什么会抛出一个 Key Error。我可能做错了,因为我是 python 的初学者。但我想知道为什么我无法使用以下代码找到 csv 的中值?我也在使用 repl.it 到 运行 这段代码。
import pandas
census = pandas.read_csv("census(1).csv")
median = census[0].median()
print(median)
如果您需要,我可以 post csv 如果这可能有助于调试它。我正在尝试获取第一列的中位数,我会按名称称呼它,但该列没有 header。我查看了我的书,如果该列没有 header,他们将如何指示对其进行编码。抛出的关键错误是
Traceback (most recent call last):
File "/tmp/.site-packages/pandas/core/indexes/base.py", line 2525, in get_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 5, in <module>
median = census[0].median()
File "/tmp/.site-packages/pandas/core/frame.py", line 2139, in __getitem__
return self._getitem_column(key)
File "/tmp/.site-packages/pandas/core/frame.py", line 2146, in _getitem_column
return self._get_item_cache(key)
File "/tmp/.site-packages/pandas/core/generic.py", line 1842, in _get_item_cache
values = self._data.get(item)
File "/tmp/.site-packages/pandas/core/internals.py", line 3843, in get
loc = self.items.get_loc(item)
File "/tmp/.site-packages/pandas/core/indexes/base.py", line 2527, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 0
csv 的前两行:
39 State-gov 77516 Bachelors 13 Never-married Adm-clerical Not-in-family White Male 2174 0 40 United-States <=50K
50 Self-emp-not-inc 83311 Bachelors 13 Married-civ-spouse Exec-managerial Husband White Male 0 0 13 United-States <=50K
Pandas 正在查找列,而 census[0]
将 return 第一行。如果你知道你想要第一列,你可以使用 df.columns[0]
.
那就是 census[census.columns[0]].median()
。