无法使用 Pandas 读取 Excel
Unable to use Pandas to Read From Excel
我正在尝试阅读完整的泰坦尼克号数据集,可在此处找到:
biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic3.xls
Kaggle 将数据作为两个 csv 文件(可以正常加载)但是他们故意遗漏了测试集的生存数据。
有问题的文件是 titanic3.xls
,包含在上述参考页面底部的 tarball 文件中。
这是我的代码:
import pandas as pd
ship = pd.read_excel('titanic3.xls')
并且输出错误:
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-2-be0879be6ad0> in <module>()
----> 1 ship = pd.read_excel('titanic3.xls')
/usr/local/lib/python3.6/site-packages/pandas/io/excel.py in read_excel(io, sheetname, header, skiprows, skip_footer, index_col, names, parse_cols, parse_dates, date_parser, na_values, thousands, convert_float, has_index_names, converters, dtype, true_values, false_values, engine, squeeze, **kwds)
198
199 if not isinstance(io, ExcelFile):
--> 200 io = ExcelFile(io, engine=engine)
201
202 return io._parse_excel(
/usr/local/lib/python3.6/site-packages/pandas/io/excel.py in __init__(self, io, **kwds)
227 def __init__(self, io, **kwds):
228
--> 229 import xlrd # throw an ImportError if we need to
230
231 ver = tuple(map(int, xlrd.__VERSION__.split(".")[:2]))
ModuleNotFoundError: No module named 'xlrd'
我正在使用 Python 2.7
错误日志告诉您 python 找不到模块(包)xlrd
。因此,您需要安装 xlrd
才能使用 read_excel()
pip install xlrd
应该可以解决问题
对我有用我的男人
import pandas as pd
data = pd.read_excel('D:Downloads/titanic3.xls')
data.head()
Out[7]:
pclass survived name sex \
0 1 1 Allen, Miss. Elisabeth Walton female
1 1 1 Allison, Master. Hudson Trevor male
2 1 0 Allison, Miss. Helen Loraine female
3 1 0 Allison, Mr. Hudson Joshua Creighton male
4 1 0 Allison, Mrs. Hudson J C (Bessie Waldo Daniels) female
age sibsp parch ticket fare cabin embarked boat body \
0 29.0000 0 0 24160 211.3375 B5 S 2 NaN
1 0.9167 1 2 113781 151.5500 C22 C26 S 11 NaN
2 2.0000 1 2 113781 151.5500 C22 C26 S NaN NaN
3 30.0000 1 2 113781 151.5500 C22 C26 S NaN 135.0
4 25.0000 1 2 113781 151.5500 C22 C26 S NaN NaN
home.dest
0 St Louis, MO
1 Montreal, PQ / Chesterville, ON
2 Montreal, PQ / Chesterville, ON
3 Montreal, PQ / Chesterville, ON
4 Montreal, PQ / Chesterville, ON
更新您的 pandas 软件包。最后是 20.2
我正在尝试阅读完整的泰坦尼克号数据集,可在此处找到:
biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic3.xls
Kaggle 将数据作为两个 csv 文件(可以正常加载)但是他们故意遗漏了测试集的生存数据。
有问题的文件是 titanic3.xls
,包含在上述参考页面底部的 tarball 文件中。
这是我的代码:
import pandas as pd
ship = pd.read_excel('titanic3.xls')
并且输出错误:
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-2-be0879be6ad0> in <module>()
----> 1 ship = pd.read_excel('titanic3.xls')
/usr/local/lib/python3.6/site-packages/pandas/io/excel.py in read_excel(io, sheetname, header, skiprows, skip_footer, index_col, names, parse_cols, parse_dates, date_parser, na_values, thousands, convert_float, has_index_names, converters, dtype, true_values, false_values, engine, squeeze, **kwds)
198
199 if not isinstance(io, ExcelFile):
--> 200 io = ExcelFile(io, engine=engine)
201
202 return io._parse_excel(
/usr/local/lib/python3.6/site-packages/pandas/io/excel.py in __init__(self, io, **kwds)
227 def __init__(self, io, **kwds):
228
--> 229 import xlrd # throw an ImportError if we need to
230
231 ver = tuple(map(int, xlrd.__VERSION__.split(".")[:2]))
ModuleNotFoundError: No module named 'xlrd'
我正在使用 Python 2.7
错误日志告诉您 python 找不到模块(包)xlrd
。因此,您需要安装 xlrd
才能使用 read_excel()
pip install xlrd
应该可以解决问题
对我有用我的男人
import pandas as pd
data = pd.read_excel('D:Downloads/titanic3.xls')
data.head()
Out[7]:
pclass survived name sex \
0 1 1 Allen, Miss. Elisabeth Walton female
1 1 1 Allison, Master. Hudson Trevor male
2 1 0 Allison, Miss. Helen Loraine female
3 1 0 Allison, Mr. Hudson Joshua Creighton male
4 1 0 Allison, Mrs. Hudson J C (Bessie Waldo Daniels) female
age sibsp parch ticket fare cabin embarked boat body \
0 29.0000 0 0 24160 211.3375 B5 S 2 NaN
1 0.9167 1 2 113781 151.5500 C22 C26 S 11 NaN
2 2.0000 1 2 113781 151.5500 C22 C26 S NaN NaN
3 30.0000 1 2 113781 151.5500 C22 C26 S NaN 135.0
4 25.0000 1 2 113781 151.5500 C22 C26 S NaN NaN
home.dest
0 St Louis, MO
1 Montreal, PQ / Chesterville, ON
2 Montreal, PQ / Chesterville, ON
3 Montreal, PQ / Chesterville, ON
4 Montreal, PQ / Chesterville, ON
更新您的 pandas 软件包。最后是 20.2