无法使用 Pandas 读取 Excel

Unable to use Pandas to Read From Excel

我正在尝试阅读完整的泰坦尼克号数据集,可在此处找到:

biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic3.xls

Kaggle 将数据作为两个 csv 文件(可以正常加载)但是他们故意遗漏了测试集的生存数据。

有问题的文件是 titanic3.xls,包含在上述参考页面底部的 tarball 文件中。

这是我的代码:

import pandas as pd
ship = pd.read_excel('titanic3.xls')

并且输出错误:

ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-2-be0879be6ad0> in <module>()
----> 1 ship = pd.read_excel('titanic3.xls')

/usr/local/lib/python3.6/site-packages/pandas/io/excel.py in read_excel(io, sheetname, header, skiprows, skip_footer, index_col, names, parse_cols, parse_dates, date_parser, na_values, thousands, convert_float, has_index_names, converters, dtype, true_values, false_values, engine, squeeze, **kwds)
    198 
    199     if not isinstance(io, ExcelFile):
--> 200         io = ExcelFile(io, engine=engine)
    201 
    202     return io._parse_excel(

/usr/local/lib/python3.6/site-packages/pandas/io/excel.py in __init__(self, io, **kwds)
    227     def __init__(self, io, **kwds):
    228 
--> 229         import xlrd  # throw an ImportError if we need to
    230 
    231         ver = tuple(map(int, xlrd.__VERSION__.split(".")[:2]))

ModuleNotFoundError: No module named 'xlrd'

我正在使用 Python 2.7

错误日志告诉您 python 找不到模块(包)xlrd。因此,您需要安装 xlrd 才能使用 read_excel()

pip install xlrd 应该可以解决问题

对我有用我的男人

import pandas as pd
data = pd.read_excel('D:Downloads/titanic3.xls')

data.head()
Out[7]: 
   pclass  survived                                             name     sex  \
0       1         1                    Allen, Miss. Elisabeth Walton  female   
1       1         1                   Allison, Master. Hudson Trevor    male   
2       1         0                     Allison, Miss. Helen Loraine  female   
3       1         0             Allison, Mr. Hudson Joshua Creighton    male   
4       1         0  Allison, Mrs. Hudson J C (Bessie Waldo Daniels)  female   

       age  sibsp  parch  ticket      fare    cabin embarked boat   body  \
0  29.0000      0      0   24160  211.3375       B5        S    2    NaN   
1   0.9167      1      2  113781  151.5500  C22 C26        S   11    NaN   
2   2.0000      1      2  113781  151.5500  C22 C26        S  NaN    NaN   
3  30.0000      1      2  113781  151.5500  C22 C26        S  NaN  135.0   
4  25.0000      1      2  113781  151.5500  C22 C26        S  NaN    NaN   

                         home.dest  
0                     St Louis, MO  
1  Montreal, PQ / Chesterville, ON  
2  Montreal, PQ / Chesterville, ON  
3  Montreal, PQ / Chesterville, ON  
4  Montreal, PQ / Chesterville, ON 

更新您的 pandas 软件包。最后是 20.2