无法阅读 Excel 97-2004 工作簿

Cannot read an Excel 97-2004 Workbook

我一直在努力阅读从网页上抓取的 Excel 文件。它的类型为:Microsoft Excel 97-2004 Workbook(我从 MS Excel 中检查过)。这就是我正在尝试使用 PHPExcel:

$destination = APPPATH . "docs/app.xls";
$inputFileType = PHPExcel_IOFactory::identify($destination);
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$objPHPExcel = $objReader->load($destination);

我收到以下错误:

A PHP Error was encountered

Severity: Warning

Message: simplexml_load_file(): /var/www/application/cookies/app.xls:1: parser error : Start tag expected, '<' not found

Filename: Reader/Excel2003XML.php

Line Number: 333

....
A PHP Error was encountered

Severity: Warning

Message: simplexml_load_file(): HTTP/1.1 200 OK

Filename: Reader/Excel2003XML.php

Line Number: 333
...

A PHP Error was encountered

Severity: Warning

Message: simplexml_load_file(): ^

Filename: Reader/Excel2003XML.php

Line Number: 333
...

Fatal error: Call to a member function getNamespaces() on boolean in /var/www/application/third_party/PHPExcel/Reader/Excel2003XML.php on line 334
A PHP Error was encountered

Severity: Error

Message: Call to a member function getNamespaces() on boolean

谁能帮我解决一下?

您的文件存在问题,它不仅仅是 SpreadsheetML 格式,它已损坏。

在文本编辑器中打开文件,我可以看到 http 响应 headers 也包含在文件中

HTTP/1.1 200 OK
Date: Fri, 13 Nov 2015 09:55:31 GMT
Server: Apache-Coyote/1.1
Content-Disposition: inline; filename="sdp_daily_app_revenue_report.xls"
Content-Type: application/vnd.ms-excel
Transfer-Encoding: chunked

<?xml version="1.0" encoding="UTF-8"?>
<?mso-application progid="Excel.Sheet"?>

<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40">
....
</Workbook>

这使得它不可读.....文件应该只包含实际的 xml 内容

我不知道您是如何获取它的,但您需要确保 http 响应 headers 不会回显到文件中。如果 <?xml version="1.0" encoding="UTF-8"?> 之前的所有内容都被删除

,PHPExcel 应该可以毫无问题地读取它

编辑

它的定义也没有 Default 样式,这对于 SpreadsheetML 格式是强制性的....如果你想破解 Excel2003XML 的代码 Reader 大约第 413-417 行,变化

if ($styleID == 'Default') {
    $this->styles['Default'] = array();
} else {
    $this->styles[$styleID] = $this->styles['Default'];
}

$this->styles[$styleID] = (isset($this->styles['Default'])) ? $this->styles['Default'] : array();