Xpath中的preg_match是什么错误?未定义的偏移量:1
What is the error of preg_match in Xpath? Undefined offset: 1
我试图从 属性 ID 中获取 ID:使用以下代码:
<?php
$getURL = file_get_contents('http://realestate.com.kh/residential-for-rent-in-phnom-penh-daun-penh-phsar-chas-2-beds-apartment-1001192296/');
$dom = new DOMDocument();
@$dom->loadHTML($getURL);
$xpath = new DOMXPath($dom);
/*echo $xpath->evaluate("normalize-space(substring-before(substring-after(//p[contains(text(),'Property ID:')][1], 'Property ID:'), '–'))");*/
$id = $xpath->evaluate('//div[contains(@class,"property-table")]')->item(0)->nodeValue;
preg_match("/Property ID :(.*)/", $id, $matches);
echo $matches[1];
但是没用;
Notice: Undefined offset: 1 in W:\Xampp\htdocs\X\index.php on line 12
怎么了?如果我像这样制造刺痛
$id ="Property Details Property Type : Apartment Price $ 350 pm Building Size 72 Sqms Property ID : 1001192296";
并在我的代码中替换它有效。那么myselt创建的数据和xpath抓取的数据有什么区别呢?
在此先感谢您的帮助。
您需要检查 preg_match()
是否确实找到了任何东西。
没有结果就没有$matches[1]
。您应该使用 if(count($matches)>1) {... }
来解决您遇到的问题。
你的 preg_match()
不起作用,因为你得到的 xpath 中的 nodeValue
正是这样的:
Property Details
Property Type :
Apartment
Price
$ 350 pm
Building Size
72 Sqms
Property ID
:
1001192296
所以你必须这样尝试:
$getURL = file_get_contents('http://realestate.com.kh/residential-for-rent-in-phnom-penh-daun-penh-phsar-chas-2-beds-apartment-1001192296/');
$dom = new DOMDocument();
@$dom->loadHTML($getURL);
$xpath = new DOMXPath($dom);
/*echo $xpath->evaluate("normalize-space(substring-before(substring-after(//p[contains(text(),'Property ID:')][1], 'Property ID:'), '–'))");*/
$id = $xpath->evaluate('//div[contains(@class,"property-table")]')->item(0)->nodeValue;
$id = preg_replace('!\s+!', ' ', $id);
preg_match("/Property ID :(.*)/", $id, $matches);
echo $matches[1];
这 ( $id = preg_replace('!\s+!', ' ', $id);
) 会将所有制表符、单词之间的空格合并为一个空格。
更新:
由于下面的评论,我现在得到了 HTML 和 $xpath->evaluate()
的全文,并尝试匹配所有 属性 id(比如只有数字和 P 数字)。
$getURL = file_get_contents('http://realestate.com.kh/residential-for-rent-in-phnom-penh-daun-penh-phsar-chas-2-beds-apartment-1001192296/');
$dom = new DOMDocument();
@$dom->loadHTML($getURL);
$xpath = new DOMXPath($dom);
// this only returns the text of the whole page without html tags
$id = $xpath->evaluate( "//html" )->item(0)->nodeValue;
$id = preg_replace('!\s+!', ' ', $id);
// not a good regex, but matches the property IDs
preg_match_all("/Property ID( |):[ |]((\w{0,1}[-]|)\d*)/", $id, $matches);
// after the changes you have to go for the matches is $matches[2]
foreach( $matches[2] as $property_id ) {
echo $property_id."<br>";
}
我试图从 属性 ID 中获取 ID:使用以下代码:
<?php
$getURL = file_get_contents('http://realestate.com.kh/residential-for-rent-in-phnom-penh-daun-penh-phsar-chas-2-beds-apartment-1001192296/');
$dom = new DOMDocument();
@$dom->loadHTML($getURL);
$xpath = new DOMXPath($dom);
/*echo $xpath->evaluate("normalize-space(substring-before(substring-after(//p[contains(text(),'Property ID:')][1], 'Property ID:'), '–'))");*/
$id = $xpath->evaluate('//div[contains(@class,"property-table")]')->item(0)->nodeValue;
preg_match("/Property ID :(.*)/", $id, $matches);
echo $matches[1];
但是没用;
Notice: Undefined offset: 1 in W:\Xampp\htdocs\X\index.php on line 12
怎么了?如果我像这样制造刺痛
$id ="Property Details Property Type : Apartment Price $ 350 pm Building Size 72 Sqms Property ID : 1001192296";
并在我的代码中替换它有效。那么myselt创建的数据和xpath抓取的数据有什么区别呢? 在此先感谢您的帮助。
您需要检查 preg_match()
是否确实找到了任何东西。
没有结果就没有$matches[1]
。您应该使用 if(count($matches)>1) {... }
来解决您遇到的问题。
你的 preg_match()
不起作用,因为你得到的 xpath 中的 nodeValue
正是这样的:
Property Details
Property Type :
Apartment
Price
$ 350 pm
Building Size
72 Sqms
Property ID
:
1001192296
所以你必须这样尝试:
$getURL = file_get_contents('http://realestate.com.kh/residential-for-rent-in-phnom-penh-daun-penh-phsar-chas-2-beds-apartment-1001192296/');
$dom = new DOMDocument();
@$dom->loadHTML($getURL);
$xpath = new DOMXPath($dom);
/*echo $xpath->evaluate("normalize-space(substring-before(substring-after(//p[contains(text(),'Property ID:')][1], 'Property ID:'), '–'))");*/
$id = $xpath->evaluate('//div[contains(@class,"property-table")]')->item(0)->nodeValue;
$id = preg_replace('!\s+!', ' ', $id);
preg_match("/Property ID :(.*)/", $id, $matches);
echo $matches[1];
这 ( $id = preg_replace('!\s+!', ' ', $id);
) 会将所有制表符、单词之间的空格合并为一个空格。
更新:
由于下面的评论,我现在得到了 HTML 和 $xpath->evaluate()
的全文,并尝试匹配所有 属性 id(比如只有数字和 P 数字)。
$getURL = file_get_contents('http://realestate.com.kh/residential-for-rent-in-phnom-penh-daun-penh-phsar-chas-2-beds-apartment-1001192296/');
$dom = new DOMDocument();
@$dom->loadHTML($getURL);
$xpath = new DOMXPath($dom);
// this only returns the text of the whole page without html tags
$id = $xpath->evaluate( "//html" )->item(0)->nodeValue;
$id = preg_replace('!\s+!', ' ', $id);
// not a good regex, but matches the property IDs
preg_match_all("/Property ID( |):[ |]((\w{0,1}[-]|)\d*)/", $id, $matches);
// after the changes you have to go for the matches is $matches[2]
foreach( $matches[2] as $property_id ) {
echo $property_id."<br>";
}