解析url，循环file_get_html(urls)然后获取元素

Question

我有一个网站需要对其进行解析。

首先，我必须解析页面中所有目录的 url，然后我需要输入所有 url，然后遍历所有 url 并再次解析每个页面上的 url，然后遍历所有 url 并获取元素 ('.说明 div').

我使用的是简单的 html dom.

但是当我想遍历我第一次解析的所有 url 时，我遇到了一个问题。我得到的是空白页面

include 'simple_html_dom.php';
$catalogs = file_get_html('http://optnow.ru/catalog');
$catalogLink = [];
if(!empty($catalogs)) {
    foreach( $catalogs->find('div.cat-name a') as $catalog) {
         $catalogUrl = 'http://optnow.ru/' . $catalog->href . '?page=0';
         $catalogLink[] = $catalogUrl;
         $catalogHtml = file_get_html($catalogUrl);
         $productsLink = $catalogHtml->find('.link-pv-name');
         print_r($productsLink->href);
    }
}

我的错误在哪里？

谢谢。

Answer 1

您需要传递数组，而不是 foreach 中的单个元素：

include 'simple_html_dom.php';
$catalog = file_get_html('http://optnow.ru/catalog');
$catalogLink = [];
if(!empty($catalog)) {
    foreach( $catalog->find('div.cat-name a') as $catalogHref) {
         $myLink = 'http://optnow.ru/' . $catalogHref->href . '?page=0';
         $catalogLink[] = $myLink;
         echo '<pre>';
         print_r($myLink);
         echo '</pre>';
    }
    foreach ($catalogLink as $catalogSingleLink ) {
         if(!empty($catalogSingleLink)) {
             $catalogHtml = file_get_html($catalogSingleLink);
             $catalogProduct = $catalogHtml->find('.link-pv-name');
             echo $catalogProduct->href;
         }
    }
}

解析url，循环file_get_html(urls)然后获取元素

Parse urls, loop file_get_html(urls) and then get element

php

arrays

foreach

parsing

simple-html-dom