如何循环 DOM 个元素并存储为数组?
How to loop DOM elements and store as an array?
我正在通过抓取获取数据。
数据源是一个 table 并且我需要获取每个 (tr) 的数据。
table 有 3 (td) 即:
- 标题
- 日期
- link
这是我使用的代码:
$data = array();
$counter = 1;
$index = 0;
foreach($html->find('#middle table tr td') as $source){
$dont_include = array(
'<td>CONTAIN TEXT THAT I DONT WNAT TO INCLUDE IN HERE</td>'
);
if (!in_array($source->outertext, $dont_include)) {
// IF IT CONTAIN LINK THEN GET IT LINK
// THE SOURCE DATA FOR LINK IS SOMETHING LIKE
// <td><a href="">xx</a></td>
if(strstr($source->innertext, 'http://')){
$a = new SimpleXMLElement($source->innertext);
$the_link = (string) $a['href'][0];
$data[$index] = array('link' => $the_link);;
}else{
if ($counter==2) {
$data[$index] = array('title' => $source->innertext);
}else{
$data[$index] = array('date' => $source->innertext);
$counter = 0;
$index++;
}
}
}
$counter++;
}
print_r($data);
问题:
如何使用此结构将这些值存储在数组中:
Array (
[0] => Array (
[title] => ""
[date] => ""
[link] => ""
)
[1] => Array (
[title] => ""
[date] => ""
[link] => ""
)
...
)
更新,这里是源结构:
<!-- THIS IS THE SOURCE , AT THE TOP HERE CONTAIN TD THAT I DONT WANT -->
<td>title</td>
<td class="ac">date</td>
<td width="190"><a href="i need this link" target="_blank">filename , i dont need the file name</a>
</td>
<td>title</td>
<td class="ac">date</td>
<td width="190"><a href="i need this link" target="_blank">filename , i dont need the file name</a>
</td>
<td>title</td>
<td class="ac">date</td>
<td width="190"><a href="i need this link" target="_blank">filename , i dont need the file name</a>
</td>
<td>title</td>
<td class="ac">date</td>
<td width="190"><a href="i need this link" target="_blank">filename , i dont need the file name</a>
</td>
我建议您循环 tr
而不是循环 td
以便您可以创建数组。试试这个
$rowData = array();
foreach ($html->find('#middle table tr') as $rows) {
$cellData = array();
$cellData['title'] = $rows->children(0)->innertext;
$cellData['date'] = $rows->children(1)->innertext;
$cellData['link'] = $rows->children(2)->innertext;
$rowData[] = $cellData;
}
print_r($rowData);
我正在通过抓取获取数据。 数据源是一个 table 并且我需要获取每个 (tr) 的数据。
table 有 3 (td) 即:
- 标题
- 日期
- link
这是我使用的代码:
$data = array();
$counter = 1;
$index = 0;
foreach($html->find('#middle table tr td') as $source){
$dont_include = array(
'<td>CONTAIN TEXT THAT I DONT WNAT TO INCLUDE IN HERE</td>'
);
if (!in_array($source->outertext, $dont_include)) {
// IF IT CONTAIN LINK THEN GET IT LINK
// THE SOURCE DATA FOR LINK IS SOMETHING LIKE
// <td><a href="">xx</a></td>
if(strstr($source->innertext, 'http://')){
$a = new SimpleXMLElement($source->innertext);
$the_link = (string) $a['href'][0];
$data[$index] = array('link' => $the_link);;
}else{
if ($counter==2) {
$data[$index] = array('title' => $source->innertext);
}else{
$data[$index] = array('date' => $source->innertext);
$counter = 0;
$index++;
}
}
}
$counter++;
}
print_r($data);
问题: 如何使用此结构将这些值存储在数组中:
Array (
[0] => Array (
[title] => ""
[date] => ""
[link] => ""
)
[1] => Array (
[title] => ""
[date] => ""
[link] => ""
)
...
)
更新,这里是源结构:
<!-- THIS IS THE SOURCE , AT THE TOP HERE CONTAIN TD THAT I DONT WANT -->
<td>title</td>
<td class="ac">date</td>
<td width="190"><a href="i need this link" target="_blank">filename , i dont need the file name</a>
</td>
<td>title</td>
<td class="ac">date</td>
<td width="190"><a href="i need this link" target="_blank">filename , i dont need the file name</a>
</td>
<td>title</td>
<td class="ac">date</td>
<td width="190"><a href="i need this link" target="_blank">filename , i dont need the file name</a>
</td>
<td>title</td>
<td class="ac">date</td>
<td width="190"><a href="i need this link" target="_blank">filename , i dont need the file name</a>
</td>
我建议您循环 tr
而不是循环 td
以便您可以创建数组。试试这个
$rowData = array();
foreach ($html->find('#middle table tr') as $rows) {
$cellData = array();
$cellData['title'] = $rows->children(0)->innertext;
$cellData['date'] = $rows->children(1)->innertext;
$cellData['link'] = $rows->children(2)->innertext;
$rowData[] = $cellData;
}
print_r($rowData);