Scrapy response.css - 没有不同标识符的两个标签

Question

我只是scrapy的初学者，遇到了一些问题：

<tr>
<td rowspan="2" style="vertical-align: top; width: 20%;">
1.&nbsp;c4<br>

<script type="text/javascript">
...
<\script>

</td>
<td style="vertical-align: top;">The English Defense, here I give up the centre to Black as a target for attack.</td>
</tr>

如果我想同时获得 "c4" 文本和 "The English Defense, here I give up the centre to Black as a target for attack."，可以使用 response.css('tr td::text').extract().

但是如果我只想要第二个 <td> 标签的文本我该怎么办，因为 <td> 标签没有 id 或 class 或其他任何东西？ In this link，我没有找到使用 style 或 rowspan...

的解决方案

Answer 1

您可以使用 nth-child 选择器。在您的具体情况下，这将是：
response.css("td:nth-child(2)::text").extract()

Scrapy response.css - 没有不同标识符的两个标签

Scrapy response.css - two tags without distinct identifiers

css

web-crawler

scrapy

python-3.x