如何从同时具有 class 和 id 的 html 文件中 select scrapy 中的数据？

Question

<div class="section-body" id="section-2"><p>Most people with aortic stenosis do not develop symptoms until the disease is advanced. The diagnosis may have been made when the health care provider heard a heart murmur and performed tests.</p><p>Symptoms of aortic stenosis include:</p><ul><li>Chest discomfort: The chest pain may get worse with activity and reach into the arm, neck, or jaw. The chest may also feel tight or squeezed.</li><li>Cough, possibly bloody.</li><li>Breathing problems when exercising.</li><li>Becoming easily tired.</li><li>Feeling the heartbeat (palpitations).</li><li>Fainting, weakness, or dizziness with activity.</li></ul><p>In infants and children, symptoms include:</p><ul><li>Becoming easily tired with exertion (in mild cases)</li><li>Failure to gain weight</li><li>Poor feeding</li><li>Serious breathing problems that develop within days or weeks of birth (in severe cases)</li></ul><p>Children with mild or moderate aortic stenosis may get worse as they get older. They are also at risk for a heart infection called bacterial endocarditis.</p></div></div></section>

我有上面的脚本，我想废弃列表中的数据。即在我试过在 scrapy 中执行命令但没有工作。它给出“[]”作为输出。

 response.css("article div.section-body p").extract() <-- this is giving all info under section body but I want only under section-2
  response.css("article div.section-body.section-2 p::text").extract()
 response.xpath("//article/*[contains(@id, 'setion-2')]").extract()

请帮我解压。谢谢

Answer 1

尝试

response.css("article div.section-body#section-2 p::text").extract()

div.section-body#section-2 表示 select DIV 同时具有 Class section-body 和 ID section-2

请注意，ID 由 # 编辑 select，class 由 . 编辑 select ... 所以您的 CSS您问题中发布的选择器错误。

如何从同时具有 class 和 id 的 html 文件中 select scrapy 中的数据？

How to select data in scrapy from html file having both class and id?

xpath

scrapy

web-scraping

scrapy-spider