如何获取 html 页面中两个字符串之间的数据

How to get the data in between two strings in a html page

我有几个 html 页面是这样的:

标题

简介

Table 的内容:

选项1。艺术

选项2。运动

选项3。跳舞

选择说明

选项1。艺术

一个。水彩 描述 b.油画 描述 C。丙烯画 描述

选项2。运动

一个。篮球 描述 b.蟋蟀 描述 C。足球 描述

选项3。舞蹈

一个。霹雳舞

所有这些内容都以不同的 html 格式保存在每个 html 页面中。我想在每个页面的运动选项下收集整个文本。 (无论如何,除了弄清楚 xpaths 之外,我还能实现这个吗,因为每个 html 页面的结构都不同)。

请帮忙。谢谢你。

样本html:

<Document>
<TYPE>
<SEQUENCE>
<FILENAME>
<DESCRIPTION>
<TEXT>
<HTML>
<HEAD>
</HEAD>

<P style="font-family:times;;margin-left:10.0pt;text-indent:-10.0pt;"><FONT SIZE=2><B>

<!-- COMMAND=STYLE_ADDED,"margin-left:10.0pt;text-indent:-10.0pt;" -->

option 1. Art history: </B></FONT></P>

<P style="font-family:times;"><FONT SIZE=2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;The nature of art has been described by philosopher Richard Wollheim as "one of the most elusive of the traditional problems of human culture".[19] Art has been defined as a vehicle for the expression or communication of emotions and ideas </FONT></P>

<P style="font-family:times;;margin-left:10.0pt;text-indent:-10.0pt;"><FONT SIZE=2><B>

<!-- COMMAND=STYLE_ADDED,"margin-left:10.0pt;text-indent:-10.0pt;" -->

option 2. Sports division : </B></FONT></P>

<P style="font-family:times;"><FONT SIZE=2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Hundreds of sports exist, from those between single contestants, through to those with hundreds of simultaneous participants, either in teams or competing as individuals. In certain sports such as racing, many contestants may compete, simultaneously or consecutively, with one winner; in others, the contest (a match) is between two sides, each attempting to exceed the other.</FONT></P>

<P style="font-family:times;;margin-left:10.0pt;text-indent:-10.0pt;"><FONT SIZE=2><B>

<!-- COMMAND=STYLE_ADDED,"margin-left:10.0pt;text-indent:-10.0pt;" -->

option 3. Dance group: </B></FONT></P>

<P style="font-family:times;"><FONT SIZE=2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;An important distinction is to be drawn between the contexts of theatrical and participatory dance,[4] although these two categories are not always completely separate; both may have special functions, </FONT></P>

好吧,我会在这里使用 jQuery .text(),因为它似乎完全符合您的要求(仅显示所有 selected 节点的文本内容)。

除此之外,select使用正确的元素是一件简单的事情,所以我们确实需要知道 "section" 到 select 是什么。可能是这样的:

$("P:contains('option 2')").nextUntil("p:contains('option 3')").text()

console.log($("P:contains('option 2')").nextUntil("p:contains('option 3')").text())
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div id="scott">
<P style="font-family:times;;margin-left:10.0pt;text-indent:-10.0pt;"><FONT SIZE=2><B>

<!-- COMMAND=STYLE_ADDED,"margin-left:10.0pt;text-indent:-10.0pt;" -->

option 1. Art history: </B></FONT></P>

<P style="font-family:times;"><FONT SIZE=2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;The nature of art has been described by philosopher Richard Wollheim as "one of the most elusive of the traditional problems of human culture".[19] Art has been defined as a vehicle for the expression or communication of emotions and ideas </FONT></P>



<P style="font-family:times;;margin-left:10.0pt;text-indent:-10.0pt;"><FONT SIZE=2><B>

<!-- COMMAND=STYLE_ADDED,"margin-left:10.0pt;text-indent:-10.0pt;" -->

option 2. Sports division : </B></FONT></P>

<P style="font-family:times;"><FONT SIZE=2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Hundreds of sports exist, from those between single contestants, through to those with hundreds of simultaneous participants, either in teams or competing as individuals. In certain sports such as racing, many contestants may compete, simultaneously or consecutively, with one winner; in others, the contest (a match) is between two sides, each attempting to exceed the other.</FONT></P>

<div>some other random stuff here.</div>

<P style="font-family:times;;margin-left:10.0pt;text-indent:-10.0pt;"><FONT SIZE=2><B>

<!-- COMMAND=STYLE_ADDED,"margin-left:10.0pt;text-indent:-10.0pt;" -->

option 3. Dance group: </B></FONT></P>

<P style="font-family:times;"><FONT SIZE=2>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;An important distinction is to be drawn between the contexts of theatrical and participatory dance,[4] although these two categories are not always completely separate; both may have special functions, </FONT></P>

</div>