JSoup 使用循环获取多个值
JSoup getting several values using a loop
我正在使用 JSoup 尝试获取一个网站的多个值,该网站只有 "luckely" 1 个 TBody 标签,并且是这样构建的:
<tbody>
<tr>
<td>2015</td>
<td>November</td>
<td class="no-border-left"></td>
<td class="no-border-left">€ 15,90</td>
<td>
<a href="/Invoice/Download?invoiceNo=2632992" target="_blank"><img alt="" src="/Content/Images/pdf_icon.png" /></a> </td>
</tr>
<tr>
<td>2015</td>
<td>Oktober</td>
<td class="no-border-left"></td>
<td class="no-border-left">€ 16,20</td>
<td>
<a href="/Invoice/Download?invoiceNo=2445473" target="_blank"><img alt="" src="/Content/Images/pdf_icon.png" /></a>
</td>
</tr>
....
</tbody>
我需要获取所有 年(2015 年)、月份(11 月)、金额(€ 15,90) 和 link( a href=) values 通过循环添加到列表视图中。
我已经得到了一些代码,但不知何故我无法获取金额值。
我还想稍后使用 "link" 值来下载更多内容。
有人可以看看,请指导我一下吗?
谢谢。
....
Elements Tbody = doc.select("TBODY");
for (Element p : Tbody) {
Iterator<Element> postIt = p.select("td").iterator();
String YeaR = postIt.next().text();
String MontH = postIt.next().text();
postIt.next();
postIt.next();
Element amount = doc.select("td.no-border-left").first();
String amounT = amount.text();
Element hrefs = doc.select("a[href]").first();
String linK = hrefs.text();
}
....
好的,我设法解决了这个问题。
如果有人有同样的问题,这里是工作代码:
try {
CharSequence cs1 = "€";
is = getActivity().getAssets().open("test.htm");
Document doc = Jsoup.parse(is, "UTF-8", "http://example.com/");
Elements rows = doc.select("tr");
for (int i = 1; i < rows.size(); i++) {
Element row = rows.get(i);
Elements cols = row.select("td");
Elements links = row.getElementsByTag("a");
String YeaR = cols.get(0).text();
//Log.e("JSOUP: ", YeaR);
String MontH = cols.get(1).text();
//Log.e("JSOUP: ", MontH);
for (Element tes : cols)
if (tes.text().contains(cs1)) {
String amounT = tes.text();
//Log.e("JSOUP: ", amounT);
}
for (Element link : links) {
String url = link.attr("href");
//Log.e("JSOUP: ", url);
}
}
if (is != null)
is.close();
} catch (IOException e) {
e.printStackTrace();
}
它给出了输出:
E/JSOUP:: 2015
E/JSOUP:: November
E/JSOUP:: € 15,90
E/JSOUP:: /Invoice/Download?invoiceNo=2632992
E/JSOUP:: 2015
E/JSOUP:: Oktober
E/JSOUP:: € 16,20
E/JSOUP:: /Invoice/Download?invoiceNo=2445473
现在回答有点晚,但我会这样做:
Document doc = Jsoup.parse(myHtml);
Elements tableRows = doc.select("tbody > tr"); //get all the tr elements in the table
for (Element tableRow : tableRows) { //iterate over all the table rows (tr elements)
String year = tableRow.child(0).text() //get the first td in the row, with the year, and get the text.
String month = tableRow.child(1).text();
String price = tableRow.child(3).text()
String link = tableRow.children().select("a").first().attr("href"); //get the link (a), and then get the href attribute. You can also use abs:href to get an absolute url
}
我正在使用 JSoup 尝试获取一个网站的多个值,该网站只有 "luckely" 1 个 TBody 标签,并且是这样构建的:
<tbody>
<tr>
<td>2015</td>
<td>November</td>
<td class="no-border-left"></td>
<td class="no-border-left">€ 15,90</td>
<td>
<a href="/Invoice/Download?invoiceNo=2632992" target="_blank"><img alt="" src="/Content/Images/pdf_icon.png" /></a> </td>
</tr>
<tr>
<td>2015</td>
<td>Oktober</td>
<td class="no-border-left"></td>
<td class="no-border-left">€ 16,20</td>
<td>
<a href="/Invoice/Download?invoiceNo=2445473" target="_blank"><img alt="" src="/Content/Images/pdf_icon.png" /></a>
</td>
</tr>
....
</tbody>
我需要获取所有 年(2015 年)、月份(11 月)、金额(€ 15,90) 和 link( a href=) values 通过循环添加到列表视图中。
我已经得到了一些代码,但不知何故我无法获取金额值。 我还想稍后使用 "link" 值来下载更多内容。
有人可以看看,请指导我一下吗? 谢谢。
.... Elements Tbody = doc.select("TBODY"); for (Element p : Tbody) { Iterator<Element> postIt = p.select("td").iterator(); String YeaR = postIt.next().text(); String MontH = postIt.next().text(); postIt.next(); postIt.next(); Element amount = doc.select("td.no-border-left").first(); String amounT = amount.text(); Element hrefs = doc.select("a[href]").first(); String linK = hrefs.text(); } ....
好的,我设法解决了这个问题。 如果有人有同样的问题,这里是工作代码:
try {
CharSequence cs1 = "€";
is = getActivity().getAssets().open("test.htm");
Document doc = Jsoup.parse(is, "UTF-8", "http://example.com/");
Elements rows = doc.select("tr");
for (int i = 1; i < rows.size(); i++) {
Element row = rows.get(i);
Elements cols = row.select("td");
Elements links = row.getElementsByTag("a");
String YeaR = cols.get(0).text();
//Log.e("JSOUP: ", YeaR);
String MontH = cols.get(1).text();
//Log.e("JSOUP: ", MontH);
for (Element tes : cols)
if (tes.text().contains(cs1)) {
String amounT = tes.text();
//Log.e("JSOUP: ", amounT);
}
for (Element link : links) {
String url = link.attr("href");
//Log.e("JSOUP: ", url);
}
}
if (is != null)
is.close();
} catch (IOException e) {
e.printStackTrace();
}
它给出了输出:
E/JSOUP:: 2015
E/JSOUP:: November
E/JSOUP:: € 15,90
E/JSOUP:: /Invoice/Download?invoiceNo=2632992
E/JSOUP:: 2015
E/JSOUP:: Oktober
E/JSOUP:: € 16,20
E/JSOUP:: /Invoice/Download?invoiceNo=2445473
现在回答有点晚,但我会这样做:
Document doc = Jsoup.parse(myHtml);
Elements tableRows = doc.select("tbody > tr"); //get all the tr elements in the table
for (Element tableRow : tableRows) { //iterate over all the table rows (tr elements)
String year = tableRow.child(0).text() //get the first td in the row, with the year, and get the text.
String month = tableRow.child(1).text();
String price = tableRow.child(3).text()
String link = tableRow.children().select("a").first().attr("href"); //get the link (a), and then get the href attribute. You can also use abs:href to get an absolute url
}