beautifulsoup> 如何删除特定的行以获得带有文本的结果集
beautifulsoup> how can I remove specific lines to get resultset with texts
*下面是我的 html 代码,我希望我只能在这个 table 中获取高清数据
我希望我能得到必要的数据,我想删除包括 "class="tltle">" 行的数据。
我不知道如何提取具有“class="tltle"...
的 html 代码行
希望我能
[['1501','9,445', '50', '+0.53%', '0', '1','1,000', '94', 'N/A', 'N/A', 'N/A'],
['1502','18,875', '195', '-0.12%', '0', '7','500', '94', 'N/A', 'N/A', 'N/A'],
...............................................,
['1550','8,350', '95', '+1.15%', '0', '2,601','1,000', '84', 'N/A', 'N/A', 'N/A']]
我的 python 代码如下:
stock_list = soup.find("table", attrs={"class": "type_2"}).find("tbody").find_all("tr")
for stock in stock_list:
if len(stock) > 1:
stock.get_text().split()
但我只喜欢:
[['1501','메리츠', '인버스','2X', '국채10년ETN' ,'9,445', '50', '+0.53%', '0', '1','1,000', '94', 'N/A', 'N/A', 'N/A'],
['1502','KB', '레버리지','구리', '선물ETN(H)' ,'18,875', '195', '-0.12%', '0', '7','500', '94', 'N/A', 'N/A', 'N/A'],
...............................................,
['1550','TRUE', '인버스','2X', 'HSCEI','ETN(H)' ,'8,350', '95', '+1.15%', '0', '2,601','1,000', '84', 'N/A', 'N/A', 'N/A']]
html 代码如下:
<table summary="코스피 시세정보를 선택한 항목에 따라 정보를 제공합니다." cellpadding="0" cellspacing="0" class="type_2">
<caption>코스피</caption>
<colgroup>
<col width="2%">
<col width="*">
<col width="7%">
<col width="9%">
<col width="7%">
<col width="8%">
<col width="8%">
<col width="8%">
<col width="8%">
<col width="8%">
<col width="8%">
<col width="8%">
<col width="6%">
</colgroup>
<thead>
<tr>
<th scope="col">N</th>
<th scope="col">종목명</th>
<th scope="col">현재가</th>
<th scope="col" class="tr" style="padding-right:8px">전일비</th>
<th scope="col">등락률</th>
<th scope="col">액면가</th>
<th scope="col">거래량</th>
<th scope="col">상장주식수</th>
<th scope="col">시가총액</th>
<th scope="col">PER</th>
<th scope="col">ROE</th>
<th scope="col">PBR</th>
<th scope="col">토론실</th>
</tr>
</thead>
<tbody>
<tr><td colspan="10" class="blank_08"></td></tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)" style="background-color: rgb(255, 255, 255);">
<td class="no">1501</td>
<td><a href="/item/main.naver?code=610021" class="tltle">메리츠 인버스 2X 국채10년 ETN</a></td>
<td class="number">9,445</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_up.gif" width="7" height="6" style="margin-right:4px;" alt="상승"><span class="tah p11 red02">
50
</span>
</td>
<td class="number">
<span class="tah p11 red01">
+0.53%
</span>
</td>
<td class="number">0</td>
<td class="number">1</td>
<td class="number">1,000</td>
<td class="number">94</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=610021"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)" style="background-color: rgb(255, 255, 255);">
<td class="no">1502</td>
<td><a href="/item/main.naver?code=580032" class="tltle">KB 레버리지 구리 선물 ETN(H)</a></td>
<td class="number">18,875</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
195
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-1.02%
</span>
</td>
<td class="number">0</td>
<td class="number">7</td>
<td class="number">500</td>
<td class="number">94</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=580032"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)" style="background-color: rgb(255, 255, 255);">
<td class="no">1503</td>
<td><a href="/item/main.naver?code=570064" class="tltle">TRUE 인버스 베트남 VN30 선물 ETN(H)</a></td>
<td class="number">9,415</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
55
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-0.58%
</span>
</td>
<td class="number">0</td>
<td class="number">260</td>
<td class="number">1,000</td>
<td class="number">94</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=570064"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)" style="background-color: rgb(255, 255, 255);">
<td class="no">1504</td>
<td><a href="/item/main.naver?code=256450" class="tltle">ARIRANG 심천차이넥스트(합성)</a></td>
<td class="number">15,680</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_up.gif" width="7" height="6" style="margin-right:4px;" alt="상승"><span class="tah p11 red02">
5
</span>
</td>
<td class="number">
<span class="tah p11 red01">
+0.03%
</span>
</td>
<td class="number">0</td>
<td class="number">538</td>
<td class="number">600</td>
<td class="number">94</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=256450"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)" style="background-color: rgb(255, 255, 255);">
<td class="no">1505</td>
<td><a href="/item/main.naver?code=380340" class="tltle">KINDEX Fn5G플러스</a></td>
<td class="number">9,405</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_up.gif" width="7" height="6" style="margin-right:4px;" alt="상승"><span class="tah p11 red02">
100
</span>
</td>
<td class="number">
<span class="tah p11 red01">
+1.07%
</span>
</td>
<td class="number">0</td>
<td class="number">6,537</td>
<td class="number">1,000</td>
<td class="number">94</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=380340"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr><td colspan="13" class="blank_06"></td></tr>
<tr><td colspan="13" class="division_line"></td></tr>
<tr><td colspan="13" class="blank_08"></td></tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1506</td>
<td><a href="/item/main.naver?code=530087" class="tltle">삼성 KRX 2차전지 K-뉴딜 ETN</a></td>
<td class="number">9,335</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
60
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-0.64%
</span>
</td>
<td class="number">0</td>
<td class="number">15</td>
<td class="number">1,000</td>
<td class="number">93</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=530087"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1507</td>
<td><a href="/item/main.naver?code=152500" class="tltle">KINDEX 레버리지</a></td>
<td class="number">9,320</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
5
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-0.05%
</span>
</td>
<td class="number">0</td>
<td class="number">4,547</td>
<td class="number">1,000</td>
<td class="number">93</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=152500"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1508</td>
<td><a href="/item/main.naver?code=407300" class="tltle">HANARO Fn골프테마</a></td>
<td class="number">9,270</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
40
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-0.43%
</span>
</td>
<td class="number">0</td>
<td class="number">7,021</td>
<td class="number">1,000</td>
<td class="number">93</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=407300"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1509</td>
<td><a href="/item/main.naver?code=500012" class="tltle">신한 인버스 달러인덱스 선물 ETN(H)</a></td>
<td class="number">9,225</td>
<td class="number">
<span class="tah p11">0</span>
</td>
<td class="number">
<span class="tah p11">0.00%</span>
</td>
<td class="number">0</td>
<td class="number">0</td>
<td class="number">1,000</td>
<td class="number">92</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=500012"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1510</td>
<td><a href="/item/main.naver?code=227830" class="tltle">ARIRANG 코스피</a></td>
<td class="number">30,720</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
15
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-0.05%
</span>
</td>
<td class="number">0</td>
<td class="number">44</td>
<td class="number">300</td>
<td class="number">92</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=227830"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr><td colspan="13" class="blank_06"></td></tr>
<tr><td colspan="13" class="division_line"></td></tr>
<tr><td colspan="13" class="blank_08"></td></tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1511</td>
<td><a href="/item/main.naver?code=364690" class="tltle">KODEX 혁신기술테마액티브</a></td>
<td class="number">13,060</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_up.gif" width="7" height="6" style="margin-right:4px;" alt="상승"><span class="tah p11 red02">
30
</span>
</td>
<td class="number">
<span class="tah p11 red01">
+0.23%
</span>
</td>
<td class="number">0</td>
<td class="number">977</td>
<td class="number">700</td>
<td class="number">91</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=364690"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1512</td>
<td><a href="/item/main.naver?code=189400" class="tltle">ARIRANG 글로벌MSCI(합성 H)</a></td>
<td class="number">17,870</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
115
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-0.64%
</span>
</td>
<td class="number">0</td>
<td class="number">272</td>
<td class="number">510</td>
<td class="number">91</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=189400"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1513</td>
<td><a href="/item/main.naver?code=272230" class="tltle">KINDEX 스마트밸류</a></td>
<td class="number">15,055</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
20
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-0.13%
</span>
</td>
<td class="number">0</td>
<td class="number">20</td>
<td class="number">600</td>
<td class="number">90</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=272230"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1514</td>
<td><a href="/item/main.naver?code=570023" class="tltle">TRUE 인버스 2X S&P500 선물 ETN(H)</a></td>
<td class="number">1,800</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_up.gif" width="7" height="6" style="margin-right:4px;" alt="상승"><span class="tah p11 red02">
20
</span>
</td>
<td class="number">
<span class="tah p11 red01">
+1.12%
</span>
</td>
<td class="number">0</td>
<td class="number">27,146</td>
<td class="number">5,000</td>
<td class="number">90</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=570023"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1515</td>
<td><a href="/item/main.naver?code=167860" class="tltle">KOSEF 국고채10년레버리지</a></td>
<td class="number">127,925</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
200
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-0.16%
</span>
</td>
<td class="number">0</td>
<td class="number">751</td>
<td class="number">70</td>
<td class="number">90</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=167860"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr><td colspan="13" class="blank_06"></td></tr>
<tr><td colspan="13" class="division_line"></td></tr>
<tr><td colspan="13" class="blank_08"></td></tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1516</td>
<td><a href="/item/main.naver?code=610008" class="tltle">메리츠 레버리지 국채30년 ETN</a></td>
<td class="number">8,915</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
165
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-1.82%
</span>
</td>
<td class="number">0</td>
<td class="number">1,098</td>
<td class="number">1,000</td>
<td class="number">89</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=610008"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1517</td>
<td><a href="/item/main.naver?code=005965" class="tltle">동부건설우</a></td>
<td class="number">39,400</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_up.gif" width="7" height="6" style="margin-right:4px;" alt="상승"><span class="tah p11 red02">
2,300
</span>
</td>
<td class="number">
<span class="tah p11 red01">
+6.20%
</span>
</td>
<td class="number">5,000</td>
<td class="number">5,382</td>
<td class="number">226</td>
<td class="number">89</td>
<td class="number">8.55</td>
<td class="number">N/A</td>
<td class="number">1.67</td>
<td class="center"><a href="/item/board.naver?code=005965"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1518</td>
<td><a href="/item/main.naver?code=700003" class="tltle">하나 KRX BBIG K-뉴딜 ETN</a></td>
<td class="number">8,890</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
65
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-0.73%
</span>
</td>
<td class="number">0</td>
<td class="number">5</td>
<td class="number">1,000</td>
<td class="number">89</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=700003"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1519</td>
<td><a href="/item/main.naver?code=014825" class="tltle">동원시스템즈우</a></td>
<td class="number">33,500</td>
<td class="number">
<span class="tah p11">0</span>
</td>
<td class="number">
<span class="tah p11">0.00%</span>
</td>
<td class="number">5,000</td>
<td class="number">73</td>
<td class="number">265</td>
<td class="number">89</td>
<td class="number">15.17</td>
<td class="number">N/A</td>
<td class="number">1.62</td>
<td class="center"><a href="/item/board.naver?code=014825"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1520</td>
<td><a href="/item/main.naver?code=307510" class="tltle">TIGER 의료기기</a></td>
<td class="number">19,665</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_up.gif" width="7" height="6" style="margin-right:4px;" alt="상승"><span class="tah p11 red02">
310
</span>
</td>
<td class="number">
<span class="tah p11 red01">
+1.60%
</span>
</td>
<td class="number">0</td>
<td class="number">7,758</td>
<td class="number">450</td>
<td class="number">88</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=307510"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1550</td>
<td><a href="/item/main.naver?code=570032" class="tltle">TRUE 인버스 2X HSCEI ETN(H)</a></td>
<td class="number">8,350</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_up.gif" width="7" height="6" style="margin-right:4px;" alt="상승"><span class="tah p11 red02">
95
</span>
</td>
<td class="number">
<span class="tah p11 red01">
+1.15%
</span>
</td>
<td class="number">0</td>
<td class="number">2,601</td>
<td class="number">1,000</td>
<td class="number">84</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=570032"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr><td colspan="13" class="blank_09"></td></tr>
<tr><td colspan="13" class="division_line_1"></td></tr>
<tr><td colspan="13" class="blank_09"></td></tr>
</tbody>
</table>
注意 问题需要一些改进,此外 url 会很棒,以获得更多上下文并提出更具体的解决方案 - 示例以实际提供的内容为准
如何修复?
Select 需要的元素更具体 css selectors
可用于 - 以下行将 select 来自 table 的所有行,其标题包含“코스피”而不是有任何 th 或 colspan:
soup.select('table:has(caption:-soup-contains("코스피")) tr:not(:has(th, [colspan]))')
这将使用行中的文本创建结果集 data
:
data = []
for row in soup.select('table:has(caption:-soup-contains("코스피")) tr:not(:has(th, [colspan]))'):
data.append([x.text for x in row.select('td:not(.title)')])
编辑
基于 url 的附加上下文 (finance.naver.com/sise/sise_market_sum.nhn?page=31) css selectors
像这样更改。
获取数据(table是唯一带类名type_2):
for row in soup.select('table.type_2 tr:not(:has(th, [colspan]))'):
data.append([x.text for x in row.select('td:not(.title)')])
获取标题:
list(soup.select_one('table.type_2 tr').stripped_strings)
示例(+从数据创建数据框)
import requests
import pandas as pd
from bs4 import BeautifulSoup
html = requests.get('https://finance.naver.com/sise/sise_market_sum.nhn?page=31').text
soup = BeautifulSoup(html, 'lxml')
data = []
for row in soup.select('table.type_2 tr:not(:has(th, [colspan]))'):
data.append([x.text for x in row.select('td:not(.title)')])
pd.DataFrame(data, columns=list(soup.select_one('table.type_2 tr').stripped_strings))
输出
N
종목명
현재가
전일비
등락률
액면가
거래량
상장주식수
시가총액
PER
ROE
PBR
토론실
1501
메리츠 인버스 2X 국채10년 ETN
9,445
50
+0.53%
0
1
1,000
94
N/A
N/A
N/A
1502
KB 레버리지 구리 선물 ETN(H)
18,875
195
-1.02%
0
7
500
94
N/A
N/A
N/A
1503
TRUE 인버스 베트남 VN30 선물 ETN(H)
9,415
55
-0.58%
0
260
1,000
94
N/A
N/A
N/A
*下面是我的 html 代码,我希望我只能在这个 table 中获取高清数据 我希望我能得到必要的数据,我想删除包括 "class="tltle">" 行的数据。 我不知道如何提取具有“class="tltle"...
的 html 代码行希望我能
[['1501','9,445', '50', '+0.53%', '0', '1','1,000', '94', 'N/A', 'N/A', 'N/A'],
['1502','18,875', '195', '-0.12%', '0', '7','500', '94', 'N/A', 'N/A', 'N/A'],
...............................................,
['1550','8,350', '95', '+1.15%', '0', '2,601','1,000', '84', 'N/A', 'N/A', 'N/A']]
我的 python 代码如下:
stock_list = soup.find("table", attrs={"class": "type_2"}).find("tbody").find_all("tr")
for stock in stock_list:
if len(stock) > 1:
stock.get_text().split()
但我只喜欢:
[['1501','메리츠', '인버스','2X', '국채10년ETN' ,'9,445', '50', '+0.53%', '0', '1','1,000', '94', 'N/A', 'N/A', 'N/A'],
['1502','KB', '레버리지','구리', '선물ETN(H)' ,'18,875', '195', '-0.12%', '0', '7','500', '94', 'N/A', 'N/A', 'N/A'],
...............................................,
['1550','TRUE', '인버스','2X', 'HSCEI','ETN(H)' ,'8,350', '95', '+1.15%', '0', '2,601','1,000', '84', 'N/A', 'N/A', 'N/A']]
html 代码如下:
<table summary="코스피 시세정보를 선택한 항목에 따라 정보를 제공합니다." cellpadding="0" cellspacing="0" class="type_2">
<caption>코스피</caption>
<colgroup>
<col width="2%">
<col width="*">
<col width="7%">
<col width="9%">
<col width="7%">
<col width="8%">
<col width="8%">
<col width="8%">
<col width="8%">
<col width="8%">
<col width="8%">
<col width="8%">
<col width="6%">
</colgroup>
<thead>
<tr>
<th scope="col">N</th>
<th scope="col">종목명</th>
<th scope="col">현재가</th>
<th scope="col" class="tr" style="padding-right:8px">전일비</th>
<th scope="col">등락률</th>
<th scope="col">액면가</th>
<th scope="col">거래량</th>
<th scope="col">상장주식수</th>
<th scope="col">시가총액</th>
<th scope="col">PER</th>
<th scope="col">ROE</th>
<th scope="col">PBR</th>
<th scope="col">토론실</th>
</tr>
</thead>
<tbody>
<tr><td colspan="10" class="blank_08"></td></tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)" style="background-color: rgb(255, 255, 255);">
<td class="no">1501</td>
<td><a href="/item/main.naver?code=610021" class="tltle">메리츠 인버스 2X 국채10년 ETN</a></td>
<td class="number">9,445</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_up.gif" width="7" height="6" style="margin-right:4px;" alt="상승"><span class="tah p11 red02">
50
</span>
</td>
<td class="number">
<span class="tah p11 red01">
+0.53%
</span>
</td>
<td class="number">0</td>
<td class="number">1</td>
<td class="number">1,000</td>
<td class="number">94</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=610021"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)" style="background-color: rgb(255, 255, 255);">
<td class="no">1502</td>
<td><a href="/item/main.naver?code=580032" class="tltle">KB 레버리지 구리 선물 ETN(H)</a></td>
<td class="number">18,875</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
195
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-1.02%
</span>
</td>
<td class="number">0</td>
<td class="number">7</td>
<td class="number">500</td>
<td class="number">94</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=580032"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)" style="background-color: rgb(255, 255, 255);">
<td class="no">1503</td>
<td><a href="/item/main.naver?code=570064" class="tltle">TRUE 인버스 베트남 VN30 선물 ETN(H)</a></td>
<td class="number">9,415</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
55
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-0.58%
</span>
</td>
<td class="number">0</td>
<td class="number">260</td>
<td class="number">1,000</td>
<td class="number">94</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=570064"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)" style="background-color: rgb(255, 255, 255);">
<td class="no">1504</td>
<td><a href="/item/main.naver?code=256450" class="tltle">ARIRANG 심천차이넥스트(합성)</a></td>
<td class="number">15,680</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_up.gif" width="7" height="6" style="margin-right:4px;" alt="상승"><span class="tah p11 red02">
5
</span>
</td>
<td class="number">
<span class="tah p11 red01">
+0.03%
</span>
</td>
<td class="number">0</td>
<td class="number">538</td>
<td class="number">600</td>
<td class="number">94</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=256450"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)" style="background-color: rgb(255, 255, 255);">
<td class="no">1505</td>
<td><a href="/item/main.naver?code=380340" class="tltle">KINDEX Fn5G플러스</a></td>
<td class="number">9,405</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_up.gif" width="7" height="6" style="margin-right:4px;" alt="상승"><span class="tah p11 red02">
100
</span>
</td>
<td class="number">
<span class="tah p11 red01">
+1.07%
</span>
</td>
<td class="number">0</td>
<td class="number">6,537</td>
<td class="number">1,000</td>
<td class="number">94</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=380340"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr><td colspan="13" class="blank_06"></td></tr>
<tr><td colspan="13" class="division_line"></td></tr>
<tr><td colspan="13" class="blank_08"></td></tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1506</td>
<td><a href="/item/main.naver?code=530087" class="tltle">삼성 KRX 2차전지 K-뉴딜 ETN</a></td>
<td class="number">9,335</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
60
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-0.64%
</span>
</td>
<td class="number">0</td>
<td class="number">15</td>
<td class="number">1,000</td>
<td class="number">93</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=530087"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1507</td>
<td><a href="/item/main.naver?code=152500" class="tltle">KINDEX 레버리지</a></td>
<td class="number">9,320</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
5
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-0.05%
</span>
</td>
<td class="number">0</td>
<td class="number">4,547</td>
<td class="number">1,000</td>
<td class="number">93</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=152500"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1508</td>
<td><a href="/item/main.naver?code=407300" class="tltle">HANARO Fn골프테마</a></td>
<td class="number">9,270</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
40
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-0.43%
</span>
</td>
<td class="number">0</td>
<td class="number">7,021</td>
<td class="number">1,000</td>
<td class="number">93</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=407300"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1509</td>
<td><a href="/item/main.naver?code=500012" class="tltle">신한 인버스 달러인덱스 선물 ETN(H)</a></td>
<td class="number">9,225</td>
<td class="number">
<span class="tah p11">0</span>
</td>
<td class="number">
<span class="tah p11">0.00%</span>
</td>
<td class="number">0</td>
<td class="number">0</td>
<td class="number">1,000</td>
<td class="number">92</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=500012"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1510</td>
<td><a href="/item/main.naver?code=227830" class="tltle">ARIRANG 코스피</a></td>
<td class="number">30,720</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
15
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-0.05%
</span>
</td>
<td class="number">0</td>
<td class="number">44</td>
<td class="number">300</td>
<td class="number">92</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=227830"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr><td colspan="13" class="blank_06"></td></tr>
<tr><td colspan="13" class="division_line"></td></tr>
<tr><td colspan="13" class="blank_08"></td></tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1511</td>
<td><a href="/item/main.naver?code=364690" class="tltle">KODEX 혁신기술테마액티브</a></td>
<td class="number">13,060</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_up.gif" width="7" height="6" style="margin-right:4px;" alt="상승"><span class="tah p11 red02">
30
</span>
</td>
<td class="number">
<span class="tah p11 red01">
+0.23%
</span>
</td>
<td class="number">0</td>
<td class="number">977</td>
<td class="number">700</td>
<td class="number">91</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=364690"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1512</td>
<td><a href="/item/main.naver?code=189400" class="tltle">ARIRANG 글로벌MSCI(합성 H)</a></td>
<td class="number">17,870</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
115
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-0.64%
</span>
</td>
<td class="number">0</td>
<td class="number">272</td>
<td class="number">510</td>
<td class="number">91</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=189400"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1513</td>
<td><a href="/item/main.naver?code=272230" class="tltle">KINDEX 스마트밸류</a></td>
<td class="number">15,055</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
20
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-0.13%
</span>
</td>
<td class="number">0</td>
<td class="number">20</td>
<td class="number">600</td>
<td class="number">90</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=272230"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1514</td>
<td><a href="/item/main.naver?code=570023" class="tltle">TRUE 인버스 2X S&P500 선물 ETN(H)</a></td>
<td class="number">1,800</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_up.gif" width="7" height="6" style="margin-right:4px;" alt="상승"><span class="tah p11 red02">
20
</span>
</td>
<td class="number">
<span class="tah p11 red01">
+1.12%
</span>
</td>
<td class="number">0</td>
<td class="number">27,146</td>
<td class="number">5,000</td>
<td class="number">90</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=570023"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1515</td>
<td><a href="/item/main.naver?code=167860" class="tltle">KOSEF 국고채10년레버리지</a></td>
<td class="number">127,925</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
200
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-0.16%
</span>
</td>
<td class="number">0</td>
<td class="number">751</td>
<td class="number">70</td>
<td class="number">90</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=167860"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr><td colspan="13" class="blank_06"></td></tr>
<tr><td colspan="13" class="division_line"></td></tr>
<tr><td colspan="13" class="blank_08"></td></tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1516</td>
<td><a href="/item/main.naver?code=610008" class="tltle">메리츠 레버리지 국채30년 ETN</a></td>
<td class="number">8,915</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
165
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-1.82%
</span>
</td>
<td class="number">0</td>
<td class="number">1,098</td>
<td class="number">1,000</td>
<td class="number">89</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=610008"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1517</td>
<td><a href="/item/main.naver?code=005965" class="tltle">동부건설우</a></td>
<td class="number">39,400</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_up.gif" width="7" height="6" style="margin-right:4px;" alt="상승"><span class="tah p11 red02">
2,300
</span>
</td>
<td class="number">
<span class="tah p11 red01">
+6.20%
</span>
</td>
<td class="number">5,000</td>
<td class="number">5,382</td>
<td class="number">226</td>
<td class="number">89</td>
<td class="number">8.55</td>
<td class="number">N/A</td>
<td class="number">1.67</td>
<td class="center"><a href="/item/board.naver?code=005965"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1518</td>
<td><a href="/item/main.naver?code=700003" class="tltle">하나 KRX BBIG K-뉴딜 ETN</a></td>
<td class="number">8,890</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_down.gif" width="7" height="6" style="margin-right:4px;" alt="하락"><span class="tah p11 nv01">
65
</span>
</td>
<td class="number">
<span class="tah p11 nv01">
-0.73%
</span>
</td>
<td class="number">0</td>
<td class="number">5</td>
<td class="number">1,000</td>
<td class="number">89</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=700003"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1519</td>
<td><a href="/item/main.naver?code=014825" class="tltle">동원시스템즈우</a></td>
<td class="number">33,500</td>
<td class="number">
<span class="tah p11">0</span>
</td>
<td class="number">
<span class="tah p11">0.00%</span>
</td>
<td class="number">5,000</td>
<td class="number">73</td>
<td class="number">265</td>
<td class="number">89</td>
<td class="number">15.17</td>
<td class="number">N/A</td>
<td class="number">1.62</td>
<td class="center"><a href="/item/board.naver?code=014825"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1520</td>
<td><a href="/item/main.naver?code=307510" class="tltle">TIGER 의료기기</a></td>
<td class="number">19,665</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_up.gif" width="7" height="6" style="margin-right:4px;" alt="상승"><span class="tah p11 red02">
310
</span>
</td>
<td class="number">
<span class="tah p11 red01">
+1.60%
</span>
</td>
<td class="number">0</td>
<td class="number">7,758</td>
<td class="number">450</td>
<td class="number">88</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=307510"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
<tr onmouseover="mouseOver(this)" onmouseout="mouseOut(this)">
<td class="no">1550</td>
<td><a href="/item/main.naver?code=570032" class="tltle">TRUE 인버스 2X HSCEI ETN(H)</a></td>
<td class="number">8,350</td>
<td class="number">
<img src="https://ssl.pstatic.net/imgstock/images/images4/ico_up.gif" width="7" height="6" style="margin-right:4px;" alt="상승"><span class="tah p11 red02">
95
</span>
</td>
<td class="number">
<span class="tah p11 red01">
+1.15%
</span>
</td>
<td class="number">0</td>
<td class="number">2,601</td>
<td class="number">1,000</td>
<td class="number">84</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="number">N/A</td>
<td class="center"><a href="/item/board.naver?code=570032"><img src="https://ssl.pstatic.net/imgstock/images5/ico_debatebl2.gif" width="15" height="13" alt="토론실"></a></td>
</tr>
<tr><td colspan="13" class="blank_09"></td></tr>
<tr><td colspan="13" class="division_line_1"></td></tr>
<tr><td colspan="13" class="blank_09"></td></tr>
</tbody>
</table>
注意 问题需要一些改进,此外 url 会很棒,以获得更多上下文并提出更具体的解决方案 - 示例以实际提供的内容为准
如何修复?
Select 需要的元素更具体 css selectors
可用于 - 以下行将 select 来自 table 的所有行,其标题包含“코스피”而不是有任何 th 或 colspan:
soup.select('table:has(caption:-soup-contains("코스피")) tr:not(:has(th, [colspan]))')
这将使用行中的文本创建结果集 data
:
data = []
for row in soup.select('table:has(caption:-soup-contains("코스피")) tr:not(:has(th, [colspan]))'):
data.append([x.text for x in row.select('td:not(.title)')])
编辑
基于 url 的附加上下文 (finance.naver.com/sise/sise_market_sum.nhn?page=31) css selectors
像这样更改。
获取数据(table是唯一带类名type_2):
for row in soup.select('table.type_2 tr:not(:has(th, [colspan]))'):
data.append([x.text for x in row.select('td:not(.title)')])
获取标题:
list(soup.select_one('table.type_2 tr').stripped_strings)
示例(+从数据创建数据框)
import requests
import pandas as pd
from bs4 import BeautifulSoup
html = requests.get('https://finance.naver.com/sise/sise_market_sum.nhn?page=31').text
soup = BeautifulSoup(html, 'lxml')
data = []
for row in soup.select('table.type_2 tr:not(:has(th, [colspan]))'):
data.append([x.text for x in row.select('td:not(.title)')])
pd.DataFrame(data, columns=list(soup.select_one('table.type_2 tr').stripped_strings))
输出
N | 종목명 | 현재가 | 전일비 | 등락률 | 액면가 | 거래량 | 상장주식수 | 시가총액 | PER | ROE | PBR | 토론실 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1501 | 메리츠 인버스 2X 국채10년 ETN | 9,445 | 50 | +0.53% | 0 | 1 | 1,000 | 94 | N/A | N/A | N/A | |
1502 | KB 레버리지 구리 선물 ETN(H) | 18,875 | 195 | -1.02% | 0 | 7 | 500 | 94 | N/A | N/A | N/A | |
1503 | TRUE 인버스 베트남 VN30 선물 ETN(H) | 9,415 | 55 | -0.58% | 0 | 260 | 1,000 | 94 | N/A | N/A | N/A |