在 R 中将字符串 url 转换为 html
convert string urls to html in R
我有一个简单的 html table 和字符串 urls.
我想找到每个 url,并将其替换为相同 url 的 html 标签。
例如,https://waterdata.usgs.gov/ct/nwis/uv?site_no=01127500&emsp
将替换为 "<a href='https://waterdata.usgs.gov/ct/nwis/uv?site_no=01127500&emsp' target='_blank'>https://waterdata.usgs.gov/ct/nwis/uv?site_no=01127500&emsp</a>"
如何在整个文档中执行此操作?我查看了 stringr、正则表达式,并尝试使用 python,但无法弄清楚。
下面是示例 table:
"<div class='scrollableContainer'>
<table class= id='popup'>
<tr><td></td><th>STAID </th><td>01127500 </td></tr><tr><td></td><th>STANAME </th><td>YANTIC RIVER AT YANTIC, CT </td></tr><tr><td></td><th>ST </th><td>CT </td></tr><tr><td></td><th>HUC </th><td>01100003 </td></tr><tr><td></td><th>CLASS </th><td>7 </td></tr><tr><td></td><th>FLOW </th><td> 857.0 </td></tr><tr><td></td><th>STAGE </th><td> 4.86 </td></tr><tr><td></td><th>TIME </th><td>2020-12-01 10:45:00 </td></tr><tr><td></td><th>TIME_UTC </th><td>2020-12-01 15:45:00 </td></tr><tr><td></td><th>URL </th><td>https://waterdata.usgs.gov/ct/nwis/uv?site_no=01127500 </td></tr><tr><td></td><th>DATUM </th><td>NAD83 </td></tr><tr><td></td><th>COUNT </th><td> 8vdssDcvSDCs9 </td></tr><tr><td></td><th>PERCENTILE </th><td>97.19 </td></tr><tr><td></td><th>FLOODSTAGE </th><td>0 </td></tr></table></div>"
[2] "<div class='scrollableContainer'><table class= id='popup'><tr><td></td><th>STAID </th><td>01578310 </td></tr><tr><td></td><th>STANAME </th><td>SUSQUEHANNA RIVER AT CONOWINGO, MD </td></tr><tr><td></td><th>ST </th><td>MD </td></tr><tr><td></td><th>HUC </th><td>02050306 </td></tr><tr><td></td><th>CLASS </th><td>4 </td></tr><tr><td></td><th>FLOW </th><td>20100.0 </td></tr><tr><td></td><th>STAGE </th><td>11.63 </td></tr><tr><td></td><th>TIME </th><td>2020-12-01 10:30:00 </td></tr><tr><td></td><th>TIME_UTC </th><td>2020-12-01 15:30:00 </td></tr><tr><td></td><th>URL </th><td>https://waterdata.usgs.gov/md/nwis/uv?site_no=01578310 </td></tr><tr><td></td><th>DATUM </th><td>NAD83 </td></tr><tr><td></td><th>COUNT </th><td> 5vdssDcvSDCs2 </td></tr><tr><td></td><th>PERCENTILE </th><td>12.43 </td></tr><tr><td></td><th>FLOODSTAGE </th><td>0 </td></tr></table></div>"
[3] "<div class='scrollableContainer'><table class= id='popup'><tr><td></td><th>STAID </th><td>02035000 </td></tr><tr><td></td><th>STANAME </th><td>JAMES RIVER AT CARTERSVILLE, VA </td></tr><tr><td></td><th>ST </th><td>VA </td></tr><tr><td></td><th>HUC </th><td>02080205 </td></tr><tr><td></td><th>CLASS </th><td>7 </td></tr><tr><td></td><th>FLOW </th><td>46500.0 </td></tr><tr><td></td><th>STAGE </th><td>15.70 </td></tr><tr><td></td><th>TIME </th><td>2020-12-01 10:45:00 </td></tr><tr><td></td><th>TIME_UTC </th><td>2020-12-01 15:45:00 </td></tr><tr><td></td><th>URL </th><td>https://waterdata.usgs.gov/va/nwis/uv?site_no=02035000 </td></tr><tr><td></td><th>DATUM </th><td>NAD83 </td></tr><tr><td></td><th>COUNT </th><td>121 </td></tr><tr><td></td><th>PERCENTILE </th><td>97.02 </td></tr><tr><td></td><th>FLOODSTAGE </th><td>0 </td></tr></table></div>"vdssDcvSDCs
[4] "<div class='scrollableContainer'><table class= id='popup'><tr><td></td><th>STAID </th><td>02198690 </td></tr><tr><td></td><th>STANAME </th><td>EBENEZER CREEK AT SPRINGFIELD, GA </td></tr><tr><td></td><th>ST </th><td>GA </td></tr><tr><td></td><th>HUC </th><td>03060109 </td></tr><tr><td></td><th>CLASS </th><td>5 </td></tr><tr><td></td><th>FLOW </th><td> 26.3 </td></tr><tr><td></td><th>STAGE </th><td> 4.95 </td></tr><tr><td></td><th>TIME </th><td>2020-12-01 10:00:00 </td></tr><tr><td></td><th>TIME_UTC </th><td>2020-12-01 15:00:00 </td></tr><tr><td></td><th>URL </th><td>https://waterdata.usgs.gov/ga/nwis/uv?site_no=02198690 </td></tr><tr><td></td><th>DATUM </th><td>NAD83 </td></tr><tr><td></td><th>COUNT </th><td> 3vdssDcvSDCs0 </td></tr><tr><td></td><th>PERCENTILE </th><td>47.56 </td></tr><tr><td></td><th>FLOODSTAGE </th><td>0 </td></tr></table></div>"
[5] "<div class='scrollableContainer'><table class= id='popup'><tr><td></td><th>STAID </th><td>02323000 </td></tr><tr><td></td><th>STANAME </th><td>SUWANNEE RIVER NEAR BELL, FLORIDA </td></tr><tr><td></td><th>ST </th><td>FL </td></tr><tr><td></td><th>HUC </th><td>03110205 </td></tr><tr><td></td><th>CLASS </th><td>5 </td></tr><tr><td></td><th>FLOW </th><td> 5020.0 </td></tr><tr><td></td><th>STAGE </th><td> 7.04 </td></tr><tr><td></td><th>TIME </th><td>2020-12-01 10:00:00 </td></tr><tr><td></td><th>TIME_UTC </th><td>2020-12-01 15:00:00 </td></tr><tr><td></td><th>URL </th><td>https://waterdata.usgs.gov/fl/nwis/uv?site_no=02323000 </td></tr><tr><td></td><th>DATUM </th><td>NAD83 </td></tr><tr><td></td><th>COUNT </th><td> 4vdssDcvSDCs5 </td></tr><tr><td></td><th>PERCENTILE </th><td>52.62 </td></tr><tr><td></td><th>FLOODSTAGE </th><td>0 </td></tr></table></div>"
[6] "<div class='scrollableContainer'><table class= id='popup'><tr><td></td><th>STAID </th><td>01638500 </td></tr><tr><td></td><th>STANAME </th><td>POTOMAC RIVER AT POINT OF ROCKS, MD </td></tr><tr><td></td><th>ST </th><td>MD </td></tr><tr><td></td><th>HUC </th><td>02070008 </td></tr><tr><td></td><th>CLASS </th><td>5 </td></tr><tr><td></td><th>FLOW </th><td> 4940.0 </td></tr><tr><td></td><th>STAGE </th><td> 2.18 </td></tr><tr><td></td><th>TIME </th><td>2020-12-01 10:45:00 </td></tr><tr><td></td><th>TIME_UTC </th><td>2020-12-01 15:45:00 </td></tr><tr><td></td><th>URL </th><td>https://waterdata.usgs.gov/md/nwis/uv?site_no=01638500 </td></tr><tr><td></td><th>DATUM </th><td>NAD83 </td></tr><tr><td></td><th>COUNT </th><td>125 </td></tr><tr><td></td><th>PERCENTILE </th><td>50.94 </td></tr><tr><td></td><th>FLOODSTAGE </th><td>0 </td></tr></table></div>"
您可以通过简单的正则表达式替换或使用将创建标签的 htmltools::tags$a
函数来实现:
str_replace_all(tables, "(https?://.+?)(?=<)", "<a href=\\"\1\\">\1</>")
str_replace_all(tables, "https?://.+?(?=<)", function(.x) as.character(htmltools::tags$a(.x, href=.x)))
我有一个简单的 html table 和字符串 urls.
我想找到每个 url,并将其替换为相同 url 的 html 标签。
例如,https://waterdata.usgs.gov/ct/nwis/uv?site_no=01127500&emsp
将替换为 "<a href='https://waterdata.usgs.gov/ct/nwis/uv?site_no=01127500&emsp' target='_blank'>https://waterdata.usgs.gov/ct/nwis/uv?site_no=01127500&emsp</a>"
如何在整个文档中执行此操作?我查看了 stringr、正则表达式,并尝试使用 python,但无法弄清楚。
下面是示例 table:
"<div class='scrollableContainer'>
<table class= id='popup'>
<tr><td></td><th>STAID </th><td>01127500 </td></tr><tr><td></td><th>STANAME </th><td>YANTIC RIVER AT YANTIC, CT </td></tr><tr><td></td><th>ST </th><td>CT </td></tr><tr><td></td><th>HUC </th><td>01100003 </td></tr><tr><td></td><th>CLASS </th><td>7 </td></tr><tr><td></td><th>FLOW </th><td> 857.0 </td></tr><tr><td></td><th>STAGE </th><td> 4.86 </td></tr><tr><td></td><th>TIME </th><td>2020-12-01 10:45:00 </td></tr><tr><td></td><th>TIME_UTC </th><td>2020-12-01 15:45:00 </td></tr><tr><td></td><th>URL </th><td>https://waterdata.usgs.gov/ct/nwis/uv?site_no=01127500 </td></tr><tr><td></td><th>DATUM </th><td>NAD83 </td></tr><tr><td></td><th>COUNT </th><td> 8vdssDcvSDCs9 </td></tr><tr><td></td><th>PERCENTILE </th><td>97.19 </td></tr><tr><td></td><th>FLOODSTAGE </th><td>0 </td></tr></table></div>"
[2] "<div class='scrollableContainer'><table class= id='popup'><tr><td></td><th>STAID </th><td>01578310 </td></tr><tr><td></td><th>STANAME </th><td>SUSQUEHANNA RIVER AT CONOWINGO, MD </td></tr><tr><td></td><th>ST </th><td>MD </td></tr><tr><td></td><th>HUC </th><td>02050306 </td></tr><tr><td></td><th>CLASS </th><td>4 </td></tr><tr><td></td><th>FLOW </th><td>20100.0 </td></tr><tr><td></td><th>STAGE </th><td>11.63 </td></tr><tr><td></td><th>TIME </th><td>2020-12-01 10:30:00 </td></tr><tr><td></td><th>TIME_UTC </th><td>2020-12-01 15:30:00 </td></tr><tr><td></td><th>URL </th><td>https://waterdata.usgs.gov/md/nwis/uv?site_no=01578310 </td></tr><tr><td></td><th>DATUM </th><td>NAD83 </td></tr><tr><td></td><th>COUNT </th><td> 5vdssDcvSDCs2 </td></tr><tr><td></td><th>PERCENTILE </th><td>12.43 </td></tr><tr><td></td><th>FLOODSTAGE </th><td>0 </td></tr></table></div>"
[3] "<div class='scrollableContainer'><table class= id='popup'><tr><td></td><th>STAID </th><td>02035000 </td></tr><tr><td></td><th>STANAME </th><td>JAMES RIVER AT CARTERSVILLE, VA </td></tr><tr><td></td><th>ST </th><td>VA </td></tr><tr><td></td><th>HUC </th><td>02080205 </td></tr><tr><td></td><th>CLASS </th><td>7 </td></tr><tr><td></td><th>FLOW </th><td>46500.0 </td></tr><tr><td></td><th>STAGE </th><td>15.70 </td></tr><tr><td></td><th>TIME </th><td>2020-12-01 10:45:00 </td></tr><tr><td></td><th>TIME_UTC </th><td>2020-12-01 15:45:00 </td></tr><tr><td></td><th>URL </th><td>https://waterdata.usgs.gov/va/nwis/uv?site_no=02035000 </td></tr><tr><td></td><th>DATUM </th><td>NAD83 </td></tr><tr><td></td><th>COUNT </th><td>121 </td></tr><tr><td></td><th>PERCENTILE </th><td>97.02 </td></tr><tr><td></td><th>FLOODSTAGE </th><td>0 </td></tr></table></div>"vdssDcvSDCs
[4] "<div class='scrollableContainer'><table class= id='popup'><tr><td></td><th>STAID </th><td>02198690 </td></tr><tr><td></td><th>STANAME </th><td>EBENEZER CREEK AT SPRINGFIELD, GA </td></tr><tr><td></td><th>ST </th><td>GA </td></tr><tr><td></td><th>HUC </th><td>03060109 </td></tr><tr><td></td><th>CLASS </th><td>5 </td></tr><tr><td></td><th>FLOW </th><td> 26.3 </td></tr><tr><td></td><th>STAGE </th><td> 4.95 </td></tr><tr><td></td><th>TIME </th><td>2020-12-01 10:00:00 </td></tr><tr><td></td><th>TIME_UTC </th><td>2020-12-01 15:00:00 </td></tr><tr><td></td><th>URL </th><td>https://waterdata.usgs.gov/ga/nwis/uv?site_no=02198690 </td></tr><tr><td></td><th>DATUM </th><td>NAD83 </td></tr><tr><td></td><th>COUNT </th><td> 3vdssDcvSDCs0 </td></tr><tr><td></td><th>PERCENTILE </th><td>47.56 </td></tr><tr><td></td><th>FLOODSTAGE </th><td>0 </td></tr></table></div>"
[5] "<div class='scrollableContainer'><table class= id='popup'><tr><td></td><th>STAID </th><td>02323000 </td></tr><tr><td></td><th>STANAME </th><td>SUWANNEE RIVER NEAR BELL, FLORIDA </td></tr><tr><td></td><th>ST </th><td>FL </td></tr><tr><td></td><th>HUC </th><td>03110205 </td></tr><tr><td></td><th>CLASS </th><td>5 </td></tr><tr><td></td><th>FLOW </th><td> 5020.0 </td></tr><tr><td></td><th>STAGE </th><td> 7.04 </td></tr><tr><td></td><th>TIME </th><td>2020-12-01 10:00:00 </td></tr><tr><td></td><th>TIME_UTC </th><td>2020-12-01 15:00:00 </td></tr><tr><td></td><th>URL </th><td>https://waterdata.usgs.gov/fl/nwis/uv?site_no=02323000 </td></tr><tr><td></td><th>DATUM </th><td>NAD83 </td></tr><tr><td></td><th>COUNT </th><td> 4vdssDcvSDCs5 </td></tr><tr><td></td><th>PERCENTILE </th><td>52.62 </td></tr><tr><td></td><th>FLOODSTAGE </th><td>0 </td></tr></table></div>"
[6] "<div class='scrollableContainer'><table class= id='popup'><tr><td></td><th>STAID </th><td>01638500 </td></tr><tr><td></td><th>STANAME </th><td>POTOMAC RIVER AT POINT OF ROCKS, MD </td></tr><tr><td></td><th>ST </th><td>MD </td></tr><tr><td></td><th>HUC </th><td>02070008 </td></tr><tr><td></td><th>CLASS </th><td>5 </td></tr><tr><td></td><th>FLOW </th><td> 4940.0 </td></tr><tr><td></td><th>STAGE </th><td> 2.18 </td></tr><tr><td></td><th>TIME </th><td>2020-12-01 10:45:00 </td></tr><tr><td></td><th>TIME_UTC </th><td>2020-12-01 15:45:00 </td></tr><tr><td></td><th>URL </th><td>https://waterdata.usgs.gov/md/nwis/uv?site_no=01638500 </td></tr><tr><td></td><th>DATUM </th><td>NAD83 </td></tr><tr><td></td><th>COUNT </th><td>125 </td></tr><tr><td></td><th>PERCENTILE </th><td>50.94 </td></tr><tr><td></td><th>FLOODSTAGE </th><td>0 </td></tr></table></div>"
您可以通过简单的正则表达式替换或使用将创建标签的 htmltools::tags$a
函数来实现:
str_replace_all(tables, "(https?://.+?)(?=<)", "<a href=\\"\1\\">\1</>")
str_replace_all(tables, "https?://.+?(?=<)", function(.x) as.character(htmltools::tags$a(.x, href=.x)))