在 R 中将字符串 url 转换为 html

convert string urls to html in R

我有一个简单的 html table 和字符串 urls.

我想找到每个 url,并将其替换为相同 url 的 html 标签。

例如,https://waterdata.usgs.gov/ct/nwis/uv?site_no=01127500&emsp

将替换为 "<a href='https://waterdata.usgs.gov/ct/nwis/uv?site_no=01127500&emsp' target='_blank'>https://waterdata.usgs.gov/ct/nwis/uv?site_no=01127500&emsp</a>"

如何在整个文档中执行此操作?我查看了 stringr、正则表达式,并尝试使用 python,但无法弄清楚。

下面是示例 table:

"<div class='scrollableContainer'>
  <table class= id='popup'>
    <tr><td></td><th>STAID&emsp;</th><td>01127500&emsp;</td></tr><tr><td></td><th>STANAME&emsp;</th><td>YANTIC RIVER AT YANTIC, CT&emsp;</td></tr><tr><td></td><th>ST&emsp;</th><td>CT&emsp;</td></tr><tr><td></td><th>HUC&emsp;</th><td>01100003&emsp;</td></tr><tr><td></td><th>CLASS&emsp;</th><td>7&emsp;</td></tr><tr><td></td><th>FLOW&emsp;</th><td>  857.0&emsp;</td></tr><tr><td></td><th>STAGE&emsp;</th><td> 4.86&emsp;</td></tr><tr><td></td><th>TIME&emsp;</th><td>2020-12-01 10:45:00&emsp;</td></tr><tr><td></td><th>TIME_UTC&emsp;</th><td>2020-12-01 15:45:00&emsp;</td></tr><tr><td></td><th>URL&emsp;</th><td>https://waterdata.usgs.gov/ct/nwis/uv?site_no=01127500&emsp;</td></tr><tr><td></td><th>DATUM&emsp;</th><td>NAD83&emsp;</td></tr><tr><td></td><th>COUNT&emsp;</th><td> 8vdssDcvSDCs9&emsp;</td></tr><tr><td></td><th>PERCENTILE&emsp;</th><td>97.19&emsp;</td></tr><tr><td></td><th>FLOODSTAGE&emsp;</th><td>0&emsp;</td></tr></table></div>"
[2] "<div class='scrollableContainer'><table class= id='popup'><tr><td></td><th>STAID&emsp;</th><td>01578310&emsp;</td></tr><tr><td></td><th>STANAME&emsp;</th><td>SUSQUEHANNA RIVER AT CONOWINGO, MD&emsp;</td></tr><tr><td></td><th>ST&emsp;</th><td>MD&emsp;</td></tr><tr><td></td><th>HUC&emsp;</th><td>02050306&emsp;</td></tr><tr><td></td><th>CLASS&emsp;</th><td>4&emsp;</td></tr><tr><td></td><th>FLOW&emsp;</th><td>20100.0&emsp;</td></tr><tr><td></td><th>STAGE&emsp;</th><td>11.63&emsp;</td></tr><tr><td></td><th>TIME&emsp;</th><td>2020-12-01 10:30:00&emsp;</td></tr><tr><td></td><th>TIME_UTC&emsp;</th><td>2020-12-01 15:30:00&emsp;</td></tr><tr><td></td><th>URL&emsp;</th><td>https://waterdata.usgs.gov/md/nwis/uv?site_no=01578310&emsp;</td></tr><tr><td></td><th>DATUM&emsp;</th><td>NAD83&emsp;</td></tr><tr><td></td><th>COUNT&emsp;</th><td> 5vdssDcvSDCs2&emsp;</td></tr><tr><td></td><th>PERCENTILE&emsp;</th><td>12.43&emsp;</td></tr><tr><td></td><th>FLOODSTAGE&emsp;</th><td>0&emsp;</td></tr></table></div>"
[3] "<div class='scrollableContainer'><table class= id='popup'><tr><td></td><th>STAID&emsp;</th><td>02035000&emsp;</td></tr><tr><td></td><th>STANAME&emsp;</th><td>JAMES RIVER AT CARTERSVILLE, VA&emsp;</td></tr><tr><td></td><th>ST&emsp;</th><td>VA&emsp;</td></tr><tr><td></td><th>HUC&emsp;</th><td>02080205&emsp;</td></tr><tr><td></td><th>CLASS&emsp;</th><td>7&emsp;</td></tr><tr><td></td><th>FLOW&emsp;</th><td>46500.0&emsp;</td></tr><tr><td></td><th>STAGE&emsp;</th><td>15.70&emsp;</td></tr><tr><td></td><th>TIME&emsp;</th><td>2020-12-01 10:45:00&emsp;</td></tr><tr><td></td><th>TIME_UTC&emsp;</th><td>2020-12-01 15:45:00&emsp;</td></tr><tr><td></td><th>URL&emsp;</th><td>https://waterdata.usgs.gov/va/nwis/uv?site_no=02035000&emsp;</td></tr><tr><td></td><th>DATUM&emsp;</th><td>NAD83&emsp;</td></tr><tr><td></td><th>COUNT&emsp;</th><td>121&emsp;</td></tr><tr><td></td><th>PERCENTILE&emsp;</th><td>97.02&emsp;</td></tr><tr><td></td><th>FLOODSTAGE&emsp;</th><td>0&emsp;</td></tr></table></div>"vdssDcvSDCs
[4] "<div class='scrollableContainer'><table class= id='popup'><tr><td></td><th>STAID&emsp;</th><td>02198690&emsp;</td></tr><tr><td></td><th>STANAME&emsp;</th><td>EBENEZER CREEK AT SPRINGFIELD, GA&emsp;</td></tr><tr><td></td><th>ST&emsp;</th><td>GA&emsp;</td></tr><tr><td></td><th>HUC&emsp;</th><td>03060109&emsp;</td></tr><tr><td></td><th>CLASS&emsp;</th><td>5&emsp;</td></tr><tr><td></td><th>FLOW&emsp;</th><td>   26.3&emsp;</td></tr><tr><td></td><th>STAGE&emsp;</th><td> 4.95&emsp;</td></tr><tr><td></td><th>TIME&emsp;</th><td>2020-12-01 10:00:00&emsp;</td></tr><tr><td></td><th>TIME_UTC&emsp;</th><td>2020-12-01 15:00:00&emsp;</td></tr><tr><td></td><th>URL&emsp;</th><td>https://waterdata.usgs.gov/ga/nwis/uv?site_no=02198690&emsp;</td></tr><tr><td></td><th>DATUM&emsp;</th><td>NAD83&emsp;</td></tr><tr><td></td><th>COUNT&emsp;</th><td> 3vdssDcvSDCs0&emsp;</td></tr><tr><td></td><th>PERCENTILE&emsp;</th><td>47.56&emsp;</td></tr><tr><td></td><th>FLOODSTAGE&emsp;</th><td>0&emsp;</td></tr></table></div>"
[5] "<div class='scrollableContainer'><table class= id='popup'><tr><td></td><th>STAID&emsp;</th><td>02323000&emsp;</td></tr><tr><td></td><th>STANAME&emsp;</th><td>SUWANNEE RIVER NEAR BELL, FLORIDA&emsp;</td></tr><tr><td></td><th>ST&emsp;</th><td>FL&emsp;</td></tr><tr><td></td><th>HUC&emsp;</th><td>03110205&emsp;</td></tr><tr><td></td><th>CLASS&emsp;</th><td>5&emsp;</td></tr><tr><td></td><th>FLOW&emsp;</th><td> 5020.0&emsp;</td></tr><tr><td></td><th>STAGE&emsp;</th><td> 7.04&emsp;</td></tr><tr><td></td><th>TIME&emsp;</th><td>2020-12-01 10:00:00&emsp;</td></tr><tr><td></td><th>TIME_UTC&emsp;</th><td>2020-12-01 15:00:00&emsp;</td></tr><tr><td></td><th>URL&emsp;</th><td>https://waterdata.usgs.gov/fl/nwis/uv?site_no=02323000&emsp;</td></tr><tr><td></td><th>DATUM&emsp;</th><td>NAD83&emsp;</td></tr><tr><td></td><th>COUNT&emsp;</th><td> 4vdssDcvSDCs5&emsp;</td></tr><tr><td></td><th>PERCENTILE&emsp;</th><td>52.62&emsp;</td></tr><tr><td></td><th>FLOODSTAGE&emsp;</th><td>0&emsp;</td></tr></table></div>"
[6] "<div class='scrollableContainer'><table class= id='popup'><tr><td></td><th>STAID&emsp;</th><td>01638500&emsp;</td></tr><tr><td></td><th>STANAME&emsp;</th><td>POTOMAC RIVER AT POINT OF ROCKS, MD&emsp;</td></tr><tr><td></td><th>ST&emsp;</th><td>MD&emsp;</td></tr><tr><td></td><th>HUC&emsp;</th><td>02070008&emsp;</td></tr><tr><td></td><th>CLASS&emsp;</th><td>5&emsp;</td></tr><tr><td></td><th>FLOW&emsp;</th><td> 4940.0&emsp;</td></tr><tr><td></td><th>STAGE&emsp;</th><td> 2.18&emsp;</td></tr><tr><td></td><th>TIME&emsp;</th><td>2020-12-01 10:45:00&emsp;</td></tr><tr><td></td><th>TIME_UTC&emsp;</th><td>2020-12-01 15:45:00&emsp;</td></tr><tr><td></td><th>URL&emsp;</th><td>https://waterdata.usgs.gov/md/nwis/uv?site_no=01638500&emsp;</td></tr><tr><td></td><th>DATUM&emsp;</th><td>NAD83&emsp;</td></tr><tr><td></td><th>COUNT&emsp;</th><td>125&emsp;</td></tr><tr><td></td><th>PERCENTILE&emsp;</th><td>50.94&emsp;</td></tr><tr><td></td><th>FLOODSTAGE&emsp;</th><td>0&emsp;</td></tr></table></div>"

您可以通过简单的正则表达式替换或使用将创建标签的 htmltools::tags$a 函数来实现:

str_replace_all(tables, "(https?://.+?)(?=<)", "<a href=\\"\1\\">\1</>")

str_replace_all(tables, "https?://.+?(?=<)", function(.x) as.character(htmltools::tags$a(.x, href=.x)))