在 HTML 标记的文本内容中找到 words/text 并用突出显示标记替换匹配项的可靠方法是什么?

What are reliable approaches for finding words/text within an HTML-markup's text-content and replacing the matches with highlighting markup?

我有一些文字。我有一个接收单词或短语的函数,我必须 return 相同的文本,但在这个关键字或短语周围有一个 class 的跨度。

示例:

如果我有这个

text = <a href="/redirect?uri=https%3A%2F%2Fwww.website.com&context=post" target="_blank" rel="noopener noreferrer">https://www.website.com</a>

我要

text = <a href="/redirect?uri=https%3A%2F%2Fwww.website.com&context=post" target="_blank" rel="noopener noreferrer">https://www.<span class="bold">website</span>.com</a>

但我得到的是

text = <a href="/redirect?uri=https%3A%2F%2Fwww.<span class="bold"> website </span>.com&amp;context=post" target="_blank" rel="noopener noreferrer">https://www.<span class="bold"> website </span>.com</a>

我在做的是

        ...
        const escapedPhrases = ["\bwebsite\b"]
        const regex = new RegExp(`(${escapedPhrases.join('|')})`, 'gi');
        text = text.replace(
          regex,
          '<span class="bold">  </span>'
        );

如何改进我的正则表达式?

此外,我还尝试在替换 <span class="bold"> </span> 后“清理”文本,如果它在 href 中但没有成功,则尝试将其删除。

更新说明:

我有这段文字:

text = `Follow me on 
<a href="/redirect?uri=https%3A%2F%2Fwww.twitter.com&context=post" target="_blank" rel="noopener noreferrer">https://www.twitter.com</a>

Thanks!`

示例 1: 我想突出这个词 twitter:

为此,我想在 Twitter 周围添加 class bold 的范围:

text = `Follow me on 
<a href="/redirect?uri=https%3A%2F%2Fwww.twitter.com&context=post" target="_blank" rel="noopener noreferrer">https://www.<span class="bold">twitter</span>.com</a>

Thanks!`

示例 2: 我想突出这个词 twitter.com:

为此,我想添加一个 class bold 的范围,例如 twitter.com:

text = `Follow me on 
<a href="/redirect?uri=https%3A%2F%2Fwww.twitter.com&context=post" target="_blank" rel="noopener noreferrer">https://www.<span class="bold">twitter.com</span></a>

Thanks!`

示例 3: 我想突出这个词 https://twitter.com/:

为此,我想添加一个带有 class bold 的跨度,例如 https://twitter.com/:

text = `Follow me on 
<a href="/redirect?uri=https%3A%2F%2Fwww.twitter.com&context=post" target="_blank" rel="noopener noreferrer"><span class="bold">https://www.twitter.com</span></a>

Thanks!`

示例 4:

我有这段文字,想突出显示 twitter:

text = `Follow me on 
<a href="/redirect?uri=https%3A%2F%2Fwww.twitter.com&context=post" target="_blank" rel="noopener noreferrer">https://www.twitter.com</a>

Thanks for follow my twitter!`

那我得return

text = `Follow me on 
<a href="/redirect?uri=https%3A%2F%2Fwww.twitter.com&context=post" target="_blank" rel="noopener noreferrer">https://www.<span class="bold">twitter</span>.com</a>

Thanks for follow my <span class="bold">twitter</span>!`

正则表达式不是解决所有问题的方法,在这种情况下,只修改 textContent 而不是 attribute 也许下面的代码可以满足您的需要:

let text = `Follow me on 
<a href="/redirect?uri=https%3A%2F%2Fwww.twitter.com&context=post" target="_blank" rel="noopener noreferrer">https://www.twitter.com</a>

Thanks for follow my twitter!`;

const replaceKeyword = (keyword, text) => {
  let template = document.createElement('template');
  template.innerHTML = text;
  let children = template.content.childNodes;
  
  let str = '';
  let substitute = `<span style='color:red;font-weight:bold;'>${keyword}</span>`;
  for (let child of children){
    if (child.nodeType === 3){
      // #text
      str += child.textContent.replace(keyword, substitute);
    } else if (child.nodeType === 1) {
      // element
      let nodeStr = child.textContent.replace(keyword, substitute);
      child.innerHTML = nodeStr;
      str += child.outerHTML;
    }
  }
  return str;
}

let result = replaceKeyword('twitter', text);
console.log(result);
document.body.innerHTML = result;

通过将最新功能添加到要求中,OP 彻底改变了游戏规则。现在有人在谈论在 html-markup.

的文本内容中进行全文搜索

类似于...

  • How to highlight the search-result of a text-query within an html document ignoring the html tags?
  • ... 或 ... 如何从 DOM 查询文本节点、查找降价模式、用 HTML 标记替换匹配项并替换原始文本-具有新内容的节点?[​​=42=]

...最后两个提供了不同但通用的基于 DOM-node/text-node 的方法。

至于OP的问题。对于在 html 代码的文本内容中查找文本查询等要求,不能坚持简单的解决方案。现在必须采用嵌套标记。

Providing/adding 围绕每个搜索结果的特殊标记必须首先从非常 DOM 片段中收集每个文本节点开始,这些片段必须在之前从传递的 html-代码.

有了这样的基础,人们就不能再用基于正则表达式的方式开火了String.replace。现在必须 replace/reassamble 每个与搜索查询部分匹配的文本节点与不匹配的文本内容以及由于附加标记而现在变为元素节点的部分匹配文本。

因此仅从 OP 的最后一次需求变更开始,就必须提供通用的全文搜索和突出显示方法添加必须考虑到 sanitize/handle white-space 序列和提供的搜索查询中特定于正则表达式的字符 ...

// node detection helpers.
function isElementNode(node) {
  return (node && (node.nodeType === 1));
}
function isNonEmptyTextNode(node) {
  return (
        node
    && (node.nodeType === 3)
    && (node.nodeValue.trim() !== '')
    && (node.parentNode.tagName.toLowerCase() !== 'script')
  );
}

// dom node render helper.
function insertNodeAfter(node, referenceNode) {
  const { parentNode, nextSibling } = referenceNode;
    if (nextSibling !== null) {

    node = parentNode.insertBefore(node, nextSibling);
  } else {
    node = parentNode.appendChild(node);
  }
  return node;
}

// text node reducer functionality.
function collectNonEmptyTextNode(list, node) {
  if (isNonEmptyTextNode(node)) {
    list.push(node);
  }
  return list;
}
function collectTextNodeList(list, elmNode) {
  return Array.from(
    elmNode.childNodes
  ).reduce(
    collectNonEmptyTextNode,
    list
  );
}
function getTextNodeList(rootNode) {
  rootNode = (isElementNode(rootNode) && rootNode) || document.body;

  const elementNodeList = Array.from(
    rootNode.getElementsByTagName('*')
  );
  elementNodeList.unshift(rootNode);

  return elementNodeList.reduce(collectTextNodeList, []);
}


// search result emphasizing functinality.

function createSearchMatch(text) {
  const elmMatch = document.createElement('strong');

  // elmMatch.classList.add("bold");
  elmMatch.textContent = text;

  return elmMatch;
}
function aggregateSearchResult(collector, text, idx) {
  const { previousNode, regXSearch } = collector;

  const currentNode = regXSearch.test(text)
    ? createSearchMatch(text)
    : document.createTextNode(text);

  if (idx === 0) {
    previousNode.parentNode.replaceChild(currentNode, previousNode);
  } else {
    insertNodeAfter(currentNode, previousNode);
  }
  collector.previousNode = currentNode;

  return collector;
}
function emphasizeTextContentMatch(textNode, regXSearch) {
  // console.log(regXSearch);
  textNode.textContent
    .split(regXSearch)
    .filter(text => text !== '')
    .reduce(aggregateSearchResult, {
      previousNode: textNode,
      regXSearch,
    })
}


function emphasizeEveryTextContentMatch(htmlCode, searchValue, isIgnoreCase) {
  searchValue = searchValue.trim();
  if (searchValue !== '') {

    const replacementNode = document.createElement('div');
    replacementNode.innerHTML = htmlCode;

    const regXSearchString = searchValue
      // escaping of regex specific characters.
      .replace((/[.*+?^${}()|[\]\]/g), '\$&')
      // additional escaping of whitespace (sequences).
      .replace((/\s+/g), '\s+');

    const regXFlags = `g${ !!isIgnoreCase ? 'i' : '' }`;
    const regXSearch = RegExp(`(${ regXSearchString })`, regXFlags);

    getTextNodeList(replacementNode).forEach(textNode =>
      emphasizeTextContentMatch(textNode, regXSearch)
    );
    htmlCode = replacementNode.innerHTML
  }
  return htmlCode;
}


const htmlLinkList = [
  emphasizeEveryTextContentMatch(
    'Follow me on <a href="/redirect?uri=https%3A%2F%2Fwww.twitter.com&context=post" target="_blank" rel="noopener noreferrer">https://www.twitter.com/</a> Thanks!',
    'twitter'
  ),
  emphasizeEveryTextContentMatch(
    'Follow me on <a href="/redirect?uri=https%3A%2F%2Fwww.twitter.com&context=post" target="_blank" rel="noopener noreferrer">https://www.twitter.com/</a> Thanks!',
    'twitter.com'
  ),
  emphasizeEveryTextContentMatch(
    'Follow me on <a href="/redirect?uri=https%3A%2F%2Fwww.twitter.com&context=post" target="_blank" rel="noopener noreferrer">https://www.twitter.com/</a> Thanks!',
    'https://www.twitter.com/'
  ),
  emphasizeEveryTextContentMatch(
    'Follow me on <a href="/redirect?uri=https%3A%2F%2Fwww.twitter.com&context=post" target="_blank" rel="noopener noreferrer">https://www.twitter.com/</a> Thanks for follow my Twitter!',
    'TWITTER',
    true
  ),
  emphasizeEveryTextContentMatch(
    `Follow me on <a href="/redirect?uri=https%3A%2F%2Fwww.twitter.com&context=post" target="_blank" rel="noopener noreferrer">https://www.twitter.com/</a>
    Thanks
    for follow 
    my   Twitter!`,
    'follow my twitter',
    true
  ),
];
document.body.innerHTML = htmlLinkList.join('<br/>');

const container = document.createElement('code');

container.textContent = emphasizeEveryTextContentMatch(
  'Follow me on <a href="/redirect?uri=https%3A%2F%2Fwww.twitter.com&context=post" target="_blank" rel="noopener noreferrer">https://www.twitter.com/</a> Thanks for follow my Twitter!',
  'TWITTER',
  true
);
document.body.appendChild(container.cloneNode(true));

container.textContent = emphasizeEveryTextContentMatch(
  `Follow me on <a href="/redirect?uri=https%3A%2F%2Fwww.twitter.com&context=post" target="_blank" rel="noopener noreferrer">https://www.twitter.com/</a>
  Thanks
  for follow 
  my   Twitter!`,
  'follow my twitter',
  true
);
document.body.appendChild(container.cloneNode(true));
code {
  display: block;
  margin: 10px 0;
  padding: 0
}
a strong {
  font-weight: bold;
}
.as-console-wrapper { min-height: 100%!important; top: 0; }