你见过 Golang Regexp 找不到所有出现的地方吗?

have you seen Golang Regexp not find all occurrences?

我有一个示例,我试图删除文件中所有为空的行(仅换行符或空格和换行符)。我以为我可以用一个简单的 ^\s*$ 轻松地做到这一点。当我使用 grep -e '^\s*$' samplefile | wc -l 时,我拥有的文件样本有 724 行长,出现 182 次这种模式,我可以简单地添加 -v 标志并重定向输出以获取删除额外行的内容。

在 Go 1.12.4 中我尝试:

dat, _ := ioutil.ReadFile("./samplefile")
ioutil.WriteFile("eventbody", []byte(strings.Split(string(dat), "</head>")[1]), 0555)

regNewline := regexp.MustCompile(`(?ms)^\s*$`)
d := regNewline.ReplaceAll(dat, []byte(""))
ioutil.WriteFile("./emptyremoved", d, 0555)

并且生成的文件出现了 143 次,实际做我想做的事。示例文件只是一个 HTML 页面。我这样做的全部原因是因为我无法获得 golang.org/x/net/html 包来解析 HTML 并逐步通过令牌来获取我想要的数据(table 行)和解决了一些问题,试图解决我的问题,但我仍然遇到了死胡同。

示例文件:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>SourceManagementboard</title>
    <style>
      @font-face {
        font-family: 'Open Sans';
        font-style: normal;
        font-weight: 400;
        src: local('Open Sans'), local('OpenSans'), url(/static/fonts/Open_Sans.woff) format('woff');
      }
      input[type=text] {
        width: 130px;
        border: 2px solid #f96302;
        -webkit-transition: width 0.4s ease-in-out;
        transition: width 0.4s ease-in-out;
      }
      input[type=text]:focus {
        width: 270px;
      }
    </style>
    
    <link href='//fonts.googleapis.com/css?family=Open+Sans' rel='stylesheet' type='text/css' />
    <link href="/static/Semantic-UI-2.1.8/semantic.min.css" rel="stylesheet" />
    <link href='//cdnjs.cloudflare.com/ajax/libs/datatables/1.10.13/css/dataTables.semanticui.min.css' rel='stylesheet' type='text/css'>
    
    <link href="/static/css/c3.min.css" rel="stylesheet" />
    <link href="/static/css/SourceManagementboard.css" rel="stylesheet" />

    
      <script src="//cdnjs.cloudflare.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
      <script src="//cdnjs.cloudflare.com/ajax/libs/datatables/1.10.13/js/jquery.dataTables.min.js"></script>
      <script src="//cdnjs.cloudflare.com/ajax/libs/datatables/1.10.13/js/dataTables.semanticui.min.js"></script>
      
        <script src="//cdnjs.cloudflare.com/ajax/libs/moment.js/2.7.0/moment.min.js"></script>
      
    
    
      <script src="/static/js/timestamps.js"></script>
    
    <script src="/static/Semantic-UI-2.1.8/semantic.min.js"></script>
    <script src="/static/js/lists.js"></script>
    <script src="/static/js/scroll.top.js"></script>
    <script src="/static/js/d3.min.js"></script>
    <script src="/static/js/c3.min.js"></script>
    <script src="/static/jquery-tablesort-v.0.0.11/jquery.tablesort.min.js"></script>
    <script src="/static/js/Chart.min.js"></script>
    <script type="text/javascript">
     
        $(document).ready(function(){
            $(".ui.dropdown").dropdown();
            $.getScript('/static/js/lists.js')
            $.getScript('/static/js/tables.js')
             
          })
    </script>
    <script type="text/javascript">
        function jumpToServerPage() {
            var myserver = document.getElementById("serverJump").value;
            if (myserver == "*.foo.com"){
                var jumpto = "/node/" + myserver;
                window.location = jumpto;
            else {
                var jumpto = "/node/" + myserver + ".foo.com";
                window.location = jumpto;
            }
        }
    </script>
     

<!-- begin usabilla live embed code -->
<script type="text/javascript">/*{literal}<![CDATA[*/window.lightningjs||function(c){function g(b,d){d&&(d+=(/\?/.test(d)?"&":"?")+"lv=1");c[b]||function(){var i=window,h=document,j=b,g=h.location.protocol,l="load",k=0;(function(){function b(){a.P(l);a.w=1;c[j]("_load")}c[j]=function(){function m(){m.id=e;return c[j].apply(m,arguments)}var b,e=++k;b=this&&this!=i?this.id||0:0;(a.s=a.s||[]).push([e,b,arguments]);m.then=function(b,c,h){var d=a.fh[e]=a.fh[e]||[],j=a.eh[e]=a.eh[e]||[],f=a.ph[e]=a.ph[e]||[];b&&d.push(b);c&&j.push(c);h&&f.push(h);return m};return m};var a=c[j]._={};a.fh={};a.eh={};a.ph={};a.l=d?d.replace(/^\/\//,(g=="https:"?g:"http:")+"//"):d;a.p={0:+new Date};a.P=function(b){a.p[b]=new Date-a.p[0]};a.w&&b();i.addEventListener?i.addEventListener(l,b,!1):i.attachEvent("on"+l,b);var q=function(){function b(){return["<head></head><",c,' onload="var d=',n,";d.getElementsByTagName('head')[0].",d,"(d.",g,"('script')).",i,"='",a.l,"'\"></",c,">"].join("")}var c="body",e=h[c];if(!e)return setTimeout(q,100);a.P(1);var d="appendChild",g="createElement",i="src",k=h[g]("div"),l=k[d](h[g]("div")),f=h[g]("iframe"),n="document",p;k.style.display="none";e.insertBefore(k,e.firstChild).id=o+"-"+j;f.frameBorder="0";f.id=o+"-frame-"+j;/MSIE[ ]+6/.test(navigator.userAgent)&&(f[i]="javascript:false");f.allowTransparency="true";l[d](f);try{f.contentWindow[n].open()}catch(s){a.domain=h.domain,p="javascript:var d="+n+".open();d.domain='"+h.domain+"';",f[i]=p+"void(0);"}try{var r=f.contentWindow[n];r.write(b());r.close()}catch(t){f[i]=p+'d.write("'+b().replace(/"/g,String.fromCharCode(92)+'"')+'");d.close();'}a.P(2)};a.l&&setTimeout(q,0)})()}();c[b].lv="1";return c[b]}var o="lightningjs",k=window[o]=g(o);k.require=g;k.modules=c}({});
window.usabilla_live = lightningjs.require("usabilla_live", "//w.usabilla.com/521b4f1b8dd9.js");
/*]]>{/literal}*/</script>
<!-- end usabilla live embed code -->

     
  </head>


  <body>
    <div class="ui doubling stackable inverted large menu" style="margin-top: 0px; margin-bottom: 20px;">
      <div class="title item" style="margin-top: 0px; margin-bottom: 0px;">
        <a href="/">
          <img src="/static/icons/fldlogo.svg">
        </a>
      </div>
      <a class="item" href="/">Report Board</a>
      <a class="item" href="/search">StackRepo</a>
      <a class="item" href="/validate">Validate Info</a>
      <a class="item" href="/patching">Patching Status</a>
      <div class="ui item dropdown">More<i class="dropdown icon"></i>
          <div class="menu">
              <a class="item" href="/nodes">Nodes</a>
              <a class="item" href="/"></a>
              <a class="item" href="/reports">Reports</a>
              <a class="item" href="/metrics">Metrics</a>
              <a class="item" href="/inventory">Inventory</a>
              <a class="item" href="/catalogerrors">Catalog Errors</a>
              <a class="item" href="/radiator">Radiator</a>
              <a class="item" href="/query">Query</a>
              <a class="item" href="/module">Module Reporting</a>
              <a class="item" href="/grid_versions">Grid Versions</a>
              <a class="item" href="/summary">Core State</a>
          </div>
      </div>
      <form method="POST" class="item" action="/serverjump" name="serverjump">
          <input class="item" type="submit" value="Go" style="background-color: #f96302; color: black;">
          <input class="item" style="color: orange;" id="server_name" name="server_name"
              type="text" placeholder="Jump to a Server info page" value="">
      </form>
      <div class="item right">
          <div class="ui item dropdown">
              <b>S</b>
              
              <i class="dropdown icon"></i>
              <div class="menu">
                  <a class="item " href="/%2A/report/myserv.foo.com/58964efb70ab87610ff4ce3fdbf46ce7dea54dac">All servers</a>
                  
                      <a class="item "
                          href="/production/report/myserv.foo.com/58964efb70ab87610ff4ce3fdbf46ce7dea54dac">Datacenter</a>
                      <a class="item active"
                          href="/report/myserv.foo.com/58964efb70ab87610ff4ce3fdbf46ce7dea54dac">S</a>
                  
              </div>
          </div>
      </div>
    </div>
    <div class="ui grid padding-bottom">
        <div class="one wide column"></div>
        <div class="fourteen wide column">
            
<h1>Summary</h1>
<table class='ui basic table'>
  <thead>
    <tr>
      <th>Certname</th>
      <th>Configuration version</th>
      <th>Start time</th>
      <th>End time</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><a href="/node/myserv.foo.com">myserv.foo.com</a></td>
      <td>
        <p>1581366405</p>

      </td>
      <td rel="utctimestamp">
        2020-02-10 20:26:02.286000+00:00
      </td>
      <td rel="utctimestamp">
        2020-02-10 20:27:33.971000+00:00
      </td>
    </tr>
  </tbody>
</table>

<style>
    .button {
      display: inline-block;
      border-radius: 4px;
      background-color: #f4511e;
      border: none;
      color: #FFFFFF;
      transition: all 0.5s;
      cursor: pointer;
    }
    .button span {
      cursor: pointer;
      display: inline-block;
      position: relative;
      transition: 0.5s;
    }
    .button span:after {
      content: '»';
      position: absolute;
      opacity: 0;
      top: 0;
      right: -20px;
      transition: 0.5s;
    }
    .button:hover span {
      padding-right: 25px;
    }
    .button:hover span:after {
      opacity: 1;
      right: 0;
    }
</style>

<h1>Events <a href="/download_report/myserv.foo.com/58964efb70ab87610ff4ce3fdbf46ce7dea54dac/events"><button style="float: right;" class="ui grey button"><span>Download</span></button></a></h1>
<table class='ui basic compact fixed wrapped table'>
  <thead>
    <tr>
      <th class="eight wide">Resource</th>
      <th class="two wide">Status</th>
      <th class="two wide">Changed From</th>
      <th class="four wide">Changed To</th>
    </tr>
  </thead>
  <tbody>
    
    
      <tr id='event-1' class='ui line changed'>
    
      <td>Exec[/bin/ksh -c &#39;source /usr/local/bin/src.host; /opt/sr/su/bin/sr-db-sc 1249 N COM&#39;]</td>
      <td>success</td>
      <td>notrun</td>
      <td>[u&#39;0&#39;]</td>
    </tr>
    
    
      <tr id='event-2' class='ui line failed'>
    
      <td>Package[sr-kd-if]</td>
      <td>failure</td>
      <td>1.22-1</td>
      <td>2.0-2</td>
    </tr>
    
    
      <tr id='event-3' class='ui line changed'>
    
      <td>Exec[yum_trex reconfig]</td>
      <td>success</td>
      <td>notrun</td>
      <td>[u&#39;0&#39;]</td>
    </tr>
    
    
      <tr id='event-4' class='ui line changed'>
    
      <td>Exec[yum_trex config]</td>
      <td>success</td>
      <td>notrun</td>
      <td>[u&#39;0&#39;]</td>
    </tr>
    
    
      <tr id='event-5' class='ui line changed'>
    
      <td>Exec[rpm_import_stflag]</td>
      <td>success</td>
      <td>notrun</td>
      <td>[u&#39;0&#39;]</td>
    </tr>
    
    
      <tr id='event-6' class='ui line changed'>
    
      <td>Exec[RK sed on agent_flag]</td>
      <td>success</td>
      <td>notrun</td>
      <td>[u&#39;0&#39;]</td>
    </tr>
    
  </tbody>
</table>

<h1>Logs <a href="/download_report/myserv.foo.com/58964efb70ab87610ff4ce3fdbf46ce7dea54dac/logs"><button style="float: right;" class="ui grey button"><span>Download</span></button></a></h1>
<table class='ui basic compact fixed wrapped table'>
  <thead>
    <tr>
      <th>Timestamp</th>
      <th>Source</th>
      <th>Tags</th>
      <th>Message</th>
      <th>Location</th>
    <tr>
  </thead>
  <tbody>
    
      
        <tr class='warning'>
      
        <td rel="utctimestamp">2020-02-10T15:26:07.109-05:00</td>
        <td>SourceManagement</td>
        <td>warning</td>
        <td>Unable to fetch my node definition, but the agent run will continue:</td>
        
          <td></td>
        
      </tr>
    
      
        <tr class='warning'>
      
        <td rel="utctimestamp">2020-02-10T15:26:07.109-05:00</td>
        <td>SourceManagement</td>
        <td>warning</td>
        <td>Could not intern from application/json: Could not find a directory environment named &#39;env&#39; anywhere in the path: /etc/SourceManagement/code/environments. Does the directory exist?</td>
        
          <td></td>
        
      </tr>
    
      
        <tr class='positive'>
      
        <td rel="utctimestamp">2020-02-10T15:26:07.110-05:00</td>
        <td>SourceManagement</td>
        <td>info</td>
        <td>Retrieving plugin</td>
        
          <td></td>
        
      </tr>
    
      
        <tr class='positive'>
      
        <td rel="utctimestamp">2020-02-10T15:26:07.326-05:00</td>
        <td>SourceManagement</td>
        <td>info</td>
        <td>Retrieving plugin</td>
        
          <td></td>
        
      </tr>
    
      
        <tr class='positive'>
      
        <td rel="utctimestamp">2020-02-10T15:26:08.349-05:00</td>
        <td>SourceManagement</td>
        <td>info</td>
        <td>Retrieving locales</td>
        
          <td></td>
        
      </tr>
    
      
        <tr class='positive'>
      
        <td rel="utctimestamp">2020-02-10T15:26:08.572-05:00</td>
        <td>SourceManagement</td>
        <td>info</td>
        <td>Loading </td>
        
          <td></td>
        
      </tr>
    
      
        <tr class='positive'>
      
        <td rel="utctimestamp">2020-02-10T15:27:19.334-05:00</td>
        <td>SourceManagement</td>
        <td>info</td>
        <td>Caching catalog for myserv.foo.com</td>
        
          <td></td>
        
      </tr>
    
      
        <tr class='positive'>
      
        <td rel="utctimestamp">2020-02-10T15:27:20.824-05:00</td>
        <td>SourceManagement</td>
        <td>info</td>
        <td>Applying configuration version &#39;1581366405&#39;</td>
        
          <td></td>
        
      </tr>
    
      
        <tr class='positive'>
      
        <td rel="utctimestamp">2020-02-10T15:27:26.979-05:00</td>
        <td>/Stage[main]/RK::V1_0_1/Exec[RK sed on agent_flags.conf]/returns</td>
        <td>notice, exec, class, rk::v1_0_1, rk, v1_0_1</td>
        <td>executed successfully</td>
        
          <td>/etc/SourceManagement/code/environments/env/modules/rk/manifests/v1_0_1.pp:25</td>
        
      </tr>
    
      
        <tr class='positive'>
      
        <td rel="utctimestamp">2020-02-10T15:27:26.980-05:00</td>
        <td>/Stage[main]/RK::V1_0_1/Exec[RK sed on agent_flags.conf]</td>
        <td>info, exec, class, rk::v1_0_1, rk, v1_0_1</td>
        <td>Scheduling refresh of Service[rkagents]</td>
        
          <td>/etc/SourceManagement/code/environments/env/modules/rk/manifests/v1_0_1.pp:25</td>
        
      </tr>
    
      
        <tr class='positive'>
      
        <td rel="utctimestamp">2020-02-10T15:27:27.492-05:00</td>
        <td>/Stage[main]/RK::V1_0_1/Service[rkagents]</td>
        <td>notice, service, rkagents, class, rk::v1_0_1, rk, v1_0_1</td>
        <td>Triggered &#39;refresh&#39; from 1 event</td>
        
          <td>/etc/SourceManagement/code/environments/env/modules/rk/manifests/v1_0_1.pp:33</td>
        
      </tr>
    
      
        <tr class='positive'>
      
        <td rel="utctimestamp">2020-02-10T15:27:29.030-05:00</td>
        <td>/Stage[main]/tr::V1_0_5::Config/Exec[rpm_import_tr]/returns</td>
        <td>notice, exec, rpm_import_tr, class, tr::v1_0_5::config, tr, v1_0_5, config, tr::v1_0_5</td>
        <td>executed successfully</td>
        
          <td>/etc/SourceManagement/code/environments/env/modules/tanium/manifests/v1_0_5/config.pp:40</td>
        
      </tr>
    
      
        <tr class='positive'>
      
        <td rel="utctimestamp">2020-02-10T15:27:35.083-05:00</td>
        <td>/Stage[main]/Toolrental_trex_svc/Exec[yum_trex config]/returns</td>
        <td>notice, exec, class, toolrental_trex_svc</td>
        <td>executed successfully</td>
        
          <td>/etc/SourceManagement/code/environments/env/modules/toolrental_trex_svc/manifests/init.pp:7</td>
        
      </tr>
    
      
        <tr class='positive'>
      
        <td rel="utctimestamp">2020-02-10T15:27:35.101-05:00</td>
        <td>/Stage[main]/Toolrental_trex_svc/Exec[yum_trex reconfig]/returns</td>
        <td>notice, exec, class, toolrental_trex_svc</td>
        <td>executed successfully</td>
        
          <td>/etc/SourceManagement/code/environments/env/modules/toolrental_trex_svc/manifests/init.pp:17</td>
        
      </tr>
    
      
        <tr class='error'>
      
        <td rel="utctimestamp">2020-02-10T15:27:40.113-05:00</td>
        <td>SourceManagement</td>
        <td>err</td>
        <td>Could not update: Execution of &#39;/usr/bin/yum -d 0 -e 0 -y update sr-kd-if-2.0-2&#39; returned 1: Error Downloading Packages:
  jq-1.3-2.el6.x86_64: failed to retrieve getPackage/jq-1.3-2.el6.x86_64.rpm from rhel-x86_64-server-6-s-epel
error was [Errno 14] PYCURL ERROR 22 - &#34;The requested URL returned error: 500 Internal Server Error&#34;
  sr-kd-if-2.0-2.x86_64: failed to retrieve getPackage/sr-kd-if-2.0-2.x86_64.rpm from rhel-x86_64-server-6-s-deploy
error was [Errno 14] PYCURL ERROR 22 - &#34;The requested URL returned error: 500 Internal Server Error&#34;</td>
        
          <td></td>
        
      </tr>
    
      
        <tr class='error'>
      
        <td rel="utctimestamp">2020-02-10T15:27:40.117-05:00</td>
        <td>/Stage[main]/sr-kd-if::V2_0/Package[sr-kd-if]/ensure</td>
        <td>err, package, sr-kd-if, class, sr-kd-if::v2_0, sr-kd-if, v2_0</td>
        <td>change from &#39;1.22-1&#39; to &#39;2.0-2&#39; failed: Could not update: Execution of &#39;/usr/bin/yum -d 0 -e 0 -y update sr-kd-if-2.0-2&#39; returned 1: Error Downloading Packages:
  jq-1.3-2.el6.x86_64: failed to retrieve getPackage/jq-1.3-2.el6.x86_64.rpm from rhel-x86_64-server-6-s-epel
error was [Errno 14] PYCURL ERROR 22 - &#34;The requested URL returned error: 500 Internal Server Error&#34;
  sr-kd-if-2.0-2.x86_64: failed to retrieve getPackage/sr-kd-if-2.0-2.x86_64.rpm from rhel-x86_64-server-6-s-deploy
error was [Errno 14] PYCURL ERROR 22 - &#34;The requested URL returned error: 500 Internal Server Error&#34;</td>
        
          <td>/etc/SourceManagement/code/environments/env/modules/sr-kd-if/manifests/v2_0.pp:2</td>
        
      </tr>
    
      
        <tr class='positive'>
      
        <td rel="utctimestamp">2020-02-10T15:27:55.695-05:00</td>
        <td>/Stage[main]/Com::Ismt/fld::Ismt[1249 COM]/Exec[/bin/ksh -c &#39;source /usr/local/bin/src.host; /opt/hd/su/bin/sr-db-sc 1249 N COM&#39;]/returns</td>
        <td>notice, exec, fld::ismt, fld, ismt, class, com::ismt, com, fld::catalog, catalog, com::pr::v19_11_2, pr, v19_11_2</td>
        <td>executed successfully</td>
        
          <td>/etc/SourceManagement/code/environments/env/modules/fld/manifests/ismt.pp:7</td>
        
      </tr>
    
      
        <tr class='positive'>
      
        <td rel="utctimestamp">2020-02-10T15:27:55.696-05:00</td>
        <td>/Stage[main]/Com::Ismt/fld::Ismt[1249 COM]/Exec[/bin/ksh -c &#39;source /usr/local/bin/src.host; /opt/hd/su/bin/sr-db-sc 1249 N COM&#39;]</td>
        <td>info, exec, fld::ismt, fld, ismt, class, com::ismt, com, fld::catalog, catalog, com::pr::v19_11_2, pr, v19_11_2</td>
        <td>Scheduling refresh of Exec[/bin/chown pris01:dbaccgrp /opt/hd/su/tmp/*]</td>
        
          <td>/etc/SourceManagement/code/environments/env/modules/fld/manifests/ismt.pp:7</td>
        
      </tr>
    
      
        <tr class='positive'>
      
        <td rel="utctimestamp">2020-02-10T15:27:55.905-05:00</td>
        <td>/Stage[main]/Ismt_2615_pp::V1_0_0/fld::Ismt[ismt_2615]/Exec[/bin/chown pris01:dbaccgrp /opt/hd/su/tmp/*]</td>
        <td>notice, exec, fld::ismt, fld, ismt, ismt_2615, class, ismt_2615_pp::v1_0_0, ismt_2615_pp, v1_0_0</td>
        <td>Triggered &#39;refresh&#39; from 1 event</td>
        
          <td>/etc/SourceManagement/code/environments/env/modules/fld/manifests/ismt.pp:22</td>
        
      </tr>
    
      
        <tr class='positive'>
      
        <td rel="utctimestamp">2020-02-10T15:27:56.069-05:00</td>
        <td>Stage[main]</td>
        <td>info, stage</td>
        <td>Unscheduling all events on Stage[main]</td>
        
          <td></td>
        
      </tr>
    
      
        <tr class='positive'>
      
        <td rel="utctimestamp">2020-02-10T15:27:56.404-05:00</td>
        <td>SourceManagement</td>
        <td>notice</td>
        <td>Applied catalog in 35.89 seconds</td>
        
          <td></td>
        
      </tr>
    
  </tbody>
</table>

<h1>Metrics <a href="/download_report/myserv.foo.com/58964efb70ab87610ff4ce3fdbf46ce7dea54dac/metrics"><button style="float: right;" class="ui grey button"><span>Download</span></button></a></h1>
<table class="ui basic table compact">
  <thead>
    <tr>
      <th>Category</th>
      <th>Name</th>
      <th>Value</th>
    </tr>
  </thead>
  <tbody>
    
      <tr>
        <td>resources</td>
        <td>changed</td>
        <td>5.0</td>
      </tr>
    
      <tr>
        <td>resources</td>
        <td>corrective_change</td>
        <td>5.0</td>
      </tr>
    
      <tr>
        <td>resources</td>
        <td>failed</td>
        <td>1.0</td>
      </tr>
    
      <tr>
        <td>resources</td>
        <td>failed_to_restart</td>
        <td>0.0</td>
      </tr>
    
      <tr>
        <td>resources</td>
        <td>out_of_sync</td>
        <td>6.0</td>
      </tr>
    
      <tr>
        <td>resources</td>
        <td>restarted</td>
        <td>2.0</td>
      </tr>
    
      <tr>
        <td>resources</td>
        <td>scheduled</td>
        <td>0.0</td>
      </tr>
    
      <tr>
        <td>resources</td>
        <td>skipped</td>
        <td>0.0</td>
      </tr>
    
      <tr>
        <td>resources</td>
        <td>total</td>
        <td>754.0</td>
      </tr>
    
      <tr>
        <td>time</td>
        <td>config_retrieval</td>
        <td>60.69</td>
      </tr>
    
      <tr>
        <td>time</td>
        <td>cron</td>
        <td>0.01</td>
      </tr>
    
      <tr>
        <td>time</td>
        <td>exec</td>
        <td>7.89</td>
      </tr>
    
      <tr>
        <td>time</td>
        <td>file</td>
        <td>6.95</td>
      </tr>
    
      <tr>
        <td>time</td>
        <td>file_line</td>
        <td>0.01</td>
      </tr>
    
      <tr>
        <td>time</td>
        <td>filebucket</td>
        <td>0.0</td>
      </tr>
    
      <tr>
        <td>time</td>
        <td>filesystem</td>
        <td>0.01</td>
      </tr>
    
      <tr>
        <td>time</td>
        <td>logical_volume</td>
        <td>2.35</td>
      </tr>
    
      <tr>
        <td>time</td>
        <td>mount</td>
        <td>0.0</td>
      </tr>
    
      <tr>
        <td>time</td>
        <td>package</td>
        <td>13.01</td>
      </tr>
    
      <tr>
        <td>time</td>
        <td>pe_anchor</td>
        <td>0.0</td>
      </tr>
    
      <tr>
        <td>time</td>
        <td>service</td>
        <td>0.76</td>
      </tr>
    
      <tr>
        <td>time</td>
        <td>total</td>
        <td>91.69</td>
      </tr>
    
      <tr>
        <td>time</td>
        <td>user</td>
        <td>0.0</td>
      </tr>
    
      <tr>
        <td>changes</td>
        <td>total</td>
        <td>5.0</td>
      </tr>
    
      <tr>
        <td>events</td>
        <td>failure</td>
        <td>1.0</td>
      </tr>
    
      <tr>
        <td>events</td>
        <td>success</td>
        <td>5.0</td>
      </tr>
    
      <tr>
        <td>events</td>
        <td>total</td>
        <td>6.0</td>
      </tr>
    
  </tbody>
</table>


        </div>
        <div class="one wide column"></div>
    </div>

    <div id="scroll-btn-top">
      <i class="large arrow up icon"></i>
    </div>
  </body>
</html>

编辑:

grep -v -e '^\s*$' samplefile

之后文件的最后 25 行

      <tr>
        <td>changes</td>
        <td>total</td>
        <td>5.0</td>
      </tr>
      <tr>
        <td>events</td>
        <td>failure</td>
        <td>1.0</td>
      </tr>
      <tr>
        <td>events</td>
        <td>success</td>
        <td>5.0</td>
      </tr>
      <tr>
        <td>events</td>
        <td>total</td>
        <td>6.0</td>
      </tr>
  </tbody>
</table>
        </div>
        <div class="one wide column"></div>
    </div>
    <div id="scroll-btn-top">
      <i class="large arrow up icon"></i>
    </div>
  </body>
</html>

Go 代码示例后文件的最后 25 行:

      <tr>
        <td>events</td>
        <td>success</td>
        <td>5.0</td>
      </tr>

      <tr>
        <td>events</td>
        <td>total</td>
        <td>6.0</td>
      </tr>

  </tbody>
</table>

        </div>
        <div class="one wide column"></div>
    </div>

    <div id="scroll-btn-top">
      <i class="large arrow up icon"></i>
    </div>
  </body>
</html>

^$ 锚点 在一行的开头或结尾匹配,但 不包括相邻的换行符。因此,您的正则表达式所做的只是删除纯空白行上的空白(并将相邻的纯空白行合并为一个;见下文),但它不会删除初始或最终换行符。

您需要放弃多行模式并使用 (^|\n)(\n|$) 来匹配实际的换行符,这样您就可以替换它们。请注意,您只想更换其中一个;否则,纯空白行周围的行将被连接起来。另请注意,根据您选择替换的行,您可能会得到一个额外的初始或最终换行符,因此您可能希望单独处理初始和最终一组纯空白行(可能存在也可能不存在)。


(旧答案在下面;也许对其他人有用。)

\s 也匹配换行符,* 找到最大长度匹配。因此,相邻的纯空白行将成为一个匹配项。~~~

如果您需要计算单独的行数,请尝试使用 *?,这会产生非贪婪匹配(因此它会在到达 $ 时停止)。或者使用 [^\n\S] 而不是 \s,即 "match anything except newlines or non-whitespace".

我想我明白了。看来比赛的问题与开始和结束锚点的使用有关。使用 (?ms) 将匹配行为从每个 string 更改为每个 line plus per string 会在 ^$ 一起使用。简单地使用 (?ms)\s+$ 只匹配可能是 \t \n \r \f \v 组合的行。这会产生我正在寻找的行为(仅匹配空白行)。