使用 sed 从 wget 读取网站的字节数

Question

我试图只打印 wget 命令输出的一小部分。如果我输入

wget http://google.com --spider --server-response

我收到一长串我要搜索的终端输出。其中一行是

Content-Length: 219

我只想读取数字 219 并将其打印到标准输出。我在另一个堆栈溢出线程 (get file size of a file to wget before wget-ing it?)

上找到了答案

wget http://google.com --spider --server-response -O - 2>&1 | sed -ne '/Content-Length/{s/.*: //;p}'

我在理解这个命令时遇到两个主要困难。我希望有人能详细解释一下这两件事。

sed 通常需要一个输入文件，对吗？管道 wget 命令的输出不会使其成为文件。没有这个怎么行？
没看懂-e是什么意思。我查阅了 linux 手册页，它提到它是针对 "script" 的？这个标志很重要，因为没有它，什么都不起作用。这是什么意思？另外，命令的其余部分发生了什么，如何只打印出数字？

很抱歉问一个以前回答过的问题，但我没有在网上找到任何有意义的解释，我想尝试使用替代解决方案来做到这一点！

Answer 1

sed usually requires an input file right? Piping the output from the wget command doesn't make it a file. How come it works without this?

与大多数 Unix 实用程序一样，sed 将处理作为参数提供的文件，否则它将处理其标准输入。

I don't understand what -e means. I've looked up the linux man pages and it mentions it is for "script" ? What does that means? Also, what is happening in the line with the quotes?

-e 用于指示下一个参数是一串要执行的 sed 操作（文档称之为 "script"）。这是 sed 的第一个参数的默认值，但您获得的脚本恰好明确地使用了它。当您给出多个命令时，它最有用，因为如果您没有在其他命令之前使用 -e，它们将被视为文件名。另见

what does dash e(-e) mean in sed commands?

在您的命令中，-n 选项意味着 sed 不应默认打印其输入行——您将使用 p 操作显式打印所选行。 /Content-Length/ 匹配包含该字符串的行，然后是一组要对 {} 中的匹配行执行的操作。第一个操作是 s/.*: //，它将 : 和 space 之前的所有内容替换为空。第二个操作是 p，打印修改后的行。所以在 Content-Length:.

之后打印数字

Answer 2

您仍然可以将 sed 命令（wget -O 不需要，sed -e 不需要）减少到：

wget http://google.com --spider --server-response 2>&1 | sed -n '/Content-Length/{s/.*: //;p}'

在这里，将 STDERR 重定向到 STDOUT 并使 sed 对其进行操作。 sed 命令的作用是，它禁止打印 (-n)，然后对于包含 Content-Length 的行，从开头删除所有字符，包括 : 和 space.然后打印修改后的行(p in sed).

与awk相同：

wget http://google.com --spider --server-response 2>&1 | awk '/Content-Length/{print }'

对于包含 Content-Length 的行，打印第二个字段（将是数字部分）。

使用 sed 从 wget 读取网站的字节数

Using sed to read byte count of a website from wget

linux

bash

sed

wget