打印一行中每个单词的第一个字母

Question

我已经搜索过其他帖子，但没有找到符合我需要的答案。我有一个 space 分隔的文件。我想打印给定行中每个单词的第一个字母。例如：

cat test.txt
This is a test sentence.

使用 sed、awk 或组合使用，我希望输出为 "Tiats"。有什么建议可以指引我正确的方向吗？

Answer 1

一种可能：

pax> echo 'This is a test sentence.
  This is another.' | sed -e 's/$/ /' -e 's/\([^ ]\)[^ ]* //g' -e 's/^ *//'
Tiats
Tia

第一个 sed 命令只是确保每行末尾有一个 space 以简化第二个命令。

第二个命令将删除所有后续字母和每个单词的尾部 space。这个意义上的词被定义为任何一组非 space 个字符。

第三个是添加的内容，以确保删除每行的前导 space。

Answer 2

在 awk 中：

awk '{
  for (i=1; i<=NF; i++) {
    printf(substr($i, 1, 1));
  }
  printf("\n");
}' input_file

awk 自动将 NF 设置为行中的字段数，遍历每个字段并使用 substr 获取第一个字母

Answer 3

sed的另一个解决方案：

sed 's/\(.\)[^ ]* *//g' File

在这里，我们寻找 any character(.)，然后是 sequence of non-space characters([^ ]*)，然后是 optional space( * ).将此模式替换为 first 字符（与 . 匹配的字符）。

样本：

$ cat File
This is a test sentence.
Ahggsh Mathsh Dansdjksj
$ sed 's/\(.\)[^ ]* *//g' File
Tiats
AMD

Answer 4

使用 perl：

$ echo This is a test sentence | perl -nE 'print for /^\w|(?<=\W)./g'
Tiats

解释：打印任何非白色-space字符，它是行的开头，或者前面有一个白色-space。

Answer 5

另一个 perl 命令。

$ echo 'This is a test sentence.' | perl -nE 'print for m/(?<!\S)\S/g;print "\n"'
Tiats

Answer 6

另一个awk

awk '{for (i=1;i<=NF;i++) $i=substr($i,1,1)}1' OFS= file

这会遍历每个单词并删除除第一个字母以外的所有单词。

埃克斯：

cat file
This is a test sentence.
Ahggsh Mathsh Dansdjksj

awk '{for (i=1;i<=NF;i++) $i=substr($i,1,1)}1' OFS= file
Tiats
AMD

Answer 7

sed 's/ *\([^ ]\)[^ ]\{1,\} *//g' YourFile

直接取所有space长度和位置。假设 space 是 space 字符而不是制表符（但很容易适应）

纯属娱乐

sed 's/ *\(\([^ ]\)\)\{1,\} *//g' YourFile

取最后一个字母而不是第一个

Answer 8

在Haskell中，一行：

main = putStr =<< (unlines . map (map head . words) . lines <$> getContents)

也许更易读：

main = do
  line <- getLine  --Read a single line from stdin
  let allWords = words line --Turn the line into a list of words
  let firsts = map head allWords --Get the first letter of each word
  putStrLn firsts --Print them out
  main --Start over

Answer 9

一个有趣的纯Bash解决方案：

while read -r line; do
    read -r -d '' -a ary <<< "$line"
    printf '%c' "${ary[@]}" $'\n'
done < text.txt

Answer 10

这可能对你有用 (GNU sed)：

sed 's/\B.\|[[:space:][:punct:]]//g' file

删除单词开头之后的所有字符、空格和标点符号。

Answer 11

啊，在我找到这个线程之前，这是一项非常困难的任务。 ...我想提取一串单词中的第一个字母。这有效：

echo 'Apple banana Carrot fruit-cake (Grapes)' | sed -r 's/.*/\L&/; s/-/ /g; s/[()]//g; s/(.)[^ ]* *//g'
abcfcg

即

sed -r 's/.*/\L&/; s/-/ /g; s/[()]//g; s/(.)[^ ]* *//g'

\L& 将字符串小写（大写，使用：\U&）
将-替换为space
去掉括号()
这里其他答案的最后一个表达，特别是@arjun-mathew-dan
- 寻找任意字符：(.)
- 后跟一系列非space字符：[^ ]*
- 后跟可选的space： *
- 用 (.) 匹配的第一个字符替换此模式：

打印一行中每个单词的第一个字母

Print first letter of each word in a line

bash

awk

sed