接受文件名作为参数并计算重复单词和计数

Question

我需要从文本文件中查找数字或重复字符，并且需要将文件名作为参数传递。

示例： test.txt 数据包含

Zoom

输出应该是这样的：

z 1
o 2
m 1

我需要一个接受文件名作为参数的命令，然后列出该文件中的字符数。在我的示例中，我有一个 test.txt，其中包含 zoom 个单词。所以输出就像每个字母重复了多少次。

我的尝试：

vi test.sh

#!/bin/bash
FILE="" --to pass filename as argument
sort file1.txt | uniq -c --to count the number of letters

Answer 1

只是猜测？

cat test.txt |
tr '[:upper:]' '[:lower:]' |
fold -w 1 |
sort |
uniq -c |
awk '{print , }'

m 1
o 2
z 1

Answer 2

#!/bin/bash
#get the argument for further processing
inputfile=""

#check if file exists
if [ -f $inputfile ]
then
    #convert file to a usable format
                    #convert all characters to lowercase
                                                 #put each character on a new line
                                                                            #output to temporary file
    cat $inputfile | tr '[:upper:]' '[:lower:]' | sed -e 's/\(.\)/\n/g' > tmp.txt
    #loop over every character from a-z
    for char in {a..z}
    do
        #count how many times a character occurs
        count=$(grep -c "$char" tmp.txt)
        #print if count > 0
        if [ "$count" -gt "0" ]
        then
            echo -e "$char" "$count"
        fi
    done
    rm tmp.txt
else
    echo "file not found!"
    exit 1
fi

Answer 3

建议计算各种字符的awk脚本：

awk '
BEGIN{FS = ""}  # make each char a field
{
  for (i = 1; i <= NF; i++) { # iteratre over all fields in line
    ++charsArr[$i]; # count each field occourance in array
  }
}
END {
  for (char in charsArr) { # iterrate over chars array
    printf("%3d %s\n", charsArr[char], char);  # cournt char-occourances and the char
  }
}' |sort -n

或一行：

awk '{for(i=1;i<=NF;i++)++arr[$i]}END{for(char in arr)printf("%3d %s\n",arr[char],char)}' FS="" input.1.txt|sort -n

接受文件名作为参数并计算重复单词和计数

Accept filename as argument and calculate repeated words along with count

bash

shell