如何计算文件中的重复行并找到重复次数最多的行?
How to count the duplicate lines in a file and find the most duplicated line?
或者更好的是,向我展示某个元素在地图中被复制了多少次。地图是这样创建的:
fun prirazovac() {
var lineNumber = 0
File("src/60.ips.txt").forEachLine {
lineNumber++
val ipcode = mutableMapOf(lineNumber to it)
for (ii in 1..200) {
for (i in 200 downTo 1) {
val truth = (ipcode.get(ii)== ipcode.get(i))
if (truth) {
println(ipcode)
}
}
}
}
}
60.ips.txt:
66.249.64.33
66.249.64.124
66.249.76.13
66.249.76.11
142.54.183.122
142.54.183.122
180.76.15.162
173.234.153.122
173.234.153.122
173.234.153.122
173.234.153.122
180.76.15.154
180.76.15.33
66.249.76.110
66.249.76.109
46.119.118.233
46.119.118.233
46.119.118.233
207.46.13.231
207.46.13.231
40.77.167.29
52.3.127.144
66.249.64.33
66.249.76.109
63.249.66.212
63.249.66.212
207.46.13.237
207.46.13.237
40.77.167.29
40.77.167.29
157.55.39.251
207.46.13.142
66.249.76.9
40.77.167.7
157.55.39.251
157.55.39.251
157.55.39.251
157.55.39.251
157.55.39.251
207.46.13.142
207.46.13.142
198.204.240.219
198.204.240.219
68.180.231.40
68.180.231.40
66.249.64.124
139.167.180.171
139.167.180.171
52.3.127.144
217.69.133.169
66.249.76.13
131.161.8.209
223.16.201.219
223.16.201.219
68.180.231.40
162.210.196.97
162.210.196.97
106.75.74.148
106.75.74.148
106.75.74.148
137.226.158.12
137.226.158.12
106.75.74.148
106.75.74.148
123.125.71.53
178.255.215.84
178.255.215.84
66.249.76.9
63.249.66.212
63.249.66.212
63.249.66.212
198.204.227.58
198.204.227.58
198.204.227.58
198.204.227.58
198.204.227.58
198.204.227.58
198.204.227.58
198.204.227.58
198.204.227.58
198.204.227.58
142.54.183.122
142.54.183.122
66.249.76.109
151.80.31.167
51.255.65.21
202.46.58.80
84.185.64.239
84.185.64.239
178.255.215.84
178.255.215.84
52.3.127.144
180.76.15.21
66.249.64.20
66.249.76.127
80.112.180.113
66.249.76.109
180.76.15.6
223.16.201.219
223.16.201.219
84.121.51.229
84.121.51.229
123.125.71.79
157.55.39.251
217.69.133.253
217.69.133.252
92.204.106.99
188.251.22.226
80.183.10.116
68.180.228.62
68.180.228.62
173.208.211.250
173.208.211.250
66.249.65.158
180.76.15.6
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
68.180.228.62
180.76.15.6
173.208.211.250
173.208.211.250
5.248.253.78
5.248.253.78
5.248.253.78
123.125.71.95
92.204.106.99
93.95.103.45
52.3.127.144
52.3.127.144
68.180.228.62
163.172.66.14
190.200.185.85
190.200.185.85
157.55.39.251
157.55.39.113
180.76.15.137
180.76.15.25
92.204.106.99
66.249.73.136
46.229.167.149
46.229.167.149
46.229.167.149
92.229.161.46
92.204.106.99
92.204.106.99
92.204.106.99
66.249.65.158
66.249.65.154
207.46.13.141
207.46.13.141
207.46.13.141
173.208.211.250
173.208.211.250
66.249.73.131
66.249.73.131
163.172.14.55
178.255.215.84
91.64.61.78
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
87.78.248.247
87.78.248.247
69.64.40.177
223.16.201.219
223.16.201.219
63.249.66.212
63.249.66.212
178.137.95.202
178.137.95.202
178.137.95.202
92.204.106.99
它打印出数千个结果。我需要它们单个,并且在最好的情况下,显示一个结果有多少重复项,例如:ip 地址 - 20 次。我认为 HashMap() 会有所帮助,但事实并非如此。有什么想法吗?
Kotlin 对此有一些很棒的功能:groupingBy
and eachCount
可以完全满足您的需求:
import java.io.File
fun main() {
File("src/60.ips.txt")
.readLines()
.groupingBy { it }
.eachCount()
.forEach { (ip, count) -> println("$ip -> $count times") }
}
部分输出:
66.249.64.33 -> 2 times
66.249.64.124 -> 2 times
66.249.76.13 -> 2 times
66.249.76.11 -> 1 times
142.54.183.122 -> 4 times
要查找最频繁的重复项,您可以使用 maxByOrNull
:
File("src/60.ips.txt")
.readLines()
.groupingBy { it }
.eachCount()
.maxByOrNull { it.value }
?.let { (ip, count) -> println("IP $ip appeared the most: $count times") }
输出:
IP 46.246.39.81 appeared the most: 17 times
或者更好的是,向我展示某个元素在地图中被复制了多少次。地图是这样创建的:
fun prirazovac() {
var lineNumber = 0
File("src/60.ips.txt").forEachLine {
lineNumber++
val ipcode = mutableMapOf(lineNumber to it)
for (ii in 1..200) {
for (i in 200 downTo 1) {
val truth = (ipcode.get(ii)== ipcode.get(i))
if (truth) {
println(ipcode)
}
}
}
}
}
60.ips.txt:
66.249.64.33
66.249.64.124
66.249.76.13
66.249.76.11
142.54.183.122
142.54.183.122
180.76.15.162
173.234.153.122
173.234.153.122
173.234.153.122
173.234.153.122
180.76.15.154
180.76.15.33
66.249.76.110
66.249.76.109
46.119.118.233
46.119.118.233
46.119.118.233
207.46.13.231
207.46.13.231
40.77.167.29
52.3.127.144
66.249.64.33
66.249.76.109
63.249.66.212
63.249.66.212
207.46.13.237
207.46.13.237
40.77.167.29
40.77.167.29
157.55.39.251
207.46.13.142
66.249.76.9
40.77.167.7
157.55.39.251
157.55.39.251
157.55.39.251
157.55.39.251
157.55.39.251
207.46.13.142
207.46.13.142
198.204.240.219
198.204.240.219
68.180.231.40
68.180.231.40
66.249.64.124
139.167.180.171
139.167.180.171
52.3.127.144
217.69.133.169
66.249.76.13
131.161.8.209
223.16.201.219
223.16.201.219
68.180.231.40
162.210.196.97
162.210.196.97
106.75.74.148
106.75.74.148
106.75.74.148
137.226.158.12
137.226.158.12
106.75.74.148
106.75.74.148
123.125.71.53
178.255.215.84
178.255.215.84
66.249.76.9
63.249.66.212
63.249.66.212
63.249.66.212
198.204.227.58
198.204.227.58
198.204.227.58
198.204.227.58
198.204.227.58
198.204.227.58
198.204.227.58
198.204.227.58
198.204.227.58
198.204.227.58
142.54.183.122
142.54.183.122
66.249.76.109
151.80.31.167
51.255.65.21
202.46.58.80
84.185.64.239
84.185.64.239
178.255.215.84
178.255.215.84
52.3.127.144
180.76.15.21
66.249.64.20
66.249.76.127
80.112.180.113
66.249.76.109
180.76.15.6
223.16.201.219
223.16.201.219
84.121.51.229
84.121.51.229
123.125.71.79
157.55.39.251
217.69.133.253
217.69.133.252
92.204.106.99
188.251.22.226
80.183.10.116
68.180.228.62
68.180.228.62
173.208.211.250
173.208.211.250
66.249.65.158
180.76.15.6
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
88.198.117.52
68.180.228.62
180.76.15.6
173.208.211.250
173.208.211.250
5.248.253.78
5.248.253.78
5.248.253.78
123.125.71.95
92.204.106.99
93.95.103.45
52.3.127.144
52.3.127.144
68.180.228.62
163.172.66.14
190.200.185.85
190.200.185.85
157.55.39.251
157.55.39.113
180.76.15.137
180.76.15.25
92.204.106.99
66.249.73.136
46.229.167.149
46.229.167.149
46.229.167.149
92.229.161.46
92.204.106.99
92.204.106.99
92.204.106.99
66.249.65.158
66.249.65.154
207.46.13.141
207.46.13.141
207.46.13.141
173.208.211.250
173.208.211.250
66.249.73.131
66.249.73.131
163.172.14.55
178.255.215.84
91.64.61.78
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
46.246.39.81
87.78.248.247
87.78.248.247
69.64.40.177
223.16.201.219
223.16.201.219
63.249.66.212
63.249.66.212
178.137.95.202
178.137.95.202
178.137.95.202
92.204.106.99
它打印出数千个结果。我需要它们单个,并且在最好的情况下,显示一个结果有多少重复项,例如:ip 地址 - 20 次。我认为 HashMap() 会有所帮助,但事实并非如此。有什么想法吗?
Kotlin 对此有一些很棒的功能:groupingBy
and eachCount
可以完全满足您的需求:
import java.io.File
fun main() {
File("src/60.ips.txt")
.readLines()
.groupingBy { it }
.eachCount()
.forEach { (ip, count) -> println("$ip -> $count times") }
}
部分输出:
66.249.64.33 -> 2 times
66.249.64.124 -> 2 times
66.249.76.13 -> 2 times
66.249.76.11 -> 1 times
142.54.183.122 -> 4 times
要查找最频繁的重复项,您可以使用 maxByOrNull
:
File("src/60.ips.txt")
.readLines()
.groupingBy { it }
.eachCount()
.maxByOrNull { it.value }
?.let { (ip, count) -> println("IP $ip appeared the most: $count times") }
输出:
IP 46.246.39.81 appeared the most: 17 times