使用 HashSet 存储文本文件并从中读取

Question

我看过很多关于 HasSets 的重要资源，但没有任何内容可以帮助我解决这个特定问题。我正在对泛型进行算法 class 并且此分配需要使用扫描仪（已完成）将 txt 文件读入系统并使用 hashSet，加载 txt 文件以便我可以与用户一起阅读输入并找出单词出现的次数。我有返回单词的方法，我已经完成了大部分 hashSet 和文件 reader 代码。但我完全坚持如何将整个 txt 文件存储为一个 hashSet。我无法通过 crime.add 让它工作，我尝试了其他几件事。我是否缺少实现此方法的更简单方法？谢谢

编辑：作业说明 - 方案一（70分）将小说《罪与罚》中的文字载入java.util.HashSet，通过 Theodore Dostoevsky（此作业可在黑板上获得文本文件）。提示用户输入一个词并报告该词是否出现在小说中。

编辑：好的，我已经写好了所有这些并且它运行了但是它没有找到肯定在 txt 文件中的单词，所以我在某个地方错误地将文件添加到 hashSet 中。有任何想法吗？我已经尝试过数组列表、不同的 String 实现，但我只是不知道该转向哪里。感谢您提供任何有用的信息。

import java.awt.List;
import java.io.File;
import java.io.FileNotFoundException;
import java.util.HashSet;
import java.util.Scanner;
import java.util.Set;

public class CandPHashSet {
    public static void main(String[] args) throws FileNotFoundException{
        Scanner file = new Scanner(new File("crime_and_punishment.txt")).useDelimiter("[ˆa-zA-Z]+");
        Scanner input = new Scanner(System.in);

        Set<String> crime = new HashSet<String>();

        while(file.hasNext()){
            String line = file.nextLine();
            //String[] words = line.split("[ˆa-zA-Z]+");
            for (String word : line.split("[ˆa-zA-Z]+")){
                crime.add(line);
            }
        }

        String search;
        System.out.println("Enter a word to search for: ");
        search = input.next();

        if(crime.contains(input)){
            System.out.println("Yes");
        }else{
            System.out.println("No");
        }
    }
}

Answer 1

你不能用 HashSet. 这样做，你只会丢失重复项。您可以在添加重复项时对其进行计数，但是您需要在某个地方放置计数。

需要Map<String, Integer>。

Answer 2

看来您不需要计算单词的出现次数。您只需要将输入文件字符串拆分为单个单词，并将它们存储到 HashSet<String> 中。那么你应该使用方法 contains() 来检查用户给出的单词是否存在于集合中。

您的代码中有几个问题需要检查：

您在Scanner中使用useDelimiter()的方式不正确。您可能不想指定分隔符，以便使用默认的 whitespace。
如果您使用 whitespace 作为扫描仪分隔符，它已经将您的输入拆分为单词。所以我们不需要逐行读取文件。
您使用 crime.contains(input) 来查找用户提供的单词。但是 input 是 Scanner，而不是 String。你想使用 crime.contains(search).

修改后的代码看起来有点像这样：

// Read the file using whitespace as a delimiter (default)
// so that the input will be split into words
Scanner file = new Scanner(new File("crime_and_punishment.txt"));

Set<String> crime = new HashSet<>();
// For each word in the input
while (file.hasNext()) {
    // Convert the word to lower case, trim it and insert into the set
    // In this step, you will probably want to remove punctuation marks
    crime.add(file.next().trim().toLowerCase());
}

System.out.println("Enter a word to search for: ");
Scanner input = new Scanner(System.in);
// Also convert the input to lowercase
String search = input.next().toLowerCase();

// Check if the set contains the search string
if (crime.contains(search)) {
    System.out.println("Yes");
} else {
    System.out.println("No");
}

Answer 3

您发布的要求相互矛盾。

find the number of occurrences

与

不一样

report whether or not that word appears in the novel.

HashSet 适用于第二个。不是第一次。

阅读要求时要非常小心。多花 5 分钟阅读它们可以让您多花 5 个小时编写代码。

要按照说明操作，您需要做的是一次向您的哈希集中添加一个词。一个字一个字看，已有答案here

每当我不确定要使用哪个容器时，我都会看这个：

使用 HashSet 存储文本文件并从中读取

Using HashSet to store a text file and read from it

java

string

set