检查页面中是否包含单词并将所有结果存储在 ArrayList 中

check a if a word is contained in a page and store all the results in an ArrayList

我正在使用 Selenium Java Webdriver。我想检查单词 "good" 是否包含在页面中,并将每个找到的内容存储在 ArrayList 中。

例如。该页面包含单词 "good"、"goodmorning" 和 "goodafternoon"。然后我应该得到一个 ArrayList = [good, goodmorning, goodafternoon].

我认为像 "contain("")" 这样检查页面中是否包含某些文本的传统方法在这种情况下不起作用。

你们怎么说?可以吗?

File f = new File ("C:\yourpath\filename");    
BufferedReader br = new BufferedReader(new FileReader(f));
                String line = "";
                while ((line = br.readLine()) != null) {
                   if (line.matches("good")) // regex here
                    System.out.println(line);
                }

您可以使用扫描仪来:

    Scanner scanner = null;
    Pattern pattern = Pattern.compile("good[a-zA-Z]*\p{Blank}*");
    List<String> matches = new ArrayList<String>();;
    try {
        scanner = new Scanner(driver.getPageSource());//selenium driver
        String match = "";
        while (null != (match = scanner.findWithinHorizon(pattern, 0))){            
            matches.add(match.trim());
        }
    } catch (Exception e) {

    }finally{
        if(scanner != null){
            scanner.close();
        }
    }

这就是您要查找的内容,它正在使用 Selenium WebDriver。我在这个页面上测试了单词 "good" 并得到了预期的结果。

public List<String> perform(String url, String searchWord) {
    // webdriver that opens the given URL
    driver.get(url); 
    searchWord = searchWord.toLowerCase();

    // get the top most element on page, it will be html in most cases
    WebElement html = driver.findElement(By.cssSelector("html"));

    // gets all the text on page
    String htmlText = html.getText().replaceAll("\n", " ").toLowerCase();

    // split by space to get all words on page
    String[] allWords = htmlText.split(" ");

    List<String> myWordList = new ArrayList<String>();

    // add all the words that contains your search word
    for (String word : allWords)
        if (word.contains(searchWord))
            myWordList.add(word);

    return myWordList;
}