解析文本并使用其词根获取给定单词的出现次数
Parsing a text and get the occurences of a given word using its root
我正在开发一个系统,该系统可以根据上下文给出给定单词(显然是多义词)的确切含义。这个研究领域称为词义消歧。为此,我需要给定单词的词根(steam)和包含它的文本。我将解析文本并使用其词根查找给定单词的所有出现。
例如,如果给定的单词是 "love"。系统将解析文本和 returns 所有出现的 "love",例如 "lovely, loved, beloved..."
以下是我尝试过的但不幸的是我没有得到我想要的!
public class Partenn1 {
public static void main(String[] args) {
int c=0;
String w = "tissue";
try (BufferedReader br = new BufferedReader(new FileReader("D:/Sc46.txt")))
{
String line;
while ((line = br.readLine()) != null)
{
String[] WrdsLine = line.split(" ");
boolean findwrd = false;
for( String WrdLine : WrdsLine )
{
for (int a=0; a<WrdsLine.length; a++)
{
if ( WrdsLine[a].indexOf(w)!=0)
{
c++; //It's just a counter for verification of the numbre of the occ.
findwrd = true;
}
}
}
}
System.out.println(c);
}
catch (IOException e) {}
}
}
单词的词根也称为单词的前缀。这可以通过在具有相应前缀的字符串上调用方法 startsWith 来实现。
以下代码正确打印出“2”,因为 'tissue2' 和 'tissue3' 都以 'tissue'.
开头
int count = 0;
final String prefix = "tissue";
try (BufferedReader br = new BufferedReader(new StringReader("tissue2 tiss tiss3 tissue3"))) {
String line;
while ((line = br.readLine()) != null) {
// Get all the words on this line
final String[] wordsInLine = line.split(" ");
for (final String s : wordsInLine) {
// Check that the word starts with the prefix.
if (s.startsWith(prefix)) {
count++;
}
}
}
System.out.println(count);
} catch (final IOException ignored) {
}
不需要再 for
循环。 w
是这里需要的字符串:
while ((line = br.readLine()) != null) {
String[] WrdsLine = line.split(" "); // split
for( String WrdLine : WrdsLine ) {
if ( WrdLine.contains(w)) { // if match - print
System.out.println(WrdLine);
}
}
}
我正在开发一个系统,该系统可以根据上下文给出给定单词(显然是多义词)的确切含义。这个研究领域称为词义消歧。为此,我需要给定单词的词根(steam)和包含它的文本。我将解析文本并使用其词根查找给定单词的所有出现。
例如,如果给定的单词是 "love"。系统将解析文本和 returns 所有出现的 "love",例如 "lovely, loved, beloved..."
以下是我尝试过的但不幸的是我没有得到我想要的!
public class Partenn1 {
public static void main(String[] args) {
int c=0;
String w = "tissue";
try (BufferedReader br = new BufferedReader(new FileReader("D:/Sc46.txt")))
{
String line;
while ((line = br.readLine()) != null)
{
String[] WrdsLine = line.split(" ");
boolean findwrd = false;
for( String WrdLine : WrdsLine )
{
for (int a=0; a<WrdsLine.length; a++)
{
if ( WrdsLine[a].indexOf(w)!=0)
{
c++; //It's just a counter for verification of the numbre of the occ.
findwrd = true;
}
}
}
}
System.out.println(c);
}
catch (IOException e) {}
}
}
单词的词根也称为单词的前缀。这可以通过在具有相应前缀的字符串上调用方法 startsWith 来实现。
以下代码正确打印出“2”,因为 'tissue2' 和 'tissue3' 都以 'tissue'.
开头int count = 0;
final String prefix = "tissue";
try (BufferedReader br = new BufferedReader(new StringReader("tissue2 tiss tiss3 tissue3"))) {
String line;
while ((line = br.readLine()) != null) {
// Get all the words on this line
final String[] wordsInLine = line.split(" ");
for (final String s : wordsInLine) {
// Check that the word starts with the prefix.
if (s.startsWith(prefix)) {
count++;
}
}
}
System.out.println(count);
} catch (final IOException ignored) {
}
不需要再 for
循环。 w
是这里需要的字符串:
while ((line = br.readLine()) != null) {
String[] WrdsLine = line.split(" "); // split
for( String WrdLine : WrdsLine ) {
if ( WrdLine.contains(w)) { // if match - print
System.out.println(WrdLine);
}
}
}