在 richTextBox 中搜索单词(windows 表单,c#)
Searching for words in richTextBox (windows forms, c#)
我需要帮助来计算 richTextBox 中的相同单词,并且还想将 numbers/words 分配给它们。一盒50万+字,全部只写了3、6遍。
一种解决方案是预先写下我要查找的词,在框中搜索并为其分配 number/word。但是做的太多了,如果有更快的方法对我帮助很大。
提前致谢!
这是我现在正在做的代码,并不代表我的要求。
for (int i = 0; i < Url.Length; i++)
{
doc = web.Load(Url[i]);
int x=0;
int y=0;
//here was a long list of whiles for *fors* down there,
//removed to be cleaner for you to see
//one example:
//while (i == 0)
//{
// x = 18;
// y = 19;
// break;
//}
for (int w = 0; w < x; w++)
{
string metascore = doc.DocumentNode.SelectNodes("//*[@class=\"trow8\"]")[w].InnerText;
richTextBox1.AppendText("-----------");
richTextBox1.AppendText(metascore);
richTextBox1.AppendText("\r\n");
}
for (int z = 0; z < y; z++)
{
string metascore1 = doc.DocumentNode.SelectNodes("//*[@class=\"trow2\"]")[z].InnerText;
richTextBox1.AppendText("-----------");
richTextBox1.AppendText(metascore1);
richTextBox1.AppendText("\r\n");
}
}
假设您身边有一个 RichTextBox 运行,我们称他为 Bob:
//Grab all the text from the RTF
TextRange BobRange = new TextRange(
// TextPointer to the start of content in the RichTextBox.
Bob.Document.ContentStart,
// TextPointer to the end of content in the RichTextBox.
Bob.Document.ContentEnd
);
//Assume words are sperated by space, commas or periods, split on that and thorw the words in a List
List<string> BobsWords = BobRange.Text.Split(new char[] { ' ', ',', '.' }, StringSplitOptions.RemoveEmptyEntries).ToList<string>();
//Now use Linq for .net to grab what you want
int CountOfTheWordThe = BobsWords.Where(w => w == "The").Count();
//Now that we have the list of words we can create a MetaDictionary
Dictionary<string, int> BobsWordsAndCounts = new Dictionary<string, int>();
foreach( string word in BobsWords)
{
if (!BobsWordsAndCounts.Keys.Contains(word))
BobsWordsAndCounts.Add( word, BobsWords.Where(w => w == word).Count(); )
}
这就是你给一个词分配一个数字的意思吗?您想知道 RichTextBox 中每个单词的数量吗?
也许添加一些上下文?
假设您有文本、从文件读取、从某些网页下载或 RTB...使用 LINQ 将为您提供所需的一切。
string textToCount = "...";
// second step, split text by delimiters of your own
List<string> words = textToCount.Split(' ', ',', '!', '?', '.', ';', '\r', '\n') // split by whatever characters
.Where(s => !string.IsNullOrWhiteSpace(s)) // eliminate all whitespaces
.Select(w => w.ToLower()) // transform to lower for easier comparison/count
.ToList();
// use LINQ helper method to group words and project them into dictionary
// with word as a key and value as total number of word appearances
// this is the actual magic. With this in place, it's easy to get statistics.
var groupedWords = words.GroupBy(w => w)
.ToDictionary(k => k.Key, v => v.Count());`
// to get various statistics
// total different words - count your dictionary entries
groupedWords.Count();
// the most often word - looking for the word having the max number in the list of values
groupedWords.First(kvp => kvp.Value == groupedWords.Values.Max()).Key
// mentioned this many times
groupedWords.Values.Max()
// similar for min, just replace call to Max() with call to Min()
// print out your dictionary to see for every word how often it's metnioned
foreach (var wordStats in groupedWords)
{
Console.WriteLine($"{wordStats.Key} - {wordStats.Value}");
}
此解决方案与之前的解决方案类似 post。主要区别在于此解决方案使用按单词分组,这很简单并将其放入字典。在那里,很容易找到很多东西。
我需要帮助来计算 richTextBox 中的相同单词,并且还想将 numbers/words 分配给它们。一盒50万+字,全部只写了3、6遍。 一种解决方案是预先写下我要查找的词,在框中搜索并为其分配 number/word。但是做的太多了,如果有更快的方法对我帮助很大。 提前致谢!
这是我现在正在做的代码,并不代表我的要求。
for (int i = 0; i < Url.Length; i++)
{
doc = web.Load(Url[i]);
int x=0;
int y=0;
//here was a long list of whiles for *fors* down there,
//removed to be cleaner for you to see
//one example:
//while (i == 0)
//{
// x = 18;
// y = 19;
// break;
//}
for (int w = 0; w < x; w++)
{
string metascore = doc.DocumentNode.SelectNodes("//*[@class=\"trow8\"]")[w].InnerText;
richTextBox1.AppendText("-----------");
richTextBox1.AppendText(metascore);
richTextBox1.AppendText("\r\n");
}
for (int z = 0; z < y; z++)
{
string metascore1 = doc.DocumentNode.SelectNodes("//*[@class=\"trow2\"]")[z].InnerText;
richTextBox1.AppendText("-----------");
richTextBox1.AppendText(metascore1);
richTextBox1.AppendText("\r\n");
}
}
假设您身边有一个 RichTextBox 运行,我们称他为 Bob:
//Grab all the text from the RTF
TextRange BobRange = new TextRange(
// TextPointer to the start of content in the RichTextBox.
Bob.Document.ContentStart,
// TextPointer to the end of content in the RichTextBox.
Bob.Document.ContentEnd
);
//Assume words are sperated by space, commas or periods, split on that and thorw the words in a List
List<string> BobsWords = BobRange.Text.Split(new char[] { ' ', ',', '.' }, StringSplitOptions.RemoveEmptyEntries).ToList<string>();
//Now use Linq for .net to grab what you want
int CountOfTheWordThe = BobsWords.Where(w => w == "The").Count();
//Now that we have the list of words we can create a MetaDictionary
Dictionary<string, int> BobsWordsAndCounts = new Dictionary<string, int>();
foreach( string word in BobsWords)
{
if (!BobsWordsAndCounts.Keys.Contains(word))
BobsWordsAndCounts.Add( word, BobsWords.Where(w => w == word).Count(); )
}
这就是你给一个词分配一个数字的意思吗?您想知道 RichTextBox 中每个单词的数量吗?
也许添加一些上下文?
假设您有文本、从文件读取、从某些网页下载或 RTB...使用 LINQ 将为您提供所需的一切。
string textToCount = "...";
// second step, split text by delimiters of your own
List<string> words = textToCount.Split(' ', ',', '!', '?', '.', ';', '\r', '\n') // split by whatever characters
.Where(s => !string.IsNullOrWhiteSpace(s)) // eliminate all whitespaces
.Select(w => w.ToLower()) // transform to lower for easier comparison/count
.ToList();
// use LINQ helper method to group words and project them into dictionary
// with word as a key and value as total number of word appearances
// this is the actual magic. With this in place, it's easy to get statistics.
var groupedWords = words.GroupBy(w => w)
.ToDictionary(k => k.Key, v => v.Count());`
// to get various statistics
// total different words - count your dictionary entries
groupedWords.Count();
// the most often word - looking for the word having the max number in the list of values
groupedWords.First(kvp => kvp.Value == groupedWords.Values.Max()).Key
// mentioned this many times
groupedWords.Values.Max()
// similar for min, just replace call to Max() with call to Min()
// print out your dictionary to see for every word how often it's metnioned
foreach (var wordStats in groupedWords)
{
Console.WriteLine($"{wordStats.Key} - {wordStats.Value}");
}
此解决方案与之前的解决方案类似 post。主要区别在于此解决方案使用按单词分组,这很简单并将其放入字典。在那里,很容易找到很多东西。