比较文本文件的字符串
Comparing strings of text files
我有三个文本文件:file1、file2 和 file3,它们都包含电子邮件。文件 1 应该包含所有电子邮件,文件 2 包含所有 A-M 电子邮件,文件 3 包含来自 n-z 的电子邮件(这并不重要,但我认为它有助于提供一些上下文。)
我正在用 c# 编写一个控制台应用程序,它将查看这三个文件,如果有一封电子邮件不是 1 而不是它应该在的位置,那么它将写入一个主文件,说明需要什么要添加到什么。
例如,假设我有电子邮件 john@example.com
。如果在file1中找到但在file2中找不到,则masterfile的输出需要为"this email needs to be added to file2: john@example.com"
。现在,如果它被反转,并且在 file2 中找到了电子邮件,但在 file1 中找不到,那么输出应该是 "this email needs to be added to file1: john@example.com"
作为我的代码的一部分,我正在寻找的答案需要在某种 foreach 循环和 if 语句中,但是我有点迷失了我需要输入的内容。如果有人可以帮助我在弄清楚我必须在我的陈述中使用它是什么时,我将非常感激。如果有人对此有任何疑问,请随时提问!
//Making a list for file1
List<string> listFullPack = new List<string>();
string line;
StreamReader sr = new StreamReader("file1");
while ((line = sr.ReadLine()) != null)
{
listFile1.Add(line);
}
sr.Close();
//Making a list for file2
List<string> listDen1 = new List<string>();
string line1;
StreamReader sr1 = new StreamReader("file2");
while ((line1 = sr1.ReadLine()) != null)
{
listFile2.Add(line1);
}
sr1.Close();
//Making a list for file3
List<string> listDen2 = new List<string>();
string line2;
StreamReader sr2 = new StreamReader("file3");
while ((line2 = sr2.ReadLine()) != null)
{
listFile3.Add(line2);
}
sr2.Close();
//This will double check that emails are in
foreach (string element in listFullPack)
{
System.Console.WriteLine(element);
Debug.WriteLine(element);
if (element == "jimbob@example.com")
{
Debugger.Break();
}
}
//this will compare the file1 list to the file2 list
var firstNotSecond = listFile1.Except(listFile2).ToList();
var secondNotFirst = listFile2.Except(listFile1).ToList();
//this will compare the file2 list to the file3 list
var firstNotThird = listFile1.Except(listFile3).ToList();
var thirdNotFirst = listFile3.Except(listFile1).ToList();
//this will compare the file2 list to the file3 list
var secondNotThird = listFile2.Except(listFile3).ToList();
var thirdNotSecond = listFile3.Except(listFile2).ToList();
foreach (string element in listFile1) // This is where I am lost
{
if (!)
{
}
}
您可以尝试这样简单的操作:
//Making a list for file1
HashSet<string> listFile1 = new HashSet<string>();
string line;
StreamReader sr = new StreamReader("file1");
while ((line = sr.ReadLine()) != null)
{
listFile1.Add(line);
}
sr.Close();
//Making a list for file2
HashSet<string> listFile2 = new HashSet<string>();
string line1;
StreamReader sr1 = new StreamReader("file2");
while ((line1 = sr1.ReadLine()) != null)
{
listFile2.Add(line1);
}
sr1.Close();
//Making a list for file3
HashSet<string> listFile3 = new HashSet<string>();
string line2;
StreamReader sr2 = new StreamReader("file3");
while ((line2 = sr2.ReadLine()) != null)
{
listFile3.Add(line2);
}
sr2.Close();
IEnumerable<string> allEmails = listFile1.Union(listFile2).Union(listFile3);
// this will double check the emails
foreach (string element in allEmails)
{
if (!listFile1.Contains(element))
System.Console.WriteLine("file 1 is missing " + element);
int firstCharAscii = element.Trim().ToLower()[0];
if (firstCharAscii < 110)
{
// less than "n"
if (!listFile2.Contains(element))
System.Console.WriteLine("file 2 is missing " + element);
if (listFile3.Contains(element))
System.Console.WriteLine("file 3 erroneously contains " + element);
}
else
{
// "n" or greater
if (!listFile3.Contains(element))
System.Console.WriteLine("file 3 is missing " + element);
if (listFile2.Contains(element))
System.Console.WriteLine("file 2 erroneously contains " + element);
}
}
请记住,随着电子邮件数量的增加,List<string>.Contains()
方法成为确定存在或不存在的低效方法。你会更适合 HashSet<string>
class。此外,如果出于某种原因您正在阅读 Unicode 字符串,您将需要一种更强大的方法来检查第一个字符的值。
List<string> fullList = File.ReadAllLines("fullist.txt").ToList<string>();
List<string> firstList = File.ReadAllLines("list1.txt").ToList<string>();
List<string> secondList = File.ReadAllLines("list2.txt").ToList<string>();
firstList.ForEach(m => {if (!fullList.Contains(m)/*Or other logic*/) {fullList.Add(m+" from 1 Needs to be in master");}});
secondList.ForEach(m => {if (!fullList.Contains(m)/*Or other logic*/) {fullList.Add(m+ " from 2 Needs to be in master");}});
这应该可以为您完成。
List<string> file1Parsed = new List<string>();
List<string> file2Parsed = new List<string>();
List<string> file3Parsed = new List<string>();
using (StreamReader readerFile1 = new StreamReader(@"c:\file1.txt"))
{
while (!readerFile1.EndOfStream)
{
file1Parsed.Add(readerFile1.ReadLine());
}
}
using (StreamReader readerFile2 = new StreamReader(@"c:\file2.txt"))
{
while (!readerFile2.EndOfStream)
{
file2Parsed.Add(readerFile2.ReadLine());
}
}
using (StreamReader readerFile3 = new StreamReader(@"c:\file3.txt"))
{
while (!readerFile3.EndOfStream)
{
file3Parsed.Add(readerFile3.ReadLine());
}
}
char[] firstSet = { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M' };
char[] secondSet = { 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z' };
var file1InFile2 = file1Parsed.Where(y => y.ToUpper().IndexOfAny(firstSet) == 0).Select(x => x);
var file1InFile3 = file1Parsed.Where(y => y.ToUpper().IndexOfAny(secondSet) == 0).Select(x => x);
using (StreamWriter writer = new StreamWriter(@"C:\notExists.txt"))
{
file1InFile2.Where(x => !file2Parsed.Contains(x.Trim())).ToList().ForEach(y => writer.WriteLine("This email needs to be added to file2: " + y));
file1InFile3.Where(x => !file3Parsed.Contains(x.Trim())).ToList().ForEach(y => writer.WriteLine("This email needs to be added to file3: " + y));
file2Parsed.Where(x => !file1InFile2.Contains(x.Trim())).ToList().ForEach(y => writer.WriteLine("This email needs to be added to file1: " + y));
file3Parsed.Where(x => !file1InFile3.Contains(x.Trim())).ToList().ForEach(y => writer.WriteLine("This email needs to be added to file1: " + y));
}
我修改了它以减少代码
我有三个文本文件:file1、file2 和 file3,它们都包含电子邮件。文件 1 应该包含所有电子邮件,文件 2 包含所有 A-M 电子邮件,文件 3 包含来自 n-z 的电子邮件(这并不重要,但我认为它有助于提供一些上下文。)
我正在用 c# 编写一个控制台应用程序,它将查看这三个文件,如果有一封电子邮件不是 1 而不是它应该在的位置,那么它将写入一个主文件,说明需要什么要添加到什么。
例如,假设我有电子邮件 john@example.com
。如果在file1中找到但在file2中找不到,则masterfile的输出需要为"this email needs to be added to file2: john@example.com"
。现在,如果它被反转,并且在 file2 中找到了电子邮件,但在 file1 中找不到,那么输出应该是 "this email needs to be added to file1: john@example.com"
作为我的代码的一部分,我正在寻找的答案需要在某种 foreach 循环和 if 语句中,但是我有点迷失了我需要输入的内容。如果有人可以帮助我在弄清楚我必须在我的陈述中使用它是什么时,我将非常感激。如果有人对此有任何疑问,请随时提问!
//Making a list for file1
List<string> listFullPack = new List<string>();
string line;
StreamReader sr = new StreamReader("file1");
while ((line = sr.ReadLine()) != null)
{
listFile1.Add(line);
}
sr.Close();
//Making a list for file2
List<string> listDen1 = new List<string>();
string line1;
StreamReader sr1 = new StreamReader("file2");
while ((line1 = sr1.ReadLine()) != null)
{
listFile2.Add(line1);
}
sr1.Close();
//Making a list for file3
List<string> listDen2 = new List<string>();
string line2;
StreamReader sr2 = new StreamReader("file3");
while ((line2 = sr2.ReadLine()) != null)
{
listFile3.Add(line2);
}
sr2.Close();
//This will double check that emails are in
foreach (string element in listFullPack)
{
System.Console.WriteLine(element);
Debug.WriteLine(element);
if (element == "jimbob@example.com")
{
Debugger.Break();
}
}
//this will compare the file1 list to the file2 list
var firstNotSecond = listFile1.Except(listFile2).ToList();
var secondNotFirst = listFile2.Except(listFile1).ToList();
//this will compare the file2 list to the file3 list
var firstNotThird = listFile1.Except(listFile3).ToList();
var thirdNotFirst = listFile3.Except(listFile1).ToList();
//this will compare the file2 list to the file3 list
var secondNotThird = listFile2.Except(listFile3).ToList();
var thirdNotSecond = listFile3.Except(listFile2).ToList();
foreach (string element in listFile1) // This is where I am lost
{
if (!)
{
}
}
您可以尝试这样简单的操作:
//Making a list for file1
HashSet<string> listFile1 = new HashSet<string>();
string line;
StreamReader sr = new StreamReader("file1");
while ((line = sr.ReadLine()) != null)
{
listFile1.Add(line);
}
sr.Close();
//Making a list for file2
HashSet<string> listFile2 = new HashSet<string>();
string line1;
StreamReader sr1 = new StreamReader("file2");
while ((line1 = sr1.ReadLine()) != null)
{
listFile2.Add(line1);
}
sr1.Close();
//Making a list for file3
HashSet<string> listFile3 = new HashSet<string>();
string line2;
StreamReader sr2 = new StreamReader("file3");
while ((line2 = sr2.ReadLine()) != null)
{
listFile3.Add(line2);
}
sr2.Close();
IEnumerable<string> allEmails = listFile1.Union(listFile2).Union(listFile3);
// this will double check the emails
foreach (string element in allEmails)
{
if (!listFile1.Contains(element))
System.Console.WriteLine("file 1 is missing " + element);
int firstCharAscii = element.Trim().ToLower()[0];
if (firstCharAscii < 110)
{
// less than "n"
if (!listFile2.Contains(element))
System.Console.WriteLine("file 2 is missing " + element);
if (listFile3.Contains(element))
System.Console.WriteLine("file 3 erroneously contains " + element);
}
else
{
// "n" or greater
if (!listFile3.Contains(element))
System.Console.WriteLine("file 3 is missing " + element);
if (listFile2.Contains(element))
System.Console.WriteLine("file 2 erroneously contains " + element);
}
}
请记住,随着电子邮件数量的增加,List<string>.Contains()
方法成为确定存在或不存在的低效方法。你会更适合 HashSet<string>
class。此外,如果出于某种原因您正在阅读 Unicode 字符串,您将需要一种更强大的方法来检查第一个字符的值。
List<string> fullList = File.ReadAllLines("fullist.txt").ToList<string>();
List<string> firstList = File.ReadAllLines("list1.txt").ToList<string>();
List<string> secondList = File.ReadAllLines("list2.txt").ToList<string>();
firstList.ForEach(m => {if (!fullList.Contains(m)/*Or other logic*/) {fullList.Add(m+" from 1 Needs to be in master");}});
secondList.ForEach(m => {if (!fullList.Contains(m)/*Or other logic*/) {fullList.Add(m+ " from 2 Needs to be in master");}});
这应该可以为您完成。
List<string> file1Parsed = new List<string>();
List<string> file2Parsed = new List<string>();
List<string> file3Parsed = new List<string>();
using (StreamReader readerFile1 = new StreamReader(@"c:\file1.txt"))
{
while (!readerFile1.EndOfStream)
{
file1Parsed.Add(readerFile1.ReadLine());
}
}
using (StreamReader readerFile2 = new StreamReader(@"c:\file2.txt"))
{
while (!readerFile2.EndOfStream)
{
file2Parsed.Add(readerFile2.ReadLine());
}
}
using (StreamReader readerFile3 = new StreamReader(@"c:\file3.txt"))
{
while (!readerFile3.EndOfStream)
{
file3Parsed.Add(readerFile3.ReadLine());
}
}
char[] firstSet = { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M' };
char[] secondSet = { 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z' };
var file1InFile2 = file1Parsed.Where(y => y.ToUpper().IndexOfAny(firstSet) == 0).Select(x => x);
var file1InFile3 = file1Parsed.Where(y => y.ToUpper().IndexOfAny(secondSet) == 0).Select(x => x);
using (StreamWriter writer = new StreamWriter(@"C:\notExists.txt"))
{
file1InFile2.Where(x => !file2Parsed.Contains(x.Trim())).ToList().ForEach(y => writer.WriteLine("This email needs to be added to file2: " + y));
file1InFile3.Where(x => !file3Parsed.Contains(x.Trim())).ToList().ForEach(y => writer.WriteLine("This email needs to be added to file3: " + y));
file2Parsed.Where(x => !file1InFile2.Contains(x.Trim())).ToList().ForEach(y => writer.WriteLine("This email needs to be added to file1: " + y));
file3Parsed.Where(x => !file1InFile3.Contains(x.Trim())).ToList().ForEach(y => writer.WriteLine("This email needs to be added to file1: " + y));
}
我修改了它以减少代码