借助 C# 中的字典(或其他数据结构)降低嵌套循环的复杂性

Reducing nested loop complexity with help of Dictionary (or other data structure) in C#

我有兴趣在字典(或可能是其他数据结构)的帮助下降低下面这段代码的时间复杂度。

据我所知,我的蛮力解决方案的时间复杂度为 O(n^2),但是,可能可以在 O(n) 中完成(在 n 次非嵌套循环中)。

任务是针对每一天和每一地点打印哺乳动物观测的那一天和每一地点的观测百分比。

foreach (var day in EachDay(GetDateTimeForFirstObservation(animalObservations),
GetDateTimeForLastObservation(animalObservations)))
{
    Console.WriteLine("Day: {0}", day.ToString("dd/MM/yyyy"));

    foreach (var location in EachLocation(animalObservations
        .Where(ao => ao.Datetime.Day == day.Day).ToList()))
    {
        Console.WriteLine("Location: {0}", location);

        numOfAllAnimalsInLocationAndDay = animalObservations
            .Where(aob => aob.Location == location &&
                aob.Datetime.Date == day).Count();

        numOfMammalsAnimalsInLocationAndDay = animalObservations
            .Where(aob => aob.Location == location &&
            aob.Datetime.Date == day && aob.Animal.IsMammal).Count();

        Console.WriteLine("Percentage of Mammals in location and day: {0:N2}%",
            numOfMammalsAnimalsInLocationAndDay/numOfAllAnimalsInLocationAndDay * 100);
    }
}

输入看起来像这样:

[
{"DateTime":"2020-02-22 10:10:15", "Location":"Backyard", "Animal": {"Species":"Camel", "IsMammal": "TRUE"}},
{"DateTime":"2020-02-22 11:10:15", "Location":"Backyard", "Animal": {"Species":"Camel", "IsMammal": "TRUE"}},
{"DateTime":"2020-02-22 12:10:15", "Location":"Backyard", "Animal": {"Species":"Ant", "IsMammal": "FALSE"}},
{"DateTime":"2020-02-22 22:10:15", "Location":"Sky", "Animal": {"Species":"Flamingo", "IsMammal": "TRUE"}},
{"DateTime":"2020-02-22 23:10:15", "Location":"Sky", "Animal": {"Species":"Bee", "IsMammal": "FALSE"}},
{"DateTime":"2020-02-23 13:11:15", "Location":"City", "Animal": {"Species":"Racoon", "IsMammal": "TRUE"}},
{"DateTime":"2020-02-24 15:10:00", "Location":"City", "Animal": {"Species":"Dog", "IsMammal": "TRUE"}},
{"DateTime":"2020-02-24 19:10:00", "Location":"City", "Animal": {"Species":"Fly", "IsMammal": "FALSE"}},
{"DateTime":"2020-02-24 19:10:15", "Location":"City", "Animal": {"Species":"Butterfly", "IsMammal": "FALSE"}},
{"DateTime":"2020-02-24 19:10:20", "Location":"City", "Animal": {"Species":"Cat", "IsMammal": "TRUE"}},
{"DateTime":"2020-02-24 19:10:30", "Location":"City", "Animal": {"Species":"Flee", "IsMammal": "FALSE"}},
{"DateTime":"2020-02-25 21:10:15", "Location":"Village", "Animal": {"Species":"Horse", "IsMammal": "TRUE"}},
{"DateTime":"2020-02-25 22:10:15", "Location":"Village", "Animal": {"Species":"Fly", "IsMammal": "FALSE"}},
{"DateTime":"2020-02-25 23:10:15", "Location":"Village", "Animal": {"Species":"Bee", "IsMammal": "FALSE"}},
{"DateTime":"2020-02-25 10:10:15", "Location":"Home", "Animal": {"Species":"Iguana", "IsMammal": "FALSE"}}
]

所需的输出:

Day: 22.02.2020
Location: Backyard
Percentage of Mammals in location and day: 66,67%
Location: Sky
Percentage of Mammals in location and day: 50,00%
Day: 23.02.2020
Location: City
Percentage of Mammals in location and day: 100,00%
Day: 24.02.2020
Location: City
Percentage of Mammals in location and day: 40,00%
Day: 25.02.2020
Location: Village
Percentage of Mammals in location and day: 33,33%
Location: Home
Percentage of Mammals in location and day: 0,00%

有很多真正疯狂的技巧可以降低时间复杂度,但大多数时候处理数据时,您必须依赖于数据初始排序的方式。如果它没有被订购,我们可以通过一些复合键来订购它。在您的情况下,关键是 Tuple<DateTime, Location> ,其中 Item1 是日期, Item2 是位置。这将需要 n*log(n),然后使用线性时间遍历数据,使用线性时间生成结果。

因此查看您的数据,它已经排序。所以我们可以跳过那部分,直接走一遍。基本思想我们初始化一些状态并检查它是否改变我们产生结果。在我们的案例中,状态是当天、当前位置,我们跟踪两个变量的信息,总动物数,总哺乳动物数。

    public static void PrintPopulation(List<AnimalObservations> animalObservations)
    {
        if (animalObservations.Count == 0)
            return;
        var item = animalObservations[0];
        string currentLocation = item.Location;
        DateTime currentDate = item.DateTime.Date;
        int totalAnimals = 1;
        int totalMammals = item.Animal.IsMammal ? 1 : 0;
        for (int i = 1; i < animalObservations.Count; i++)
        {
            item = animalObservations[i];
            if (currentLocation != item.Location ||
                currentDate != item.DateTime.Date)
            {
                PrintResult(currentDate, currentLocation, totalAnimals, totalMammals);
                totalMammals = 0;
                totalAnimals = 0;
                currentLocation = item.Location;
                currentDate = item.DateTime.Date;
            }

            totalAnimals++;
            totalMammals += item.Animal.IsMammal ? 1 : 0;
        }
        PrintResult(currentDate, currentLocation, totalAnimals, totalMammals);
    }

    public static void PrintResult(DateTime date, string location, int totalAnimals, int totalMammals)
    {
        Console.WriteLine($"{date} {location} {(double) totalMammals / totalAnimals}");
    }

我假设

public class AnimalObservations
{
    public DateTime DateTime { get; set; }
    public string Location { get; set; }
    public Animal Animal { get; set; }
}

public class Animal
{
    public bool IsMammal { get; set; }
}

我认为您的解决方案实际上是 O(n^3) 的复杂性,因为您进行了 3 次嵌套迭代:

  1. 每个不同的日子
  2. 当天每个不同的位置
  3. 计算日-地点对的动物和哺乳动物的数量 --- 这是你在 Linq 表达式中给出的,所以不是很明显

鉴于您具有以下 class 结构:

public class AnimalObservation {
    public DateTime DateTime { get; set; }
    public string Location { get; set; }
    public Animal Animal { get; set; }
}

public class Animal {
    public string Species { get; set; }
    public bool IsMammal { get; set; }
}

您可以在 O(n) 中使用两本词典(一本用于动物,一本用于哺乳动物)来执行此操作,它们将日期-位置对作为键,将计数器作为值

    IDictionary<ValueTuple<DateTime, string>, int> animals = new Dictionary<ValueTuple<DateTime, string>, int>(new DayLocationComparer());
    IDictionary<ValueTuple<DateTime, string>, int> mammals = new Dictionary<ValueTuple<DateTime, string>, int>(new DayLocationComparer());
    foreach (AnimalObservation ao in aos) {
        ValueTuple<DateTime, string> dayLocation = new ValueTuple<DateTime, string>(ao.DateTime, ao.Location);

        if (!animals.ContainsKey(dayLocation)) {
            animals.Add(dayLocation, 1);
        } else {
            animals[dayLocation] = animals[dayLocation] + 1;
        }


        if (!mammals.ContainsKey(dayLocation) && ao.Animal.IsMammal) {
            mammals.Add(dayLocation, 1);
        } else if (!mammals.ContainsKey(dayLocation) && !ao.Animal.IsMammal) {
            mammals.Add(dayLocation, 0);
        } else if (mammals.ContainsKey(dayLocation) && ao.Animal.IsMammal) {
            animals[dayLocation] = animals[dayLocation] + 1;
        }
    }

    foreach (ValueTuple<DateTime, string> dayLocation in animals.Keys) {
        int nrOfAnimals = animals[dayLocation];
        int nrOfMammals = mammals[dayLocation];
        Console.WriteLine((double)nrOfMammals / nrOfAnimals * 100);
    }

其中 DayLocationComparer 是忽略 DateTime

的时间部分的比较器
public class DayLocationComparer : EqualityComparer<ValueTuple<DateTime, string>> {
    public override bool Equals(ValueTuple<DateTime, string> x, ValueTuple<DateTime, string> y) => x.Item1.Date == y.Item1.Date && x.Item2 == y.Item2;
    public override int GetHashCode(ValueTuple<DateTime, string> x) => x.Item1.GetHashCode();
}

当然,我建议可能使用 class 作为日期-位置对以获得更易读的代码。

一种解决方案是使用 字典,其键类型包括日期和位置,值类型包含 mammal/non-mammal 计数。

试试下面的代码。您需要将顶部附近的字符串常量指向 JSON 文件的位置。您还需要将 NewtonSoft JSON 库添加到您的项目中。请注意,我已经覆盖了用作字典键类型的 class 中的 Equals 和 GetHashCode 方法。

namespace Mammals
{
    using System;
    using System.Collections.Generic;
    using System.IO;
    using System.Linq;

    using Newtonsoft.Json.Linq;

    class Program
    {
        static void Main(string[] args)
        {
            const string filePath = "C:\temp\mammals.json";

            // Read input from JSON:
            string jsonInput;
            using (var file = new StreamReader(new FileStream(filePath, FileMode.Open)))
            {
                jsonInput = file.ReadToEnd();
            }

            var json = JArray.Parse(jsonInput);
            var sightings = json.Select(j => j.ToObject<Sighting>());

            // Set up dictionary, using day/location as key:
            var sightingDictionary = new Dictionary<SightingDayAndPlace, SightingCounter>();

            // Loop through sightings in O(n):
            foreach (var sighting in sightings)
            {
                var sightingTimeAndPlace = sighting.GetSightingDayAndPlace;
                if (!sightingDictionary.ContainsKey(sightingTimeAndPlace))
                {
                    sightingDictionary.Add(sightingTimeAndPlace, new SightingCounter());
                }

                if (sighting.IsMammal)
                {
                    sightingDictionary[sightingTimeAndPlace].MammalCount++;
                }
                else
                {
                    sightingDictionary[sightingTimeAndPlace].NonMammalCount++;
                }
            }

            // Print output:
            var currentDay = default(DateTime);
            foreach (var item in sightingDictionary)
            {
                var key = item.Key;
                if (key.Day != currentDay)
                {
                    Console.WriteLine($"Day: {key.Day:dd.MM.yyyy}");
                }

                Console.WriteLine($"Location: {key.Location}");
                Console.WriteLine($"Percentage of Mammals in location and day: {item.Value.MammalPercentage:F}%");
            }

            Console.ReadKey();
        }

        private class SightingDayAndPlace
        {
            public SightingDayAndPlace(DateTime day, string location)
            {
                this.Day = day;
                this.Location = location;
            }

            public DateTime Day { get; }

            public string Location { get; }

            public override bool Equals(object obj)
            {
                if (obj == null || !(obj is SightingDayAndPlace that))
                {
                    return false;
                }

                return this.Day == that.Day
                       && this.Location == that.Location;
            }

            public override int GetHashCode()
            {
                // Consider a different implementation if memory or performance is relevant.
                return new { this.Day, this.Location }.GetHashCode();
            }
        }

        private class Sighting
        {
            public DateTime DateTime { get; set; }
            public string Location { get; set; }
            public Animal Animal { get; set; }
            public string Species => Animal.Species;
            public bool IsMammal => Animal.IsMammal;
            public DateTime Day => DateTime.Date;

            public SightingDayAndPlace GetSightingDayAndPlace => new SightingDayAndPlace(this.Day, this.Location);
        }

        private class Animal
        {
            public string Species { get; set; }
            public bool IsMammal { get; set; }
        }

        private class SightingCounter
        {
            public int MammalCount { get; set; }
            public int NonMammalCount { get; set; }

            public double MammalPercentage => (MammalCount / ((double)MammalCount + NonMammalCount)) * 100;
        }
    }
}