Python 字典的总值(Time/Space 复杂性)
Sum Values of Python Dictionary (Time/Space Complexity)
我正在尝试解决以下问题:
给定出生日期和死亡日期列表,找出在世人数最多的年份。
到目前为止,这是我的代码:
b = [1791, 1796, 1691, 1907, 1999, 2001, 1907] # birth dates
d = [1800, 1803, 1692, 1907, 1852, 1980, 2006] # death dates
year_dict = {} # populates dict key as year, val as total living/dead
for birth in b:
year_dict.setdefault(birth,0) # sets default value of key to 0
year_dict[birth] += 1 # will add +1 for each birth and sums duplicates
for death in d:
year_dict.setdefault(death,0) # sets default value of key to 0
year_dict[death] += -1 # will add -1 for each death and sums duplicates
以下代码returns:
{1791: 1, 1796: 1, 1691: 1, 1907: 1, 1999: 1, 2001: 1, 1800: -1, 1803: -1, 1692: -1, 1852: -1, 1980: -1, 2006: -1}
现在我正在寻找一种创建 运行 总和的方法,以找出哪一年的人口最多,例如:
Image of desired result
正如我们所见,根据给定的数据集,结果显示 1796 人的存活人数最多。我在获取 运行 总和部分时遇到问题,该部分需要获取每个键值,并将其与先前的值相加。我尝试了几种不同的循环和枚举,但现在卡住了。一旦找到解决此问题的最佳方法,我将创建一个函数以提高效率。
考虑到 time/space 的复杂性,如果有更有效的方法,请告诉我。我正在尝试通过 python 学习效率。非常感谢您的帮助!!!
您是否希望使用特定的数据结构来存放结果?我得到了与 imgur link 相同的结果打印到终端。不过写到字典里也不难。
from collections import OrderedDict
b = [1791, 1796, 1691, 1907, 1999, 2001, 1907] # birth dates
d = [1800, 1803, 1692, 1907, 1852, 1980, 2006] # death dates
year_dict = {} # populates dict key as year, val as total living/dead
for birth in b:
year_dict.setdefault(birth,0) # sets default value of key to 0
year_dict[birth] += 1 # will add +1 for each birth and sums duplicates
for death in d:
year_dict.setdefault(death,0) # sets default value of key to 0
year_dict[death] += -1 # will add -1 for each death and sums duplicates
year_dict = OrderedDict(sorted(year_dict.items(), key=lambda t: t[0]))
solution_dict = {}
total = 0
print('year net_living running_sum')
for year in year_dict:
total += year_dict[year]
solution_dict.update({year:{'net_living': year_dict[year],
'running_sum': total}
})
print('{} {:4} {:10}'.format(year, year_dict[year], total))
输出:
year net_living running_sum
1691 1 1
1692 -1 0
1791 1 1
1796 1 2
1800 -1 1
1803 -1 0
1852 -1 -1
1907 1 0
1980 -1 -1
1999 1 0
2001 1 1
2006 -1 0
solution_dict
的输出
{
1691: {'net_living': 1, 'running_sum': 1},
1692: {'net_living': -1, 'running_sum': 0},
1791: {'net_living': 1, 'running_sum': 1},
1796: {'net_living': 1, 'running_sum': 2},
1800: {'net_living': -1, 'running_sum': 1},
1803: {'net_living': -1, 'running_sum': 0},
1852: {'net_living': -1, 'running_sum': -1},
1907: {'net_living': 1, 'running_sum': 0},
1980: {'net_living': -1, 'running_sum': -1},
1999: {'net_living': 1, 'running_sum': 0},
2001: {'net_living': 1, 'running_sum': 1},
2006: {'net_living': -1, 'running_sum': 0}
}
我会使用 pandas
,并利用其 DataFrame
对象:
制作人的出生年份和死亡年份的数据框::
born = [1791, 1796, 1691, 1907, 1999, 2001, 1907] # birth dates
died = [1800, 1803, 1692, 1907, 1852, 1980, 2006] # death dates
people = pd.DataFrame({'born': born, 'died': died} for born, died in zip(born, died))
制作一个数据框,其中包含列出的第一个出生到最后一个死亡之间的所有年份:
years = pd.DataFrame(index=np.arange(people['born'].min(), people['died'].max() + 1))
找出这些年中每年活着的总人数:
for year in years.index:
num_living = ((year > people['born']) & (year < people['died'])).sum()
years.loc[year, 'total_living'] = num_living
调用 years.tail()
产生以下结果:
total_living
2002 1.0
2003 1.0
2004 1.0
2005 1.0
2006 0.0
从那里,您可以简单地在 'total_living'
列上执行 argmax
。
为了清楚起见,我假设了一个合乎逻辑的情况,即人们在 出生后 死亡,并且(因此)永远不会有负数的人活着。
我正在尝试解决以下问题:
给定出生日期和死亡日期列表,找出在世人数最多的年份。
到目前为止,这是我的代码:
b = [1791, 1796, 1691, 1907, 1999, 2001, 1907] # birth dates
d = [1800, 1803, 1692, 1907, 1852, 1980, 2006] # death dates
year_dict = {} # populates dict key as year, val as total living/dead
for birth in b:
year_dict.setdefault(birth,0) # sets default value of key to 0
year_dict[birth] += 1 # will add +1 for each birth and sums duplicates
for death in d:
year_dict.setdefault(death,0) # sets default value of key to 0
year_dict[death] += -1 # will add -1 for each death and sums duplicates
以下代码returns:
{1791: 1, 1796: 1, 1691: 1, 1907: 1, 1999: 1, 2001: 1, 1800: -1, 1803: -1, 1692: -1, 1852: -1, 1980: -1, 2006: -1}
现在我正在寻找一种创建 运行 总和的方法,以找出哪一年的人口最多,例如:
Image of desired result
正如我们所见,根据给定的数据集,结果显示 1796 人的存活人数最多。我在获取 运行 总和部分时遇到问题,该部分需要获取每个键值,并将其与先前的值相加。我尝试了几种不同的循环和枚举,但现在卡住了。一旦找到解决此问题的最佳方法,我将创建一个函数以提高效率。
考虑到 time/space 的复杂性,如果有更有效的方法,请告诉我。我正在尝试通过 python 学习效率。非常感谢您的帮助!!!
您是否希望使用特定的数据结构来存放结果?我得到了与 imgur link 相同的结果打印到终端。不过写到字典里也不难。
from collections import OrderedDict
b = [1791, 1796, 1691, 1907, 1999, 2001, 1907] # birth dates
d = [1800, 1803, 1692, 1907, 1852, 1980, 2006] # death dates
year_dict = {} # populates dict key as year, val as total living/dead
for birth in b:
year_dict.setdefault(birth,0) # sets default value of key to 0
year_dict[birth] += 1 # will add +1 for each birth and sums duplicates
for death in d:
year_dict.setdefault(death,0) # sets default value of key to 0
year_dict[death] += -1 # will add -1 for each death and sums duplicates
year_dict = OrderedDict(sorted(year_dict.items(), key=lambda t: t[0]))
solution_dict = {}
total = 0
print('year net_living running_sum')
for year in year_dict:
total += year_dict[year]
solution_dict.update({year:{'net_living': year_dict[year],
'running_sum': total}
})
print('{} {:4} {:10}'.format(year, year_dict[year], total))
输出:
year net_living running_sum
1691 1 1
1692 -1 0
1791 1 1
1796 1 2
1800 -1 1
1803 -1 0
1852 -1 -1
1907 1 0
1980 -1 -1
1999 1 0
2001 1 1
2006 -1 0
solution_dict
的输出{
1691: {'net_living': 1, 'running_sum': 1},
1692: {'net_living': -1, 'running_sum': 0},
1791: {'net_living': 1, 'running_sum': 1},
1796: {'net_living': 1, 'running_sum': 2},
1800: {'net_living': -1, 'running_sum': 1},
1803: {'net_living': -1, 'running_sum': 0},
1852: {'net_living': -1, 'running_sum': -1},
1907: {'net_living': 1, 'running_sum': 0},
1980: {'net_living': -1, 'running_sum': -1},
1999: {'net_living': 1, 'running_sum': 0},
2001: {'net_living': 1, 'running_sum': 1},
2006: {'net_living': -1, 'running_sum': 0}
}
我会使用 pandas
,并利用其 DataFrame
对象:
制作人的出生年份和死亡年份的数据框::
born = [1791, 1796, 1691, 1907, 1999, 2001, 1907] # birth dates
died = [1800, 1803, 1692, 1907, 1852, 1980, 2006] # death dates
people = pd.DataFrame({'born': born, 'died': died} for born, died in zip(born, died))
制作一个数据框,其中包含列出的第一个出生到最后一个死亡之间的所有年份:
years = pd.DataFrame(index=np.arange(people['born'].min(), people['died'].max() + 1))
找出这些年中每年活着的总人数:
for year in years.index:
num_living = ((year > people['born']) & (year < people['died'])).sum()
years.loc[year, 'total_living'] = num_living
调用 years.tail()
产生以下结果:
total_living
2002 1.0
2003 1.0
2004 1.0
2005 1.0
2006 0.0
从那里,您可以简单地在 'total_living'
列上执行 argmax
。
为了清楚起见,我假设了一个合乎逻辑的情况,即人们在 出生后 死亡,并且(因此)永远不会有负数的人活着。