如何在遍历列表列表时 Return 一个不同的值
How Do I Return a Different Value When Iterating Over a List of Lists
问题
我有一个 FOR 循环,它创建一个列表列表,其中每个条目都包含输入和关联的输出。我不知道如何在创建列表和 return 相应输入后迭代输出。我能够通过将列表转换为数据框并使用 .loc[] 来解决我的问题,但我很顽固并且想要产生相同的结果而不必执行转换为一个数据框。我也不想把它转换成字典,我也已经解决了这种情况。
我已经包含了生成的列表以及有效的转换数据框。在这种情况下 best_tree_size 应该 return 100 因为它的输出是最小结果。
当前可用的代码
candidate_max_leaf_nodes = [5, 25, 50, 100, 250, 500]
#list placeholder for loop calculation
leaf_list = []
#Write loop to find the ideal tree size from candidate_max_leaf_nodes
for max_leaf_nodes in candidate_max_leaf_nodes:
#each iteration outputs a 2 item list [leaf, MAE], which appends to leaf_list as an array
leaf_list.append([max_leaf_nodes, get_mae(max_leaf_nodes, train_X, val_X, train_y, val_y)])
#convert array into dataframe
scores = pd.DataFrame(leaf_list, columns =['Leaf', 'MAE'])
#Store the best value of max_leaf_nodes (it will be either 5, 25, 50, 100, 250 or 500)
#idxmin() is finding the min value of MAE and returning the dataframe index
#.loc is utilizing the index from idxmin() and returning the corresponding value from Leaf that caused it
best_tree_size = scores.loc[scores.MAE.idxmin(), 'Leaf']
#clear list placeholder (if needed)
leaf_list.clear()
已制作 leaf_list
[[5, 35044.51299744237],
[25, 29016.41319191076],
[50, 27405.930473214907],
[100, 27282.50803885739],
[250, 27893.822225701646],
[500, 29454.18598068598]]
CONVERTED scores DATAFRAME
所以你有一个 [leaf, MAE] 的列表,你想从该列表中获取具有最小 MAE 的项目?
你可以这样做:
scores = [
[5, 35044.51299744237],
[25, 29016.41319191076],
[50, 27405.930473214907],
[100, 27282.50803885739],
[250, 27893.822225701646],
[500, 29454.18598068598]
]
from operator import itemgetter
best_leaf, best_mae = min(scores, key=itemgetter(1))
# beaf_leaf will be equal to 100, best_mae will be equal to 27282.50803885739
这里的关键是 itemgetter(1)
,其中 returns 一种方法,当传递元组或列表时,returns 索引 1 处的元素(此处为 MAE)。
我们将其用作 min()
的键,以便根据元素的 MAE 值比较元素。
Numpy 风格:
leaf_list = [
[5, 35044.51299744237],
[25, 29016.41319191076],
[50, 27405.930473214907],
[100, 27282.50803885739],
[250, 27893.822225701646],
[500, 29454.18598068598]
]
# to numpy
leaf_list = np.array(leaf_list)
# reduce dimension
flatten = leaf_list.flatten()
# def. cond. (check every second item (output) and find min value index
index = np.where(flatten == flatten[1::2].min())[0]//2
# output list
out_list = leaf_list[index]
输出:
array([[ 100. , 27282.50803886]])
还有多个最小值(相同的数字):
leaf_list = [[14, 6],
[25, 55],
[5, 6]]
#... same code
输出:
array([[14, 6],
[ 5, 6]])
问题
我有一个 FOR 循环,它创建一个列表列表,其中每个条目都包含输入和关联的输出。我不知道如何在创建列表和 return 相应输入后迭代输出。我能够通过将列表转换为数据框并使用 .loc[] 来解决我的问题,但我很顽固并且想要产生相同的结果而不必执行转换为一个数据框。我也不想把它转换成字典,我也已经解决了这种情况。
我已经包含了生成的列表以及有效的转换数据框。在这种情况下 best_tree_size 应该 return 100 因为它的输出是最小结果。
当前可用的代码
candidate_max_leaf_nodes = [5, 25, 50, 100, 250, 500]
#list placeholder for loop calculation
leaf_list = []
#Write loop to find the ideal tree size from candidate_max_leaf_nodes
for max_leaf_nodes in candidate_max_leaf_nodes:
#each iteration outputs a 2 item list [leaf, MAE], which appends to leaf_list as an array
leaf_list.append([max_leaf_nodes, get_mae(max_leaf_nodes, train_X, val_X, train_y, val_y)])
#convert array into dataframe
scores = pd.DataFrame(leaf_list, columns =['Leaf', 'MAE'])
#Store the best value of max_leaf_nodes (it will be either 5, 25, 50, 100, 250 or 500)
#idxmin() is finding the min value of MAE and returning the dataframe index
#.loc is utilizing the index from idxmin() and returning the corresponding value from Leaf that caused it
best_tree_size = scores.loc[scores.MAE.idxmin(), 'Leaf']
#clear list placeholder (if needed)
leaf_list.clear()
已制作 leaf_list
[[5, 35044.51299744237],
[25, 29016.41319191076],
[50, 27405.930473214907],
[100, 27282.50803885739],
[250, 27893.822225701646],
[500, 29454.18598068598]]
CONVERTED scores DATAFRAME
所以你有一个 [leaf, MAE] 的列表,你想从该列表中获取具有最小 MAE 的项目? 你可以这样做:
scores = [
[5, 35044.51299744237],
[25, 29016.41319191076],
[50, 27405.930473214907],
[100, 27282.50803885739],
[250, 27893.822225701646],
[500, 29454.18598068598]
]
from operator import itemgetter
best_leaf, best_mae = min(scores, key=itemgetter(1))
# beaf_leaf will be equal to 100, best_mae will be equal to 27282.50803885739
这里的关键是 itemgetter(1)
,其中 returns 一种方法,当传递元组或列表时,returns 索引 1 处的元素(此处为 MAE)。
我们将其用作 min()
的键,以便根据元素的 MAE 值比较元素。
Numpy 风格:
leaf_list = [
[5, 35044.51299744237],
[25, 29016.41319191076],
[50, 27405.930473214907],
[100, 27282.50803885739],
[250, 27893.822225701646],
[500, 29454.18598068598]
]
# to numpy
leaf_list = np.array(leaf_list)
# reduce dimension
flatten = leaf_list.flatten()
# def. cond. (check every second item (output) and find min value index
index = np.where(flatten == flatten[1::2].min())[0]//2
# output list
out_list = leaf_list[index]
输出:
array([[ 100. , 27282.50803886]])
还有多个最小值(相同的数字):
leaf_list = [[14, 6],
[25, 55],
[5, 6]]
#... same code
输出:
array([[14, 6],
[ 5, 6]])