以列表为值迭代两个字典

iterate on two dictionaries with list as values

我有关于员工打卡时间时钟的数据。

员工可以通过自动流程(通过他的钥匙卡或指纹)或通过简单的网络表格手动报告他的开始时间。

问题是有些员工不小心用不止一种方法报告了时间

输入流数据来自两个字典,取值如下:

如果员工在特定日期登录,我正在尝试迭代两个字典并计算(不是总和)。

def count_type(logins, punch_clock_data):
    counter = 0
    ans_dict = {}
    for punch_clock_key, punch_clock_value in punch_clock_data.items():
        for element in punch_clock_value:
            if element not in ans_dict:
                l1 = logins.get(element)
                for i in range(len(l1)):
                    if i == 1:
                        counter+=1
                        break
            ans_dict[punch_clock_key] = counter
        print(f'this employee was at work {counter} days this week by {punch_clock_key} login')
    return ans_dict


# each item in the array is a representation of weekday. for example, 
# on Sunday there was not any log in data.
# on Tuesday, this employee reported login both on web form and by 
# card(flag = 1) etc.


logins = { 'card'        :[0, 1, 1, 0, 0, 0, 0],
           'fingerprint' :[0, 0, 0, 1, 1, 0, 0],
           'web form'    :[0, 0, 1, 1, 0, 1, 1]
}


# dictionary contains data on types of punch clock
punch_clock_data  = { 'automated' :['card', 'fingerprint'],
                      'manual'    :['web form'],
                      'all_types' :['card', 'fingerprint', 'web form']
}

res = count_type(logins, punch_clock_data)
print(res)

我的输出不符合预期。 这是我的输出

{'automated': 2, 'manual': 3, 'all_types': 6}

但我想得到的是:

{'automated': 4, 'manual': 4, 'all_types': 6}

我认为我的问题是我需要按索引而不是按值迭代所有工作日列表。 对于一周中的每一天,获取正确的索引并计算它(垂直而不是水平)

看起来您只想在员工通过特定打卡时钟类别中的一种以上方法登录的日子里计算一次登录。您可以将每个类别的登录方法列表压缩在一起,并测试是否有任何一个登录方法。

logins = {'card': [0, 1, 1, 0, 0, 0, 0], 'fingerprint' :[0, 0, 0, 1, 1, 0, 0], 'web form': [0, 0, 1, 1, 0, 1, 1]}
punch_clock_data = { 'automated': ['card', 'fingerprint'], 'manual': ['web form'], 'all_types': ['card', 'fingerprint', 'web form']}

results = {}
for group, keys in punch_clock_data.items():
    results[group] = sum(any(t) for t in zip(*[logins[k] for k in keys]))

print(results) 
# {'automated': 4, 'manual': 4, 'all_types': 6}

根据您的评论,请求一个可以更轻松地查看所涉及步骤的版本。这里有一点细分。

results = {}
for group, keys in punch_clock_data.items():
    # list of lists of logins for each method in the category
    all_logins = [logins[k] for k in keys]

    # zip all_logins by day of the week
    logins_per_day = zip(*all_logins)

    # add 1 for each day where any of the values in the tuple are not zero
    results[group] = sum(any(t) for t in logins_per_day)

看这段代码

def count_type(logins, punch_clock_data):
    ans_dict = {}
    for punch_clock_key, punch_clock_value in punch_clock_data.items():
        counter = 0
        tmp_tab = [0] * 7
        for login_key in punch_clock_value:
            for i in range(len(logins[login_key])):
                tmp_tab[i] += logins[login_key][i]
        for day in tmp_tab:
            counter += day > 0
        ans_dict[punch_clock_key] = counter
    return ans_dict

例如 all_types,我创建了一个 tmp_tab 将您的 3 标签转换为

[0, 1, 2, 2, 1, 1, 1]

然后它是每个 col 和计数器的总和 += 1 如果 col 的值 > 到 0

这里的关键,需要这样SUM登录,例子在all type:

       'card':        [0, 1, 1, 0, 0, 0, 0]
       'fingerprint' :[0, 0, 0, 1, 1, 0, 0]
       'web form'    :[0, 0, 1, 1, 0, 1, 1]
       'all type'    :[0, 1, 1, 1, 1, 1, 1]  total = 6

所以你可以试试这个:

NUMBER_OF_DAY = 7
def count_type(logins, punch_clock_data):
    ans_dict = {}
    for punch_clock_key, punch_clock_values in punch_clock_data.items():
        # list of all login
        element_list = [logins[punch_clock_value] for punch_clock_value in punch_clock_values]
        # compute the sum of each day
        # EX:
        # [0, 1, 1, 0, 0, 0, 0] + [0, 0, 0, 1, 1, 0, 0]
        # total equal to = [0, 1, 1, 1, 1, 0, 0]
        total_days = [0] * NUMBER_OF_DAY
        for element in element_list:
            for day_index, is_checked in enumerate(element):
                # if he checked is_checked == 1 else is 0 
                if is_checked:
                    # he checked in day mark this in the total by 1 not by some
                    total_days[day_index] = 1

        # now just some the total of day
        ans_dict[punch_clock_key] = sum(total_days)
    return ans_dict

使用 zip、zip 和 list comprehensive 将有助于减少代码:

def count_type(logins, punch_clock_data):
    ans_dict = {}
    for punch_clock_key, punch_clock_values in punch_clock_data.items():
        # list of all login
        element_list = [logins[punch_clock_value] for punch_clock_value in punch_clock_values]
        # zip them so when we iterate over them we get a tuple of login of one day in each iteration
        element_list = zip(*element_list)
        total = 0
        for days in element_list:   
           total += any(days) and 1 or 0
        ans_dict[punch_clock_key] = total
    return ans_dict

现在我们可以进一步简化代码:

  element_list = [logins[punch_clock_value] for punch_clock_value in punch_clock_values]
  element_list = zip(*element_list)

  # to this 
  element_list = zip(*[logins[punch_clock_value] for punch_clock_value in punch_clock_values])

感谢 build-in sum:

    total = 0
    for days in element_list:   
       total += any(days) and 1 or 0
    ans_dict[punch_clock_key] = total


    # to this 
    ans_dict[punch_clock_key] = sum(any(days) for days in element_list)

所以最终结果函数:

def count_type(logins, punch_clock_data):
    ans_dict = {}
    for punch_clock_key, punch_clock_values in punch_clock_data.items():
        # list of all login
        element_list = element_list = zip(*[logins[punch_clock_value] for punch_clock_value in punch_clock_values])
        ans_dict[punch_clock_key] = sum(any(days) for days in element_list)
    return ans_dict