Linux

Question

通过查看 /proc/<PID>/sched 中的调度统计信息，您可以获得如下输出：

[horro@system ~]$ cat /proc/1/sched
systemd (1, #threads: 1)
-------------------------------------------------------------------
se.exec_start                                :    2499611106.982616
se.vruntime                                  :          7952.917943
se.sum_exec_runtime                          :         58651.279127
se.nr_migrations                             :                53355
nr_switches                                  :               169561
nr_voluntary_switches                        :               168185
nr_involuntary_switches                      :                 1376
se.load.weight                               :              1048576
se.avg.load_sum                              :               343837
se.avg.util_sum                              :               338827
se.avg.load_avg                              :                    7
se.avg.util_avg                              :                    7
se.avg.last_update_time                      :     2499611106982616
policy                                       :                    0
prio                                         :                  120
clock-delta                                  :                  180
mm->numa_scan_seq                            :                    1
numa_pages_migrated                          :                  296
numa_preferred_nid                           :                    0
total_numa_faults                            :                   34
current_node=0, numa_group_id=0
numa_faults node=0 task_private=0 task_shared=23 group_private=0 group_shared=0
numa_faults node=1 task_private=0 task_shared=0 group_private=0 group_shared=0
numa_faults node=2 task_private=0 task_shared=0 group_private=0 group_shared=0
numa_faults node=3 task_private=0 task_shared=11 group_private=0 group_shared=0
numa_faults node=4 task_private=0 task_shared=0 group_private=0 group_shared=0
numa_faults node=5 task_private=0 task_shared=0 group_private=0 group_shared=0
numa_faults node=6 task_private=0 task_shared=0 group_private=0 group_shared=0
numa_faults node=7 task_private=0 task_shared=0 group_private=0 group_shared=0

一直在想弄清楚migrations和switches有什么区别，一些回复here and here。总结这些回复：

nr_switches: 上下文切换次数。
nr_voluntary_switches: 自愿切换的次数，即线程被阻塞，因此另一个线程被拾起。
nr_involuntary_switches：调度程序将线程踢出，因为有另一个饥饿线程准备运行。

那么，migrations是什么？这些概念是否相关？迁移是核心之间和核心内的交换机？

Answer 1

迁移是指线程（通常在上下文切换之后）被调度到与之前调度不同的 CPU 上。

编辑 1:

以下是 Wikipedia 上有关迁移的更多信息： https://en.wikipedia.org/wiki/Process_migration

这是增加计数器的内核代码： https://github.com/torvalds/linux/blob/master/kernel/sched/core.c#L1175

if (task_cpu(p) != new_cpu) {
    ...
    p->se.nr_migrations++;

编辑 2:

在以下情况下，一个线程可以迁移到另一个CPU：

exec()期间
fork()期间
线程唤醒期间。
如果线程关联掩码已更改。
当前CPU离线时。

有关更多信息，请查看同一源文件中的函数 set_task_cpu()、move_queued_task()、migrate_tasks()：https://github.com/torvalds/linux/blob/master/kernel/sched/core.c

select_task_rq() 中描述了调度程序遵循的策略，这取决于您使用的调度程序class。 policier的基本版本：

if (p->nr_cpus_allowed > 1)
    cpu = p->sched_class->select_task_rq(p, cpu, sd_flags, wake_flags);
else
    cpu = cpumask_any(&p->cpus_allowed);

来源：https://github.com/torvalds/linux/blob/master/kernel/sched/core.c#L1534

因此，为了避免迁移，请使用sched_setaffinity(2)系统调用或相应的POSIX为您的线程设置CPU关联掩码API pthread_setaffinity_np(3).

下面是完全公平调度程序的 select_task_rq() 的定义： https://github.com/torvalds/linux/blob/master/kernel/sched/fair.c#L5860

逻辑相当复杂，但基本上，我们要么 select 兄弟空闲 CPU 要么找一个最不忙的新人。

希望这能回答您的问题。

Linux - 迁移和切换之间的区别？

Linux - Difference between migrations and switches?

multithreading

scheduler

linux-kernel