使用内存映射IO时调用ioread函数有什么好处

Question

要使用内存映射I/O，我们需要先调用request_mem_region。

struct resource *request_mem_region(
                unsigned long start,
                unsigned long len,
                char *name);

然后，由于内核是运行在虚拟地址space，我们需要通过运行ioremap函数将物理地址映射到虚拟地址space。

void *ioremap(unsigned long phys_addr, unsigned long size);

那为什么我们不能直接访问return值。

来自 Linux 设备驱动程序手册

Once equipped with ioremap (and iounmap), a device driver can access any I/O memory address, whether or not it is directly mapped to virtual address space. Remember, though, that the addresses returned from ioremap should not be dereferenced directly; instead, accessor functions provided by the kernel should be used.

任何人都可以解释这背后的原因或访问函数的优势，例如 ioread32 或 iowrite8()？

Answer 1

您需要 ioread8 / iowrite8 或任何至少强制转换为 volatile* 以确保优化仍然导致恰好 1 次访问（不为 0 或大于 1）。事实上，他们做的不止于此，处理字节序（他们还处理字节序，访问设备内存作为小端。或者 ioread32be 用于大端）和一些编译时重新排序内存屏障语义 Linux 选择包含在这些函数中。由于 DMA，读取后甚至会出现运行时障碍。使用 _rep 版本仅通过一个屏障从设备内存中复制一个块。

在 C 中，数据竞争是 UB（未定义行为）。这意味着允许编译器假设通过非 volatile 指针访问的内存在访问之间不会改变。并且 if (x) y = *ptr; 可以转换为 tmp = *ptr; if (x) y = tmp; 即编译时推测加载，如果已知 *ptr 没有错误。（相关：Who's afraid of a big bad optimizing compiler? 回复：为什么 Linux 内核需要 volatile 来滚动它自己的原子。）

MMIO 寄存器即使对于读取也可能有副作用，因此您必须阻止编译器执行不在源代码中的加载，并且必须强制它执行所有加载在源代码中正好出现一次。

实体店优惠相同。（编译器甚至不允许发明对非易失性对象的写入，但它们可以删除死存储。例如 *ioreg = 1; *ioreg = 2; 通常会编译为 *ioreg = 2; 第一个存储被删除为“死”，因为它被认为没有明显的副作用。

C volatile 语义是 MMIO 的理想选择，但是 Linux 除了 volatile 之外，还包含更多内容。

通过谷歌搜索 ioread8 并在 https://elixir.bootlin.com/linux/latest/source/lib/iomap.c#L11 中四处寻找，我们发现 Linux I/O 地址可以编码 IO 地址 space（端口I/O，又名 PIO；in / out x86 上的指令）与内存地址 space（普通 load/store 到特殊地址）。 ioread* 函数实际上会检查并相应地进行调度。

    /*
     * Read/write from/to an (offsettable) iomem cookie. It might be a PIO
     * access or a MMIO access, these functions don't care. The info is
     * encoded in the hardware mapping set up by the mapping functions
     * (or the cookie itself, depending on implementation and hw).
     *
     * The generic routines don't assume any hardware mappings, and just
     * encode the PIO/MMIO as part of the cookie. They coldly assume that
     * the MMIO IO mappings are not in the low address range.
     *
     * Architectures for which this is not true can't use this generic
     * implementation and should do their own copy.
     */

例如实现，这里是ioread16。（IO_COND 是一个根据预定义常量检查地址的宏：低地址是 PIO 地址）。

    unsigned int ioread16(void __iomem *addr)
    {
      IO_COND(addr, return inw(port), return readw(addr));
      return 0xffff;
    }

如果您将 `ioremap` 结果转换为 `volatile uint32_t*` 会发生什么问题？

例如如果您使用 READ_ONCE / WRITE_ONCE 只是转换为 volatile unsigned char* 或其他，并且用于对共享变量的原子访问。（在 Linux 的手卷 volatile + 原子的内联 asm 实现中，它使用它而不是 C11 _Atomic）。

如果编译时重新排序不是问题，那实际上可能适用于 x86 等小端 ISA，但其他 ISA 需要更多障碍。如果您查看 definition of readl（ioread32 用于 MMIO，而不是 inl 用于 PIO），它使用围绕 volatile 指针的取消引用的障碍。

（这个和它使用的宏在与这个相同的 io.h 中定义，或者您可以使用 LXR 链接进行导航：每个标识符都是一个超链接。）

static inline u32 readl(const volatile void __iomem *addr) {
    u32 val;
    __io_br();
    val = __le32_to_cpu(__raw_readl(addr));
    __io_ar(val);
    return val;
}

泛型 __raw_readl 只是 volatile 解引用；一些 ISA 可能会提供他们自己的。

__io_ar() 使用 rmb() 或 barrier() 阅读后。 /* prevent prefetching of coherent DMA data ahead of a dma-complete */。 Before Read 障碍只是 barrier() - 在没有 asm 指令的情况下阻止编译时重新排序。

错误问题的旧答案：下面的文字回答了为什么你需要调用ioremap。

因为它是一个物理地址并且内核内存不是标识映射 (virt = phys) 到物理地址。

并且返回虚拟地址不是一个选项：并非所有系统都有足够的虚拟地址 space 甚至可以将所有物理地址 space 直接映射为连续的虚拟地址范围。（但是当 space 足够时，Linux 会这样做，例如 x86-64 Linux 的虚拟地址-space 布局记录在 x86_64/mm.txt

特别是 RAM 大于 1 或 2GB 的系统上的 32 位 x86 内核（取决于内核的配置方式：2:2 或 1:3 kernel:user 虚拟地址分割space)。使用用于 36 位物理地址的 PAE space，32 位 x86 内核可以使用更多物理内存，而不是一次映射。（这太可怕了，让内核的日子不好过：一些随机的博客转载了 Linus Torvald 关于如何 PAE really really sucks 的评论。）

其他 ISA 可能也有这个，并且我知道 Alpha 在需要字节访问时对 IO 内存做了什么；也许将字 loads/stores 映射到字节 loads/stores 的物理地址 space 区域较早处理，因此您请求正确的物理地址。 (http://www.tldp.org/HOWTO/Alpha-HOWTO-8.html)

但 32 位 x86 PAE 显然是 Linux 非常关心的 ISA，甚至在 Linux 历史的早期。

使用内存映射IO时调用ioread函数有什么好处

What is the benefit of calling ioread functions when using memory mapped IO

linux

io

x86

linux-device-driver

linux-kernel

如果您将 `ioremap` 结果转换为 `volatile uint32_t*` 会发生什么问题？

使用内存映射IO时调用ioread函数有什么好处

What is the benefit of calling ioread functions when using memory mapped IO

linux

io

x86

linux-device-driver

linux-kernel

如果您将 ioremap 结果转换为 volatile uint32_t* 会发生什么问题？

如果您将 `ioremap` 结果转换为 `volatile uint32_t*` 会发生什么问题？