使用 Bazel 构建 LLVM

Building LLVM with Bazel

我有一个项目目前正在使用 CMake,我想将其切换到 Bazel。主要依赖项是 LLVM,我用它来生成 LLVM IR。环顾四周,似乎没有太多关于此的指导,因为只有 TensorFlow 似乎使用 Bazel 的 LLVM(据我所知 auto-generates 它的配置)。还有一个 thread on bazel-discuss 我发现它讨论了类似的问题,尽管我尝试复制它失败了。

目前,我最好的 运行 必须是这个 (fetcher.bzl):

def _impl(ctx):
    # Download LLVM master
    ctx.download_and_extract(url = "https://github.com/llvm-mirror/llvm/archive/master.zip")

    # Run `cmake llvm-master` to generate configuration.
    ctx.execute(["cmake", "llvm-master"])

    # The bazel-discuss thread says to delete llvm-master, but I've
    # found that only generated files are pulled out of master, so all
    # the non-generated ones get dropped if I delete this.
    # ctx.execute(["rm", "-r", "llvm-master"])

    # Generate a BUILD file for the LLVM dependency.
    ctx.file('BUILD', """
# Build a library with all the LLVM code in it.
cc_library(
    name = "lib",
    srcs = glob(["**/*.cpp"]),
    hdrs = glob(["**/*.h"]),

    # Include the x86 target and all include files.
    # Add those under llvm-master/... as well because only built files
    # seem to appear under include/...
    copts = [
        "-Ilib/Target/X86",
        "-Iinclude",
        "-Illvm-master/lib/Target/X86",
        "-Illvm-master/include",
    ],

    # Include here as well, not sure whether this or copts is
    # actually doing the work.
    includes = [
        "include",
        "llvm-master/include",
    ],
    visibility = ["//visibility:public"],
    # Currently picking up some gtest targets, I have that dependency
    # already, so just link it here until I filter those out.
    deps = [
        "@gtest//:gtest_main",
    ],
)
""")

    # Generate an empty workspace file
    ctx.file('WORKSPACE', '')

get_llvm = repository_rule(implementation = _impl)

然后我的 WORKSPACE 文件如下所示:

load(":fetcher.bzl", "get_llvm")

git_repository(
    name = "gflags",
    commit = "46f73f88b18aee341538c0dfc22b1710a6abedef", # 2.2.1
    remote = "https://github.com/gflags/gflags.git",
)

new_http_archive(
    name = "gtest",
    url = "https://github.com/google/googletest/archive/release-1.8.0.zip",
    sha256 = "f3ed3b58511efd272eb074a3a6d6fb79d7c2e6a0e374323d1e6bcbcc1ef141bf",
    build_file = "gtest.BUILD",
    strip_prefix = "googletest-release-1.8.0",
)

get_llvm(name = "llvm")

然后我会 运行 这个 bazel build @llvm//:lib --verbose_failures

我总是会因丢失 header 文件而出错。最后我发现 运行ning cmake llvm-master 生成了许多 header 文件到当前目录,但似乎将 non-generated 留在了 llvm-master/。我在 llvm-master/ 下添加了相同的包含目录,这似乎捕获了很多文件。但是,目前看来 tblgen 不是 运行ning 并且我仍然缺少编译所需的关键 headers。我当前的错误是:

In file included from external/llvm/llvm-master/include/llvm/CodeGen/MachineOperand.h:18:0,
                 from external/llvm/llvm-master/include/llvm/CodeGen/MachineInstr.h:24,
                 from external/llvm/llvm-master/include/llvm/CodeGen/MachineBasicBlock.h:22,
                 from external/llvm/llvm-master/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h:20,
                 from external/llvm/llvm-master/include/llvm/CodeGen/GlobalISel/ConstantFoldingMIRBuilder.h:13,
                 from external/llvm/llvm-master/unittests/CodeGen/GlobalISel/PatternMatchTest.cpp:10:
external/llvm/llvm-master/include/llvm/IR/Intrinsics.h:42:38: fatal error: llvm/IR/IntrinsicEnums.inc: No such file or directory

特别尝试查找此文件,但我没有看到任何 IntrinsicEnums.incIntrinsicEnums.hIntrinsicEnums.dt。我确实看到了很多 Instrinsics*.td,所以也许其中之一生成了这个特定文件?

似乎tblgen 应该将*.td 文件转换为*.h*.cpp 文件(如果我有误解,请纠正我)。但是,这似乎不是 运行ning。我看到在 Tensorflow 的项目中,他们有一个 gentbl() BUILD 宏,虽然我复制它是不切实际的,因为它对 Tensorflow 的构建基础设施的其余部分有太多的依赖。

如果没有像 Tensorflow 的系统那样大而复杂的东西,有什么办法可以做到这一点吗?

我发布到 llvm-dev 邮件列表 here 并收到了一些有趣的回复。 LLVM 绝对不是为支持 Bazel 而设计的,而且做得也不是特别好。使用 Ninja 输出所有编译命令然后从 Bazel 使用它们在理论上似乎是可行的。这可能非常困难,需要一个单独的工具来输出 Skylark 代码 运行 by Bazel。

对于我正在处理的项目规模而言,这似乎相当复杂,因此我的解决方法是从 releases.llvm.org 下载预构建的二进制文件。这包括所有必要的头文件、库和工具二进制文件。我能够基于此在 Bazel 中为我的自定义编程语言制作一个简单但功能强大的工具链。

简单示例(有限但重点突出):https://github.com/dgp1130/llvm-bazel-foolang

完整示例(更复杂且不太集中):https://github.com/dgp1130/sanity-lang