使用 Bazel 构建 LLVM
Building LLVM with Bazel
我有一个项目目前正在使用 CMake,我想将其切换到 Bazel。主要依赖项是 LLVM,我用它来生成 LLVM IR。环顾四周,似乎没有太多关于此的指导,因为只有 TensorFlow 似乎使用 Bazel 的 LLVM(据我所知 auto-generates 它的配置)。还有一个 thread on bazel-discuss 我发现它讨论了类似的问题,尽管我尝试复制它失败了。
目前,我最好的 运行 必须是这个 (fetcher.bzl
):
def _impl(ctx):
# Download LLVM master
ctx.download_and_extract(url = "https://github.com/llvm-mirror/llvm/archive/master.zip")
# Run `cmake llvm-master` to generate configuration.
ctx.execute(["cmake", "llvm-master"])
# The bazel-discuss thread says to delete llvm-master, but I've
# found that only generated files are pulled out of master, so all
# the non-generated ones get dropped if I delete this.
# ctx.execute(["rm", "-r", "llvm-master"])
# Generate a BUILD file for the LLVM dependency.
ctx.file('BUILD', """
# Build a library with all the LLVM code in it.
cc_library(
name = "lib",
srcs = glob(["**/*.cpp"]),
hdrs = glob(["**/*.h"]),
# Include the x86 target and all include files.
# Add those under llvm-master/... as well because only built files
# seem to appear under include/...
copts = [
"-Ilib/Target/X86",
"-Iinclude",
"-Illvm-master/lib/Target/X86",
"-Illvm-master/include",
],
# Include here as well, not sure whether this or copts is
# actually doing the work.
includes = [
"include",
"llvm-master/include",
],
visibility = ["//visibility:public"],
# Currently picking up some gtest targets, I have that dependency
# already, so just link it here until I filter those out.
deps = [
"@gtest//:gtest_main",
],
)
""")
# Generate an empty workspace file
ctx.file('WORKSPACE', '')
get_llvm = repository_rule(implementation = _impl)
然后我的 WORKSPACE
文件如下所示:
load(":fetcher.bzl", "get_llvm")
git_repository(
name = "gflags",
commit = "46f73f88b18aee341538c0dfc22b1710a6abedef", # 2.2.1
remote = "https://github.com/gflags/gflags.git",
)
new_http_archive(
name = "gtest",
url = "https://github.com/google/googletest/archive/release-1.8.0.zip",
sha256 = "f3ed3b58511efd272eb074a3a6d6fb79d7c2e6a0e374323d1e6bcbcc1ef141bf",
build_file = "gtest.BUILD",
strip_prefix = "googletest-release-1.8.0",
)
get_llvm(name = "llvm")
然后我会 运行 这个 bazel build @llvm//:lib --verbose_failures
。
我总是会因丢失 header 文件而出错。最后我发现 运行ning cmake llvm-master
生成了许多 header 文件到当前目录,但似乎将 non-generated 留在了 llvm-master/
。我在 llvm-master/
下添加了相同的包含目录,这似乎捕获了很多文件。但是,目前看来 tblgen
不是 运行ning 并且我仍然缺少编译所需的关键 headers。我当前的错误是:
In file included from external/llvm/llvm-master/include/llvm/CodeGen/MachineOperand.h:18:0,
from external/llvm/llvm-master/include/llvm/CodeGen/MachineInstr.h:24,
from external/llvm/llvm-master/include/llvm/CodeGen/MachineBasicBlock.h:22,
from external/llvm/llvm-master/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h:20,
from external/llvm/llvm-master/include/llvm/CodeGen/GlobalISel/ConstantFoldingMIRBuilder.h:13,
from external/llvm/llvm-master/unittests/CodeGen/GlobalISel/PatternMatchTest.cpp:10:
external/llvm/llvm-master/include/llvm/IR/Intrinsics.h:42:38: fatal error: llvm/IR/IntrinsicEnums.inc: No such file or directory
特别尝试查找此文件,但我没有看到任何 IntrinsicEnums.inc
、IntrinsicEnums.h
或 IntrinsicEnums.dt
。我确实看到了很多 Instrinsics*.td
,所以也许其中之一生成了这个特定文件?
似乎tblgen
应该将*.td
文件转换为*.h
和*.cpp
文件(如果我有误解,请纠正我)。但是,这似乎不是 运行ning。我看到在 Tensorflow 的项目中,他们有一个 gentbl()
BUILD 宏,虽然我复制它是不切实际的,因为它对 Tensorflow 的构建基础设施的其余部分有太多的依赖。
如果没有像 Tensorflow 的系统那样大而复杂的东西,有什么办法可以做到这一点吗?
我发布到 llvm-dev 邮件列表 here 并收到了一些有趣的回复。 LLVM 绝对不是为支持 Bazel 而设计的,而且做得也不是特别好。使用 Ninja 输出所有编译命令然后从 Bazel 使用它们在理论上似乎是可行的。这可能非常困难,需要一个单独的工具来输出 Skylark 代码 运行 by Bazel。
对于我正在处理的项目规模而言,这似乎相当复杂,因此我的解决方法是从 releases.llvm.org 下载预构建的二进制文件。这包括所有必要的头文件、库和工具二进制文件。我能够基于此在 Bazel 中为我的自定义编程语言制作一个简单但功能强大的工具链。
简单示例(有限但重点突出):https://github.com/dgp1130/llvm-bazel-foolang
完整示例(更复杂且不太集中):https://github.com/dgp1130/sanity-lang
我有一个项目目前正在使用 CMake,我想将其切换到 Bazel。主要依赖项是 LLVM,我用它来生成 LLVM IR。环顾四周,似乎没有太多关于此的指导,因为只有 TensorFlow 似乎使用 Bazel 的 LLVM(据我所知 auto-generates 它的配置)。还有一个 thread on bazel-discuss 我发现它讨论了类似的问题,尽管我尝试复制它失败了。
目前,我最好的 运行 必须是这个 (fetcher.bzl
):
def _impl(ctx):
# Download LLVM master
ctx.download_and_extract(url = "https://github.com/llvm-mirror/llvm/archive/master.zip")
# Run `cmake llvm-master` to generate configuration.
ctx.execute(["cmake", "llvm-master"])
# The bazel-discuss thread says to delete llvm-master, but I've
# found that only generated files are pulled out of master, so all
# the non-generated ones get dropped if I delete this.
# ctx.execute(["rm", "-r", "llvm-master"])
# Generate a BUILD file for the LLVM dependency.
ctx.file('BUILD', """
# Build a library with all the LLVM code in it.
cc_library(
name = "lib",
srcs = glob(["**/*.cpp"]),
hdrs = glob(["**/*.h"]),
# Include the x86 target and all include files.
# Add those under llvm-master/... as well because only built files
# seem to appear under include/...
copts = [
"-Ilib/Target/X86",
"-Iinclude",
"-Illvm-master/lib/Target/X86",
"-Illvm-master/include",
],
# Include here as well, not sure whether this or copts is
# actually doing the work.
includes = [
"include",
"llvm-master/include",
],
visibility = ["//visibility:public"],
# Currently picking up some gtest targets, I have that dependency
# already, so just link it here until I filter those out.
deps = [
"@gtest//:gtest_main",
],
)
""")
# Generate an empty workspace file
ctx.file('WORKSPACE', '')
get_llvm = repository_rule(implementation = _impl)
然后我的 WORKSPACE
文件如下所示:
load(":fetcher.bzl", "get_llvm")
git_repository(
name = "gflags",
commit = "46f73f88b18aee341538c0dfc22b1710a6abedef", # 2.2.1
remote = "https://github.com/gflags/gflags.git",
)
new_http_archive(
name = "gtest",
url = "https://github.com/google/googletest/archive/release-1.8.0.zip",
sha256 = "f3ed3b58511efd272eb074a3a6d6fb79d7c2e6a0e374323d1e6bcbcc1ef141bf",
build_file = "gtest.BUILD",
strip_prefix = "googletest-release-1.8.0",
)
get_llvm(name = "llvm")
然后我会 运行 这个 bazel build @llvm//:lib --verbose_failures
。
我总是会因丢失 header 文件而出错。最后我发现 运行ning cmake llvm-master
生成了许多 header 文件到当前目录,但似乎将 non-generated 留在了 llvm-master/
。我在 llvm-master/
下添加了相同的包含目录,这似乎捕获了很多文件。但是,目前看来 tblgen
不是 运行ning 并且我仍然缺少编译所需的关键 headers。我当前的错误是:
In file included from external/llvm/llvm-master/include/llvm/CodeGen/MachineOperand.h:18:0,
from external/llvm/llvm-master/include/llvm/CodeGen/MachineInstr.h:24,
from external/llvm/llvm-master/include/llvm/CodeGen/MachineBasicBlock.h:22,
from external/llvm/llvm-master/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h:20,
from external/llvm/llvm-master/include/llvm/CodeGen/GlobalISel/ConstantFoldingMIRBuilder.h:13,
from external/llvm/llvm-master/unittests/CodeGen/GlobalISel/PatternMatchTest.cpp:10:
external/llvm/llvm-master/include/llvm/IR/Intrinsics.h:42:38: fatal error: llvm/IR/IntrinsicEnums.inc: No such file or directory
特别尝试查找此文件,但我没有看到任何 IntrinsicEnums.inc
、IntrinsicEnums.h
或 IntrinsicEnums.dt
。我确实看到了很多 Instrinsics*.td
,所以也许其中之一生成了这个特定文件?
似乎tblgen
应该将*.td
文件转换为*.h
和*.cpp
文件(如果我有误解,请纠正我)。但是,这似乎不是 运行ning。我看到在 Tensorflow 的项目中,他们有一个 gentbl()
BUILD 宏,虽然我复制它是不切实际的,因为它对 Tensorflow 的构建基础设施的其余部分有太多的依赖。
如果没有像 Tensorflow 的系统那样大而复杂的东西,有什么办法可以做到这一点吗?
我发布到 llvm-dev 邮件列表 here 并收到了一些有趣的回复。 LLVM 绝对不是为支持 Bazel 而设计的,而且做得也不是特别好。使用 Ninja 输出所有编译命令然后从 Bazel 使用它们在理论上似乎是可行的。这可能非常困难,需要一个单独的工具来输出 Skylark 代码 运行 by Bazel。
对于我正在处理的项目规模而言,这似乎相当复杂,因此我的解决方法是从 releases.llvm.org 下载预构建的二进制文件。这包括所有必要的头文件、库和工具二进制文件。我能够基于此在 Bazel 中为我的自定义编程语言制作一个简单但功能强大的工具链。
简单示例(有限但重点突出):https://github.com/dgp1130/llvm-bazel-foolang
完整示例(更复杂且不太集中):https://github.com/dgp1130/sanity-lang