根据内部评论记录和详细说明单个脚本

Question

我将编写一组脚本，每个脚本独立于其他脚本但有一些相似之处。所有脚本的结构很可能是相同的，可能看起来像：

# -*- coding: utf-8 -*-
"""
Small description and information
@author: Author
"""

# Imports
import numpy as np
import math
from scipy import signal
...

# Constant definition (always with variable in capital letters)
CONSTANT_1 = 5
CONSTANT_2 = 10

# Main class
class Test():
    def __init__(self, run_id, parameters):
        # Some stuff not too important
        
    def _run(self, parameters):
        # Main program returning a result object.

对于每个脚本，我都想编写文档并将其导出为 PDF。我需要一个 library/module/parser 来读取脚本、提取注释、代码并将其以所需的输出格式放回原处。

比如在_run()方法中，注释中可能会详细说明几个步骤：

def _run(self, parameters):
        # Step 1: we start by doing this
        code to do it
            
        # Step 2: then we do this
        code to do it
        code 
        code # this code does that

我可以使用哪个 library/parser 来分析 python 脚本并输出 PDF？ 起初，我想到了 sphinx，但它不适合我的需要，因为我必须设计一个自定义扩展。此外，sphinx 的优势在于相同或不同模块的多个脚本之间的链接和层次结构。就我而言，我只会记录一个脚本，一次记录一个文件。

然后，我的第二个想法是使用 RST 格式和 RST2PDF 来创建 PDF。对于解析器，我可以设计一个解析器，它读取 .py 文件并提取 commented/decorated 行或下面建议的一组行，然后编写 RST 文件。

#-description
## Title of something
# doing this here
#-

#-code
some code to extract and put in the doc
some more code
#-

最后，我还希望能够执行一些代码并捕获结果，以便将其放入输出 PDF 文件中。例如，我可以运行 python 代码来计算 .py 文件内容的 SHA1 哈希，并将其作为参考包含在 PDF 文档中。

Answer 1

评论不适合用于文档，通常它们用于突出显示仅与开发人员（而非用户）相关的特定方面。为了实现您的目标，您可以在不同的地方使用 __doc__ 字符串：

模块级
class级
function-/method-level

如果您的 _run 方法真的很长并且您觉得文档字符串与实际代码相距太远，那么这是一个强烈的信号，表明您的函数无论如何都太长了。它应该被拆分成多个更小的函数以提高清晰度，每个函数都可以有自己的文档字符串。例如 Google style guide 建议如果函数超过 40 行代码，则应将其分解成更小的部分。

然后您可以使用 Sphinx 来解析该文档并将其转换为 PDF 格式。

这是一个示例设置（使用 Google doc style）：

# -*- coding: utf-8 -*-
"""
Small description and information.
@author: Author

Attributes:
    CONSTANT_1 (int): Some description.
    CONSTANT_2 (int): Some description.
"""

import numpy as np
import math
from scipy import signal


CONSTANT_1 = 5
CONSTANT_2 = 10


class Test():
    """Main class."""
    def __init__(self, run_id, parameters):
        """Some stuff not too important."""
        pass
        
    def _run(self, parameters):
        """Main program returning a result object.

        Uses `func1` to compute X and then `func2` to convert it to Y.

        Args:
            parameters (dict): Parameters for the computation

        Returns:
            result
        """
        X = self.func1(parameters)
        Y = self.func2(X)
        return Y

    def func1(self, p):
        """Information on this method."""
        pass

    def func2(self, x):
        """Information on this method."""
        pass

然后用Sphinx就可以使用了sphinx-quickstart command line utility to set up a sample project. In order to create documentation for the script you can use sphinx-apidoc。为此，您可以创建一个单独的目录 scripts，添加一个空的 __init__.py 文件并将所有脚本放在该目录中。在运行完成这些步骤后，目录结构将如下所示（假设您在 sphinx-quickstart（默认）期间没有将构建目录和源目录分开）：

$ tree
.
├── _build
├── conf.py
├── index.rst
├── make.bat
├── Makefile
├── scripts
│   └── __init__.py
│   └── example.py
├── _static
└── _templates

要使 sphinx-apidoc 正常工作，您需要启用 sphinx-autodoc 扩展。根据您使用的文档样式，您可能还需要启用相应的扩展。上面的示例使用 Google 文档样式，由 Napoleon 扩展处理。这些扩展可以在 conf.py:

中启用

extensions = ['sphinx.ext.autodoc', 'sphinx.ext.napoleon']

然后你可以运行 sphinx-apidoc如下（-e将每个module/script放在一个单独的页面上，-f覆盖现有的doc文件，-P 记录私有成员（以 _ 开头的成员）：

$ sphinx-apidoc -efPo api scripts/
Creating file api/scripts.rst.
Creating file api/scripts.example.rst.
Creating file api/modules.rst.

此命令为实际构建命令创建了必要的指令。为了使构建也能够导入并正确记录您的脚本，您还需要相应地设置导入路径。这可以通过取消注释 conf.py:

顶部附近的以下三行来完成

import os
import sys
sys.path.insert(0, os.path.abspath('.'))

要使您的脚本文档出现在文档中，您需要从主 index.rst 文件中 link 它们：

Welcome to ExampleProject's documentation!
==========================================

.. toctree::
   :maxdepth: 2
   :caption: Contents:

   api/modules

最终你可以运行构建命令：

$ make latexpdf

然后可以在 _build/latex/<your-project-name>.pdf.

找到生成的文档

这是生成文档的屏幕截图：

请注意，有多种 themes available to change the look of your documentation. Sphinx also supports plenty of configuration 选项可用于自定义文档的构建。

Answer 2

文档字符串而不是注释

为了方便您自己，您可能希望使用 docstrings 而不是评论：

A docstring is a string literal that occurs as the first statement in a module, function, class, or method definition. Such a docstring becomes the __doc__ special attribute of that object.

这样，您可以在生成文档时解析脚本时使用 __doc__ 属性。

紧接在成为文档字符串的 function/module 定义之后放置的三个双引号字符串只是语法糖化。您可以根据需要以编程方式编辑 __doc__ 属性。

例如，您可以使用 decorators 在您的特定情况下更好地创建文档字符串。例如，让您内联注释步骤，但仍将注释添加到文档字符串中（在浏览器中编程，可能有错误）：

def with_steps(func):
  def add_step(n, doc):
    func.__doc__ = func.__doc__ + "\nStep %d: %s" % (n, doc)
  func.add_step = add_step

@with_steps
def _run(self, parameters):
  """Initial description that is turned into the initial docstring"""
  _run.add_step(1, "we start by doing this")
  code to do it
        
  _run.add_step(2, "then we do this")
  code to do it
  code

这将创建这样的文档字符串：

Initial description that is turned into the initial docstring
Step 1: we start by doing this
Step 2: then we do this

你懂的。

从记录的脚本生成 PDF

狮身人面像

就个人而言，如果您不想依赖 LaTeX，我会通过捆绑的 LaTeXBuilder or using rinoh 尝试适用于 Sphinx 的 PDF 生成器。

但是，您必须使用 Sphinx 可以理解的文档字符串格式，例如 reStructuredText 或 Google Style Docstrings。

AST

另一种方法是使用 ast to extract the docstrings. This is probably what the Sphinx autodoc extension uses internally to extract the documentation from the source files. There are a few examples out there on how to do this, like this gist or this blog post。

这样你就可以编写一个脚本来解析和输出你想要的任何格式。例如，您可以输出 Markdown 或 reST 并使用 pandoc.

将其转换为 PDF

您可以直接在文档字符串中编写标记文本，这会给您带来很大的灵活性。假设您想使用 markdown 编写文档——只需在文档字符串中直接编写 markdown 即可。

def _run(self, parameters):
  """Example script
  ================

  This script does a, b, c

  1. Does something first
  2. Does something else next
  3. Returns something else

  Usage example:
  
      result = script(parameters)
      foo = [r.foo for r in results]
  """

可以使用 ast 和 parsed/processed 使用您认为合适的任何库提取此字符串。

Answer 3

Doxygen 听起来很适合这个。它支持 Python 文档字符串，还可以解析以 ## 开头的注释，如下所述：

https://www.doxygen.nl/manual/docblocks.html#pythonblocks

要获得 PDF 格式的输出，您需要安装 LaTeX 处理器，例如 MikTex。当您运行 Doxygen 时，它将创建一个包含“make”shell 脚本的乳胶文件夹。运行将生成 shell 脚本和 PDF 文件，.

包括在别处生成的内容，例如您提到的 SHA1 哈希，您可以在评论中使用 @include 命令。请注意，Doxygen 的 @include 命令仅在您使用 ## 注释时才有效。

例如

## Documentation for a class.
#
#  More details.
#  @include PyClassSha1Hash.txt
class PyClass:

根据内部评论记录和详细说明单个脚本

Documenting and detailing a single script based on the comments inside

python

documentation

文档字符串而不是注释

从记录的脚本生成 PDF