为什么我在使用 python 解释器时不需要声明编码？

Question

# -*- coding: utf-8 -*-

我明白当python脚本文件中涉及非ascii字符时，这行代码是必要的。

当我学习python时，有人告诉我运行宁python代码的两种方式（在解释器中逐行与运行脚本文件) 会产生相同的结果。在大多数情况下，他们确实这样做了。但是当脚本中涉及非ascii字符时，结果发现我必须先声明编码。

此外，我试过 exec() 函数，试图执行包含 python 代码的字符串。

>>> exec ("b='你'")

有效。

但是如果我将 "b = '你'" 保存到脚本并运行它，我将得到语法错误。

我很好奇为什么在解释器中运行宁python逐行编码时不需要声明编码。

这两种方式执行程序有什么区别吗？

谢谢。

Answer 1

因为标准中已经有一个编码 (sys.stdin.encoding)。

stdin 的编码可能来自不同的来源，具体取决于平台。在 Apple 和 Windows 上，它是预定义的（"utf-8" 用于 Apple，"mbcs" 用于 windows），否则它由 LC_ALL 或 [= 给出的当前语言环境确定15=]（如果缺少 LC_ALL）环境变量。

如果您运行在 linux 下，您可以执行运行 LC_ALL=en_GB.ascii python 并且您的示例应该会失败。

Answer 2

我假设 Python 的交互式会话使用 system encoding (see also Python Unicode strings and the Python interactive interpreter).

当它读取源文件时，它需要知道如何解释它正在解析的数据脚本不必以与执行它的终端相同的编码编写是合乎逻辑的；当脚本是运行而没有指定编码的环境时甚至更多。

此外，阅读 Joel Spolsky The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets which might explain why Python choose to require the developers to be explicit about the encoding (which I would have preferred that they set a default to UTF8, but their way is coherent with the Zen of Python) 也很有趣。

为什么我在使用 python 解释器时不需要声明编码？

Why I don't need to declare encoding when using python interpreter?

python

character-encoding