python 解码非英语以用作 url？

Question

我有一个变量，例如 title:

title = "révolution_essentielle"

我可以像这样对其进行编码和解码以用于其他目的：

title1 = unicode(title, encoding = "utf-8")

但是如何保留非英语并将其用作 url 字符串的一部分来访问 url？例如，我理想地希望 https://mainurl.com/révolution_essentielle.html 通过连接几个字符串，包括 title 像这样：

url = main_url + "/" + title + ".html"

谁能告诉我该怎么做？非常感谢！

Answer 1

总结一下我们在评论中讨论的内容：有一个引用 URL 的功能（用 % 前缀转义序列替换特殊字符。

对于Python2（如本例中所用）为urllib.quote()，可按如下方式使用：

urllib.quote("révolution_essentielle")

当我们的输入是一个unicode宽字符对象时，我们也需要先对其进行编码，例如：

urllib.quote(u'hey_there_who_likes_lego_that\xe3\u019\xe2_\xe3_...'.encode('utf8')).

请注意，您的表示与对方机器的 expected/understood 匹配。

如果我们谈论的是 Python 3，等效函数将是 urllib.parse.quote():

urllib.parse.quote("révolution_essentielle")

它可以咀嚼 str (unicode) 参数以及 bytes 对象中的编码值。

python decoding Non-English to use as url?