HTML5 语法中的 "Text" 是否表示 "any character"?
Does "Text" in the HTML5 syntax mean "any character"?
我无法找到任何限制 Text 中允许使用哪些字符,这是否意味着允许更改或是否存在影响 HTML 文档的一般限制?
例如 Character Reference Section 指出:
The numeric character reference forms [...] are allowed to reference any Unicode code point other than U+0000, U+000D, permanently undefined Unicode characters (noncharacters), surrogates (U+D800–U+DFFF), and control characters other than space characters.
这些字符是否仍允许以 "unescaped" 形式出现在文本中?例如。 as attribute value: <span title="Hello ␀ World"></span>
其中 ␀ 是 U+0000 NULL 字符(不是 U+2400)。
我不认为在您所指的上下文中对文本有任何限制。此处的文本表示所有允许的字母、数字和字母数字字符。
答案在您提供的 link 中:
Text is allowed inside elements, attribute values, and comments. Extra constraints are placed on what is and what is not allowed in text based on where the text is to be put, as described in the other sections
现在,如果我们转到 CDATA sections 的语法定义:
CDATA sections must consist of the following components, in this
order:
- The string "<![CDATA[".
- Optionally, text, with the additional restriction that the text must not contain the string "]]>".
- The string "]]>".
所以每种类型的内容都有自己的一套限制,文本只是用来定义所有字符、符号等的超集...
页面和标记中文本的字符限制是根据您选择的字符集定义的。如果您没有定义字符集,浏览器将进行猜测或断言其默认选项(通常是限制最少的)。字符集使用meta
tag with the charset
attribute in your document's head
section. The most common example of this uses the UTF-8字符集定义:
<meta charset="UTF-8" />
此属性的值可以是 Internet Assigned Numbers Authority (IANA). The full list of defined character sets is available here.
定义的任何字符集
此外,在某些元素(或元素类型)中使用的非转义文本可能有特定限制。在这种情况下,您将必须阅读该标签或标签类型的规范,或者简单地将有问题的字符替换为它们的 ampersand-encoded html entities escape values.
我无法找到任何限制 Text 中允许使用哪些字符,这是否意味着允许更改或是否存在影响 HTML 文档的一般限制?
例如 Character Reference Section 指出:
The numeric character reference forms [...] are allowed to reference any Unicode code point other than U+0000, U+000D, permanently undefined Unicode characters (noncharacters), surrogates (U+D800–U+DFFF), and control characters other than space characters.
这些字符是否仍允许以 "unescaped" 形式出现在文本中?例如。 as attribute value: <span title="Hello ␀ World"></span>
其中 ␀ 是 U+0000 NULL 字符(不是 U+2400)。
我不认为在您所指的上下文中对文本有任何限制。此处的文本表示所有允许的字母、数字和字母数字字符。
答案在您提供的 link 中:
Text is allowed inside elements, attribute values, and comments. Extra constraints are placed on what is and what is not allowed in text based on where the text is to be put, as described in the other sections
现在,如果我们转到 CDATA sections 的语法定义:
CDATA sections must consist of the following components, in this order:
- The string "<![CDATA[".
- Optionally, text, with the additional restriction that the text must not contain the string "]]>".
- The string "]]>".
所以每种类型的内容都有自己的一套限制,文本只是用来定义所有字符、符号等的超集...
页面和标记中文本的字符限制是根据您选择的字符集定义的。如果您没有定义字符集,浏览器将进行猜测或断言其默认选项(通常是限制最少的)。字符集使用meta
tag with the charset
attribute in your document's head
section. The most common example of this uses the UTF-8字符集定义:
<meta charset="UTF-8" />
此属性的值可以是 Internet Assigned Numbers Authority (IANA). The full list of defined character sets is available here.
定义的任何字符集此外,在某些元素(或元素类型)中使用的非转义文本可能有特定限制。在这种情况下,您将必须阅读该标签或标签类型的规范,或者简单地将有问题的字符替换为它们的 ampersand-encoded html entities escape values.