无法弄清楚 PHP 中的字符编码
Can't figure out with character encoding in PHP
我整理了一些用于读取 youtube 视频标签的实用程序。
http://www.daviddresden.com/tagreader/
<?php
header("Content-Type: application/json");
error_reporting(E_ERROR | E_PARSE);
$_POST['fn']='https://www.youtube.com/watch?v=OgAt8Ehg0eo';
if(isset($_POST['fn']) && $_POST['fn'] != ''){
$url = htmlentities($_POST['fn']);
$page_content = file_get_contents('https://www.youtube.com/watch?v=OgAt8Ehg0eo');
$dom_obj = new DOMDocument();
if($dom_obj->loadHTML($page_content)){
$dom_obj->loadHTML($page_content);
$meta_val = '';
foreach($dom_obj->getElementsByTagName('meta') as $meta) {
if($meta->getAttribute('property')=='og:video:tag'){
$meta_val = $meta_val.','.$meta->getAttribute('content');
}
}
echo substr($meta_val,1);
}
else{
echo "Invalid Url!";
}
}
else{
echo "Empty Url!";
}
?>
它适用于 ASCI 字符,但 UTF 字符显示不可读。
我找不到问题。
Converts a string with ISO-8859-1 characters encoded with UTF-8 to
single-byte ISO-8859-1
使用utf8_decode
输出:
echo utf8_decode(substr($meta_val,1)) ;
将 Content-Type
设置为 utf-8
header('Content-Type: text/html; charset=utf-8');
完整代码:
header('Content-Type: text/html; charset=utf-8');
$_POST['fn']='https://www.youtube.com/watch?v=OgAt8Ehg0eo';
if(isset($_POST['fn']) && $_POST['fn'] != ''){
$url = htmlentities($_POST['fn']);
$page_content = file_get_contents('https://www.youtube.com/watch?v=OgAt8Ehg0eo');
$dom_obj = new DOMDocument();
if($dom_obj->loadHTML($page_content)){
$dom_obj->loadHTML($page_content);
$meta_val = '';
foreach($dom_obj->getElementsByTagName('meta') as $meta) {
if($meta->getAttribute('property')=='og:video:tag'){
$meta_val = $meta_val.','.$meta->getAttribute('content');
}
}
echo utf8_decode(substr($meta_val,1)) ;
}
else{
echo "Invalid Url!";
}
}
else{
echo "Empty Url!";
}
我整理了一些用于读取 youtube 视频标签的实用程序。 http://www.daviddresden.com/tagreader/
<?php
header("Content-Type: application/json");
error_reporting(E_ERROR | E_PARSE);
$_POST['fn']='https://www.youtube.com/watch?v=OgAt8Ehg0eo';
if(isset($_POST['fn']) && $_POST['fn'] != ''){
$url = htmlentities($_POST['fn']);
$page_content = file_get_contents('https://www.youtube.com/watch?v=OgAt8Ehg0eo');
$dom_obj = new DOMDocument();
if($dom_obj->loadHTML($page_content)){
$dom_obj->loadHTML($page_content);
$meta_val = '';
foreach($dom_obj->getElementsByTagName('meta') as $meta) {
if($meta->getAttribute('property')=='og:video:tag'){
$meta_val = $meta_val.','.$meta->getAttribute('content');
}
}
echo substr($meta_val,1);
}
else{
echo "Invalid Url!";
}
}
else{
echo "Empty Url!";
}
?>
它适用于 ASCI 字符,但 UTF 字符显示不可读。 我找不到问题。
Converts a string with ISO-8859-1 characters encoded with UTF-8 to single-byte ISO-8859-1
使用utf8_decode
输出:
echo utf8_decode(substr($meta_val,1)) ;
将 Content-Type
设置为 utf-8
header('Content-Type: text/html; charset=utf-8');
完整代码:
header('Content-Type: text/html; charset=utf-8');
$_POST['fn']='https://www.youtube.com/watch?v=OgAt8Ehg0eo';
if(isset($_POST['fn']) && $_POST['fn'] != ''){
$url = htmlentities($_POST['fn']);
$page_content = file_get_contents('https://www.youtube.com/watch?v=OgAt8Ehg0eo');
$dom_obj = new DOMDocument();
if($dom_obj->loadHTML($page_content)){
$dom_obj->loadHTML($page_content);
$meta_val = '';
foreach($dom_obj->getElementsByTagName('meta') as $meta) {
if($meta->getAttribute('property')=='og:video:tag'){
$meta_val = $meta_val.','.$meta->getAttribute('content');
}
}
echo utf8_decode(substr($meta_val,1)) ;
}
else{
echo "Invalid Url!";
}
}
else{
echo "Empty Url!";
}