为什么 getBytes() 编码转换会给出这些结果

Question

我在 UTF-8 中有一个字符串，我首先将其转换为 ISO-8859_1，然后将其转换回 UTF-8 并从中获取 ISO8859_1 字节。结果应该又是 ISO-8859-1，但它给了我 UTF-8 字节。为什么？

  import java.io.UnsupportedEncodingException;      

  public class Test  {
    public static void main(String[] args) throws
        UnsupportedEncodingException  {
        String s0 = "H\u00ebllo";
        byte[] bytes = s0.getBytes("ISO8859_1");
        byte[] bytes1=s0.getBytes("UTF-8");
        printBytes(bytes, "bytes");  //72 -21 108 108 111  (ISO-8859-1)
        printBytes(bytes1, "bytes1");  //72 -61 -85 108 108 111  (UTF-8)
        byte[] bytes2=new String(s0.getBytes("UTF-8"), "ISO8859_1").getBytes("ISO8859_1");
        printBytes(bytes2, "bytes2");  //72 -61 -85 108 108 111  (UTF-8)
       }


   private static void printBytes(byte[] array, String name)  {
           System.out.print(name+": ");
            for(int i=0; i<array.length; i++)  {
                    System.out.print(array[i] + " ");
            }
            System.out.println();
      }
    }

Answer 1

这毫无意义：

new String(s0.getBytes("UTF-8"), "ISO8859_1")

您正在用 ISO8859_1 编码解释 UTF-8 byte[]。您应该使用 UTF-8 编码解释 UTF-8 字节：

new String(s0.getBytes("UTF-8"), "UTF-8")

然后它会打印：

bytes: 72 -21 108 108 111 
bytes1: 72 -61 -85 108 108 111 
bytes2: 72 -21 108 108 111

你还说：

I have a String in UTF-8

Strings 没有明确定义的内部编码，它是一个实现细节。创建 String 后，没有编码，只有 String。但是，您可以使用特定编码从中获取 byte[]。

为什么 getBytes() 编码转换会给出这些结果

Why getBytes() encoding conversion gives these results

java

encoding

utf-8

iso-8859-1

character-encoding