â®字符在返回时转换为问号

â® characters getting converted to question marks while getting back

我遇到了一个非常奇怪的问题。 我正在从 Amazon AWS SQS 发送和接收消息。 放置时我正在压缩和编码消息,如下所示:

String responseMessageBodyOriginal = gson.toJson(responseData);
String responseMessageBodyCompressed = compressToBase64String(responseMessageBodyOriginal);
AmazonSqsHelper.sendMessage(responseMessageBodyCompressed, queue, null);

压缩和编码函数,如下所示:

public static String compressToBase64String(String data) throws IOException {
    ByteArrayOutputStream bos = new ByteArrayOutputStream(data.length());
    GZIPOutputStream gzip = new GZIPOutputStream(bos);
    gzip.write(data.getBytes());
    gzip.close();
    byte[] compressedBytes = bos.toByteArray();
    bos.close();
    return new String(Base64.encodeBase64(compressedBytes));
}

另一方面,在接收消息时,这是代码:

List<Message> sqsMessageList = AmazonSqsHelper.receiveMessages(queueUrl, max_message_read_count,
                    default_visibility_timeout);
int num_messages = sqsMessageList.size();
if (num_messages > 0) {
   for (Message m : sqsMessageList) {
       String responseMessageBodyCompressed = m.getBody();
       String responseMessageBodyOriginal = decompressFromBase64String(responseMessageBodyCompressed);
   }
}

解码解压的函数是这样的:

public static String decompressFromBase64String(String compressedString) throws IOException {
    byte[] compressedBytes = Base64.decodeBase64(compressedString);
    ByteArrayInputStream bis = new ByteArrayInputStream(compressedBytes);
    GZIPInputStream gis = new GZIPInputStream(bis);
    BufferedReader br = new BufferedReader(new InputStreamReader(gis, "UTF-8"));
    StringBuilder sb = new StringBuilder();
    String line;
    while ((line = br.readLine()) != null) {
        sb.append(line);
    }
    br.close();
    gis.close();
    bis.close();
    return sb.toString();
}

但问题是,有时如果我传递像“â®”这样的字符,那么这些字符会转换为 ????? ,如果我正在打印消息,解码后。

无法弄清楚为什么编码和解码行为异常。任何帮助将不胜感激。

问题是使用平台的默认字符集 (data.getBytes()) 进行编码,而使用 UTF-8 进行解码。

compressToBase64String 中将 data.getBytes() 更改为 data.getBytes(StandardCharsets.UTF_8)