如何在 C++ 中使用 libbz2 库解压缩内存缓冲区中的 pbzip2 数据
How to decompress pbzip2 data in memory buffer by using libbz2 library in C++
我有一个解压缩 bzip2 数据的工作版本,我称之为 bz2_bzdecompress API。它是这样的
while (bytes_input < len) {
isDone = false;
// Initialize the input buffer and its length
size_t in_buffer_size = len -bytes_input;
the_bz2_stream.avail_in = in_buffer_size;
the_bz2_stream.next_in = (char*)data +bytes_input;
size_t out_buffer_size =
output_size -bytes_uncompressed; // size of output buffer
if (out_buffer_size == 0) { // out of space in the output buffer
break;
}
the_bz2_stream.avail_out = out_buffer_size;
the_bz2_stream.next_out =
(char*)output +bytes_uncompressed; // output buffer
ret = BZ2_bzDecompress(&the_bz2_stream);
if (ret != BZ_OK && ret != BZ_STREAM_END) {
throw Bzip2Exception("Bzip2 failed. ", ret);
}
bytes_input += in_buffer_size - the_bz2_stream.avail_in;
bytes_uncompressed += out_buffer_size - the_bz2_stream.avail_out;
*data_consumed =bytes_input;
if (ret == BZ_STREAM_END) {
ret = BZ2_bzDecompressEnd(&the_bz2_stream);
if (ret != BZ_OK) {
throw Bzip2Exception("Bzip2 fail. ", ret);
}
isDone = true;
}
}
这对本机 bzip2 压缩文件非常有效,但对于 pbzip2(并行 Bzip2)和 "Splittable" bzip2 数据,它抛出 "BZ_PARAM_ERROR".
我在他们的文档中看到 pbzip2 是这样说的-
Data compressed with pbzip2 is broken into multiple streams and each
stream is bzip2 compressed looking like this:
[-----|-----|-----|-----|-----|-----|-----|-----|-----]
If you are writing software with libbzip2 to decompress data created
with pbzip2, you must take into account that the data contains
multiple bzip2 streams so you will encounter end-of-stream markers
from libbzip2 after each stream and must look-ahead to see if there
are any more streams to process before quitting. The bzip2 program
itself will automatically handle this condition.
来源:http://compression.ca/pbzip2/
有人可以告诉我如何处理吗?我应该使用其他一些 libzip2 API 吗?
此外,pbzip2 文件与正常的 "bunzip2" 命令兼容。当我的代码抛出 BZ_PARAM_ERROR?
时,bzip2 如何优雅地处理这个问题
谢谢。
在你的 BZ2_bzDecompressEnd()
之后你需要再次调用 BZ2_bzDecompressInit()
(你必须在那个循环之前调用它),如果还有数据需要解压,即 bytes_input < len
。
要解压缩每个 |-----|
块,您需要执行一次 init
、一些 decompress
次调用和一次 end
。所以如果你还有剩余的输入,那么你需要再做一个 init
, n*decompress
, end
.
确保你做最后的 end
,以避免大的内存泄漏。
你得到一个 BZ_PARAM_ERROR
因为你正试图使用一个未初始化的 bz_stream
来解压缩。一旦你做了 BZ2_bzDecompressEnd()
,你就不能再使用那个 bz_stream
,除非你对它做 BZ2_bzDecompressInit()
。
我有一个解压缩 bzip2 数据的工作版本,我称之为 bz2_bzdecompress API。它是这样的
while (bytes_input < len) {
isDone = false;
// Initialize the input buffer and its length
size_t in_buffer_size = len -bytes_input;
the_bz2_stream.avail_in = in_buffer_size;
the_bz2_stream.next_in = (char*)data +bytes_input;
size_t out_buffer_size =
output_size -bytes_uncompressed; // size of output buffer
if (out_buffer_size == 0) { // out of space in the output buffer
break;
}
the_bz2_stream.avail_out = out_buffer_size;
the_bz2_stream.next_out =
(char*)output +bytes_uncompressed; // output buffer
ret = BZ2_bzDecompress(&the_bz2_stream);
if (ret != BZ_OK && ret != BZ_STREAM_END) {
throw Bzip2Exception("Bzip2 failed. ", ret);
}
bytes_input += in_buffer_size - the_bz2_stream.avail_in;
bytes_uncompressed += out_buffer_size - the_bz2_stream.avail_out;
*data_consumed =bytes_input;
if (ret == BZ_STREAM_END) {
ret = BZ2_bzDecompressEnd(&the_bz2_stream);
if (ret != BZ_OK) {
throw Bzip2Exception("Bzip2 fail. ", ret);
}
isDone = true;
}
}
这对本机 bzip2 压缩文件非常有效,但对于 pbzip2(并行 Bzip2)和 "Splittable" bzip2 数据,它抛出 "BZ_PARAM_ERROR".
我在他们的文档中看到 pbzip2 是这样说的-
Data compressed with pbzip2 is broken into multiple streams and each stream is bzip2 compressed looking like this: [-----|-----|-----|-----|-----|-----|-----|-----|-----]
If you are writing software with libbzip2 to decompress data created with pbzip2, you must take into account that the data contains multiple bzip2 streams so you will encounter end-of-stream markers from libbzip2 after each stream and must look-ahead to see if there are any more streams to process before quitting. The bzip2 program itself will automatically handle this condition.
来源:http://compression.ca/pbzip2/
有人可以告诉我如何处理吗?我应该使用其他一些 libzip2 API 吗?
此外,pbzip2 文件与正常的 "bunzip2" 命令兼容。当我的代码抛出 BZ_PARAM_ERROR?
时,bzip2 如何优雅地处理这个问题谢谢。
在你的 BZ2_bzDecompressEnd()
之后你需要再次调用 BZ2_bzDecompressInit()
(你必须在那个循环之前调用它),如果还有数据需要解压,即 bytes_input < len
。
要解压缩每个 |-----|
块,您需要执行一次 init
、一些 decompress
次调用和一次 end
。所以如果你还有剩余的输入,那么你需要再做一个 init
, n*decompress
, end
.
确保你做最后的 end
,以避免大的内存泄漏。
你得到一个 BZ_PARAM_ERROR
因为你正试图使用一个未初始化的 bz_stream
来解压缩。一旦你做了 BZ2_bzDecompressEnd()
,你就不能再使用那个 bz_stream
,除非你对它做 BZ2_bzDecompressInit()
。