wok-next annotate linux-libre/stuff/001-squashfs-decompressors-add-xz-decompressor-module.patch @ rev 16918

Up glibmm glibmm-dev (2.40.0) to reflect glib version
author Yuri Pourre <yuripourre@gmail.com>
date Thu Jul 17 11:41:17 2014 -0300 (2014-07-17)
parents
children
rev   line source
gokhlayeh@9258 1 From: Lasse Collin <lasse.collin@tukaani.org>
gokhlayeh@9258 2 Date: Thu, 2 Dec 2010 19:14:19 +0000 (+0200)
gokhlayeh@9258 3 Subject: Decompressors: Add XZ decompressor module
gokhlayeh@9258 4 X-Git-Url: http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Fpkl%2Fsquashfs-xz.git;a=commitdiff_plain;h=3dbc3fe7878e53b43064a12d4ab31ca4c18ce85f
gokhlayeh@9258 5
gokhlayeh@9258 6 Decompressors: Add XZ decompressor module
gokhlayeh@9258 7
gokhlayeh@9258 8 In userspace, the .lzma format has become mostly a legacy
gokhlayeh@9258 9 file format that got superseded by the .xz format. Similarly,
gokhlayeh@9258 10 LZMA Utils was superseded by XZ Utils.
gokhlayeh@9258 11
gokhlayeh@9258 12 These patches add support for XZ decompression into
gokhlayeh@9258 13 the kernel. Most of the code is as is from XZ Embedded
gokhlayeh@9258 14 <http://tukaani.org/xz/embedded.html>. It was written for
gokhlayeh@9258 15 the Linux kernel but is usable in other projects too.
gokhlayeh@9258 16
gokhlayeh@9258 17 Advantages of XZ over the current LZMA code in the kernel:
gokhlayeh@9258 18 - Nice API that can be used by other kernel modules; it's
gokhlayeh@9258 19 not limited to kernel, initramfs, and initrd decompression.
gokhlayeh@9258 20 - Integrity check support (CRC32)
gokhlayeh@9258 21 - BCJ filters improve compression of executable code on
gokhlayeh@9258 22 certain architectures. These together with LZMA2 can
gokhlayeh@9258 23 produce a few percent smaller kernel or Squashfs images
gokhlayeh@9258 24 than plain LZMA without making the decompression slower.
gokhlayeh@9258 25
gokhlayeh@9258 26 This patch: Add the main decompression code (xz_dec), testing
gokhlayeh@9258 27 module (xz_dec_test), wrapper script (xz_wrap.sh) for the xz
gokhlayeh@9258 28 command line tool, and documentation. The xz_dec module is
gokhlayeh@9258 29 enough to have a usable XZ decompressor e.g. for Squashfs.
gokhlayeh@9258 30
gokhlayeh@9258 31 Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
gokhlayeh@9258 32 ---
gokhlayeh@9258 33
gokhlayeh@9258 34 diff --git a/Documentation/xz.txt b/Documentation/xz.txt
gokhlayeh@9258 35 new file mode 100644
gokhlayeh@9258 36 index 0000000..68329ac
gokhlayeh@9258 37 --- /dev/null
gokhlayeh@9258 38 +++ b/Documentation/xz.txt
gokhlayeh@9258 39 @@ -0,0 +1,122 @@
gokhlayeh@9258 40 +
gokhlayeh@9258 41 +XZ data compression in Linux
gokhlayeh@9258 42 +============================
gokhlayeh@9258 43 +
gokhlayeh@9258 44 +Introduction
gokhlayeh@9258 45 +
gokhlayeh@9258 46 + XZ is a general purpose data compression format with high compression
gokhlayeh@9258 47 + ratio and relatively fast decompression. The primary compression
gokhlayeh@9258 48 + algorithm (filter) is LZMA2. Additional filters can be used to improve
gokhlayeh@9258 49 + compression ratio even further. E.g. Branch/Call/Jump (BCJ) filters
gokhlayeh@9258 50 + improve compression ratio of executable data.
gokhlayeh@9258 51 +
gokhlayeh@9258 52 + The XZ decompressor in Linux is called XZ Embedded. It supports
gokhlayeh@9258 53 + the LZMA2 filter and optionally also BCJ filters. CRC32 is supported
gokhlayeh@9258 54 + for integrity checking. The home page of XZ Embedded is at
gokhlayeh@9258 55 + <http://tukaani.org/xz/embedded.html>, where you can find the
gokhlayeh@9258 56 + latest version and also information about using the code outside
gokhlayeh@9258 57 + the Linux kernel.
gokhlayeh@9258 58 +
gokhlayeh@9258 59 + For userspace, XZ Utils provide a zlib-like compression library
gokhlayeh@9258 60 + and a gzip-like command line tool. XZ Utils can be downloaded from
gokhlayeh@9258 61 + <http://tukaani.org/xz/>.
gokhlayeh@9258 62 +
gokhlayeh@9258 63 +XZ related components in the kernel
gokhlayeh@9258 64 +
gokhlayeh@9258 65 + The xz_dec module provides XZ decompressor with single-call (buffer
gokhlayeh@9258 66 + to buffer) and multi-call (stateful) APIs. The usage of the xz_dec
gokhlayeh@9258 67 + module is documented in include/linux/xz.h.
gokhlayeh@9258 68 +
gokhlayeh@9258 69 + The xz_dec_test module is for testing xz_dec. xz_dec_test is not
gokhlayeh@9258 70 + useful unless you are hacking the XZ decompressor. xz_dec_test
gokhlayeh@9258 71 + allocates a char device major dynamically to which one can write
gokhlayeh@9258 72 + .xz files from userspace. The decompressed output is thrown away.
gokhlayeh@9258 73 + Keep an eye on dmesg to see diagnostics printed by xz_dec_test.
gokhlayeh@9258 74 + See the xz_dec_test source code for the details.
gokhlayeh@9258 75 +
gokhlayeh@9258 76 + For decompressing the kernel image, initramfs, and initrd, there
gokhlayeh@9258 77 + is a wrapper function in lib/decompress_unxz.c. Its API is the
gokhlayeh@9258 78 + same as in other decompress_*.c files, which is defined in
gokhlayeh@9258 79 + include/linux/decompress/generic.h.
gokhlayeh@9258 80 +
gokhlayeh@9258 81 + scripts/xz_wrap.sh is a wrapper for the xz command line tool found
gokhlayeh@9258 82 + from XZ Utils. The wrapper sets compression options to values suitable
gokhlayeh@9258 83 + for compressing the kernel image.
gokhlayeh@9258 84 +
gokhlayeh@9258 85 + For kernel makefiles, two commands are provided for use with
gokhlayeh@9258 86 + $(call if_needed). The kernel image should be compressed with
gokhlayeh@9258 87 + $(call if_needed,xzkern) which will use a BCJ filter and a big LZMA2
gokhlayeh@9258 88 + dictionary. It will also append a four-byte trailer containing the
gokhlayeh@9258 89 + uncompressed size of the file, which is needed by the boot code.
gokhlayeh@9258 90 + Other things should be compressed with $(call if_needed,xzmisc)
gokhlayeh@9258 91 + which will use no BCJ filter and 1 MiB LZMA2 dictionary.
gokhlayeh@9258 92 +
gokhlayeh@9258 93 +Notes on compression options
gokhlayeh@9258 94 +
gokhlayeh@9258 95 + Since the XZ Embedded supports only streams with no integrity check or
gokhlayeh@9258 96 + CRC32, make sure that you don't use some other integrity check type
gokhlayeh@9258 97 + when encoding files that are supposed to be decoded by the kernel. With
gokhlayeh@9258 98 + liblzma, you need to use either LZMA_CHECK_NONE or LZMA_CHECK_CRC32
gokhlayeh@9258 99 + when encoding. With the xz command line tool, use --check=none or
gokhlayeh@9258 100 + --check=crc32.
gokhlayeh@9258 101 +
gokhlayeh@9258 102 + Using CRC32 is strongly recommended unless there is some other layer
gokhlayeh@9258 103 + which will verify the integrity of the uncompressed data anyway.
gokhlayeh@9258 104 + Double checking the integrity would probably be waste of CPU cycles.
gokhlayeh@9258 105 + Note that the headers will always have a CRC32 which will be validated
gokhlayeh@9258 106 + by the decoder; you can only change the integrity check type (or
gokhlayeh@9258 107 + disable it) for the actual uncompressed data.
gokhlayeh@9258 108 +
gokhlayeh@9258 109 + In userspace, LZMA2 is typically used with dictionary sizes of several
gokhlayeh@9258 110 + megabytes. The decoder needs to have the dictionary in RAM, thus big
gokhlayeh@9258 111 + dictionaries cannot be used for files that are intended to be decoded
gokhlayeh@9258 112 + by the kernel. 1 MiB is probably the maximum reasonable dictionary
gokhlayeh@9258 113 + size for in-kernel use (maybe more is OK for initramfs). The presets
gokhlayeh@9258 114 + in XZ Utils may not be optimal when creating files for the kernel,
gokhlayeh@9258 115 + so don't hesitate to use custom settings. Example:
gokhlayeh@9258 116 +
gokhlayeh@9258 117 + xz --check=crc32 --lzma2=dict=512KiB inputfile
gokhlayeh@9258 118 +
gokhlayeh@9258 119 + An exception to above dictionary size limitation is when the decoder
gokhlayeh@9258 120 + is used in single-call mode. Decompressing the kernel itself is an
gokhlayeh@9258 121 + example of this situation. In single-call mode, the memory usage
gokhlayeh@9258 122 + doesn't depend on the dictionary size, and it is perfectly fine to
gokhlayeh@9258 123 + use a big dictionary: for maximum compression, the dictionary should
gokhlayeh@9258 124 + be at least as big as the uncompressed data itself.
gokhlayeh@9258 125 +
gokhlayeh@9258 126 +Future plans
gokhlayeh@9258 127 +
gokhlayeh@9258 128 + Creating a limited XZ encoder may be considered if people think it is
gokhlayeh@9258 129 + useful. LZMA2 is slower to compress than e.g. Deflate or LZO even at
gokhlayeh@9258 130 + the fastest settings, so it isn't clear if LZMA2 encoder is wanted
gokhlayeh@9258 131 + into the kernel.
gokhlayeh@9258 132 +
gokhlayeh@9258 133 + Support for limited random-access reading is planned for the
gokhlayeh@9258 134 + decompression code. I don't know if it could have any use in the
gokhlayeh@9258 135 + kernel, but I know that it would be useful in some embedded projects
gokhlayeh@9258 136 + outside the Linux kernel.
gokhlayeh@9258 137 +
gokhlayeh@9258 138 +Conformance to the .xz file format specification
gokhlayeh@9258 139 +
gokhlayeh@9258 140 + There are a couple of corner cases where things have been simplified
gokhlayeh@9258 141 + at expense of detecting errors as early as possible. These should not
gokhlayeh@9258 142 + matter in practice all, since they don't cause security issues. But
gokhlayeh@9258 143 + it is good to know this if testing the code e.g. with the test files
gokhlayeh@9258 144 + from XZ Utils.
gokhlayeh@9258 145 +
gokhlayeh@9258 146 +Reporting bugs
gokhlayeh@9258 147 +
gokhlayeh@9258 148 + Before reporting a bug, please check that it's not fixed already
gokhlayeh@9258 149 + at upstream. See <http://tukaani.org/xz/embedded.html> to get the
gokhlayeh@9258 150 + latest code.
gokhlayeh@9258 151 +
gokhlayeh@9258 152 + Report bugs to <lasse.collin@tukaani.org> or visit #tukaani on
gokhlayeh@9258 153 + Freenode and talk to Larhzu. I don't actively read LKML or other
gokhlayeh@9258 154 + kernel-related mailing lists, so if there's something I should know,
gokhlayeh@9258 155 + you should email to me personally or use IRC.
gokhlayeh@9258 156 +
gokhlayeh@9258 157 + Don't bother Igor Pavlov with questions about the XZ implementation
gokhlayeh@9258 158 + in the kernel or about XZ Utils. While these two implementations
gokhlayeh@9258 159 + include essential code that is directly based on Igor Pavlov's code,
gokhlayeh@9258 160 + these implementations aren't maintained nor supported by him.
gokhlayeh@9258 161 +
gokhlayeh@9258 162 diff --git a/include/linux/xz.h b/include/linux/xz.h
gokhlayeh@9258 163 new file mode 100644
gokhlayeh@9258 164 index 0000000..64cffa6
gokhlayeh@9258 165 --- /dev/null
gokhlayeh@9258 166 +++ b/include/linux/xz.h
gokhlayeh@9258 167 @@ -0,0 +1,264 @@
gokhlayeh@9258 168 +/*
gokhlayeh@9258 169 + * XZ decompressor
gokhlayeh@9258 170 + *
gokhlayeh@9258 171 + * Authors: Lasse Collin <lasse.collin@tukaani.org>
gokhlayeh@9258 172 + * Igor Pavlov <http://7-zip.org/>
gokhlayeh@9258 173 + *
gokhlayeh@9258 174 + * This file has been put into the public domain.
gokhlayeh@9258 175 + * You can do whatever you want with this file.
gokhlayeh@9258 176 + */
gokhlayeh@9258 177 +
gokhlayeh@9258 178 +#ifndef XZ_H
gokhlayeh@9258 179 +#define XZ_H
gokhlayeh@9258 180 +
gokhlayeh@9258 181 +#ifdef __KERNEL__
gokhlayeh@9258 182 +# include <linux/stddef.h>
gokhlayeh@9258 183 +# include <linux/types.h>
gokhlayeh@9258 184 +#else
gokhlayeh@9258 185 +# include <stddef.h>
gokhlayeh@9258 186 +# include <stdint.h>
gokhlayeh@9258 187 +#endif
gokhlayeh@9258 188 +
gokhlayeh@9258 189 +/* In Linux, this is used to make extern functions static when needed. */
gokhlayeh@9258 190 +#ifndef XZ_EXTERN
gokhlayeh@9258 191 +# define XZ_EXTERN extern
gokhlayeh@9258 192 +#endif
gokhlayeh@9258 193 +
gokhlayeh@9258 194 +/**
gokhlayeh@9258 195 + * enum xz_mode - Operation mode
gokhlayeh@9258 196 + *
gokhlayeh@9258 197 + * @XZ_SINGLE: Single-call mode. This uses less RAM than
gokhlayeh@9258 198 + * than multi-call modes, because the LZMA2
gokhlayeh@9258 199 + * dictionary doesn't need to be allocated as
gokhlayeh@9258 200 + * part of the decoder state. All required data
gokhlayeh@9258 201 + * structures are allocated at initialization,
gokhlayeh@9258 202 + * so xz_dec_run() cannot return XZ_MEM_ERROR.
gokhlayeh@9258 203 + * @XZ_PREALLOC: Multi-call mode with preallocated LZMA2
gokhlayeh@9258 204 + * dictionary buffer. All data structures are
gokhlayeh@9258 205 + * allocated at initialization, so xz_dec_run()
gokhlayeh@9258 206 + * cannot return XZ_MEM_ERROR.
gokhlayeh@9258 207 + * @XZ_DYNALLOC: Multi-call mode. The LZMA2 dictionary is
gokhlayeh@9258 208 + * allocated once the required size has been
gokhlayeh@9258 209 + * parsed from the stream headers. If the
gokhlayeh@9258 210 + * allocation fails, xz_dec_run() will return
gokhlayeh@9258 211 + * XZ_MEM_ERROR.
gokhlayeh@9258 212 + *
gokhlayeh@9258 213 + * It is possible to enable support only for a subset of the above
gokhlayeh@9258 214 + * modes at compile time by defining XZ_DEC_SINGLE, XZ_DEC_PREALLOC,
gokhlayeh@9258 215 + * or XZ_DEC_DYNALLOC. The xz_dec kernel module is always compiled
gokhlayeh@9258 216 + * with support for all operation modes, but the preboot code may
gokhlayeh@9258 217 + * be built with fewer features to minimize code size.
gokhlayeh@9258 218 + */
gokhlayeh@9258 219 +enum xz_mode {
gokhlayeh@9258 220 + XZ_SINGLE,
gokhlayeh@9258 221 + XZ_PREALLOC,
gokhlayeh@9258 222 + XZ_DYNALLOC
gokhlayeh@9258 223 +};
gokhlayeh@9258 224 +
gokhlayeh@9258 225 +/**
gokhlayeh@9258 226 + * enum xz_ret - Return codes
gokhlayeh@9258 227 + * @XZ_OK: Everything is OK so far. More input or more
gokhlayeh@9258 228 + * output space is required to continue. This
gokhlayeh@9258 229 + * return code is possible only in multi-call mode
gokhlayeh@9258 230 + * (XZ_PREALLOC or XZ_DYNALLOC).
gokhlayeh@9258 231 + * @XZ_STREAM_END: Operation finished successfully.
gokhlayeh@9258 232 + * @XZ_UNSUPPORTED_CHECK: Integrity check type is not supported. Decoding
gokhlayeh@9258 233 + * is still possible in multi-call mode by simply
gokhlayeh@9258 234 + * calling xz_dec_run() again.
gokhlayeh@9258 235 + * Note that this return value is used only if
gokhlayeh@9258 236 + * XZ_DEC_ANY_CHECK was defined at build time,
gokhlayeh@9258 237 + * which is not used in the kernel. Unsupported
gokhlayeh@9258 238 + * check types return XZ_OPTIONS_ERROR if
gokhlayeh@9258 239 + * XZ_DEC_ANY_CHECK was not defined at build time.
gokhlayeh@9258 240 + * @XZ_MEM_ERROR: Allocating memory failed. This return code is
gokhlayeh@9258 241 + * possible only if the decoder was initialized
gokhlayeh@9258 242 + * with XZ_DYNALLOC. The amount of memory that was
gokhlayeh@9258 243 + * tried to be allocated was no more than the
gokhlayeh@9258 244 + * dict_max argument given to xz_dec_init().
gokhlayeh@9258 245 + * @XZ_MEMLIMIT_ERROR: A bigger LZMA2 dictionary would be needed than
gokhlayeh@9258 246 + * allowed by the dict_max argument given to
gokhlayeh@9258 247 + * xz_dec_init(). This return value is possible
gokhlayeh@9258 248 + * only in multi-call mode (XZ_PREALLOC or
gokhlayeh@9258 249 + * XZ_DYNALLOC); the single-call mode (XZ_SINGLE)
gokhlayeh@9258 250 + * ignores the dict_max argument.
gokhlayeh@9258 251 + * @XZ_FORMAT_ERROR: File format was not recognized (wrong magic
gokhlayeh@9258 252 + * bytes).
gokhlayeh@9258 253 + * @XZ_OPTIONS_ERROR: This implementation doesn't support the requested
gokhlayeh@9258 254 + * compression options. In the decoder this means
gokhlayeh@9258 255 + * that the header CRC32 matches, but the header
gokhlayeh@9258 256 + * itself specifies something that we don't support.
gokhlayeh@9258 257 + * @XZ_DATA_ERROR: Compressed data is corrupt.
gokhlayeh@9258 258 + * @XZ_BUF_ERROR: Cannot make any progress. Details are slightly
gokhlayeh@9258 259 + * different between multi-call and single-call
gokhlayeh@9258 260 + * mode; more information below.
gokhlayeh@9258 261 + *
gokhlayeh@9258 262 + * In multi-call mode, XZ_BUF_ERROR is returned when two consecutive calls
gokhlayeh@9258 263 + * to XZ code cannot consume any input and cannot produce any new output.
gokhlayeh@9258 264 + * This happens when there is no new input available, or the output buffer
gokhlayeh@9258 265 + * is full while at least one output byte is still pending. Assuming your
gokhlayeh@9258 266 + * code is not buggy, you can get this error only when decoding a compressed
gokhlayeh@9258 267 + * stream that is truncated or otherwise corrupt.
gokhlayeh@9258 268 + *
gokhlayeh@9258 269 + * In single-call mode, XZ_BUF_ERROR is returned only when the output buffer
gokhlayeh@9258 270 + * is too small or the compressed input is corrupt in a way that makes the
gokhlayeh@9258 271 + * decoder produce more output than the caller expected. When it is
gokhlayeh@9258 272 + * (relatively) clear that the compressed input is truncated, XZ_DATA_ERROR
gokhlayeh@9258 273 + * is used instead of XZ_BUF_ERROR.
gokhlayeh@9258 274 + */
gokhlayeh@9258 275 +enum xz_ret {
gokhlayeh@9258 276 + XZ_OK,
gokhlayeh@9258 277 + XZ_STREAM_END,
gokhlayeh@9258 278 + XZ_UNSUPPORTED_CHECK,
gokhlayeh@9258 279 + XZ_MEM_ERROR,
gokhlayeh@9258 280 + XZ_MEMLIMIT_ERROR,
gokhlayeh@9258 281 + XZ_FORMAT_ERROR,
gokhlayeh@9258 282 + XZ_OPTIONS_ERROR,
gokhlayeh@9258 283 + XZ_DATA_ERROR,
gokhlayeh@9258 284 + XZ_BUF_ERROR
gokhlayeh@9258 285 +};
gokhlayeh@9258 286 +
gokhlayeh@9258 287 +/**
gokhlayeh@9258 288 + * struct xz_buf - Passing input and output buffers to XZ code
gokhlayeh@9258 289 + * @in: Beginning of the input buffer. This may be NULL if and only
gokhlayeh@9258 290 + * if in_pos is equal to in_size.
gokhlayeh@9258 291 + * @in_pos: Current position in the input buffer. This must not exceed
gokhlayeh@9258 292 + * in_size.
gokhlayeh@9258 293 + * @in_size: Size of the input buffer
gokhlayeh@9258 294 + * @out: Beginning of the output buffer. This may be NULL if and only
gokhlayeh@9258 295 + * if out_pos is equal to out_size.
gokhlayeh@9258 296 + * @out_pos: Current position in the output buffer. This must not exceed
gokhlayeh@9258 297 + * out_size.
gokhlayeh@9258 298 + * @out_size: Size of the output buffer
gokhlayeh@9258 299 + *
gokhlayeh@9258 300 + * Only the contents of the output buffer from out[out_pos] onward, and
gokhlayeh@9258 301 + * the variables in_pos and out_pos are modified by the XZ code.
gokhlayeh@9258 302 + */
gokhlayeh@9258 303 +struct xz_buf {
gokhlayeh@9258 304 + const uint8_t *in;
gokhlayeh@9258 305 + size_t in_pos;
gokhlayeh@9258 306 + size_t in_size;
gokhlayeh@9258 307 +
gokhlayeh@9258 308 + uint8_t *out;
gokhlayeh@9258 309 + size_t out_pos;
gokhlayeh@9258 310 + size_t out_size;
gokhlayeh@9258 311 +};
gokhlayeh@9258 312 +
gokhlayeh@9258 313 +/**
gokhlayeh@9258 314 + * struct xz_dec - Opaque type to hold the XZ decoder state
gokhlayeh@9258 315 + */
gokhlayeh@9258 316 +struct xz_dec;
gokhlayeh@9258 317 +
gokhlayeh@9258 318 +/**
gokhlayeh@9258 319 + * xz_dec_init() - Allocate and initialize a XZ decoder state
gokhlayeh@9258 320 + * @mode: Operation mode
gokhlayeh@9258 321 + * @dict_max: Maximum size of the LZMA2 dictionary (history buffer) for
gokhlayeh@9258 322 + * multi-call decoding. This is ignored in single-call mode
gokhlayeh@9258 323 + * (mode == XZ_SINGLE). LZMA2 dictionary is always 2^n bytes
gokhlayeh@9258 324 + * or 2^n + 2^(n-1) bytes (the latter sizes are less common
gokhlayeh@9258 325 + * in practice), so other values for dict_max don't make sense.
gokhlayeh@9258 326 + * In the kernel, dictionary sizes of 64 KiB, 128 KiB, 256 KiB,
gokhlayeh@9258 327 + * 512 KiB, and 1 MiB are probably the only reasonable values,
gokhlayeh@9258 328 + * except for kernel and initramfs images where a bigger
gokhlayeh@9258 329 + * dictionary can be fine and useful.
gokhlayeh@9258 330 + *
gokhlayeh@9258 331 + * Single-call mode (XZ_SINGLE): xz_dec_run() decodes the whole stream at
gokhlayeh@9258 332 + * once. The caller must provide enough output space or the decoding will
gokhlayeh@9258 333 + * fail. The output space is used as the dictionary buffer, which is why
gokhlayeh@9258 334 + * there is no need to allocate the dictionary as part of the decoder's
gokhlayeh@9258 335 + * internal state.
gokhlayeh@9258 336 + *
gokhlayeh@9258 337 + * Because the output buffer is used as the workspace, streams encoded using
gokhlayeh@9258 338 + * a big dictionary are not a problem in single-call mode. It is enough that
gokhlayeh@9258 339 + * the output buffer is big enough to hold the actual uncompressed data; it
gokhlayeh@9258 340 + * can be smaller than the dictionary size stored in the stream headers.
gokhlayeh@9258 341 + *
gokhlayeh@9258 342 + * Multi-call mode with preallocated dictionary (XZ_PREALLOC): dict_max bytes
gokhlayeh@9258 343 + * of memory is preallocated for the LZMA2 dictionary. This way there is no
gokhlayeh@9258 344 + * risk that xz_dec_run() could run out of memory, since xz_dec_run() will
gokhlayeh@9258 345 + * never allocate any memory. Instead, if the preallocated dictionary is too
gokhlayeh@9258 346 + * small for decoding the given input stream, xz_dec_run() will return
gokhlayeh@9258 347 + * XZ_MEMLIMIT_ERROR. Thus, it is important to know what kind of data will be
gokhlayeh@9258 348 + * decoded to avoid allocating excessive amount of memory for the dictionary.
gokhlayeh@9258 349 + *
gokhlayeh@9258 350 + * Multi-call mode with dynamically allocated dictionary (XZ_DYNALLOC):
gokhlayeh@9258 351 + * dict_max specifies the maximum allowed dictionary size that xz_dec_run()
gokhlayeh@9258 352 + * may allocate once it has parsed the dictionary size from the stream
gokhlayeh@9258 353 + * headers. This way excessive allocations can be avoided while still
gokhlayeh@9258 354 + * limiting the maximum memory usage to a sane value to prevent running the
gokhlayeh@9258 355 + * system out of memory when decompressing streams from untrusted sources.
gokhlayeh@9258 356 + *
gokhlayeh@9258 357 + * On success, xz_dec_init() returns a pointer to struct xz_dec, which is
gokhlayeh@9258 358 + * ready to be used with xz_dec_run(). If memory allocation fails,
gokhlayeh@9258 359 + * xz_dec_init() returns NULL.
gokhlayeh@9258 360 + */
gokhlayeh@9258 361 +XZ_EXTERN struct xz_dec *xz_dec_init(enum xz_mode mode, uint32_t dict_max);
gokhlayeh@9258 362 +
gokhlayeh@9258 363 +/**
gokhlayeh@9258 364 + * xz_dec_run() - Run the XZ decoder
gokhlayeh@9258 365 + * @s: Decoder state allocated using xz_dec_init()
gokhlayeh@9258 366 + * @b: Input and output buffers
gokhlayeh@9258 367 + *
gokhlayeh@9258 368 + * The possible return values depend on build options and operation mode.
gokhlayeh@9258 369 + * See enum xz_ret for details.
gokhlayeh@9258 370 + *
gokhlayeh@9258 371 + * Note that if an error occurs in single-call mode (return value is not
gokhlayeh@9258 372 + * XZ_STREAM_END), b->in_pos and b->out_pos are not modified and the
gokhlayeh@9258 373 + * contents of the output buffer from b->out[b->out_pos] onward are
gokhlayeh@9258 374 + * undefined. This is true even after XZ_BUF_ERROR, because with some filter
gokhlayeh@9258 375 + * chains, there may be a second pass over the output buffer, and this pass
gokhlayeh@9258 376 + * cannot be properly done if the output buffer is truncated. Thus, you
gokhlayeh@9258 377 + * cannot give the single-call decoder a too small buffer and then expect to
gokhlayeh@9258 378 + * get that amount valid data from the beginning of the stream. You must use
gokhlayeh@9258 379 + * the multi-call decoder if you don't want to uncompress the whole stream.
gokhlayeh@9258 380 + */
gokhlayeh@9258 381 +XZ_EXTERN enum xz_ret xz_dec_run(struct xz_dec *s, struct xz_buf *b);
gokhlayeh@9258 382 +
gokhlayeh@9258 383 +/**
gokhlayeh@9258 384 + * xz_dec_reset() - Reset an already allocated decoder state
gokhlayeh@9258 385 + * @s: Decoder state allocated using xz_dec_init()
gokhlayeh@9258 386 + *
gokhlayeh@9258 387 + * This function can be used to reset the multi-call decoder state without
gokhlayeh@9258 388 + * freeing and reallocating memory with xz_dec_end() and xz_dec_init().
gokhlayeh@9258 389 + *
gokhlayeh@9258 390 + * In single-call mode, xz_dec_reset() is always called in the beginning of
gokhlayeh@9258 391 + * xz_dec_run(). Thus, explicit call to xz_dec_reset() is useful only in
gokhlayeh@9258 392 + * multi-call mode.
gokhlayeh@9258 393 + */
gokhlayeh@9258 394 +XZ_EXTERN void xz_dec_reset(struct xz_dec *s);
gokhlayeh@9258 395 +
gokhlayeh@9258 396 +/**
gokhlayeh@9258 397 + * xz_dec_end() - Free the memory allocated for the decoder state
gokhlayeh@9258 398 + * @s: Decoder state allocated using xz_dec_init(). If s is NULL,
gokhlayeh@9258 399 + * this function does nothing.
gokhlayeh@9258 400 + */
gokhlayeh@9258 401 +XZ_EXTERN void xz_dec_end(struct xz_dec *s);
gokhlayeh@9258 402 +
gokhlayeh@9258 403 +/*
gokhlayeh@9258 404 + * Standalone build (userspace build or in-kernel build for boot time use)
gokhlayeh@9258 405 + * needs a CRC32 implementation. For normal in-kernel use, kernel's own
gokhlayeh@9258 406 + * CRC32 module is used instead, and users of this module don't need to
gokhlayeh@9258 407 + * care about the functions below.
gokhlayeh@9258 408 + */
gokhlayeh@9258 409 +#ifndef XZ_INTERNAL_CRC32
gokhlayeh@9258 410 +# ifdef __KERNEL__
gokhlayeh@9258 411 +# define XZ_INTERNAL_CRC32 0
gokhlayeh@9258 412 +# else
gokhlayeh@9258 413 +# define XZ_INTERNAL_CRC32 1
gokhlayeh@9258 414 +# endif
gokhlayeh@9258 415 +#endif
gokhlayeh@9258 416 +
gokhlayeh@9258 417 +#if XZ_INTERNAL_CRC32
gokhlayeh@9258 418 +/*
gokhlayeh@9258 419 + * This must be called before any other xz_* function to initialize
gokhlayeh@9258 420 + * the CRC32 lookup table.
gokhlayeh@9258 421 + */
gokhlayeh@9258 422 +XZ_EXTERN void xz_crc32_init(void);
gokhlayeh@9258 423 +
gokhlayeh@9258 424 +/*
gokhlayeh@9258 425 + * Update CRC32 value using the polynomial from IEEE-802.3. To start a new
gokhlayeh@9258 426 + * calculation, the third argument must be zero. To continue the calculation,
gokhlayeh@9258 427 + * the previously returned value is passed as the third argument.
gokhlayeh@9258 428 + */
gokhlayeh@9258 429 +XZ_EXTERN uint32_t xz_crc32(const uint8_t *buf, size_t size, uint32_t crc);
gokhlayeh@9258 430 +#endif
gokhlayeh@9258 431 +#endif
gokhlayeh@9258 432 diff --git a/lib/Kconfig b/lib/Kconfig
gokhlayeh@9258 433 index fa9bf2c..6090314 100644
gokhlayeh@9258 434 --- a/lib/Kconfig
gokhlayeh@9258 435 +++ b/lib/Kconfig
gokhlayeh@9258 436 @@ -106,6 +106,8 @@ config LZO_COMPRESS
gokhlayeh@9258 437 config LZO_DECOMPRESS
gokhlayeh@9258 438 tristate
gokhlayeh@9258 439
gokhlayeh@9258 440 +source "lib/xz/Kconfig"
gokhlayeh@9258 441 +
gokhlayeh@9258 442 #
gokhlayeh@9258 443 # These all provide a common interface (hence the apparent duplication with
gokhlayeh@9258 444 # ZLIB_INFLATE; DECOMPRESS_GZIP is just a wrapper.)
gokhlayeh@9258 445 diff --git a/lib/Makefile b/lib/Makefile
gokhlayeh@9258 446 index e6a3763..f2f98dd 100644
gokhlayeh@9258 447 --- a/lib/Makefile
gokhlayeh@9258 448 +++ b/lib/Makefile
gokhlayeh@9258 449 @@ -69,6 +69,7 @@ obj-$(CONFIG_ZLIB_DEFLATE) += zlib_deflate/
gokhlayeh@9258 450 obj-$(CONFIG_REED_SOLOMON) += reed_solomon/
gokhlayeh@9258 451 obj-$(CONFIG_LZO_COMPRESS) += lzo/
gokhlayeh@9258 452 obj-$(CONFIG_LZO_DECOMPRESS) += lzo/
gokhlayeh@9258 453 +obj-$(CONFIG_XZ_DEC) += xz/
gokhlayeh@9258 454 obj-$(CONFIG_RAID6_PQ) += raid6/
gokhlayeh@9258 455
gokhlayeh@9258 456 lib-$(CONFIG_DECOMPRESS_GZIP) += decompress_inflate.o
gokhlayeh@9258 457 diff --git a/lib/xz/Kconfig b/lib/xz/Kconfig
gokhlayeh@9258 458 new file mode 100644
gokhlayeh@9258 459 index 0000000..e3b6e18
gokhlayeh@9258 460 --- /dev/null
gokhlayeh@9258 461 +++ b/lib/xz/Kconfig
gokhlayeh@9258 462 @@ -0,0 +1,59 @@
gokhlayeh@9258 463 +config XZ_DEC
gokhlayeh@9258 464 + tristate "XZ decompression support"
gokhlayeh@9258 465 + select CRC32
gokhlayeh@9258 466 + help
gokhlayeh@9258 467 + LZMA2 compression algorithm and BCJ filters are supported using
gokhlayeh@9258 468 + the .xz file format as the container. For integrity checking,
gokhlayeh@9258 469 + CRC32 is supported. See Documentation/xz.txt for more information.
gokhlayeh@9258 470 +
gokhlayeh@9258 471 +config XZ_DEC_X86
gokhlayeh@9258 472 + bool "x86 BCJ filter decoder" if EMBEDDED
gokhlayeh@9258 473 + default y
gokhlayeh@9258 474 + depends on XZ_DEC
gokhlayeh@9258 475 + select XZ_DEC_BCJ
gokhlayeh@9258 476 +
gokhlayeh@9258 477 +config XZ_DEC_POWERPC
gokhlayeh@9258 478 + bool "PowerPC BCJ filter decoder" if EMBEDDED
gokhlayeh@9258 479 + default y
gokhlayeh@9258 480 + depends on XZ_DEC
gokhlayeh@9258 481 + select XZ_DEC_BCJ
gokhlayeh@9258 482 +
gokhlayeh@9258 483 +config XZ_DEC_IA64
gokhlayeh@9258 484 + bool "IA-64 BCJ filter decoder" if EMBEDDED
gokhlayeh@9258 485 + default y
gokhlayeh@9258 486 + depends on XZ_DEC
gokhlayeh@9258 487 + select XZ_DEC_BCJ
gokhlayeh@9258 488 +
gokhlayeh@9258 489 +config XZ_DEC_ARM
gokhlayeh@9258 490 + bool "ARM BCJ filter decoder" if EMBEDDED
gokhlayeh@9258 491 + default y
gokhlayeh@9258 492 + depends on XZ_DEC
gokhlayeh@9258 493 + select XZ_DEC_BCJ
gokhlayeh@9258 494 +
gokhlayeh@9258 495 +config XZ_DEC_ARMTHUMB
gokhlayeh@9258 496 + bool "ARM-Thumb BCJ filter decoder" if EMBEDDED
gokhlayeh@9258 497 + default y
gokhlayeh@9258 498 + depends on XZ_DEC
gokhlayeh@9258 499 + select XZ_DEC_BCJ
gokhlayeh@9258 500 +
gokhlayeh@9258 501 +config XZ_DEC_SPARC
gokhlayeh@9258 502 + bool "SPARC BCJ filter decoder" if EMBEDDED
gokhlayeh@9258 503 + default y
gokhlayeh@9258 504 + depends on XZ_DEC
gokhlayeh@9258 505 + select XZ_DEC_BCJ
gokhlayeh@9258 506 +
gokhlayeh@9258 507 +config XZ_DEC_BCJ
gokhlayeh@9258 508 + bool
gokhlayeh@9258 509 + default n
gokhlayeh@9258 510 +
gokhlayeh@9258 511 +config XZ_DEC_TEST
gokhlayeh@9258 512 + tristate "XZ decompressor tester"
gokhlayeh@9258 513 + default n
gokhlayeh@9258 514 + depends on XZ_DEC
gokhlayeh@9258 515 + help
gokhlayeh@9258 516 + This allows passing .xz files to the in-kernel XZ decoder via
gokhlayeh@9258 517 + a character special file. It calculates CRC32 of the decompressed
gokhlayeh@9258 518 + data and writes diagnostics to the system log.
gokhlayeh@9258 519 +
gokhlayeh@9258 520 + Unless you are developing the XZ decoder, you don't need this
gokhlayeh@9258 521 + and should say N.
gokhlayeh@9258 522 diff --git a/lib/xz/Makefile b/lib/xz/Makefile
gokhlayeh@9258 523 new file mode 100644
gokhlayeh@9258 524 index 0000000..a7fa769
gokhlayeh@9258 525 --- /dev/null
gokhlayeh@9258 526 +++ b/lib/xz/Makefile
gokhlayeh@9258 527 @@ -0,0 +1,5 @@
gokhlayeh@9258 528 +obj-$(CONFIG_XZ_DEC) += xz_dec.o
gokhlayeh@9258 529 +xz_dec-y := xz_dec_syms.o xz_dec_stream.o xz_dec_lzma2.o
gokhlayeh@9258 530 +xz_dec-$(CONFIG_XZ_DEC_BCJ) += xz_dec_bcj.o
gokhlayeh@9258 531 +
gokhlayeh@9258 532 +obj-$(CONFIG_XZ_DEC_TEST) += xz_dec_test.o
gokhlayeh@9258 533 diff --git a/lib/xz/xz_crc32.c b/lib/xz/xz_crc32.c
gokhlayeh@9258 534 new file mode 100644
gokhlayeh@9258 535 index 0000000..34532d1
gokhlayeh@9258 536 --- /dev/null
gokhlayeh@9258 537 +++ b/lib/xz/xz_crc32.c
gokhlayeh@9258 538 @@ -0,0 +1,59 @@
gokhlayeh@9258 539 +/*
gokhlayeh@9258 540 + * CRC32 using the polynomial from IEEE-802.3
gokhlayeh@9258 541 + *
gokhlayeh@9258 542 + * Authors: Lasse Collin <lasse.collin@tukaani.org>
gokhlayeh@9258 543 + * Igor Pavlov <http://7-zip.org/>
gokhlayeh@9258 544 + *
gokhlayeh@9258 545 + * This file has been put into the public domain.
gokhlayeh@9258 546 + * You can do whatever you want with this file.
gokhlayeh@9258 547 + */
gokhlayeh@9258 548 +
gokhlayeh@9258 549 +/*
gokhlayeh@9258 550 + * This is not the fastest implementation, but it is pretty compact.
gokhlayeh@9258 551 + * The fastest versions of xz_crc32() on modern CPUs without hardware
gokhlayeh@9258 552 + * accelerated CRC instruction are 3-5 times as fast as this version,
gokhlayeh@9258 553 + * but they are bigger and use more memory for the lookup table.
gokhlayeh@9258 554 + */
gokhlayeh@9258 555 +
gokhlayeh@9258 556 +#include "xz_private.h"
gokhlayeh@9258 557 +
gokhlayeh@9258 558 +/*
gokhlayeh@9258 559 + * STATIC_RW_DATA is used in the pre-boot environment on some architectures.
gokhlayeh@9258 560 + * See <linux/decompress/mm.h> for details.
gokhlayeh@9258 561 + */
gokhlayeh@9258 562 +#ifndef STATIC_RW_DATA
gokhlayeh@9258 563 +# define STATIC_RW_DATA static
gokhlayeh@9258 564 +#endif
gokhlayeh@9258 565 +
gokhlayeh@9258 566 +STATIC_RW_DATA uint32_t xz_crc32_table[256];
gokhlayeh@9258 567 +
gokhlayeh@9258 568 +XZ_EXTERN void xz_crc32_init(void)
gokhlayeh@9258 569 +{
gokhlayeh@9258 570 + const uint32_t poly = 0xEDB88320;
gokhlayeh@9258 571 +
gokhlayeh@9258 572 + uint32_t i;
gokhlayeh@9258 573 + uint32_t j;
gokhlayeh@9258 574 + uint32_t r;
gokhlayeh@9258 575 +
gokhlayeh@9258 576 + for (i = 0; i < 256; ++i) {
gokhlayeh@9258 577 + r = i;
gokhlayeh@9258 578 + for (j = 0; j < 8; ++j)
gokhlayeh@9258 579 + r = (r >> 1) ^ (poly & ~((r & 1) - 1));
gokhlayeh@9258 580 +
gokhlayeh@9258 581 + xz_crc32_table[i] = r;
gokhlayeh@9258 582 + }
gokhlayeh@9258 583 +
gokhlayeh@9258 584 + return;
gokhlayeh@9258 585 +}
gokhlayeh@9258 586 +
gokhlayeh@9258 587 +XZ_EXTERN uint32_t xz_crc32(const uint8_t *buf, size_t size, uint32_t crc)
gokhlayeh@9258 588 +{
gokhlayeh@9258 589 + crc = ~crc;
gokhlayeh@9258 590 +
gokhlayeh@9258 591 + while (size != 0) {
gokhlayeh@9258 592 + crc = xz_crc32_table[*buf++ ^ (crc & 0xFF)] ^ (crc >> 8);
gokhlayeh@9258 593 + --size;
gokhlayeh@9258 594 + }
gokhlayeh@9258 595 +
gokhlayeh@9258 596 + return ~crc;
gokhlayeh@9258 597 +}
gokhlayeh@9258 598 diff --git a/lib/xz/xz_dec_bcj.c b/lib/xz/xz_dec_bcj.c
gokhlayeh@9258 599 new file mode 100644
gokhlayeh@9258 600 index 0000000..e51e255
gokhlayeh@9258 601 --- /dev/null
gokhlayeh@9258 602 +++ b/lib/xz/xz_dec_bcj.c
gokhlayeh@9258 603 @@ -0,0 +1,561 @@
gokhlayeh@9258 604 +/*
gokhlayeh@9258 605 + * Branch/Call/Jump (BCJ) filter decoders
gokhlayeh@9258 606 + *
gokhlayeh@9258 607 + * Authors: Lasse Collin <lasse.collin@tukaani.org>
gokhlayeh@9258 608 + * Igor Pavlov <http://7-zip.org/>
gokhlayeh@9258 609 + *
gokhlayeh@9258 610 + * This file has been put into the public domain.
gokhlayeh@9258 611 + * You can do whatever you want with this file.
gokhlayeh@9258 612 + */
gokhlayeh@9258 613 +
gokhlayeh@9258 614 +#include "xz_private.h"
gokhlayeh@9258 615 +
gokhlayeh@9258 616 +/*
gokhlayeh@9258 617 + * The rest of the file is inside this ifdef. It makes things a little more
gokhlayeh@9258 618 + * convenient when building without support for any BCJ filters.
gokhlayeh@9258 619 + */
gokhlayeh@9258 620 +#ifdef XZ_DEC_BCJ
gokhlayeh@9258 621 +
gokhlayeh@9258 622 +struct xz_dec_bcj {
gokhlayeh@9258 623 + /* Type of the BCJ filter being used */
gokhlayeh@9258 624 + enum {
gokhlayeh@9258 625 + BCJ_X86 = 4, /* x86 or x86-64 */
gokhlayeh@9258 626 + BCJ_POWERPC = 5, /* Big endian only */
gokhlayeh@9258 627 + BCJ_IA64 = 6, /* Big or little endian */
gokhlayeh@9258 628 + BCJ_ARM = 7, /* Little endian only */
gokhlayeh@9258 629 + BCJ_ARMTHUMB = 8, /* Little endian only */
gokhlayeh@9258 630 + BCJ_SPARC = 9 /* Big or little endian */
gokhlayeh@9258 631 + } type;
gokhlayeh@9258 632 +
gokhlayeh@9258 633 + /*
gokhlayeh@9258 634 + * Return value of the next filter in the chain. We need to preserve
gokhlayeh@9258 635 + * this information across calls, because we must not call the next
gokhlayeh@9258 636 + * filter anymore once it has returned XZ_STREAM_END.
gokhlayeh@9258 637 + */
gokhlayeh@9258 638 + enum xz_ret ret;
gokhlayeh@9258 639 +
gokhlayeh@9258 640 + /* True if we are operating in single-call mode. */
gokhlayeh@9258 641 + bool single_call;
gokhlayeh@9258 642 +
gokhlayeh@9258 643 + /*
gokhlayeh@9258 644 + * Absolute position relative to the beginning of the uncompressed
gokhlayeh@9258 645 + * data (in a single .xz Block). We care only about the lowest 32
gokhlayeh@9258 646 + * bits so this doesn't need to be uint64_t even with big files.
gokhlayeh@9258 647 + */
gokhlayeh@9258 648 + uint32_t pos;
gokhlayeh@9258 649 +
gokhlayeh@9258 650 + /* x86 filter state */
gokhlayeh@9258 651 + uint32_t x86_prev_mask;
gokhlayeh@9258 652 +
gokhlayeh@9258 653 + /* Temporary space to hold the variables from struct xz_buf */
gokhlayeh@9258 654 + uint8_t *out;
gokhlayeh@9258 655 + size_t out_pos;
gokhlayeh@9258 656 + size_t out_size;
gokhlayeh@9258 657 +
gokhlayeh@9258 658 + struct {
gokhlayeh@9258 659 + /* Amount of already filtered data in the beginning of buf */
gokhlayeh@9258 660 + size_t filtered;
gokhlayeh@9258 661 +
gokhlayeh@9258 662 + /* Total amount of data currently stored in buf */
gokhlayeh@9258 663 + size_t size;
gokhlayeh@9258 664 +
gokhlayeh@9258 665 + /*
gokhlayeh@9258 666 + * Buffer to hold a mix of filtered and unfiltered data. This
gokhlayeh@9258 667 + * needs to be big enough to hold Alignment + 2 * Look-ahead:
gokhlayeh@9258 668 + *
gokhlayeh@9258 669 + * Type Alignment Look-ahead
gokhlayeh@9258 670 + * x86 1 4
gokhlayeh@9258 671 + * PowerPC 4 0
gokhlayeh@9258 672 + * IA-64 16 0
gokhlayeh@9258 673 + * ARM 4 0
gokhlayeh@9258 674 + * ARM-Thumb 2 2
gokhlayeh@9258 675 + * SPARC 4 0
gokhlayeh@9258 676 + */
gokhlayeh@9258 677 + uint8_t buf[16];
gokhlayeh@9258 678 + } temp;
gokhlayeh@9258 679 +};
gokhlayeh@9258 680 +
gokhlayeh@9258 681 +#ifdef XZ_DEC_X86
gokhlayeh@9258 682 +/*
gokhlayeh@9258 683 + * This is used to test the most significant byte of a memory address
gokhlayeh@9258 684 + * in an x86 instruction.
gokhlayeh@9258 685 + */
gokhlayeh@9258 686 +static inline int bcj_x86_test_msbyte(uint8_t b)
gokhlayeh@9258 687 +{
gokhlayeh@9258 688 + return b == 0x00 || b == 0xFF;
gokhlayeh@9258 689 +}
gokhlayeh@9258 690 +
gokhlayeh@9258 691 +static size_t bcj_x86(struct xz_dec_bcj *s, uint8_t *buf, size_t size)
gokhlayeh@9258 692 +{
gokhlayeh@9258 693 + static const bool mask_to_allowed_status[8]
gokhlayeh@9258 694 + = { true, true, true, false, true, false, false, false };
gokhlayeh@9258 695 +
gokhlayeh@9258 696 + static const uint8_t mask_to_bit_num[8] = { 0, 1, 2, 2, 3, 3, 3, 3 };
gokhlayeh@9258 697 +
gokhlayeh@9258 698 + size_t i;
gokhlayeh@9258 699 + size_t prev_pos = (size_t)-1;
gokhlayeh@9258 700 + uint32_t prev_mask = s->x86_prev_mask;
gokhlayeh@9258 701 + uint32_t src;
gokhlayeh@9258 702 + uint32_t dest;
gokhlayeh@9258 703 + uint32_t j;
gokhlayeh@9258 704 + uint8_t b;
gokhlayeh@9258 705 +
gokhlayeh@9258 706 + if (size <= 4)
gokhlayeh@9258 707 + return 0;
gokhlayeh@9258 708 +
gokhlayeh@9258 709 + size -= 4;
gokhlayeh@9258 710 + for (i = 0; i < size; ++i) {
gokhlayeh@9258 711 + if ((buf[i] & 0xFE) != 0xE8)
gokhlayeh@9258 712 + continue;
gokhlayeh@9258 713 +
gokhlayeh@9258 714 + prev_pos = i - prev_pos;
gokhlayeh@9258 715 + if (prev_pos > 3) {
gokhlayeh@9258 716 + prev_mask = 0;
gokhlayeh@9258 717 + } else {
gokhlayeh@9258 718 + prev_mask = (prev_mask << (prev_pos - 1)) & 7;
gokhlayeh@9258 719 + if (prev_mask != 0) {
gokhlayeh@9258 720 + b = buf[i + 4 - mask_to_bit_num[prev_mask]];
gokhlayeh@9258 721 + if (!mask_to_allowed_status[prev_mask]
gokhlayeh@9258 722 + || bcj_x86_test_msbyte(b)) {
gokhlayeh@9258 723 + prev_pos = i;
gokhlayeh@9258 724 + prev_mask = (prev_mask << 1) | 1;
gokhlayeh@9258 725 + continue;
gokhlayeh@9258 726 + }
gokhlayeh@9258 727 + }
gokhlayeh@9258 728 + }
gokhlayeh@9258 729 +
gokhlayeh@9258 730 + prev_pos = i;
gokhlayeh@9258 731 +
gokhlayeh@9258 732 + if (bcj_x86_test_msbyte(buf[i + 4])) {
gokhlayeh@9258 733 + src = get_unaligned_le32(buf + i + 1);
gokhlayeh@9258 734 + while (true) {
gokhlayeh@9258 735 + dest = src - (s->pos + (uint32_t)i + 5);
gokhlayeh@9258 736 + if (prev_mask == 0)
gokhlayeh@9258 737 + break;
gokhlayeh@9258 738 +
gokhlayeh@9258 739 + j = mask_to_bit_num[prev_mask] * 8;
gokhlayeh@9258 740 + b = (uint8_t)(dest >> (24 - j));
gokhlayeh@9258 741 + if (!bcj_x86_test_msbyte(b))
gokhlayeh@9258 742 + break;
gokhlayeh@9258 743 +
gokhlayeh@9258 744 + src = dest ^ (((uint32_t)1 << (32 - j)) - 1);
gokhlayeh@9258 745 + }
gokhlayeh@9258 746 +
gokhlayeh@9258 747 + dest &= 0x01FFFFFF;
gokhlayeh@9258 748 + dest |= (uint32_t)0 - (dest & 0x01000000);
gokhlayeh@9258 749 + put_unaligned_le32(dest, buf + i + 1);
gokhlayeh@9258 750 + i += 4;
gokhlayeh@9258 751 + } else {
gokhlayeh@9258 752 + prev_mask = (prev_mask << 1) | 1;
gokhlayeh@9258 753 + }
gokhlayeh@9258 754 + }
gokhlayeh@9258 755 +
gokhlayeh@9258 756 + prev_pos = i - prev_pos;
gokhlayeh@9258 757 + s->x86_prev_mask = prev_pos > 3 ? 0 : prev_mask << (prev_pos - 1);
gokhlayeh@9258 758 + return i;
gokhlayeh@9258 759 +}
gokhlayeh@9258 760 +#endif
gokhlayeh@9258 761 +
gokhlayeh@9258 762 +#ifdef XZ_DEC_POWERPC
gokhlayeh@9258 763 +static size_t bcj_powerpc(struct xz_dec_bcj *s, uint8_t *buf, size_t size)
gokhlayeh@9258 764 +{
gokhlayeh@9258 765 + size_t i;
gokhlayeh@9258 766 + uint32_t instr;
gokhlayeh@9258 767 +
gokhlayeh@9258 768 + for (i = 0; i + 4 <= size; i += 4) {
gokhlayeh@9258 769 + instr = get_unaligned_be32(buf + i);
gokhlayeh@9258 770 + if ((instr & 0xFC000003) == 0x48000001) {
gokhlayeh@9258 771 + instr &= 0x03FFFFFC;
gokhlayeh@9258 772 + instr -= s->pos + (uint32_t)i;
gokhlayeh@9258 773 + instr &= 0x03FFFFFC;
gokhlayeh@9258 774 + instr |= 0x48000001;
gokhlayeh@9258 775 + put_unaligned_be32(instr, buf + i);
gokhlayeh@9258 776 + }
gokhlayeh@9258 777 + }
gokhlayeh@9258 778 +
gokhlayeh@9258 779 + return i;
gokhlayeh@9258 780 +}
gokhlayeh@9258 781 +#endif
gokhlayeh@9258 782 +
gokhlayeh@9258 783 +#ifdef XZ_DEC_IA64
gokhlayeh@9258 784 +static size_t bcj_ia64(struct xz_dec_bcj *s, uint8_t *buf, size_t size)
gokhlayeh@9258 785 +{
gokhlayeh@9258 786 + static const uint8_t branch_table[32] = {
gokhlayeh@9258 787 + 0, 0, 0, 0, 0, 0, 0, 0,
gokhlayeh@9258 788 + 0, 0, 0, 0, 0, 0, 0, 0,
gokhlayeh@9258 789 + 4, 4, 6, 6, 0, 0, 7, 7,
gokhlayeh@9258 790 + 4, 4, 0, 0, 4, 4, 0, 0
gokhlayeh@9258 791 + };
gokhlayeh@9258 792 +
gokhlayeh@9258 793 + /*
gokhlayeh@9258 794 + * The local variables take a little bit stack space, but it's less
gokhlayeh@9258 795 + * than what LZMA2 decoder takes, so it doesn't make sense to reduce
gokhlayeh@9258 796 + * stack usage here without doing that for the LZMA2 decoder too.
gokhlayeh@9258 797 + */
gokhlayeh@9258 798 +
gokhlayeh@9258 799 + /* Loop counters */
gokhlayeh@9258 800 + size_t i;
gokhlayeh@9258 801 + size_t j;
gokhlayeh@9258 802 +
gokhlayeh@9258 803 + /* Instruction slot (0, 1, or 2) in the 128-bit instruction word */
gokhlayeh@9258 804 + uint32_t slot;
gokhlayeh@9258 805 +
gokhlayeh@9258 806 + /* Bitwise offset of the instruction indicated by slot */
gokhlayeh@9258 807 + uint32_t bit_pos;
gokhlayeh@9258 808 +
gokhlayeh@9258 809 + /* bit_pos split into byte and bit parts */
gokhlayeh@9258 810 + uint32_t byte_pos;
gokhlayeh@9258 811 + uint32_t bit_res;
gokhlayeh@9258 812 +
gokhlayeh@9258 813 + /* Address part of an instruction */
gokhlayeh@9258 814 + uint32_t addr;
gokhlayeh@9258 815 +
gokhlayeh@9258 816 + /* Mask used to detect which instructions to convert */
gokhlayeh@9258 817 + uint32_t mask;
gokhlayeh@9258 818 +
gokhlayeh@9258 819 + /* 41-bit instruction stored somewhere in the lowest 48 bits */
gokhlayeh@9258 820 + uint64_t instr;
gokhlayeh@9258 821 +
gokhlayeh@9258 822 + /* Instruction normalized with bit_res for easier manipulation */
gokhlayeh@9258 823 + uint64_t norm;
gokhlayeh@9258 824 +
gokhlayeh@9258 825 + for (i = 0; i + 16 <= size; i += 16) {
gokhlayeh@9258 826 + mask = branch_table[buf[i] & 0x1F];
gokhlayeh@9258 827 + for (slot = 0, bit_pos = 5; slot < 3; ++slot, bit_pos += 41) {
gokhlayeh@9258 828 + if (((mask >> slot) & 1) == 0)
gokhlayeh@9258 829 + continue;
gokhlayeh@9258 830 +
gokhlayeh@9258 831 + byte_pos = bit_pos >> 3;
gokhlayeh@9258 832 + bit_res = bit_pos & 7;
gokhlayeh@9258 833 + instr = 0;
gokhlayeh@9258 834 + for (j = 0; j < 6; ++j)
gokhlayeh@9258 835 + instr |= (uint64_t)(buf[i + j + byte_pos])
gokhlayeh@9258 836 + << (8 * j);
gokhlayeh@9258 837 +
gokhlayeh@9258 838 + norm = instr >> bit_res;
gokhlayeh@9258 839 +
gokhlayeh@9258 840 + if (((norm >> 37) & 0x0F) == 0x05
gokhlayeh@9258 841 + && ((norm >> 9) & 0x07) == 0) {
gokhlayeh@9258 842 + addr = (norm >> 13) & 0x0FFFFF;
gokhlayeh@9258 843 + addr |= ((uint32_t)(norm >> 36) & 1) << 20;
gokhlayeh@9258 844 + addr <<= 4;
gokhlayeh@9258 845 + addr -= s->pos + (uint32_t)i;
gokhlayeh@9258 846 + addr >>= 4;
gokhlayeh@9258 847 +
gokhlayeh@9258 848 + norm &= ~((uint64_t)0x8FFFFF << 13);
gokhlayeh@9258 849 + norm |= (uint64_t)(addr & 0x0FFFFF) << 13;
gokhlayeh@9258 850 + norm |= (uint64_t)(addr & 0x100000)
gokhlayeh@9258 851 + << (36 - 20);
gokhlayeh@9258 852 +
gokhlayeh@9258 853 + instr &= (1 << bit_res) - 1;
gokhlayeh@9258 854 + instr |= norm << bit_res;
gokhlayeh@9258 855 +
gokhlayeh@9258 856 + for (j = 0; j < 6; j++)
gokhlayeh@9258 857 + buf[i + j + byte_pos]
gokhlayeh@9258 858 + = (uint8_t)(instr >> (8 * j));
gokhlayeh@9258 859 + }
gokhlayeh@9258 860 + }
gokhlayeh@9258 861 + }
gokhlayeh@9258 862 +
gokhlayeh@9258 863 + return i;
gokhlayeh@9258 864 +}
gokhlayeh@9258 865 +#endif
gokhlayeh@9258 866 +
gokhlayeh@9258 867 +#ifdef XZ_DEC_ARM
gokhlayeh@9258 868 +static size_t bcj_arm(struct xz_dec_bcj *s, uint8_t *buf, size_t size)
gokhlayeh@9258 869 +{
gokhlayeh@9258 870 + size_t i;
gokhlayeh@9258 871 + uint32_t addr;
gokhlayeh@9258 872 +
gokhlayeh@9258 873 + for (i = 0; i + 4 <= size; i += 4) {
gokhlayeh@9258 874 + if (buf[i + 3] == 0xEB) {
gokhlayeh@9258 875 + addr = (uint32_t)buf[i] | ((uint32_t)buf[i + 1] << 8)
gokhlayeh@9258 876 + | ((uint32_t)buf[i + 2] << 16);
gokhlayeh@9258 877 + addr <<= 2;
gokhlayeh@9258 878 + addr -= s->pos + (uint32_t)i + 8;
gokhlayeh@9258 879 + addr >>= 2;
gokhlayeh@9258 880 + buf[i] = (uint8_t)addr;
gokhlayeh@9258 881 + buf[i + 1] = (uint8_t)(addr >> 8);
gokhlayeh@9258 882 + buf[i + 2] = (uint8_t)(addr >> 16);
gokhlayeh@9258 883 + }
gokhlayeh@9258 884 + }
gokhlayeh@9258 885 +
gokhlayeh@9258 886 + return i;
gokhlayeh@9258 887 +}
gokhlayeh@9258 888 +#endif
gokhlayeh@9258 889 +
gokhlayeh@9258 890 +#ifdef XZ_DEC_ARMTHUMB
gokhlayeh@9258 891 +static size_t bcj_armthumb(struct xz_dec_bcj *s, uint8_t *buf, size_t size)
gokhlayeh@9258 892 +{
gokhlayeh@9258 893 + size_t i;
gokhlayeh@9258 894 + uint32_t addr;
gokhlayeh@9258 895 +
gokhlayeh@9258 896 + for (i = 0; i + 4 <= size; i += 2) {
gokhlayeh@9258 897 + if ((buf[i + 1] & 0xF8) == 0xF0
gokhlayeh@9258 898 + && (buf[i + 3] & 0xF8) == 0xF8) {
gokhlayeh@9258 899 + addr = (((uint32_t)buf[i + 1] & 0x07) << 19)
gokhlayeh@9258 900 + | ((uint32_t)buf[i] << 11)
gokhlayeh@9258 901 + | (((uint32_t)buf[i + 3] & 0x07) << 8)
gokhlayeh@9258 902 + | (uint32_t)buf[i + 2];
gokhlayeh@9258 903 + addr <<= 1;
gokhlayeh@9258 904 + addr -= s->pos + (uint32_t)i + 4;
gokhlayeh@9258 905 + addr >>= 1;
gokhlayeh@9258 906 + buf[i + 1] = (uint8_t)(0xF0 | ((addr >> 19) & 0x07));
gokhlayeh@9258 907 + buf[i] = (uint8_t)(addr >> 11);
gokhlayeh@9258 908 + buf[i + 3] = (uint8_t)(0xF8 | ((addr >> 8) & 0x07));
gokhlayeh@9258 909 + buf[i + 2] = (uint8_t)addr;
gokhlayeh@9258 910 + i += 2;
gokhlayeh@9258 911 + }
gokhlayeh@9258 912 + }
gokhlayeh@9258 913 +
gokhlayeh@9258 914 + return i;
gokhlayeh@9258 915 +}
gokhlayeh@9258 916 +#endif
gokhlayeh@9258 917 +
gokhlayeh@9258 918 +#ifdef XZ_DEC_SPARC
gokhlayeh@9258 919 +static size_t bcj_sparc(struct xz_dec_bcj *s, uint8_t *buf, size_t size)
gokhlayeh@9258 920 +{
gokhlayeh@9258 921 + size_t i;
gokhlayeh@9258 922 + uint32_t instr;
gokhlayeh@9258 923 +
gokhlayeh@9258 924 + for (i = 0; i + 4 <= size; i += 4) {
gokhlayeh@9258 925 + instr = get_unaligned_be32(buf + i);
gokhlayeh@9258 926 + if ((instr >> 22) == 0x100 || (instr >> 22) == 0x1FF) {
gokhlayeh@9258 927 + instr <<= 2;
gokhlayeh@9258 928 + instr -= s->pos + (uint32_t)i;
gokhlayeh@9258 929 + instr >>= 2;
gokhlayeh@9258 930 + instr = ((uint32_t)0x40000000 - (instr & 0x400000))
gokhlayeh@9258 931 + | 0x40000000 | (instr & 0x3FFFFF);
gokhlayeh@9258 932 + put_unaligned_be32(instr, buf + i);
gokhlayeh@9258 933 + }
gokhlayeh@9258 934 + }
gokhlayeh@9258 935 +
gokhlayeh@9258 936 + return i;
gokhlayeh@9258 937 +}
gokhlayeh@9258 938 +#endif
gokhlayeh@9258 939 +
gokhlayeh@9258 940 +/*
gokhlayeh@9258 941 + * Apply the selected BCJ filter. Update *pos and s->pos to match the amount
gokhlayeh@9258 942 + * of data that got filtered.
gokhlayeh@9258 943 + *
gokhlayeh@9258 944 + * NOTE: This is implemented as a switch statement to avoid using function
gokhlayeh@9258 945 + * pointers, which could be problematic in the kernel boot code, which must
gokhlayeh@9258 946 + * avoid pointers to static data (at least on x86).
gokhlayeh@9258 947 + */
gokhlayeh@9258 948 +static void bcj_apply(struct xz_dec_bcj *s,
gokhlayeh@9258 949 + uint8_t *buf, size_t *pos, size_t size)
gokhlayeh@9258 950 +{
gokhlayeh@9258 951 + size_t filtered;
gokhlayeh@9258 952 +
gokhlayeh@9258 953 + buf += *pos;
gokhlayeh@9258 954 + size -= *pos;
gokhlayeh@9258 955 +
gokhlayeh@9258 956 + switch (s->type) {
gokhlayeh@9258 957 +#ifdef XZ_DEC_X86
gokhlayeh@9258 958 + case BCJ_X86:
gokhlayeh@9258 959 + filtered = bcj_x86(s, buf, size);
gokhlayeh@9258 960 + break;
gokhlayeh@9258 961 +#endif
gokhlayeh@9258 962 +#ifdef XZ_DEC_POWERPC
gokhlayeh@9258 963 + case BCJ_POWERPC:
gokhlayeh@9258 964 + filtered = bcj_powerpc(s, buf, size);
gokhlayeh@9258 965 + break;
gokhlayeh@9258 966 +#endif
gokhlayeh@9258 967 +#ifdef XZ_DEC_IA64
gokhlayeh@9258 968 + case BCJ_IA64:
gokhlayeh@9258 969 + filtered = bcj_ia64(s, buf, size);
gokhlayeh@9258 970 + break;
gokhlayeh@9258 971 +#endif
gokhlayeh@9258 972 +#ifdef XZ_DEC_ARM
gokhlayeh@9258 973 + case BCJ_ARM:
gokhlayeh@9258 974 + filtered = bcj_arm(s, buf, size);
gokhlayeh@9258 975 + break;
gokhlayeh@9258 976 +#endif
gokhlayeh@9258 977 +#ifdef XZ_DEC_ARMTHUMB
gokhlayeh@9258 978 + case BCJ_ARMTHUMB:
gokhlayeh@9258 979 + filtered = bcj_armthumb(s, buf, size);
gokhlayeh@9258 980 + break;
gokhlayeh@9258 981 +#endif
gokhlayeh@9258 982 +#ifdef XZ_DEC_SPARC
gokhlayeh@9258 983 + case BCJ_SPARC:
gokhlayeh@9258 984 + filtered = bcj_sparc(s, buf, size);
gokhlayeh@9258 985 + break;
gokhlayeh@9258 986 +#endif
gokhlayeh@9258 987 + default:
gokhlayeh@9258 988 + /* Never reached but silence compiler warnings. */
gokhlayeh@9258 989 + filtered = 0;
gokhlayeh@9258 990 + break;
gokhlayeh@9258 991 + }
gokhlayeh@9258 992 +
gokhlayeh@9258 993 + *pos += filtered;
gokhlayeh@9258 994 + s->pos += filtered;
gokhlayeh@9258 995 +}
gokhlayeh@9258 996 +
gokhlayeh@9258 997 +/*
gokhlayeh@9258 998 + * Flush pending filtered data from temp to the output buffer.
gokhlayeh@9258 999 + * Move the remaining mixture of possibly filtered and unfiltered
gokhlayeh@9258 1000 + * data to the beginning of temp.
gokhlayeh@9258 1001 + */
gokhlayeh@9258 1002 +static void bcj_flush(struct xz_dec_bcj *s, struct xz_buf *b)
gokhlayeh@9258 1003 +{
gokhlayeh@9258 1004 + size_t copy_size;
gokhlayeh@9258 1005 +
gokhlayeh@9258 1006 + copy_size = min_t(size_t, s->temp.filtered, b->out_size - b->out_pos);
gokhlayeh@9258 1007 + memcpy(b->out + b->out_pos, s->temp.buf, copy_size);
gokhlayeh@9258 1008 + b->out_pos += copy_size;
gokhlayeh@9258 1009 +
gokhlayeh@9258 1010 + s->temp.filtered -= copy_size;
gokhlayeh@9258 1011 + s->temp.size -= copy_size;
gokhlayeh@9258 1012 + memmove(s->temp.buf, s->temp.buf + copy_size, s->temp.size);
gokhlayeh@9258 1013 +}
gokhlayeh@9258 1014 +
gokhlayeh@9258 1015 +/*
gokhlayeh@9258 1016 + * The BCJ filter functions are primitive in sense that they process the
gokhlayeh@9258 1017 + * data in chunks of 1-16 bytes. To hide this issue, this function does
gokhlayeh@9258 1018 + * some buffering.
gokhlayeh@9258 1019 + */
gokhlayeh@9258 1020 +XZ_EXTERN enum xz_ret xz_dec_bcj_run(struct xz_dec_bcj *s,
gokhlayeh@9258 1021 + struct xz_dec_lzma2 *lzma2,
gokhlayeh@9258 1022 + struct xz_buf *b)
gokhlayeh@9258 1023 +{
gokhlayeh@9258 1024 + size_t out_start;
gokhlayeh@9258 1025 +
gokhlayeh@9258 1026 + /*
gokhlayeh@9258 1027 + * Flush pending already filtered data to the output buffer. Return
gokhlayeh@9258 1028 + * immediatelly if we couldn't flush everything, or if the next
gokhlayeh@9258 1029 + * filter in the chain had already returned XZ_STREAM_END.
gokhlayeh@9258 1030 + */
gokhlayeh@9258 1031 + if (s->temp.filtered > 0) {
gokhlayeh@9258 1032 + bcj_flush(s, b);
gokhlayeh@9258 1033 + if (s->temp.filtered > 0)
gokhlayeh@9258 1034 + return XZ_OK;
gokhlayeh@9258 1035 +
gokhlayeh@9258 1036 + if (s->ret == XZ_STREAM_END)
gokhlayeh@9258 1037 + return XZ_STREAM_END;
gokhlayeh@9258 1038 + }
gokhlayeh@9258 1039 +
gokhlayeh@9258 1040 + /*
gokhlayeh@9258 1041 + * If we have more output space than what is currently pending in
gokhlayeh@9258 1042 + * temp, copy the unfiltered data from temp to the output buffer
gokhlayeh@9258 1043 + * and try to fill the output buffer by decoding more data from the
gokhlayeh@9258 1044 + * next filter in the chain. Apply the BCJ filter on the new data
gokhlayeh@9258 1045 + * in the output buffer. If everything cannot be filtered, copy it
gokhlayeh@9258 1046 + * to temp and rewind the output buffer position accordingly.
gokhlayeh@9258 1047 + */
gokhlayeh@9258 1048 + if (s->temp.size < b->out_size - b->out_pos) {
gokhlayeh@9258 1049 + out_start = b->out_pos;
gokhlayeh@9258 1050 + memcpy(b->out + b->out_pos, s->temp.buf, s->temp.size);
gokhlayeh@9258 1051 + b->out_pos += s->temp.size;
gokhlayeh@9258 1052 +
gokhlayeh@9258 1053 + s->ret = xz_dec_lzma2_run(lzma2, b);
gokhlayeh@9258 1054 + if (s->ret != XZ_STREAM_END
gokhlayeh@9258 1055 + && (s->ret != XZ_OK || s->single_call))
gokhlayeh@9258 1056 + return s->ret;
gokhlayeh@9258 1057 +
gokhlayeh@9258 1058 + bcj_apply(s, b->out, &out_start, b->out_pos);
gokhlayeh@9258 1059 +
gokhlayeh@9258 1060 + /*
gokhlayeh@9258 1061 + * As an exception, if the next filter returned XZ_STREAM_END,
gokhlayeh@9258 1062 + * we can do that too, since the last few bytes that remain
gokhlayeh@9258 1063 + * unfiltered are meant to remain unfiltered.
gokhlayeh@9258 1064 + */
gokhlayeh@9258 1065 + if (s->ret == XZ_STREAM_END)
gokhlayeh@9258 1066 + return XZ_STREAM_END;
gokhlayeh@9258 1067 +
gokhlayeh@9258 1068 + s->temp.size = b->out_pos - out_start;
gokhlayeh@9258 1069 + b->out_pos -= s->temp.size;
gokhlayeh@9258 1070 + memcpy(s->temp.buf, b->out + b->out_pos, s->temp.size);
gokhlayeh@9258 1071 + }
gokhlayeh@9258 1072 +
gokhlayeh@9258 1073 + /*
gokhlayeh@9258 1074 + * If we have unfiltered data in temp, try to fill by decoding more
gokhlayeh@9258 1075 + * data from the next filter. Apply the BCJ filter on temp. Then we
gokhlayeh@9258 1076 + * hopefully can fill the actual output buffer by copying filtered
gokhlayeh@9258 1077 + * data from temp. A mix of filtered and unfiltered data may be left
gokhlayeh@9258 1078 + * in temp; it will be taken care on the next call to this function.
gokhlayeh@9258 1079 + */
gokhlayeh@9258 1080 + if (s->temp.size > 0) {
gokhlayeh@9258 1081 + /* Make b->out{,_pos,_size} temporarily point to s->temp. */
gokhlayeh@9258 1082 + s->out = b->out;
gokhlayeh@9258 1083 + s->out_pos = b->out_pos;
gokhlayeh@9258 1084 + s->out_size = b->out_size;
gokhlayeh@9258 1085 + b->out = s->temp.buf;
gokhlayeh@9258 1086 + b->out_pos = s->temp.size;
gokhlayeh@9258 1087 + b->out_size = sizeof(s->temp.buf);
gokhlayeh@9258 1088 +
gokhlayeh@9258 1089 + s->ret = xz_dec_lzma2_run(lzma2, b);
gokhlayeh@9258 1090 +
gokhlayeh@9258 1091 + s->temp.size = b->out_pos;
gokhlayeh@9258 1092 + b->out = s->out;
gokhlayeh@9258 1093 + b->out_pos = s->out_pos;
gokhlayeh@9258 1094 + b->out_size = s->out_size;
gokhlayeh@9258 1095 +
gokhlayeh@9258 1096 + if (s->ret != XZ_OK && s->ret != XZ_STREAM_END)
gokhlayeh@9258 1097 + return s->ret;
gokhlayeh@9258 1098 +
gokhlayeh@9258 1099 + bcj_apply(s, s->temp.buf, &s->temp.filtered, s->temp.size);
gokhlayeh@9258 1100 +
gokhlayeh@9258 1101 + /*
gokhlayeh@9258 1102 + * If the next filter returned XZ_STREAM_END, we mark that
gokhlayeh@9258 1103 + * everything is filtered, since the last unfiltered bytes
gokhlayeh@9258 1104 + * of the stream are meant to be left as is.
gokhlayeh@9258 1105 + */
gokhlayeh@9258 1106 + if (s->ret == XZ_STREAM_END)
gokhlayeh@9258 1107 + s->temp.filtered = s->temp.size;
gokhlayeh@9258 1108 +
gokhlayeh@9258 1109 + bcj_flush(s, b);
gokhlayeh@9258 1110 + if (s->temp.filtered > 0)
gokhlayeh@9258 1111 + return XZ_OK;
gokhlayeh@9258 1112 + }
gokhlayeh@9258 1113 +
gokhlayeh@9258 1114 + return s->ret;
gokhlayeh@9258 1115 +}
gokhlayeh@9258 1116 +
gokhlayeh@9258 1117 +XZ_EXTERN struct xz_dec_bcj *xz_dec_bcj_create(bool single_call)
gokhlayeh@9258 1118 +{
gokhlayeh@9258 1119 + struct xz_dec_bcj *s = kmalloc(sizeof(*s), GFP_KERNEL);
gokhlayeh@9258 1120 + if (s != NULL)
gokhlayeh@9258 1121 + s->single_call = single_call;
gokhlayeh@9258 1122 +
gokhlayeh@9258 1123 + return s;
gokhlayeh@9258 1124 +}
gokhlayeh@9258 1125 +
gokhlayeh@9258 1126 +XZ_EXTERN enum xz_ret xz_dec_bcj_reset(struct xz_dec_bcj *s, uint8_t id)
gokhlayeh@9258 1127 +{
gokhlayeh@9258 1128 + switch (id) {
gokhlayeh@9258 1129 +#ifdef XZ_DEC_X86
gokhlayeh@9258 1130 + case BCJ_X86:
gokhlayeh@9258 1131 +#endif
gokhlayeh@9258 1132 +#ifdef XZ_DEC_POWERPC
gokhlayeh@9258 1133 + case BCJ_POWERPC:
gokhlayeh@9258 1134 +#endif
gokhlayeh@9258 1135 +#ifdef XZ_DEC_IA64
gokhlayeh@9258 1136 + case BCJ_IA64:
gokhlayeh@9258 1137 +#endif
gokhlayeh@9258 1138 +#ifdef XZ_DEC_ARM
gokhlayeh@9258 1139 + case BCJ_ARM:
gokhlayeh@9258 1140 +#endif
gokhlayeh@9258 1141 +#ifdef XZ_DEC_ARMTHUMB
gokhlayeh@9258 1142 + case BCJ_ARMTHUMB:
gokhlayeh@9258 1143 +#endif
gokhlayeh@9258 1144 +#ifdef XZ_DEC_SPARC
gokhlayeh@9258 1145 + case BCJ_SPARC:
gokhlayeh@9258 1146 +#endif
gokhlayeh@9258 1147 + break;
gokhlayeh@9258 1148 +
gokhlayeh@9258 1149 + default:
gokhlayeh@9258 1150 + /* Unsupported Filter ID */
gokhlayeh@9258 1151 + return XZ_OPTIONS_ERROR;
gokhlayeh@9258 1152 + }
gokhlayeh@9258 1153 +
gokhlayeh@9258 1154 + s->type = id;
gokhlayeh@9258 1155 + s->ret = XZ_OK;
gokhlayeh@9258 1156 + s->pos = 0;
gokhlayeh@9258 1157 + s->x86_prev_mask = 0;
gokhlayeh@9258 1158 + s->temp.filtered = 0;
gokhlayeh@9258 1159 + s->temp.size = 0;
gokhlayeh@9258 1160 +
gokhlayeh@9258 1161 + return XZ_OK;
gokhlayeh@9258 1162 +}
gokhlayeh@9258 1163 +
gokhlayeh@9258 1164 +#endif
gokhlayeh@9258 1165 diff --git a/lib/xz/xz_dec_lzma2.c b/lib/xz/xz_dec_lzma2.c
gokhlayeh@9258 1166 new file mode 100644
gokhlayeh@9258 1167 index 0000000..ea5fa4f
gokhlayeh@9258 1168 --- /dev/null
gokhlayeh@9258 1169 +++ b/lib/xz/xz_dec_lzma2.c
gokhlayeh@9258 1170 @@ -0,0 +1,1171 @@
gokhlayeh@9258 1171 +/*
gokhlayeh@9258 1172 + * LZMA2 decoder
gokhlayeh@9258 1173 + *
gokhlayeh@9258 1174 + * Authors: Lasse Collin <lasse.collin@tukaani.org>
gokhlayeh@9258 1175 + * Igor Pavlov <http://7-zip.org/>
gokhlayeh@9258 1176 + *
gokhlayeh@9258 1177 + * This file has been put into the public domain.
gokhlayeh@9258 1178 + * You can do whatever you want with this file.
gokhlayeh@9258 1179 + */
gokhlayeh@9258 1180 +
gokhlayeh@9258 1181 +#include "xz_private.h"
gokhlayeh@9258 1182 +#include "xz_lzma2.h"
gokhlayeh@9258 1183 +
gokhlayeh@9258 1184 +/*
gokhlayeh@9258 1185 + * Range decoder initialization eats the first five bytes of each LZMA chunk.
gokhlayeh@9258 1186 + */
gokhlayeh@9258 1187 +#define RC_INIT_BYTES 5
gokhlayeh@9258 1188 +
gokhlayeh@9258 1189 +/*
gokhlayeh@9258 1190 + * Minimum number of usable input buffer to safely decode one LZMA symbol.
gokhlayeh@9258 1191 + * The worst case is that we decode 22 bits using probabilities and 26
gokhlayeh@9258 1192 + * direct bits. This may decode at maximum of 20 bytes of input. However,
gokhlayeh@9258 1193 + * lzma_main() does an extra normalization before returning, thus we
gokhlayeh@9258 1194 + * need to put 21 here.
gokhlayeh@9258 1195 + */
gokhlayeh@9258 1196 +#define LZMA_IN_REQUIRED 21
gokhlayeh@9258 1197 +
gokhlayeh@9258 1198 +/*
gokhlayeh@9258 1199 + * Dictionary (history buffer)
gokhlayeh@9258 1200 + *
gokhlayeh@9258 1201 + * These are always true:
gokhlayeh@9258 1202 + * start <= pos <= full <= end
gokhlayeh@9258 1203 + * pos <= limit <= end
gokhlayeh@9258 1204 + *
gokhlayeh@9258 1205 + * In multi-call mode, also these are true:
gokhlayeh@9258 1206 + * end == size
gokhlayeh@9258 1207 + * size <= size_max
gokhlayeh@9258 1208 + * allocated <= size
gokhlayeh@9258 1209 + *
gokhlayeh@9258 1210 + * Most of these variables are size_t to support single-call mode,
gokhlayeh@9258 1211 + * in which the dictionary variables address the actual output
gokhlayeh@9258 1212 + * buffer directly.
gokhlayeh@9258 1213 + */
gokhlayeh@9258 1214 +struct dictionary {
gokhlayeh@9258 1215 + /* Beginning of the history buffer */
gokhlayeh@9258 1216 + uint8_t *buf;
gokhlayeh@9258 1217 +
gokhlayeh@9258 1218 + /* Old position in buf (before decoding more data) */
gokhlayeh@9258 1219 + size_t start;
gokhlayeh@9258 1220 +
gokhlayeh@9258 1221 + /* Position in buf */
gokhlayeh@9258 1222 + size_t pos;
gokhlayeh@9258 1223 +
gokhlayeh@9258 1224 + /*
gokhlayeh@9258 1225 + * How full dictionary is. This is used to detect corrupt input that
gokhlayeh@9258 1226 + * would read beyond the beginning of the uncompressed stream.
gokhlayeh@9258 1227 + */
gokhlayeh@9258 1228 + size_t full;
gokhlayeh@9258 1229 +
gokhlayeh@9258 1230 + /* Write limit; we don't write to buf[limit] or later bytes. */
gokhlayeh@9258 1231 + size_t limit;
gokhlayeh@9258 1232 +
gokhlayeh@9258 1233 + /*
gokhlayeh@9258 1234 + * End of the dictionary buffer. In multi-call mode, this is
gokhlayeh@9258 1235 + * the same as the dictionary size. In single-call mode, this
gokhlayeh@9258 1236 + * indicates the size of the output buffer.
gokhlayeh@9258 1237 + */
gokhlayeh@9258 1238 + size_t end;
gokhlayeh@9258 1239 +
gokhlayeh@9258 1240 + /*
gokhlayeh@9258 1241 + * Size of the dictionary as specified in Block Header. This is used
gokhlayeh@9258 1242 + * together with "full" to detect corrupt input that would make us
gokhlayeh@9258 1243 + * read beyond the beginning of the uncompressed stream.
gokhlayeh@9258 1244 + */
gokhlayeh@9258 1245 + uint32_t size;
gokhlayeh@9258 1246 +
gokhlayeh@9258 1247 + /*
gokhlayeh@9258 1248 + * Maximum allowed dictionary size in multi-call mode.
gokhlayeh@9258 1249 + * This is ignored in single-call mode.
gokhlayeh@9258 1250 + */
gokhlayeh@9258 1251 + uint32_t size_max;
gokhlayeh@9258 1252 +
gokhlayeh@9258 1253 + /*
gokhlayeh@9258 1254 + * Amount of memory currently allocated for the dictionary.
gokhlayeh@9258 1255 + * This is used only with XZ_DYNALLOC. (With XZ_PREALLOC,
gokhlayeh@9258 1256 + * size_max is always the same as the allocated size.)
gokhlayeh@9258 1257 + */
gokhlayeh@9258 1258 + uint32_t allocated;
gokhlayeh@9258 1259 +
gokhlayeh@9258 1260 + /* Operation mode */
gokhlayeh@9258 1261 + enum xz_mode mode;
gokhlayeh@9258 1262 +};
gokhlayeh@9258 1263 +
gokhlayeh@9258 1264 +/* Range decoder */
gokhlayeh@9258 1265 +struct rc_dec {
gokhlayeh@9258 1266 + uint32_t range;
gokhlayeh@9258 1267 + uint32_t code;
gokhlayeh@9258 1268 +
gokhlayeh@9258 1269 + /*
gokhlayeh@9258 1270 + * Number of initializing bytes remaining to be read
gokhlayeh@9258 1271 + * by rc_read_init().
gokhlayeh@9258 1272 + */
gokhlayeh@9258 1273 + uint32_t init_bytes_left;
gokhlayeh@9258 1274 +
gokhlayeh@9258 1275 + /*
gokhlayeh@9258 1276 + * Buffer from which we read our input. It can be either
gokhlayeh@9258 1277 + * temp.buf or the caller-provided input buffer.
gokhlayeh@9258 1278 + */
gokhlayeh@9258 1279 + const uint8_t *in;
gokhlayeh@9258 1280 + size_t in_pos;
gokhlayeh@9258 1281 + size_t in_limit;
gokhlayeh@9258 1282 +};
gokhlayeh@9258 1283 +
gokhlayeh@9258 1284 +/* Probabilities for a length decoder. */
gokhlayeh@9258 1285 +struct lzma_len_dec {
gokhlayeh@9258 1286 + /* Probability of match length being at least 10 */
gokhlayeh@9258 1287 + uint16_t choice;
gokhlayeh@9258 1288 +
gokhlayeh@9258 1289 + /* Probability of match length being at least 18 */
gokhlayeh@9258 1290 + uint16_t choice2;
gokhlayeh@9258 1291 +
gokhlayeh@9258 1292 + /* Probabilities for match lengths 2-9 */
gokhlayeh@9258 1293 + uint16_t low[POS_STATES_MAX][LEN_LOW_SYMBOLS];
gokhlayeh@9258 1294 +
gokhlayeh@9258 1295 + /* Probabilities for match lengths 10-17 */
gokhlayeh@9258 1296 + uint16_t mid[POS_STATES_MAX][LEN_MID_SYMBOLS];
gokhlayeh@9258 1297 +
gokhlayeh@9258 1298 + /* Probabilities for match lengths 18-273 */
gokhlayeh@9258 1299 + uint16_t high[LEN_HIGH_SYMBOLS];
gokhlayeh@9258 1300 +};
gokhlayeh@9258 1301 +
gokhlayeh@9258 1302 +struct lzma_dec {
gokhlayeh@9258 1303 + /* Distances of latest four matches */
gokhlayeh@9258 1304 + uint32_t rep0;
gokhlayeh@9258 1305 + uint32_t rep1;
gokhlayeh@9258 1306 + uint32_t rep2;
gokhlayeh@9258 1307 + uint32_t rep3;
gokhlayeh@9258 1308 +
gokhlayeh@9258 1309 + /* Types of the most recently seen LZMA symbols */
gokhlayeh@9258 1310 + enum lzma_state state;
gokhlayeh@9258 1311 +
gokhlayeh@9258 1312 + /*
gokhlayeh@9258 1313 + * Length of a match. This is updated so that dict_repeat can
gokhlayeh@9258 1314 + * be called again to finish repeating the whole match.
gokhlayeh@9258 1315 + */
gokhlayeh@9258 1316 + uint32_t len;
gokhlayeh@9258 1317 +
gokhlayeh@9258 1318 + /*
gokhlayeh@9258 1319 + * LZMA properties or related bit masks (number of literal
gokhlayeh@9258 1320 + * context bits, a mask dervied from the number of literal
gokhlayeh@9258 1321 + * position bits, and a mask dervied from the number
gokhlayeh@9258 1322 + * position bits)
gokhlayeh@9258 1323 + */
gokhlayeh@9258 1324 + uint32_t lc;
gokhlayeh@9258 1325 + uint32_t literal_pos_mask; /* (1 << lp) - 1 */
gokhlayeh@9258 1326 + uint32_t pos_mask; /* (1 << pb) - 1 */
gokhlayeh@9258 1327 +
gokhlayeh@9258 1328 + /* If 1, it's a match. Otherwise it's a single 8-bit literal. */
gokhlayeh@9258 1329 + uint16_t is_match[STATES][POS_STATES_MAX];
gokhlayeh@9258 1330 +
gokhlayeh@9258 1331 + /* If 1, it's a repeated match. The distance is one of rep0 .. rep3. */
gokhlayeh@9258 1332 + uint16_t is_rep[STATES];
gokhlayeh@9258 1333 +
gokhlayeh@9258 1334 + /*
gokhlayeh@9258 1335 + * If 0, distance of a repeated match is rep0.
gokhlayeh@9258 1336 + * Otherwise check is_rep1.
gokhlayeh@9258 1337 + */
gokhlayeh@9258 1338 + uint16_t is_rep0[STATES];
gokhlayeh@9258 1339 +
gokhlayeh@9258 1340 + /*
gokhlayeh@9258 1341 + * If 0, distance of a repeated match is rep1.
gokhlayeh@9258 1342 + * Otherwise check is_rep2.
gokhlayeh@9258 1343 + */
gokhlayeh@9258 1344 + uint16_t is_rep1[STATES];
gokhlayeh@9258 1345 +
gokhlayeh@9258 1346 + /* If 0, distance of a repeated match is rep2. Otherwise it is rep3. */
gokhlayeh@9258 1347 + uint16_t is_rep2[STATES];
gokhlayeh@9258 1348 +
gokhlayeh@9258 1349 + /*
gokhlayeh@9258 1350 + * If 1, the repeated match has length of one byte. Otherwise
gokhlayeh@9258 1351 + * the length is decoded from rep_len_decoder.
gokhlayeh@9258 1352 + */
gokhlayeh@9258 1353 + uint16_t is_rep0_long[STATES][POS_STATES_MAX];
gokhlayeh@9258 1354 +
gokhlayeh@9258 1355 + /*
gokhlayeh@9258 1356 + * Probability tree for the highest two bits of the match
gokhlayeh@9258 1357 + * distance. There is a separate probability tree for match
gokhlayeh@9258 1358 + * lengths of 2 (i.e. MATCH_LEN_MIN), 3, 4, and [5, 273].
gokhlayeh@9258 1359 + */
gokhlayeh@9258 1360 + uint16_t dist_slot[DIST_STATES][DIST_SLOTS];
gokhlayeh@9258 1361 +
gokhlayeh@9258 1362 + /*
gokhlayeh@9258 1363 + * Probility trees for additional bits for match distance
gokhlayeh@9258 1364 + * when the distance is in the range [4, 127].
gokhlayeh@9258 1365 + */
gokhlayeh@9258 1366 + uint16_t dist_special[FULL_DISTANCES - DIST_MODEL_END];
gokhlayeh@9258 1367 +
gokhlayeh@9258 1368 + /*
gokhlayeh@9258 1369 + * Probability tree for the lowest four bits of a match
gokhlayeh@9258 1370 + * distance that is equal to or greater than 128.
gokhlayeh@9258 1371 + */
gokhlayeh@9258 1372 + uint16_t dist_align[ALIGN_SIZE];
gokhlayeh@9258 1373 +
gokhlayeh@9258 1374 + /* Length of a normal match */
gokhlayeh@9258 1375 + struct lzma_len_dec match_len_dec;
gokhlayeh@9258 1376 +
gokhlayeh@9258 1377 + /* Length of a repeated match */
gokhlayeh@9258 1378 + struct lzma_len_dec rep_len_dec;
gokhlayeh@9258 1379 +
gokhlayeh@9258 1380 + /* Probabilities of literals */
gokhlayeh@9258 1381 + uint16_t literal[LITERAL_CODERS_MAX][LITERAL_CODER_SIZE];
gokhlayeh@9258 1382 +};
gokhlayeh@9258 1383 +
gokhlayeh@9258 1384 +struct lzma2_dec {
gokhlayeh@9258 1385 + /* Position in xz_dec_lzma2_run(). */
gokhlayeh@9258 1386 + enum lzma2_seq {
gokhlayeh@9258 1387 + SEQ_CONTROL,
gokhlayeh@9258 1388 + SEQ_UNCOMPRESSED_1,
gokhlayeh@9258 1389 + SEQ_UNCOMPRESSED_2,
gokhlayeh@9258 1390 + SEQ_COMPRESSED_0,
gokhlayeh@9258 1391 + SEQ_COMPRESSED_1,
gokhlayeh@9258 1392 + SEQ_PROPERTIES,
gokhlayeh@9258 1393 + SEQ_LZMA_PREPARE,
gokhlayeh@9258 1394 + SEQ_LZMA_RUN,
gokhlayeh@9258 1395 + SEQ_COPY
gokhlayeh@9258 1396 + } sequence;
gokhlayeh@9258 1397 +
gokhlayeh@9258 1398 + /* Next position after decoding the compressed size of the chunk. */
gokhlayeh@9258 1399 + enum lzma2_seq next_sequence;
gokhlayeh@9258 1400 +
gokhlayeh@9258 1401 + /* Uncompressed size of LZMA chunk (2 MiB at maximum) */
gokhlayeh@9258 1402 + uint32_t uncompressed;
gokhlayeh@9258 1403 +
gokhlayeh@9258 1404 + /*
gokhlayeh@9258 1405 + * Compressed size of LZMA chunk or compressed/uncompressed
gokhlayeh@9258 1406 + * size of uncompressed chunk (64 KiB at maximum)
gokhlayeh@9258 1407 + */
gokhlayeh@9258 1408 + uint32_t compressed;
gokhlayeh@9258 1409 +
gokhlayeh@9258 1410 + /*
gokhlayeh@9258 1411 + * True if dictionary reset is needed. This is false before
gokhlayeh@9258 1412 + * the first chunk (LZMA or uncompressed).
gokhlayeh@9258 1413 + */
gokhlayeh@9258 1414 + bool need_dict_reset;
gokhlayeh@9258 1415 +
gokhlayeh@9258 1416 + /*
gokhlayeh@9258 1417 + * True if new LZMA properties are needed. This is false
gokhlayeh@9258 1418 + * before the first LZMA chunk.
gokhlayeh@9258 1419 + */
gokhlayeh@9258 1420 + bool need_props;
gokhlayeh@9258 1421 +};
gokhlayeh@9258 1422 +
gokhlayeh@9258 1423 +struct xz_dec_lzma2 {
gokhlayeh@9258 1424 + /*
gokhlayeh@9258 1425 + * The order below is important on x86 to reduce code size and
gokhlayeh@9258 1426 + * it shouldn't hurt on other platforms. Everything up to and
gokhlayeh@9258 1427 + * including lzma.pos_mask are in the first 128 bytes on x86-32,
gokhlayeh@9258 1428 + * which allows using smaller instructions to access those
gokhlayeh@9258 1429 + * variables. On x86-64, fewer variables fit into the first 128
gokhlayeh@9258 1430 + * bytes, but this is still the best order without sacrificing
gokhlayeh@9258 1431 + * the readability by splitting the structures.
gokhlayeh@9258 1432 + */
gokhlayeh@9258 1433 + struct rc_dec rc;
gokhlayeh@9258 1434 + struct dictionary dict;
gokhlayeh@9258 1435 + struct lzma2_dec lzma2;
gokhlayeh@9258 1436 + struct lzma_dec lzma;
gokhlayeh@9258 1437 +
gokhlayeh@9258 1438 + /*
gokhlayeh@9258 1439 + * Temporary buffer which holds small number of input bytes between
gokhlayeh@9258 1440 + * decoder calls. See lzma2_lzma() for details.
gokhlayeh@9258 1441 + */
gokhlayeh@9258 1442 + struct {
gokhlayeh@9258 1443 + uint32_t size;
gokhlayeh@9258 1444 + uint8_t buf[3 * LZMA_IN_REQUIRED];
gokhlayeh@9258 1445 + } temp;
gokhlayeh@9258 1446 +};
gokhlayeh@9258 1447 +
gokhlayeh@9258 1448 +/**************
gokhlayeh@9258 1449 + * Dictionary *
gokhlayeh@9258 1450 + **************/
gokhlayeh@9258 1451 +
gokhlayeh@9258 1452 +/*
gokhlayeh@9258 1453 + * Reset the dictionary state. When in single-call mode, set up the beginning
gokhlayeh@9258 1454 + * of the dictionary to point to the actual output buffer.
gokhlayeh@9258 1455 + */
gokhlayeh@9258 1456 +static void dict_reset(struct dictionary *dict, struct xz_buf *b)
gokhlayeh@9258 1457 +{
gokhlayeh@9258 1458 + if (DEC_IS_SINGLE(dict->mode)) {
gokhlayeh@9258 1459 + dict->buf = b->out + b->out_pos;
gokhlayeh@9258 1460 + dict->end = b->out_size - b->out_pos;
gokhlayeh@9258 1461 + }
gokhlayeh@9258 1462 +
gokhlayeh@9258 1463 + dict->start = 0;
gokhlayeh@9258 1464 + dict->pos = 0;
gokhlayeh@9258 1465 + dict->limit = 0;
gokhlayeh@9258 1466 + dict->full = 0;
gokhlayeh@9258 1467 +}
gokhlayeh@9258 1468 +
gokhlayeh@9258 1469 +/* Set dictionary write limit */
gokhlayeh@9258 1470 +static void dict_limit(struct dictionary *dict, size_t out_max)
gokhlayeh@9258 1471 +{
gokhlayeh@9258 1472 + if (dict->end - dict->pos <= out_max)
gokhlayeh@9258 1473 + dict->limit = dict->end;
gokhlayeh@9258 1474 + else
gokhlayeh@9258 1475 + dict->limit = dict->pos + out_max;
gokhlayeh@9258 1476 +}
gokhlayeh@9258 1477 +
gokhlayeh@9258 1478 +/* Return true if at least one byte can be written into the dictionary. */
gokhlayeh@9258 1479 +static inline bool dict_has_space(const struct dictionary *dict)
gokhlayeh@9258 1480 +{
gokhlayeh@9258 1481 + return dict->pos < dict->limit;
gokhlayeh@9258 1482 +}
gokhlayeh@9258 1483 +
gokhlayeh@9258 1484 +/*
gokhlayeh@9258 1485 + * Get a byte from the dictionary at the given distance. The distance is
gokhlayeh@9258 1486 + * assumed to valid, or as a special case, zero when the dictionary is
gokhlayeh@9258 1487 + * still empty. This special case is needed for single-call decoding to
gokhlayeh@9258 1488 + * avoid writing a '\0' to the end of the destination buffer.
gokhlayeh@9258 1489 + */
gokhlayeh@9258 1490 +static inline uint32_t dict_get(const struct dictionary *dict, uint32_t dist)
gokhlayeh@9258 1491 +{
gokhlayeh@9258 1492 + size_t offset = dict->pos - dist - 1;
gokhlayeh@9258 1493 +
gokhlayeh@9258 1494 + if (dist >= dict->pos)
gokhlayeh@9258 1495 + offset += dict->end;
gokhlayeh@9258 1496 +
gokhlayeh@9258 1497 + return dict->full > 0 ? dict->buf[offset] : 0;
gokhlayeh@9258 1498 +}
gokhlayeh@9258 1499 +
gokhlayeh@9258 1500 +/*
gokhlayeh@9258 1501 + * Put one byte into the dictionary. It is assumed that there is space for it.
gokhlayeh@9258 1502 + */
gokhlayeh@9258 1503 +static inline void dict_put(struct dictionary *dict, uint8_t byte)
gokhlayeh@9258 1504 +{
gokhlayeh@9258 1505 + dict->buf[dict->pos++] = byte;
gokhlayeh@9258 1506 +
gokhlayeh@9258 1507 + if (dict->full < dict->pos)
gokhlayeh@9258 1508 + dict->full = dict->pos;
gokhlayeh@9258 1509 +}
gokhlayeh@9258 1510 +
gokhlayeh@9258 1511 +/*
gokhlayeh@9258 1512 + * Repeat given number of bytes from the given distance. If the distance is
gokhlayeh@9258 1513 + * invalid, false is returned. On success, true is returned and *len is
gokhlayeh@9258 1514 + * updated to indicate how many bytes were left to be repeated.
gokhlayeh@9258 1515 + */
gokhlayeh@9258 1516 +static bool dict_repeat(struct dictionary *dict, uint32_t *len, uint32_t dist)
gokhlayeh@9258 1517 +{
gokhlayeh@9258 1518 + size_t back;
gokhlayeh@9258 1519 + uint32_t left;
gokhlayeh@9258 1520 +
gokhlayeh@9258 1521 + if (dist >= dict->full || dist >= dict->size)
gokhlayeh@9258 1522 + return false;
gokhlayeh@9258 1523 +
gokhlayeh@9258 1524 + left = min_t(size_t, dict->limit - dict->pos, *len);
gokhlayeh@9258 1525 + *len -= left;
gokhlayeh@9258 1526 +
gokhlayeh@9258 1527 + back = dict->pos - dist - 1;
gokhlayeh@9258 1528 + if (dist >= dict->pos)
gokhlayeh@9258 1529 + back += dict->end;
gokhlayeh@9258 1530 +
gokhlayeh@9258 1531 + do {
gokhlayeh@9258 1532 + dict->buf[dict->pos++] = dict->buf[back++];
gokhlayeh@9258 1533 + if (back == dict->end)
gokhlayeh@9258 1534 + back = 0;
gokhlayeh@9258 1535 + } while (--left > 0);
gokhlayeh@9258 1536 +
gokhlayeh@9258 1537 + if (dict->full < dict->pos)
gokhlayeh@9258 1538 + dict->full = dict->pos;
gokhlayeh@9258 1539 +
gokhlayeh@9258 1540 + return true;
gokhlayeh@9258 1541 +}
gokhlayeh@9258 1542 +
gokhlayeh@9258 1543 +/* Copy uncompressed data as is from input to dictionary and output buffers. */
gokhlayeh@9258 1544 +static void dict_uncompressed(struct dictionary *dict, struct xz_buf *b,
gokhlayeh@9258 1545 + uint32_t *left)
gokhlayeh@9258 1546 +{
gokhlayeh@9258 1547 + size_t copy_size;
gokhlayeh@9258 1548 +
gokhlayeh@9258 1549 + while (*left > 0 && b->in_pos < b->in_size
gokhlayeh@9258 1550 + && b->out_pos < b->out_size) {
gokhlayeh@9258 1551 + copy_size = min(b->in_size - b->in_pos,
gokhlayeh@9258 1552 + b->out_size - b->out_pos);
gokhlayeh@9258 1553 + if (copy_size > dict->end - dict->pos)
gokhlayeh@9258 1554 + copy_size = dict->end - dict->pos;
gokhlayeh@9258 1555 + if (copy_size > *left)
gokhlayeh@9258 1556 + copy_size = *left;
gokhlayeh@9258 1557 +
gokhlayeh@9258 1558 + *left -= copy_size;
gokhlayeh@9258 1559 +
gokhlayeh@9258 1560 + memcpy(dict->buf + dict->pos, b->in + b->in_pos, copy_size);
gokhlayeh@9258 1561 + dict->pos += copy_size;
gokhlayeh@9258 1562 +
gokhlayeh@9258 1563 + if (dict->full < dict->pos)
gokhlayeh@9258 1564 + dict->full = dict->pos;
gokhlayeh@9258 1565 +
gokhlayeh@9258 1566 + if (DEC_IS_MULTI(dict->mode)) {
gokhlayeh@9258 1567 + if (dict->pos == dict->end)
gokhlayeh@9258 1568 + dict->pos = 0;
gokhlayeh@9258 1569 +
gokhlayeh@9258 1570 + memcpy(b->out + b->out_pos, b->in + b->in_pos,
gokhlayeh@9258 1571 + copy_size);
gokhlayeh@9258 1572 + }
gokhlayeh@9258 1573 +
gokhlayeh@9258 1574 + dict->start = dict->pos;
gokhlayeh@9258 1575 +
gokhlayeh@9258 1576 + b->out_pos += copy_size;
gokhlayeh@9258 1577 + b->in_pos += copy_size;
gokhlayeh@9258 1578 + }
gokhlayeh@9258 1579 +}
gokhlayeh@9258 1580 +
gokhlayeh@9258 1581 +/*
gokhlayeh@9258 1582 + * Flush pending data from dictionary to b->out. It is assumed that there is
gokhlayeh@9258 1583 + * enough space in b->out. This is guaranteed because caller uses dict_limit()
gokhlayeh@9258 1584 + * before decoding data into the dictionary.
gokhlayeh@9258 1585 + */
gokhlayeh@9258 1586 +static uint32_t dict_flush(struct dictionary *dict, struct xz_buf *b)
gokhlayeh@9258 1587 +{
gokhlayeh@9258 1588 + size_t copy_size = dict->pos - dict->start;
gokhlayeh@9258 1589 +
gokhlayeh@9258 1590 + if (DEC_IS_MULTI(dict->mode)) {
gokhlayeh@9258 1591 + if (dict->pos == dict->end)
gokhlayeh@9258 1592 + dict->pos = 0;
gokhlayeh@9258 1593 +
gokhlayeh@9258 1594 + memcpy(b->out + b->out_pos, dict->buf + dict->start,
gokhlayeh@9258 1595 + copy_size);
gokhlayeh@9258 1596 + }
gokhlayeh@9258 1597 +
gokhlayeh@9258 1598 + dict->start = dict->pos;
gokhlayeh@9258 1599 + b->out_pos += copy_size;
gokhlayeh@9258 1600 + return copy_size;
gokhlayeh@9258 1601 +}
gokhlayeh@9258 1602 +
gokhlayeh@9258 1603 +/*****************
gokhlayeh@9258 1604 + * Range decoder *
gokhlayeh@9258 1605 + *****************/
gokhlayeh@9258 1606 +
gokhlayeh@9258 1607 +/* Reset the range decoder. */
gokhlayeh@9258 1608 +static void rc_reset(struct rc_dec *rc)
gokhlayeh@9258 1609 +{
gokhlayeh@9258 1610 + rc->range = (uint32_t)-1;
gokhlayeh@9258 1611 + rc->code = 0;
gokhlayeh@9258 1612 + rc->init_bytes_left = RC_INIT_BYTES;
gokhlayeh@9258 1613 +}
gokhlayeh@9258 1614 +
gokhlayeh@9258 1615 +/*
gokhlayeh@9258 1616 + * Read the first five initial bytes into rc->code if they haven't been
gokhlayeh@9258 1617 + * read already. (Yes, the first byte gets completely ignored.)
gokhlayeh@9258 1618 + */
gokhlayeh@9258 1619 +static bool rc_read_init(struct rc_dec *rc, struct xz_buf *b)
gokhlayeh@9258 1620 +{
gokhlayeh@9258 1621 + while (rc->init_bytes_left > 0) {
gokhlayeh@9258 1622 + if (b->in_pos == b->in_size)
gokhlayeh@9258 1623 + return false;
gokhlayeh@9258 1624 +
gokhlayeh@9258 1625 + rc->code = (rc->code << 8) + b->in[b->in_pos++];
gokhlayeh@9258 1626 + --rc->init_bytes_left;
gokhlayeh@9258 1627 + }
gokhlayeh@9258 1628 +
gokhlayeh@9258 1629 + return true;
gokhlayeh@9258 1630 +}
gokhlayeh@9258 1631 +
gokhlayeh@9258 1632 +/* Return true if there may not be enough input for the next decoding loop. */
gokhlayeh@9258 1633 +static inline bool rc_limit_exceeded(const struct rc_dec *rc)
gokhlayeh@9258 1634 +{
gokhlayeh@9258 1635 + return rc->in_pos > rc->in_limit;
gokhlayeh@9258 1636 +}
gokhlayeh@9258 1637 +
gokhlayeh@9258 1638 +/*
gokhlayeh@9258 1639 + * Return true if it is possible (from point of view of range decoder) that
gokhlayeh@9258 1640 + * we have reached the end of the LZMA chunk.
gokhlayeh@9258 1641 + */
gokhlayeh@9258 1642 +static inline bool rc_is_finished(const struct rc_dec *rc)
gokhlayeh@9258 1643 +{
gokhlayeh@9258 1644 + return rc->code == 0;
gokhlayeh@9258 1645 +}
gokhlayeh@9258 1646 +
gokhlayeh@9258 1647 +/* Read the next input byte if needed. */
gokhlayeh@9258 1648 +static __always_inline void rc_normalize(struct rc_dec *rc)
gokhlayeh@9258 1649 +{
gokhlayeh@9258 1650 + if (rc->range < RC_TOP_VALUE) {
gokhlayeh@9258 1651 + rc->range <<= RC_SHIFT_BITS;
gokhlayeh@9258 1652 + rc->code = (rc->code << RC_SHIFT_BITS) + rc->in[rc->in_pos++];
gokhlayeh@9258 1653 + }
gokhlayeh@9258 1654 +}
gokhlayeh@9258 1655 +
gokhlayeh@9258 1656 +/*
gokhlayeh@9258 1657 + * Decode one bit. In some versions, this function has been splitted in three
gokhlayeh@9258 1658 + * functions so that the compiler is supposed to be able to more easily avoid
gokhlayeh@9258 1659 + * an extra branch. In this particular version of the LZMA decoder, this
gokhlayeh@9258 1660 + * doesn't seem to be a good idea (tested with GCC 3.3.6, 3.4.6, and 4.3.3
gokhlayeh@9258 1661 + * on x86). Using a non-splitted version results in nicer looking code too.
gokhlayeh@9258 1662 + *
gokhlayeh@9258 1663 + * NOTE: This must return an int. Do not make it return a bool or the speed
gokhlayeh@9258 1664 + * of the code generated by GCC 3.x decreases 10-15 %. (GCC 4.3 doesn't care,
gokhlayeh@9258 1665 + * and it generates 10-20 % faster code than GCC 3.x from this file anyway.)
gokhlayeh@9258 1666 + */
gokhlayeh@9258 1667 +static __always_inline int rc_bit(struct rc_dec *rc, uint16_t *prob)
gokhlayeh@9258 1668 +{
gokhlayeh@9258 1669 + uint32_t bound;
gokhlayeh@9258 1670 + int bit;
gokhlayeh@9258 1671 +
gokhlayeh@9258 1672 + rc_normalize(rc);
gokhlayeh@9258 1673 + bound = (rc->range >> RC_BIT_MODEL_TOTAL_BITS) * *prob;
gokhlayeh@9258 1674 + if (rc->code < bound) {
gokhlayeh@9258 1675 + rc->range = bound;
gokhlayeh@9258 1676 + *prob += (RC_BIT_MODEL_TOTAL - *prob) >> RC_MOVE_BITS;
gokhlayeh@9258 1677 + bit = 0;
gokhlayeh@9258 1678 + } else {
gokhlayeh@9258 1679 + rc->range -= bound;
gokhlayeh@9258 1680 + rc->code -= bound;
gokhlayeh@9258 1681 + *prob -= *prob >> RC_MOVE_BITS;
gokhlayeh@9258 1682 + bit = 1;
gokhlayeh@9258 1683 + }
gokhlayeh@9258 1684 +
gokhlayeh@9258 1685 + return bit;
gokhlayeh@9258 1686 +}
gokhlayeh@9258 1687 +
gokhlayeh@9258 1688 +/* Decode a bittree starting from the most significant bit. */
gokhlayeh@9258 1689 +static __always_inline uint32_t rc_bittree(struct rc_dec *rc,
gokhlayeh@9258 1690 + uint16_t *probs, uint32_t limit)
gokhlayeh@9258 1691 +{
gokhlayeh@9258 1692 + uint32_t symbol = 1;
gokhlayeh@9258 1693 +
gokhlayeh@9258 1694 + do {
gokhlayeh@9258 1695 + if (rc_bit(rc, &probs[symbol]))
gokhlayeh@9258 1696 + symbol = (symbol << 1) + 1;
gokhlayeh@9258 1697 + else
gokhlayeh@9258 1698 + symbol <<= 1;
gokhlayeh@9258 1699 + } while (symbol < limit);
gokhlayeh@9258 1700 +
gokhlayeh@9258 1701 + return symbol;
gokhlayeh@9258 1702 +}
gokhlayeh@9258 1703 +
gokhlayeh@9258 1704 +/* Decode a bittree starting from the least significant bit. */
gokhlayeh@9258 1705 +static __always_inline void rc_bittree_reverse(struct rc_dec *rc,
gokhlayeh@9258 1706 + uint16_t *probs,
gokhlayeh@9258 1707 + uint32_t *dest, uint32_t limit)
gokhlayeh@9258 1708 +{
gokhlayeh@9258 1709 + uint32_t symbol = 1;
gokhlayeh@9258 1710 + uint32_t i = 0;
gokhlayeh@9258 1711 +
gokhlayeh@9258 1712 + do {
gokhlayeh@9258 1713 + if (rc_bit(rc, &probs[symbol])) {
gokhlayeh@9258 1714 + symbol = (symbol << 1) + 1;
gokhlayeh@9258 1715 + *dest += 1 << i;
gokhlayeh@9258 1716 + } else {
gokhlayeh@9258 1717 + symbol <<= 1;
gokhlayeh@9258 1718 + }
gokhlayeh@9258 1719 + } while (++i < limit);
gokhlayeh@9258 1720 +}
gokhlayeh@9258 1721 +
gokhlayeh@9258 1722 +/* Decode direct bits (fixed fifty-fifty probability) */
gokhlayeh@9258 1723 +static inline void rc_direct(struct rc_dec *rc, uint32_t *dest, uint32_t limit)
gokhlayeh@9258 1724 +{
gokhlayeh@9258 1725 + uint32_t mask;
gokhlayeh@9258 1726 +
gokhlayeh@9258 1727 + do {
gokhlayeh@9258 1728 + rc_normalize(rc);
gokhlayeh@9258 1729 + rc->range >>= 1;
gokhlayeh@9258 1730 + rc->code -= rc->range;
gokhlayeh@9258 1731 + mask = (uint32_t)0 - (rc->code >> 31);
gokhlayeh@9258 1732 + rc->code += rc->range & mask;
gokhlayeh@9258 1733 + *dest = (*dest << 1) + (mask + 1);
gokhlayeh@9258 1734 + } while (--limit > 0);
gokhlayeh@9258 1735 +}
gokhlayeh@9258 1736 +
gokhlayeh@9258 1737 +/********
gokhlayeh@9258 1738 + * LZMA *
gokhlayeh@9258 1739 + ********/
gokhlayeh@9258 1740 +
gokhlayeh@9258 1741 +/* Get pointer to literal coder probability array. */
gokhlayeh@9258 1742 +static uint16_t *lzma_literal_probs(struct xz_dec_lzma2 *s)
gokhlayeh@9258 1743 +{
gokhlayeh@9258 1744 + uint32_t prev_byte = dict_get(&s->dict, 0);
gokhlayeh@9258 1745 + uint32_t low = prev_byte >> (8 - s->lzma.lc);
gokhlayeh@9258 1746 + uint32_t high = (s->dict.pos & s->lzma.literal_pos_mask) << s->lzma.lc;
gokhlayeh@9258 1747 + return s->lzma.literal[low + high];
gokhlayeh@9258 1748 +}
gokhlayeh@9258 1749 +
gokhlayeh@9258 1750 +/* Decode a literal (one 8-bit byte) */
gokhlayeh@9258 1751 +static void lzma_literal(struct xz_dec_lzma2 *s)
gokhlayeh@9258 1752 +{
gokhlayeh@9258 1753 + uint16_t *probs;
gokhlayeh@9258 1754 + uint32_t symbol;
gokhlayeh@9258 1755 + uint32_t match_byte;
gokhlayeh@9258 1756 + uint32_t match_bit;
gokhlayeh@9258 1757 + uint32_t offset;
gokhlayeh@9258 1758 + uint32_t i;
gokhlayeh@9258 1759 +
gokhlayeh@9258 1760 + probs = lzma_literal_probs(s);
gokhlayeh@9258 1761 +
gokhlayeh@9258 1762 + if (lzma_state_is_literal(s->lzma.state)) {
gokhlayeh@9258 1763 + symbol = rc_bittree(&s->rc, probs, 0x100);
gokhlayeh@9258 1764 + } else {
gokhlayeh@9258 1765 + symbol = 1;
gokhlayeh@9258 1766 + match_byte = dict_get(&s->dict, s->lzma.rep0) << 1;
gokhlayeh@9258 1767 + offset = 0x100;
gokhlayeh@9258 1768 +
gokhlayeh@9258 1769 + do {
gokhlayeh@9258 1770 + match_bit = match_byte & offset;
gokhlayeh@9258 1771 + match_byte <<= 1;
gokhlayeh@9258 1772 + i = offset + match_bit + symbol;
gokhlayeh@9258 1773 +
gokhlayeh@9258 1774 + if (rc_bit(&s->rc, &probs[i])) {
gokhlayeh@9258 1775 + symbol = (symbol << 1) + 1;
gokhlayeh@9258 1776 + offset &= match_bit;
gokhlayeh@9258 1777 + } else {
gokhlayeh@9258 1778 + symbol <<= 1;
gokhlayeh@9258 1779 + offset &= ~match_bit;
gokhlayeh@9258 1780 + }
gokhlayeh@9258 1781 + } while (symbol < 0x100);
gokhlayeh@9258 1782 + }
gokhlayeh@9258 1783 +
gokhlayeh@9258 1784 + dict_put(&s->dict, (uint8_t)symbol);
gokhlayeh@9258 1785 + lzma_state_literal(&s->lzma.state);
gokhlayeh@9258 1786 +}
gokhlayeh@9258 1787 +
gokhlayeh@9258 1788 +/* Decode the length of the match into s->lzma.len. */
gokhlayeh@9258 1789 +static void lzma_len(struct xz_dec_lzma2 *s, struct lzma_len_dec *l,
gokhlayeh@9258 1790 + uint32_t pos_state)
gokhlayeh@9258 1791 +{
gokhlayeh@9258 1792 + uint16_t *probs;
gokhlayeh@9258 1793 + uint32_t limit;
gokhlayeh@9258 1794 +
gokhlayeh@9258 1795 + if (!rc_bit(&s->rc, &l->choice)) {
gokhlayeh@9258 1796 + probs = l->low[pos_state];
gokhlayeh@9258 1797 + limit = LEN_LOW_SYMBOLS;
gokhlayeh@9258 1798 + s->lzma.len = MATCH_LEN_MIN;
gokhlayeh@9258 1799 + } else {
gokhlayeh@9258 1800 + if (!rc_bit(&s->rc, &l->choice2)) {
gokhlayeh@9258 1801 + probs = l->mid[pos_state];
gokhlayeh@9258 1802 + limit = LEN_MID_SYMBOLS;
gokhlayeh@9258 1803 + s->lzma.len = MATCH_LEN_MIN + LEN_LOW_SYMBOLS;
gokhlayeh@9258 1804 + } else {
gokhlayeh@9258 1805 + probs = l->high;
gokhlayeh@9258 1806 + limit = LEN_HIGH_SYMBOLS;
gokhlayeh@9258 1807 + s->lzma.len = MATCH_LEN_MIN + LEN_LOW_SYMBOLS
gokhlayeh@9258 1808 + + LEN_MID_SYMBOLS;
gokhlayeh@9258 1809 + }
gokhlayeh@9258 1810 + }
gokhlayeh@9258 1811 +
gokhlayeh@9258 1812 + s->lzma.len += rc_bittree(&s->rc, probs, limit) - limit;
gokhlayeh@9258 1813 +}
gokhlayeh@9258 1814 +
gokhlayeh@9258 1815 +/* Decode a match. The distance will be stored in s->lzma.rep0. */
gokhlayeh@9258 1816 +static void lzma_match(struct xz_dec_lzma2 *s, uint32_t pos_state)
gokhlayeh@9258 1817 +{
gokhlayeh@9258 1818 + uint16_t *probs;
gokhlayeh@9258 1819 + uint32_t dist_slot;
gokhlayeh@9258 1820 + uint32_t limit;
gokhlayeh@9258 1821 +
gokhlayeh@9258 1822 + lzma_state_match(&s->lzma.state);
gokhlayeh@9258 1823 +
gokhlayeh@9258 1824 + s->lzma.rep3 = s->lzma.rep2;
gokhlayeh@9258 1825 + s->lzma.rep2 = s->lzma.rep1;
gokhlayeh@9258 1826 + s->lzma.rep1 = s->lzma.rep0;
gokhlayeh@9258 1827 +
gokhlayeh@9258 1828 + lzma_len(s, &s->lzma.match_len_dec, pos_state);
gokhlayeh@9258 1829 +
gokhlayeh@9258 1830 + probs = s->lzma.dist_slot[lzma_get_dist_state(s->lzma.len)];
gokhlayeh@9258 1831 + dist_slot = rc_bittree(&s->rc, probs, DIST_SLOTS) - DIST_SLOTS;
gokhlayeh@9258 1832 +
gokhlayeh@9258 1833 + if (dist_slot < DIST_MODEL_START) {
gokhlayeh@9258 1834 + s->lzma.rep0 = dist_slot;
gokhlayeh@9258 1835 + } else {
gokhlayeh@9258 1836 + limit = (dist_slot >> 1) - 1;
gokhlayeh@9258 1837 + s->lzma.rep0 = 2 + (dist_slot & 1);
gokhlayeh@9258 1838 +
gokhlayeh@9258 1839 + if (dist_slot < DIST_MODEL_END) {
gokhlayeh@9258 1840 + s->lzma.rep0 <<= limit;
gokhlayeh@9258 1841 + probs = s->lzma.dist_special + s->lzma.rep0
gokhlayeh@9258 1842 + - dist_slot - 1;
gokhlayeh@9258 1843 + rc_bittree_reverse(&s->rc, probs,
gokhlayeh@9258 1844 + &s->lzma.rep0, limit);
gokhlayeh@9258 1845 + } else {
gokhlayeh@9258 1846 + rc_direct(&s->rc, &s->lzma.rep0, limit - ALIGN_BITS);
gokhlayeh@9258 1847 + s->lzma.rep0 <<= ALIGN_BITS;
gokhlayeh@9258 1848 + rc_bittree_reverse(&s->rc, s->lzma.dist_align,
gokhlayeh@9258 1849 + &s->lzma.rep0, ALIGN_BITS);
gokhlayeh@9258 1850 + }
gokhlayeh@9258 1851 + }
gokhlayeh@9258 1852 +}
gokhlayeh@9258 1853 +
gokhlayeh@9258 1854 +/*
gokhlayeh@9258 1855 + * Decode a repeated match. The distance is one of the four most recently
gokhlayeh@9258 1856 + * seen matches. The distance will be stored in s->lzma.rep0.
gokhlayeh@9258 1857 + */
gokhlayeh@9258 1858 +static void lzma_rep_match(struct xz_dec_lzma2 *s, uint32_t pos_state)
gokhlayeh@9258 1859 +{
gokhlayeh@9258 1860 + uint32_t tmp;
gokhlayeh@9258 1861 +
gokhlayeh@9258 1862 + if (!rc_bit(&s->rc, &s->lzma.is_rep0[s->lzma.state])) {
gokhlayeh@9258 1863 + if (!rc_bit(&s->rc, &s->lzma.is_rep0_long[
gokhlayeh@9258 1864 + s->lzma.state][pos_state])) {
gokhlayeh@9258 1865 + lzma_state_short_rep(&s->lzma.state);
gokhlayeh@9258 1866 + s->lzma.len = 1;
gokhlayeh@9258 1867 + return;
gokhlayeh@9258 1868 + }
gokhlayeh@9258 1869 + } else {
gokhlayeh@9258 1870 + if (!rc_bit(&s->rc, &s->lzma.is_rep1[s->lzma.state])) {
gokhlayeh@9258 1871 + tmp = s->lzma.rep1;
gokhlayeh@9258 1872 + } else {
gokhlayeh@9258 1873 + if (!rc_bit(&s->rc, &s->lzma.is_rep2[s->lzma.state])) {
gokhlayeh@9258 1874 + tmp = s->lzma.rep2;
gokhlayeh@9258 1875 + } else {
gokhlayeh@9258 1876 + tmp = s->lzma.rep3;
gokhlayeh@9258 1877 + s->lzma.rep3 = s->lzma.rep2;
gokhlayeh@9258 1878 + }
gokhlayeh@9258 1879 +
gokhlayeh@9258 1880 + s->lzma.rep2 = s->lzma.rep1;
gokhlayeh@9258 1881 + }
gokhlayeh@9258 1882 +
gokhlayeh@9258 1883 + s->lzma.rep1 = s->lzma.rep0;
gokhlayeh@9258 1884 + s->lzma.rep0 = tmp;
gokhlayeh@9258 1885 + }
gokhlayeh@9258 1886 +
gokhlayeh@9258 1887 + lzma_state_long_rep(&s->lzma.state);
gokhlayeh@9258 1888 + lzma_len(s, &s->lzma.rep_len_dec, pos_state);
gokhlayeh@9258 1889 +}
gokhlayeh@9258 1890 +
gokhlayeh@9258 1891 +/* LZMA decoder core */
gokhlayeh@9258 1892 +static bool lzma_main(struct xz_dec_lzma2 *s)
gokhlayeh@9258 1893 +{
gokhlayeh@9258 1894 + uint32_t pos_state;
gokhlayeh@9258 1895 +
gokhlayeh@9258 1896 + /*
gokhlayeh@9258 1897 + * If the dictionary was reached during the previous call, try to
gokhlayeh@9258 1898 + * finish the possibly pending repeat in the dictionary.
gokhlayeh@9258 1899 + */
gokhlayeh@9258 1900 + if (dict_has_space(&s->dict) && s->lzma.len > 0)
gokhlayeh@9258 1901 + dict_repeat(&s->dict, &s->lzma.len, s->lzma.rep0);
gokhlayeh@9258 1902 +
gokhlayeh@9258 1903 + /*
gokhlayeh@9258 1904 + * Decode more LZMA symbols. One iteration may consume up to
gokhlayeh@9258 1905 + * LZMA_IN_REQUIRED - 1 bytes.
gokhlayeh@9258 1906 + */
gokhlayeh@9258 1907 + while (dict_has_space(&s->dict) && !rc_limit_exceeded(&s->rc)) {
gokhlayeh@9258 1908 + pos_state = s->dict.pos & s->lzma.pos_mask;
gokhlayeh@9258 1909 +
gokhlayeh@9258 1910 + if (!rc_bit(&s->rc, &s->lzma.is_match[
gokhlayeh@9258 1911 + s->lzma.state][pos_state])) {
gokhlayeh@9258 1912 + lzma_literal(s);
gokhlayeh@9258 1913 + } else {
gokhlayeh@9258 1914 + if (rc_bit(&s->rc, &s->lzma.is_rep[s->lzma.state]))
gokhlayeh@9258 1915 + lzma_rep_match(s, pos_state);
gokhlayeh@9258 1916 + else
gokhlayeh@9258 1917 + lzma_match(s, pos_state);
gokhlayeh@9258 1918 +
gokhlayeh@9258 1919 + if (!dict_repeat(&s->dict, &s->lzma.len, s->lzma.rep0))
gokhlayeh@9258 1920 + return false;
gokhlayeh@9258 1921 + }
gokhlayeh@9258 1922 + }
gokhlayeh@9258 1923 +
gokhlayeh@9258 1924 + /*
gokhlayeh@9258 1925 + * Having the range decoder always normalized when we are outside
gokhlayeh@9258 1926 + * this function makes it easier to correctly handle end of the chunk.
gokhlayeh@9258 1927 + */
gokhlayeh@9258 1928 + rc_normalize(&s->rc);
gokhlayeh@9258 1929 +
gokhlayeh@9258 1930 + return true;
gokhlayeh@9258 1931 +}
gokhlayeh@9258 1932 +
gokhlayeh@9258 1933 +/*
gokhlayeh@9258 1934 + * Reset the LZMA decoder and range decoder state. Dictionary is nore reset
gokhlayeh@9258 1935 + * here, because LZMA state may be reset without resetting the dictionary.
gokhlayeh@9258 1936 + */
gokhlayeh@9258 1937 +static void lzma_reset(struct xz_dec_lzma2 *s)
gokhlayeh@9258 1938 +{
gokhlayeh@9258 1939 + uint16_t *probs;
gokhlayeh@9258 1940 + size_t i;
gokhlayeh@9258 1941 +
gokhlayeh@9258 1942 + s->lzma.state = STATE_LIT_LIT;
gokhlayeh@9258 1943 + s->lzma.rep0 = 0;
gokhlayeh@9258 1944 + s->lzma.rep1 = 0;
gokhlayeh@9258 1945 + s->lzma.rep2 = 0;
gokhlayeh@9258 1946 + s->lzma.rep3 = 0;
gokhlayeh@9258 1947 +
gokhlayeh@9258 1948 + /*
gokhlayeh@9258 1949 + * All probabilities are initialized to the same value. This hack
gokhlayeh@9258 1950 + * makes the code smaller by avoiding a separate loop for each
gokhlayeh@9258 1951 + * probability array.
gokhlayeh@9258 1952 + *
gokhlayeh@9258 1953 + * This could be optimized so that only that part of literal
gokhlayeh@9258 1954 + * probabilities that are actually required. In the common case
gokhlayeh@9258 1955 + * we would write 12 KiB less.
gokhlayeh@9258 1956 + */
gokhlayeh@9258 1957 + probs = s->lzma.is_match[0];
gokhlayeh@9258 1958 + for (i = 0; i < PROBS_TOTAL; ++i)
gokhlayeh@9258 1959 + probs[i] = RC_BIT_MODEL_TOTAL / 2;
gokhlayeh@9258 1960 +
gokhlayeh@9258 1961 + rc_reset(&s->rc);
gokhlayeh@9258 1962 +}
gokhlayeh@9258 1963 +
gokhlayeh@9258 1964 +/*
gokhlayeh@9258 1965 + * Decode and validate LZMA properties (lc/lp/pb) and calculate the bit masks
gokhlayeh@9258 1966 + * from the decoded lp and pb values. On success, the LZMA decoder state is
gokhlayeh@9258 1967 + * reset and true is returned.
gokhlayeh@9258 1968 + */
gokhlayeh@9258 1969 +static bool lzma_props(struct xz_dec_lzma2 *s, uint8_t props)
gokhlayeh@9258 1970 +{
gokhlayeh@9258 1971 + if (props > (4 * 5 + 4) * 9 + 8)
gokhlayeh@9258 1972 + return false;
gokhlayeh@9258 1973 +
gokhlayeh@9258 1974 + s->lzma.pos_mask = 0;
gokhlayeh@9258 1975 + while (props >= 9 * 5) {
gokhlayeh@9258 1976 + props -= 9 * 5;
gokhlayeh@9258 1977 + ++s->lzma.pos_mask;
gokhlayeh@9258 1978 + }
gokhlayeh@9258 1979 +
gokhlayeh@9258 1980 + s->lzma.pos_mask = (1 << s->lzma.pos_mask) - 1;
gokhlayeh@9258 1981 +
gokhlayeh@9258 1982 + s->lzma.literal_pos_mask = 0;
gokhlayeh@9258 1983 + while (props >= 9) {
gokhlayeh@9258 1984 + props -= 9;
gokhlayeh@9258 1985 + ++s->lzma.literal_pos_mask;
gokhlayeh@9258 1986 + }
gokhlayeh@9258 1987 +
gokhlayeh@9258 1988 + s->lzma.lc = props;
gokhlayeh@9258 1989 +
gokhlayeh@9258 1990 + if (s->lzma.lc + s->lzma.literal_pos_mask > 4)
gokhlayeh@9258 1991 + return false;
gokhlayeh@9258 1992 +
gokhlayeh@9258 1993 + s->lzma.literal_pos_mask = (1 << s->lzma.literal_pos_mask) - 1;
gokhlayeh@9258 1994 +
gokhlayeh@9258 1995 + lzma_reset(s);
gokhlayeh@9258 1996 +
gokhlayeh@9258 1997 + return true;
gokhlayeh@9258 1998 +}
gokhlayeh@9258 1999 +
gokhlayeh@9258 2000 +/*********
gokhlayeh@9258 2001 + * LZMA2 *
gokhlayeh@9258 2002 + *********/
gokhlayeh@9258 2003 +
gokhlayeh@9258 2004 +/*
gokhlayeh@9258 2005 + * The LZMA decoder assumes that if the input limit (s->rc.in_limit) hasn't
gokhlayeh@9258 2006 + * been exceeded, it is safe to read up to LZMA_IN_REQUIRED bytes. This
gokhlayeh@9258 2007 + * wrapper function takes care of making the LZMA decoder's assumption safe.
gokhlayeh@9258 2008 + *
gokhlayeh@9258 2009 + * As long as there is plenty of input left to be decoded in the current LZMA
gokhlayeh@9258 2010 + * chunk, we decode directly from the caller-supplied input buffer until
gokhlayeh@9258 2011 + * there's LZMA_IN_REQUIRED bytes left. Those remaining bytes are copied into
gokhlayeh@9258 2012 + * s->temp.buf, which (hopefully) gets filled on the next call to this
gokhlayeh@9258 2013 + * function. We decode a few bytes from the temporary buffer so that we can
gokhlayeh@9258 2014 + * continue decoding from the caller-supplied input buffer again.
gokhlayeh@9258 2015 + */
gokhlayeh@9258 2016 +static bool lzma2_lzma(struct xz_dec_lzma2 *s, struct xz_buf *b)
gokhlayeh@9258 2017 +{
gokhlayeh@9258 2018 + size_t in_avail;
gokhlayeh@9258 2019 + uint32_t tmp;
gokhlayeh@9258 2020 +
gokhlayeh@9258 2021 + in_avail = b->in_size - b->in_pos;
gokhlayeh@9258 2022 + if (s->temp.size > 0 || s->lzma2.compressed == 0) {
gokhlayeh@9258 2023 + tmp = 2 * LZMA_IN_REQUIRED - s->temp.size;
gokhlayeh@9258 2024 + if (tmp > s->lzma2.compressed - s->temp.size)
gokhlayeh@9258 2025 + tmp = s->lzma2.compressed - s->temp.size;
gokhlayeh@9258 2026 + if (tmp > in_avail)
gokhlayeh@9258 2027 + tmp = in_avail;
gokhlayeh@9258 2028 +
gokhlayeh@9258 2029 + memcpy(s->temp.buf + s->temp.size, b->in + b->in_pos, tmp);
gokhlayeh@9258 2030 +
gokhlayeh@9258 2031 + if (s->temp.size + tmp == s->lzma2.compressed) {
gokhlayeh@9258 2032 + memzero(s->temp.buf + s->temp.size + tmp,
gokhlayeh@9258 2033 + sizeof(s->temp.buf)
gokhlayeh@9258 2034 + - s->temp.size - tmp);
gokhlayeh@9258 2035 + s->rc.in_limit = s->temp.size + tmp;
gokhlayeh@9258 2036 + } else if (s->temp.size + tmp < LZMA_IN_REQUIRED) {
gokhlayeh@9258 2037 + s->temp.size += tmp;
gokhlayeh@9258 2038 + b->in_pos += tmp;
gokhlayeh@9258 2039 + return true;
gokhlayeh@9258 2040 + } else {
gokhlayeh@9258 2041 + s->rc.in_limit = s->temp.size + tmp - LZMA_IN_REQUIRED;
gokhlayeh@9258 2042 + }
gokhlayeh@9258 2043 +
gokhlayeh@9258 2044 + s->rc.in = s->temp.buf;
gokhlayeh@9258 2045 + s->rc.in_pos = 0;
gokhlayeh@9258 2046 +
gokhlayeh@9258 2047 + if (!lzma_main(s) || s->rc.in_pos > s->temp.size + tmp)
gokhlayeh@9258 2048 + return false;
gokhlayeh@9258 2049 +
gokhlayeh@9258 2050 + s->lzma2.compressed -= s->rc.in_pos;
gokhlayeh@9258 2051 +
gokhlayeh@9258 2052 + if (s->rc.in_pos < s->temp.size) {
gokhlayeh@9258 2053 + s->temp.size -= s->rc.in_pos;
gokhlayeh@9258 2054 + memmove(s->temp.buf, s->temp.buf + s->rc.in_pos,
gokhlayeh@9258 2055 + s->temp.size);
gokhlayeh@9258 2056 + return true;
gokhlayeh@9258 2057 + }
gokhlayeh@9258 2058 +
gokhlayeh@9258 2059 + b->in_pos += s->rc.in_pos - s->temp.size;
gokhlayeh@9258 2060 + s->temp.size = 0;
gokhlayeh@9258 2061 + }
gokhlayeh@9258 2062 +
gokhlayeh@9258 2063 + in_avail = b->in_size - b->in_pos;
gokhlayeh@9258 2064 + if (in_avail >= LZMA_IN_REQUIRED) {
gokhlayeh@9258 2065 + s->rc.in = b->in;
gokhlayeh@9258 2066 + s->rc.in_pos = b->in_pos;
gokhlayeh@9258 2067 +
gokhlayeh@9258 2068 + if (in_avail >= s->lzma2.compressed + LZMA_IN_REQUIRED)
gokhlayeh@9258 2069 + s->rc.in_limit = b->in_pos + s->lzma2.compressed;
gokhlayeh@9258 2070 + else
gokhlayeh@9258 2071 + s->rc.in_limit = b->in_size - LZMA_IN_REQUIRED;
gokhlayeh@9258 2072 +
gokhlayeh@9258 2073 + if (!lzma_main(s))
gokhlayeh@9258 2074 + return false;
gokhlayeh@9258 2075 +
gokhlayeh@9258 2076 + in_avail = s->rc.in_pos - b->in_pos;
gokhlayeh@9258 2077 + if (in_avail > s->lzma2.compressed)
gokhlayeh@9258 2078 + return false;
gokhlayeh@9258 2079 +
gokhlayeh@9258 2080 + s->lzma2.compressed -= in_avail;
gokhlayeh@9258 2081 + b->in_pos = s->rc.in_pos;
gokhlayeh@9258 2082 + }
gokhlayeh@9258 2083 +
gokhlayeh@9258 2084 + in_avail = b->in_size - b->in_pos;
gokhlayeh@9258 2085 + if (in_avail < LZMA_IN_REQUIRED) {
gokhlayeh@9258 2086 + if (in_avail > s->lzma2.compressed)
gokhlayeh@9258 2087 + in_avail = s->lzma2.compressed;
gokhlayeh@9258 2088 +
gokhlayeh@9258 2089 + memcpy(s->temp.buf, b->in + b->in_pos, in_avail);
gokhlayeh@9258 2090 + s->temp.size = in_avail;
gokhlayeh@9258 2091 + b->in_pos += in_avail;
gokhlayeh@9258 2092 + }
gokhlayeh@9258 2093 +
gokhlayeh@9258 2094 + return true;
gokhlayeh@9258 2095 +}
gokhlayeh@9258 2096 +
gokhlayeh@9258 2097 +/*
gokhlayeh@9258 2098 + * Take care of the LZMA2 control layer, and forward the job of actual LZMA
gokhlayeh@9258 2099 + * decoding or copying of uncompressed chunks to other functions.
gokhlayeh@9258 2100 + */
gokhlayeh@9258 2101 +XZ_EXTERN enum xz_ret xz_dec_lzma2_run(struct xz_dec_lzma2 *s,
gokhlayeh@9258 2102 + struct xz_buf *b)
gokhlayeh@9258 2103 +{
gokhlayeh@9258 2104 + uint32_t tmp;
gokhlayeh@9258 2105 +
gokhlayeh@9258 2106 + while (b->in_pos < b->in_size || s->lzma2.sequence == SEQ_LZMA_RUN) {
gokhlayeh@9258 2107 + switch (s->lzma2.sequence) {
gokhlayeh@9258 2108 + case SEQ_CONTROL:
gokhlayeh@9258 2109 + /*
gokhlayeh@9258 2110 + * LZMA2 control byte
gokhlayeh@9258 2111 + *
gokhlayeh@9258 2112 + * Exact values:
gokhlayeh@9258 2113 + * 0x00 End marker
gokhlayeh@9258 2114 + * 0x01 Dictionary reset followed by
gokhlayeh@9258 2115 + * an uncompressed chunk
gokhlayeh@9258 2116 + * 0x02 Uncompressed chunk (no dictionary reset)
gokhlayeh@9258 2117 + *
gokhlayeh@9258 2118 + * Highest three bits (s->control & 0xE0):
gokhlayeh@9258 2119 + * 0xE0 Dictionary reset, new properties and state
gokhlayeh@9258 2120 + * reset, followed by LZMA compressed chunk
gokhlayeh@9258 2121 + * 0xC0 New properties and state reset, followed
gokhlayeh@9258 2122 + * by LZMA compressed chunk (no dictionary
gokhlayeh@9258 2123 + * reset)
gokhlayeh@9258 2124 + * 0xA0 State reset using old properties,
gokhlayeh@9258 2125 + * followed by LZMA compressed chunk (no
gokhlayeh@9258 2126 + * dictionary reset)
gokhlayeh@9258 2127 + * 0x80 LZMA chunk (no dictionary or state reset)
gokhlayeh@9258 2128 + *
gokhlayeh@9258 2129 + * For LZMA compressed chunks, the lowest five bits
gokhlayeh@9258 2130 + * (s->control & 1F) are the highest bits of the
gokhlayeh@9258 2131 + * uncompressed size (bits 16-20).
gokhlayeh@9258 2132 + *
gokhlayeh@9258 2133 + * A new LZMA2 stream must begin with a dictionary
gokhlayeh@9258 2134 + * reset. The first LZMA chunk must set new
gokhlayeh@9258 2135 + * properties and reset the LZMA state.
gokhlayeh@9258 2136 + *
gokhlayeh@9258 2137 + * Values that don't match anything described above
gokhlayeh@9258 2138 + * are invalid and we return XZ_DATA_ERROR.
gokhlayeh@9258 2139 + */
gokhlayeh@9258 2140 + tmp = b->in[b->in_pos++];
gokhlayeh@9258 2141 +
gokhlayeh@9258 2142 + if (tmp >= 0xE0 || tmp == 0x01) {
gokhlayeh@9258 2143 + s->lzma2.need_props = true;
gokhlayeh@9258 2144 + s->lzma2.need_dict_reset = false;
gokhlayeh@9258 2145 + dict_reset(&s->dict, b);
gokhlayeh@9258 2146 + } else if (s->lzma2.need_dict_reset) {
gokhlayeh@9258 2147 + return XZ_DATA_ERROR;
gokhlayeh@9258 2148 + }
gokhlayeh@9258 2149 +
gokhlayeh@9258 2150 + if (tmp >= 0x80) {
gokhlayeh@9258 2151 + s->lzma2.uncompressed = (tmp & 0x1F) << 16;
gokhlayeh@9258 2152 + s->lzma2.sequence = SEQ_UNCOMPRESSED_1;
gokhlayeh@9258 2153 +
gokhlayeh@9258 2154 + if (tmp >= 0xC0) {
gokhlayeh@9258 2155 + /*
gokhlayeh@9258 2156 + * When there are new properties,
gokhlayeh@9258 2157 + * state reset is done at
gokhlayeh@9258 2158 + * SEQ_PROPERTIES.
gokhlayeh@9258 2159 + */
gokhlayeh@9258 2160 + s->lzma2.need_props = false;
gokhlayeh@9258 2161 + s->lzma2.next_sequence
gokhlayeh@9258 2162 + = SEQ_PROPERTIES;
gokhlayeh@9258 2163 +
gokhlayeh@9258 2164 + } else if (s->lzma2.need_props) {
gokhlayeh@9258 2165 + return XZ_DATA_ERROR;
gokhlayeh@9258 2166 +
gokhlayeh@9258 2167 + } else {
gokhlayeh@9258 2168 + s->lzma2.next_sequence
gokhlayeh@9258 2169 + = SEQ_LZMA_PREPARE;
gokhlayeh@9258 2170 + if (tmp >= 0xA0)
gokhlayeh@9258 2171 + lzma_reset(s);
gokhlayeh@9258 2172 + }
gokhlayeh@9258 2173 + } else {
gokhlayeh@9258 2174 + if (tmp == 0x00)
gokhlayeh@9258 2175 + return XZ_STREAM_END;
gokhlayeh@9258 2176 +
gokhlayeh@9258 2177 + if (tmp > 0x02)
gokhlayeh@9258 2178 + return XZ_DATA_ERROR;
gokhlayeh@9258 2179 +
gokhlayeh@9258 2180 + s->lzma2.sequence = SEQ_COMPRESSED_0;
gokhlayeh@9258 2181 + s->lzma2.next_sequence = SEQ_COPY;
gokhlayeh@9258 2182 + }
gokhlayeh@9258 2183 +
gokhlayeh@9258 2184 + break;
gokhlayeh@9258 2185 +
gokhlayeh@9258 2186 + case SEQ_UNCOMPRESSED_1:
gokhlayeh@9258 2187 + s->lzma2.uncompressed
gokhlayeh@9258 2188 + += (uint32_t)b->in[b->in_pos++] << 8;
gokhlayeh@9258 2189 + s->lzma2.sequence = SEQ_UNCOMPRESSED_2;
gokhlayeh@9258 2190 + break;
gokhlayeh@9258 2191 +
gokhlayeh@9258 2192 + case SEQ_UNCOMPRESSED_2:
gokhlayeh@9258 2193 + s->lzma2.uncompressed
gokhlayeh@9258 2194 + += (uint32_t)b->in[b->in_pos++] + 1;
gokhlayeh@9258 2195 + s->lzma2.sequence = SEQ_COMPRESSED_0;
gokhlayeh@9258 2196 + break;
gokhlayeh@9258 2197 +
gokhlayeh@9258 2198 + case SEQ_COMPRESSED_0:
gokhlayeh@9258 2199 + s->lzma2.compressed
gokhlayeh@9258 2200 + = (uint32_t)b->in[b->in_pos++] << 8;
gokhlayeh@9258 2201 + s->lzma2.sequence = SEQ_COMPRESSED_1;
gokhlayeh@9258 2202 + break;
gokhlayeh@9258 2203 +
gokhlayeh@9258 2204 + case SEQ_COMPRESSED_1:
gokhlayeh@9258 2205 + s->lzma2.compressed
gokhlayeh@9258 2206 + += (uint32_t)b->in[b->in_pos++] + 1;
gokhlayeh@9258 2207 + s->lzma2.sequence = s->lzma2.next_sequence;
gokhlayeh@9258 2208 + break;
gokhlayeh@9258 2209 +
gokhlayeh@9258 2210 + case SEQ_PROPERTIES:
gokhlayeh@9258 2211 + if (!lzma_props(s, b->in[b->in_pos++]))
gokhlayeh@9258 2212 + return XZ_DATA_ERROR;
gokhlayeh@9258 2213 +
gokhlayeh@9258 2214 + s->lzma2.sequence = SEQ_LZMA_PREPARE;
gokhlayeh@9258 2215 +
gokhlayeh@9258 2216 + case SEQ_LZMA_PREPARE:
gokhlayeh@9258 2217 + if (s->lzma2.compressed < RC_INIT_BYTES)
gokhlayeh@9258 2218 + return XZ_DATA_ERROR;
gokhlayeh@9258 2219 +
gokhlayeh@9258 2220 + if (!rc_read_init(&s->rc, b))
gokhlayeh@9258 2221 + return XZ_OK;
gokhlayeh@9258 2222 +
gokhlayeh@9258 2223 + s->lzma2.compressed -= RC_INIT_BYTES;
gokhlayeh@9258 2224 + s->lzma2.sequence = SEQ_LZMA_RUN;
gokhlayeh@9258 2225 +
gokhlayeh@9258 2226 + case SEQ_LZMA_RUN:
gokhlayeh@9258 2227 + /*
gokhlayeh@9258 2228 + * Set dictionary limit to indicate how much we want
gokhlayeh@9258 2229 + * to be encoded at maximum. Decode new data into the
gokhlayeh@9258 2230 + * dictionary. Flush the new data from dictionary to
gokhlayeh@9258 2231 + * b->out. Check if we finished decoding this chunk.
gokhlayeh@9258 2232 + * In case the dictionary got full but we didn't fill
gokhlayeh@9258 2233 + * the output buffer yet, we may run this loop
gokhlayeh@9258 2234 + * multiple times without changing s->lzma2.sequence.
gokhlayeh@9258 2235 + */
gokhlayeh@9258 2236 + dict_limit(&s->dict, min_t(size_t,
gokhlayeh@9258 2237 + b->out_size - b->out_pos,
gokhlayeh@9258 2238 + s->lzma2.uncompressed));
gokhlayeh@9258 2239 + if (!lzma2_lzma(s, b))
gokhlayeh@9258 2240 + return XZ_DATA_ERROR;
gokhlayeh@9258 2241 +
gokhlayeh@9258 2242 + s->lzma2.uncompressed -= dict_flush(&s->dict, b);
gokhlayeh@9258 2243 +
gokhlayeh@9258 2244 + if (s->lzma2.uncompressed == 0) {
gokhlayeh@9258 2245 + if (s->lzma2.compressed > 0 || s->lzma.len > 0
gokhlayeh@9258 2246 + || !rc_is_finished(&s->rc))
gokhlayeh@9258 2247 + return XZ_DATA_ERROR;
gokhlayeh@9258 2248 +
gokhlayeh@9258 2249 + rc_reset(&s->rc);
gokhlayeh@9258 2250 + s->lzma2.sequence = SEQ_CONTROL;
gokhlayeh@9258 2251 +
gokhlayeh@9258 2252 + } else if (b->out_pos == b->out_size
gokhlayeh@9258 2253 + || (b->in_pos == b->in_size
gokhlayeh@9258 2254 + && s->temp.size
gokhlayeh@9258 2255 + < s->lzma2.compressed)) {
gokhlayeh@9258 2256 + return XZ_OK;
gokhlayeh@9258 2257 + }
gokhlayeh@9258 2258 +
gokhlayeh@9258 2259 + break;
gokhlayeh@9258 2260 +
gokhlayeh@9258 2261 + case SEQ_COPY:
gokhlayeh@9258 2262 + dict_uncompressed(&s->dict, b, &s->lzma2.compressed);
gokhlayeh@9258 2263 + if (s->lzma2.compressed > 0)
gokhlayeh@9258 2264 + return XZ_OK;
gokhlayeh@9258 2265 +
gokhlayeh@9258 2266 + s->lzma2.sequence = SEQ_CONTROL;
gokhlayeh@9258 2267 + break;
gokhlayeh@9258 2268 + }
gokhlayeh@9258 2269 + }
gokhlayeh@9258 2270 +
gokhlayeh@9258 2271 + return XZ_OK;
gokhlayeh@9258 2272 +}
gokhlayeh@9258 2273 +
gokhlayeh@9258 2274 +XZ_EXTERN struct xz_dec_lzma2 *xz_dec_lzma2_create(enum xz_mode mode,
gokhlayeh@9258 2275 + uint32_t dict_max)
gokhlayeh@9258 2276 +{
gokhlayeh@9258 2277 + struct xz_dec_lzma2 *s = kmalloc(sizeof(*s), GFP_KERNEL);
gokhlayeh@9258 2278 + if (s == NULL)
gokhlayeh@9258 2279 + return NULL;
gokhlayeh@9258 2280 +
gokhlayeh@9258 2281 + s->dict.mode = mode;
gokhlayeh@9258 2282 + s->dict.size_max = dict_max;
gokhlayeh@9258 2283 +
gokhlayeh@9258 2284 + if (DEC_IS_PREALLOC(mode)) {
gokhlayeh@9258 2285 + s->dict.buf = vmalloc(dict_max);
gokhlayeh@9258 2286 + if (s->dict.buf == NULL) {
gokhlayeh@9258 2287 + kfree(s);
gokhlayeh@9258 2288 + return NULL;
gokhlayeh@9258 2289 + }
gokhlayeh@9258 2290 + } else if (DEC_IS_DYNALLOC(mode)) {
gokhlayeh@9258 2291 + s->dict.buf = NULL;
gokhlayeh@9258 2292 + s->dict.allocated = 0;
gokhlayeh@9258 2293 + }
gokhlayeh@9258 2294 +
gokhlayeh@9258 2295 + return s;
gokhlayeh@9258 2296 +}
gokhlayeh@9258 2297 +
gokhlayeh@9258 2298 +XZ_EXTERN enum xz_ret xz_dec_lzma2_reset(struct xz_dec_lzma2 *s, uint8_t props)
gokhlayeh@9258 2299 +{
gokhlayeh@9258 2300 + /* This limits dictionary size to 3 GiB to keep parsing simpler. */
gokhlayeh@9258 2301 + if (props > 39)
gokhlayeh@9258 2302 + return XZ_OPTIONS_ERROR;
gokhlayeh@9258 2303 +
gokhlayeh@9258 2304 + s->dict.size = 2 + (props & 1);
gokhlayeh@9258 2305 + s->dict.size <<= (props >> 1) + 11;
gokhlayeh@9258 2306 +
gokhlayeh@9258 2307 + if (DEC_IS_MULTI(s->dict.mode)) {
gokhlayeh@9258 2308 + if (s->dict.size > s->dict.size_max)
gokhlayeh@9258 2309 + return XZ_MEMLIMIT_ERROR;
gokhlayeh@9258 2310 +
gokhlayeh@9258 2311 + s->dict.end = s->dict.size;
gokhlayeh@9258 2312 +
gokhlayeh@9258 2313 + if (DEC_IS_DYNALLOC(s->dict.mode)) {
gokhlayeh@9258 2314 + if (s->dict.allocated < s->dict.size) {
gokhlayeh@9258 2315 + vfree(s->dict.buf);
gokhlayeh@9258 2316 + s->dict.buf = vmalloc(s->dict.size);
gokhlayeh@9258 2317 + if (s->dict.buf == NULL) {
gokhlayeh@9258 2318 + s->dict.allocated = 0;
gokhlayeh@9258 2319 + return XZ_MEM_ERROR;
gokhlayeh@9258 2320 + }
gokhlayeh@9258 2321 + }
gokhlayeh@9258 2322 + }
gokhlayeh@9258 2323 + }
gokhlayeh@9258 2324 +
gokhlayeh@9258 2325 + s->lzma.len = 0;
gokhlayeh@9258 2326 +
gokhlayeh@9258 2327 + s->lzma2.sequence = SEQ_CONTROL;
gokhlayeh@9258 2328 + s->lzma2.need_dict_reset = true;
gokhlayeh@9258 2329 +
gokhlayeh@9258 2330 + s->temp.size = 0;
gokhlayeh@9258 2331 +
gokhlayeh@9258 2332 + return XZ_OK;
gokhlayeh@9258 2333 +}
gokhlayeh@9258 2334 +
gokhlayeh@9258 2335 +XZ_EXTERN void xz_dec_lzma2_end(struct xz_dec_lzma2 *s)
gokhlayeh@9258 2336 +{
gokhlayeh@9258 2337 + if (DEC_IS_MULTI(s->dict.mode))
gokhlayeh@9258 2338 + vfree(s->dict.buf);
gokhlayeh@9258 2339 +
gokhlayeh@9258 2340 + kfree(s);
gokhlayeh@9258 2341 +}
gokhlayeh@9258 2342 diff --git a/lib/xz/xz_dec_stream.c b/lib/xz/xz_dec_stream.c
gokhlayeh@9258 2343 new file mode 100644
gokhlayeh@9258 2344 index 0000000..ac809b1
gokhlayeh@9258 2345 --- /dev/null
gokhlayeh@9258 2346 +++ b/lib/xz/xz_dec_stream.c
gokhlayeh@9258 2347 @@ -0,0 +1,821 @@
gokhlayeh@9258 2348 +/*
gokhlayeh@9258 2349 + * .xz Stream decoder
gokhlayeh@9258 2350 + *
gokhlayeh@9258 2351 + * Author: Lasse Collin <lasse.collin@tukaani.org>
gokhlayeh@9258 2352 + *
gokhlayeh@9258 2353 + * This file has been put into the public domain.
gokhlayeh@9258 2354 + * You can do whatever you want with this file.
gokhlayeh@9258 2355 + */
gokhlayeh@9258 2356 +
gokhlayeh@9258 2357 +#include "xz_private.h"
gokhlayeh@9258 2358 +#include "xz_stream.h"
gokhlayeh@9258 2359 +
gokhlayeh@9258 2360 +/* Hash used to validate the Index field */
gokhlayeh@9258 2361 +struct xz_dec_hash {
gokhlayeh@9258 2362 + vli_type unpadded;
gokhlayeh@9258 2363 + vli_type uncompressed;
gokhlayeh@9258 2364 + uint32_t crc32;
gokhlayeh@9258 2365 +};
gokhlayeh@9258 2366 +
gokhlayeh@9258 2367 +struct xz_dec {
gokhlayeh@9258 2368 + /* Position in dec_main() */
gokhlayeh@9258 2369 + enum {
gokhlayeh@9258 2370 + SEQ_STREAM_HEADER,
gokhlayeh@9258 2371 + SEQ_BLOCK_START,
gokhlayeh@9258 2372 + SEQ_BLOCK_HEADER,
gokhlayeh@9258 2373 + SEQ_BLOCK_UNCOMPRESS,
gokhlayeh@9258 2374 + SEQ_BLOCK_PADDING,
gokhlayeh@9258 2375 + SEQ_BLOCK_CHECK,
gokhlayeh@9258 2376 + SEQ_INDEX,
gokhlayeh@9258 2377 + SEQ_INDEX_PADDING,
gokhlayeh@9258 2378 + SEQ_INDEX_CRC32,
gokhlayeh@9258 2379 + SEQ_STREAM_FOOTER
gokhlayeh@9258 2380 + } sequence;
gokhlayeh@9258 2381 +
gokhlayeh@9258 2382 + /* Position in variable-length integers and Check fields */
gokhlayeh@9258 2383 + uint32_t pos;
gokhlayeh@9258 2384 +
gokhlayeh@9258 2385 + /* Variable-length integer decoded by dec_vli() */
gokhlayeh@9258 2386 + vli_type vli;
gokhlayeh@9258 2387 +
gokhlayeh@9258 2388 + /* Saved in_pos and out_pos */
gokhlayeh@9258 2389 + size_t in_start;
gokhlayeh@9258 2390 + size_t out_start;
gokhlayeh@9258 2391 +
gokhlayeh@9258 2392 + /* CRC32 value in Block or Index */
gokhlayeh@9258 2393 + uint32_t crc32;
gokhlayeh@9258 2394 +
gokhlayeh@9258 2395 + /* Type of the integrity check calculated from uncompressed data */
gokhlayeh@9258 2396 + enum xz_check check_type;
gokhlayeh@9258 2397 +
gokhlayeh@9258 2398 + /* Operation mode */
gokhlayeh@9258 2399 + enum xz_mode mode;
gokhlayeh@9258 2400 +
gokhlayeh@9258 2401 + /*
gokhlayeh@9258 2402 + * True if the next call to xz_dec_run() is allowed to return
gokhlayeh@9258 2403 + * XZ_BUF_ERROR.
gokhlayeh@9258 2404 + */
gokhlayeh@9258 2405 + bool allow_buf_error;
gokhlayeh@9258 2406 +
gokhlayeh@9258 2407 + /* Information stored in Block Header */
gokhlayeh@9258 2408 + struct {
gokhlayeh@9258 2409 + /*
gokhlayeh@9258 2410 + * Value stored in the Compressed Size field, or
gokhlayeh@9258 2411 + * VLI_UNKNOWN if Compressed Size is not present.
gokhlayeh@9258 2412 + */
gokhlayeh@9258 2413 + vli_type compressed;
gokhlayeh@9258 2414 +
gokhlayeh@9258 2415 + /*
gokhlayeh@9258 2416 + * Value stored in the Uncompressed Size field, or
gokhlayeh@9258 2417 + * VLI_UNKNOWN if Uncompressed Size is not present.
gokhlayeh@9258 2418 + */
gokhlayeh@9258 2419 + vli_type uncompressed;
gokhlayeh@9258 2420 +
gokhlayeh@9258 2421 + /* Size of the Block Header field */
gokhlayeh@9258 2422 + uint32_t size;
gokhlayeh@9258 2423 + } block_header;
gokhlayeh@9258 2424 +
gokhlayeh@9258 2425 + /* Information collected when decoding Blocks */
gokhlayeh@9258 2426 + struct {
gokhlayeh@9258 2427 + /* Observed compressed size of the current Block */
gokhlayeh@9258 2428 + vli_type compressed;
gokhlayeh@9258 2429 +
gokhlayeh@9258 2430 + /* Observed uncompressed size of the current Block */
gokhlayeh@9258 2431 + vli_type uncompressed;
gokhlayeh@9258 2432 +
gokhlayeh@9258 2433 + /* Number of Blocks decoded so far */
gokhlayeh@9258 2434 + vli_type count;
gokhlayeh@9258 2435 +
gokhlayeh@9258 2436 + /*
gokhlayeh@9258 2437 + * Hash calculated from the Block sizes. This is used to
gokhlayeh@9258 2438 + * validate the Index field.
gokhlayeh@9258 2439 + */
gokhlayeh@9258 2440 + struct xz_dec_hash hash;
gokhlayeh@9258 2441 + } block;
gokhlayeh@9258 2442 +
gokhlayeh@9258 2443 + /* Variables needed when verifying the Index field */
gokhlayeh@9258 2444 + struct {
gokhlayeh@9258 2445 + /* Position in dec_index() */
gokhlayeh@9258 2446 + enum {
gokhlayeh@9258 2447 + SEQ_INDEX_COUNT,
gokhlayeh@9258 2448 + SEQ_INDEX_UNPADDED,
gokhlayeh@9258 2449 + SEQ_INDEX_UNCOMPRESSED
gokhlayeh@9258 2450 + } sequence;
gokhlayeh@9258 2451 +
gokhlayeh@9258 2452 + /* Size of the Index in bytes */
gokhlayeh@9258 2453 + vli_type size;
gokhlayeh@9258 2454 +
gokhlayeh@9258 2455 + /* Number of Records (matches block.count in valid files) */
gokhlayeh@9258 2456 + vli_type count;
gokhlayeh@9258 2457 +
gokhlayeh@9258 2458 + /*
gokhlayeh@9258 2459 + * Hash calculated from the Records (matches block.hash in
gokhlayeh@9258 2460 + * valid files).
gokhlayeh@9258 2461 + */
gokhlayeh@9258 2462 + struct xz_dec_hash hash;
gokhlayeh@9258 2463 + } index;
gokhlayeh@9258 2464 +
gokhlayeh@9258 2465 + /*
gokhlayeh@9258 2466 + * Temporary buffer needed to hold Stream Header, Block Header,
gokhlayeh@9258 2467 + * and Stream Footer. The Block Header is the biggest (1 KiB)
gokhlayeh@9258 2468 + * so we reserve space according to that. buf[] has to be aligned
gokhlayeh@9258 2469 + * to a multiple of four bytes; the size_t variables before it
gokhlayeh@9258 2470 + * should guarantee this.
gokhlayeh@9258 2471 + */
gokhlayeh@9258 2472 + struct {
gokhlayeh@9258 2473 + size_t pos;
gokhlayeh@9258 2474 + size_t size;
gokhlayeh@9258 2475 + uint8_t buf[1024];
gokhlayeh@9258 2476 + } temp;
gokhlayeh@9258 2477 +
gokhlayeh@9258 2478 + struct xz_dec_lzma2 *lzma2;
gokhlayeh@9258 2479 +
gokhlayeh@9258 2480 +#ifdef XZ_DEC_BCJ
gokhlayeh@9258 2481 + struct xz_dec_bcj *bcj;
gokhlayeh@9258 2482 + bool bcj_active;
gokhlayeh@9258 2483 +#endif
gokhlayeh@9258 2484 +};
gokhlayeh@9258 2485 +
gokhlayeh@9258 2486 +#ifdef XZ_DEC_ANY_CHECK
gokhlayeh@9258 2487 +/* Sizes of the Check field with different Check IDs */
gokhlayeh@9258 2488 +static const uint8_t check_sizes[16] = {
gokhlayeh@9258 2489 + 0,
gokhlayeh@9258 2490 + 4, 4, 4,
gokhlayeh@9258 2491 + 8, 8, 8,
gokhlayeh@9258 2492 + 16, 16, 16,
gokhlayeh@9258 2493 + 32, 32, 32,
gokhlayeh@9258 2494 + 64, 64, 64
gokhlayeh@9258 2495 +};
gokhlayeh@9258 2496 +#endif
gokhlayeh@9258 2497 +
gokhlayeh@9258 2498 +/*
gokhlayeh@9258 2499 + * Fill s->temp by copying data starting from b->in[b->in_pos]. Caller
gokhlayeh@9258 2500 + * must have set s->temp.pos to indicate how much data we are supposed
gokhlayeh@9258 2501 + * to copy into s->temp.buf. Return true once s->temp.pos has reached
gokhlayeh@9258 2502 + * s->temp.size.
gokhlayeh@9258 2503 + */
gokhlayeh@9258 2504 +static bool fill_temp(struct xz_dec *s, struct xz_buf *b)
gokhlayeh@9258 2505 +{
gokhlayeh@9258 2506 + size_t copy_size = min_t(size_t,
gokhlayeh@9258 2507 + b->in_size - b->in_pos, s->temp.size - s->temp.pos);
gokhlayeh@9258 2508 +
gokhlayeh@9258 2509 + memcpy(s->temp.buf + s->temp.pos, b->in + b->in_pos, copy_size);
gokhlayeh@9258 2510 + b->in_pos += copy_size;
gokhlayeh@9258 2511 + s->temp.pos += copy_size;
gokhlayeh@9258 2512 +
gokhlayeh@9258 2513 + if (s->temp.pos == s->temp.size) {
gokhlayeh@9258 2514 + s->temp.pos = 0;
gokhlayeh@9258 2515 + return true;
gokhlayeh@9258 2516 + }
gokhlayeh@9258 2517 +
gokhlayeh@9258 2518 + return false;
gokhlayeh@9258 2519 +}
gokhlayeh@9258 2520 +
gokhlayeh@9258 2521 +/* Decode a variable-length integer (little-endian base-128 encoding) */
gokhlayeh@9258 2522 +static enum xz_ret dec_vli(struct xz_dec *s, const uint8_t *in,
gokhlayeh@9258 2523 + size_t *in_pos, size_t in_size)
gokhlayeh@9258 2524 +{
gokhlayeh@9258 2525 + uint8_t byte;
gokhlayeh@9258 2526 +
gokhlayeh@9258 2527 + if (s->pos == 0)
gokhlayeh@9258 2528 + s->vli = 0;
gokhlayeh@9258 2529 +
gokhlayeh@9258 2530 + while (*in_pos < in_size) {
gokhlayeh@9258 2531 + byte = in[*in_pos];
gokhlayeh@9258 2532 + ++*in_pos;
gokhlayeh@9258 2533 +
gokhlayeh@9258 2534 + s->vli |= (vli_type)(byte & 0x7F) << s->pos;
gokhlayeh@9258 2535 +
gokhlayeh@9258 2536 + if ((byte & 0x80) == 0) {
gokhlayeh@9258 2537 + /* Don't allow non-minimal encodings. */
gokhlayeh@9258 2538 + if (byte == 0 && s->pos != 0)
gokhlayeh@9258 2539 + return XZ_DATA_ERROR;
gokhlayeh@9258 2540 +
gokhlayeh@9258 2541 + s->pos = 0;
gokhlayeh@9258 2542 + return XZ_STREAM_END;
gokhlayeh@9258 2543 + }
gokhlayeh@9258 2544 +
gokhlayeh@9258 2545 + s->pos += 7;
gokhlayeh@9258 2546 + if (s->pos == 7 * VLI_BYTES_MAX)
gokhlayeh@9258 2547 + return XZ_DATA_ERROR;
gokhlayeh@9258 2548 + }
gokhlayeh@9258 2549 +
gokhlayeh@9258 2550 + return XZ_OK;
gokhlayeh@9258 2551 +}
gokhlayeh@9258 2552 +
gokhlayeh@9258 2553 +/*
gokhlayeh@9258 2554 + * Decode the Compressed Data field from a Block. Update and validate
gokhlayeh@9258 2555 + * the observed compressed and uncompressed sizes of the Block so that
gokhlayeh@9258 2556 + * they don't exceed the values possibly stored in the Block Header
gokhlayeh@9258 2557 + * (validation assumes that no integer overflow occurs, since vli_type
gokhlayeh@9258 2558 + * is normally uint64_t). Update the CRC32 if presence of the CRC32
gokhlayeh@9258 2559 + * field was indicated in Stream Header.
gokhlayeh@9258 2560 + *
gokhlayeh@9258 2561 + * Once the decoding is finished, validate that the observed sizes match
gokhlayeh@9258 2562 + * the sizes possibly stored in the Block Header. Update the hash and
gokhlayeh@9258 2563 + * Block count, which are later used to validate the Index field.
gokhlayeh@9258 2564 + */
gokhlayeh@9258 2565 +static enum xz_ret dec_block(struct xz_dec *s, struct xz_buf *b)
gokhlayeh@9258 2566 +{
gokhlayeh@9258 2567 + enum xz_ret ret;
gokhlayeh@9258 2568 +
gokhlayeh@9258 2569 + s->in_start = b->in_pos;
gokhlayeh@9258 2570 + s->out_start = b->out_pos;
gokhlayeh@9258 2571 +
gokhlayeh@9258 2572 +#ifdef XZ_DEC_BCJ
gokhlayeh@9258 2573 + if (s->bcj_active)
gokhlayeh@9258 2574 + ret = xz_dec_bcj_run(s->bcj, s->lzma2, b);
gokhlayeh@9258 2575 + else
gokhlayeh@9258 2576 +#endif
gokhlayeh@9258 2577 + ret = xz_dec_lzma2_run(s->lzma2, b);
gokhlayeh@9258 2578 +
gokhlayeh@9258 2579 + s->block.compressed += b->in_pos - s->in_start;
gokhlayeh@9258 2580 + s->block.uncompressed += b->out_pos - s->out_start;
gokhlayeh@9258 2581 +
gokhlayeh@9258 2582 + /*
gokhlayeh@9258 2583 + * There is no need to separately check for VLI_UNKNOWN, since
gokhlayeh@9258 2584 + * the observed sizes are always smaller than VLI_UNKNOWN.
gokhlayeh@9258 2585 + */
gokhlayeh@9258 2586 + if (s->block.compressed > s->block_header.compressed
gokhlayeh@9258 2587 + || s->block.uncompressed
gokhlayeh@9258 2588 + > s->block_header.uncompressed)
gokhlayeh@9258 2589 + return XZ_DATA_ERROR;
gokhlayeh@9258 2590 +
gokhlayeh@9258 2591 + if (s->check_type == XZ_CHECK_CRC32)
gokhlayeh@9258 2592 + s->crc32 = xz_crc32(b->out + s->out_start,
gokhlayeh@9258 2593 + b->out_pos - s->out_start, s->crc32);
gokhlayeh@9258 2594 +
gokhlayeh@9258 2595 + if (ret == XZ_STREAM_END) {
gokhlayeh@9258 2596 + if (s->block_header.compressed != VLI_UNKNOWN
gokhlayeh@9258 2597 + && s->block_header.compressed
gokhlayeh@9258 2598 + != s->block.compressed)
gokhlayeh@9258 2599 + return XZ_DATA_ERROR;
gokhlayeh@9258 2600 +
gokhlayeh@9258 2601 + if (s->block_header.uncompressed != VLI_UNKNOWN
gokhlayeh@9258 2602 + && s->block_header.uncompressed
gokhlayeh@9258 2603 + != s->block.uncompressed)
gokhlayeh@9258 2604 + return XZ_DATA_ERROR;
gokhlayeh@9258 2605 +
gokhlayeh@9258 2606 + s->block.hash.unpadded += s->block_header.size
gokhlayeh@9258 2607 + + s->block.compressed;
gokhlayeh@9258 2608 +
gokhlayeh@9258 2609 +#ifdef XZ_DEC_ANY_CHECK
gokhlayeh@9258 2610 + s->block.hash.unpadded += check_sizes[s->check_type];
gokhlayeh@9258 2611 +#else
gokhlayeh@9258 2612 + if (s->check_type == XZ_CHECK_CRC32)
gokhlayeh@9258 2613 + s->block.hash.unpadded += 4;
gokhlayeh@9258 2614 +#endif
gokhlayeh@9258 2615 +
gokhlayeh@9258 2616 + s->block.hash.uncompressed += s->block.uncompressed;
gokhlayeh@9258 2617 + s->block.hash.crc32 = xz_crc32(
gokhlayeh@9258 2618 + (const uint8_t *)&s->block.hash,
gokhlayeh@9258 2619 + sizeof(s->block.hash), s->block.hash.crc32);
gokhlayeh@9258 2620 +
gokhlayeh@9258 2621 + ++s->block.count;
gokhlayeh@9258 2622 + }
gokhlayeh@9258 2623 +
gokhlayeh@9258 2624 + return ret;
gokhlayeh@9258 2625 +}
gokhlayeh@9258 2626 +
gokhlayeh@9258 2627 +/* Update the Index size and the CRC32 value. */
gokhlayeh@9258 2628 +static void index_update(struct xz_dec *s, const struct xz_buf *b)
gokhlayeh@9258 2629 +{
gokhlayeh@9258 2630 + size_t in_used = b->in_pos - s->in_start;
gokhlayeh@9258 2631 + s->index.size += in_used;
gokhlayeh@9258 2632 + s->crc32 = xz_crc32(b->in + s->in_start, in_used, s->crc32);
gokhlayeh@9258 2633 +}
gokhlayeh@9258 2634 +
gokhlayeh@9258 2635 +/*
gokhlayeh@9258 2636 + * Decode the Number of Records, Unpadded Size, and Uncompressed Size
gokhlayeh@9258 2637 + * fields from the Index field. That is, Index Padding and CRC32 are not
gokhlayeh@9258 2638 + * decoded by this function.
gokhlayeh@9258 2639 + *
gokhlayeh@9258 2640 + * This can return XZ_OK (more input needed), XZ_STREAM_END (everything
gokhlayeh@9258 2641 + * successfully decoded), or XZ_DATA_ERROR (input is corrupt).
gokhlayeh@9258 2642 + */
gokhlayeh@9258 2643 +static enum xz_ret dec_index(struct xz_dec *s, struct xz_buf *b)
gokhlayeh@9258 2644 +{
gokhlayeh@9258 2645 + enum xz_ret ret;
gokhlayeh@9258 2646 +
gokhlayeh@9258 2647 + do {
gokhlayeh@9258 2648 + ret = dec_vli(s, b->in, &b->in_pos, b->in_size);
gokhlayeh@9258 2649 + if (ret != XZ_STREAM_END) {
gokhlayeh@9258 2650 + index_update(s, b);
gokhlayeh@9258 2651 + return ret;
gokhlayeh@9258 2652 + }
gokhlayeh@9258 2653 +
gokhlayeh@9258 2654 + switch (s->index.sequence) {
gokhlayeh@9258 2655 + case SEQ_INDEX_COUNT:
gokhlayeh@9258 2656 + s->index.count = s->vli;
gokhlayeh@9258 2657 +
gokhlayeh@9258 2658 + /*
gokhlayeh@9258 2659 + * Validate that the Number of Records field
gokhlayeh@9258 2660 + * indicates the same number of Records as
gokhlayeh@9258 2661 + * there were Blocks in the Stream.
gokhlayeh@9258 2662 + */
gokhlayeh@9258 2663 + if (s->index.count != s->block.count)
gokhlayeh@9258 2664 + return XZ_DATA_ERROR;
gokhlayeh@9258 2665 +
gokhlayeh@9258 2666 + s->index.sequence = SEQ_INDEX_UNPADDED;
gokhlayeh@9258 2667 + break;
gokhlayeh@9258 2668 +
gokhlayeh@9258 2669 + case SEQ_INDEX_UNPADDED:
gokhlayeh@9258 2670 + s->index.hash.unpadded += s->vli;
gokhlayeh@9258 2671 + s->index.sequence = SEQ_INDEX_UNCOMPRESSED;
gokhlayeh@9258 2672 + break;
gokhlayeh@9258 2673 +
gokhlayeh@9258 2674 + case SEQ_INDEX_UNCOMPRESSED:
gokhlayeh@9258 2675 + s->index.hash.uncompressed += s->vli;
gokhlayeh@9258 2676 + s->index.hash.crc32 = xz_crc32(
gokhlayeh@9258 2677 + (const uint8_t *)&s->index.hash,
gokhlayeh@9258 2678 + sizeof(s->index.hash),
gokhlayeh@9258 2679 + s->index.hash.crc32);
gokhlayeh@9258 2680 + --s->index.count;
gokhlayeh@9258 2681 + s->index.sequence = SEQ_INDEX_UNPADDED;
gokhlayeh@9258 2682 + break;
gokhlayeh@9258 2683 + }
gokhlayeh@9258 2684 + } while (s->index.count > 0);
gokhlayeh@9258 2685 +
gokhlayeh@9258 2686 + return XZ_STREAM_END;
gokhlayeh@9258 2687 +}
gokhlayeh@9258 2688 +
gokhlayeh@9258 2689 +/*
gokhlayeh@9258 2690 + * Validate that the next four input bytes match the value of s->crc32.
gokhlayeh@9258 2691 + * s->pos must be zero when starting to validate the first byte.
gokhlayeh@9258 2692 + */
gokhlayeh@9258 2693 +static enum xz_ret crc32_validate(struct xz_dec *s, struct xz_buf *b)
gokhlayeh@9258 2694 +{
gokhlayeh@9258 2695 + do {
gokhlayeh@9258 2696 + if (b->in_pos == b->in_size)
gokhlayeh@9258 2697 + return XZ_OK;
gokhlayeh@9258 2698 +
gokhlayeh@9258 2699 + if (((s->crc32 >> s->pos) & 0xFF) != b->in[b->in_pos++])
gokhlayeh@9258 2700 + return XZ_DATA_ERROR;
gokhlayeh@9258 2701 +
gokhlayeh@9258 2702 + s->pos += 8;
gokhlayeh@9258 2703 +
gokhlayeh@9258 2704 + } while (s->pos < 32);
gokhlayeh@9258 2705 +
gokhlayeh@9258 2706 + s->crc32 = 0;
gokhlayeh@9258 2707 + s->pos = 0;
gokhlayeh@9258 2708 +
gokhlayeh@9258 2709 + return XZ_STREAM_END;
gokhlayeh@9258 2710 +}
gokhlayeh@9258 2711 +
gokhlayeh@9258 2712 +#ifdef XZ_DEC_ANY_CHECK
gokhlayeh@9258 2713 +/*
gokhlayeh@9258 2714 + * Skip over the Check field when the Check ID is not supported.
gokhlayeh@9258 2715 + * Returns true once the whole Check field has been skipped over.
gokhlayeh@9258 2716 + */
gokhlayeh@9258 2717 +static bool check_skip(struct xz_dec *s, struct xz_buf *b)
gokhlayeh@9258 2718 +{
gokhlayeh@9258 2719 + while (s->pos < check_sizes[s->check_type]) {
gokhlayeh@9258 2720 + if (b->in_pos == b->in_size)
gokhlayeh@9258 2721 + return false;
gokhlayeh@9258 2722 +
gokhlayeh@9258 2723 + ++b->in_pos;
gokhlayeh@9258 2724 + ++s->pos;
gokhlayeh@9258 2725 + }
gokhlayeh@9258 2726 +
gokhlayeh@9258 2727 + s->pos = 0;
gokhlayeh@9258 2728 +
gokhlayeh@9258 2729 + return true;
gokhlayeh@9258 2730 +}
gokhlayeh@9258 2731 +#endif
gokhlayeh@9258 2732 +
gokhlayeh@9258 2733 +/* Decode the Stream Header field (the first 12 bytes of the .xz Stream). */
gokhlayeh@9258 2734 +static enum xz_ret dec_stream_header(struct xz_dec *s)
gokhlayeh@9258 2735 +{
gokhlayeh@9258 2736 + if (!memeq(s->temp.buf, HEADER_MAGIC, HEADER_MAGIC_SIZE))
gokhlayeh@9258 2737 + return XZ_FORMAT_ERROR;
gokhlayeh@9258 2738 +
gokhlayeh@9258 2739 + if (xz_crc32(s->temp.buf + HEADER_MAGIC_SIZE, 2, 0)
gokhlayeh@9258 2740 + != get_le32(s->temp.buf + HEADER_MAGIC_SIZE + 2))
gokhlayeh@9258 2741 + return XZ_DATA_ERROR;
gokhlayeh@9258 2742 +
gokhlayeh@9258 2743 + if (s->temp.buf[HEADER_MAGIC_SIZE] != 0)
gokhlayeh@9258 2744 + return XZ_OPTIONS_ERROR;
gokhlayeh@9258 2745 +
gokhlayeh@9258 2746 + /*
gokhlayeh@9258 2747 + * Of integrity checks, we support only none (Check ID = 0) and
gokhlayeh@9258 2748 + * CRC32 (Check ID = 1). However, if XZ_DEC_ANY_CHECK is defined,
gokhlayeh@9258 2749 + * we will accept other check types too, but then the check won't
gokhlayeh@9258 2750 + * be verified and a warning (XZ_UNSUPPORTED_CHECK) will be given.
gokhlayeh@9258 2751 + */
gokhlayeh@9258 2752 + s->check_type = s->temp.buf[HEADER_MAGIC_SIZE + 1];
gokhlayeh@9258 2753 +
gokhlayeh@9258 2754 +#ifdef XZ_DEC_ANY_CHECK
gokhlayeh@9258 2755 + if (s->check_type > XZ_CHECK_MAX)
gokhlayeh@9258 2756 + return XZ_OPTIONS_ERROR;
gokhlayeh@9258 2757 +
gokhlayeh@9258 2758 + if (s->check_type > XZ_CHECK_CRC32)
gokhlayeh@9258 2759 + return XZ_UNSUPPORTED_CHECK;
gokhlayeh@9258 2760 +#else
gokhlayeh@9258 2761 + if (s->check_type > XZ_CHECK_CRC32)
gokhlayeh@9258 2762 + return XZ_OPTIONS_ERROR;
gokhlayeh@9258 2763 +#endif
gokhlayeh@9258 2764 +
gokhlayeh@9258 2765 + return XZ_OK;
gokhlayeh@9258 2766 +}
gokhlayeh@9258 2767 +
gokhlayeh@9258 2768 +/* Decode the Stream Footer field (the last 12 bytes of the .xz Stream) */
gokhlayeh@9258 2769 +static enum xz_ret dec_stream_footer(struct xz_dec *s)
gokhlayeh@9258 2770 +{
gokhlayeh@9258 2771 + if (!memeq(s->temp.buf + 10, FOOTER_MAGIC, FOOTER_MAGIC_SIZE))
gokhlayeh@9258 2772 + return XZ_DATA_ERROR;
gokhlayeh@9258 2773 +
gokhlayeh@9258 2774 + if (xz_crc32(s->temp.buf + 4, 6, 0) != get_le32(s->temp.buf))
gokhlayeh@9258 2775 + return XZ_DATA_ERROR;
gokhlayeh@9258 2776 +
gokhlayeh@9258 2777 + /*
gokhlayeh@9258 2778 + * Validate Backward Size. Note that we never added the size of the
gokhlayeh@9258 2779 + * Index CRC32 field to s->index.size, thus we use s->index.size / 4
gokhlayeh@9258 2780 + * instead of s->index.size / 4 - 1.
gokhlayeh@9258 2781 + */
gokhlayeh@9258 2782 + if ((s->index.size >> 2) != get_le32(s->temp.buf + 4))
gokhlayeh@9258 2783 + return XZ_DATA_ERROR;
gokhlayeh@9258 2784 +
gokhlayeh@9258 2785 + if (s->temp.buf[8] != 0 || s->temp.buf[9] != s->check_type)
gokhlayeh@9258 2786 + return XZ_DATA_ERROR;
gokhlayeh@9258 2787 +
gokhlayeh@9258 2788 + /*
gokhlayeh@9258 2789 + * Use XZ_STREAM_END instead of XZ_OK to be more convenient
gokhlayeh@9258 2790 + * for the caller.
gokhlayeh@9258 2791 + */
gokhlayeh@9258 2792 + return XZ_STREAM_END;
gokhlayeh@9258 2793 +}
gokhlayeh@9258 2794 +
gokhlayeh@9258 2795 +/* Decode the Block Header and initialize the filter chain. */
gokhlayeh@9258 2796 +static enum xz_ret dec_block_header(struct xz_dec *s)
gokhlayeh@9258 2797 +{
gokhlayeh@9258 2798 + enum xz_ret ret;
gokhlayeh@9258 2799 +
gokhlayeh@9258 2800 + /*
gokhlayeh@9258 2801 + * Validate the CRC32. We know that the temp buffer is at least
gokhlayeh@9258 2802 + * eight bytes so this is safe.
gokhlayeh@9258 2803 + */
gokhlayeh@9258 2804 + s->temp.size -= 4;
gokhlayeh@9258 2805 + if (xz_crc32(s->temp.buf, s->temp.size, 0)
gokhlayeh@9258 2806 + != get_le32(s->temp.buf + s->temp.size))
gokhlayeh@9258 2807 + return XZ_DATA_ERROR;
gokhlayeh@9258 2808 +
gokhlayeh@9258 2809 + s->temp.pos = 2;
gokhlayeh@9258 2810 +
gokhlayeh@9258 2811 + /*
gokhlayeh@9258 2812 + * Catch unsupported Block Flags. We support only one or two filters
gokhlayeh@9258 2813 + * in the chain, so we catch that with the same test.
gokhlayeh@9258 2814 + */
gokhlayeh@9258 2815 +#ifdef XZ_DEC_BCJ
gokhlayeh@9258 2816 + if (s->temp.buf[1] & 0x3E)
gokhlayeh@9258 2817 +#else
gokhlayeh@9258 2818 + if (s->temp.buf[1] & 0x3F)
gokhlayeh@9258 2819 +#endif
gokhlayeh@9258 2820 + return XZ_OPTIONS_ERROR;
gokhlayeh@9258 2821 +
gokhlayeh@9258 2822 + /* Compressed Size */
gokhlayeh@9258 2823 + if (s->temp.buf[1] & 0x40) {
gokhlayeh@9258 2824 + if (dec_vli(s, s->temp.buf, &s->temp.pos, s->temp.size)
gokhlayeh@9258 2825 + != XZ_STREAM_END)
gokhlayeh@9258 2826 + return XZ_DATA_ERROR;
gokhlayeh@9258 2827 +
gokhlayeh@9258 2828 + s->block_header.compressed = s->vli;
gokhlayeh@9258 2829 + } else {
gokhlayeh@9258 2830 + s->block_header.compressed = VLI_UNKNOWN;
gokhlayeh@9258 2831 + }
gokhlayeh@9258 2832 +
gokhlayeh@9258 2833 + /* Uncompressed Size */
gokhlayeh@9258 2834 + if (s->temp.buf[1] & 0x80) {
gokhlayeh@9258 2835 + if (dec_vli(s, s->temp.buf, &s->temp.pos, s->temp.size)
gokhlayeh@9258 2836 + != XZ_STREAM_END)
gokhlayeh@9258 2837 + return XZ_DATA_ERROR;
gokhlayeh@9258 2838 +
gokhlayeh@9258 2839 + s->block_header.uncompressed = s->vli;
gokhlayeh@9258 2840 + } else {
gokhlayeh@9258 2841 + s->block_header.uncompressed = VLI_UNKNOWN;
gokhlayeh@9258 2842 + }
gokhlayeh@9258 2843 +
gokhlayeh@9258 2844 +#ifdef XZ_DEC_BCJ
gokhlayeh@9258 2845 + /* If there are two filters, the first one must be a BCJ filter. */
gokhlayeh@9258 2846 + s->bcj_active = s->temp.buf[1] & 0x01;
gokhlayeh@9258 2847 + if (s->bcj_active) {
gokhlayeh@9258 2848 + if (s->temp.size - s->temp.pos < 2)
gokhlayeh@9258 2849 + return XZ_OPTIONS_ERROR;
gokhlayeh@9258 2850 +
gokhlayeh@9258 2851 + ret = xz_dec_bcj_reset(s->bcj, s->temp.buf[s->temp.pos++]);
gokhlayeh@9258 2852 + if (ret != XZ_OK)
gokhlayeh@9258 2853 + return ret;
gokhlayeh@9258 2854 +
gokhlayeh@9258 2855 + /*
gokhlayeh@9258 2856 + * We don't support custom start offset,
gokhlayeh@9258 2857 + * so Size of Properties must be zero.
gokhlayeh@9258 2858 + */
gokhlayeh@9258 2859 + if (s->temp.buf[s->temp.pos++] != 0x00)
gokhlayeh@9258 2860 + return XZ_OPTIONS_ERROR;
gokhlayeh@9258 2861 + }
gokhlayeh@9258 2862 +#endif
gokhlayeh@9258 2863 +
gokhlayeh@9258 2864 + /* Valid Filter Flags always take at least two bytes. */
gokhlayeh@9258 2865 + if (s->temp.size - s->temp.pos < 2)
gokhlayeh@9258 2866 + return XZ_DATA_ERROR;
gokhlayeh@9258 2867 +
gokhlayeh@9258 2868 + /* Filter ID = LZMA2 */
gokhlayeh@9258 2869 + if (s->temp.buf[s->temp.pos++] != 0x21)
gokhlayeh@9258 2870 + return XZ_OPTIONS_ERROR;
gokhlayeh@9258 2871 +
gokhlayeh@9258 2872 + /* Size of Properties = 1-byte Filter Properties */
gokhlayeh@9258 2873 + if (s->temp.buf[s->temp.pos++] != 0x01)
gokhlayeh@9258 2874 + return XZ_OPTIONS_ERROR;
gokhlayeh@9258 2875 +
gokhlayeh@9258 2876 + /* Filter Properties contains LZMA2 dictionary size. */
gokhlayeh@9258 2877 + if (s->temp.size - s->temp.pos < 1)
gokhlayeh@9258 2878 + return XZ_DATA_ERROR;
gokhlayeh@9258 2879 +
gokhlayeh@9258 2880 + ret = xz_dec_lzma2_reset(s->lzma2, s->temp.buf[s->temp.pos++]);
gokhlayeh@9258 2881 + if (ret != XZ_OK)
gokhlayeh@9258 2882 + return ret;
gokhlayeh@9258 2883 +
gokhlayeh@9258 2884 + /* The rest must be Header Padding. */
gokhlayeh@9258 2885 + while (s->temp.pos < s->temp.size)
gokhlayeh@9258 2886 + if (s->temp.buf[s->temp.pos++] != 0x00)
gokhlayeh@9258 2887 + return XZ_OPTIONS_ERROR;
gokhlayeh@9258 2888 +
gokhlayeh@9258 2889 + s->temp.pos = 0;
gokhlayeh@9258 2890 + s->block.compressed = 0;
gokhlayeh@9258 2891 + s->block.uncompressed = 0;
gokhlayeh@9258 2892 +
gokhlayeh@9258 2893 + return XZ_OK;
gokhlayeh@9258 2894 +}
gokhlayeh@9258 2895 +
gokhlayeh@9258 2896 +static enum xz_ret dec_main(struct xz_dec *s, struct xz_buf *b)
gokhlayeh@9258 2897 +{
gokhlayeh@9258 2898 + enum xz_ret ret;
gokhlayeh@9258 2899 +
gokhlayeh@9258 2900 + /*
gokhlayeh@9258 2901 + * Store the start position for the case when we are in the middle
gokhlayeh@9258 2902 + * of the Index field.
gokhlayeh@9258 2903 + */
gokhlayeh@9258 2904 + s->in_start = b->in_pos;
gokhlayeh@9258 2905 +
gokhlayeh@9258 2906 + while (true) {
gokhlayeh@9258 2907 + switch (s->sequence) {
gokhlayeh@9258 2908 + case SEQ_STREAM_HEADER:
gokhlayeh@9258 2909 + /*
gokhlayeh@9258 2910 + * Stream Header is copied to s->temp, and then
gokhlayeh@9258 2911 + * decoded from there. This way if the caller
gokhlayeh@9258 2912 + * gives us only little input at a time, we can
gokhlayeh@9258 2913 + * still keep the Stream Header decoding code
gokhlayeh@9258 2914 + * simple. Similar approach is used in many places
gokhlayeh@9258 2915 + * in this file.
gokhlayeh@9258 2916 + */
gokhlayeh@9258 2917 + if (!fill_temp(s, b))
gokhlayeh@9258 2918 + return XZ_OK;
gokhlayeh@9258 2919 +
gokhlayeh@9258 2920 + /*
gokhlayeh@9258 2921 + * If dec_stream_header() returns
gokhlayeh@9258 2922 + * XZ_UNSUPPORTED_CHECK, it is still possible
gokhlayeh@9258 2923 + * to continue decoding if working in multi-call
gokhlayeh@9258 2924 + * mode. Thus, update s->sequence before calling
gokhlayeh@9258 2925 + * dec_stream_header().
gokhlayeh@9258 2926 + */
gokhlayeh@9258 2927 + s->sequence = SEQ_BLOCK_START;
gokhlayeh@9258 2928 +
gokhlayeh@9258 2929 + ret = dec_stream_header(s);
gokhlayeh@9258 2930 + if (ret != XZ_OK)
gokhlayeh@9258 2931 + return ret;
gokhlayeh@9258 2932 +
gokhlayeh@9258 2933 + case SEQ_BLOCK_START:
gokhlayeh@9258 2934 + /* We need one byte of input to continue. */
gokhlayeh@9258 2935 + if (b->in_pos == b->in_size)
gokhlayeh@9258 2936 + return XZ_OK;
gokhlayeh@9258 2937 +
gokhlayeh@9258 2938 + /* See if this is the beginning of the Index field. */
gokhlayeh@9258 2939 + if (b->in[b->in_pos] == 0) {
gokhlayeh@9258 2940 + s->in_start = b->in_pos++;
gokhlayeh@9258 2941 + s->sequence = SEQ_INDEX;
gokhlayeh@9258 2942 + break;
gokhlayeh@9258 2943 + }
gokhlayeh@9258 2944 +
gokhlayeh@9258 2945 + /*
gokhlayeh@9258 2946 + * Calculate the size of the Block Header and
gokhlayeh@9258 2947 + * prepare to decode it.
gokhlayeh@9258 2948 + */
gokhlayeh@9258 2949 + s->block_header.size
gokhlayeh@9258 2950 + = ((uint32_t)b->in[b->in_pos] + 1) * 4;
gokhlayeh@9258 2951 +
gokhlayeh@9258 2952 + s->temp.size = s->block_header.size;
gokhlayeh@9258 2953 + s->temp.pos = 0;
gokhlayeh@9258 2954 + s->sequence = SEQ_BLOCK_HEADER;
gokhlayeh@9258 2955 +
gokhlayeh@9258 2956 + case SEQ_BLOCK_HEADER:
gokhlayeh@9258 2957 + if (!fill_temp(s, b))
gokhlayeh@9258 2958 + return XZ_OK;
gokhlayeh@9258 2959 +
gokhlayeh@9258 2960 + ret = dec_block_header(s);
gokhlayeh@9258 2961 + if (ret != XZ_OK)
gokhlayeh@9258 2962 + return ret;
gokhlayeh@9258 2963 +
gokhlayeh@9258 2964 + s->sequence = SEQ_BLOCK_UNCOMPRESS;
gokhlayeh@9258 2965 +
gokhlayeh@9258 2966 + case SEQ_BLOCK_UNCOMPRESS:
gokhlayeh@9258 2967 + ret = dec_block(s, b);
gokhlayeh@9258 2968 + if (ret != XZ_STREAM_END)
gokhlayeh@9258 2969 + return ret;
gokhlayeh@9258 2970 +
gokhlayeh@9258 2971 + s->sequence = SEQ_BLOCK_PADDING;
gokhlayeh@9258 2972 +
gokhlayeh@9258 2973 + case SEQ_BLOCK_PADDING:
gokhlayeh@9258 2974 + /*
gokhlayeh@9258 2975 + * Size of Compressed Data + Block Padding
gokhlayeh@9258 2976 + * must be a multiple of four. We don't need
gokhlayeh@9258 2977 + * s->block.compressed for anything else
gokhlayeh@9258 2978 + * anymore, so we use it here to test the size
gokhlayeh@9258 2979 + * of the Block Padding field.
gokhlayeh@9258 2980 + */
gokhlayeh@9258 2981 + while (s->block.compressed & 3) {
gokhlayeh@9258 2982 + if (b->in_pos == b->in_size)
gokhlayeh@9258 2983 + return XZ_OK;
gokhlayeh@9258 2984 +
gokhlayeh@9258 2985 + if (b->in[b->in_pos++] != 0)
gokhlayeh@9258 2986 + return XZ_DATA_ERROR;
gokhlayeh@9258 2987 +
gokhlayeh@9258 2988 + ++s->block.compressed;
gokhlayeh@9258 2989 + }
gokhlayeh@9258 2990 +
gokhlayeh@9258 2991 + s->sequence = SEQ_BLOCK_CHECK;
gokhlayeh@9258 2992 +
gokhlayeh@9258 2993 + case SEQ_BLOCK_CHECK:
gokhlayeh@9258 2994 + if (s->check_type == XZ_CHECK_CRC32) {
gokhlayeh@9258 2995 + ret = crc32_validate(s, b);
gokhlayeh@9258 2996 + if (ret != XZ_STREAM_END)
gokhlayeh@9258 2997 + return ret;
gokhlayeh@9258 2998 + }
gokhlayeh@9258 2999 +#ifdef XZ_DEC_ANY_CHECK
gokhlayeh@9258 3000 + else if (!check_skip(s, b)) {
gokhlayeh@9258 3001 + return XZ_OK;
gokhlayeh@9258 3002 + }
gokhlayeh@9258 3003 +#endif
gokhlayeh@9258 3004 +
gokhlayeh@9258 3005 + s->sequence = SEQ_BLOCK_START;
gokhlayeh@9258 3006 + break;
gokhlayeh@9258 3007 +
gokhlayeh@9258 3008 + case SEQ_INDEX:
gokhlayeh@9258 3009 + ret = dec_index(s, b);
gokhlayeh@9258 3010 + if (ret != XZ_STREAM_END)
gokhlayeh@9258 3011 + return ret;
gokhlayeh@9258 3012 +
gokhlayeh@9258 3013 + s->sequence = SEQ_INDEX_PADDING;
gokhlayeh@9258 3014 +
gokhlayeh@9258 3015 + case SEQ_INDEX_PADDING:
gokhlayeh@9258 3016 + while ((s->index.size + (b->in_pos - s->in_start))
gokhlayeh@9258 3017 + & 3) {
gokhlayeh@9258 3018 + if (b->in_pos == b->in_size) {
gokhlayeh@9258 3019 + index_update(s, b);
gokhlayeh@9258 3020 + return XZ_OK;
gokhlayeh@9258 3021 + }
gokhlayeh@9258 3022 +
gokhlayeh@9258 3023 + if (b->in[b->in_pos++] != 0)
gokhlayeh@9258 3024 + return XZ_DATA_ERROR;
gokhlayeh@9258 3025 + }
gokhlayeh@9258 3026 +
gokhlayeh@9258 3027 + /* Finish the CRC32 value and Index size. */
gokhlayeh@9258 3028 + index_update(s, b);
gokhlayeh@9258 3029 +
gokhlayeh@9258 3030 + /* Compare the hashes to validate the Index field. */
gokhlayeh@9258 3031 + if (!memeq(&s->block.hash, &s->index.hash,
gokhlayeh@9258 3032 + sizeof(s->block.hash)))
gokhlayeh@9258 3033 + return XZ_DATA_ERROR;
gokhlayeh@9258 3034 +
gokhlayeh@9258 3035 + s->sequence = SEQ_INDEX_CRC32;
gokhlayeh@9258 3036 +
gokhlayeh@9258 3037 + case SEQ_INDEX_CRC32:
gokhlayeh@9258 3038 + ret = crc32_validate(s, b);
gokhlayeh@9258 3039 + if (ret != XZ_STREAM_END)
gokhlayeh@9258 3040 + return ret;
gokhlayeh@9258 3041 +
gokhlayeh@9258 3042 + s->temp.size = STREAM_HEADER_SIZE;
gokhlayeh@9258 3043 + s->sequence = SEQ_STREAM_FOOTER;
gokhlayeh@9258 3044 +
gokhlayeh@9258 3045 + case SEQ_STREAM_FOOTER:
gokhlayeh@9258 3046 + if (!fill_temp(s, b))
gokhlayeh@9258 3047 + return XZ_OK;
gokhlayeh@9258 3048 +
gokhlayeh@9258 3049 + return dec_stream_footer(s);
gokhlayeh@9258 3050 + }
gokhlayeh@9258 3051 + }
gokhlayeh@9258 3052 +
gokhlayeh@9258 3053 + /* Never reached */
gokhlayeh@9258 3054 +}
gokhlayeh@9258 3055 +
gokhlayeh@9258 3056 +/*
gokhlayeh@9258 3057 + * xz_dec_run() is a wrapper for dec_main() to handle some special cases in
gokhlayeh@9258 3058 + * multi-call and single-call decoding.
gokhlayeh@9258 3059 + *
gokhlayeh@9258 3060 + * In multi-call mode, we must return XZ_BUF_ERROR when it seems clear that we
gokhlayeh@9258 3061 + * are not going to make any progress anymore. This is to prevent the caller
gokhlayeh@9258 3062 + * from calling us infinitely when the input file is truncated or otherwise
gokhlayeh@9258 3063 + * corrupt. Since zlib-style API allows that the caller fills the input buffer
gokhlayeh@9258 3064 + * only when the decoder doesn't produce any new output, we have to be careful
gokhlayeh@9258 3065 + * to avoid returning XZ_BUF_ERROR too easily: XZ_BUF_ERROR is returned only
gokhlayeh@9258 3066 + * after the second consecutive call to xz_dec_run() that makes no progress.
gokhlayeh@9258 3067 + *
gokhlayeh@9258 3068 + * In single-call mode, if we couldn't decode everything and no error
gokhlayeh@9258 3069 + * occurred, either the input is truncated or the output buffer is too small.
gokhlayeh@9258 3070 + * Since we know that the last input byte never produces any output, we know
gokhlayeh@9258 3071 + * that if all the input was consumed and decoding wasn't finished, the file
gokhlayeh@9258 3072 + * must be corrupt. Otherwise the output buffer has to be too small or the
gokhlayeh@9258 3073 + * file is corrupt in a way that decoding it produces too big output.
gokhlayeh@9258 3074 + *
gokhlayeh@9258 3075 + * If single-call decoding fails, we reset b->in_pos and b->out_pos back to
gokhlayeh@9258 3076 + * their original values. This is because with some filter chains there won't
gokhlayeh@9258 3077 + * be any valid uncompressed data in the output buffer unless the decoding
gokhlayeh@9258 3078 + * actually succeeds (that's the price to pay of using the output buffer as
gokhlayeh@9258 3079 + * the workspace).
gokhlayeh@9258 3080 + */
gokhlayeh@9258 3081 +XZ_EXTERN enum xz_ret xz_dec_run(struct xz_dec *s, struct xz_buf *b)
gokhlayeh@9258 3082 +{
gokhlayeh@9258 3083 + size_t in_start;
gokhlayeh@9258 3084 + size_t out_start;
gokhlayeh@9258 3085 + enum xz_ret ret;
gokhlayeh@9258 3086 +
gokhlayeh@9258 3087 + if (DEC_IS_SINGLE(s->mode))
gokhlayeh@9258 3088 + xz_dec_reset(s);
gokhlayeh@9258 3089 +
gokhlayeh@9258 3090 + in_start = b->in_pos;
gokhlayeh@9258 3091 + out_start = b->out_pos;
gokhlayeh@9258 3092 + ret = dec_main(s, b);
gokhlayeh@9258 3093 +
gokhlayeh@9258 3094 + if (DEC_IS_SINGLE(s->mode)) {
gokhlayeh@9258 3095 + if (ret == XZ_OK)
gokhlayeh@9258 3096 + ret = b->in_pos == b->in_size
gokhlayeh@9258 3097 + ? XZ_DATA_ERROR : XZ_BUF_ERROR;
gokhlayeh@9258 3098 +
gokhlayeh@9258 3099 + if (ret != XZ_STREAM_END) {
gokhlayeh@9258 3100 + b->in_pos = in_start;
gokhlayeh@9258 3101 + b->out_pos = out_start;
gokhlayeh@9258 3102 + }
gokhlayeh@9258 3103 +
gokhlayeh@9258 3104 + } else if (ret == XZ_OK && in_start == b->in_pos
gokhlayeh@9258 3105 + && out_start == b->out_pos) {
gokhlayeh@9258 3106 + if (s->allow_buf_error)
gokhlayeh@9258 3107 + ret = XZ_BUF_ERROR;
gokhlayeh@9258 3108 +
gokhlayeh@9258 3109 + s->allow_buf_error = true;
gokhlayeh@9258 3110 + } else {
gokhlayeh@9258 3111 + s->allow_buf_error = false;
gokhlayeh@9258 3112 + }
gokhlayeh@9258 3113 +
gokhlayeh@9258 3114 + return ret;
gokhlayeh@9258 3115 +}
gokhlayeh@9258 3116 +
gokhlayeh@9258 3117 +XZ_EXTERN struct xz_dec *xz_dec_init(enum xz_mode mode, uint32_t dict_max)
gokhlayeh@9258 3118 +{
gokhlayeh@9258 3119 + struct xz_dec *s = kmalloc(sizeof(*s), GFP_KERNEL);
gokhlayeh@9258 3120 + if (s == NULL)
gokhlayeh@9258 3121 + return NULL;
gokhlayeh@9258 3122 +
gokhlayeh@9258 3123 + s->mode = mode;
gokhlayeh@9258 3124 +
gokhlayeh@9258 3125 +#ifdef XZ_DEC_BCJ
gokhlayeh@9258 3126 + s->bcj = xz_dec_bcj_create(DEC_IS_SINGLE(mode));
gokhlayeh@9258 3127 + if (s->bcj == NULL)
gokhlayeh@9258 3128 + goto error_bcj;
gokhlayeh@9258 3129 +#endif
gokhlayeh@9258 3130 +
gokhlayeh@9258 3131 + s->lzma2 = xz_dec_lzma2_create(mode, dict_max);
gokhlayeh@9258 3132 + if (s->lzma2 == NULL)
gokhlayeh@9258 3133 + goto error_lzma2;
gokhlayeh@9258 3134 +
gokhlayeh@9258 3135 + xz_dec_reset(s);
gokhlayeh@9258 3136 + return s;
gokhlayeh@9258 3137 +
gokhlayeh@9258 3138 +error_lzma2:
gokhlayeh@9258 3139 +#ifdef XZ_DEC_BCJ
gokhlayeh@9258 3140 + xz_dec_bcj_end(s->bcj);
gokhlayeh@9258 3141 +error_bcj:
gokhlayeh@9258 3142 +#endif
gokhlayeh@9258 3143 + kfree(s);
gokhlayeh@9258 3144 + return NULL;
gokhlayeh@9258 3145 +}
gokhlayeh@9258 3146 +
gokhlayeh@9258 3147 +XZ_EXTERN void xz_dec_reset(struct xz_dec *s)
gokhlayeh@9258 3148 +{
gokhlayeh@9258 3149 + s->sequence = SEQ_STREAM_HEADER;
gokhlayeh@9258 3150 + s->allow_buf_error = false;
gokhlayeh@9258 3151 + s->pos = 0;
gokhlayeh@9258 3152 + s->crc32 = 0;
gokhlayeh@9258 3153 + memzero(&s->block, sizeof(s->block));
gokhlayeh@9258 3154 + memzero(&s->index, sizeof(s->index));
gokhlayeh@9258 3155 + s->temp.pos = 0;
gokhlayeh@9258 3156 + s->temp.size = STREAM_HEADER_SIZE;
gokhlayeh@9258 3157 +}
gokhlayeh@9258 3158 +
gokhlayeh@9258 3159 +XZ_EXTERN void xz_dec_end(struct xz_dec *s)
gokhlayeh@9258 3160 +{
gokhlayeh@9258 3161 + if (s != NULL) {
gokhlayeh@9258 3162 + xz_dec_lzma2_end(s->lzma2);
gokhlayeh@9258 3163 +#ifdef XZ_DEC_BCJ
gokhlayeh@9258 3164 + xz_dec_bcj_end(s->bcj);
gokhlayeh@9258 3165 +#endif
gokhlayeh@9258 3166 + kfree(s);
gokhlayeh@9258 3167 + }
gokhlayeh@9258 3168 +}
gokhlayeh@9258 3169 diff --git a/lib/xz/xz_dec_syms.c b/lib/xz/xz_dec_syms.c
gokhlayeh@9258 3170 new file mode 100644
gokhlayeh@9258 3171 index 0000000..32eb3c0
gokhlayeh@9258 3172 --- /dev/null
gokhlayeh@9258 3173 +++ b/lib/xz/xz_dec_syms.c
gokhlayeh@9258 3174 @@ -0,0 +1,26 @@
gokhlayeh@9258 3175 +/*
gokhlayeh@9258 3176 + * XZ decoder module information
gokhlayeh@9258 3177 + *
gokhlayeh@9258 3178 + * Author: Lasse Collin <lasse.collin@tukaani.org>
gokhlayeh@9258 3179 + *
gokhlayeh@9258 3180 + * This file has been put into the public domain.
gokhlayeh@9258 3181 + * You can do whatever you want with this file.
gokhlayeh@9258 3182 + */
gokhlayeh@9258 3183 +
gokhlayeh@9258 3184 +#include <linux/module.h>
gokhlayeh@9258 3185 +#include <linux/xz.h>
gokhlayeh@9258 3186 +
gokhlayeh@9258 3187 +EXPORT_SYMBOL(xz_dec_init);
gokhlayeh@9258 3188 +EXPORT_SYMBOL(xz_dec_reset);
gokhlayeh@9258 3189 +EXPORT_SYMBOL(xz_dec_run);
gokhlayeh@9258 3190 +EXPORT_SYMBOL(xz_dec_end);
gokhlayeh@9258 3191 +
gokhlayeh@9258 3192 +MODULE_DESCRIPTION("XZ decompressor");
gokhlayeh@9258 3193 +MODULE_VERSION("1.0");
gokhlayeh@9258 3194 +MODULE_AUTHOR("Lasse Collin <lasse.collin@tukaani.org> and Igor Pavlov");
gokhlayeh@9258 3195 +
gokhlayeh@9258 3196 +/*
gokhlayeh@9258 3197 + * This code is in the public domain, but in Linux it's simplest to just
gokhlayeh@9258 3198 + * say it's GPL and consider the authors as the copyright holders.
gokhlayeh@9258 3199 + */
gokhlayeh@9258 3200 +MODULE_LICENSE("GPL");
gokhlayeh@9258 3201 diff --git a/lib/xz/xz_dec_test.c b/lib/xz/xz_dec_test.c
gokhlayeh@9258 3202 new file mode 100644
gokhlayeh@9258 3203 index 0000000..da28a19
gokhlayeh@9258 3204 --- /dev/null
gokhlayeh@9258 3205 +++ b/lib/xz/xz_dec_test.c
gokhlayeh@9258 3206 @@ -0,0 +1,220 @@
gokhlayeh@9258 3207 +/*
gokhlayeh@9258 3208 + * XZ decoder tester
gokhlayeh@9258 3209 + *
gokhlayeh@9258 3210 + * Author: Lasse Collin <lasse.collin@tukaani.org>
gokhlayeh@9258 3211 + *
gokhlayeh@9258 3212 + * This file has been put into the public domain.
gokhlayeh@9258 3213 + * You can do whatever you want with this file.
gokhlayeh@9258 3214 + */
gokhlayeh@9258 3215 +
gokhlayeh@9258 3216 +#include <linux/kernel.h>
gokhlayeh@9258 3217 +#include <linux/module.h>
gokhlayeh@9258 3218 +#include <linux/fs.h>
gokhlayeh@9258 3219 +#include <linux/uaccess.h>
gokhlayeh@9258 3220 +#include <linux/crc32.h>
gokhlayeh@9258 3221 +#include <linux/xz.h>
gokhlayeh@9258 3222 +
gokhlayeh@9258 3223 +/* Maximum supported dictionary size */
gokhlayeh@9258 3224 +#define DICT_MAX (1 << 20)
gokhlayeh@9258 3225 +
gokhlayeh@9258 3226 +/* Device name to pass to register_chrdev(). */
gokhlayeh@9258 3227 +#define DEVICE_NAME "xz_dec_test"
gokhlayeh@9258 3228 +
gokhlayeh@9258 3229 +/* Dynamically allocated device major number */
gokhlayeh@9258 3230 +static int device_major;
gokhlayeh@9258 3231 +
gokhlayeh@9258 3232 +/*
gokhlayeh@9258 3233 + * We reuse the same decoder state, and thus can decode only one
gokhlayeh@9258 3234 + * file at a time.
gokhlayeh@9258 3235 + */
gokhlayeh@9258 3236 +static bool device_is_open;
gokhlayeh@9258 3237 +
gokhlayeh@9258 3238 +/* XZ decoder state */
gokhlayeh@9258 3239 +static struct xz_dec *state;
gokhlayeh@9258 3240 +
gokhlayeh@9258 3241 +/*
gokhlayeh@9258 3242 + * Return value of xz_dec_run(). We need to avoid calling xz_dec_run() after
gokhlayeh@9258 3243 + * it has returned XZ_STREAM_END, so we make this static.
gokhlayeh@9258 3244 + */
gokhlayeh@9258 3245 +static enum xz_ret ret;
gokhlayeh@9258 3246 +
gokhlayeh@9258 3247 +/*
gokhlayeh@9258 3248 + * Input and output buffers. The input buffer is used as a temporary safe
gokhlayeh@9258 3249 + * place for the data coming from the userspace.
gokhlayeh@9258 3250 + */
gokhlayeh@9258 3251 +static uint8_t buffer_in[1024];
gokhlayeh@9258 3252 +static uint8_t buffer_out[1024];
gokhlayeh@9258 3253 +
gokhlayeh@9258 3254 +/*
gokhlayeh@9258 3255 + * Structure to pass the input and output buffers to the XZ decoder.
gokhlayeh@9258 3256 + * A few of the fields are never modified so we initialize them here.
gokhlayeh@9258 3257 + */
gokhlayeh@9258 3258 +static struct xz_buf buffers = {
gokhlayeh@9258 3259 + .in = buffer_in,
gokhlayeh@9258 3260 + .out = buffer_out,
gokhlayeh@9258 3261 + .out_size = sizeof(buffer_out)
gokhlayeh@9258 3262 +};
gokhlayeh@9258 3263 +
gokhlayeh@9258 3264 +/*
gokhlayeh@9258 3265 + * CRC32 of uncompressed data. This is used to give the user a simple way
gokhlayeh@9258 3266 + * to check that the decoder produces correct output.
gokhlayeh@9258 3267 + */
gokhlayeh@9258 3268 +static uint32_t crc;
gokhlayeh@9258 3269 +
gokhlayeh@9258 3270 +static int xz_dec_test_open(struct inode *i, struct file *f)
gokhlayeh@9258 3271 +{
gokhlayeh@9258 3272 + if (device_is_open)
gokhlayeh@9258 3273 + return -EBUSY;
gokhlayeh@9258 3274 +
gokhlayeh@9258 3275 + device_is_open = true;
gokhlayeh@9258 3276 +
gokhlayeh@9258 3277 + xz_dec_reset(state);
gokhlayeh@9258 3278 + ret = XZ_OK;
gokhlayeh@9258 3279 + crc = 0xFFFFFFFF;
gokhlayeh@9258 3280 +
gokhlayeh@9258 3281 + buffers.in_pos = 0;
gokhlayeh@9258 3282 + buffers.in_size = 0;
gokhlayeh@9258 3283 + buffers.out_pos = 0;
gokhlayeh@9258 3284 +
gokhlayeh@9258 3285 + printk(KERN_INFO DEVICE_NAME ": opened\n");
gokhlayeh@9258 3286 + return 0;
gokhlayeh@9258 3287 +}
gokhlayeh@9258 3288 +
gokhlayeh@9258 3289 +static int xz_dec_test_release(struct inode *i, struct file *f)
gokhlayeh@9258 3290 +{
gokhlayeh@9258 3291 + device_is_open = false;
gokhlayeh@9258 3292 +
gokhlayeh@9258 3293 + if (ret == XZ_OK)
gokhlayeh@9258 3294 + printk(KERN_INFO DEVICE_NAME ": input was truncated\n");
gokhlayeh@9258 3295 +
gokhlayeh@9258 3296 + printk(KERN_INFO DEVICE_NAME ": closed\n");
gokhlayeh@9258 3297 + return 0;
gokhlayeh@9258 3298 +}
gokhlayeh@9258 3299 +
gokhlayeh@9258 3300 +/*
gokhlayeh@9258 3301 + * Decode the data given to us from the userspace. CRC32 of the uncompressed
gokhlayeh@9258 3302 + * data is calculated and is printed at the end of successful decoding. The
gokhlayeh@9258 3303 + * uncompressed data isn't stored anywhere for further use.
gokhlayeh@9258 3304 + *
gokhlayeh@9258 3305 + * The .xz file must have exactly one Stream and no Stream Padding. The data
gokhlayeh@9258 3306 + * after the first Stream is considered to be garbage.
gokhlayeh@9258 3307 + */
gokhlayeh@9258 3308 +static ssize_t xz_dec_test_write(struct file *file, const char __user *buf,
gokhlayeh@9258 3309 + size_t size, loff_t *pos)
gokhlayeh@9258 3310 +{
gokhlayeh@9258 3311 + size_t remaining;
gokhlayeh@9258 3312 +
gokhlayeh@9258 3313 + if (ret != XZ_OK) {
gokhlayeh@9258 3314 + if (size > 0)
gokhlayeh@9258 3315 + printk(KERN_INFO DEVICE_NAME ": %zu bytes of "
gokhlayeh@9258 3316 + "garbage at the end of the file\n",
gokhlayeh@9258 3317 + size);
gokhlayeh@9258 3318 +
gokhlayeh@9258 3319 + return -ENOSPC;
gokhlayeh@9258 3320 + }
gokhlayeh@9258 3321 +
gokhlayeh@9258 3322 + printk(KERN_INFO DEVICE_NAME ": decoding %zu bytes of input\n",
gokhlayeh@9258 3323 + size);
gokhlayeh@9258 3324 +
gokhlayeh@9258 3325 + remaining = size;
gokhlayeh@9258 3326 + while ((remaining > 0 || buffers.out_pos == buffers.out_size)
gokhlayeh@9258 3327 + && ret == XZ_OK) {
gokhlayeh@9258 3328 + if (buffers.in_pos == buffers.in_size) {
gokhlayeh@9258 3329 + buffers.in_pos = 0;
gokhlayeh@9258 3330 + buffers.in_size = min(remaining, sizeof(buffer_in));
gokhlayeh@9258 3331 + if (copy_from_user(buffer_in, buf, buffers.in_size))
gokhlayeh@9258 3332 + return -EFAULT;
gokhlayeh@9258 3333 +
gokhlayeh@9258 3334 + buf += buffers.in_size;
gokhlayeh@9258 3335 + remaining -= buffers.in_size;
gokhlayeh@9258 3336 + }
gokhlayeh@9258 3337 +
gokhlayeh@9258 3338 + buffers.out_pos = 0;
gokhlayeh@9258 3339 + ret = xz_dec_run(state, &buffers);
gokhlayeh@9258 3340 + crc = crc32(crc, buffer_out, buffers.out_pos);
gokhlayeh@9258 3341 + }
gokhlayeh@9258 3342 +
gokhlayeh@9258 3343 + switch (ret) {
gokhlayeh@9258 3344 + case XZ_OK:
gokhlayeh@9258 3345 + printk(KERN_INFO DEVICE_NAME ": XZ_OK\n");
gokhlayeh@9258 3346 + return size;
gokhlayeh@9258 3347 +
gokhlayeh@9258 3348 + case XZ_STREAM_END:
gokhlayeh@9258 3349 + printk(KERN_INFO DEVICE_NAME ": XZ_STREAM_END, "
gokhlayeh@9258 3350 + "CRC32 = 0x%08X\n", ~crc);
gokhlayeh@9258 3351 + return size - remaining - (buffers.in_size - buffers.in_pos);
gokhlayeh@9258 3352 +
gokhlayeh@9258 3353 + case XZ_MEMLIMIT_ERROR:
gokhlayeh@9258 3354 + printk(KERN_INFO DEVICE_NAME ": XZ_MEMLIMIT_ERROR\n");
gokhlayeh@9258 3355 + break;
gokhlayeh@9258 3356 +
gokhlayeh@9258 3357 + case XZ_FORMAT_ERROR:
gokhlayeh@9258 3358 + printk(KERN_INFO DEVICE_NAME ": XZ_FORMAT_ERROR\n");
gokhlayeh@9258 3359 + break;
gokhlayeh@9258 3360 +
gokhlayeh@9258 3361 + case XZ_OPTIONS_ERROR:
gokhlayeh@9258 3362 + printk(KERN_INFO DEVICE_NAME ": XZ_OPTIONS_ERROR\n");
gokhlayeh@9258 3363 + break;
gokhlayeh@9258 3364 +
gokhlayeh@9258 3365 + case XZ_DATA_ERROR:
gokhlayeh@9258 3366 + printk(KERN_INFO DEVICE_NAME ": XZ_DATA_ERROR\n");
gokhlayeh@9258 3367 + break;
gokhlayeh@9258 3368 +
gokhlayeh@9258 3369 + case XZ_BUF_ERROR:
gokhlayeh@9258 3370 + printk(KERN_INFO DEVICE_NAME ": XZ_BUF_ERROR\n");
gokhlayeh@9258 3371 + break;
gokhlayeh@9258 3372 +
gokhlayeh@9258 3373 + default:
gokhlayeh@9258 3374 + printk(KERN_INFO DEVICE_NAME ": Bug detected!\n");
gokhlayeh@9258 3375 + break;
gokhlayeh@9258 3376 + }
gokhlayeh@9258 3377 +
gokhlayeh@9258 3378 + return -EIO;
gokhlayeh@9258 3379 +}
gokhlayeh@9258 3380 +
gokhlayeh@9258 3381 +/* Allocate the XZ decoder state and register the character device. */
gokhlayeh@9258 3382 +static int __init xz_dec_test_init(void)
gokhlayeh@9258 3383 +{
gokhlayeh@9258 3384 + static const struct file_operations fileops = {
gokhlayeh@9258 3385 + .owner = THIS_MODULE,
gokhlayeh@9258 3386 + .open = &xz_dec_test_open,
gokhlayeh@9258 3387 + .release = &xz_dec_test_release,
gokhlayeh@9258 3388 + .write = &xz_dec_test_write
gokhlayeh@9258 3389 + };
gokhlayeh@9258 3390 +
gokhlayeh@9258 3391 + state = xz_dec_init(XZ_PREALLOC, DICT_MAX);
gokhlayeh@9258 3392 + if (state == NULL)
gokhlayeh@9258 3393 + return -ENOMEM;
gokhlayeh@9258 3394 +
gokhlayeh@9258 3395 + device_major = register_chrdev(0, DEVICE_NAME, &fileops);
gokhlayeh@9258 3396 + if (device_major < 0) {
gokhlayeh@9258 3397 + xz_dec_end(state);
gokhlayeh@9258 3398 + return device_major;
gokhlayeh@9258 3399 + }
gokhlayeh@9258 3400 +
gokhlayeh@9258 3401 + printk(KERN_INFO DEVICE_NAME ": module loaded\n");
gokhlayeh@9258 3402 + printk(KERN_INFO DEVICE_NAME ": Create a device node with "
gokhlayeh@9258 3403 + "'mknod " DEVICE_NAME " c %d 0' and write .xz files "
gokhlayeh@9258 3404 + "to it.\n", device_major);
gokhlayeh@9258 3405 + return 0;
gokhlayeh@9258 3406 +}
gokhlayeh@9258 3407 +
gokhlayeh@9258 3408 +static void __exit xz_dec_test_exit(void)
gokhlayeh@9258 3409 +{
gokhlayeh@9258 3410 + unregister_chrdev(device_major, DEVICE_NAME);
gokhlayeh@9258 3411 + xz_dec_end(state);
gokhlayeh@9258 3412 + printk(KERN_INFO DEVICE_NAME ": module unloaded\n");
gokhlayeh@9258 3413 +}
gokhlayeh@9258 3414 +
gokhlayeh@9258 3415 +module_init(xz_dec_test_init);
gokhlayeh@9258 3416 +module_exit(xz_dec_test_exit);
gokhlayeh@9258 3417 +
gokhlayeh@9258 3418 +MODULE_DESCRIPTION("XZ decompressor tester");
gokhlayeh@9258 3419 +MODULE_VERSION("1.0");
gokhlayeh@9258 3420 +MODULE_AUTHOR("Lasse Collin <lasse.collin@tukaani.org>");
gokhlayeh@9258 3421 +
gokhlayeh@9258 3422 +/*
gokhlayeh@9258 3423 + * This code is in the public domain, but in Linux it's simplest to just
gokhlayeh@9258 3424 + * say it's GPL and consider the authors as the copyright holders.
gokhlayeh@9258 3425 + */
gokhlayeh@9258 3426 +MODULE_LICENSE("GPL");
gokhlayeh@9258 3427 diff --git a/lib/xz/xz_lzma2.h b/lib/xz/xz_lzma2.h
gokhlayeh@9258 3428 new file mode 100644
gokhlayeh@9258 3429 index 0000000..071d67b
gokhlayeh@9258 3430 --- /dev/null
gokhlayeh@9258 3431 +++ b/lib/xz/xz_lzma2.h
gokhlayeh@9258 3432 @@ -0,0 +1,204 @@
gokhlayeh@9258 3433 +/*
gokhlayeh@9258 3434 + * LZMA2 definitions
gokhlayeh@9258 3435 + *
gokhlayeh@9258 3436 + * Authors: Lasse Collin <lasse.collin@tukaani.org>
gokhlayeh@9258 3437 + * Igor Pavlov <http://7-zip.org/>
gokhlayeh@9258 3438 + *
gokhlayeh@9258 3439 + * This file has been put into the public domain.
gokhlayeh@9258 3440 + * You can do whatever you want with this file.
gokhlayeh@9258 3441 + */
gokhlayeh@9258 3442 +
gokhlayeh@9258 3443 +#ifndef XZ_LZMA2_H
gokhlayeh@9258 3444 +#define XZ_LZMA2_H
gokhlayeh@9258 3445 +
gokhlayeh@9258 3446 +/* Range coder constants */
gokhlayeh@9258 3447 +#define RC_SHIFT_BITS 8
gokhlayeh@9258 3448 +#define RC_TOP_BITS 24
gokhlayeh@9258 3449 +#define RC_TOP_VALUE (1 << RC_TOP_BITS)
gokhlayeh@9258 3450 +#define RC_BIT_MODEL_TOTAL_BITS 11
gokhlayeh@9258 3451 +#define RC_BIT_MODEL_TOTAL (1 << RC_BIT_MODEL_TOTAL_BITS)
gokhlayeh@9258 3452 +#define RC_MOVE_BITS 5
gokhlayeh@9258 3453 +
gokhlayeh@9258 3454 +/*
gokhlayeh@9258 3455 + * Maximum number of position states. A position state is the lowest pb
gokhlayeh@9258 3456 + * number of bits of the current uncompressed offset. In some places there
gokhlayeh@9258 3457 + * are different sets of probabilities for different position states.
gokhlayeh@9258 3458 + */
gokhlayeh@9258 3459 +#define POS_STATES_MAX (1 << 4)
gokhlayeh@9258 3460 +
gokhlayeh@9258 3461 +/*
gokhlayeh@9258 3462 + * This enum is used to track which LZMA symbols have occurred most recently
gokhlayeh@9258 3463 + * and in which order. This information is used to predict the next symbol.
gokhlayeh@9258 3464 + *
gokhlayeh@9258 3465 + * Symbols:
gokhlayeh@9258 3466 + * - Literal: One 8-bit byte
gokhlayeh@9258 3467 + * - Match: Repeat a chunk of data at some distance
gokhlayeh@9258 3468 + * - Long repeat: Multi-byte match at a recently seen distance
gokhlayeh@9258 3469 + * - Short repeat: One-byte repeat at a recently seen distance
gokhlayeh@9258 3470 + *
gokhlayeh@9258 3471 + * The symbol names are in from STATE_oldest_older_previous. REP means
gokhlayeh@9258 3472 + * either short or long repeated match, and NONLIT means any non-literal.
gokhlayeh@9258 3473 + */
gokhlayeh@9258 3474 +enum lzma_state {
gokhlayeh@9258 3475 + STATE_LIT_LIT,
gokhlayeh@9258 3476 + STATE_MATCH_LIT_LIT,
gokhlayeh@9258 3477 + STATE_REP_LIT_LIT,
gokhlayeh@9258 3478 + STATE_SHORTREP_LIT_LIT,
gokhlayeh@9258 3479 + STATE_MATCH_LIT,
gokhlayeh@9258 3480 + STATE_REP_LIT,
gokhlayeh@9258 3481 + STATE_SHORTREP_LIT,
gokhlayeh@9258 3482 + STATE_LIT_MATCH,
gokhlayeh@9258 3483 + STATE_LIT_LONGREP,
gokhlayeh@9258 3484 + STATE_LIT_SHORTREP,
gokhlayeh@9258 3485 + STATE_NONLIT_MATCH,
gokhlayeh@9258 3486 + STATE_NONLIT_REP
gokhlayeh@9258 3487 +};
gokhlayeh@9258 3488 +
gokhlayeh@9258 3489 +/* Total number of states */
gokhlayeh@9258 3490 +#define STATES 12
gokhlayeh@9258 3491 +
gokhlayeh@9258 3492 +/* The lowest 7 states indicate that the previous state was a literal. */
gokhlayeh@9258 3493 +#define LIT_STATES 7
gokhlayeh@9258 3494 +
gokhlayeh@9258 3495 +/* Indicate that the latest symbol was a literal. */
gokhlayeh@9258 3496 +static inline void lzma_state_literal(enum lzma_state *state)
gokhlayeh@9258 3497 +{
gokhlayeh@9258 3498 + if (*state <= STATE_SHORTREP_LIT_LIT)
gokhlayeh@9258 3499 + *state = STATE_LIT_LIT;
gokhlayeh@9258 3500 + else if (*state <= STATE_LIT_SHORTREP)
gokhlayeh@9258 3501 + *state -= 3;
gokhlayeh@9258 3502 + else
gokhlayeh@9258 3503 + *state -= 6;
gokhlayeh@9258 3504 +}
gokhlayeh@9258 3505 +
gokhlayeh@9258 3506 +/* Indicate that the latest symbol was a match. */
gokhlayeh@9258 3507 +static inline void lzma_state_match(enum lzma_state *state)
gokhlayeh@9258 3508 +{
gokhlayeh@9258 3509 + *state = *state < LIT_STATES ? STATE_LIT_MATCH : STATE_NONLIT_MATCH;
gokhlayeh@9258 3510 +}
gokhlayeh@9258 3511 +
gokhlayeh@9258 3512 +/* Indicate that the latest state was a long repeated match. */
gokhlayeh@9258 3513 +static inline void lzma_state_long_rep(enum lzma_state *state)
gokhlayeh@9258 3514 +{
gokhlayeh@9258 3515 + *state = *state < LIT_STATES ? STATE_LIT_LONGREP : STATE_NONLIT_REP;
gokhlayeh@9258 3516 +}
gokhlayeh@9258 3517 +
gokhlayeh@9258 3518 +/* Indicate that the latest symbol was a short match. */
gokhlayeh@9258 3519 +static inline void lzma_state_short_rep(enum lzma_state *state)
gokhlayeh@9258 3520 +{
gokhlayeh@9258 3521 + *state = *state < LIT_STATES ? STATE_LIT_SHORTREP : STATE_NONLIT_REP;
gokhlayeh@9258 3522 +}
gokhlayeh@9258 3523 +
gokhlayeh@9258 3524 +/* Test if the previous symbol was a literal. */
gokhlayeh@9258 3525 +static inline bool lzma_state_is_literal(enum lzma_state state)
gokhlayeh@9258 3526 +{
gokhlayeh@9258 3527 + return state < LIT_STATES;
gokhlayeh@9258 3528 +}
gokhlayeh@9258 3529 +
gokhlayeh@9258 3530 +/* Each literal coder is divided in three sections:
gokhlayeh@9258 3531 + * - 0x001-0x0FF: Without match byte
gokhlayeh@9258 3532 + * - 0x101-0x1FF: With match byte; match bit is 0
gokhlayeh@9258 3533 + * - 0x201-0x2FF: With match byte; match bit is 1
gokhlayeh@9258 3534 + *
gokhlayeh@9258 3535 + * Match byte is used when the previous LZMA symbol was something else than
gokhlayeh@9258 3536 + * a literal (that is, it was some kind of match).
gokhlayeh@9258 3537 + */
gokhlayeh@9258 3538 +#define LITERAL_CODER_SIZE 0x300
gokhlayeh@9258 3539 +
gokhlayeh@9258 3540 +/* Maximum number of literal coders */
gokhlayeh@9258 3541 +#define LITERAL_CODERS_MAX (1 << 4)
gokhlayeh@9258 3542 +
gokhlayeh@9258 3543 +/* Minimum length of a match is two bytes. */
gokhlayeh@9258 3544 +#define MATCH_LEN_MIN 2
gokhlayeh@9258 3545 +
gokhlayeh@9258 3546 +/* Match length is encoded with 4, 5, or 10 bits.
gokhlayeh@9258 3547 + *
gokhlayeh@9258 3548 + * Length Bits
gokhlayeh@9258 3549 + * 2-9 4 = Choice=0 + 3 bits
gokhlayeh@9258 3550 + * 10-17 5 = Choice=1 + Choice2=0 + 3 bits
gokhlayeh@9258 3551 + * 18-273 10 = Choice=1 + Choice2=1 + 8 bits
gokhlayeh@9258 3552 + */
gokhlayeh@9258 3553 +#define LEN_LOW_BITS 3
gokhlayeh@9258 3554 +#define LEN_LOW_SYMBOLS (1 << LEN_LOW_BITS)
gokhlayeh@9258 3555 +#define LEN_MID_BITS 3
gokhlayeh@9258 3556 +#define LEN_MID_SYMBOLS (1 << LEN_MID_BITS)
gokhlayeh@9258 3557 +#define LEN_HIGH_BITS 8
gokhlayeh@9258 3558 +#define LEN_HIGH_SYMBOLS (1 << LEN_HIGH_BITS)
gokhlayeh@9258 3559 +#define LEN_SYMBOLS (LEN_LOW_SYMBOLS + LEN_MID_SYMBOLS + LEN_HIGH_SYMBOLS)
gokhlayeh@9258 3560 +
gokhlayeh@9258 3561 +/*
gokhlayeh@9258 3562 + * Maximum length of a match is 273 which is a result of the encoding
gokhlayeh@9258 3563 + * described above.
gokhlayeh@9258 3564 + */
gokhlayeh@9258 3565 +#define MATCH_LEN_MAX (MATCH_LEN_MIN + LEN_SYMBOLS - 1)
gokhlayeh@9258 3566 +
gokhlayeh@9258 3567 +/*
gokhlayeh@9258 3568 + * Different sets of probabilities are used for match distances that have
gokhlayeh@9258 3569 + * very short match length: Lengths of 2, 3, and 4 bytes have a separate
gokhlayeh@9258 3570 + * set of probabilities for each length. The matches with longer length
gokhlayeh@9258 3571 + * use a shared set of probabilities.
gokhlayeh@9258 3572 + */
gokhlayeh@9258 3573 +#define DIST_STATES 4
gokhlayeh@9258 3574 +
gokhlayeh@9258 3575 +/*
gokhlayeh@9258 3576 + * Get the index of the appropriate probability array for decoding
gokhlayeh@9258 3577 + * the distance slot.
gokhlayeh@9258 3578 + */
gokhlayeh@9258 3579 +static inline uint32_t lzma_get_dist_state(uint32_t len)
gokhlayeh@9258 3580 +{
gokhlayeh@9258 3581 + return len < DIST_STATES + MATCH_LEN_MIN
gokhlayeh@9258 3582 + ? len - MATCH_LEN_MIN : DIST_STATES - 1;
gokhlayeh@9258 3583 +}
gokhlayeh@9258 3584 +
gokhlayeh@9258 3585 +/*
gokhlayeh@9258 3586 + * The highest two bits of a 32-bit match distance are encoded using six bits.
gokhlayeh@9258 3587 + * This six-bit value is called a distance slot. This way encoding a 32-bit
gokhlayeh@9258 3588 + * value takes 6-36 bits, larger values taking more bits.
gokhlayeh@9258 3589 + */
gokhlayeh@9258 3590 +#define DIST_SLOT_BITS 6
gokhlayeh@9258 3591 +#define DIST_SLOTS (1 << DIST_SLOT_BITS)
gokhlayeh@9258 3592 +
gokhlayeh@9258 3593 +/* Match distances up to 127 are fully encoded using probabilities. Since
gokhlayeh@9258 3594 + * the highest two bits (distance slot) are always encoded using six bits,
gokhlayeh@9258 3595 + * the distances 0-3 don't need any additional bits to encode, since the
gokhlayeh@9258 3596 + * distance slot itself is the same as the actual distance. DIST_MODEL_START
gokhlayeh@9258 3597 + * indicates the first distance slot where at least one additional bit is
gokhlayeh@9258 3598 + * needed.
gokhlayeh@9258 3599 + */
gokhlayeh@9258 3600 +#define DIST_MODEL_START 4
gokhlayeh@9258 3601 +
gokhlayeh@9258 3602 +/*
gokhlayeh@9258 3603 + * Match distances greater than 127 are encoded in three pieces:
gokhlayeh@9258 3604 + * - distance slot: the highest two bits
gokhlayeh@9258 3605 + * - direct bits: 2-26 bits below the highest two bits
gokhlayeh@9258 3606 + * - alignment bits: four lowest bits
gokhlayeh@9258 3607 + *
gokhlayeh@9258 3608 + * Direct bits don't use any probabilities.
gokhlayeh@9258 3609 + *
gokhlayeh@9258 3610 + * The distance slot value of 14 is for distances 128-191.
gokhlayeh@9258 3611 + */
gokhlayeh@9258 3612 +#define DIST_MODEL_END 14
gokhlayeh@9258 3613 +
gokhlayeh@9258 3614 +/* Distance slots that indicate a distance <= 127. */
gokhlayeh@9258 3615 +#define FULL_DISTANCES_BITS (DIST_MODEL_END / 2)
gokhlayeh@9258 3616 +#define FULL_DISTANCES (1 << FULL_DISTANCES_BITS)
gokhlayeh@9258 3617 +
gokhlayeh@9258 3618 +/*
gokhlayeh@9258 3619 + * For match distances greater than 127, only the highest two bits and the
gokhlayeh@9258 3620 + * lowest four bits (alignment) is encoded using probabilities.
gokhlayeh@9258 3621 + */
gokhlayeh@9258 3622 +#define ALIGN_BITS 4
gokhlayeh@9258 3623 +#define ALIGN_SIZE (1 << ALIGN_BITS)
gokhlayeh@9258 3624 +#define ALIGN_MASK (ALIGN_SIZE - 1)
gokhlayeh@9258 3625 +
gokhlayeh@9258 3626 +/* Total number of all probability variables */
gokhlayeh@9258 3627 +#define PROBS_TOTAL (1846 + LITERAL_CODERS_MAX * LITERAL_CODER_SIZE)
gokhlayeh@9258 3628 +
gokhlayeh@9258 3629 +/*
gokhlayeh@9258 3630 + * LZMA remembers the four most recent match distances. Reusing these
gokhlayeh@9258 3631 + * distances tends to take less space than re-encoding the actual
gokhlayeh@9258 3632 + * distance value.
gokhlayeh@9258 3633 + */
gokhlayeh@9258 3634 +#define REPS 4
gokhlayeh@9258 3635 +
gokhlayeh@9258 3636 +#endif
gokhlayeh@9258 3637 diff --git a/lib/xz/xz_private.h b/lib/xz/xz_private.h
gokhlayeh@9258 3638 new file mode 100644
gokhlayeh@9258 3639 index 0000000..a65633e
gokhlayeh@9258 3640 --- /dev/null
gokhlayeh@9258 3641 +++ b/lib/xz/xz_private.h
gokhlayeh@9258 3642 @@ -0,0 +1,156 @@
gokhlayeh@9258 3643 +/*
gokhlayeh@9258 3644 + * Private includes and definitions
gokhlayeh@9258 3645 + *
gokhlayeh@9258 3646 + * Author: Lasse Collin <lasse.collin@tukaani.org>
gokhlayeh@9258 3647 + *
gokhlayeh@9258 3648 + * This file has been put into the public domain.
gokhlayeh@9258 3649 + * You can do whatever you want with this file.
gokhlayeh@9258 3650 + */
gokhlayeh@9258 3651 +
gokhlayeh@9258 3652 +#ifndef XZ_PRIVATE_H
gokhlayeh@9258 3653 +#define XZ_PRIVATE_H
gokhlayeh@9258 3654 +
gokhlayeh@9258 3655 +#ifdef __KERNEL__
gokhlayeh@9258 3656 +# include <linux/xz.h>
gokhlayeh@9258 3657 +# include <asm/byteorder.h>
gokhlayeh@9258 3658 +# include <asm/unaligned.h>
gokhlayeh@9258 3659 + /* XZ_PREBOOT may be defined only via decompress_unxz.c. */
gokhlayeh@9258 3660 +# ifndef XZ_PREBOOT
gokhlayeh@9258 3661 +# include <linux/slab.h>
gokhlayeh@9258 3662 +# include <linux/vmalloc.h>
gokhlayeh@9258 3663 +# include <linux/string.h>
gokhlayeh@9258 3664 +# ifdef CONFIG_XZ_DEC_X86
gokhlayeh@9258 3665 +# define XZ_DEC_X86
gokhlayeh@9258 3666 +# endif
gokhlayeh@9258 3667 +# ifdef CONFIG_XZ_DEC_POWERPC
gokhlayeh@9258 3668 +# define XZ_DEC_POWERPC
gokhlayeh@9258 3669 +# endif
gokhlayeh@9258 3670 +# ifdef CONFIG_XZ_DEC_IA64
gokhlayeh@9258 3671 +# define XZ_DEC_IA64
gokhlayeh@9258 3672 +# endif
gokhlayeh@9258 3673 +# ifdef CONFIG_XZ_DEC_ARM
gokhlayeh@9258 3674 +# define XZ_DEC_ARM
gokhlayeh@9258 3675 +# endif
gokhlayeh@9258 3676 +# ifdef CONFIG_XZ_DEC_ARMTHUMB
gokhlayeh@9258 3677 +# define XZ_DEC_ARMTHUMB
gokhlayeh@9258 3678 +# endif
gokhlayeh@9258 3679 +# ifdef CONFIG_XZ_DEC_SPARC
gokhlayeh@9258 3680 +# define XZ_DEC_SPARC
gokhlayeh@9258 3681 +# endif
gokhlayeh@9258 3682 +# define memeq(a, b, size) (memcmp(a, b, size) == 0)
gokhlayeh@9258 3683 +# define memzero(buf, size) memset(buf, 0, size)
gokhlayeh@9258 3684 +# endif
gokhlayeh@9258 3685 +# define get_le32(p) le32_to_cpup((const uint32_t *)(p))
gokhlayeh@9258 3686 +#else
gokhlayeh@9258 3687 + /*
gokhlayeh@9258 3688 + * For userspace builds, use a separate header to define the required
gokhlayeh@9258 3689 + * macros and functions. This makes it easier to adapt the code into
gokhlayeh@9258 3690 + * different environments and avoids clutter in the Linux kernel tree.
gokhlayeh@9258 3691 + */
gokhlayeh@9258 3692 +# include "xz_config.h"
gokhlayeh@9258 3693 +#endif
gokhlayeh@9258 3694 +
gokhlayeh@9258 3695 +/* If no specific decoding mode is requested, enable support for all modes. */
gokhlayeh@9258 3696 +#if !defined(XZ_DEC_SINGLE) && !defined(XZ_DEC_PREALLOC) \
gokhlayeh@9258 3697 + && !defined(XZ_DEC_DYNALLOC)
gokhlayeh@9258 3698 +# define XZ_DEC_SINGLE
gokhlayeh@9258 3699 +# define XZ_DEC_PREALLOC
gokhlayeh@9258 3700 +# define XZ_DEC_DYNALLOC
gokhlayeh@9258 3701 +#endif
gokhlayeh@9258 3702 +
gokhlayeh@9258 3703 +/*
gokhlayeh@9258 3704 + * The DEC_IS_foo(mode) macros are used in "if" statements. If only some
gokhlayeh@9258 3705 + * of the supported modes are enabled, these macros will evaluate to true or
gokhlayeh@9258 3706 + * false at compile time and thus allow the compiler to omit unneeded code.
gokhlayeh@9258 3707 + */
gokhlayeh@9258 3708 +#ifdef XZ_DEC_SINGLE
gokhlayeh@9258 3709 +# define DEC_IS_SINGLE(mode) ((mode) == XZ_SINGLE)
gokhlayeh@9258 3710 +#else
gokhlayeh@9258 3711 +# define DEC_IS_SINGLE(mode) (false)
gokhlayeh@9258 3712 +#endif
gokhlayeh@9258 3713 +
gokhlayeh@9258 3714 +#ifdef XZ_DEC_PREALLOC
gokhlayeh@9258 3715 +# define DEC_IS_PREALLOC(mode) ((mode) == XZ_PREALLOC)
gokhlayeh@9258 3716 +#else
gokhlayeh@9258 3717 +# define DEC_IS_PREALLOC(mode) (false)
gokhlayeh@9258 3718 +#endif
gokhlayeh@9258 3719 +
gokhlayeh@9258 3720 +#ifdef XZ_DEC_DYNALLOC
gokhlayeh@9258 3721 +# define DEC_IS_DYNALLOC(mode) ((mode) == XZ_DYNALLOC)
gokhlayeh@9258 3722 +#else
gokhlayeh@9258 3723 +# define DEC_IS_DYNALLOC(mode) (false)
gokhlayeh@9258 3724 +#endif
gokhlayeh@9258 3725 +
gokhlayeh@9258 3726 +#if !defined(XZ_DEC_SINGLE)
gokhlayeh@9258 3727 +# define DEC_IS_MULTI(mode) (true)
gokhlayeh@9258 3728 +#elif defined(XZ_DEC_PREALLOC) || defined(XZ_DEC_DYNALLOC)
gokhlayeh@9258 3729 +# define DEC_IS_MULTI(mode) ((mode) != XZ_SINGLE)
gokhlayeh@9258 3730 +#else
gokhlayeh@9258 3731 +# define DEC_IS_MULTI(mode) (false)
gokhlayeh@9258 3732 +#endif
gokhlayeh@9258 3733 +
gokhlayeh@9258 3734 +/*
gokhlayeh@9258 3735 + * If any of the BCJ filter decoders are wanted, define XZ_DEC_BCJ.
gokhlayeh@9258 3736 + * XZ_DEC_BCJ is used to enable generic support for BCJ decoders.
gokhlayeh@9258 3737 + */
gokhlayeh@9258 3738 +#ifndef XZ_DEC_BCJ
gokhlayeh@9258 3739 +# if defined(XZ_DEC_X86) || defined(XZ_DEC_POWERPC) \
gokhlayeh@9258 3740 + || defined(XZ_DEC_IA64) || defined(XZ_DEC_ARM) \
gokhlayeh@9258 3741 + || defined(XZ_DEC_ARM) || defined(XZ_DEC_ARMTHUMB) \
gokhlayeh@9258 3742 + || defined(XZ_DEC_SPARC)
gokhlayeh@9258 3743 +# define XZ_DEC_BCJ
gokhlayeh@9258 3744 +# endif
gokhlayeh@9258 3745 +#endif
gokhlayeh@9258 3746 +
gokhlayeh@9258 3747 +/*
gokhlayeh@9258 3748 + * Allocate memory for LZMA2 decoder. xz_dec_lzma2_reset() must be used
gokhlayeh@9258 3749 + * before calling xz_dec_lzma2_run().
gokhlayeh@9258 3750 + */
gokhlayeh@9258 3751 +XZ_EXTERN struct xz_dec_lzma2 *xz_dec_lzma2_create(enum xz_mode mode,
gokhlayeh@9258 3752 + uint32_t dict_max);
gokhlayeh@9258 3753 +
gokhlayeh@9258 3754 +/*
gokhlayeh@9258 3755 + * Decode the LZMA2 properties (one byte) and reset the decoder. Return
gokhlayeh@9258 3756 + * XZ_OK on success, XZ_MEMLIMIT_ERROR if the preallocated dictionary is not
gokhlayeh@9258 3757 + * big enough, and XZ_OPTIONS_ERROR if props indicates something that this
gokhlayeh@9258 3758 + * decoder doesn't support.
gokhlayeh@9258 3759 + */
gokhlayeh@9258 3760 +XZ_EXTERN enum xz_ret xz_dec_lzma2_reset(struct xz_dec_lzma2 *s,
gokhlayeh@9258 3761 + uint8_t props);
gokhlayeh@9258 3762 +
gokhlayeh@9258 3763 +/* Decode raw LZMA2 stream from b->in to b->out. */
gokhlayeh@9258 3764 +XZ_EXTERN enum xz_ret xz_dec_lzma2_run(struct xz_dec_lzma2 *s,
gokhlayeh@9258 3765 + struct xz_buf *b);
gokhlayeh@9258 3766 +
gokhlayeh@9258 3767 +/* Free the memory allocated for the LZMA2 decoder. */
gokhlayeh@9258 3768 +XZ_EXTERN void xz_dec_lzma2_end(struct xz_dec_lzma2 *s);
gokhlayeh@9258 3769 +
gokhlayeh@9258 3770 +#ifdef XZ_DEC_BCJ
gokhlayeh@9258 3771 +/*
gokhlayeh@9258 3772 + * Allocate memory for BCJ decoders. xz_dec_bcj_reset() must be used before
gokhlayeh@9258 3773 + * calling xz_dec_bcj_run().
gokhlayeh@9258 3774 + */
gokhlayeh@9258 3775 +XZ_EXTERN struct xz_dec_bcj *xz_dec_bcj_create(bool single_call);
gokhlayeh@9258 3776 +
gokhlayeh@9258 3777 +/*
gokhlayeh@9258 3778 + * Decode the Filter ID of a BCJ filter. This implementation doesn't
gokhlayeh@9258 3779 + * support custom start offsets, so no decoding of Filter Properties
gokhlayeh@9258 3780 + * is needed. Returns XZ_OK if the given Filter ID is supported.
gokhlayeh@9258 3781 + * Otherwise XZ_OPTIONS_ERROR is returned.
gokhlayeh@9258 3782 + */
gokhlayeh@9258 3783 +XZ_EXTERN enum xz_ret xz_dec_bcj_reset(struct xz_dec_bcj *s, uint8_t id);
gokhlayeh@9258 3784 +
gokhlayeh@9258 3785 +/*
gokhlayeh@9258 3786 + * Decode raw BCJ + LZMA2 stream. This must be used only if there actually is
gokhlayeh@9258 3787 + * a BCJ filter in the chain. If the chain has only LZMA2, xz_dec_lzma2_run()
gokhlayeh@9258 3788 + * must be called directly.
gokhlayeh@9258 3789 + */
gokhlayeh@9258 3790 +XZ_EXTERN enum xz_ret xz_dec_bcj_run(struct xz_dec_bcj *s,
gokhlayeh@9258 3791 + struct xz_dec_lzma2 *lzma2,
gokhlayeh@9258 3792 + struct xz_buf *b);
gokhlayeh@9258 3793 +
gokhlayeh@9258 3794 +/* Free the memory allocated for the BCJ filters. */
gokhlayeh@9258 3795 +#define xz_dec_bcj_end(s) kfree(s)
gokhlayeh@9258 3796 +#endif
gokhlayeh@9258 3797 +
gokhlayeh@9258 3798 +#endif
gokhlayeh@9258 3799 diff --git a/lib/xz/xz_stream.h b/lib/xz/xz_stream.h
gokhlayeh@9258 3800 new file mode 100644
gokhlayeh@9258 3801 index 0000000..66cb5a7
gokhlayeh@9258 3802 --- /dev/null
gokhlayeh@9258 3803 +++ b/lib/xz/xz_stream.h
gokhlayeh@9258 3804 @@ -0,0 +1,62 @@
gokhlayeh@9258 3805 +/*
gokhlayeh@9258 3806 + * Definitions for handling the .xz file format
gokhlayeh@9258 3807 + *
gokhlayeh@9258 3808 + * Author: Lasse Collin <lasse.collin@tukaani.org>
gokhlayeh@9258 3809 + *
gokhlayeh@9258 3810 + * This file has been put into the public domain.
gokhlayeh@9258 3811 + * You can do whatever you want with this file.
gokhlayeh@9258 3812 + */
gokhlayeh@9258 3813 +
gokhlayeh@9258 3814 +#ifndef XZ_STREAM_H
gokhlayeh@9258 3815 +#define XZ_STREAM_H
gokhlayeh@9258 3816 +
gokhlayeh@9258 3817 +#if defined(__KERNEL__) && !XZ_INTERNAL_CRC32
gokhlayeh@9258 3818 +# include <linux/crc32.h>
gokhlayeh@9258 3819 +# undef crc32
gokhlayeh@9258 3820 +# define xz_crc32(buf, size, crc) \
gokhlayeh@9258 3821 + (~crc32_le(~(uint32_t)(crc), buf, size))
gokhlayeh@9258 3822 +#endif
gokhlayeh@9258 3823 +
gokhlayeh@9258 3824 +/*
gokhlayeh@9258 3825 + * See the .xz file format specification at
gokhlayeh@9258 3826 + * http://tukaani.org/xz/xz-file-format.txt
gokhlayeh@9258 3827 + * to understand the container format.
gokhlayeh@9258 3828 + */
gokhlayeh@9258 3829 +
gokhlayeh@9258 3830 +#define STREAM_HEADER_SIZE 12
gokhlayeh@9258 3831 +
gokhlayeh@9258 3832 +#define HEADER_MAGIC "\3757zXZ"
gokhlayeh@9258 3833 +#define HEADER_MAGIC_SIZE 6
gokhlayeh@9258 3834 +
gokhlayeh@9258 3835 +#define FOOTER_MAGIC "YZ"
gokhlayeh@9258 3836 +#define FOOTER_MAGIC_SIZE 2
gokhlayeh@9258 3837 +
gokhlayeh@9258 3838 +/*
gokhlayeh@9258 3839 + * Variable-length integer can hold a 63-bit unsigned integer or a special
gokhlayeh@9258 3840 + * value indicating that the value is unknown.
gokhlayeh@9258 3841 + *
gokhlayeh@9258 3842 + * Experimental: vli_type can be defined to uint32_t to save a few bytes
gokhlayeh@9258 3843 + * in code size (no effect on speed). Doing so limits the uncompressed and
gokhlayeh@9258 3844 + * compressed size of the file to less than 256 MiB and may also weaken
gokhlayeh@9258 3845 + * error detection slightly.
gokhlayeh@9258 3846 + */
gokhlayeh@9258 3847 +typedef uint64_t vli_type;
gokhlayeh@9258 3848 +
gokhlayeh@9258 3849 +#define VLI_MAX ((vli_type)-1 / 2)
gokhlayeh@9258 3850 +#define VLI_UNKNOWN ((vli_type)-1)
gokhlayeh@9258 3851 +
gokhlayeh@9258 3852 +/* Maximum encoded size of a VLI */
gokhlayeh@9258 3853 +#define VLI_BYTES_MAX (sizeof(vli_type) * 8 / 7)
gokhlayeh@9258 3854 +
gokhlayeh@9258 3855 +/* Integrity Check types */
gokhlayeh@9258 3856 +enum xz_check {
gokhlayeh@9258 3857 + XZ_CHECK_NONE = 0,
gokhlayeh@9258 3858 + XZ_CHECK_CRC32 = 1,
gokhlayeh@9258 3859 + XZ_CHECK_CRC64 = 4,
gokhlayeh@9258 3860 + XZ_CHECK_SHA256 = 10
gokhlayeh@9258 3861 +};
gokhlayeh@9258 3862 +
gokhlayeh@9258 3863 +/* Maximum possible Check ID */
gokhlayeh@9258 3864 +#define XZ_CHECK_MAX 15
gokhlayeh@9258 3865 +
gokhlayeh@9258 3866 +#endif
gokhlayeh@9258 3867 diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
gokhlayeh@9258 3868 index 54fd1b7..b862007 100644
gokhlayeh@9258 3869 --- a/scripts/Makefile.lib
gokhlayeh@9258 3870 +++ b/scripts/Makefile.lib
gokhlayeh@9258 3871 @@ -246,6 +246,34 @@ cmd_lzo = (cat $(filter-out FORCE,$^) | \
gokhlayeh@9258 3872 lzop -9 && $(call size_append, $(filter-out FORCE,$^))) > $@ || \
gokhlayeh@9258 3873 (rm -f $@ ; false)
gokhlayeh@9258 3874
gokhlayeh@9258 3875 +# XZ
gokhlayeh@9258 3876 +# ---------------------------------------------------------------------------
gokhlayeh@9258 3877 +# Use xzkern to compress the kernel image and xzmisc to compress other things.
gokhlayeh@9258 3878 +#
gokhlayeh@9258 3879 +# xzkern uses a big LZMA2 dictionary since it doesn't increase memory usage
gokhlayeh@9258 3880 +# of the kernel decompressor. A BCJ filter is used if it is available for
gokhlayeh@9258 3881 +# the target architecture. xzkern also appends uncompressed size of the data
gokhlayeh@9258 3882 +# using size_append. The .xz format has the size information available at
gokhlayeh@9258 3883 +# the end of the file too, but it's in more complex format and it's good to
gokhlayeh@9258 3884 +# avoid changing the part of the boot code that reads the uncompressed size.
gokhlayeh@9258 3885 +# Note that the bytes added by size_append will make the xz tool think that
gokhlayeh@9258 3886 +# the file is corrupt. This is expected.
gokhlayeh@9258 3887 +#
gokhlayeh@9258 3888 +# xzmisc doesn't use size_append, so it can be used to create normal .xz
gokhlayeh@9258 3889 +# files. xzmisc uses smaller LZMA2 dictionary than xzkern, because a very
gokhlayeh@9258 3890 +# big dictionary would increase the memory usage too much in the multi-call
gokhlayeh@9258 3891 +# decompression mode. A BCJ filter isn't used either.
gokhlayeh@9258 3892 +quiet_cmd_xzkern = XZKERN $@
gokhlayeh@9258 3893 +cmd_xzkern = (cat $(filter-out FORCE,$^) | \
gokhlayeh@9258 3894 + sh $(srctree)/scripts/xz_wrap.sh && \
gokhlayeh@9258 3895 + $(call size_append, $(filter-out FORCE,$^))) > $@ || \
gokhlayeh@9258 3896 + (rm -f $@ ; false)
gokhlayeh@9258 3897 +
gokhlayeh@9258 3898 +quiet_cmd_xzmisc = XZMISC $@
gokhlayeh@9258 3899 +cmd_xzmisc = (cat $(filter-out FORCE,$^) | \
gokhlayeh@9258 3900 + xz --check=crc32 --lzma2=dict=1MiB) > $@ || \
gokhlayeh@9258 3901 + (rm -f $@ ; false)
gokhlayeh@9258 3902 +
gokhlayeh@9258 3903 # misc stuff
gokhlayeh@9258 3904 # ---------------------------------------------------------------------------
gokhlayeh@9258 3905 quote:="
gokhlayeh@9258 3906 diff --git a/scripts/xz_wrap.sh b/scripts/xz_wrap.sh
gokhlayeh@9258 3907 new file mode 100644
gokhlayeh@9258 3908 index 0000000..17a5798
gokhlayeh@9258 3909 --- /dev/null
gokhlayeh@9258 3910 +++ b/scripts/xz_wrap.sh
gokhlayeh@9258 3911 @@ -0,0 +1,23 @@
gokhlayeh@9258 3912 +#!/bin/sh
gokhlayeh@9258 3913 +#
gokhlayeh@9258 3914 +# This is a wrapper for xz to compress the kernel image using appropriate
gokhlayeh@9258 3915 +# compression options depending on the architecture.
gokhlayeh@9258 3916 +#
gokhlayeh@9258 3917 +# Author: Lasse Collin <lasse.collin@tukaani.org>
gokhlayeh@9258 3918 +#
gokhlayeh@9258 3919 +# This file has been put into the public domain.
gokhlayeh@9258 3920 +# You can do whatever you want with this file.
gokhlayeh@9258 3921 +#
gokhlayeh@9258 3922 +
gokhlayeh@9258 3923 +BCJ=
gokhlayeh@9258 3924 +LZMA2OPTS=
gokhlayeh@9258 3925 +
gokhlayeh@9258 3926 +case $ARCH in
gokhlayeh@9258 3927 + x86|x86_64) BCJ=--x86 ;;
gokhlayeh@9258 3928 + powerpc) BCJ=--powerpc ;;
gokhlayeh@9258 3929 + ia64) BCJ=--ia64; LZMA2OPTS=pb=4 ;;
gokhlayeh@9258 3930 + arm) BCJ=--arm ;;
gokhlayeh@9258 3931 + sparc) BCJ=--sparc ;;
gokhlayeh@9258 3932 +esac
gokhlayeh@9258 3933 +
gokhlayeh@9258 3934 +exec xz --check=crc32 $BCJ --lzma2=$LZMA2OPTS,dict=32MiB