************************************* | Star Fighter 3000 | | Documentation Project | ************************************* | Reference 8: Compression format | | Changed: CJB 03.01.11 | ************************************* N.B. This text is best viewed at a display width of 77 columns. To create an index do a 'List of found' search for ">>". ----------------------------------------------------------------------------- >> Contents: ========= 8.1 Introduction 8.2 File types 8.3 Compression format 8.4 Known implementations 8.5 Implementation quirks ----------------------------------------------------------------------------- >> 8.1 Introduction ================ This document describes a compression format used to reduce the size of (and perhaps obfuscate) files belonging to several RISC OS games: 'Chocks Away' (published by The Fourth Dimension), 'Stunt Racer 2000' and 'Star Fighter 3000' (both published by Fednet Software). ----------------------------------------------------------------------------- >> 8.2 File types ============== There follows a list of filetypes that compressed files may have. Beware that files of type &400 are not necessarily compressed because the 'Stunt Racer 2000' track editor saves non-compressed tracks with this type! Type Contents ---- -------- &154 'Star Fighter 3000' or 'Stunt Racer 2000' compressed code or data. &300 'Star Fighter 3000' compressed polygonal objects set. &400 'Star Fighter 3000' compressed ground map, or 'Stunt Racer 2000' track (not necessarily compressed). &402 'Star Fighter 3000' compressed objects map. &401 'Star Fighter 3000' compressed ground map overlay, or 'Stunt Racer 2000' rec file. &403 'Star Fighter 3000' compressed objects map overlay. &404 'Star Fighter 3000' compressed sky colours. &405 'Star Fighter 3000' compressed mission data. &406 'Star Fighter 3000' compressed sky pictures set. &407 'Star Fighter 3000' compressed map tile graphics set. &408 'Star Fighter 3000' compressed ground map animations. &FFD 'Chocks Away' or 'Stunt Racer 2000' compressed code or data (not easily identifiable). ----------------------------------------------------------------------------- >> 8.3 Compression format ======================= The first 4 bytes of a compressed file give the expected size of the data when decompressed, as a 32 bit signed little-endian integer. Gordon Key's file decompression module 'FDComp', which is presumably normative, rejects input files where the top bit of the fourth byte is set (i.e. negative values). Thereafter, the compressed data consists of tightly packed groups of 1, 8 or 9 bits without any padding between them or alignment with byte boundaries. A decompressor must deal with two main types of directive: The first (store a byte) consists of 1+8=9 bits and the second (copy previously-decompressed data) consists of 1+9+8=18 or 1+9+9=19 bits. The type of each directive is determined by whether its first bit is set: 0. The next 8 bits of the input file give a literal byte value (0-255) to be written at the current output position. When attempting to compress input that contains few repeating patterns, the output may actually be larger than the input because each byte value is encoded using 9 rather than 8 bits. 1. The next 9 bits of the input file give an offset (0-511) within the data already decompressed, relative to a point 512 bytes behind the current output position. If the offset is greater than or equal to 256 (i.e. within the last 256 bytes written) then the next 8 bits give the number of bytes (0-255) to be copied to the current output position. Otherwise, the number of bytes (0-511) to be copied is encoded using 9 bits. If the read pointer is before the start of the output buffer then zeros should be written at the output position until it becomes valid again. This is a legitimate method of initialising areas of memory with zeros. It isn't possible to replicate the whole of the preceding 512 bytes in one operation. ----------------------------------------------------------------------------- >> 8.4 Known implementations ========================= In 'Chocks Away' and 'Stunt Racer 2000' the contents of compressed files were extracted by a module named 'DeComp' (©1993 The fourth dimension) which provides a single command *CLoad - analogous to the standard command *Load. In 'Star Fighter 3000' a similar module is supplied encrypted but it is named 'FDComp' (© Gordon Key 1994) and its command is *LComp. Apart from an unbalanced stack bug having been fixed (when SWI OS_Find returns an error) the two modules appear to be virtually identical. Actually 'FDComp' doesn't behave much better in the same situation because *LComp returns with V set but R0 still pointing to the command tail rather than an error block. It seems safe to assume that Gordon Key wrote both modules. The 'Stunt Racer 2000' track designer released through Visions Of The Impossible saves compressed tracks using a module named 'Comp' (©1993 The fourth dimension) which provides a single command *CSave - analogous to the standard commmand *Save. David O'Shea's desktop version of the track designer incorporates a re-implementation of the 'DeComp' module in C. He also wrote a compression module (not publicly released). Until release 39, Chris Bazley's C programmers' library 'CBLibrary' incorporated code heavily based on David's compressor and decompressor. In later releases, this will be superseded by new state machine implementations. ----------------------------------------------------------------------------- >> 8.5 Implementation quirks ========================= The decompressors written by the Fourth Dimension and David O'Shea always copy at least 1 byte from the source offset, even if the compressed bitstream specified 0 as the number of bytes to be copied. A well-written compressor should not insert directives to copy 0 bytes and no instances are known in the wild. CBLibrary's new decompressor will treat directives to copy 0 bytes as invalid input. The Fourth Dimension's compressor terminates a sequence when the sum of its offset and size is 511, even if the most recent byte matched the data to be compressed. This explains why the decompressor only expects 8 bits for the size of sequences starting at byte 256. All compatible compressors must impose a size limit of 255 or 511 on sequences starting at offsets 256 or 0 respectively. The most recent byte can't be included in a sequence starting at either of those offsets (e.g. [255 .. 511] and [257 .. 511] are valid but [256 .. 511] isn't). David O'Shea's compressor only searches the most recent 511 bytes for matches, so it produces different output from The Fourth Dimension's compressor for the same input. (A sequence starting at the oldest byte may be as long as, or longer than, a later sequence.) An optimal compressor should allow sequences to include the most recent byte (at offset 511) unless the start offset is 0 or 256.