MySQL Internals Manual  /  ...  /  Detailed Specification of the Decoding

21.4.7 Detailed Specification of the Decoding

Below follows the detailed specification of the encoding:

Datafile fixed header (32 bytes):

4 byte  magic number
4 byte  total header length (fixed + column info + code trees)
4 byte  minimum packed record length
4 byte  maximum packed record length
4 byte  total number of elements in all code trees
4 byte  total number of bytes collected for distinct column values
2 byte  number of code trees
1 byte  maximum number of bytes required to represent record+blob lengths
1 byte  record pointer length, number of bytes for compressed data file length
4 byte  zeros

Column Information. For every column in the table:

5 bits  field type
    FIELD_NORMAL          0
    FIELD_SKIP_ENDSPACE   1
    FIELD_SKIP_PRESPACE   2
    FIELD_SKIP_ZERO       3
    FIELD_BLOB            4
    FIELD_CONSTANT        5
    FIELD_INTERVALL       6
    FIELD_ZERO            7
    FIELD_VARCHAR         8
    FIELD_CHECK           9

6 bits  pack type as a set of flags
    PACK_TYPE_SELECTED      1
    PACK_TYPE_SPACE_FIELDS  2
    PACK_TYPE_ZERO_FILL     4

5 bits  if pack type contains PACK_TYPE_ZERO_FILL
    minimum number of trailing zero bytes in this column
        else
    number of bits to encode the number of
    packed bytes in this column (length_bits)

x bits  number of the code tree used to encode this column
    x is the minimum number of bits required to represent the highest
    tree number.

Alignment:

x bits alignment to the next byte border

Code Trees. For every tree:

1 bit   compression type
    0 = byte value compression
        8 bits  minimum byte value coded by this tree
        9 bits  number of byte values encoded by this tree
        5 bits  number of bits used to encode the byte values
        5 bits  number of bits used to encode offsets to next tree elements
    1 = distinct column value compression
        15 bits number of distinct column values encoded by this tree
        16 bits length of the buffer with all column values
        5 bits  number of bits used to encode the index of the column value
        5 bits  number of bits used to encode offsets to next tree elements
For each code tree element:
    1 bit   IS_OFFSET
    x bits  the announced number of bits for either a value or an offset
x bits  alignment to the next byte border
If compression by distinct column values:
    The number of 8-bit values that make up the column value buffer

Compressed Records. For every record:

1-5 bytes  length of the compressed record in bytes
    1. byte  0..253 length
             254    length encoded in the next two bytes little endian
             255    length encoded in the next  x  bytes little endian
                    x = 3  for pack file version 1
                    x = 4  for pack file version > 1
1-5 bytes  total length of all expanded blobs of this record
    1. byte  0..253 length
             254    length encoded in the next two bytes little endian
             255    length encoded in the next  x  bytes little endian
                    x = 3  for pack file version 1
                    x = 4  for pack file version > 1
For every column:
    If pack type includes PACK_TYPE_SPACE_FIELDS,
        1 bit   1 = spaces only, 0 = not only spaces
    In case the field type is of:
        FIELD_SKIP_ZERO
            1 bit   1 = zeros only, 0 = not only zeros
                    In the latter case
                        x bits  the Huffman code for every byte
        FIELD_NORMAL
            x bits  the Huffman code for every byte
        FIELD_SKIP_ENDSPACE
            If pack type includes PACK_TYPE_SELECTED,
                1 bit   1 = more than min endspace, 0 = not more
                        In the former case
                            x bits  nr of extra spaces, x = length_bits
            else
                x bits  nr of extra spaces, x = length_bits
            x bits  the Huffman code for every byte
        FIELD_SKIP_PRESPACE
            If pack type includes PACK_TYPE_SELECTED,
                1 bit   1 = more than min prespace, 0 = not more
                        In the former case
                            x bits  nr of extra spaces, x = length_bits
            else
                x bits  nr of extra spaces, x = length_bits
            x bits  the Huffman code for every byte
        FIELD_CONSTANT or FIELD_ZERO or FIELD_CHECK
            nothing for these
        FIELD_INTERVALL
            x bits  the Huffman code for the buffer index of the column value
        FIELD_BLOB
            1 bit   1 = blob is empty, 0 = not empty
                    In the latter case
                        x bits  blob length, x = length_bits
                        x bits  the Huffman code for every byte
        FIELD_VARCHAR
            1 bit   1 = varchar is empty, 0 = not empty
                    In the latter case
                        x bits  blob length, x = length_bits
                        x bits  the Huffman code for every byte
    x bits  alignment to the next byte border

User Comments
User comments in this section are, as the name implies, provided by MySQL users. The MySQL documentation team is not responsible for, nor do they endorse, any of the information provided here.
Sign Up Login You must be logged in to post a comment.