Thursday, December 7, 2017

LEB128 integer type

LEB128 ("Little-Endian Base 128") is a variable-length encoding for arbitrary signed or unsigned integer quantities. The format was borrowed from the DWARF3 specification. Each byte has its most significant bit set except for the final byte in the sequence, which has its most significant bit clear. The remaining seven bits of each byte are payload, with the least significant seven bits of the quantity in the first byte, the next seven in the second byte and so on.

We must define how many bits there are for LEB128, for example, LEB128-40(5 bytes) can be used to encode 32-bit quantities(have some higer bits not used).

In the case of a signed LEB128 (sleb128), the most significant payload bit of the final byte in the sequence is sign-extended to produce the final value. For example, "80 7f ..." is taken as "80 7f", then sign-extended to "00 ff ff ff", then -128. Another example, "7f ,,," is taken as -1.

In the unsigned case (uleb128), any bits not explicitly represented are interpreted as 0. For example, "80 7f ..." is taken as 80 3f, then 0x3f80

The variant uleb128p1 is used to represent a signed value, where the representation is of the value plus one encoded as a uleb128. This makes the encoding of -1 (alternatively thought of as the unsigned value 0xffffffff) — but no other negative number — a single byte, and is useful in exactly those cases where the represented number must either be non-negative or -1 (or 0xffffffff), and where no other negative values are allowed (or where large unsigned values are unlikely to be needed).

No comments:

Post a Comment