Encoding

Encoding is a means of converting data. Data may be converted into another format in order to transmit it, store it, or compress it. Encoding might also be used to describe a data structure or format, for example a file format. Algorithms can encode and decode this data without any sort of key.

Encoding is not Encryption!

As long as someone can determine the rules that were applied to the original data, they can easily reverse the encoding without any special knowledge, like passwords or secret keys. For this reason, encoding should never be used in a situation where the security and confidentiality of data is important.

Binary

Binary encoding consists of only two basic components and can be represented by any two values. They might be an ON and OFF state, a clockwise or counter-clockwise spin, or simply the numbers 1 and 0.

Hexadecimal

It can be difficult to read a long string of 0s and 1s. One of the things that can make it a bit easier to understand is representing binary-encoded data using hexadecimal, or base-16, numbers. The highest number we can fit into a single byte is 0b11111111, 0xff, or decimal 255.

EXAMPLE

Decimal Binary Hexadecimal
0 0b0 0x0
1 0b1 0x1
2 0b10 0x2
3 0b11 0x3
4 0b100 0x4
5 0b101 0x5
6 0b110 0x6
7 0b111 0x7
8 0b1000 0x8
9 0b1001 0x9
10 0b1010 0xa
11 0b1011 0xb
12 0b1100 0xc
13 0b1101 0xd
14 0b1110 0xe
15 0b1111 0xf

Decimal	Binary	Hexadecimal
0	0b0	0x0
1	0b1	0x1
2	0b10	0x2
3	0b11	0x3
4	0b100	0x4
5	0b101	0x5
6	0b110	0x6
7	0b111	0x7
8	0b1000	0x8
9	0b1001	0x9
10	0b1010	0xa
11	0b1011	0xb
12	0b1100	0xc
13	0b1101	0xd
14	0b1110	0xe
15	0b1111	0xf

American Standard Code for Information Interchange

American Standard Code for Information Interchange (ASCII) is a type of encoding used to store and process both printable and non-printable characters. In ASCII every character is represented with a 7-bit binary number, a string of seven 0s or 1s. ASCII contains encoding for all the alphanumeric characters and symbols on a modern keyboard, as well as encoding for things like TABs, Line Feeds, and even Backspaces.

Unicode and Unicode Transformation Format

Unicode is a standard that provides a number, or unique code point, for each character. Another way to say this is that each character is mapped to a unique value.

Unicode includes numbers and characters from the familiar Latin alphabet, for example, U+0041 for the Latin uppercase letter “A”. There are also Unicode numbers for each character in, for example, the Cyrillic, Thai, and Hangul alphabets. In total, there are over a million (a total of 1,112,064) mapped visible and non-visible characters.

Unicode Transformation Format (UTF) is a way to encode these Unicode mappings. The most common forms of UTF are UTF-8, which uses 8 bits, or 1-byte unit, and UTF-16, which uses 16 bits, or 2-byte units.

NOTE

UTF-8 was designed to be backward compatible with ASCII.

Base64

Base64 encoding allows us to transfer binary data over channels that can only represent text data. It essentially converts any binary data into an encoded sequence of printable characters, allowing us to transfer that data over virtually any channel and protocol.

Base64 gets its name from its use of 64 characters:

1-26: A to Z
27-52: a to z
53-62: 0 to 9
63: +
64: /

The = character might be used in the visual representation of this encoding as well, but only at the end of a string for padding.

Base64 works by:

Converting every three-bytes of binary data into four Base64 characters.
Each three byte sequence is called a block.
3x8 bytes of input produces 4x6 Base64 bytes of output.
When the input is indivisible by six, we add zeroes at the end of the input string to pad it, so that it becomes divisible.

Base64 output will contain:

One = character if the last block of input was only two bytes (without the added zeros).
Two = characters if the last block of input was only one byte.

Relevant Note(s):

Till Studer's Notes

Recent Notes

Home

Poker Quickstart Guide

Elastic Query Languages

Prompt Engineering

AI Tools

Encoding

Binary

Hexadecimal

American Standard Code for Information Interchange

Unicode and Unicode Transformation Format

Base64

Graph View

Table of Contents

Backlinks