Base32 encoding is a binary-to-text encoding scheme that represents binary data in an ASCII string format by translating it into a base 32 representation. It's commonly used for encoding data in systems where binary data can't be used directly, such as in URLs, file names, or when transmitting binary data over media designed for textual data.
How Base32 Encoding Works:
Input Data: Base32 encoding takes a stream of binary data (e.g., bytes) as input.
Divide Data into 5-bit Chunks: The data is divided into 5-bit groups. This is because 32 different values (2^5) are used for encoding.
Mapping to Alphabet: Each 5-bit group is then mapped to a character in the Base32 alphabet, which consists of the following 32 characters:
A-Z, 2-7 (letters A-Z and digits 2-7 are used to represent the 32 values).
Padding: If the input data is not an exact multiple of 5 bits, padding is added to make it a complete group. This padding typically uses = (equals sign) to ensure the length of the encoded string is a multiple of 8 characters.
Base32 Example:
Suppose we want to encode the string "hello".
First, convert the text "hello" into its binary representation.
Then, divide the binary data into 5-bit chunks.
Each chunk is converted to its corresponding Base32 character.
If padding is needed, it is added.
Example Conversion (simplified):
Text: hello
Convert "hello" to binary:
"h" = 01101000
"e" = 01100101
"l" = 01101100
"l" = 01101100
"o" = 01101111
Combine the bits:
01101000 01100101 01101100 01101100 01101111
Break it into 5-bit chunks:
01101 00011 00101 01101 10011 01111
Map each 5-bit chunk to a Base32 character:
01101 → "N"
00011 → "D"
00101 → "F"
01101 → "N"
10011 → "P"
01111 → "H"
Encoded output: "NDNFNP"
If there's remaining data that doesn't fit into 5-bit chunks (like in the case of a non-multiple of 5), padding (=) will be added to the result.
Use Cases:
QR Codes: Base32 is often used in QR codes for encoding binary data like URLs.
File Transfers: When sending binary files via text-based protocols.
Hashing and Security: Base32 is used in systems like OTP (One-Time Password) algorithms, where a human-readable string is needed but the data must be securely encoded.