The QR Code (QR Code®) is a two-dimensional code developed by Denso Wave and standardized as JIS X 0510 and ISO/IEC 18004. In this article we explain its specifications - version, data capacity, encoding modes, structure, and masking - together with actually generated sample QR codes.
1. Basic Structure and Modules
A QR Code is made up of black and white cells, and the smallest unit is called a module. Within the square, there is a data area and the function patterns that aid reading.
- Finder patterns (position detection): the large concentric-square marks in three corners. Used to detect orientation and position.
- Separators: the white borders surrounding the finder patterns.
- Timing patterns: alternating black-and-white columns and rows. The reference for coordinates (module count).
- Alignment patterns: small concentric squares present in version 2 and above. Used to correct distortion.
- Format and version information: stores the error correction level, mask, and version number.
- Quiet zone: the surrounding margin (4 modules or more recommended).
2. Version and Module Count
A QR Code has versions 1 to 40, and the larger the number, the more cells there are, allowing more data to be stored. The number of modules per side is determined by the following formula.
The samples below generate the same URL at different versions. Even with the same amount of data, you can see that the cells become finer as the version increases.
3. Data Capacity and Encoding Modes
A QR Code uses four encoding modes depending on the type of input data to pack data efficiently. Digits alone fit a lot, while including Kanji reduces the amount.
| Mode | Target | Max capacity (V40, error correction L) |
|---|---|---|
| Numeric | 0-9 | about 7,089 characters |
| Alphanumeric | 0-9, A-Z, 9 symbols | about 4,296 characters |
| 8-bit byte | binary (UTF-8, etc.) | about 2,953 bytes |
| Kanji | Shift_JIS Kanji | about 1,817 characters |
Even at the same version, the amount that fits varies with the type of content. Below are samples for numeric, alphanumeric, and URL (byte mode).
4. Error Correction Levels (L / M / Q / H)
A QR Code carries redundant recovery data so it can still be read even when partly soiled. There are four levels; higher levels are more robust against damage but require more cells for the same information.
| Level | Approx. recoverable | Use case |
|---|---|---|
| L | about 7% | capacity priority, clean environments |
| M | about 15% | general printed materials (standard) |
| Q | about 25% | somewhat harsh environments |
| H | about 30% | design QRs with logos, etc. |
These are samples generating the same URL at each level. As you raise the level from L to H, the cells become denser.
The mechanism of error correction (Reed-Solomon codes) is explained in detail in a separate article, "Understanding QR Code Error Correction Levels".
5. Masking
To avoid bias between black and white and false detection, a QR Code overlays one of eight mask patterns (0-7) on the data area. The most readable pattern is selected automatically during generation, and which one was used is recorded in the format information. When reading, this is removed to recover the original data.
6. Derived Standards: Micro QR / rMQR
- Micro QR: a space-saving version with only one finder pattern that stores a small amount of data in a small area.
- rMQR (Rectangular Micro QR): a rectangular type that fits into horizontally or vertically elongated spaces.
- iQR / SQRC / Frame QR: extended versions for larger capacity, restricted reading, design frames, and more.
Frequently Asked Questions (FAQ)
What is a QR Code version?
Versions range from 1 to 40, and the larger the number, the more cells (modules) there are, allowing more data to be stored. The number of modules per side is determined by "4 x version + 17", so V1 is 21x21 and V40 is 177x177.
How much data can a QR Code store?
It packs data efficiently using four encoding modes that match the type of input data. The maximum capacity (V40, error correction L) is about 7,089 characters in numeric mode, about 4,296 in alphanumeric, about 2,953 bytes in 8-bit byte, and about 1,817 characters in Kanji.
What is QR Code masking?
To avoid bias between black and white and false detection, it is a process that overlays one of eight mask patterns (0-7) on the data area. The most readable pattern is selected automatically during generation, and which one was used is recorded in the format information.