In computer programming, a magic number or file signature is a numeric literal in source code that has a special, particular meaning that is less than clear to the reader. Also in computing, but not limited to programming, the term is used for a number that identifies a particular concept but without additional knowledge its meaning is less than clear. For example, some file formats are identified by an embedded magic number in the file ). Also, a number that is relatively uniquely associated with a particular concept, such as a universally unique identifier, might be classified as a magic number.

Numeric literal

A magic number or magic constant is a numeric literal in source code which has a special meaning that is less than clear in context. This is considered an anti-pattern and breaks one of the oldest rules of programming, dating back to the COBOL, FORTRAN and PL/1 manuals of the 1960s.

For example, in the following code that computes a price after tax, <code>1.05</code> is a magic number since the value encodes the sales tax rate, 5%, in a way that is less than obvious.

price_after_tax = 1.05 * price

The use of magic numbers in code obscures the developers' intent in choosing that number, increases opportunities for subtle errors, and makes it more difficult for the program to be adapted and extended in the future. As an example, it is difficult to tell whether every digit in <code>3.14159265358979323846</code> is correctly typed, or if this constant for pi can be truncated to <code>3.14159</code> without affecting the functionality of the program with its reduced precision. Replacing all significant magic numbers with named constants (also called explanatory variables) makes programs easier to read, understand and maintain.

The example above can be improved by adding a descriptively named variable:

TAX = 0.05

price_after_tax = (1.0 + TAX) * price

A good name can result in code that is more easily understood by a maintainer who is not the original author and even the original author after a period of time. Pre-Sixth Edition Unix versions read an executable file into memory and jumped to the first low memory address of the program, relative address zero. With the development of paged versions of Unix, a header was created to describe the executable image components. Also, a branch instruction was inserted as the first word of the header to skip the header and start the program. In this way a program could be run in the older relocatable memory reference (regular) mode or in paged mode. As more executable formats were developed, new constants were added by incrementing the branch offset.

In the Sixth Edition source code of the Unix program loader, the exec() function read the executable (binary) image from the file system. The first 8 bytes of the file was a header containing the sizes of the program (text) and initialized (global) data areas. Also, the first 16-bit word of the header was compared to two constants to determine if the executable image contained relocatable memory references (normal), the newly implemented paged read-only executable image, or the separated instruction and data paged image. There was no mention of the dual role of the header constant, but the high order byte of the constant was, in fact, the operation code for the PDP-11 branch instruction (octal 000407 or hex 0107). Adding seven to the program counter showed that if this constant was executed, it would branch the Unix exec() service over the executable image eight byte header and start the program.

Since the Sixth and Seventh Editions of Unix employed paging code, the dual role of the header constant was hidden. That is, the exec() service read the executable file header (meta) data into a kernel space buffer, but read the executable image into user space, thereby not using the constant's branching feature. Magic number creation was implemented in the Unix linker and loader and magic number branching was probably still used in the suite of stand-alone diagnostic programs that came with the Sixth and Seventh Editions. Thus, the header constant did provide an illusion and met the criteria for magic.

In Version Seven Unix, the header constant was not tested directly, but assigned to a variable labeled ux_mag and subsequently referred to as the magic number. Probably because of its uniqueness, the term magic number came to mean executable format type, then expanded to mean file system type, and expanded again to mean any type of file.

=== In files === <!-- Courtesy note per MOS:LINK2SECT: File format#Magic number links here. -->

Magic numbers are common in programs across many operating systems. Magic numbers implement strongly typed data and are a form of in-band signaling to the controlling program that reads the data type(s) at program run-time. Many files have such constants that identify the contained data. Detecting such constants in files is a simple and effective way of distinguishing between many file formats and can yield further run-time information.

;Examples

  • Compiled Java class files (bytecode) and Mach-O binaries start with hex <code>CA&nbsp;FE&nbsp;BA&nbsp;BE</code>. When compressed with Pack200 the bytes are changed to <code>CA&nbsp;FE&nbsp;D0&nbsp;0D</code>.
  • GIF image files have the ASCII code for "GIF89a" (<code>47&nbsp;49&nbsp;46&nbsp;38&nbsp;39&nbsp;61</code>) or "GIF87a" (<code>47&nbsp;49&nbsp;46&nbsp;38&nbsp;37&nbsp;61</code>)
  • JPEG image files begin with <code>FF&nbsp;D8</code> and end with <code>FF&nbsp;D9</code>. JPEG/JFIF files contain the null terminated string "JFIF" (<code>4A&nbsp;46&nbsp;49&nbsp;46&nbsp;00</code>). JPEG/Exif files contain the null terminated string "Exif" (<code>45&nbsp;78&nbsp;69&nbsp;66&nbsp;00</code>), followed by more metadata about the file.
  • PNG image files begin with an 8-byte signature which identifies the file as a PNG file and allows detection of common file transfer problems: "\211PNG\r\n\032\n" (<code>89&nbsp;50&nbsp;4E&nbsp;47&nbsp;0D&nbsp;0A&nbsp;1A&nbsp;0A</code>). That signature contains various newline characters to permit detecting unwarranted automated newline conversions, such as transferring the file using FTP with the ASCII transfer mode instead of the binary mode.
  • Standard MIDI audio files have the ASCII code for "MThd" (MIDI Track header, <code>4D&nbsp;54&nbsp;68&nbsp;64</code>) followed by more metadata.
  • Unix or Linux scripts may start with a shebang ("#!", <code>23&nbsp;21</code>) followed by the path to an interpreter, if the interpreter is likely to be different from the one from which the script was invoked.
  • ELF executables start with the byte <code>7F</code> followed by "ELF" (<code>7F&nbsp;45&nbsp;4C&nbsp;46</code>).
  • PostScript files and programs start with "%!" (<code>25&nbsp;21</code>).
  • PDF files start with "%PDF" (hex <code>25&nbsp;50&nbsp;44&nbsp;46</code>).
  • DOS MZ executable files and the EXE stub of the Microsoft Windows PE (Portable Executable) files start with the characters "MZ" (<code>4D&nbsp;5A</code>), the initials of the designer of the file format, Mark Zbikowski. The definition allows the uncommon "ZM" (<code>5A&nbsp;4D</code>) as well for dosZMXP, a non-PE EXE.
  • The Berkeley Fast File System superblock format is identified as either <code>19&nbsp;54&nbsp;01&nbsp;19</code> or <code>01&nbsp;19&nbsp;54</code> depending on version; both represent the birthday of the author, Marshall Kirk McKusick.
  • The Master Boot Record of bootable storage devices on almost all IA-32 IBM PC compatibles has a code of <code>55&nbsp;AA</code> as its last two bytes.
  • Executables for the Game Boy and Game Boy Advance handheld video game systems have a 48-byte or 156-byte magic number, respectively, at a fixed spot in the header. This magic number encodes a bitmap of the Nintendo logo.
  • Amiga software executable Hunk files running on Amiga classic 68000 machines all started with the hexadecimal number $000003f3, nicknamed the "Magic Cookie."
  • In the Amiga, the only absolute address in the system is hex $0000 0004 (memory location 4), which contains the start location called SysBase, a pointer to exec.library, the so-called kernel of Amiga.
  • PEF files, used by the classic Mac OS and BeOS for PowerPC executables, contain the ASCII code for "Joy!" (<code>4A&nbsp;6F&nbsp;79&nbsp;21</code>) as a prefix.
  • TIFF files begin with either "II" or "MM" followed by 42 as a two-byte integer in little or big endian byte ordering. "II" is for Intel, which uses little endian byte ordering, so the magic number is <code>49&nbsp;49&nbsp;2A&nbsp;00</code>. "MM" is for Motorola, which uses big endian byte ordering, so the magic number is <code>4D&nbsp;4D&nbsp;00&nbsp;2A</code>.
  • Unicode text files encoded in UTF-16 often start with the Byte Order Mark to detect endianness (<code>FE&nbsp;FF</code> for big endian and <code>FF&nbsp;FE</code> for little endian). And on Microsoft Windows, UTF-8 text files often start with the UTF-8 encoding of the same character, <code>EF&nbsp;BB&nbsp;BF</code>.
  • LLVM Bitcode files start with "BC" (<code>42&nbsp;43</code>).
  • WAD files start with "IWAD" or "PWAD" (for Doom), "WAD2" (for Quake) and "WAD3" (for Half-Life).
  • Microsoft Compound File Binary Format (mostly known as one of the older formats of Microsoft Office documents) files start with <code>D0&nbsp;CF&nbsp;11&nbsp;E0</code>, which is visually suggestive of the word "DOCFILE0".
  • Headers in ZIP files often show up in text editors as "PK♥♦" (<code>50&nbsp;4B&nbsp;03&nbsp;04</code>), where "PK" are the initials of Phil Katz, author of DOS compression utility PKZIP.
  • Headers in 7z files begin with "7z" (full magic number: <code>37&nbsp;7A&nbsp;BC&nbsp;AF&nbsp;27&nbsp;1C</code>).

;Detection

The Unix utility program <code>file</code> can read and interpret magic numbers from files, and the file which is used to parse the information is called magic. The Windows utility TrID has a similar purpose.

In protocols

;Examples

  • The OSCAR protocol, used in AIM/ICQ, prefixes requests with <code>2A</code>.
  • In the RFB protocol used by VNC, a client starts its conversation with a server by sending "RFB" (<code>52&nbsp;46&nbsp;42</code>, for "Remote Frame Buffer") followed by the client's protocol version number.
  • In the SMB protocol used by Microsoft Windows, each SMB request or server reply begins with <code>FF&nbsp;53&nbsp;4D&nbsp;42</code>, or <code>\xFFSMB</code> at the start of the SMB request.
  • In the MSRPC protocol used by Microsoft Windows, each TCP-based request begins with <code>05</code> at the start of the request (representing Microsoft DCE/RPC Version 5), followed immediately by a <code>00</code> or <code>01</code> for the minor version. In UDP-based MSRPC requests the first byte is always <code>04</code>.
  • In COM and DCOM marshalled interfaces, called OBJREFs, always start with the byte sequence "MEOW" (<code>4D&nbsp;45&nbsp;4F&nbsp;57</code>). Debugging extensions (used for DCOM channel hooking) are prefaced with the byte sequence "MARB" (<code>4D&nbsp;41&nbsp;52&nbsp;42</code>).
  • Unencrypted BitTorrent tracker requests begin with a single byte containing the value <code>19</code> representing the header length, followed immediately by the phrase "BitTorrent protocol" at byte position 1.
  • eDonkey2000/eMule traffic begins with a single byte representing the client version. Currently <code>E3</code> represents an eDonkey client, <code>C5</code> represents eMule, and <code>D4</code> represents compressed eMule.
  • The first 4 bytes of a block in the Bitcoin Blockchain contains a magic number which serves as the network identifier. The value is <code>D9&nbsp;B4&nbsp;BE&nbsp;F9</code>, which indicates the main network, while <code>DA&nbsp;B5&nbsp;BF&nbsp;FA</code> indicates the testnet.
  • SSL transactions always begin with a "client hello" message. The record encapsulation scheme used to prefix all SSL packets consists of two- and three- byte header forms. Typically an SSL version 2 client hello message is prefixed with an <code>80</code> and an SSLv3 server response to a client hello begins with <code>16</code> (though this may vary).
  • DHCP packets use a "magic cookie" value of <code>63&nbsp;82&nbsp;53&nbsp;63</code> at the start of the options section of the packet. This value is included in all DHCP packet types.
  • HTTP/2 connections start with the 24-character string <code>PRI&nbsp;*&nbsp;HTTP/2.0\r\n\r\nSM\r\n\r\n</code>. It is designed to avoid the processing of frames by servers and intermediaries which support earlier versions of HTTP but not 2.0.
  • The WebSocket opening handshake uses a string containing the UUIDv4 <code>258EAFA5-E914-47DA-95CA-C5AB0DC85B11</code>.

In interfaces

Magic numbers are common in API functions and interfaces across many operating systems, including DOS, Windows and NetWare:

;Examples

  • IBM PC-compatible BIOSes use magic values <code>00&nbsp;00</code> and <code>12&nbsp;34</code> to decide if the system should count up memory or not on reboot, thereby performing a cold or a warm boot. Theses values are also used by EMM386 memory managers intercepting boot requests.
  • The MS-DOS disk cache SMARTDRV (codenamed "Bambi") uses magic values <code>BA&nbsp;BE</code> and <code>EB&nbsp;AB</code> in API functions.

GUID

It is possible to create or alter globally unique identifiers (GUIDs) so that they are memorable, but this is highly discouraged as it compromises their strength as near-unique identifiers. The specifications for generating GUIDs and UUIDs are quite complex, which is what leads to them being virtually unique, if properly implemented.

Microsoft Windows product ID numbers for Microsoft Office products sometimes end with <code>0000-0000-0000000FF1CE</code> ("OFFICE"), such as <code>90160000-008C-0000-0000-0000000FF1CE</code>, the product ID for the "Office 16 Click-to-Run Extensibility Component".

Java uses several GUIDs starting with <code>CAFEEFAC</code>.

In the GUID Partition Table of the GPT partitioning scheme, BIOS Boot partitions use the special GUID <code>21686148-6449-6E6F-744E-656564454649</code> which does not follow the GUID definition; instead, it is formed by using the ASCII codes for the string <code>Hah!IdontNeedEFI</code> partially in little endian order.

Debug value

Magic debug values are specific values written to memory during allocation or deallocation, so that it will later be possible to tell whether or not they have become corrupted, and to make it obvious when values taken from uninitialized memory are being used. Memory is usually viewed in hexadecimal, so memorable repeating or hexspeak values are common. Numerically odd values may be preferred so that processors without byte addressing will fault when attempting to use them as pointers (which must fall at even addresses). Values should be chosen that are away from likely addresses (the program code, static data, heap data, or the stack). Similarly, they may be chosen so that they are not valid codes in the instruction set for the given architecture.

Since it is very unlikely, although possible, that a 32-bit integer would take this specific value, the appearance of such a number in a debugger or memory dump most likely indicates an error such as a buffer overflow or an uninitialized variable.

Famous and common examples include:

<!--

Please understand the above description before adding things! This is not the place for other kinds of magic numbers like header signatures or error codes.

-->

{| class="wikitable"

|-

! style="background:#D0E0FF"| Code

! style="background:#D0E0FF"| Description

|-

| <code>00008123</code> || Used in MS Visual C++. Deleted pointers are set to this value, so they throw an exception, when they are used after; it is a more recognizable alias for the zero address. It is activated with the Security Development Lifecycle (/sdl) option.

|-

| <code>..FACADE</code> || "Facade", Used by a number of RTOSes.

|-

| <code>1BADB002</code> || "1 bad boot", Multiboot header magic number.

|-

| <code>8BADF00D</code> || "Ate bad food", Indicates that an Apple iOS application has been terminated because a watchdog timeout occurred.

|-

| <code>A5A5A5A5</code> || Used in embedded development because the alternating bit pattern (1010 0101) creates an easily recognized pattern on oscilloscopes and logic analyzers.

|-

| <code>A5</code> || Used in FreeBSD's PHK malloc(3) for debugging when /etc/malloc.conf is symlinked to "-J" to initialize all newly allocated memory as this value is not a NULL pointer or ASCII NUL character.

|-

| <code>ABABABAB</code> || Used by Microsoft's debug HeapAlloc() to mark "no man's land" guard bytes after allocated heap memory.

|-

| <code>ABADBABE</code> || "A bad babe", Used by Apple as the "Boot Zero Block" magic number.

|-

| <code>ABBABABE</code> || "ABBA babe", used by Driver: Parallel Lines memory heap.

|-

| <code>ABADCAFE</code> || "A bad cafe", Used to initialize all unallocated memory (Mungwall, AmigaOS).

|-

| <code>B16B00B5</code> || "Big Boobs", Formerly required by Microsoft's Hyper-V hypervisor to be used by Linux guests as the upper half of their "guest id".

|-

| <code>BAADF00D</code> || "Bad food", Used by Microsoft's debug HeapAlloc() to mark uninitialized allocated heap memory.

|-

| <code>BEBEBEBE</code> || Used by AddressSanitizer to fill allocated but not initialized memory.

|-

| <code>BEEFCACE</code> || "Beef cake", Used by Microsoft .NET as a magic number in resource files.

|-

| <code>C00010FF</code> || "Cool off", Indicates Apple iOS app was killed by the operating system in response to a thermal event.

|-

| <code>CDCDCDCD</code> || Used by Microsoft's C/C++ debug malloc() function to mark uninitialized heap memory, usually returned from <code>HeapAlloc</code>.

|-

| <code>DEFEC8ED</code> || "Defecated", Used for OpenSolaris core dumps.

|-

| <code>DEADDEAD</code> || "Dead Dead" indicates that the user deliberately initiated a crash dump from either the kernel debugger or the keyboard under Microsoft Windows.

|-

|<code>D00D2BAD</code>

|"Dude, Too Bad", Used by Safari crashes on macOS Big Sur.

|-

|<code>D00DF33D</code>

|"Dude feed", Used by the devicetree to mark the start of headers.

|-

| <code>EBEBEBEB</code> || From MicroQuill's SmartHeap.

|-

| <code>FADEDEAD</code> || "Fade dead", Comes at the end to identify every AppleScript script.

|-

| <code>FDFDFDFD</code> || Used by Microsoft's C/C++ debug malloc() function to mark "no man's land" guard bytes before and after allocated heap memory,

|-

| <code>FEE1DEAD</code> || "Feel dead", Used by Linux reboot() syscall.

|-

| <code>FEEDFACE</code> || "Feed face", Seen in Mach-O binaries on Apple Inc.'s Mac OSX platform. On Sun Microsystems' Solaris, marks the red zone (KMEM_REDZONE_PATTERN).

Used by VLC player and some IP cameras in RTP/RTCP protocol, VLC player sends four bytes in the order of the endianness of the system. Some IP cameras expect the player to send this magic number and do not start the stream if it is not received.

|-

| <code>FEEEFEEE</code> || "Fee fee", Used by Microsoft's debug HeapFree() to mark freed heap memory. Some nearby internal bookkeeping values may have the high word set to FEEE as well.