thumb|AMD Opteron, the first CPU to introduce the x86-64 extensions in April 2003
thumb|right|The five-volume set of the x86-64 Architecture Programmer's Manual, as published and distributed by AMD in 2002
x86-64 (also known as x86_64, AMD64, Intel 64 and x64) is a 64-bit extension of the x86 instruction set. It was announced in 1999 and first available in the AMD Opteron family in 2003. It introduces two new operating modes: 64-bit mode and compatibility mode, along with a new four-level paging mechanism.
In 64-bit mode, x86-64 supports significantly larger amounts of virtual memory and physical memory compared to its 32-bit predecessors, allowing programs to utilize more memory. The architecture expands the number of general-purpose registers from 8 to 16, all fully general-purpose, and extends their width to 64 bits.
Floating-point arithmetic is supported through mandatory SSE2 instructions in 64-bit mode. While the older x87 FPU and MMX registers are still available, they are generally superseded by a set of sixteen 128-bit vector registers (XMM registers). Each of these vector registers can store one or two double-precision floating-point numbers, up to four single-precision floating-point numbers, or various integer formats.
In 64-bit mode, instructions are modified to support 64-bit operands and 64-bit addressing mode.
The x86-64 architecture defines a compatibility mode that allows 16-bit and 32-bit user applications to run unmodified alongside 64-bit applications, provided the 64-bit operating system supports them. Since the full x86-32 instruction sets remain implemented in hardware without the need for emulation, these older executables can run with little or no performance penalty, while newer or modified applications can take advantage of new features of the processor design to achieve performance improvements. Also, processors supporting x86-64 still power on in real mode to maintain backward compatibility with the original 8086 processor, as has been the case with x86 processors since the introduction of protected mode with the 80286.
The original specification, created by AMD and released in 2000, has been implemented by AMD, Intel, and VIA. The AMD K8 microarchitecture, in the Opteron and Athlon 64 processors, was the first to implement it. This was the first significant addition to the x86 architecture designed by a company other than Intel. Intel was forced to follow suit and introduced a modified NetBurst family which was software-compatible with AMD's specification. VIA Technologies introduced x86-64 in their VIA Isaiah architecture, with the VIA Nano.
The x86-64 architecture was quickly adopted for desktop and laptop personal computers and servers which were commonly configured for 16 GiB (gibibytes) of memory or more. It has effectively replaced the discontinued Intel Itanium architecture (formerly IA-64), which was originally intended to replace the x86 architecture. x86-64 and Itanium are not compatible on the native instruction set level, and operating systems and applications compiled for one architecture cannot be run on the other natively.
In October 2024, the x86 Ecosystem Advisory Group was formed jointly by Intel and AMD. Goals included the enhancement of software consistency and standardizing x86 interfaces and features.
AMD64
thumb|upright=0.5|AMD64 logo
History
AMD64 (also variously referred to by AMD in their literature and documentation as "AMD 64-bit Technology" and "AMD x86-64 Architecture") was created as an alternative to the radically different IA-64 architecture designed by Intel and Hewlett-Packard, which was backward-incompatible with IA-32, the 32-bit extension of the x86 architecture. AMD originally announced AMD64 in 1999 with a full specification available in August 2000. As AMD was never invited to be a contributing party for the IA-64 architecture and any kind of licensing seemed unlikely, the AMD64 architecture was positioned by AMD from the beginning as an evolutionary way to add 64-bit computing capabilities to the existing x86 architecture while supporting legacy 32-bit x86 code, as opposed to Intel's approach of creating an entirely new, completely x86-incompatible 64-bit architecture with IA-64.
The first AMD64-based processor, the Opteron, was released in April 2003.
Implementations
AMD's processors implementing the AMD64 architecture include Opteron, Athlon 64, Athlon 64 X2, Athlon 64 FX, Athlon II (followed by "X2", "X3", or "X4" to indicate the number of cores, and XLT models), Turion 64, Turion 64 X2, Sempron ("Palermo" E6 stepping and all "Manila" models), Phenom (followed by "X3" or "X4" to indicate the number of cores), Phenom II (followed by "X2", "X3", "X4" or "X6" to indicate the number of cores), FX, Fusion/APU and Ryzen/Epyc.
Architectural features
The primary defining characteristic of AMD64 is the availability of 64-bit general-purpose processor registers (for example, ), 64-bit integer arithmetic and logical operations, and 64-bit virtual addresses. The designers took the opportunity to make other improvements as well.
Notable changes in the 64-bit extensions include:
; 64-bit integer capability
: All general-purpose registers (GPRs) are expanded from 32 bits to 64 bits, and all arithmetic and logical operations, memory-to-register and register-to-memory operations, etc., can operate directly on 64-bit integers. Pushes and pops on the stack default to 8-byte strides, and pointers are 8 bytes wide.
; Additional registers
: In addition to increasing the size of the general-purpose registers, the number of named general-purpose registers is increased from eight (i.e. , , , , , , , ) in x86 to 16 (i.e. , , , , , , , , , , , , , , , ). It is therefore possible to keep more local variables in registers rather than on the stack, and to let registers hold frequently accessed constants; arguments for small and fast subroutines may also be passed in registers to a greater extent.
: AMD64 still has fewer registers than many RISC instruction sets (e.g. Power ISA has 32 GPRs; 64-bit ARM, RISC-V I, SPARC, Alpha, MIPS, and PA-RISC have 31) or VLIW-like machines such as the IA-64 (which has 128 registers). However, an AMD64 implementation may have far more internal registers than the number of architectural registers exposed by the instruction set (see register renaming). (For example, AMD Zen cores have 168 64-bit integer and 160 128-bit vector floating-point physical internal registers.)
; Additional XMM (SSE) registers
: Similarly, the number of 128-bit XMM<!-- don't confuse this name with MMX; MMX has no hardware registers and is mapped to the FPU stack. Here we talk about SSE (XMM) registers --> registers (used for Streaming SIMD instructions) is also increased from 8 to 16.
: The traditional x87 FPU register stack is not included in the register file size extension in 64-bit mode, compared with the XMM registers used by SSE2, which did get extended. The x87 register stack is not a simple register file although it does allow direct access to individual registers by low cost exchange operations.
; Larger virtual address space
: The AMD64 architecture defines a 64-bit virtual address format, of which the low-order 48 bits are used in current implementations. This is compared to just 4 GiB (2<sup>32</sup> bytes) for the x86.
: This means that very large files can be operated on by mapping the entire file into the process address space (which is often much faster than working with file read/write calls), rather than having to map regions of the file into and out of the address space.
; Larger physical address space
: The original implementation of the AMD64 architecture implemented 40-bit physical addresses and so could address up to 1 TiB (2<sup>40</sup> bytes) of RAM. and therefore can address up to 256 TiB (2<sup>48</sup> bytes) of RAM. The architecture permits extending this to 52 bits in the future (limited by the page table entry format); or 4 GiB of RAM without PAE mode. SSE3 instructions and later Streaming SIMD Extensions instruction sets are not standard features of the architecture.
; No-Execute bit
: The No-Execute bit or NX bit (bit 63 of the page table entry) allows the operating system to specify which pages of virtual address space can contain executable code and which cannot. An attempt to execute code from a page tagged "no execute" will result in a memory access violation, similar to an attempt to write to a read-only page. This should make it more difficult for malicious code to take control of the system via "buffer overrun" or "unchecked buffer" attacks. A similar feature has been available on x86 processors since the 80286 as an attribute of segment descriptors; however, this works only on an entire segment at a time.
: Segmented addressing has long been considered an obsolete mode of operation, and all current PC operating systems in effect bypass it, setting all segments to a base address of zero and (in their 32-bit implementation) a size of 4 GiB. AMD was the first x86-family vendor to implement no-execute in linear addressing mode. The feature is also available in legacy mode on AMD64 processors, and recent Intel x86 processors, when PAE is used.
; Removal of older features
: A few "system programming" features of the x86 architecture were either unused or underused in modern operating systems and are either not available on AMD64 in long (64-bit and compatibility) mode, or exist only in limited form. These include segmented addressing (although the FS and GS segments are retained in vestigial form for use as extra-base pointers to operating system structures), Windows did not support the entire 48-bit address space until Windows 8.1, which was released in October 2013. Further extensions may allow full 64-bit virtual address space and physical memory with 12-bit page table descriptors and 16- or 21-bit memory offsets for 64 KiB and 2 MiB page allocation sizes; the page table entry would be expanded to 128 bits to support additional hardware flags for page size and virtual address space size.
Operating system limits
The operating system can also limit the virtual address space. Details, where applicable, are given in the "Operating system compatibility and characteristics" section.
Physical address space details
Current AMD64 processors support a physical address space of up to 2<sup>48</sup> bytes of RAM, or 256 TiB. The operating system may place additional limits on the amount of RAM that is usable or supported. Details on this point are given in the "Operating system compatibility and characteristics" section of this article.
Operating modes
The architecture has two primary modes of operation: long mode and legacy mode.
{| class="wikitable" style="text-align: center;"
|-
! colspan="2" | Operating
! rowspan="2" | Operating system required
! rowspan="2" | Type of code being run
! colspan="2" | Size (in bits)
! rowspan="2" | No. of general-purpose registers
|-
! Mode
! Sub-mode
! Addresses
! Operands (default in italics)
|-
| rowspan="3" | Long mode
| 64-bit mode
| 64-bit OS, 64-bit UEFI firmware, or the previous two interacting via a 64-bit firmware's UEFI interface
| 64-bit
| 64
| 8, 16, 32, 64
| 16
|-
| rowspan="2" | Compatibility mode
| rowspan="2" | Bootloader or 64-bit OS
| 32-bit
| 32
| 8, 16, 32
| 8
|-
| 16-bit protected mode
| 16
| 8, 16, 32
| 8
|-
| rowspan="5" | Legacy mode
| rowspan="2" | Protected mode
| Bootloader, 32-bit OS, 32-bit UEFI firmware, or the latter two interacting via the firmware's UEFI interface
| 32-bit
| 32
| 8, 16, 32
| 8
|-
| 16-bit protected mode OS
| 16-bit protected mode
| 16
| 8, 16, 32
| real mode
| 16, 20, 21
| 8, 16, 32
thumb|400px|State diagram of the x86-64 operating modes
Long mode
Long mode is the architecture's intended primary mode of operation; it is a combination of the processor's native 64-bit mode and a combined 32-bit and 16-bit compatibility mode. It is used by 64-bit operating systems. Under a 64-bit operating system, 64-bit programs run under 64-bit mode, and 32-bit and 16-bit protected mode applications (that do not need to use either real mode or virtual 8086 mode in order to execute at any time) run under compatibility mode. Real-mode programs and programs that use virtual 8086 mode at any time cannot be run in long mode unless those modes are emulated in software. (after the Yamhill River in Oregon's Willamette Valley). After several years of denying its existence, Intel announced at the February 2004 IDF that the project was indeed underway. Intel's chairman at the time, Craig Barrett, admitted that this was one of their worst-kept secrets.
Intel's name for this instruction set has changed several times. The name used at the IDF was CT (presumably for Clackamas Technology, another codename from an Oregon river) and in March 2004 unveiled the "official" name EM64T (Extended Memory 64 Technology). In late 2006 Intel began instead using the name Intel 64 for its implementation, paralleling AMD's use of the name AMD64.
The first processor to implement Intel 64 was the multi-socket processor Xeon code-named Nocona in June 2004. In contrast, the initial Prescott chips (February 2004) did not enable this feature. Intel subsequently began selling Intel 64-enabled Pentium 4s using the E0 revision of the Prescott core, being sold on the OEM market as the Pentium 4, model F. The E0 revision also adds eXecute Disable (XD) (Intel's name for the NX bit) to Intel 64, and has been included in then current Xeon code-named Irwindale. Intel's official launch of Intel 64 (under the name EM64T at that time) in mainstream desktop processors was the N0 stepping Prescott-2M in February 2005.
The first Intel mobile processor implementing Intel 64 is the Merom Core 2, which was released on July 27, 2006. None of Intel's earlier notebook CPUs (Core Duo, Pentium M, Celeron M, Mobile Pentium 4) implement Intel 64.
Implementations
Intel's processors implementing the Intel64 architecture include the Pentium 4 F-series/5x1 series, 506, and 516, Celeron D models 3x1, 3x6, 355, 347, 352, 360, and 365 and all later Celerons, all models of Xeon since "Nocona", all models of Pentium Dual-Core processors since "Merom-2M", the Atom 230, 330, D410, D425, D510, D525, N450, N455, N470, N475, N550, N570, N2600 and N2800, all models of the Pentium D, Pentium Extreme Edition, Core 2, Core i9, Core i7, Core i5, and Core i3 processors, and the Xeon Phi 7200 series processors.
VIA's x86-64 implementation
VIA Technologies introduced their first implementation of the x86-64 architecture in 2008 after five years of development by its CPU division, Centaur Technology.
Codenamed "Isaiah", the 64-bit architecture was unveiled on January 24, 2008, and launched on May 29 under the VIA Nano brand name.
The processor supports a number of VIA-specific x86 extensions designed to boost efficiency in low-power appliances.
In pre-release marketing, VIA stated it was expected that the Isaiah architecture would be twice as fast in integer performance and four times as fast in floating-point performance as the previous-generation—VIA Esther—at an equivalent clock speed. Power consumption was also expected to be on par with the previous-generation VIA CPUs, with thermal design power ranging from 5 W to 25 W.
Being a completely new design, the Isaiah architecture was built with support for features like the x86-64 instruction set and x86 virtualization which were unavailable on its predecessors, the VIA C7 line, while retaining their encryption extensions.
Microarchitecture levels
In 2020, through a collaboration between AMD, Intel, Red Hat, and SUSE, three microarchitecture levels (or feature levels) on top of the x86-64 baseline were defined: x86-64-v2, x86-64-v3, and x86-64-v4. These levels define specific features that can be targeted by programmers to provide compile-time optimizations. The features exposed by each level are as follows:
{| class="wikitable"
|+ style="font-size: 105%;" | CPU microarchitecture levels
|-
! Level name
! CPU features
! Example instruction
! Supported processors
|-
| rowspan="9" |
| CMOV ||
| rowspan="9" | Baseline for all x86-64 CPUs. Features match the common capabilities between the 2003 AMD AMD64 and the 2004 Intel EM64T initial implementations in the AMD K8 and the Intel Prescott processor families.<br />
- Intel
- Prescott processors
- Merom and Conroe processors (SSE3 and SSSE3 supported)
- Penryn and Wolfdale processors (SSE3 and SSSE3 supported; SSE4.1 supported, except Pentium Dual-Core and Celeron)
- AMD
- K8 processors (Later models supported SSE3)
- K10 processors (SSE3, SSE4a, <code>POPCNT</code> and <code>LZCNT</code> supported)
- Low-power Bobcat processors (SSE3, SSSE3, SSE4a, <code>POPCNT</code> and <code>LZCNT</code> supported)
- VIA
- Nano 3000, X2, QuadCore processors (SSE3 and SSE4.1 supported)
|-
| CX8 ||
|-
| FPU ||
|-
| FXSR ||
|-
| MMX ||
|-
| OSFXSR ||
|-
| SCE ||
|-
| SSE ||
|-
| SSE2 ||
|-
| rowspan="7" | x86-64-v2
| CMPXCHG16B ||
| rowspan="7" | Features match the 2008 Intel Nehalem architecture, excluding Intel-specific instructions.<br />
- Intel
- Nehalem and Westmere processors<br />(Except Pentium and Celeron: SSE4.1, SSE4.2 and <code>POPCNT</code> not supported)
- Sandy Bridge and Ivy Bridge processors (AVX supported)
- Low-power Silvermont, Goldmont, Goldmont Plus and Tremont processors
- AMD
- Bulldozer, Piledriver and Steamroller processors<br />(SSE4a, AVX and <code>LZCNT</code> supported)
- Low-power Jaguar-based and Puma-based processors<br />(SSE4a and <code>LZCNT</code> supported)
- VIA
- Nano QuadCore C4000-series processors
- Eden X4 processors
- Zhaoxin
- ZX-C processors and newer
|-
| LAHF-SAHF ||
|-
| POPCNT ||
|-
| SSE3 ||
|-
| SSE4_1 ||
|-
| SSE4_2 ||
|-
| SSSE3 ||
|-
| rowspan=9 | x86-64-v3
| AVX ||
| rowspan=9 | Features match the 2013 Intel Haswell architecture, excluding Intel-specific instructions.<br />
- Intel
- Haswell and Broadwell processors
- Low-power Gracemont processors
- AMD
- Excavator processors (SSE4a supported)
- Zen, Zen+, Zen 2, and Zen 3 processors (SSE4a supported)
- Zhaoxin
- YongFeng and Shijidadao processors
|-
| AVX2 ||
|-
| BMI1 ||
|-
| BMI2 ||
|-
| F16C ||
|-
| FMA ||
|-
| LZCNT ||
|-
| MOVBE ||
|-
| OSXSAVE ||
|-
| rowspan="5" | x86-64-v4
| AVX512F ||
| rowspan="5" | Features match the 2017 Intel Skylake-X architecture, excluding Intel-specific instructions.<br />
- Intel
- Skylake processors and newer
- Removed in Alder Lake processors and newer
- AMD
- Zen 4 processors and newer (SSE4a supported)
|-
| AVX512BW ||
|-
| AVX512CD ||
|-
| AVX512DQ ||
|-
| AVX512VL ||
|}
The x86-64 microarchitecture feature levels can also be found as AMD64-v1, AMD64-v2 .. or AMD64_v1 .. in settings where the "AMD64" nomenclature is used. These are used as synonyms with the x86-64-vX nomenclature and are thus functionally identical. Examples of this include the Go language documentation and the Fedora Linux distribution.
All levels include features found in the previous levels. Instruction set extensions not concerned with general-purpose computation, including AES-NI and RDRAND, are excluded from the level requirements.
On most recent x86_64 Linux distributions, all x86_64 feature levels supported by a CPU can be verified using command: <code>/lib64/ld-linux-x86-64.so.2 --help</code> (available since glibc 2.33). The result will be visible at the end of command's output:
<syntaxhighlight lang="output">
Subdirectories of glibc-hwcaps directories, in priority order:
x86-64-v4
x86-64-v3 (supported, searched)
x86-64-v2 (supported, searched)
</syntaxhighlight>
Here x86-64-v4 feature level is not supported by CPU, but x86-64-v3 and x86-64-v2 are, which means this CPU does not support AVX512 required at v4 level.
Differences between AMD64 and Intel 64
Although nearly identical, there are some differences between the two instruction sets in the semantics of a few seldom used machine instructions (or situations), which are mainly used for system programming. Unless instructed to otherwise via <code>-march</code> settings, compilers generally produce executables (i.e. machine code) that avoid any differences, at least for ordinary application programs. This is therefore of interest mainly to developers of compilers, operating systems and similar, which must deal with individual and special system instructions.
Recent implementations
- Intel 64 allows <code>SYSCALL</code>/<code>SYSRET</code> only in 64-bit mode (not in compatibility mode), and allows <code>SYSENTER</code>/<code>SYSEXIT</code> in both modes. AMD64 lacks <code>SYSENTER</code>/<code>SYSEXIT</code> in both sub-modes of long mode. while on Intel 64 processors it is executed in privilege level 0.
- The <code>SYSRET</code> instruction will load a set of fixed values into the hidden part of the <code>SS</code> segment register (base-address, limit, attributes) on Intel 64 but leave the hidden part of <code>SS</code> unchanged on AMD64.
- On Intel 64, the <code>SYSRET</code> instruction unconditionally sets the privilege level (RPL) of the <code>SS</code> segment register to 3 (as the instruction causes a return to privilege level 3). On AMD64, the RPL is set to the corresponding bits in the STAR MSR (model-specific register), that is, bits 49 and 48.
- AMD64 requires a different microcode update format and control MSRs, while Intel 64 implements microcode update unchanged from their 32-bit only processors.
- Intel 64 lacks some MSRs that are considered architectural in AMD64. These include <code>SYSCFG</code>, <code>TOP_MEM</code>, and <code>TOP_MEM2</code>.
- Intel 64 lacks the ability to save and restore a reduced (and thus faster) version of the floating-point state (involving the <code>FXSAVE</code> and <code>FXRSTOR</code> instructions).
- On AMD64, the <code>FXSAVE</code>/<code>FXRSTOR</code> instructions will only save/restore x87 exception pointers (FCS/FIP, FDS/FDP, FOP) when an unmasked pending x87 exception is present. On Intel 64, these pointers are always saved and restored regardless of x87 exception status.
- In 64-bit mode, near branches with the 66H (operand size override) prefix behave differently. Intel 64 ignores this prefix: the instruction has a 32-bit sign extended offset, and instruction pointer is not truncated. AMD64 uses a 16-bit offset field in the instruction, and clears the top 48 bits of instruction pointer.
- On Intel 64 but not AMD64, the <code>REX.W</code> prefix can be used with the far-pointer instructions (<code>LFS</code>, <code>LGS</code>, <code>LSS</code>, , ) to increase the size of their far pointer argument to 80 bits (64-bit offset + 16-bit segment).
- When the <code>MOVSXD</code> instruction is executed with a memory source operand and an operand-size of 16 bits, the memory operand will be accessed with a 16-bit read on Intel 64, but a 32-bit read on AMD64.
- When the <code>PUSH</code> instruction is used with a segment register and an operand-size of 32 bits in legacy/compatibility mode, AMD64 will zero-extend the register from 2 to 4 bytes and push that 4-byte value onto the stack. Intel 64 will also decrement the stack pointer by 4 but will just write 2 bytes, leaving a 2-byte hole that's not written.
- The <code>FCOMI</code>/<code>FCOMIP</code>/<code>FUCOMI</code>/<code>FUCOMIP</code> (x87 floating-point compare) instructions will clear the OF, SF and AF bits of EFLAGS on Intel 64, but leave these flag bits unmodified on AMD64.
- For the <code>VMASKMOVPS</code>/<code>VMASKMOVPD</code>/<code>VPMASKMOVD</code>/<code>VPMASKMOVQ</code> (AVX/AVX2 masked move to/from memory) instructions, Intel 64 architecturally guarantees that the instructions will not cause memory faults (e.g. page-faults and segmentation-faults) for any zero-masked lanes, while AMD64 does not provide such a guarantee.
- If the <code>RDRAND</code> instruction fails to obtain a random number (as indicated by <code>EFLAGS.CF=0</code>), the destination register is architecturally guaranteed to be set to 0 on Intel 64 but not AMD64.
- For the <code>VPINSRD</code> and <code>VPEXTRD</code> (AVX vector lane insert/extract) instructions outside 64-bit mode, AMD64 requires the instructions to be encoded with <code>VEX.W=0</code>, while Intel 64 also accepts encodings with <code>VEX.W=1</code>. (In 64-bit mode, both AMD64 and Intel 64 require <code>VEX.W=0</code>.)
- When alignment checking is enabled (<code>EFLAGS.AC=1</code>), AVX instructions with misaligned 128-bit or 256-bit memory operands, and the SSE4.2 <code>PCMP*STR*</code> instructions with misaligned 128-bit memory operands, will cause #AC (alignment check) exceptions on AMD64 but not Intel 64.
- The <code>0F 0D /r</code> opcode with the ModR/M byte's Mod field set to <code>11b</code> is a Reserved-NOP on Intel 64 but will cause #UD (invalid-opcode exception) on AMD64.
- The ordering guarantees provided by some memory ordering instructions such as <code>LFENCE</code> and <code>MFENCE</code> differ between Intel 64 and AMD64:
- <code>LFENCE</code> is dispatch-serializing (enabling it to be used as a speculation fence) on Intel 64 but is not architecturally guaranteed to be dispatch-serializing on AMD64.
- <code>MFENCE</code> is a fully serializing instruction (including instruction fetch serialization) on AMD64 but not Intel 64.
- The <code>MOV</code> to CR8 and <code>INVPCID</code> instructions are serializing on AMD64 but not Intel 64.
- The <code>LMSW</code> instruction is serializing on Intel 64 but not AMD64.
- <code>WRMSR</code> to the x2APIC ICR (Interrupt Command Register; MSR <code>830h</code>) is commonly used to produce an IPI (Inter-processor interrupt) — on Intel 64 but not AMD64 CPUs, such an IPI can be reordered before an older memory store.
- On recent AMD64 processors (Zen 4 and later), <code>WRMSR</code> to the <code>FS_BASE</code>, <code>GS_BASE</code> and <code>KernelGSBase</code> MSRs is non-serializing. On Intel 64 processors as well as older AMD64 processors, <code>WRMSR</code> to these MSRs is serializing.
Older implementations
- The AMD64 processors prior to Revision F (distinguished by the switch from DDR to DDR2 memory and sockets AM2, F and S1) of 2006 lacked the <code>CMPXCHG16B</code> instruction, which is an extension of the <code>CMPXCHG8B</code> instruction present on most post-80486 processors. Similar to <code>CMPXCHG8B</code>, <code>CMPXCHG16B</code> allows for atomic operations on octa-words (128-bit values). This is useful for parallel algorithms that use compare and swap on data larger than the size of a pointer, common in lock-free and wait-free algorithms. Without <code>CMPXCHG16B</code> one must use workarounds, such as a critical section or alternative lock-free approaches. Its absence also prevents 64-bit Windows prior to Windows 8.1 from having a user-mode address space larger than 8 TiB. All 64-bit versions of Windows since Windows 8.1 require the instruction.
- Early AMD64 and Intel 64 CPUs lacked <code>LAHF</code> and <code>SAHF</code> instructions in 64-bit mode. AMD introduced these instructions (also in 64-bit mode) with their 90 nm (revision D) processors, starting with Athlon 64 in October 2004. Intel introduced the instructions in October 2005 with the 0F47h and later revisions of NetBurst. All 64-bit versions of Windows since Windows 8.1 require this feature. 64-bit versions of Windows since Windows 8.1 require this feature. The difference is not a difference of the user-visible ISAs. In 2007 AMD 10h-based Opteron was the first to provide a 48-bit (256 TiB) physical address space. Intel 64's physical addressing was extended to 44 bits (16 TiB) in Nehalem-EX in 2010 and to 46 bits (64 TiB) in Sandy Bridge E in 2011. With the Ice Lake 3rd gen Xeon Scalable processors, Intel increased the virtual addressing to 57 bits (128 PiB) and physical to 52 bits (4 PiB) in 2021, necessitating a 5-level paging. The following year AMD64 added the same in 4th generation EPYC (Genoa). Non-server CPUs retained smaller address spaces for longer.
- On all AMD64 processors, the <code>BSF</code> and <code>BSR</code> instructions will, when given a source value of 0, leave their destination register unmodified. This is mostly the case on Intel 64 processors as well, except that on some older Intel 64 CPUs, executing these instructions with an operand size of 32 bits will clear the top 32 bits of their destination register even with a source value of 0 (with the low 32 bits kept unchanged.)
- AMD64 processors since Opteron Rev. E and Athlon 64 Rev. D reintroduced limited support for segmentation, via the Long Mode Segment Limit Enable (LMSLE) bit, to ease virtualization of 64-bit guests. LMSLE support was removed in the Zen 3 processor.
- On all Intel 64 processors, <code>CLFLUSH</code> is ordered with respect to <code>SFENCE</code> — this is also the case on newer AMD64 processors (Zen 1 and later). On older AMD64 processors, imposing ordering on the <code>CLFLUSH</code> instruction instead required <code>MFENCE</code>.
- On older AMD64 processors, the <code>XSAVE</code>/<code>XRSTOR</code> instructions would abstain from saving/restoring x87 error pointers (FCS/FIP, FDS/FDP, FOP) unless an unmasked pending x87 exception was present. On Intel 64 processors and newer AMD64 processors (Zen 1 and later), these error pointers are always saved and restored. (This Intel/AMD difference continues to exist for the <code>FXSAVE</code>/<code>FXRSTOR</code> instructions, though.)
Adoption
thumb|350x350px|An area chart showing the representation of different families of microprocessors in the TOP500 supercomputer ranking list, from 1993 to 2020
In supercomputers tracked by TOP500, the appearance of 64-bit extensions for the x86 architecture enabled 64-bit x86 processors by AMD and Intel to replace most RISC processor architectures previously used in such systems (including PA-RISC, SPARC, Alpha and others), as well as 32-bit x86, even though Intel itself initially tried unsuccessfully to replace x86 with a new incompatible 64-bit architecture in the Itanium processor.
, a HPE EPYC-based supercomputer called Frontier is number one. The first ARM-based supercomputer appeared on the list in 2018 and, in recent years, non-CPU architecture co-processors (GPGPU) have also played a big role in performance. Intel's Xeon Phi "Knights Corner" coprocessors, which implement a subset of x86-64 with some vector extensions, are also used, along with x86-64 processors, in the Tianhe-2 supercomputer.
Operating system compatibility and characteristics
The following operating systems and releases support the x86-64 architecture in long mode.
BSD
DragonFly BSD
Preliminary infrastructure work was started in February 2004 for a x86-64 port. This development later stalled. Development started again during July 2007
and continued during Google Summer of Code 2008 and SoC 2009. The first official release to contain x86-64 support was version 2.4 from 2009.
FreeBSD
FreeBSD first added x86-64 support under the name "amd64" as an experimental architecture in 5.1-RELEASE in June 2003. It was included as a standard distribution architecture as of 5.2-RELEASE in January 2004. Since then, FreeBSD has designated it as a Tier 1 platform. The 6.0-RELEASE version cleaned up some quirks with running x86 executables under amd64, and most drivers work just as they do on the x86 architecture. Work is currently being done to integrate more fully the x86 application binary interface (ABI), in the same manner as the Linux 32-bit ABI compatibility currently works.
NetBSD
x86-64 architecture support was first committed to the NetBSD source tree on June 19, 2001. As of NetBSD 2.0, released on December 9, 2004, NetBSD/amd64 is a fully integrated and supported port.
32-bit code is still supported in 64-bit mode, with a netbsd-32 kernel compatibility layer for 32-bit syscalls. The NX bit is used to provide non-executable stack and heap with per-page granularity (segment granularity being used on 32-bit x86).
OpenBSD
OpenBSD has supported AMD64 since OpenBSD 3.5, released on May 1, 2004. Complete in-tree implementation of AMD64 support was achieved prior to the hardware's initial release because AMD had loaned several machines for the project's hackathon that year. OpenBSD developers have taken to the platform because of its support for the NX bit, which allowed for an easy implementation of the W^X feature.
The code for the AMD64 port of OpenBSD also runs on Intel 64 processors which contains cloned use of the AMD64 extensions, but since Intel left out the page table NX bit in early Intel 64 processors, there is no W^X capability on those Intel CPUs; later Intel 64 processors added the NX bit under the name "XD bit". Symmetric multiprocessing (SMP) works on OpenBSD's AMD64 port, starting with release 3.6 on November 1, 2004.
DOS
It is possible to enter long mode under DOS without a DOS extender, but the user must return to real mode in order to call BIOS or DOS interrupts.
It may also be possible to enter long mode with a DOS extender similar to DOS/4GW, but more complex since x86-64 lacks virtual 8086 mode. DOS itself is not aware of that, and no benefits should be expected unless running DOS in an emulation with an adequate virtualization driver backend, for example: the mass storage interface.
Linux
Linux was the first operating system kernel to run the x86-64 architecture in long mode, starting with the 2.4 version in 2001 (preceding the hardware's availability). Linux also provides backward compatibility for running 32-bit executables. This permits programs to be recompiled into long mode while retaining the use of 32-bit programs. Current Linux distributions ship with x86-64-native kernels and userlands. Some, such as Arch Linux, SUSE, Mandriva, and Debian, allow users to install a set of 32-bit components and libraries when installing off a 64-bit distribution medium, thus allowing most existing 32-bit applications to run alongside the 64-bit OS.
x32 ABI (Application Binary Interface), introduced in Linux 3.4, allows programs compiled for the x32 ABI to run in the 64-bit mode of x86-64 while only using 32-bit pointers and data fields.
Though this limits the program to a virtual address space of 4 GiB, it also decreases the memory footprint of the program and in some cases can allow it to run faster. or up to 128 PiB (virtual) and 4 PiB (physical) with 5-level paging enabled.
macOS
Mac OS X 10.4.7 and higher versions of Mac OS X 10.4 run 64-bit command-line tools using the POSIX and math libraries on 64-bit Intel-based machines, just as all versions of Mac OS X 10.4 and 10.5 run them on 64-bit PowerPC machines. No other libraries or frameworks work with 64-bit applications in Mac OS X 10.4.
The kernel, and all kernel extensions, are 32-bit only.
Mac OS X 10.5 supports 64-bit GUI applications using Cocoa, Quartz, OpenGL, and X11 on 64-bit Intel-based machines, as well as on 64-bit PowerPC machines.
All non-GUI libraries and frameworks also support 64-bit applications on those platforms. The kernel, and all kernel extensions, are 32-bit only.
Mac OS X 10.6 is the first version of macOS that supports a 64-bit kernel. However, not all 64-bit computers can run the 64-bit kernel, and not all 64-bit computers that can run the 64-bit kernel will do so by default.
The 64-bit kernel, like the 32-bit kernel, supports 32-bit applications; both kernels also support 64-bit applications. 32-bit applications have a virtual address space limit of 4 GiB under either kernel. The 64-bit kernel does not support 32-bit kernel extensions, and the 32-bit kernel does not support 64-bit kernel extensions.
OS X 10.8 includes only the 64-bit kernel, but continues to support 32-bit applications; it does not support 32-bit kernel extensions, however.
macOS 10.15 includes only the 64-bit kernel and no longer supports 32-bit applications. This removal of support has presented a problem for Wine (and the commercial version CrossOver), as it needs to still be able to run 32-bit Windows applications. The solution, termed wine32on64, was to add thunks that bring the CPU in and out of 32-bit compatibility mode in the nominally 64-bit application.
macOS uses the universal binary format to package 32- and 64-bit versions of application and library code into a single file; the most appropriate version is automatically selected at load time. In Mac OS X 10.6, the universal binary format is also used for the kernel and for those kernel extensions that support both 32-bit and 64-bit kernels.
Solaris
Solaris 10 and later releases support the x86-64 architecture.
For Solaris 10, just as with the SPARC architecture, there is only one operating system image, which contains a 32-bit kernel and a 64-bit kernel; this is labeled as the "x64/x86" DVD-ROM image. The default behavior is to boot a 64-bit kernel, allowing both 64-bit and existing or new 32-bit executables to be run. A 32-bit kernel can also be manually selected, in which case only 32-bit executables will run. The <code>isainfo</code> command can be used to determine if a system is running a 64-bit kernel.
For Solaris 11, only the 64-bit kernel is provided. However, the 64-bit kernel supports both 32- and 64-bit executables, libraries, and system calls.
Windows
x64 editions of Microsoft Windows client and server—Windows XP Professional x64 Edition and Windows Server 2003 x64 Edition—were released in March 2005. Internally they are actually the same build (5.2.3790.1830 SP1), as they share the same source base and operating system binaries, so even system updates are released in unified packages, much in the manner as Windows 2000 Professional and Server editions for x86. Windows Vista, which also has many different editions, was released in January 2007. Windows 7 was released in July 2009. Windows Server 2008 R2 was sold in only x64 and Itanium editions; later versions of Windows Server only offer an x64 edition.
Versions of Windows for x64 prior to Windows 8.1 and Windows Server 2012 R2 offer the following:
- 8 TiB of virtual address space per process, accessible from both user mode and kernel mode, referred to as the user mode address space. An x86-64 program can use all of this, subject to backing store limits on the system, and provided it is linked with the "large address aware" option, which is present by default. This is a 4096-fold increase over the default 2 GiB user-mode virtual address space offered by 32-bit Windows.
- 8 TiB of kernel mode virtual address space for the operating system.
Under Windows 8.1 and Windows Server 2012 R2, both user mode and kernel mode virtual address spaces have been extended to 128 TiB. Unlike the use of the <code>/3GB</code> boot option on x86, this does not reduce the kernel mode virtual address space available to the operating system. 32-bit applications can, therefore, benefit from running on x64 Windows even if they are not recompiled for x86-64.
- Both 32- and 64-bit applications, if not linked with "large address aware", are limited to 2 GiB of virtual address space.
- Ability to use up to 128 GiB (Windows XP/Vista), 192 GiB (Windows 7), 512 GiB (Windows 8), 1 TiB (Windows Server 2003), 2 TiB (Windows Server 2008/Windows 10), 4 TiB (Windows Server 2012), or 24 TiB (Windows Server 2016/2019) of physical random access memory (RAM).
- LLP64 data model: in C/C++, "int" and "long" types are 32 bits wide, "long long" is 64 bits, while pointers and types derived from pointers are 64 bits wide.
- Kernel mode device drivers must be 64-bit versions; there is no way to run 32-bit kernel mode executables within the 64-bit operating system. User mode device drivers can be either 32-bit or 64-bit.
- 16-bit Windows (Win16) and DOS applications will not run on x86-64 versions of Windows due to the removal of the virtual DOS machine subsystem (NTVDM) which relied upon the ability to use virtual 8086 mode. Virtual 8086 mode cannot be entered while running in long mode.
- Full implementation of the NX (No Execute) page protection feature. This is also implemented on recent 32-bit versions of Windows when they are started in PAE mode.
- Instead of FS segment descriptor on x86 versions of the Windows NT family, GS segment descriptor is used to point to two operating system defined structures: Thread Information Block (NT_TIB) in user mode and Processor Control Region (KPCR) in kernel mode. Thus, for example, in user mode <code>GS:0</code> is the address of the first member of the Thread Information Block. Maintaining this convention made the x86-64 port easier, but required AMD to retain the function of the FS and GS segments in long mode – even though segmented addressing per se is not really used by any modern operating system. which are also supported on Intel processors as of Broadwell.<!-- citations at that article-->)
- Some components like Jet Database Engine and Data Access Objects will not be ported to 64-bit architectures such as x86-64 and IA-64.
- Microsoft Visual Studio can compile native applications to target either the x86-64 architecture, which can run only on 64-bit Microsoft Windows, or the IA-32 architecture, which can run as a 32-bit application on 32-bit Microsoft Windows or 64-bit Microsoft Windows in WoW64 emulation mode. Managed applications can be compiled either in IA-32, x86-64 or AnyCPU modes. Software created in the first two modes behave like their IA-32 or x86-64 native code counterparts respectively; When using the AnyCPU mode, however, applications in 32-bit versions of Microsoft Windows run as 32-bit applications, while they run as a 64-bit application in 64-bit editions of Microsoft Windows.
Video game consoles
The PlayStation 4 and Xbox One use custom AMD accelerated processing units (APUs) based on the Jaguar microarchitecture. Likewise, the PlayStation 5, Steam Deck and Xbox Series X/S use custom AMD APUs based on the Zen 2 microarchitecture. Firmware and games are written in x86-64 code; no legacy 32-bit x86 code is involved.
Industry naming conventions
Since AMD64 and Intel 64 are substantially similar, many software and hardware products use one vendor-neutral term to indicate their compatibility with both implementations. AMD's original designation for this processor architecture, "x86-64", is still used for this purpose,
- Sun's Solaris isalist command identifies both AMD64- and Intel 64-based systems as "amd64".
- Java Development Kit (JDK): the name "amd64" is used in directory names containing x86-64 files.
- x86_64
- The Linux kernel and the GNU Compiler Collection refers to 64-bit architecture as "x86_64".
- Some Linux distributions, such as Fedora, openSUSE, Arch, Gentoo refer to this 64-bit architecture as "x86_64".
- Apple macOS refers to 64-bit architecture as "x86-64" or "x86_64", as seen in the Terminal command <code>arch</code> those patents had to be licensed from AMD in order to implement AMD64. Intel entered into a cross-licensing agreement with AMD, licensing to AMD their patents on existing x86 techniques, and licensing from AMD their patents on techniques used in x86-64. In 2009, AMD and Intel settled several lawsuits and cross-licensing disagreements, extending their cross-licensing agreements.
See also
- AGESA (AMD Generic Encapsulated Software Architecture)
- Transient execution CPU vulnerability
Notes
References
External links
- AMD Developer Guides, Manuals & ISA Documents
- x86-64: Extending the x86 architecture to 64-bits – technical talk by the architect of AMD64 (video archive), and second talk by the same speaker (video archive)
- AMD's "Enhanced Virus Protection"
- Intel tweaks EM64T for full AMD64 compatibility
- Analyst: Intel Reverse-Engineered AMD64
- Early report of differences between Intel IA32e and AMD64
- Porting to 64-bit GNU/Linux Systems, by Andreas Jaeger from GCC Summit 2003. An excellent paper explaining almost all practical aspects for a transition from 32-bit to 64-bit.
- Intel 64 Architecture
- Intel Software Network: "64 bits"
- TurboIRC.COM tutorials, including examples of how to of enter protected and long mode the raw way from DOS
- Seven Steps of Migrating a Program to a 64-bit System
- Memory Limits for Windows Releases
<!-- Category:Advanced Micro Devices products category is added to the AMD64 redirect page instead of here, in order to make that name appear in the category listing rather than "X86-64"; -->
