docs: move x86 documentation into Documentation/arch/

Move the x86 documentation under Documentation/arch/ as a way of cleaning up the top-level directory and making the structure of our docs more closely match the structure of the source directories it describes. All in-kernel references to the old paths have been updated. Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: linux-arch@vger.kernel.org Cc: x86@kernel.org Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/lkml/20230315211523.108836-1-corbet@lwn.net/ Signed-off-by: Jonathan Corbet <corbet@lwn.net>
author: Jonathan Corbet <corbet@lwn.net> 2023-03-15 00:06:44 +0100
committer: Jonathan Corbet <corbet@lwn.net> 2023-03-30 20:58:51 +0200
commit: ff61f0791ce969d2db6c9f3b71d74ceec0a2e958 (patch)
tree: fe32be44aaf65f9c436a8f37cd4a18f6ec47c3cb /Documentation/x86/x86_64
parent: Documentation: kernel-parameters: Remove meye entry (diff)
download: linux-ff61f0791ce969d2db6c9f3b71d74ceec0a2e958.tar.xz
linux-ff61f0791ce969d2db6c9f3b71d74ceec0a2e958.zip
9 files changed, 0 insertions, 952 deletions
diff --git a/Documentation/x86/x86_64/5level-paging.rst b/Documentation/x86/x86_64/5level-paging.rst
deleted file mode 100644
index b792bbdc0b01..000000000000
--- a/Documentation/x86/x86_64/5level-paging.rst
+++ /dev/null
@@ -1,67 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-==============
-5-level paging
-==============
-
-Overview
-========
-Original x86-64 was limited by 4-level paging to 256 TiB of virtual address
-space and 64 TiB of physical address space. We are already bumping into
-this limit: some vendors offer servers with 64 TiB of memory today.
-
-To overcome the limitation upcoming hardware will introduce support for
-5-level paging. It is a straight-forward extension of the current page
-table structure adding one more layer of translation.
-
-It bumps the limits to 128 PiB of virtual address space and 4 PiB of
-physical address space. This "ought to be enough for anybody" ©.
-
-QEMU 2.9 and later support 5-level paging.
-
-Virtual memory layout for 5-level paging is described in
-Documentation/x86/x86_64/mm.rst
-
-
-Enabling 5-level paging
-=======================
-CONFIG_X86_5LEVEL=y enables the feature.
-
-Kernel with CONFIG_X86_5LEVEL=y still able to boot on 4-level hardware.
-In this case additional page table level -- p4d -- will be folded at
-runtime.
-
-User-space and large virtual address space
-==========================================
-On x86, 5-level paging enables 56-bit userspace virtual address space.
-Not all user space is ready to handle wide addresses. It's known that
-at least some JIT compilers use higher bits in pointers to encode their
-information. It collides with valid pointers with 5-level paging and
-leads to crashes.
-
-To mitigate this, we are not going to allocate virtual address space
-above 47-bit by default.
-
-But userspace can ask for allocation from full address space by
-specifying hint address (with or without MAP_FIXED) above 47-bits.
-
-If hint address set above 47-bit, but MAP_FIXED is not specified, we try
-to look for unmapped area by specified address. If it's already
-occupied, we look for unmapped area in *full* address space, rather than
-from 47-bit window.
-
-A high hint address would only affect the allocation in question, but not
-any future mmap()s.
-
-Specifying high hint address on older kernel or on machine without 5-level
-paging support is safe. The hint will be ignored and kernel will fall back
-to allocation from 47-bit address space.
-
-This approach helps to easily make application's memory allocator aware
-about large address space without manually tracking allocated virtual
-address space.
-
-One important case we need to handle here is interaction with MPX.
-MPX (without MAWA extension) cannot handle addresses above 47-bit, so we
-need to make sure that MPX cannot be enabled we already have VMA above
-the boundary and forbid creating such VMAs once MPX is enabled.
diff --git a/Documentation/x86/x86_64/boot-options.rst b/Documentation/x86/x86_64/boot-options.rst
deleted file mode 100644
index cbd14124a667..000000000000
--- a/Documentation/x86/x86_64/boot-options.rst
+++ /dev/null
@@ -1,319 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-===========================
-AMD64 Specific Boot Options
-===========================
-
-There are many others (usually documented in driver documentation), but
-only the AMD64 specific ones are listed here.
-
-Machine check
-=============
-Please see Documentation/x86/x86_64/machinecheck.rst for sysfs runtime tunables.
-
-   mce=off
-		Disable machine check
-   mce=no_cmci
-		Disable CMCI(Corrected Machine Check Interrupt) that
-		Intel processor supports.  Usually this disablement is
-		not recommended, but it might be handy if your hardware
-		is misbehaving.
-		Note that you'll get more problems without CMCI than with
-		due to the shared banks, i.e. you might get duplicated
-		error logs.
-   mce=dont_log_ce
-		Don't make logs for corrected errors.  All events reported
-		as corrected are silently cleared by OS.
-		This option will be useful if you have no interest in any
-		of corrected errors.
-   mce=ignore_ce
-		Disable features for corrected errors, e.g. polling timer
-		and CMCI.  All events reported as corrected are not cleared
-		by OS and remained in its error banks.
-		Usually this disablement is not recommended, however if
-		there is an agent checking/clearing corrected errors
-		(e.g. BIOS or hardware monitoring applications), conflicting
-		with OS's error handling, and you cannot deactivate the agent,
-		then this option will be a help.
-   mce=no_lmce
-		Do not opt-in to Local MCE delivery. Use legacy method
-		to broadcast MCEs.
-   mce=bootlog
-		Enable logging of machine checks left over from booting.
-		Disabled by default on AMD Fam10h and older because some BIOS
-		leave bogus ones.
-		If your BIOS doesn't do that it's a good idea to enable though
-		to make sure you log even machine check events that result
-		in a reboot. On Intel systems it is enabled by default.
-   mce=nobootlog
-		Disable boot machine check logging.
-   mce=monarchtimeout (number)
-		monarchtimeout:
-		Sets the time in us to wait for other CPUs on machine checks. 0
-		to disable.
-   mce=bios_cmci_threshold
-		Don't overwrite the bios-set CMCI threshold. This boot option
-		prevents Linux from overwriting the CMCI threshold set by the
-		bios. Without this option, Linux always sets the CMCI
-		threshold to 1. Enabling this may make memory predictive failure
-		analysis less effective if the bios sets thresholds for memory
-		errors since we will not see details for all errors.
-   mce=recovery
-		Force-enable recoverable machine check code paths
-
-   nomce (for compatibility with i386)
-		same as mce=off
-
-   Everything else is in sysfs now.
-
-APICs
-=====
-
-   apic
-	Use IO-APIC. Default
-
-   noapic
-	Don't use the IO-APIC.
-
-   disableapic
-	Don't use the local APIC
-
-   nolapic
-     Don't use the local APIC (alias for i386 compatibility)
-
-   pirq=...
-	See Documentation/x86/i386/IO-APIC.rst
-
-   noapictimer
-	Don't set up the APIC timer
-
-   no_timer_check
-	Don't check the IO-APIC timer. This can work around
-	problems with incorrect timer initialization on some boards.
-
-   apicpmtimer
-	Do APIC timer calibration using the pmtimer. Implies
-	apicmaintimer. Useful when your PIT timer is totally broken.
-
-Timing
-======
-
-  notsc
-    Deprecated, use tsc=unstable instead.
-
-  nohpet
-    Don't use the HPET timer.
-
-Idle loop
-=========
-
-  idle=poll
-    Don't do power saving in the idle loop using HLT, but poll for rescheduling
-    event. This will make the CPUs eat a lot more power, but may be useful
-    to get slightly better performance in multiprocessor benchmarks. It also
-    makes some profiling using performance counters more accurate.
-    Please note that on systems with MONITOR/MWAIT support (like Intel EM64T
-    CPUs) this option has no performance advantage over the normal idle loop.
-    It may also interact badly with hyperthreading.
-
-Rebooting
-=========
-
-   reboot=b[ios] | t[riple] | k[bd] | a[cpi] | e[fi] | p[ci] [, [w]arm | [c]old]
-      bios
-        Use the CPU reboot vector for warm reset
-      warm
-        Don't set the cold reboot flag
-      cold
-        Set the cold reboot flag
-      triple
-        Force a triple fault (init)
-      kbd
-        Use the keyboard controller. cold reset (default)
-      acpi
-        Use the ACPI RESET_REG in the FADT. If ACPI is not configured or
-        the ACPI reset does not work, the reboot path attempts the reset
-        using the keyboard controller.
-      efi
-        Use efi reset_system runtime service. If EFI is not configured or
-        the EFI reset does not work, the reboot path attempts the reset using
-        the keyboard controller.
-      pci
-        Use a write to the PCI config space register 0xcf9 to trigger reboot.
-
-   Using warm reset will be much faster especially on big memory
-   systems because the BIOS will not go through the memory check.
-   Disadvantage is that not all hardware will be completely reinitialized
-   on reboot so there may be boot problems on some systems.
-
-   reboot=force
-     Don't stop other CPUs on reboot. This can make reboot more reliable
-     in some cases.
-
-   reboot=default
-     There are some built-in platform specific "quirks" - you may see:
-     "reboot: <name> series board detected. Selecting <type> for reboots."
-     In the case where you think the quirk is in error (e.g. you have
-     newer BIOS, or newer board) using this option will ignore the built-in
-     quirk table, and use the generic default reboot actions.
-
-NUMA
-====
-
-  numa=off
-    Only set up a single NUMA node spanning all memory.
-
-  numa=noacpi
-    Don't parse the SRAT table for NUMA setup
-
-  numa=nohmat
-    Don't parse the HMAT table for NUMA setup, or soft-reserved memory
-    partitioning.
-
-  numa=fake=<size>[MG]
-    If given as a memory unit, fills all system RAM with nodes of
-    size interleaved over physical nodes.
-
-  numa=fake=<N>
-    If given as an integer, fills all system RAM with N fake nodes
-    interleaved over physical nodes.
-
-  numa=fake=<N>U
-    If given as an integer followed by 'U', it will divide each
-    physical node into N emulated nodes.
-
-ACPI
-====
-
-  acpi=off
-    Don't enable ACPI
-  acpi=ht
-    Use ACPI boot table parsing, but don't enable ACPI interpreter
-  acpi=force
-    Force ACPI on (currently not needed)
-  acpi=strict
-    Disable out of spec ACPI workarounds.
-  acpi_sci={edge,level,high,low}
-    Set up ACPI SCI interrupt.
-  acpi=noirq
-    Don't route interrupts
-  acpi=nocmcff
-    Disable firmware first mode for corrected errors. This
-    disables parsing the HEST CMC error source to check if
-    firmware has set the FF flag. This may result in
-    duplicate corrected error reports.
-
-PCI
-===
-
-  pci=off
-    Don't use PCI
-  pci=conf1
-    Use conf1 access.
-  pci=conf2
-    Use conf2 access.
-  pci=rom
-    Assign ROMs.
-  pci=assign-busses
-    Assign busses
-  pci=irqmask=MASK
-    Set PCI interrupt mask to MASK
-  pci=lastbus=NUMBER
-    Scan up to NUMBER busses, no matter what the mptable says.
-  pci=noacpi
-    Don't use ACPI to set up PCI interrupt routing.
-
-IOMMU (input/output memory management unit)
-===========================================
-Multiple x86-64 PCI-DMA mapping implementations exist, for example:
-
-   1. <kernel/dma/direct.c>: use no hardware/software IOMMU at all
-      (e.g. because you have < 3 GB memory).
-      Kernel boot message: "PCI-DMA: Disabling IOMMU"
-
-   2. <arch/x86/kernel/amd_gart_64.c>: AMD GART based hardware IOMMU.
-      Kernel boot message: "PCI-DMA: using GART IOMMU"
-
-   3. <arch/x86_64/kernel/pci-swiotlb.c> : Software IOMMU implementation. Used
-      e.g. if there is no hardware IOMMU in the system and it is need because
-      you have >3GB memory or told the kernel to us it (iommu=soft))
-      Kernel boot message: "PCI-DMA: Using software bounce buffering
-      for IO (SWIOTLB)"
-
-::
-
-  iommu=[<size>][,noagp][,off][,force][,noforce]
-  [,memaper[=<order>]][,merge][,fullflush][,nomerge]
-  [,noaperture]
-
-General iommu options:
-
-    off
-      Don't initialize and use any kind of IOMMU.
-    noforce
-      Don't force hardware IOMMU usage when it is not needed. (default).
-    force
-      Force the use of the hardware IOMMU even when it is
-      not actually needed (e.g. because < 3 GB memory).
-    soft
-      Use software bounce buffering (SWIOTLB) (default for
-      Intel machines). This can be used to prevent the usage
-      of an available hardware IOMMU.
-
-iommu options only relevant to the AMD GART hardware IOMMU:
-
-    <size>
-      Set the size of the remapping area in bytes.
-    allowed
-      Overwrite iommu off workarounds for specific chipsets.
-    fullflush
-      Flush IOMMU on each allocation (default).
-    nofullflush
-      Don't use IOMMU fullflush.
-    memaper[=<order>]
-      Allocate an own aperture over RAM with size 32MB<<order.
-      (default: order=1, i.e. 64MB)
-    merge
-      Do scatter-gather (SG) merging. Implies "force" (experimental).
-    nomerge
-      Don't do scatter-gather (SG) merging.
-    noaperture
-      Ask the IOMMU not to touch the aperture for AGP.
-    noagp
-      Don't initialize the AGP driver and use full aperture.
-    panic
-      Always panic when IOMMU overflows.
-
-iommu options only relevant to the software bounce buffering (SWIOTLB) IOMMU
-implementation:
-
-    swiotlb=<slots>[,force,noforce]
-      <slots>
-        Prereserve that many 2K slots for the software IO bounce buffering.
-      force
-        Force all IO through the software TLB.
-      noforce
-        Do not initialize the software TLB.
-
-
-Miscellaneous
-=============
-
-  nogbpages
-    Do not use GB pages for kernel direct mappings.
-  gbpages
-    Use GB pages for kernel direct mappings.
-
-
-AMD SEV (Secure Encrypted Virtualization)
-=========================================
-Options relating to AMD SEV, specified via the following format:
-
-::
-
-   sev=option1[,option2]
-
-The available options are:
-
-   debug
-     Enable debug messages.
diff --git a/Documentation/x86/x86_64/cpu-hotplug-spec.rst b/Documentation/x86/x86_64/cpu-hotplug-spec.rst
deleted file mode 100644
index 8d1c91f0c880..000000000000
--- a/Documentation/x86/x86_64/cpu-hotplug-spec.rst
+++ /dev/null
@@ -1,24 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-===================================================
-Firmware support for CPU hotplug under Linux/x86-64
-===================================================
-
-Linux/x86-64 supports CPU hotplug now. For various reasons Linux wants to
-know in advance of boot time the maximum number of CPUs that could be plugged
-into the system. ACPI 3.0 currently has no official way to supply
-this information from the firmware to the operating system.
-
-In ACPI each CPU needs an LAPIC object in the MADT table (5.2.11.5 in the
-ACPI 3.0 specification).  ACPI already has the concept of disabled LAPIC
-objects by setting the Enabled bit in the LAPIC object to zero.
-
-For CPU hotplug Linux/x86-64 expects now that any possible future hotpluggable
-CPU is already available in the MADT. If the CPU is not available yet
-it should have its LAPIC Enabled bit set to 0. Linux will use the number
-of disabled LAPICs to compute the maximum number of future CPUs.
-
-In the worst case the user can overwrite this choice using a command line
-option (additional_cpus=...), but it is recommended to supply the correct
-number (or a reasonable approximation of it, with erring towards more not less)
-in the MADT to avoid manual configuration.
diff --git a/Documentation/x86/x86_64/fake-numa-for-cpusets.rst b/Documentation/x86/x86_64/fake-numa-for-cpusets.rst
deleted file mode 100644
index ff9bcfd2cc14..000000000000
--- a/Documentation/x86/x86_64/fake-numa-for-cpusets.rst
+++ /dev/null
@@ -1,78 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-=====================
-Fake NUMA For CPUSets
-=====================
-
-:Author: David Rientjes <rientjes@cs.washington.edu>
-
-Using numa=fake and CPUSets for Resource Management
-
-This document describes how the numa=fake x86_64 command-line option can be used
-in conjunction with cpusets for coarse memory management.  Using this feature,
-you can create fake NUMA nodes that represent contiguous chunks of memory and
-assign them to cpusets and their attached tasks.  This is a way of limiting the
-amount of system memory that are available to a certain class of tasks.
-
-For more information on the features of cpusets, see
-Documentation/admin-guide/cgroup-v1/cpusets.rst.
-There are a number of different configurations you can use for your needs.  For
-more information on the numa=fake command line option and its various ways of
-configuring fake nodes, see Documentation/x86/x86_64/boot-options.rst.
-
-For the purposes of this introduction, we'll assume a very primitive NUMA
-emulation setup of "numa=fake=4*512,".  This will split our system memory into
-four equal chunks of 512M each that we can now use to assign to cpusets.  As
-you become more familiar with using this combination for resource control,
-you'll determine a better setup to minimize the number of nodes you have to deal
-with.
-
-A machine may be split as follows with "numa=fake=4*512," as reported by dmesg::
-
-	Faking node 0 at 0000000000000000-0000000020000000 (512MB)
-	Faking node 1 at 0000000020000000-0000000040000000 (512MB)
-	Faking node 2 at 0000000040000000-0000000060000000 (512MB)
-	Faking node 3 at 0000000060000000-0000000080000000 (512MB)
-	...
-	On node 0 totalpages: 130975
-	On node 1 totalpages: 131072
-	On node 2 totalpages: 131072
-	On node 3 totalpages: 131072
-
-Now following the instructions for mounting the cpusets filesystem from
-Documentation/admin-guide/cgroup-v1/cpusets.rst, you can assign fake nodes (i.e. contiguous memory
-address spaces) to individual cpusets::
-
-	[root@xroads /]# mkdir exampleset
-	[root@xroads /]# mount -t cpuset none exampleset
-	[root@xroads /]# mkdir exampleset/ddset
-	[root@xroads /]# cd exampleset/ddset
-	[root@xroads /exampleset/ddset]# echo 0-1 > cpus
-	[root@xroads /exampleset/ddset]# echo 0-1 > mems
-
-Now this cpuset, 'ddset', will only allowed access to fake nodes 0 and 1 for
-memory allocations (1G).
-
-You can now assign tasks to these cpusets to limit the memory resources
-available to them according to the fake nodes assigned as mems::
-
-	[root@xroads /exampleset/ddset]# echo $$ > tasks
-	[root@xroads /exampleset/ddset]# dd if=/dev/zero of=tmp bs=1024 count=1G
-	[1] 13425
-
-Notice the difference between the system memory usage as reported by
-/proc/meminfo between the restricted cpuset case above and the unrestricted
-case (i.e. running the same 'dd' command without assigning it to a fake NUMA
-cpuset):
-
-	========	============	==========
-	Name		Unrestricted	Restricted
-	========	============	==========
-	MemTotal	3091900 kB	3091900 kB
-	MemFree		42113 kB	1513236 kB
-	========	============	==========
-
-This allows for coarse memory management for the tasks you assign to particular
-cpusets.  Since cpusets can form a hierarchy, you can create some pretty
-interesting combinations of use-cases for various classes of tasks for your
-memory management needs.
diff --git a/Documentation/x86/x86_64/fsgs.rst b/Documentation/x86/x86_64/fsgs.rst
deleted file mode 100644
index 50960e09e1f6..000000000000
--- a/Documentation/x86/x86_64/fsgs.rst
+++ /dev/null
@@ -1,199 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-Using FS and GS segments in user space applications
-===================================================
-
-The x86 architecture supports segmentation. Instructions which access
-memory can use segment register based addressing mode. The following
-notation is used to address a byte within a segment:
-
-  Segment-register:Byte-address
-
-The segment base address is added to the Byte-address to compute the
-resulting virtual address which is accessed. This allows to access multiple
-instances of data with the identical Byte-address, i.e. the same code. The
-selection of a particular instance is purely based on the base-address in
-the segment register.
-
-In 32-bit mode the CPU provides 6 segments, which also support segment
-limits. The limits can be used to enforce address space protections.
-
-In 64-bit mode the CS/SS/DS/ES segments are ignored and the base address is
-always 0 to provide a full 64bit address space. The FS and GS segments are
-still functional in 64-bit mode.
-
-Common FS and GS usage
-------------------------------
-
-The FS segment is commonly used to address Thread Local Storage (TLS). FS
-is usually managed by runtime code or a threading library. Variables
-declared with the '__thread' storage class specifier are instantiated per
-thread and the compiler emits the FS: address prefix for accesses to these
-variables. Each thread has its own FS base address so common code can be
-used without complex address offset calculations to access the per thread
-instances. Applications should not use FS for other purposes when they use
-runtimes or threading libraries which manage the per thread FS.
-
-The GS segment has no common use and can be used freely by
-applications. GCC and Clang support GS based addressing via address space
-identifiers.
-
-Reading and writing the FS/GS base address
-------------------------------------------
-
-There exist two mechanisms to read and write the FS/GS base address:
-
- - the arch_prctl() system call
-
- - the FSGSBASE instruction family
-
-Accessing FS/GS base with arch_prctl()
---------------------------------------
-
- The arch_prctl(2) based mechanism is available on all 64-bit CPUs and all
- kernel versions.
-
- Reading the base:
-
-   arch_prctl(ARCH_GET_FS, &fsbase);
-   arch_prctl(ARCH_GET_GS, &gsbase);
-
- Writing the base:
-
-   arch_prctl(ARCH_SET_FS, fsbase);
-   arch_prctl(ARCH_SET_GS, gsbase);
-
- The ARCH_SET_GS prctl may be disabled depending on kernel configuration
- and security settings.
-
-Accessing FS/GS base with the FSGSBASE instructions
----------------------------------------------------
-
- With the Ivy Bridge CPU generation Intel introduced a new set of
- instructions to access the FS and GS base registers directly from user
- space. These instructions are also supported on AMD Family 17H CPUs. The
- following instructions are available:
-
-  =============== ===========================
-  RDFSBASE %reg   Read the FS base register
-  RDGSBASE %reg   Read the GS base register
-  WRFSBASE %reg   Write the FS base register
-  WRGSBASE %reg   Write the GS base register
-  =============== ===========================
-
- The instructions avoid the overhead of the arch_prctl() syscall and allow
- more flexible usage of the FS/GS addressing modes in user space
- applications. This does not prevent conflicts between threading libraries
- and runtimes which utilize FS and applications which want to use it for
- their own purpose.
-
-FSGSBASE instructions enablement
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- The instructions are enumerated in CPUID leaf 7, bit 0 of EBX. If
- available /proc/cpuinfo shows 'fsgsbase' in the flag entry of the CPUs.
-
- The availability of the instructions does not enable them
- automatically. The kernel has to enable them explicitly in CR4. The
- reason for this is that older kernels make assumptions about the values in
- the GS register and enforce them when GS base is set via
- arch_prctl(). Allowing user space to write arbitrary values to GS base
- would violate these assumptions and cause malfunction.
-
- On kernels which do not enable FSGSBASE the execution of the FSGSBASE
- instructions will fault with a #UD exception.
-
- The kernel provides reliable information about the enabled state in the
- ELF AUX vector. If the HWCAP2_FSGSBASE bit is set in the AUX vector, the
- kernel has FSGSBASE instructions enabled and applications can use them.
- The following code example shows how this detection works::
-
-   #include <sys/auxv.h>
-   #include <elf.h>
-
-   /* Will be eventually in asm/hwcap.h */
-   #ifndef HWCAP2_FSGSBASE
-   #define HWCAP2_FSGSBASE        (1 << 1)
-   #endif
-
-   ....
-
-   unsigned val = getauxval(AT_HWCAP2);
-
-   if (val & HWCAP2_FSGSBASE)
-        printf("FSGSBASE enabled\n");
-
-FSGSBASE instructions compiler support
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-GCC version 4.6.4 and newer provide instrinsics for the FSGSBASE
-instructions. Clang 5 supports them as well.
-
-  =================== ===========================
-  _readfsbase_u64()   Read the FS base register
-  _readfsbase_u64()   Read the GS base register
-  _writefsbase_u64()  Write the FS base register
-  _writegsbase_u64()  Write the GS base register
-  =================== ===========================
-
-To utilize these instrinsics <immintrin.h> must be included in the source
-code and the compiler option -mfsgsbase has to be added.
-
-Compiler support for FS/GS based addressing
--------------------------------------------
-
-GCC version 6 and newer provide support for FS/GS based addressing via
-Named Address Spaces. GCC implements the following address space
-identifiers for x86:
-
-  ========= ====================================
-  __seg_fs  Variable is addressed relative to FS
-  __seg_gs  Variable is addressed relative to GS
-  ========= ====================================
-
-The preprocessor symbols __SEG_FS and __SEG_GS are defined when these
-address spaces are supported. Code which implements fallback modes should
-check whether these symbols are defined. Usage example::
-
-  #ifdef __SEG_GS
-
-  long data0 = 0;
-  long data1 = 1;
-
-  long __seg_gs *ptr;
-
-  /* Check whether FSGSBASE is enabled by the kernel (HWCAP2_FSGSBASE) */
-  ....
-
-  /* Set GS base to point to data0 */
-  _writegsbase_u64(&data0);
-
-  /* Access offset 0 of GS */
-  ptr = 0;
-  printf("data0 = %ld\n", *ptr);
-
-  /* Set GS base to point to data1 */
-  _writegsbase_u64(&data1);
-  /* ptr still addresses offset 0! */
-  printf("data1 = %ld\n", *ptr);
-
-
-Clang does not provide the GCC address space identifiers, but it provides
-address spaces via an attribute based mechanism in Clang 2.6 and newer
-versions:
-
- ==================================== =====================================
-  __attribute__((address_space(256))  Variable is addressed relative to GS
-  __attribute__((address_space(257))  Variable is addressed relative to FS
- ==================================== =====================================
-
-FS/GS based addressing with inline assembly
--------------------------------------------
-
-In case the compiler does not support address spaces, inline assembly can
-be used for FS/GS based addressing mode::
-
-	mov %fs:offset, %reg
-	mov %gs:offset, %reg
-
-	mov %reg, %fs:offset
-	mov %reg, %gs:offset
diff --git a/Documentation/x86/x86_64/index.rst b/Documentation/x86/x86_64/index.rst
deleted file mode 100644
index a56070fc8e77..000000000000
--- a/Documentation/x86/x86_64/index.rst
+++ /dev/null
@@ -1,17 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-==============
-x86_64 Support
-==============
-
-.. toctree::
-   :maxdepth: 2
-
-   boot-options
-   uefi
-   mm
-   5level-paging
-   fake-numa-for-cpusets
-   cpu-hotplug-spec
-   machinecheck
-   fsgs
diff --git a/Documentation/x86/x86_64/machinecheck.rst b/Documentation/x86/x86_64/machinecheck.rst
deleted file mode 100644
index cea12ee97200..000000000000
--- a/Documentation/x86/x86_64/machinecheck.rst
+++ /dev/null
@@ -1,33 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-===============================================================
-Configurable sysfs parameters for the x86-64 machine check code
-===============================================================
-
-Machine checks report internal hardware error conditions detected
-by the CPU. Uncorrected errors typically cause a machine check
-(often with panic), corrected ones cause a machine check log entry.
-
-Machine checks are organized in banks (normally associated with
-a hardware subsystem) and subevents in a bank. The exact meaning
-of the banks and subevent is CPU specific.
-
-mcelog knows how to decode them.
-
-When you see the "Machine check errors logged" message in the system
-log then mcelog should run to collect and decode machine check entries
-from /dev/mcelog. Normally mcelog should be run regularly from a cronjob.
-
-Each CPU has a directory in /sys/devices/system/machinecheck/machinecheckN
-(N = CPU number).
-
-The directory contains some configurable entries. See
-Documentation/ABI/testing/sysfs-mce for more details.
-
-TBD document entries for AMD threshold interrupt configuration
-
-For more details about the x86 machine check architecture
-see the Intel and AMD architecture manuals from their developer websites.
-
-For more details about the architecture
-see http://one.firstfloor.org/~andi/mce.pdf
diff --git a/Documentation/x86/x86_64/mm.rst b/Documentation/x86/x86_64/mm.rst
deleted file mode 100644
index 35e5e18c83d0..000000000000
--- a/Documentation/x86/x86_64/mm.rst
+++ /dev/null
@@ -1,157 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-=================
-Memory Management
-=================
-
-Complete virtual memory map with 4-level page tables
-====================================================
-
-.. note::
-
- - Negative addresses such as "-23 TB" are absolute addresses in bytes, counted down
-   from the top of the 64-bit address space. It's easier to understand the layout
-   when seen both in absolute addresses and in distance-from-top notation.
-
-   For example 0xffffe90000000000 == -23 TB, it's 23 TB lower than the top of the
-   64-bit address space (ffffffffffffffff).
-
-   Note that as we get closer to the top of the address space, the notation changes
-   from TB to GB and then MB/KB.
-
- - "16M TB" might look weird at first sight, but it's an easier way to visualize size
-   notation than "16 EB", which few will recognize at first sight as 16 exabytes.
-   It also shows it nicely how incredibly large 64-bit address space is.
-
-::
-
-  ========================================================================================================================
-      Start addr    |   Offset   |     End addr     |  Size   | VM area description
-  ========================================================================================================================
-                    |            |                  |         |
-   0000000000000000 |    0       | 00007fffffffffff |  128 TB | user-space virtual memory, different per mm
-  __________________|____________|__________________|_________|___________________________________________________________
-                    |            |                  |         |
-   0000800000000000 | +128    TB | ffff7fffffffffff | ~16M TB | ... huge, almost 64 bits wide hole of non-canonical
-                    |            |                  |         |     virtual memory addresses up to the -128 TB
-                    |            |                  |         |     starting offset of kernel mappings.
-  __________________|____________|__________________|_________|___________________________________________________________
-                                                              |
-                                                              | Kernel-space virtual memory, shared between all processes:
-  ____________________________________________________________|___________________________________________________________
-                    |            |                  |         |
-   ffff800000000000 | -128    TB | ffff87ffffffffff |    8 TB | ... guard hole, also reserved for hypervisor
-   ffff880000000000 | -120    TB | ffff887fffffffff |  0.5 TB | LDT remap for PTI
-   ffff888000000000 | -119.5  TB | ffffc87fffffffff |   64 TB | direct mapping of all physical memory (page_offset_base)
-   ffffc88000000000 |  -55.5  TB | ffffc8ffffffffff |  0.5 TB | ... unused hole
-   ffffc90000000000 |  -55    TB | ffffe8ffffffffff |   32 TB | vmalloc/ioremap space (vmalloc_base)
-   ffffe90000000000 |  -23    TB | ffffe9ffffffffff |    1 TB | ... unused hole
-   ffffea0000000000 |  -22    TB | ffffeaffffffffff |    1 TB | virtual memory map (vmemmap_base)
-   ffffeb0000000000 |  -21    TB | ffffebffffffffff |    1 TB | ... unused hole
-   ffffec0000000000 |  -20    TB | fffffbffffffffff |   16 TB | KASAN shadow memory
-  __________________|____________|__________________|_________|____________________________________________________________
-                                                              |
-                                                              | Identical layout to the 56-bit one from here on:
-  ____________________________________________________________|____________________________________________________________
-                    |            |                  |         |
-   fffffc0000000000 |   -4    TB | fffffdffffffffff |    2 TB | ... unused hole
-                    |            |                  |         | vaddr_end for KASLR
-   fffffe0000000000 |   -2    TB | fffffe7fffffffff |  0.5 TB | cpu_entry_area mapping
-   fffffe8000000000 |   -1.5  TB | fffffeffffffffff |  0.5 TB | ... unused hole
-   ffffff0000000000 |   -1    TB | ffffff7fffffffff |  0.5 TB | %esp fixup stacks
-   ffffff8000000000 | -512    GB | ffffffeeffffffff |  444 GB | ... unused hole
-   ffffffef00000000 |  -68    GB | fffffffeffffffff |   64 GB | EFI region mapping space
-   ffffffff00000000 |   -4    GB | ffffffff7fffffff |    2 GB | ... unused hole
-   ffffffff80000000 |   -2    GB | ffffffff9fffffff |  512 MB | kernel text mapping, mapped to physical address 0
-   ffffffff80000000 |-2048    MB |                  |         |
-   ffffffffa0000000 |-1536    MB | fffffffffeffffff | 1520 MB | module mapping space
-   ffffffffff000000 |  -16    MB |                  |         |
-      FIXADDR_START | ~-11    MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset
-   ffffffffff600000 |  -10    MB | ffffffffff600fff |    4 kB | legacy vsyscall ABI
-   ffffffffffe00000 |   -2    MB | ffffffffffffffff |    2 MB | ... unused hole
-  __________________|____________|__________________|_________|___________________________________________________________
-
-
-Complete virtual memory map with 5-level page tables
-====================================================
-
-.. note::
-
- - With 56-bit addresses, user-space memory gets expanded by a factor of 512x,
-   from 0.125 PB to 64 PB. All kernel mappings shift down to the -64 PB starting
-   offset and many of the regions expand to support the much larger physical
-   memory supported.
-
-::
-
-  ========================================================================================================================
-      Start addr    |   Offset   |     End addr     |  Size   | VM area description
-  ========================================================================================================================
-                    |            |                  |         |
-   0000000000000000 |    0       | 00ffffffffffffff |   64 PB | user-space virtual memory, different per mm
-  __________________|____________|__________________|_________|___________________________________________________________
-                    |            |                  |         |
-   0100000000000000 |  +64    PB | feffffffffffffff | ~16K PB | ... huge, still almost 64 bits wide hole of non-canonical
-                    |            |                  |         |     virtual memory addresses up to the -64 PB
-                    |            |                  |         |     starting offset of kernel mappings.
-  __________________|____________|__________________|_________|___________________________________________________________
-                                                              |
-                                                              | Kernel-space virtual memory, shared between all processes:
-  ____________________________________________________________|___________________________________________________________
-                    |            |                  |         |
-   ff00000000000000 |  -64    PB | ff0fffffffffffff |    4 PB | ... guard hole, also reserved for hypervisor
-   ff10000000000000 |  -60    PB | ff10ffffffffffff | 0.25 PB | LDT remap for PTI
-   ff11000000000000 |  -59.75 PB | ff90ffffffffffff |   32 PB | direct mapping of all physical memory (page_offset_base)
-   ff91000000000000 |  -27.75 PB | ff9fffffffffffff | 3.75 PB | ... unused hole
-   ffa0000000000000 |  -24    PB | ffd1ffffffffffff | 12.5 PB | vmalloc/ioremap space (vmalloc_base)
-   ffd2000000000000 |  -11.5  PB | ffd3ffffffffffff |  0.5 PB | ... unused hole
-   ffd4000000000000 |  -11    PB | ffd5ffffffffffff |  0.5 PB | virtual memory map (vmemmap_base)
-   ffd6000000000000 |  -10.5  PB | ffdeffffffffffff | 2.25 PB | ... unused hole
-   ffdf000000000000 |   -8.25 PB | fffffbffffffffff |   ~8 PB | KASAN shadow memory
-  __________________|____________|__________________|_________|____________________________________________________________
-                                                              |
-                                                              | Identical layout to the 47-bit one from here on:
-  ____________________________________________________________|____________________________________________________________
-                    |            |                  |         |
-   fffffc0000000000 |   -4    TB | fffffdffffffffff |    2 TB | ... unused hole
-                    |            |                  |         | vaddr_end for KASLR
-   fffffe0000000000 |   -2    TB | fffffe7fffffffff |  0.5 TB | cpu_entry_area mapping
-   fffffe8000000000 |   -1.5  TB | fffffeffffffffff |  0.5 TB | ... unused hole
-   ffffff0000000000 |   -1    TB | ffffff7fffffffff |  0.5 TB | %esp fixup stacks
-   ffffff8000000000 | -512    GB | ffffffeeffffffff |  444 GB | ... unused hole
-   ffffffef00000000 |  -68    GB | fffffffeffffffff |   64 GB | EFI region mapping space
-   ffffffff00000000 |   -4    GB | ffffffff7fffffff |    2 GB | ... unused hole
-   ffffffff80000000 |   -2    GB | ffffffff9fffffff |  512 MB | kernel text mapping, mapped to physical address 0
-   ffffffff80000000 |-2048    MB |                  |         |
-   ffffffffa0000000 |-1536    MB | fffffffffeffffff | 1520 MB | module mapping space
-   ffffffffff000000 |  -16    MB |                  |         |
-      FIXADDR_START | ~-11    MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset
-   ffffffffff600000 |  -10    MB | ffffffffff600fff |    4 kB | legacy vsyscall ABI
-   ffffffffffe00000 |   -2    MB | ffffffffffffffff |    2 MB | ... unused hole
-  __________________|____________|__________________|_________|___________________________________________________________
-
-Architecture defines a 64-bit virtual address. Implementations can support
-less. Currently supported are 48- and 57-bit virtual addresses. Bits 63
-through to the most-significant implemented bit are sign extended.
-This causes hole between user space and kernel addresses if you interpret them
-as unsigned.
-
-The direct mapping covers all memory in the system up to the highest
-memory address (this means in some cases it can also include PCI memory
-holes).
-
-We map EFI runtime services in the 'efi_pgd' PGD in a 64GB large virtual
-memory window (this size is arbitrary, it can be raised later if needed).
-The mappings are not part of any other kernel PGD and are only available
-during EFI runtime calls.
-
-Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all
-physical memory, vmalloc/ioremap space and virtual memory map are randomized.
-Their order is preserved but their base will be offset early at boot time.
-
-Be very careful vs. KASLR when changing anything here. The KASLR address
-range must not overlap with anything except the KASAN shadow area, which is
-correct as KASAN disables KASLR.
-
-For both 4- and 5-level layouts, the STACKLEAK_POISON value in the last 2MB
-hole: ffffffffffff4111
diff --git a/Documentation/x86/x86_64/uefi.rst b/Documentation/x86/x86_64/uefi.rst
deleted file mode 100644
index fbc30c9a071d..000000000000
--- a/Documentation/x86/x86_64/uefi.rst
+++ /dev/null
@@ -1,58 +0,0 @@
-.. SPDX-License-Identifier: GPL-2.0
-
-=====================================
-General note on [U]EFI x86_64 support
-=====================================
-
-The nomenclature EFI and UEFI are used interchangeably in this document.
-
-Although the tools below are _not_ needed for building the kernel,
-the needed bootloader support and associated tools for x86_64 platforms
-with EFI firmware and specifications are listed below.
-
-1. UEFI specification:  http://www.uefi.org
-
-2. Booting Linux kernel on UEFI x86_64 platform requires bootloader
-   support. Elilo with x86_64 support can be used.
-
-3. x86_64 platform with EFI/UEFI firmware.
-
-Mechanics
----------
-
-- Build the kernel with the following configuration::
-
-	CONFIG_FB_EFI=y
-	CONFIG_FRAMEBUFFER_CONSOLE=y
-
-  If EFI runtime services are expected, the following configuration should
-  be selected::
-
-	CONFIG_EFI=y
-	CONFIG_EFIVAR_FS=y or m		# optional
-
-- Create a VFAT partition on the disk
-- Copy the following to the VFAT partition:
-
-	elilo bootloader with x86_64 support, elilo configuration file,
-	kernel image built in first step and corresponding
-	initrd. Instructions on building elilo and its dependencies
-	can be found in the elilo sourceforge project.
-
-- Boot to EFI shell and invoke elilo choosing the kernel image built
-  in first step.
-- If some or all EFI runtime services don't work, you can try following
-  kernel command line parameters to turn off some or all EFI runtime
-  services.
-
-	noefi
-		turn off all EFI runtime services
-	reboot_type=k
-		turn off EFI reboot runtime service
-
-- If the EFI memory map has additional entries not in the E820 map,
-  you can include those entries in the kernels memory map of available
-  physical RAM by using the following kernel command line parameter.
-
-	add_efi_memmap
-		include EFI memory map of available physical RAM
author	Jonathan Corbet <corbet@lwn.net>	2023-03-15 00:06:44 +0100
committer	Jonathan Corbet <corbet@lwn.net>	2023-03-30 20:58:51 +0200
commit	ff61f0791ce969d2db6c9f3b71d74ceec0a2e958 (patch)
tree	fe32be44aaf65f9c436a8f37cd4a18f6ec47c3cb /Documentation/x86/x86_64
parent	Documentation: kernel-parameters: Remove meye entry (diff)
download	linux-ff61f0791ce969d2db6c9f3b71d74ceec0a2e958.tar.xz linux-ff61f0791ce969d2db6c9f3b71d74ceec0a2e958.zip