Magellan Linux

Contents of /trunk/kernel-alx-legacy/patches-4.9/0405-4.9.306-all-fixes.patch

Parent Directory Parent Directory | Revision Log Revision Log


Revision 3707 - (show annotations) (download)
Mon Oct 24 14:08:20 2022 UTC (18 months, 3 weeks ago) by niro
File size: 102080 byte(s)
-linux-4.9.306
1 diff --git a/Documentation/hw-vuln/index.rst b/Documentation/hw-vuln/index.rst
2 index b5fbc6ae9d5fd..74466ba801678 100644
3 --- a/Documentation/hw-vuln/index.rst
4 +++ b/Documentation/hw-vuln/index.rst
5 @@ -9,6 +9,7 @@ are configurable at compile, boot or run time.
6 .. toctree::
7 :maxdepth: 1
8
9 + spectre
10 l1tf
11 mds
12 tsx_async_abort
13 diff --git a/Documentation/hw-vuln/spectre.rst b/Documentation/hw-vuln/spectre.rst
14 new file mode 100644
15 index 0000000000000..c6c43ac2ba43d
16 --- /dev/null
17 +++ b/Documentation/hw-vuln/spectre.rst
18 @@ -0,0 +1,785 @@
19 +.. SPDX-License-Identifier: GPL-2.0
20 +
21 +Spectre Side Channels
22 +=====================
23 +
24 +Spectre is a class of side channel attacks that exploit branch prediction
25 +and speculative execution on modern CPUs to read memory, possibly
26 +bypassing access controls. Speculative execution side channel exploits
27 +do not modify memory but attempt to infer privileged data in the memory.
28 +
29 +This document covers Spectre variant 1 and Spectre variant 2.
30 +
31 +Affected processors
32 +-------------------
33 +
34 +Speculative execution side channel methods affect a wide range of modern
35 +high performance processors, since most modern high speed processors
36 +use branch prediction and speculative execution.
37 +
38 +The following CPUs are vulnerable:
39 +
40 + - Intel Core, Atom, Pentium, and Xeon processors
41 +
42 + - AMD Phenom, EPYC, and Zen processors
43 +
44 + - IBM POWER and zSeries processors
45 +
46 + - Higher end ARM processors
47 +
48 + - Apple CPUs
49 +
50 + - Higher end MIPS CPUs
51 +
52 + - Likely most other high performance CPUs. Contact your CPU vendor for details.
53 +
54 +Whether a processor is affected or not can be read out from the Spectre
55 +vulnerability files in sysfs. See :ref:`spectre_sys_info`.
56 +
57 +Related CVEs
58 +------------
59 +
60 +The following CVE entries describe Spectre variants:
61 +
62 + ============= ======================= ==========================
63 + CVE-2017-5753 Bounds check bypass Spectre variant 1
64 + CVE-2017-5715 Branch target injection Spectre variant 2
65 + CVE-2019-1125 Spectre v1 swapgs Spectre variant 1 (swapgs)
66 + ============= ======================= ==========================
67 +
68 +Problem
69 +-------
70 +
71 +CPUs use speculative operations to improve performance. That may leave
72 +traces of memory accesses or computations in the processor's caches,
73 +buffers, and branch predictors. Malicious software may be able to
74 +influence the speculative execution paths, and then use the side effects
75 +of the speculative execution in the CPUs' caches and buffers to infer
76 +privileged data touched during the speculative execution.
77 +
78 +Spectre variant 1 attacks take advantage of speculative execution of
79 +conditional branches, while Spectre variant 2 attacks use speculative
80 +execution of indirect branches to leak privileged memory.
81 +See :ref:`[1] <spec_ref1>` :ref:`[5] <spec_ref5>` :ref:`[6] <spec_ref6>`
82 +:ref:`[7] <spec_ref7>` :ref:`[10] <spec_ref10>` :ref:`[11] <spec_ref11>`.
83 +
84 +Spectre variant 1 (Bounds Check Bypass)
85 +---------------------------------------
86 +
87 +The bounds check bypass attack :ref:`[2] <spec_ref2>` takes advantage
88 +of speculative execution that bypasses conditional branch instructions
89 +used for memory access bounds check (e.g. checking if the index of an
90 +array results in memory access within a valid range). This results in
91 +memory accesses to invalid memory (with out-of-bound index) that are
92 +done speculatively before validation checks resolve. Such speculative
93 +memory accesses can leave side effects, creating side channels which
94 +leak information to the attacker.
95 +
96 +There are some extensions of Spectre variant 1 attacks for reading data
97 +over the network, see :ref:`[12] <spec_ref12>`. However such attacks
98 +are difficult, low bandwidth, fragile, and are considered low risk.
99 +
100 +Note that, despite "Bounds Check Bypass" name, Spectre variant 1 is not
101 +only about user-controlled array bounds checks. It can affect any
102 +conditional checks. The kernel entry code interrupt, exception, and NMI
103 +handlers all have conditional swapgs checks. Those may be problematic
104 +in the context of Spectre v1, as kernel code can speculatively run with
105 +a user GS.
106 +
107 +Spectre variant 2 (Branch Target Injection)
108 +-------------------------------------------
109 +
110 +The branch target injection attack takes advantage of speculative
111 +execution of indirect branches :ref:`[3] <spec_ref3>`. The indirect
112 +branch predictors inside the processor used to guess the target of
113 +indirect branches can be influenced by an attacker, causing gadget code
114 +to be speculatively executed, thus exposing sensitive data touched by
115 +the victim. The side effects left in the CPU's caches during speculative
116 +execution can be measured to infer data values.
117 +
118 +.. _poison_btb:
119 +
120 +In Spectre variant 2 attacks, the attacker can steer speculative indirect
121 +branches in the victim to gadget code by poisoning the branch target
122 +buffer of a CPU used for predicting indirect branch addresses. Such
123 +poisoning could be done by indirect branching into existing code,
124 +with the address offset of the indirect branch under the attacker's
125 +control. Since the branch prediction on impacted hardware does not
126 +fully disambiguate branch address and uses the offset for prediction,
127 +this could cause privileged code's indirect branch to jump to a gadget
128 +code with the same offset.
129 +
130 +The most useful gadgets take an attacker-controlled input parameter (such
131 +as a register value) so that the memory read can be controlled. Gadgets
132 +without input parameters might be possible, but the attacker would have
133 +very little control over what memory can be read, reducing the risk of
134 +the attack revealing useful data.
135 +
136 +One other variant 2 attack vector is for the attacker to poison the
137 +return stack buffer (RSB) :ref:`[13] <spec_ref13>` to cause speculative
138 +subroutine return instruction execution to go to a gadget. An attacker's
139 +imbalanced subroutine call instructions might "poison" entries in the
140 +return stack buffer which are later consumed by a victim's subroutine
141 +return instructions. This attack can be mitigated by flushing the return
142 +stack buffer on context switch, or virtual machine (VM) exit.
143 +
144 +On systems with simultaneous multi-threading (SMT), attacks are possible
145 +from the sibling thread, as level 1 cache and branch target buffer
146 +(BTB) may be shared between hardware threads in a CPU core. A malicious
147 +program running on the sibling thread may influence its peer's BTB to
148 +steer its indirect branch speculations to gadget code, and measure the
149 +speculative execution's side effects left in level 1 cache to infer the
150 +victim's data.
151 +
152 +Yet another variant 2 attack vector is for the attacker to poison the
153 +Branch History Buffer (BHB) to speculatively steer an indirect branch
154 +to a specific Branch Target Buffer (BTB) entry, even if the entry isn't
155 +associated with the source address of the indirect branch. Specifically,
156 +the BHB might be shared across privilege levels even in the presence of
157 +Enhanced IBRS.
158 +
159 +Currently the only known real-world BHB attack vector is via
160 +unprivileged eBPF. Therefore, it's highly recommended to not enable
161 +unprivileged eBPF, especially when eIBRS is used (without retpolines).
162 +For a full mitigation against BHB attacks, it's recommended to use
163 +retpolines (or eIBRS combined with retpolines).
164 +
165 +Attack scenarios
166 +----------------
167 +
168 +The following list of attack scenarios have been anticipated, but may
169 +not cover all possible attack vectors.
170 +
171 +1. A user process attacking the kernel
172 +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
173 +
174 +Spectre variant 1
175 +~~~~~~~~~~~~~~~~~
176 +
177 + The attacker passes a parameter to the kernel via a register or
178 + via a known address in memory during a syscall. Such parameter may
179 + be used later by the kernel as an index to an array or to derive
180 + a pointer for a Spectre variant 1 attack. The index or pointer
181 + is invalid, but bound checks are bypassed in the code branch taken
182 + for speculative execution. This could cause privileged memory to be
183 + accessed and leaked.
184 +
185 + For kernel code that has been identified where data pointers could
186 + potentially be influenced for Spectre attacks, new "nospec" accessor
187 + macros are used to prevent speculative loading of data.
188 +
189 +Spectre variant 1 (swapgs)
190 +~~~~~~~~~~~~~~~~~~~~~~~~~~
191 +
192 + An attacker can train the branch predictor to speculatively skip the
193 + swapgs path for an interrupt or exception. If they initialize
194 + the GS register to a user-space value, if the swapgs is speculatively
195 + skipped, subsequent GS-related percpu accesses in the speculation
196 + window will be done with the attacker-controlled GS value. This
197 + could cause privileged memory to be accessed and leaked.
198 +
199 + For example:
200 +
201 + ::
202 +
203 + if (coming from user space)
204 + swapgs
205 + mov %gs:<percpu_offset>, %reg
206 + mov (%reg), %reg1
207 +
208 + When coming from user space, the CPU can speculatively skip the
209 + swapgs, and then do a speculative percpu load using the user GS
210 + value. So the user can speculatively force a read of any kernel
211 + value. If a gadget exists which uses the percpu value as an address
212 + in another load/store, then the contents of the kernel value may
213 + become visible via an L1 side channel attack.
214 +
215 + A similar attack exists when coming from kernel space. The CPU can
216 + speculatively do the swapgs, causing the user GS to get used for the
217 + rest of the speculative window.
218 +
219 +Spectre variant 2
220 +~~~~~~~~~~~~~~~~~
221 +
222 + A spectre variant 2 attacker can :ref:`poison <poison_btb>` the branch
223 + target buffer (BTB) before issuing syscall to launch an attack.
224 + After entering the kernel, the kernel could use the poisoned branch
225 + target buffer on indirect jump and jump to gadget code in speculative
226 + execution.
227 +
228 + If an attacker tries to control the memory addresses leaked during
229 + speculative execution, he would also need to pass a parameter to the
230 + gadget, either through a register or a known address in memory. After
231 + the gadget has executed, he can measure the side effect.
232 +
233 + The kernel can protect itself against consuming poisoned branch
234 + target buffer entries by using return trampolines (also known as
235 + "retpoline") :ref:`[3] <spec_ref3>` :ref:`[9] <spec_ref9>` for all
236 + indirect branches. Return trampolines trap speculative execution paths
237 + to prevent jumping to gadget code during speculative execution.
238 + x86 CPUs with Enhanced Indirect Branch Restricted Speculation
239 + (Enhanced IBRS) available in hardware should use the feature to
240 + mitigate Spectre variant 2 instead of retpoline. Enhanced IBRS is
241 + more efficient than retpoline.
242 +
243 + There may be gadget code in firmware which could be exploited with
244 + Spectre variant 2 attack by a rogue user process. To mitigate such
245 + attacks on x86, Indirect Branch Restricted Speculation (IBRS) feature
246 + is turned on before the kernel invokes any firmware code.
247 +
248 +2. A user process attacking another user process
249 +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
250 +
251 + A malicious user process can try to attack another user process,
252 + either via a context switch on the same hardware thread, or from the
253 + sibling hyperthread sharing a physical processor core on simultaneous
254 + multi-threading (SMT) system.
255 +
256 + Spectre variant 1 attacks generally require passing parameters
257 + between the processes, which needs a data passing relationship, such
258 + as remote procedure calls (RPC). Those parameters are used in gadget
259 + code to derive invalid data pointers accessing privileged memory in
260 + the attacked process.
261 +
262 + Spectre variant 2 attacks can be launched from a rogue process by
263 + :ref:`poisoning <poison_btb>` the branch target buffer. This can
264 + influence the indirect branch targets for a victim process that either
265 + runs later on the same hardware thread, or running concurrently on
266 + a sibling hardware thread sharing the same physical core.
267 +
268 + A user process can protect itself against Spectre variant 2 attacks
269 + by using the prctl() syscall to disable indirect branch speculation
270 + for itself. An administrator can also cordon off an unsafe process
271 + from polluting the branch target buffer by disabling the process's
272 + indirect branch speculation. This comes with a performance cost
273 + from not using indirect branch speculation and clearing the branch
274 + target buffer. When SMT is enabled on x86, for a process that has
275 + indirect branch speculation disabled, Single Threaded Indirect Branch
276 + Predictors (STIBP) :ref:`[4] <spec_ref4>` are turned on to prevent the
277 + sibling thread from controlling branch target buffer. In addition,
278 + the Indirect Branch Prediction Barrier (IBPB) is issued to clear the
279 + branch target buffer when context switching to and from such process.
280 +
281 + On x86, the return stack buffer is stuffed on context switch.
282 + This prevents the branch target buffer from being used for branch
283 + prediction when the return stack buffer underflows while switching to
284 + a deeper call stack. Any poisoned entries in the return stack buffer
285 + left by the previous process will also be cleared.
286 +
287 + User programs should use address space randomization to make attacks
288 + more difficult (Set /proc/sys/kernel/randomize_va_space = 1 or 2).
289 +
290 +3. A virtualized guest attacking the host
291 +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
292 +
293 + The attack mechanism is similar to how user processes attack the
294 + kernel. The kernel is entered via hyper-calls or other virtualization
295 + exit paths.
296 +
297 + For Spectre variant 1 attacks, rogue guests can pass parameters
298 + (e.g. in registers) via hyper-calls to derive invalid pointers to
299 + speculate into privileged memory after entering the kernel. For places
300 + where such kernel code has been identified, nospec accessor macros
301 + are used to stop speculative memory access.
302 +
303 + For Spectre variant 2 attacks, rogue guests can :ref:`poison
304 + <poison_btb>` the branch target buffer or return stack buffer, causing
305 + the kernel to jump to gadget code in the speculative execution paths.
306 +
307 + To mitigate variant 2, the host kernel can use return trampolines
308 + for indirect branches to bypass the poisoned branch target buffer,
309 + and flushing the return stack buffer on VM exit. This prevents rogue
310 + guests from affecting indirect branching in the host kernel.
311 +
312 + To protect host processes from rogue guests, host processes can have
313 + indirect branch speculation disabled via prctl(). The branch target
314 + buffer is cleared before context switching to such processes.
315 +
316 +4. A virtualized guest attacking other guest
317 +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
318 +
319 + A rogue guest may attack another guest to get data accessible by the
320 + other guest.
321 +
322 + Spectre variant 1 attacks are possible if parameters can be passed
323 + between guests. This may be done via mechanisms such as shared memory
324 + or message passing. Such parameters could be used to derive data
325 + pointers to privileged data in guest. The privileged data could be
326 + accessed by gadget code in the victim's speculation paths.
327 +
328 + Spectre variant 2 attacks can be launched from a rogue guest by
329 + :ref:`poisoning <poison_btb>` the branch target buffer or the return
330 + stack buffer. Such poisoned entries could be used to influence
331 + speculation execution paths in the victim guest.
332 +
333 + Linux kernel mitigates attacks to other guests running in the same
334 + CPU hardware thread by flushing the return stack buffer on VM exit,
335 + and clearing the branch target buffer before switching to a new guest.
336 +
337 + If SMT is used, Spectre variant 2 attacks from an untrusted guest
338 + in the sibling hyperthread can be mitigated by the administrator,
339 + by turning off the unsafe guest's indirect branch speculation via
340 + prctl(). A guest can also protect itself by turning on microcode
341 + based mitigations (such as IBPB or STIBP on x86) within the guest.
342 +
343 +.. _spectre_sys_info:
344 +
345 +Spectre system information
346 +--------------------------
347 +
348 +The Linux kernel provides a sysfs interface to enumerate the current
349 +mitigation status of the system for Spectre: whether the system is
350 +vulnerable, and which mitigations are active.
351 +
352 +The sysfs file showing Spectre variant 1 mitigation status is:
353 +
354 + /sys/devices/system/cpu/vulnerabilities/spectre_v1
355 +
356 +The possible values in this file are:
357 +
358 + .. list-table::
359 +
360 + * - 'Not affected'
361 + - The processor is not vulnerable.
362 + * - 'Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers'
363 + - The swapgs protections are disabled; otherwise it has
364 + protection in the kernel on a case by case base with explicit
365 + pointer sanitation and usercopy LFENCE barriers.
366 + * - 'Mitigation: usercopy/swapgs barriers and __user pointer sanitization'
367 + - Protection in the kernel on a case by case base with explicit
368 + pointer sanitation, usercopy LFENCE barriers, and swapgs LFENCE
369 + barriers.
370 +
371 +However, the protections are put in place on a case by case basis,
372 +and there is no guarantee that all possible attack vectors for Spectre
373 +variant 1 are covered.
374 +
375 +The spectre_v2 kernel file reports if the kernel has been compiled with
376 +retpoline mitigation or if the CPU has hardware mitigation, and if the
377 +CPU has support for additional process-specific mitigation.
378 +
379 +This file also reports CPU features enabled by microcode to mitigate
380 +attack between user processes:
381 +
382 +1. Indirect Branch Prediction Barrier (IBPB) to add additional
383 + isolation between processes of different users.
384 +2. Single Thread Indirect Branch Predictors (STIBP) to add additional
385 + isolation between CPU threads running on the same core.
386 +
387 +These CPU features may impact performance when used and can be enabled
388 +per process on a case-by-case base.
389 +
390 +The sysfs file showing Spectre variant 2 mitigation status is:
391 +
392 + /sys/devices/system/cpu/vulnerabilities/spectre_v2
393 +
394 +The possible values in this file are:
395 +
396 + - Kernel status:
397 +
398 + ======================================== =================================
399 + 'Not affected' The processor is not vulnerable
400 + 'Mitigation: None' Vulnerable, no mitigation
401 + 'Mitigation: Retpolines' Use Retpoline thunks
402 + 'Mitigation: LFENCE' Use LFENCE instructions
403 + 'Mitigation: Enhanced IBRS' Hardware-focused mitigation
404 + 'Mitigation: Enhanced IBRS + Retpolines' Hardware-focused + Retpolines
405 + 'Mitigation: Enhanced IBRS + LFENCE' Hardware-focused + LFENCE
406 + ======================================== =================================
407 +
408 + - Firmware status: Show if Indirect Branch Restricted Speculation (IBRS) is
409 + used to protect against Spectre variant 2 attacks when calling firmware (x86 only).
410 +
411 + ========== =============================================================
412 + 'IBRS_FW' Protection against user program attacks when calling firmware
413 + ========== =============================================================
414 +
415 + - Indirect branch prediction barrier (IBPB) status for protection between
416 + processes of different users. This feature can be controlled through
417 + prctl() per process, or through kernel command line options. This is
418 + an x86 only feature. For more details see below.
419 +
420 + =================== ========================================================
421 + 'IBPB: disabled' IBPB unused
422 + 'IBPB: always-on' Use IBPB on all tasks
423 + 'IBPB: conditional' Use IBPB on SECCOMP or indirect branch restricted tasks
424 + =================== ========================================================
425 +
426 + - Single threaded indirect branch prediction (STIBP) status for protection
427 + between different hyper threads. This feature can be controlled through
428 + prctl per process, or through kernel command line options. This is x86
429 + only feature. For more details see below.
430 +
431 + ==================== ========================================================
432 + 'STIBP: disabled' STIBP unused
433 + 'STIBP: forced' Use STIBP on all tasks
434 + 'STIBP: conditional' Use STIBP on SECCOMP or indirect branch restricted tasks
435 + ==================== ========================================================
436 +
437 + - Return stack buffer (RSB) protection status:
438 +
439 + ============= ===========================================
440 + 'RSB filling' Protection of RSB on context switch enabled
441 + ============= ===========================================
442 +
443 +Full mitigation might require a microcode update from the CPU
444 +vendor. When the necessary microcode is not available, the kernel will
445 +report vulnerability.
446 +
447 +Turning on mitigation for Spectre variant 1 and Spectre variant 2
448 +-----------------------------------------------------------------
449 +
450 +1. Kernel mitigation
451 +^^^^^^^^^^^^^^^^^^^^
452 +
453 +Spectre variant 1
454 +~~~~~~~~~~~~~~~~~
455 +
456 + For the Spectre variant 1, vulnerable kernel code (as determined
457 + by code audit or scanning tools) is annotated on a case by case
458 + basis to use nospec accessor macros for bounds clipping :ref:`[2]
459 + <spec_ref2>` to avoid any usable disclosure gadgets. However, it may
460 + not cover all attack vectors for Spectre variant 1.
461 +
462 + Copy-from-user code has an LFENCE barrier to prevent the access_ok()
463 + check from being mis-speculated. The barrier is done by the
464 + barrier_nospec() macro.
465 +
466 + For the swapgs variant of Spectre variant 1, LFENCE barriers are
467 + added to interrupt, exception and NMI entry where needed. These
468 + barriers are done by the FENCE_SWAPGS_KERNEL_ENTRY and
469 + FENCE_SWAPGS_USER_ENTRY macros.
470 +
471 +Spectre variant 2
472 +~~~~~~~~~~~~~~~~~
473 +
474 + For Spectre variant 2 mitigation, the compiler turns indirect calls or
475 + jumps in the kernel into equivalent return trampolines (retpolines)
476 + :ref:`[3] <spec_ref3>` :ref:`[9] <spec_ref9>` to go to the target
477 + addresses. Speculative execution paths under retpolines are trapped
478 + in an infinite loop to prevent any speculative execution jumping to
479 + a gadget.
480 +
481 + To turn on retpoline mitigation on a vulnerable CPU, the kernel
482 + needs to be compiled with a gcc compiler that supports the
483 + -mindirect-branch=thunk-extern -mindirect-branch-register options.
484 + If the kernel is compiled with a Clang compiler, the compiler needs
485 + to support -mretpoline-external-thunk option. The kernel config
486 + CONFIG_RETPOLINE needs to be turned on, and the CPU needs to run with
487 + the latest updated microcode.
488 +
489 + On Intel Skylake-era systems the mitigation covers most, but not all,
490 + cases. See :ref:`[3] <spec_ref3>` for more details.
491 +
492 + On CPUs with hardware mitigation for Spectre variant 2 (e.g. Enhanced
493 + IBRS on x86), retpoline is automatically disabled at run time.
494 +
495 + The retpoline mitigation is turned on by default on vulnerable
496 + CPUs. It can be forced on or off by the administrator
497 + via the kernel command line and sysfs control files. See
498 + :ref:`spectre_mitigation_control_command_line`.
499 +
500 + On x86, indirect branch restricted speculation is turned on by default
501 + before invoking any firmware code to prevent Spectre variant 2 exploits
502 + using the firmware.
503 +
504 + Using kernel address space randomization (CONFIG_RANDOMIZE_BASE=y
505 + and CONFIG_SLAB_FREELIST_RANDOM=y in the kernel configuration) makes
506 + attacks on the kernel generally more difficult.
507 +
508 +2. User program mitigation
509 +^^^^^^^^^^^^^^^^^^^^^^^^^^
510 +
511 + User programs can mitigate Spectre variant 1 using LFENCE or "bounds
512 + clipping". For more details see :ref:`[2] <spec_ref2>`.
513 +
514 + For Spectre variant 2 mitigation, individual user programs
515 + can be compiled with return trampolines for indirect branches.
516 + This protects them from consuming poisoned entries in the branch
517 + target buffer left by malicious software. Alternatively, the
518 + programs can disable their indirect branch speculation via prctl()
519 + (See Documentation/spec_ctrl.txt).
520 + On x86, this will turn on STIBP to guard against attacks from the
521 + sibling thread when the user program is running, and use IBPB to
522 + flush the branch target buffer when switching to/from the program.
523 +
524 + Restricting indirect branch speculation on a user program will
525 + also prevent the program from launching a variant 2 attack
526 + on x86. All sand-boxed SECCOMP programs have indirect branch
527 + speculation restricted by default. Administrators can change
528 + that behavior via the kernel command line and sysfs control files.
529 + See :ref:`spectre_mitigation_control_command_line`.
530 +
531 + Programs that disable their indirect branch speculation will have
532 + more overhead and run slower.
533 +
534 + User programs should use address space randomization
535 + (/proc/sys/kernel/randomize_va_space = 1 or 2) to make attacks more
536 + difficult.
537 +
538 +3. VM mitigation
539 +^^^^^^^^^^^^^^^^
540 +
541 + Within the kernel, Spectre variant 1 attacks from rogue guests are
542 + mitigated on a case by case basis in VM exit paths. Vulnerable code
543 + uses nospec accessor macros for "bounds clipping", to avoid any
544 + usable disclosure gadgets. However, this may not cover all variant
545 + 1 attack vectors.
546 +
547 + For Spectre variant 2 attacks from rogue guests to the kernel, the
548 + Linux kernel uses retpoline or Enhanced IBRS to prevent consumption of
549 + poisoned entries in branch target buffer left by rogue guests. It also
550 + flushes the return stack buffer on every VM exit to prevent a return
551 + stack buffer underflow so poisoned branch target buffer could be used,
552 + or attacker guests leaving poisoned entries in the return stack buffer.
553 +
554 + To mitigate guest-to-guest attacks in the same CPU hardware thread,
555 + the branch target buffer is sanitized by flushing before switching
556 + to a new guest on a CPU.
557 +
558 + The above mitigations are turned on by default on vulnerable CPUs.
559 +
560 + To mitigate guest-to-guest attacks from sibling thread when SMT is
561 + in use, an untrusted guest running in the sibling thread can have
562 + its indirect branch speculation disabled by administrator via prctl().
563 +
564 + The kernel also allows guests to use any microcode based mitigation
565 + they choose to use (such as IBPB or STIBP on x86) to protect themselves.
566 +
567 +.. _spectre_mitigation_control_command_line:
568 +
569 +Mitigation control on the kernel command line
570 +---------------------------------------------
571 +
572 +Spectre variant 2 mitigation can be disabled or force enabled at the
573 +kernel command line.
574 +
575 + nospectre_v1
576 +
577 + [X86,PPC] Disable mitigations for Spectre Variant 1
578 + (bounds check bypass). With this option data leaks are
579 + possible in the system.
580 +
581 + nospectre_v2
582 +
583 + [X86] Disable all mitigations for the Spectre variant 2
584 + (indirect branch prediction) vulnerability. System may
585 + allow data leaks with this option, which is equivalent
586 + to spectre_v2=off.
587 +
588 +
589 + spectre_v2=
590 +
591 + [X86] Control mitigation of Spectre variant 2
592 + (indirect branch speculation) vulnerability.
593 + The default operation protects the kernel from
594 + user space attacks.
595 +
596 + on
597 + unconditionally enable, implies
598 + spectre_v2_user=on
599 + off
600 + unconditionally disable, implies
601 + spectre_v2_user=off
602 + auto
603 + kernel detects whether your CPU model is
604 + vulnerable
605 +
606 + Selecting 'on' will, and 'auto' may, choose a
607 + mitigation method at run time according to the
608 + CPU, the available microcode, the setting of the
609 + CONFIG_RETPOLINE configuration option, and the
610 + compiler with which the kernel was built.
611 +
612 + Selecting 'on' will also enable the mitigation
613 + against user space to user space task attacks.
614 +
615 + Selecting 'off' will disable both the kernel and
616 + the user space protections.
617 +
618 + Specific mitigations can also be selected manually:
619 +
620 + retpoline auto pick between generic,lfence
621 + retpoline,generic Retpolines
622 + retpoline,lfence LFENCE; indirect branch
623 + retpoline,amd alias for retpoline,lfence
624 + eibrs enhanced IBRS
625 + eibrs,retpoline enhanced IBRS + Retpolines
626 + eibrs,lfence enhanced IBRS + LFENCE
627 +
628 + Not specifying this option is equivalent to
629 + spectre_v2=auto.
630 +
631 +For user space mitigation:
632 +
633 + spectre_v2_user=
634 +
635 + [X86] Control mitigation of Spectre variant 2
636 + (indirect branch speculation) vulnerability between
637 + user space tasks
638 +
639 + on
640 + Unconditionally enable mitigations. Is
641 + enforced by spectre_v2=on
642 +
643 + off
644 + Unconditionally disable mitigations. Is
645 + enforced by spectre_v2=off
646 +
647 + prctl
648 + Indirect branch speculation is enabled,
649 + but mitigation can be enabled via prctl
650 + per thread. The mitigation control state
651 + is inherited on fork.
652 +
653 + prctl,ibpb
654 + Like "prctl" above, but only STIBP is
655 + controlled per thread. IBPB is issued
656 + always when switching between different user
657 + space processes.
658 +
659 + seccomp
660 + Same as "prctl" above, but all seccomp
661 + threads will enable the mitigation unless
662 + they explicitly opt out.
663 +
664 + seccomp,ibpb
665 + Like "seccomp" above, but only STIBP is
666 + controlled per thread. IBPB is issued
667 + always when switching between different
668 + user space processes.
669 +
670 + auto
671 + Kernel selects the mitigation depending on
672 + the available CPU features and vulnerability.
673 +
674 + Default mitigation:
675 + If CONFIG_SECCOMP=y then "seccomp", otherwise "prctl"
676 +
677 + Not specifying this option is equivalent to
678 + spectre_v2_user=auto.
679 +
680 + In general the kernel by default selects
681 + reasonable mitigations for the current CPU. To
682 + disable Spectre variant 2 mitigations, boot with
683 + spectre_v2=off. Spectre variant 1 mitigations
684 + cannot be disabled.
685 +
686 +Mitigation selection guide
687 +--------------------------
688 +
689 +1. Trusted userspace
690 +^^^^^^^^^^^^^^^^^^^^
691 +
692 + If all userspace applications are from trusted sources and do not
693 + execute externally supplied untrusted code, then the mitigations can
694 + be disabled.
695 +
696 +2. Protect sensitive programs
697 +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
698 +
699 + For security-sensitive programs that have secrets (e.g. crypto
700 + keys), protection against Spectre variant 2 can be put in place by
701 + disabling indirect branch speculation when the program is running
702 + (See Documentation/spec_ctrl.txt).
703 +
704 +3. Sandbox untrusted programs
705 +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
706 +
707 + Untrusted programs that could be a source of attacks can be cordoned
708 + off by disabling their indirect branch speculation when they are run
709 + (See Documentation/spec_ctrl.txt).
710 + This prevents untrusted programs from polluting the branch target
711 + buffer. All programs running in SECCOMP sandboxes have indirect
712 + branch speculation restricted by default. This behavior can be
713 + changed via the kernel command line and sysfs control files. See
714 + :ref:`spectre_mitigation_control_command_line`.
715 +
716 +3. High security mode
717 +^^^^^^^^^^^^^^^^^^^^^
718 +
719 + All Spectre variant 2 mitigations can be forced on
720 + at boot time for all programs (See the "on" option in
721 + :ref:`spectre_mitigation_control_command_line`). This will add
722 + overhead as indirect branch speculations for all programs will be
723 + restricted.
724 +
725 + On x86, branch target buffer will be flushed with IBPB when switching
726 + to a new program. STIBP is left on all the time to protect programs
727 + against variant 2 attacks originating from programs running on
728 + sibling threads.
729 +
730 + Alternatively, STIBP can be used only when running programs
731 + whose indirect branch speculation is explicitly disabled,
732 + while IBPB is still used all the time when switching to a new
733 + program to clear the branch target buffer (See "ibpb" option in
734 + :ref:`spectre_mitigation_control_command_line`). This "ibpb" option
735 + has less performance cost than the "on" option, which leaves STIBP
736 + on all the time.
737 +
738 +References on Spectre
739 +---------------------
740 +
741 +Intel white papers:
742 +
743 +.. _spec_ref1:
744 +
745 +[1] `Intel analysis of speculative execution side channels <https://newsroom.intel.com/wp-content/uploads/sites/11/2018/01/Intel-Analysis-of-Speculative-Execution-Side-Channels.pdf>`_.
746 +
747 +.. _spec_ref2:
748 +
749 +[2] `Bounds check bypass <https://software.intel.com/security-software-guidance/software-guidance/bounds-check-bypass>`_.
750 +
751 +.. _spec_ref3:
752 +
753 +[3] `Deep dive: Retpoline: A branch target injection mitigation <https://software.intel.com/security-software-guidance/insights/deep-dive-retpoline-branch-target-injection-mitigation>`_.
754 +
755 +.. _spec_ref4:
756 +
757 +[4] `Deep Dive: Single Thread Indirect Branch Predictors <https://software.intel.com/security-software-guidance/insights/deep-dive-single-thread-indirect-branch-predictors>`_.
758 +
759 +AMD white papers:
760 +
761 +.. _spec_ref5:
762 +
763 +[5] `AMD64 technology indirect branch control extension <https://developer.amd.com/wp-content/resources/Architecture_Guidelines_Update_Indirect_Branch_Control.pdf>`_.
764 +
765 +.. _spec_ref6:
766 +
767 +[6] `Software techniques for managing speculation on AMD processors <https://developer.amd.com/wp-content/resources/Managing-Speculation-on-AMD-Processors.pdf>`_.
768 +
769 +ARM white papers:
770 +
771 +.. _spec_ref7:
772 +
773 +[7] `Cache speculation side-channels <https://developer.arm.com/support/arm-security-updates/speculative-processor-vulnerability/download-the-whitepaper>`_.
774 +
775 +.. _spec_ref8:
776 +
777 +[8] `Cache speculation issues update <https://developer.arm.com/support/arm-security-updates/speculative-processor-vulnerability/latest-updates/cache-speculation-issues-update>`_.
778 +
779 +Google white paper:
780 +
781 +.. _spec_ref9:
782 +
783 +[9] `Retpoline: a software construct for preventing branch-target-injection <https://support.google.com/faqs/answer/7625886>`_.
784 +
785 +MIPS white paper:
786 +
787 +.. _spec_ref10:
788 +
789 +[10] `MIPS: response on speculative execution and side channel vulnerabilities <https://www.mips.com/blog/mips-response-on-speculative-execution-and-side-channel-vulnerabilities/>`_.
790 +
791 +Academic papers:
792 +
793 +.. _spec_ref11:
794 +
795 +[11] `Spectre Attacks: Exploiting Speculative Execution <https://spectreattack.com/spectre.pdf>`_.
796 +
797 +.. _spec_ref12:
798 +
799 +[12] `NetSpectre: Read Arbitrary Memory over Network <https://arxiv.org/abs/1807.10535>`_.
800 +
801 +.. _spec_ref13:
802 +
803 +[13] `Spectre Returns! Speculation Attacks using the Return Stack Buffer <https://www.usenix.org/system/files/conference/woot18/woot18-paper-koruyeh.pdf>`_.
804 diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
805 index 713765521c451..6c0957c67d207 100644
806 --- a/Documentation/kernel-parameters.txt
807 +++ b/Documentation/kernel-parameters.txt
808 @@ -4174,8 +4174,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
809 Specific mitigations can also be selected manually:
810
811 retpoline - replace indirect branches
812 - retpoline,generic - google's original retpoline
813 - retpoline,amd - AMD-specific minimal thunk
814 + retpoline,generic - Retpolines
815 + retpoline,lfence - LFENCE; indirect branch
816 + retpoline,amd - alias for retpoline,lfence
817 + eibrs - enhanced IBRS
818 + eibrs,retpoline - enhanced IBRS + Retpolines
819 + eibrs,lfence - enhanced IBRS + LFENCE
820
821 Not specifying this option is equivalent to
822 spectre_v2=auto.
823 diff --git a/Makefile b/Makefile
824 index 308c848b01dc2..482b841188572 100644
825 --- a/Makefile
826 +++ b/Makefile
827 @@ -1,6 +1,6 @@
828 VERSION = 4
829 PATCHLEVEL = 9
830 -SUBLEVEL = 305
831 +SUBLEVEL = 306
832 EXTRAVERSION =
833 NAME = Roaring Lionus
834
835 diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
836 index 7d727506096f6..2fa3fd30a9d61 100644
837 --- a/arch/arm/include/asm/assembler.h
838 +++ b/arch/arm/include/asm/assembler.h
839 @@ -108,6 +108,16 @@
840 .endm
841 #endif
842
843 +#if __LINUX_ARM_ARCH__ < 7
844 + .macro dsb, args
845 + mcr p15, 0, r0, c7, c10, 4
846 + .endm
847 +
848 + .macro isb, args
849 + mcr p15, 0, r0, c7, c5, 4
850 + .endm
851 +#endif
852 +
853 .macro asm_trace_hardirqs_off, save=1
854 #if defined(CONFIG_TRACE_IRQFLAGS)
855 .if \save
856 diff --git a/arch/arm/include/asm/spectre.h b/arch/arm/include/asm/spectre.h
857 new file mode 100644
858 index 0000000000000..d1fa5607d3aa3
859 --- /dev/null
860 +++ b/arch/arm/include/asm/spectre.h
861 @@ -0,0 +1,32 @@
862 +/* SPDX-License-Identifier: GPL-2.0-only */
863 +
864 +#ifndef __ASM_SPECTRE_H
865 +#define __ASM_SPECTRE_H
866 +
867 +enum {
868 + SPECTRE_UNAFFECTED,
869 + SPECTRE_MITIGATED,
870 + SPECTRE_VULNERABLE,
871 +};
872 +
873 +enum {
874 + __SPECTRE_V2_METHOD_BPIALL,
875 + __SPECTRE_V2_METHOD_ICIALLU,
876 + __SPECTRE_V2_METHOD_SMC,
877 + __SPECTRE_V2_METHOD_HVC,
878 + __SPECTRE_V2_METHOD_LOOP8,
879 +};
880 +
881 +enum {
882 + SPECTRE_V2_METHOD_BPIALL = BIT(__SPECTRE_V2_METHOD_BPIALL),
883 + SPECTRE_V2_METHOD_ICIALLU = BIT(__SPECTRE_V2_METHOD_ICIALLU),
884 + SPECTRE_V2_METHOD_SMC = BIT(__SPECTRE_V2_METHOD_SMC),
885 + SPECTRE_V2_METHOD_HVC = BIT(__SPECTRE_V2_METHOD_HVC),
886 + SPECTRE_V2_METHOD_LOOP8 = BIT(__SPECTRE_V2_METHOD_LOOP8),
887 +};
888 +
889 +void spectre_v2_update_state(unsigned int state, unsigned int methods);
890 +
891 +int spectre_bhb_update_vectors(unsigned int method);
892 +
893 +#endif
894 diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
895 index 9bddd762880cf..1738d5b61eaa1 100644
896 --- a/arch/arm/kernel/Makefile
897 +++ b/arch/arm/kernel/Makefile
898 @@ -100,4 +100,6 @@ endif
899
900 obj-$(CONFIG_HAVE_ARM_SMCCC) += smccc-call.o
901
902 +obj-$(CONFIG_GENERIC_CPU_VULNERABILITIES) += spectre.o
903 +
904 extra-y := $(head-y) vmlinux.lds
905 diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
906 index 2cac25a69a85d..1040efcb98db6 100644
907 --- a/arch/arm/kernel/entry-armv.S
908 +++ b/arch/arm/kernel/entry-armv.S
909 @@ -1036,12 +1036,11 @@ vector_\name:
910 sub lr, lr, #\correction
911 .endif
912
913 - @
914 - @ Save r0, lr_<exception> (parent PC) and spsr_<exception>
915 - @ (parent CPSR)
916 - @
917 + @ Save r0, lr_<exception> (parent PC)
918 stmia sp, {r0, lr} @ save r0, lr
919 - mrs lr, spsr
920 +
921 + @ Save spsr_<exception> (parent CPSR)
922 +2: mrs lr, spsr
923 str lr, [sp, #8] @ save spsr
924
925 @
926 @@ -1062,6 +1061,44 @@ vector_\name:
927 movs pc, lr @ branch to handler in SVC mode
928 ENDPROC(vector_\name)
929
930 +#ifdef CONFIG_HARDEN_BRANCH_HISTORY
931 + .subsection 1
932 + .align 5
933 +vector_bhb_loop8_\name:
934 + .if \correction
935 + sub lr, lr, #\correction
936 + .endif
937 +
938 + @ Save r0, lr_<exception> (parent PC)
939 + stmia sp, {r0, lr}
940 +
941 + @ bhb workaround
942 + mov r0, #8
943 +1: b . + 4
944 + subs r0, r0, #1
945 + bne 1b
946 + dsb
947 + isb
948 + b 2b
949 +ENDPROC(vector_bhb_loop8_\name)
950 +
951 +vector_bhb_bpiall_\name:
952 + .if \correction
953 + sub lr, lr, #\correction
954 + .endif
955 +
956 + @ Save r0, lr_<exception> (parent PC)
957 + stmia sp, {r0, lr}
958 +
959 + @ bhb workaround
960 + mcr p15, 0, r0, c7, c5, 6 @ BPIALL
961 + @ isb not needed due to "movs pc, lr" in the vector stub
962 + @ which gives a "context synchronisation".
963 + b 2b
964 +ENDPROC(vector_bhb_bpiall_\name)
965 + .previous
966 +#endif
967 +
968 .align 2
969 @ handler addresses follow this label
970 1:
971 @@ -1070,6 +1107,10 @@ ENDPROC(vector_\name)
972 .section .stubs, "ax", %progbits
973 @ This must be the first word
974 .word vector_swi
975 +#ifdef CONFIG_HARDEN_BRANCH_HISTORY
976 + .word vector_bhb_loop8_swi
977 + .word vector_bhb_bpiall_swi
978 +#endif
979
980 vector_rst:
981 ARM( swi SYS_ERROR0 )
982 @@ -1184,8 +1225,10 @@ vector_addrexcptn:
983 * FIQ "NMI" handler
984 *-----------------------------------------------------------------------------
985 * Handle a FIQ using the SVC stack allowing FIQ act like NMI on x86
986 - * systems.
987 + * systems. This must be the last vector stub, so lets place it in its own
988 + * subsection.
989 */
990 + .subsection 2
991 vector_stub fiq, FIQ_MODE, 4
992
993 .long __fiq_usr @ 0 (USR_26 / USR_32)
994 @@ -1218,6 +1261,30 @@ vector_addrexcptn:
995 W(b) vector_irq
996 W(b) vector_fiq
997
998 +#ifdef CONFIG_HARDEN_BRANCH_HISTORY
999 + .section .vectors.bhb.loop8, "ax", %progbits
1000 +.L__vectors_bhb_loop8_start:
1001 + W(b) vector_rst
1002 + W(b) vector_bhb_loop8_und
1003 + W(ldr) pc, .L__vectors_bhb_loop8_start + 0x1004
1004 + W(b) vector_bhb_loop8_pabt
1005 + W(b) vector_bhb_loop8_dabt
1006 + W(b) vector_addrexcptn
1007 + W(b) vector_bhb_loop8_irq
1008 + W(b) vector_bhb_loop8_fiq
1009 +
1010 + .section .vectors.bhb.bpiall, "ax", %progbits
1011 +.L__vectors_bhb_bpiall_start:
1012 + W(b) vector_rst
1013 + W(b) vector_bhb_bpiall_und
1014 + W(ldr) pc, .L__vectors_bhb_bpiall_start + 0x1008
1015 + W(b) vector_bhb_bpiall_pabt
1016 + W(b) vector_bhb_bpiall_dabt
1017 + W(b) vector_addrexcptn
1018 + W(b) vector_bhb_bpiall_irq
1019 + W(b) vector_bhb_bpiall_fiq
1020 +#endif
1021 +
1022 .data
1023
1024 .globl cr_alignment
1025 diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
1026 index 178a2a9606595..fb0f505c9924f 100644
1027 --- a/arch/arm/kernel/entry-common.S
1028 +++ b/arch/arm/kernel/entry-common.S
1029 @@ -142,6 +142,29 @@ ENDPROC(ret_from_fork)
1030 *-----------------------------------------------------------------------------
1031 */
1032
1033 + .align 5
1034 +#ifdef CONFIG_HARDEN_BRANCH_HISTORY
1035 +ENTRY(vector_bhb_loop8_swi)
1036 + sub sp, sp, #PT_REGS_SIZE
1037 + stmia sp, {r0 - r12}
1038 + mov r8, #8
1039 +1: b 2f
1040 +2: subs r8, r8, #1
1041 + bne 1b
1042 + dsb
1043 + isb
1044 + b 3f
1045 +ENDPROC(vector_bhb_loop8_swi)
1046 +
1047 + .align 5
1048 +ENTRY(vector_bhb_bpiall_swi)
1049 + sub sp, sp, #PT_REGS_SIZE
1050 + stmia sp, {r0 - r12}
1051 + mcr p15, 0, r8, c7, c5, 6 @ BPIALL
1052 + isb
1053 + b 3f
1054 +ENDPROC(vector_bhb_bpiall_swi)
1055 +#endif
1056 .align 5
1057 ENTRY(vector_swi)
1058 #ifdef CONFIG_CPU_V7M
1059 @@ -149,6 +172,7 @@ ENTRY(vector_swi)
1060 #else
1061 sub sp, sp, #PT_REGS_SIZE
1062 stmia sp, {r0 - r12} @ Calling r0 - r12
1063 +3:
1064 ARM( add r8, sp, #S_PC )
1065 ARM( stmdb r8, {sp, lr}^ ) @ Calling sp, lr
1066 THUMB( mov r8, sp )
1067 diff --git a/arch/arm/kernel/spectre.c b/arch/arm/kernel/spectre.c
1068 new file mode 100644
1069 index 0000000000000..0dcefc36fb7a0
1070 --- /dev/null
1071 +++ b/arch/arm/kernel/spectre.c
1072 @@ -0,0 +1,71 @@
1073 +// SPDX-License-Identifier: GPL-2.0-only
1074 +#include <linux/bpf.h>
1075 +#include <linux/cpu.h>
1076 +#include <linux/device.h>
1077 +
1078 +#include <asm/spectre.h>
1079 +
1080 +static bool _unprivileged_ebpf_enabled(void)
1081 +{
1082 +#ifdef CONFIG_BPF_SYSCALL
1083 + return !sysctl_unprivileged_bpf_disabled;
1084 +#else
1085 + return false;
1086 +#endif
1087 +}
1088 +
1089 +ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr,
1090 + char *buf)
1091 +{
1092 + return sprintf(buf, "Mitigation: __user pointer sanitization\n");
1093 +}
1094 +
1095 +static unsigned int spectre_v2_state;
1096 +static unsigned int spectre_v2_methods;
1097 +
1098 +void spectre_v2_update_state(unsigned int state, unsigned int method)
1099 +{
1100 + if (state > spectre_v2_state)
1101 + spectre_v2_state = state;
1102 + spectre_v2_methods |= method;
1103 +}
1104 +
1105 +ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr,
1106 + char *buf)
1107 +{
1108 + const char *method;
1109 +
1110 + if (spectre_v2_state == SPECTRE_UNAFFECTED)
1111 + return sprintf(buf, "%s\n", "Not affected");
1112 +
1113 + if (spectre_v2_state != SPECTRE_MITIGATED)
1114 + return sprintf(buf, "%s\n", "Vulnerable");
1115 +
1116 + if (_unprivileged_ebpf_enabled())
1117 + return sprintf(buf, "Vulnerable: Unprivileged eBPF enabled\n");
1118 +
1119 + switch (spectre_v2_methods) {
1120 + case SPECTRE_V2_METHOD_BPIALL:
1121 + method = "Branch predictor hardening";
1122 + break;
1123 +
1124 + case SPECTRE_V2_METHOD_ICIALLU:
1125 + method = "I-cache invalidation";
1126 + break;
1127 +
1128 + case SPECTRE_V2_METHOD_SMC:
1129 + case SPECTRE_V2_METHOD_HVC:
1130 + method = "Firmware call";
1131 + break;
1132 +
1133 + case SPECTRE_V2_METHOD_LOOP8:
1134 + method = "History overwrite";
1135 + break;
1136 +
1137 + default:
1138 + method = "Multiple mitigations";
1139 + break;
1140 + }
1141 +
1142 + return sprintf(buf, "Mitigation: %s\n", method);
1143 +}
1144 diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
1145 index aa316a7562b1f..7fca7ece8f979 100644
1146 --- a/arch/arm/kernel/traps.c
1147 +++ b/arch/arm/kernel/traps.c
1148 @@ -31,6 +31,7 @@
1149 #include <linux/atomic.h>
1150 #include <asm/cacheflush.h>
1151 #include <asm/exception.h>
1152 +#include <asm/spectre.h>
1153 #include <asm/unistd.h>
1154 #include <asm/traps.h>
1155 #include <asm/ptrace.h>
1156 @@ -819,10 +820,59 @@ static inline void __init kuser_init(void *vectors)
1157 }
1158 #endif
1159
1160 +#ifndef CONFIG_CPU_V7M
1161 +static void copy_from_lma(void *vma, void *lma_start, void *lma_end)
1162 +{
1163 + memcpy(vma, lma_start, lma_end - lma_start);
1164 +}
1165 +
1166 +static void flush_vectors(void *vma, size_t offset, size_t size)
1167 +{
1168 + unsigned long start = (unsigned long)vma + offset;
1169 + unsigned long end = start + size;
1170 +
1171 + flush_icache_range(start, end);
1172 +}
1173 +
1174 +#ifdef CONFIG_HARDEN_BRANCH_HISTORY
1175 +int spectre_bhb_update_vectors(unsigned int method)
1176 +{
1177 + extern char __vectors_bhb_bpiall_start[], __vectors_bhb_bpiall_end[];
1178 + extern char __vectors_bhb_loop8_start[], __vectors_bhb_loop8_end[];
1179 + void *vec_start, *vec_end;
1180 +
1181 + if (system_state >= SYSTEM_RUNNING) {
1182 + pr_err("CPU%u: Spectre BHB workaround too late - system vulnerable\n",
1183 + smp_processor_id());
1184 + return SPECTRE_VULNERABLE;
1185 + }
1186 +
1187 + switch (method) {
1188 + case SPECTRE_V2_METHOD_LOOP8:
1189 + vec_start = __vectors_bhb_loop8_start;
1190 + vec_end = __vectors_bhb_loop8_end;
1191 + break;
1192 +
1193 + case SPECTRE_V2_METHOD_BPIALL:
1194 + vec_start = __vectors_bhb_bpiall_start;
1195 + vec_end = __vectors_bhb_bpiall_end;
1196 + break;
1197 +
1198 + default:
1199 + pr_err("CPU%u: unknown Spectre BHB state %d\n",
1200 + smp_processor_id(), method);
1201 + return SPECTRE_VULNERABLE;
1202 + }
1203 +
1204 + copy_from_lma(vectors_page, vec_start, vec_end);
1205 + flush_vectors(vectors_page, 0, vec_end - vec_start);
1206 +
1207 + return SPECTRE_MITIGATED;
1208 +}
1209 +#endif
1210 +
1211 void __init early_trap_init(void *vectors_base)
1212 {
1213 -#ifndef CONFIG_CPU_V7M
1214 - unsigned long vectors = (unsigned long)vectors_base;
1215 extern char __stubs_start[], __stubs_end[];
1216 extern char __vectors_start[], __vectors_end[];
1217 unsigned i;
1218 @@ -843,17 +893,20 @@ void __init early_trap_init(void *vectors_base)
1219 * into the vector page, mapped at 0xffff0000, and ensure these
1220 * are visible to the instruction stream.
1221 */
1222 - memcpy((void *)vectors, __vectors_start, __vectors_end - __vectors_start);
1223 - memcpy((void *)vectors + 0x1000, __stubs_start, __stubs_end - __stubs_start);
1224 + copy_from_lma(vectors_base, __vectors_start, __vectors_end);
1225 + copy_from_lma(vectors_base + 0x1000, __stubs_start, __stubs_end);
1226
1227 kuser_init(vectors_base);
1228
1229 - flush_icache_range(vectors, vectors + PAGE_SIZE * 2);
1230 + flush_vectors(vectors_base, 0, PAGE_SIZE * 2);
1231 +}
1232 #else /* ifndef CONFIG_CPU_V7M */
1233 +void __init early_trap_init(void *vectors_base)
1234 +{
1235 /*
1236 * on V7-M there is no need to copy the vector table to a dedicated
1237 * memory area. The address is configurable and so a table in the kernel
1238 * image can be used.
1239 */
1240 -#endif
1241 }
1242 +#endif
1243 diff --git a/arch/arm/kernel/vmlinux-xip.lds.S b/arch/arm/kernel/vmlinux-xip.lds.S
1244 index 37b2a11af3459..d80ef8c2bb461 100644
1245 --- a/arch/arm/kernel/vmlinux-xip.lds.S
1246 +++ b/arch/arm/kernel/vmlinux-xip.lds.S
1247 @@ -12,6 +12,19 @@
1248 #include <asm/memory.h>
1249 #include <asm/page.h>
1250
1251 +/*
1252 + * ld.lld does not support NOCROSSREFS:
1253 + * https://github.com/ClangBuiltLinux/linux/issues/1609
1254 + */
1255 +#ifdef CONFIG_LD_IS_LLD
1256 +#define NOCROSSREFS
1257 +#endif
1258 +
1259 +/* Set start/end symbol names to the LMA for the section */
1260 +#define ARM_LMA(sym, section) \
1261 + sym##_start = LOADADDR(section); \
1262 + sym##_end = LOADADDR(section) + SIZEOF(section)
1263 +
1264 #define PROC_INFO \
1265 . = ALIGN(4); \
1266 VMLINUX_SYMBOL(__proc_info_begin) = .; \
1267 @@ -148,19 +161,31 @@ SECTIONS
1268 * The vectors and stubs are relocatable code, and the
1269 * only thing that matters is their relative offsets
1270 */
1271 - __vectors_start = .;
1272 - .vectors 0xffff0000 : AT(__vectors_start) {
1273 - *(.vectors)
1274 + __vectors_lma = .;
1275 + OVERLAY 0xffff0000 : NOCROSSREFS AT(__vectors_lma) {
1276 + .vectors {
1277 + *(.vectors)
1278 + }
1279 + .vectors.bhb.loop8 {
1280 + *(.vectors.bhb.loop8)
1281 + }
1282 + .vectors.bhb.bpiall {
1283 + *(.vectors.bhb.bpiall)
1284 + }
1285 }
1286 - . = __vectors_start + SIZEOF(.vectors);
1287 - __vectors_end = .;
1288 -
1289 - __stubs_start = .;
1290 - .stubs ADDR(.vectors) + 0x1000 : AT(__stubs_start) {
1291 + ARM_LMA(__vectors, .vectors);
1292 + ARM_LMA(__vectors_bhb_loop8, .vectors.bhb.loop8);
1293 + ARM_LMA(__vectors_bhb_bpiall, .vectors.bhb.bpiall);
1294 + . = __vectors_lma + SIZEOF(.vectors) +
1295 + SIZEOF(.vectors.bhb.loop8) +
1296 + SIZEOF(.vectors.bhb.bpiall);
1297 +
1298 + __stubs_lma = .;
1299 + .stubs ADDR(.vectors) + 0x1000 : AT(__stubs_lma) {
1300 *(.stubs)
1301 }
1302 - . = __stubs_start + SIZEOF(.stubs);
1303 - __stubs_end = .;
1304 + ARM_LMA(__stubs, .stubs);
1305 + . = __stubs_lma + SIZEOF(.stubs);
1306
1307 PROVIDE(vector_fiq_offset = vector_fiq - ADDR(.vectors));
1308
1309 diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
1310 index f7f55df0bf7b3..0d560a24408f0 100644
1311 --- a/arch/arm/kernel/vmlinux.lds.S
1312 +++ b/arch/arm/kernel/vmlinux.lds.S
1313 @@ -14,6 +14,19 @@
1314 #include <asm/page.h>
1315 #include <asm/pgtable.h>
1316
1317 +/*
1318 + * ld.lld does not support NOCROSSREFS:
1319 + * https://github.com/ClangBuiltLinux/linux/issues/1609
1320 + */
1321 +#ifdef CONFIG_LD_IS_LLD
1322 +#define NOCROSSREFS
1323 +#endif
1324 +
1325 +/* Set start/end symbol names to the LMA for the section */
1326 +#define ARM_LMA(sym, section) \
1327 + sym##_start = LOADADDR(section); \
1328 + sym##_end = LOADADDR(section) + SIZEOF(section)
1329 +
1330 #define PROC_INFO \
1331 . = ALIGN(4); \
1332 VMLINUX_SYMBOL(__proc_info_begin) = .; \
1333 @@ -169,19 +182,31 @@ SECTIONS
1334 * The vectors and stubs are relocatable code, and the
1335 * only thing that matters is their relative offsets
1336 */
1337 - __vectors_start = .;
1338 - .vectors 0xffff0000 : AT(__vectors_start) {
1339 - *(.vectors)
1340 + __vectors_lma = .;
1341 + OVERLAY 0xffff0000 : NOCROSSREFS AT(__vectors_lma) {
1342 + .vectors {
1343 + *(.vectors)
1344 + }
1345 + .vectors.bhb.loop8 {
1346 + *(.vectors.bhb.loop8)
1347 + }
1348 + .vectors.bhb.bpiall {
1349 + *(.vectors.bhb.bpiall)
1350 + }
1351 }
1352 - . = __vectors_start + SIZEOF(.vectors);
1353 - __vectors_end = .;
1354 -
1355 - __stubs_start = .;
1356 - .stubs ADDR(.vectors) + 0x1000 : AT(__stubs_start) {
1357 + ARM_LMA(__vectors, .vectors);
1358 + ARM_LMA(__vectors_bhb_loop8, .vectors.bhb.loop8);
1359 + ARM_LMA(__vectors_bhb_bpiall, .vectors.bhb.bpiall);
1360 + . = __vectors_lma + SIZEOF(.vectors) +
1361 + SIZEOF(.vectors.bhb.loop8) +
1362 + SIZEOF(.vectors.bhb.bpiall);
1363 +
1364 + __stubs_lma = .;
1365 + .stubs ADDR(.vectors) + 0x1000 : AT(__stubs_lma) {
1366 *(.stubs)
1367 }
1368 - . = __stubs_start + SIZEOF(.stubs);
1369 - __stubs_end = .;
1370 + ARM_LMA(__stubs, .stubs);
1371 + . = __stubs_lma + SIZEOF(.stubs);
1372
1373 PROVIDE(vector_fiq_offset = vector_fiq - ADDR(.vectors));
1374
1375 diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
1376 index 93623627a0b68..5c98074010d25 100644
1377 --- a/arch/arm/mm/Kconfig
1378 +++ b/arch/arm/mm/Kconfig
1379 @@ -803,6 +803,7 @@ config CPU_BPREDICT_DISABLE
1380
1381 config CPU_SPECTRE
1382 bool
1383 + select GENERIC_CPU_VULNERABILITIES
1384
1385 config HARDEN_BRANCH_PREDICTOR
1386 bool "Harden the branch predictor against aliasing attacks" if EXPERT
1387 @@ -823,6 +824,16 @@ config HARDEN_BRANCH_PREDICTOR
1388
1389 If unsure, say Y.
1390
1391 +config HARDEN_BRANCH_HISTORY
1392 + bool "Harden Spectre style attacks against branch history" if EXPERT
1393 + depends on CPU_SPECTRE
1394 + default y
1395 + help
1396 + Speculation attacks against some high-performance processors can
1397 + make use of branch history to influence future speculation. When
1398 + taking an exception, a sequence of branches overwrites the branch
1399 + history, or branch history is invalidated.
1400 +
1401 config TLS_REG_EMUL
1402 bool
1403 select NEED_KUSER_HELPERS
1404 diff --git a/arch/arm/mm/proc-v7-bugs.c b/arch/arm/mm/proc-v7-bugs.c
1405 index 9a07916af8dd2..1b6e770bc1cd3 100644
1406 --- a/arch/arm/mm/proc-v7-bugs.c
1407 +++ b/arch/arm/mm/proc-v7-bugs.c
1408 @@ -7,8 +7,36 @@
1409 #include <asm/cp15.h>
1410 #include <asm/cputype.h>
1411 #include <asm/proc-fns.h>
1412 +#include <asm/spectre.h>
1413 #include <asm/system_misc.h>
1414
1415 +#ifdef CONFIG_ARM_PSCI
1416 +#define SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED 1
1417 +static int __maybe_unused spectre_v2_get_cpu_fw_mitigation_state(void)
1418 +{
1419 + struct arm_smccc_res res;
1420 +
1421 + arm_smccc_1_1_invoke(ARM_SMCCC_ARCH_FEATURES_FUNC_ID,
1422 + ARM_SMCCC_ARCH_WORKAROUND_1, &res);
1423 +
1424 + switch ((int)res.a0) {
1425 + case SMCCC_RET_SUCCESS:
1426 + return SPECTRE_MITIGATED;
1427 +
1428 + case SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED:
1429 + return SPECTRE_UNAFFECTED;
1430 +
1431 + default:
1432 + return SPECTRE_VULNERABLE;
1433 + }
1434 +}
1435 +#else
1436 +static int __maybe_unused spectre_v2_get_cpu_fw_mitigation_state(void)
1437 +{
1438 + return SPECTRE_VULNERABLE;
1439 +}
1440 +#endif
1441 +
1442 #ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
1443 DEFINE_PER_CPU(harden_branch_predictor_fn_t, harden_branch_predictor_fn);
1444
1445 @@ -37,13 +65,61 @@ static void __maybe_unused call_hvc_arch_workaround_1(void)
1446 arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL);
1447 }
1448
1449 -static void cpu_v7_spectre_init(void)
1450 +static unsigned int spectre_v2_install_workaround(unsigned int method)
1451 {
1452 const char *spectre_v2_method = NULL;
1453 int cpu = smp_processor_id();
1454
1455 if (per_cpu(harden_branch_predictor_fn, cpu))
1456 - return;
1457 + return SPECTRE_MITIGATED;
1458 +
1459 + switch (method) {
1460 + case SPECTRE_V2_METHOD_BPIALL:
1461 + per_cpu(harden_branch_predictor_fn, cpu) =
1462 + harden_branch_predictor_bpiall;
1463 + spectre_v2_method = "BPIALL";
1464 + break;
1465 +
1466 + case SPECTRE_V2_METHOD_ICIALLU:
1467 + per_cpu(harden_branch_predictor_fn, cpu) =
1468 + harden_branch_predictor_iciallu;
1469 + spectre_v2_method = "ICIALLU";
1470 + break;
1471 +
1472 + case SPECTRE_V2_METHOD_HVC:
1473 + per_cpu(harden_branch_predictor_fn, cpu) =
1474 + call_hvc_arch_workaround_1;
1475 + cpu_do_switch_mm = cpu_v7_hvc_switch_mm;
1476 + spectre_v2_method = "hypervisor";
1477 + break;
1478 +
1479 + case SPECTRE_V2_METHOD_SMC:
1480 + per_cpu(harden_branch_predictor_fn, cpu) =
1481 + call_smc_arch_workaround_1;
1482 + cpu_do_switch_mm = cpu_v7_smc_switch_mm;
1483 + spectre_v2_method = "firmware";
1484 + break;
1485 + }
1486 +
1487 + if (spectre_v2_method)
1488 + pr_info("CPU%u: Spectre v2: using %s workaround\n",
1489 + smp_processor_id(), spectre_v2_method);
1490 +
1491 + return SPECTRE_MITIGATED;
1492 +}
1493 +#else
1494 +static unsigned int spectre_v2_install_workaround(unsigned int method)
1495 +{
1496 + pr_info("CPU%u: Spectre V2: workarounds disabled by configuration\n",
1497 + smp_processor_id());
1498 +
1499 + return SPECTRE_VULNERABLE;
1500 +}
1501 +#endif
1502 +
1503 +static void cpu_v7_spectre_v2_init(void)
1504 +{
1505 + unsigned int state, method = 0;
1506
1507 switch (read_cpuid_part()) {
1508 case ARM_CPU_PART_CORTEX_A8:
1509 @@ -52,29 +128,32 @@ static void cpu_v7_spectre_init(void)
1510 case ARM_CPU_PART_CORTEX_A17:
1511 case ARM_CPU_PART_CORTEX_A73:
1512 case ARM_CPU_PART_CORTEX_A75:
1513 - per_cpu(harden_branch_predictor_fn, cpu) =
1514 - harden_branch_predictor_bpiall;
1515 - spectre_v2_method = "BPIALL";
1516 + state = SPECTRE_MITIGATED;
1517 + method = SPECTRE_V2_METHOD_BPIALL;
1518 break;
1519
1520 case ARM_CPU_PART_CORTEX_A15:
1521 case ARM_CPU_PART_BRAHMA_B15:
1522 - per_cpu(harden_branch_predictor_fn, cpu) =
1523 - harden_branch_predictor_iciallu;
1524 - spectre_v2_method = "ICIALLU";
1525 + state = SPECTRE_MITIGATED;
1526 + method = SPECTRE_V2_METHOD_ICIALLU;
1527 break;
1528
1529 -#ifdef CONFIG_ARM_PSCI
1530 default:
1531 /* Other ARM CPUs require no workaround */
1532 - if (read_cpuid_implementor() == ARM_CPU_IMP_ARM)
1533 + if (read_cpuid_implementor() == ARM_CPU_IMP_ARM) {
1534 + state = SPECTRE_UNAFFECTED;
1535 break;
1536 + }
1537 /* fallthrough */
1538 - /* Cortex A57/A72 require firmware workaround */
1539 + /* Cortex A57/A72 require firmware workaround */
1540 case ARM_CPU_PART_CORTEX_A57:
1541 case ARM_CPU_PART_CORTEX_A72: {
1542 struct arm_smccc_res res;
1543
1544 + state = spectre_v2_get_cpu_fw_mitigation_state();
1545 + if (state != SPECTRE_MITIGATED)
1546 + break;
1547 +
1548 if (psci_ops.smccc_version == SMCCC_VERSION_1_0)
1549 break;
1550
1551 @@ -84,10 +163,7 @@ static void cpu_v7_spectre_init(void)
1552 ARM_SMCCC_ARCH_WORKAROUND_1, &res);
1553 if ((int)res.a0 != 0)
1554 break;
1555 - per_cpu(harden_branch_predictor_fn, cpu) =
1556 - call_hvc_arch_workaround_1;
1557 - cpu_do_switch_mm = cpu_v7_hvc_switch_mm;
1558 - spectre_v2_method = "hypervisor";
1559 + method = SPECTRE_V2_METHOD_HVC;
1560 break;
1561
1562 case PSCI_CONDUIT_SMC:
1563 @@ -95,29 +171,97 @@ static void cpu_v7_spectre_init(void)
1564 ARM_SMCCC_ARCH_WORKAROUND_1, &res);
1565 if ((int)res.a0 != 0)
1566 break;
1567 - per_cpu(harden_branch_predictor_fn, cpu) =
1568 - call_smc_arch_workaround_1;
1569 - cpu_do_switch_mm = cpu_v7_smc_switch_mm;
1570 - spectre_v2_method = "firmware";
1571 + method = SPECTRE_V2_METHOD_SMC;
1572 break;
1573
1574 default:
1575 + state = SPECTRE_VULNERABLE;
1576 break;
1577 }
1578 }
1579 -#endif
1580 }
1581
1582 - if (spectre_v2_method)
1583 - pr_info("CPU%u: Spectre v2: using %s workaround\n",
1584 - smp_processor_id(), spectre_v2_method);
1585 + if (state == SPECTRE_MITIGATED)
1586 + state = spectre_v2_install_workaround(method);
1587 +
1588 + spectre_v2_update_state(state, method);
1589 +}
1590 +
1591 +#ifdef CONFIG_HARDEN_BRANCH_HISTORY
1592 +static int spectre_bhb_method;
1593 +
1594 +static const char *spectre_bhb_method_name(int method)
1595 +{
1596 + switch (method) {
1597 + case SPECTRE_V2_METHOD_LOOP8:
1598 + return "loop";
1599 +
1600 + case SPECTRE_V2_METHOD_BPIALL:
1601 + return "BPIALL";
1602 +
1603 + default:
1604 + return "unknown";
1605 + }
1606 +}
1607 +
1608 +static int spectre_bhb_install_workaround(int method)
1609 +{
1610 + if (spectre_bhb_method != method) {
1611 + if (spectre_bhb_method) {
1612 + pr_err("CPU%u: Spectre BHB: method disagreement, system vulnerable\n",
1613 + smp_processor_id());
1614 +
1615 + return SPECTRE_VULNERABLE;
1616 + }
1617 +
1618 + if (spectre_bhb_update_vectors(method) == SPECTRE_VULNERABLE)
1619 + return SPECTRE_VULNERABLE;
1620 +
1621 + spectre_bhb_method = method;
1622 + }
1623 +
1624 + pr_info("CPU%u: Spectre BHB: using %s workaround\n",
1625 + smp_processor_id(), spectre_bhb_method_name(method));
1626 +
1627 + return SPECTRE_MITIGATED;
1628 }
1629 #else
1630 -static void cpu_v7_spectre_init(void)
1631 +static int spectre_bhb_install_workaround(int method)
1632 {
1633 + return SPECTRE_VULNERABLE;
1634 }
1635 #endif
1636
1637 +static void cpu_v7_spectre_bhb_init(void)
1638 +{
1639 + unsigned int state, method = 0;
1640 +
1641 + switch (read_cpuid_part()) {
1642 + case ARM_CPU_PART_CORTEX_A15:
1643 + case ARM_CPU_PART_BRAHMA_B15:
1644 + case ARM_CPU_PART_CORTEX_A57:
1645 + case ARM_CPU_PART_CORTEX_A72:
1646 + state = SPECTRE_MITIGATED;
1647 + method = SPECTRE_V2_METHOD_LOOP8;
1648 + break;
1649 +
1650 + case ARM_CPU_PART_CORTEX_A73:
1651 + case ARM_CPU_PART_CORTEX_A75:
1652 + state = SPECTRE_MITIGATED;
1653 + method = SPECTRE_V2_METHOD_BPIALL;
1654 + break;
1655 +
1656 + default:
1657 + state = SPECTRE_UNAFFECTED;
1658 + break;
1659 + }
1660 +
1661 + if (state == SPECTRE_MITIGATED)
1662 + state = spectre_bhb_install_workaround(method);
1663 +
1664 + spectre_v2_update_state(state, method);
1665 +}
1666 +
1667 static __maybe_unused bool cpu_v7_check_auxcr_set(bool *warned,
1668 u32 mask, const char *msg)
1669 {
1670 @@ -146,16 +290,17 @@ static bool check_spectre_auxcr(bool *warned, u32 bit)
1671 void cpu_v7_ca8_ibe(void)
1672 {
1673 if (check_spectre_auxcr(this_cpu_ptr(&spectre_warned), BIT(6)))
1674 - cpu_v7_spectre_init();
1675 + cpu_v7_spectre_v2_init();
1676 }
1677
1678 void cpu_v7_ca15_ibe(void)
1679 {
1680 if (check_spectre_auxcr(this_cpu_ptr(&spectre_warned), BIT(0)))
1681 - cpu_v7_spectre_init();
1682 + cpu_v7_spectre_v2_init();
1683 }
1684
1685 void cpu_v7_bugs_init(void)
1686 {
1687 - cpu_v7_spectre_init();
1688 + cpu_v7_spectre_v2_init();
1689 + cpu_v7_spectre_bhb_init();
1690 }
1691 diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
1692 index 3ce5b5bd1dc45..fa202cd53b619 100644
1693 --- a/arch/x86/Kconfig
1694 +++ b/arch/x86/Kconfig
1695 @@ -418,10 +418,6 @@ config RETPOLINE
1696 branches. Requires a compiler with -mindirect-branch=thunk-extern
1697 support for full protection. The kernel may run slower.
1698
1699 - Without compiler support, at least indirect branches in assembler
1700 - code are eliminated. Since this includes the syscall entry path,
1701 - it is not entirely pointless.
1702 -
1703 if X86_32
1704 config X86_EXTENDED_PLATFORM
1705 bool "Support for extended (non-PC) x86 platforms"
1706 diff --git a/arch/x86/Makefile b/arch/x86/Makefile
1707 index 0bc35e3e6c5cd..a77737a979c8c 100644
1708 --- a/arch/x86/Makefile
1709 +++ b/arch/x86/Makefile
1710 @@ -221,9 +221,7 @@ ifdef CONFIG_RETPOLINE
1711 RETPOLINE_CFLAGS_CLANG := -mretpoline-external-thunk
1712
1713 RETPOLINE_CFLAGS += $(call cc-option,$(RETPOLINE_CFLAGS_GCC),$(call cc-option,$(RETPOLINE_CFLAGS_CLANG)))
1714 - ifneq ($(RETPOLINE_CFLAGS),)
1715 - KBUILD_CFLAGS += $(RETPOLINE_CFLAGS) -DRETPOLINE
1716 - endif
1717 + KBUILD_CFLAGS += $(RETPOLINE_CFLAGS)
1718 endif
1719
1720 archscripts: scripts_basic
1721 @@ -239,6 +237,13 @@ archprepare:
1722 ifeq ($(CONFIG_KEXEC_FILE),y)
1723 $(Q)$(MAKE) $(build)=arch/x86/purgatory arch/x86/purgatory/kexec-purgatory.c
1724 endif
1725 +ifdef CONFIG_RETPOLINE
1726 +ifeq ($(RETPOLINE_CFLAGS),)
1727 + @echo "You are building kernel with non-retpoline compiler." >&2
1728 + @echo "Please update your compiler." >&2
1729 + @false
1730 +endif
1731 +endif
1732
1733 ###
1734 # Kernel objects
1735 diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
1736 index 8ceb7a8a249c7..5b197248d5465 100644
1737 --- a/arch/x86/include/asm/cpufeatures.h
1738 +++ b/arch/x86/include/asm/cpufeatures.h
1739 @@ -195,7 +195,7 @@
1740 #define X86_FEATURE_FENCE_SWAPGS_USER ( 7*32+10) /* "" LFENCE in user entry SWAPGS path */
1741 #define X86_FEATURE_FENCE_SWAPGS_KERNEL ( 7*32+11) /* "" LFENCE in kernel entry SWAPGS path */
1742 #define X86_FEATURE_RETPOLINE ( 7*32+12) /* "" Generic Retpoline mitigation for Spectre variant 2 */
1743 -#define X86_FEATURE_RETPOLINE_AMD ( 7*32+13) /* "" AMD Retpoline mitigation for Spectre variant 2 */
1744 +#define X86_FEATURE_RETPOLINE_LFENCE ( 7*32+13) /* "" Use LFENCE for Spectre variant 2 */
1745
1746 #define X86_FEATURE_MSR_SPEC_CTRL ( 7*32+16) /* "" MSR SPEC_CTRL is implemented */
1747 #define X86_FEATURE_SSBD ( 7*32+17) /* Speculative Store Bypass Disable */
1748 diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
1749 index 204a5ce65afda..19829b00e4fed 100644
1750 --- a/arch/x86/include/asm/nospec-branch.h
1751 +++ b/arch/x86/include/asm/nospec-branch.h
1752 @@ -119,7 +119,7 @@
1753 ANNOTATE_NOSPEC_ALTERNATIVE
1754 ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg), \
1755 __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE, \
1756 - __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD
1757 + __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_LFENCE
1758 #else
1759 jmp *\reg
1760 #endif
1761 @@ -130,7 +130,7 @@
1762 ANNOTATE_NOSPEC_ALTERNATIVE
1763 ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *\reg), \
1764 __stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\
1765 - __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *\reg), X86_FEATURE_RETPOLINE_AMD
1766 + __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *\reg), X86_FEATURE_RETPOLINE_LFENCE
1767 #else
1768 call *\reg
1769 #endif
1770 @@ -164,29 +164,35 @@
1771 _ASM_PTR " 999b\n\t" \
1772 ".popsection\n\t"
1773
1774 -#if defined(CONFIG_X86_64) && defined(RETPOLINE)
1775 +#ifdef CONFIG_RETPOLINE
1776 +#ifdef CONFIG_X86_64
1777
1778 /*
1779 - * Since the inline asm uses the %V modifier which is only in newer GCC,
1780 - * the 64-bit one is dependent on RETPOLINE not CONFIG_RETPOLINE.
1781 + * Inline asm uses the %V modifier which is only in newer GCC
1782 + * which is ensured when CONFIG_RETPOLINE is defined.
1783 */
1784 # define CALL_NOSPEC \
1785 ANNOTATE_NOSPEC_ALTERNATIVE \
1786 - ALTERNATIVE( \
1787 + ALTERNATIVE_2( \
1788 ANNOTATE_RETPOLINE_SAFE \
1789 "call *%[thunk_target]\n", \
1790 "call __x86_indirect_thunk_%V[thunk_target]\n", \
1791 - X86_FEATURE_RETPOLINE)
1792 + X86_FEATURE_RETPOLINE, \
1793 + "lfence;\n" \
1794 + ANNOTATE_RETPOLINE_SAFE \
1795 + "call *%[thunk_target]\n", \
1796 + X86_FEATURE_RETPOLINE_LFENCE)
1797 # define THUNK_TARGET(addr) [thunk_target] "r" (addr)
1798
1799 -#elif defined(CONFIG_X86_32) && defined(CONFIG_RETPOLINE)
1800 +#else /* CONFIG_X86_32 */
1801 /*
1802 * For i386 we use the original ret-equivalent retpoline, because
1803 * otherwise we'll run out of registers. We don't care about CET
1804 * here, anyway.
1805 */
1806 # define CALL_NOSPEC \
1807 - ALTERNATIVE( \
1808 + ANNOTATE_NOSPEC_ALTERNATIVE \
1809 + ALTERNATIVE_2( \
1810 ANNOTATE_RETPOLINE_SAFE \
1811 "call *%[thunk_target]\n", \
1812 " jmp 904f;\n" \
1813 @@ -201,9 +207,14 @@
1814 " ret;\n" \
1815 " .align 16\n" \
1816 "904: call 901b;\n", \
1817 - X86_FEATURE_RETPOLINE)
1818 + X86_FEATURE_RETPOLINE, \
1819 + "lfence;\n" \
1820 + ANNOTATE_RETPOLINE_SAFE \
1821 + "call *%[thunk_target]\n", \
1822 + X86_FEATURE_RETPOLINE_LFENCE)
1823
1824 # define THUNK_TARGET(addr) [thunk_target] "rm" (addr)
1825 +#endif
1826 #else /* No retpoline for C / inline asm */
1827 # define CALL_NOSPEC "call *%[thunk_target]\n"
1828 # define THUNK_TARGET(addr) [thunk_target] "rm" (addr)
1829 @@ -212,11 +223,11 @@
1830 /* The Spectre V2 mitigation variants */
1831 enum spectre_v2_mitigation {
1832 SPECTRE_V2_NONE,
1833 - SPECTRE_V2_RETPOLINE_MINIMAL,
1834 - SPECTRE_V2_RETPOLINE_MINIMAL_AMD,
1835 - SPECTRE_V2_RETPOLINE_GENERIC,
1836 - SPECTRE_V2_RETPOLINE_AMD,
1837 - SPECTRE_V2_IBRS_ENHANCED,
1838 + SPECTRE_V2_RETPOLINE,
1839 + SPECTRE_V2_LFENCE,
1840 + SPECTRE_V2_EIBRS,
1841 + SPECTRE_V2_EIBRS_RETPOLINE,
1842 + SPECTRE_V2_EIBRS_LFENCE,
1843 };
1844
1845 /* The indirect branch speculation control variants */
1846 diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
1847 index a884bb7e7b01d..94aa0206b1f98 100644
1848 --- a/arch/x86/kernel/cpu/bugs.c
1849 +++ b/arch/x86/kernel/cpu/bugs.c
1850 @@ -30,6 +30,7 @@
1851 #include <asm/cacheflush.h>
1852 #include <asm/intel-family.h>
1853 #include <asm/e820.h>
1854 +#include <linux/bpf.h>
1855
1856 #include "cpu.h"
1857
1858 @@ -585,7 +586,7 @@ static enum spectre_v2_user_mitigation spectre_v2_user_stibp __ro_after_init =
1859 static enum spectre_v2_user_mitigation spectre_v2_user_ibpb __ro_after_init =
1860 SPECTRE_V2_USER_NONE;
1861
1862 -#ifdef RETPOLINE
1863 +#ifdef CONFIG_RETPOLINE
1864 static bool spectre_v2_bad_module;
1865
1866 bool retpoline_module_ok(bool has_retpoline)
1867 @@ -606,6 +607,32 @@ static inline const char *spectre_v2_module_string(void)
1868 static inline const char *spectre_v2_module_string(void) { return ""; }
1869 #endif
1870
1871 +#define SPECTRE_V2_LFENCE_MSG "WARNING: LFENCE mitigation is not recommended for this CPU, data leaks possible!\n"
1872 +#define SPECTRE_V2_EIBRS_EBPF_MSG "WARNING: Unprivileged eBPF is enabled with eIBRS on, data leaks possible via Spectre v2 BHB attacks!\n"
1873 +#define SPECTRE_V2_EIBRS_LFENCE_EBPF_SMT_MSG "WARNING: Unprivileged eBPF is enabled with eIBRS+LFENCE mitigation and SMT, data leaks possible via Spectre v2 BHB attacks!\n"
1874 +
1875 +#ifdef CONFIG_BPF_SYSCALL
1876 +void unpriv_ebpf_notify(int new_state)
1877 +{
1878 + if (new_state)
1879 + return;
1880 +
1881 + /* Unprivileged eBPF is enabled */
1882 +
1883 + switch (spectre_v2_enabled) {
1884 + case SPECTRE_V2_EIBRS:
1885 + pr_err(SPECTRE_V2_EIBRS_EBPF_MSG);
1886 + break;
1887 + case SPECTRE_V2_EIBRS_LFENCE:
1888 + if (sched_smt_active())
1889 + pr_err(SPECTRE_V2_EIBRS_LFENCE_EBPF_SMT_MSG);
1890 + break;
1891 + default:
1892 + break;
1893 + }
1894 +}
1895 +#endif
1896 +
1897 static inline bool match_option(const char *arg, int arglen, const char *opt)
1898 {
1899 int len = strlen(opt);
1900 @@ -620,7 +647,10 @@ enum spectre_v2_mitigation_cmd {
1901 SPECTRE_V2_CMD_FORCE,
1902 SPECTRE_V2_CMD_RETPOLINE,
1903 SPECTRE_V2_CMD_RETPOLINE_GENERIC,
1904 - SPECTRE_V2_CMD_RETPOLINE_AMD,
1905 + SPECTRE_V2_CMD_RETPOLINE_LFENCE,
1906 + SPECTRE_V2_CMD_EIBRS,
1907 + SPECTRE_V2_CMD_EIBRS_RETPOLINE,
1908 + SPECTRE_V2_CMD_EIBRS_LFENCE,
1909 };
1910
1911 enum spectre_v2_user_cmd {
1912 @@ -693,6 +723,13 @@ spectre_v2_parse_user_cmdline(enum spectre_v2_mitigation_cmd v2_cmd)
1913 return SPECTRE_V2_USER_CMD_AUTO;
1914 }
1915
1916 +static inline bool spectre_v2_in_eibrs_mode(enum spectre_v2_mitigation mode)
1917 +{
1918 + return (mode == SPECTRE_V2_EIBRS ||
1919 + mode == SPECTRE_V2_EIBRS_RETPOLINE ||
1920 + mode == SPECTRE_V2_EIBRS_LFENCE);
1921 +}
1922 +
1923 static void __init
1924 spectre_v2_user_select_mitigation(enum spectre_v2_mitigation_cmd v2_cmd)
1925 {
1926 @@ -755,10 +792,12 @@ spectre_v2_user_select_mitigation(enum spectre_v2_mitigation_cmd v2_cmd)
1927 }
1928
1929 /*
1930 - * If enhanced IBRS is enabled or SMT impossible, STIBP is not
1931 + * If no STIBP, enhanced IBRS is enabled or SMT impossible, STIBP is not
1932 * required.
1933 */
1934 - if (!smt_possible || spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
1935 + if (!boot_cpu_has(X86_FEATURE_STIBP) ||
1936 + !smt_possible ||
1937 + spectre_v2_in_eibrs_mode(spectre_v2_enabled))
1938 return;
1939
1940 /*
1941 @@ -770,12 +809,6 @@ spectre_v2_user_select_mitigation(enum spectre_v2_mitigation_cmd v2_cmd)
1942 boot_cpu_has(X86_FEATURE_AMD_STIBP_ALWAYS_ON))
1943 mode = SPECTRE_V2_USER_STRICT_PREFERRED;
1944
1945 - /*
1946 - * If STIBP is not available, clear the STIBP mode.
1947 - */
1948 - if (!boot_cpu_has(X86_FEATURE_STIBP))
1949 - mode = SPECTRE_V2_USER_NONE;
1950 -
1951 spectre_v2_user_stibp = mode;
1952
1953 set_mode:
1954 @@ -784,11 +817,11 @@ set_mode:
1955
1956 static const char * const spectre_v2_strings[] = {
1957 [SPECTRE_V2_NONE] = "Vulnerable",
1958 - [SPECTRE_V2_RETPOLINE_MINIMAL] = "Vulnerable: Minimal generic ASM retpoline",
1959 - [SPECTRE_V2_RETPOLINE_MINIMAL_AMD] = "Vulnerable: Minimal AMD ASM retpoline",
1960 - [SPECTRE_V2_RETPOLINE_GENERIC] = "Mitigation: Full generic retpoline",
1961 - [SPECTRE_V2_RETPOLINE_AMD] = "Mitigation: Full AMD retpoline",
1962 - [SPECTRE_V2_IBRS_ENHANCED] = "Mitigation: Enhanced IBRS",
1963 + [SPECTRE_V2_RETPOLINE] = "Mitigation: Retpolines",
1964 + [SPECTRE_V2_LFENCE] = "Mitigation: LFENCE",
1965 + [SPECTRE_V2_EIBRS] = "Mitigation: Enhanced IBRS",
1966 + [SPECTRE_V2_EIBRS_LFENCE] = "Mitigation: Enhanced IBRS + LFENCE",
1967 + [SPECTRE_V2_EIBRS_RETPOLINE] = "Mitigation: Enhanced IBRS + Retpolines",
1968 };
1969
1970 static const struct {
1971 @@ -799,8 +832,12 @@ static const struct {
1972 { "off", SPECTRE_V2_CMD_NONE, false },
1973 { "on", SPECTRE_V2_CMD_FORCE, true },
1974 { "retpoline", SPECTRE_V2_CMD_RETPOLINE, false },
1975 - { "retpoline,amd", SPECTRE_V2_CMD_RETPOLINE_AMD, false },
1976 + { "retpoline,amd", SPECTRE_V2_CMD_RETPOLINE_LFENCE, false },
1977 + { "retpoline,lfence", SPECTRE_V2_CMD_RETPOLINE_LFENCE, false },
1978 { "retpoline,generic", SPECTRE_V2_CMD_RETPOLINE_GENERIC, false },
1979 + { "eibrs", SPECTRE_V2_CMD_EIBRS, false },
1980 + { "eibrs,lfence", SPECTRE_V2_CMD_EIBRS_LFENCE, false },
1981 + { "eibrs,retpoline", SPECTRE_V2_CMD_EIBRS_RETPOLINE, false },
1982 { "auto", SPECTRE_V2_CMD_AUTO, false },
1983 };
1984
1985 @@ -810,11 +847,6 @@ static void __init spec_v2_print_cond(const char *reason, bool secure)
1986 pr_info("%s selected on command line.\n", reason);
1987 }
1988
1989 -static inline bool retp_compiler(void)
1990 -{
1991 - return __is_defined(RETPOLINE);
1992 -}
1993 -
1994 static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void)
1995 {
1996 enum spectre_v2_mitigation_cmd cmd = SPECTRE_V2_CMD_AUTO;
1997 @@ -842,16 +874,30 @@ static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void)
1998 }
1999
2000 if ((cmd == SPECTRE_V2_CMD_RETPOLINE ||
2001 - cmd == SPECTRE_V2_CMD_RETPOLINE_AMD ||
2002 - cmd == SPECTRE_V2_CMD_RETPOLINE_GENERIC) &&
2003 + cmd == SPECTRE_V2_CMD_RETPOLINE_LFENCE ||
2004 + cmd == SPECTRE_V2_CMD_RETPOLINE_GENERIC ||
2005 + cmd == SPECTRE_V2_CMD_EIBRS_LFENCE ||
2006 + cmd == SPECTRE_V2_CMD_EIBRS_RETPOLINE) &&
2007 !IS_ENABLED(CONFIG_RETPOLINE)) {
2008 - pr_err("%s selected but not compiled in. Switching to AUTO select\n", mitigation_options[i].option);
2009 + pr_err("%s selected but not compiled in. Switching to AUTO select\n",
2010 + mitigation_options[i].option);
2011 + return SPECTRE_V2_CMD_AUTO;
2012 + }
2013 +
2014 + if ((cmd == SPECTRE_V2_CMD_EIBRS ||
2015 + cmd == SPECTRE_V2_CMD_EIBRS_LFENCE ||
2016 + cmd == SPECTRE_V2_CMD_EIBRS_RETPOLINE) &&
2017 + !boot_cpu_has(X86_FEATURE_IBRS_ENHANCED)) {
2018 + pr_err("%s selected but CPU doesn't have eIBRS. Switching to AUTO select\n",
2019 + mitigation_options[i].option);
2020 return SPECTRE_V2_CMD_AUTO;
2021 }
2022
2023 - if (cmd == SPECTRE_V2_CMD_RETPOLINE_AMD &&
2024 - boot_cpu_data.x86_vendor != X86_VENDOR_AMD) {
2025 - pr_err("retpoline,amd selected but CPU is not AMD. Switching to AUTO select\n");
2026 + if ((cmd == SPECTRE_V2_CMD_RETPOLINE_LFENCE ||
2027 + cmd == SPECTRE_V2_CMD_EIBRS_LFENCE) &&
2028 + !boot_cpu_has(X86_FEATURE_LFENCE_RDTSC)) {
2029 + pr_err("%s selected, but CPU doesn't have a serializing LFENCE. Switching to AUTO select\n",
2030 + mitigation_options[i].option);
2031 return SPECTRE_V2_CMD_AUTO;
2032 }
2033
2034 @@ -860,6 +906,16 @@ static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void)
2035 return cmd;
2036 }
2037
2038 +static enum spectre_v2_mitigation __init spectre_v2_select_retpoline(void)
2039 +{
2040 + if (!IS_ENABLED(CONFIG_RETPOLINE)) {
2041 + pr_err("Kernel not compiled with retpoline; no mitigation available!");
2042 + return SPECTRE_V2_NONE;
2043 + }
2044 +
2045 + return SPECTRE_V2_RETPOLINE;
2046 +}
2047 +
2048 static void __init spectre_v2_select_mitigation(void)
2049 {
2050 enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
2051 @@ -880,50 +936,64 @@ static void __init spectre_v2_select_mitigation(void)
2052 case SPECTRE_V2_CMD_FORCE:
2053 case SPECTRE_V2_CMD_AUTO:
2054 if (boot_cpu_has(X86_FEATURE_IBRS_ENHANCED)) {
2055 - mode = SPECTRE_V2_IBRS_ENHANCED;
2056 - /* Force it so VMEXIT will restore correctly */
2057 - x86_spec_ctrl_base |= SPEC_CTRL_IBRS;
2058 - wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
2059 - goto specv2_set_mode;
2060 + mode = SPECTRE_V2_EIBRS;
2061 + break;
2062 }
2063 - if (IS_ENABLED(CONFIG_RETPOLINE))
2064 - goto retpoline_auto;
2065 +
2066 + mode = spectre_v2_select_retpoline();
2067 break;
2068 - case SPECTRE_V2_CMD_RETPOLINE_AMD:
2069 - if (IS_ENABLED(CONFIG_RETPOLINE))
2070 - goto retpoline_amd;
2071 +
2072 + case SPECTRE_V2_CMD_RETPOLINE_LFENCE:
2073 + pr_err(SPECTRE_V2_LFENCE_MSG);
2074 + mode = SPECTRE_V2_LFENCE;
2075 break;
2076 +
2077 case SPECTRE_V2_CMD_RETPOLINE_GENERIC:
2078 - if (IS_ENABLED(CONFIG_RETPOLINE))
2079 - goto retpoline_generic;
2080 + mode = SPECTRE_V2_RETPOLINE;
2081 break;
2082 +
2083 case SPECTRE_V2_CMD_RETPOLINE:
2084 - if (IS_ENABLED(CONFIG_RETPOLINE))
2085 - goto retpoline_auto;
2086 + mode = spectre_v2_select_retpoline();
2087 + break;
2088 +
2089 + case SPECTRE_V2_CMD_EIBRS:
2090 + mode = SPECTRE_V2_EIBRS;
2091 + break;
2092 +
2093 + case SPECTRE_V2_CMD_EIBRS_LFENCE:
2094 + mode = SPECTRE_V2_EIBRS_LFENCE;
2095 + break;
2096 +
2097 + case SPECTRE_V2_CMD_EIBRS_RETPOLINE:
2098 + mode = SPECTRE_V2_EIBRS_RETPOLINE;
2099 break;
2100 }
2101 - pr_err("Spectre mitigation: kernel not compiled with retpoline; no mitigation available!");
2102 - return;
2103
2104 -retpoline_auto:
2105 - if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) {
2106 - retpoline_amd:
2107 - if (!boot_cpu_has(X86_FEATURE_LFENCE_RDTSC)) {
2108 - pr_err("Spectre mitigation: LFENCE not serializing, switching to generic retpoline\n");
2109 - goto retpoline_generic;
2110 - }
2111 - mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_AMD :
2112 - SPECTRE_V2_RETPOLINE_MINIMAL_AMD;
2113 - setup_force_cpu_cap(X86_FEATURE_RETPOLINE_AMD);
2114 - setup_force_cpu_cap(X86_FEATURE_RETPOLINE);
2115 - } else {
2116 - retpoline_generic:
2117 - mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_GENERIC :
2118 - SPECTRE_V2_RETPOLINE_MINIMAL;
2119 + if (mode == SPECTRE_V2_EIBRS && unprivileged_ebpf_enabled())
2120 + pr_err(SPECTRE_V2_EIBRS_EBPF_MSG);
2121 +
2122 + if (spectre_v2_in_eibrs_mode(mode)) {
2123 + /* Force it so VMEXIT will restore correctly */
2124 + x86_spec_ctrl_base |= SPEC_CTRL_IBRS;
2125 + wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
2126 + }
2127 +
2128 + switch (mode) {
2129 + case SPECTRE_V2_NONE:
2130 + case SPECTRE_V2_EIBRS:
2131 + break;
2132 +
2133 + case SPECTRE_V2_LFENCE:
2134 + case SPECTRE_V2_EIBRS_LFENCE:
2135 + setup_force_cpu_cap(X86_FEATURE_RETPOLINE_LFENCE);
2136 + /* fallthrough */
2137 +
2138 + case SPECTRE_V2_RETPOLINE:
2139 + case SPECTRE_V2_EIBRS_RETPOLINE:
2140 setup_force_cpu_cap(X86_FEATURE_RETPOLINE);
2141 + break;
2142 }
2143
2144 -specv2_set_mode:
2145 spectre_v2_enabled = mode;
2146 pr_info("%s\n", spectre_v2_strings[mode]);
2147
2148 @@ -949,7 +1019,7 @@ specv2_set_mode:
2149 * the CPU supports Enhanced IBRS, kernel might un-intentionally not
2150 * enable IBRS around firmware calls.
2151 */
2152 - if (boot_cpu_has(X86_FEATURE_IBRS) && mode != SPECTRE_V2_IBRS_ENHANCED) {
2153 + if (boot_cpu_has(X86_FEATURE_IBRS) && !spectre_v2_in_eibrs_mode(mode)) {
2154 setup_force_cpu_cap(X86_FEATURE_USE_IBRS_FW);
2155 pr_info("Enabling Restricted Speculation for firmware calls\n");
2156 }
2157 @@ -1019,6 +1089,10 @@ void arch_smt_update(void)
2158 {
2159 mutex_lock(&spec_ctrl_mutex);
2160
2161 + if (sched_smt_active() && unprivileged_ebpf_enabled() &&
2162 + spectre_v2_enabled == SPECTRE_V2_EIBRS_LFENCE)
2163 + pr_warn_once(SPECTRE_V2_EIBRS_LFENCE_EBPF_SMT_MSG);
2164 +
2165 switch (spectre_v2_user_stibp) {
2166 case SPECTRE_V2_USER_NONE:
2167 break;
2168 @@ -1263,7 +1337,6 @@ static int ib_prctl_set(struct task_struct *task, unsigned long ctrl)
2169 if (spectre_v2_user_ibpb == SPECTRE_V2_USER_NONE &&
2170 spectre_v2_user_stibp == SPECTRE_V2_USER_NONE)
2171 return 0;
2172 -
2173 /*
2174 * With strict mode for both IBPB and STIBP, the instruction
2175 * code paths avoid checking this task flag and instead,
2176 @@ -1610,7 +1683,7 @@ static ssize_t tsx_async_abort_show_state(char *buf)
2177
2178 static char *stibp_state(void)
2179 {
2180 - if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
2181 + if (spectre_v2_in_eibrs_mode(spectre_v2_enabled))
2182 return "";
2183
2184 switch (spectre_v2_user_stibp) {
2185 @@ -1640,6 +1713,27 @@ static char *ibpb_state(void)
2186 return "";
2187 }
2188
2189 +static ssize_t spectre_v2_show_state(char *buf)
2190 +{
2191 + if (spectre_v2_enabled == SPECTRE_V2_LFENCE)
2192 + return sprintf(buf, "Vulnerable: LFENCE\n");
2193 +
2194 + if (spectre_v2_enabled == SPECTRE_V2_EIBRS && unprivileged_ebpf_enabled())
2195 + return sprintf(buf, "Vulnerable: eIBRS with unprivileged eBPF\n");
2196 +
2197 + if (sched_smt_active() && unprivileged_ebpf_enabled() &&
2198 + spectre_v2_enabled == SPECTRE_V2_EIBRS_LFENCE)
2199 + return sprintf(buf, "Vulnerable: eIBRS+LFENCE with unprivileged eBPF and SMT\n");
2200 +
2201 + return sprintf(buf, "%s%s%s%s%s%s\n",
2202 + spectre_v2_strings[spectre_v2_enabled],
2203 + ibpb_state(),
2204 + boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
2205 + stibp_state(),
2206 + boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "",
2207 + spectre_v2_module_string());
2208 +}
2209 +
2210 static ssize_t srbds_show_state(char *buf)
2211 {
2212 return sprintf(buf, "%s\n", srbds_strings[srbds_mitigation]);
2213 @@ -1662,12 +1756,7 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
2214 return sprintf(buf, "%s\n", spectre_v1_strings[spectre_v1_mitigation]);
2215
2216 case X86_BUG_SPECTRE_V2:
2217 - return sprintf(buf, "%s%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
2218 - ibpb_state(),
2219 - boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
2220 - stibp_state(),
2221 - boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "",
2222 - spectre_v2_module_string());
2223 + return spectre_v2_show_state(buf);
2224
2225 case X86_BUG_SPEC_STORE_BYPASS:
2226 return sprintf(buf, "%s\n", ssb_strings[ssb_mode]);
2227 diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
2228 index d420597b0d2b4..17ea0ba50278d 100644
2229 --- a/drivers/block/xen-blkfront.c
2230 +++ b/drivers/block/xen-blkfront.c
2231 @@ -1266,17 +1266,16 @@ static void blkif_free_ring(struct blkfront_ring_info *rinfo)
2232 list_for_each_entry_safe(persistent_gnt, n,
2233 &rinfo->grants, node) {
2234 list_del(&persistent_gnt->node);
2235 - if (persistent_gnt->gref != GRANT_INVALID_REF) {
2236 - gnttab_end_foreign_access(persistent_gnt->gref,
2237 - 0, 0UL);
2238 - rinfo->persistent_gnts_c--;
2239 - }
2240 + if (persistent_gnt->gref == GRANT_INVALID_REF ||
2241 + !gnttab_try_end_foreign_access(persistent_gnt->gref))
2242 + continue;
2243 +
2244 + rinfo->persistent_gnts_c--;
2245 if (info->feature_persistent)
2246 __free_page(persistent_gnt->page);
2247 kfree(persistent_gnt);
2248 }
2249 }
2250 - BUG_ON(rinfo->persistent_gnts_c != 0);
2251
2252 for (i = 0; i < BLK_RING_SIZE(info); i++) {
2253 /*
2254 @@ -1333,7 +1332,8 @@ free_shadow:
2255 rinfo->ring_ref[i] = GRANT_INVALID_REF;
2256 }
2257 }
2258 - free_pages((unsigned long)rinfo->ring.sring, get_order(info->nr_ring_pages * XEN_PAGE_SIZE));
2259 + free_pages_exact(rinfo->ring.sring,
2260 + info->nr_ring_pages * XEN_PAGE_SIZE);
2261 rinfo->ring.sring = NULL;
2262
2263 if (rinfo->irq)
2264 @@ -1417,9 +1417,15 @@ static int blkif_get_final_status(enum blk_req_status s1,
2265 return BLKIF_RSP_OKAY;
2266 }
2267
2268 -static bool blkif_completion(unsigned long *id,
2269 - struct blkfront_ring_info *rinfo,
2270 - struct blkif_response *bret)
2271 +/*
2272 + * Return values:
2273 + * 1 response processed.
2274 + * 0 missing further responses.
2275 + * -1 error while processing.
2276 + */
2277 +static int blkif_completion(unsigned long *id,
2278 + struct blkfront_ring_info *rinfo,
2279 + struct blkif_response *bret)
2280 {
2281 int i = 0;
2282 struct scatterlist *sg;
2283 @@ -1493,42 +1499,43 @@ static bool blkif_completion(unsigned long *id,
2284 }
2285 /* Add the persistent grant into the list of free grants */
2286 for (i = 0; i < num_grant; i++) {
2287 - if (gnttab_query_foreign_access(s->grants_used[i]->gref)) {
2288 + if (!gnttab_try_end_foreign_access(s->grants_used[i]->gref)) {
2289 /*
2290 * If the grant is still mapped by the backend (the
2291 * backend has chosen to make this grant persistent)
2292 * we add it at the head of the list, so it will be
2293 * reused first.
2294 */
2295 - if (!info->feature_persistent)
2296 - pr_alert_ratelimited("backed has not unmapped grant: %u\n",
2297 - s->grants_used[i]->gref);
2298 + if (!info->feature_persistent) {
2299 + pr_alert("backed has not unmapped grant: %u\n",
2300 + s->grants_used[i]->gref);
2301 + return -1;
2302 + }
2303 list_add(&s->grants_used[i]->node, &rinfo->grants);
2304 rinfo->persistent_gnts_c++;
2305 } else {
2306 /*
2307 - * If the grant is not mapped by the backend we end the
2308 - * foreign access and add it to the tail of the list,
2309 - * so it will not be picked again unless we run out of
2310 - * persistent grants.
2311 + * If the grant is not mapped by the backend we add it
2312 + * to the tail of the list, so it will not be picked
2313 + * again unless we run out of persistent grants.
2314 */
2315 - gnttab_end_foreign_access(s->grants_used[i]->gref, 0, 0UL);
2316 s->grants_used[i]->gref = GRANT_INVALID_REF;
2317 list_add_tail(&s->grants_used[i]->node, &rinfo->grants);
2318 }
2319 }
2320 if (s->req.operation == BLKIF_OP_INDIRECT) {
2321 for (i = 0; i < INDIRECT_GREFS(num_grant); i++) {
2322 - if (gnttab_query_foreign_access(s->indirect_grants[i]->gref)) {
2323 - if (!info->feature_persistent)
2324 - pr_alert_ratelimited("backed has not unmapped grant: %u\n",
2325 - s->indirect_grants[i]->gref);
2326 + if (!gnttab_try_end_foreign_access(s->indirect_grants[i]->gref)) {
2327 + if (!info->feature_persistent) {
2328 + pr_alert("backed has not unmapped grant: %u\n",
2329 + s->indirect_grants[i]->gref);
2330 + return -1;
2331 + }
2332 list_add(&s->indirect_grants[i]->node, &rinfo->grants);
2333 rinfo->persistent_gnts_c++;
2334 } else {
2335 struct page *indirect_page;
2336
2337 - gnttab_end_foreign_access(s->indirect_grants[i]->gref, 0, 0UL);
2338 /*
2339 * Add the used indirect page back to the list of
2340 * available pages for indirect grefs.
2341 @@ -1610,12 +1617,17 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
2342 }
2343
2344 if (bret.operation != BLKIF_OP_DISCARD) {
2345 + int ret;
2346 +
2347 /*
2348 * We may need to wait for an extra response if the
2349 * I/O request is split in 2
2350 */
2351 - if (!blkif_completion(&id, rinfo, &bret))
2352 + ret = blkif_completion(&id, rinfo, &bret);
2353 + if (!ret)
2354 continue;
2355 + if (unlikely(ret < 0))
2356 + goto err;
2357 }
2358
2359 if (add_id_to_freelist(rinfo, id)) {
2360 @@ -1717,8 +1729,7 @@ static int setup_blkring(struct xenbus_device *dev,
2361 for (i = 0; i < info->nr_ring_pages; i++)
2362 rinfo->ring_ref[i] = GRANT_INVALID_REF;
2363
2364 - sring = (struct blkif_sring *)__get_free_pages(GFP_NOIO | __GFP_HIGH,
2365 - get_order(ring_size));
2366 + sring = alloc_pages_exact(ring_size, GFP_NOIO);
2367 if (!sring) {
2368 xenbus_dev_fatal(dev, -ENOMEM, "allocating shared ring");
2369 return -ENOMEM;
2370 @@ -1728,7 +1739,7 @@ static int setup_blkring(struct xenbus_device *dev,
2371
2372 err = xenbus_grant_ring(dev, rinfo->ring.sring, info->nr_ring_pages, gref);
2373 if (err < 0) {
2374 - free_pages((unsigned long)sring, get_order(ring_size));
2375 + free_pages_exact(sring, ring_size);
2376 rinfo->ring.sring = NULL;
2377 goto fail;
2378 }
2379 diff --git a/drivers/firmware/psci.c b/drivers/firmware/psci.c
2380 index 79a48c37fb35b..2a6d9572d6397 100644
2381 --- a/drivers/firmware/psci.c
2382 +++ b/drivers/firmware/psci.c
2383 @@ -64,6 +64,21 @@ struct psci_operations psci_ops = {
2384 .smccc_version = SMCCC_VERSION_1_0,
2385 };
2386
2387 +enum arm_smccc_conduit arm_smccc_1_1_get_conduit(void)
2388 +{
2389 + if (psci_ops.smccc_version < SMCCC_VERSION_1_1)
2390 + return SMCCC_CONDUIT_NONE;
2391 +
2392 + switch (psci_ops.conduit) {
2393 + case PSCI_CONDUIT_SMC:
2394 + return SMCCC_CONDUIT_SMC;
2395 + case PSCI_CONDUIT_HVC:
2396 + return SMCCC_CONDUIT_HVC;
2397 + default:
2398 + return SMCCC_CONDUIT_NONE;
2399 + }
2400 +}
2401 +
2402 typedef unsigned long (psci_fn)(unsigned long, unsigned long,
2403 unsigned long, unsigned long);
2404 static psci_fn *invoke_psci_fn;
2405 diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
2406 index 65a50bc5661d2..82dcd44b3e5e2 100644
2407 --- a/drivers/net/xen-netfront.c
2408 +++ b/drivers/net/xen-netfront.c
2409 @@ -413,14 +413,12 @@ static bool xennet_tx_buf_gc(struct netfront_queue *queue)
2410 queue->tx_link[id] = TX_LINK_NONE;
2411 skb = queue->tx_skbs[id];
2412 queue->tx_skbs[id] = NULL;
2413 - if (unlikely(gnttab_query_foreign_access(
2414 - queue->grant_tx_ref[id]) != 0)) {
2415 + if (unlikely(!gnttab_end_foreign_access_ref(
2416 + queue->grant_tx_ref[id], GNTMAP_readonly))) {
2417 dev_alert(dev,
2418 "Grant still in use by backend domain\n");
2419 goto err;
2420 }
2421 - gnttab_end_foreign_access_ref(
2422 - queue->grant_tx_ref[id], GNTMAP_readonly);
2423 gnttab_release_grant_reference(
2424 &queue->gref_tx_head, queue->grant_tx_ref[id]);
2425 queue->grant_tx_ref[id] = GRANT_INVALID_REF;
2426 @@ -840,7 +838,6 @@ static int xennet_get_responses(struct netfront_queue *queue,
2427 int max = XEN_NETIF_NR_SLOTS_MIN + (rx->status <= RX_COPY_THRESHOLD);
2428 int slots = 1;
2429 int err = 0;
2430 - unsigned long ret;
2431
2432 if (rx->flags & XEN_NETRXF_extra_info) {
2433 err = xennet_get_extras(queue, extras, rp);
2434 @@ -871,8 +868,13 @@ static int xennet_get_responses(struct netfront_queue *queue,
2435 goto next;
2436 }
2437
2438 - ret = gnttab_end_foreign_access_ref(ref, 0);
2439 - BUG_ON(!ret);
2440 + if (!gnttab_end_foreign_access_ref(ref, 0)) {
2441 + dev_alert(dev,
2442 + "Grant still in use by backend domain\n");
2443 + queue->info->broken = true;
2444 + dev_alert(dev, "Disabled for further use\n");
2445 + return -EINVAL;
2446 + }
2447
2448 gnttab_release_grant_reference(&queue->gref_rx_head, ref);
2449
2450 @@ -1076,6 +1078,10 @@ static int xennet_poll(struct napi_struct *napi, int budget)
2451 err = xennet_get_responses(queue, &rinfo, rp, &tmpq);
2452
2453 if (unlikely(err)) {
2454 + if (queue->info->broken) {
2455 + spin_unlock(&queue->rx_lock);
2456 + return 0;
2457 + }
2458 err:
2459 while ((skb = __skb_dequeue(&tmpq)))
2460 __skb_queue_tail(&errq, skb);
2461 @@ -1673,7 +1679,7 @@ static int setup_netfront(struct xenbus_device *dev,
2462 struct netfront_queue *queue, unsigned int feature_split_evtchn)
2463 {
2464 struct xen_netif_tx_sring *txs;
2465 - struct xen_netif_rx_sring *rxs;
2466 + struct xen_netif_rx_sring *rxs = NULL;
2467 grant_ref_t gref;
2468 int err;
2469
2470 @@ -1693,21 +1699,21 @@ static int setup_netfront(struct xenbus_device *dev,
2471
2472 err = xenbus_grant_ring(dev, txs, 1, &gref);
2473 if (err < 0)
2474 - goto grant_tx_ring_fail;
2475 + goto fail;
2476 queue->tx_ring_ref = gref;
2477
2478 rxs = (struct xen_netif_rx_sring *)get_zeroed_page(GFP_NOIO | __GFP_HIGH);
2479 if (!rxs) {
2480 err = -ENOMEM;
2481 xenbus_dev_fatal(dev, err, "allocating rx ring page");
2482 - goto alloc_rx_ring_fail;
2483 + goto fail;
2484 }
2485 SHARED_RING_INIT(rxs);
2486 FRONT_RING_INIT(&queue->rx, rxs, XEN_PAGE_SIZE);
2487
2488 err = xenbus_grant_ring(dev, rxs, 1, &gref);
2489 if (err < 0)
2490 - goto grant_rx_ring_fail;
2491 + goto fail;
2492 queue->rx_ring_ref = gref;
2493
2494 if (feature_split_evtchn)
2495 @@ -1720,22 +1726,28 @@ static int setup_netfront(struct xenbus_device *dev,
2496 err = setup_netfront_single(queue);
2497
2498 if (err)
2499 - goto alloc_evtchn_fail;
2500 + goto fail;
2501
2502 return 0;
2503
2504 /* If we fail to setup netfront, it is safe to just revoke access to
2505 * granted pages because backend is not accessing it at this point.
2506 */
2507 -alloc_evtchn_fail:
2508 - gnttab_end_foreign_access_ref(queue->rx_ring_ref, 0);
2509 -grant_rx_ring_fail:
2510 - free_page((unsigned long)rxs);
2511 -alloc_rx_ring_fail:
2512 - gnttab_end_foreign_access_ref(queue->tx_ring_ref, 0);
2513 -grant_tx_ring_fail:
2514 - free_page((unsigned long)txs);
2515 -fail:
2516 + fail:
2517 + if (queue->rx_ring_ref != GRANT_INVALID_REF) {
2518 + gnttab_end_foreign_access(queue->rx_ring_ref, 0,
2519 + (unsigned long)rxs);
2520 + queue->rx_ring_ref = GRANT_INVALID_REF;
2521 + } else {
2522 + free_page((unsigned long)rxs);
2523 + }
2524 + if (queue->tx_ring_ref != GRANT_INVALID_REF) {
2525 + gnttab_end_foreign_access(queue->tx_ring_ref, 0,
2526 + (unsigned long)txs);
2527 + queue->tx_ring_ref = GRANT_INVALID_REF;
2528 + } else {
2529 + free_page((unsigned long)txs);
2530 + }
2531 return err;
2532 }
2533
2534 diff --git a/drivers/scsi/xen-scsifront.c b/drivers/scsi/xen-scsifront.c
2535 index e1b32ed0aa205..bdfe94c023dcd 100644
2536 --- a/drivers/scsi/xen-scsifront.c
2537 +++ b/drivers/scsi/xen-scsifront.c
2538 @@ -210,12 +210,11 @@ static void scsifront_gnttab_done(struct vscsifrnt_info *info, uint32_t id)
2539 return;
2540
2541 for (i = 0; i < s->nr_grants; i++) {
2542 - if (unlikely(gnttab_query_foreign_access(s->gref[i]) != 0)) {
2543 + if (unlikely(!gnttab_try_end_foreign_access(s->gref[i]))) {
2544 shost_printk(KERN_ALERT, info->host, KBUILD_MODNAME
2545 "grant still in use by backend\n");
2546 BUG();
2547 }
2548 - gnttab_end_foreign_access(s->gref[i], 0, 0UL);
2549 }
2550
2551 kfree(s->sg);
2552 diff --git a/drivers/xen/gntalloc.c b/drivers/xen/gntalloc.c
2553 index 7a47c4c9fb1bb..24f8900eccadd 100644
2554 --- a/drivers/xen/gntalloc.c
2555 +++ b/drivers/xen/gntalloc.c
2556 @@ -166,20 +166,14 @@ undo:
2557 __del_gref(gref);
2558 }
2559
2560 - /* It's possible for the target domain to map the just-allocated grant
2561 - * references by blindly guessing their IDs; if this is done, then
2562 - * __del_gref will leave them in the queue_gref list. They need to be
2563 - * added to the global list so that we can free them when they are no
2564 - * longer referenced.
2565 - */
2566 - if (unlikely(!list_empty(&queue_gref)))
2567 - list_splice_tail(&queue_gref, &gref_list);
2568 mutex_unlock(&gref_mutex);
2569 return rc;
2570 }
2571
2572 static void __del_gref(struct gntalloc_gref *gref)
2573 {
2574 + unsigned long addr;
2575 +
2576 if (gref->notify.flags & UNMAP_NOTIFY_CLEAR_BYTE) {
2577 uint8_t *tmp = kmap(gref->page);
2578 tmp[gref->notify.pgoff] = 0;
2579 @@ -193,21 +187,16 @@ static void __del_gref(struct gntalloc_gref *gref)
2580 gref->notify.flags = 0;
2581
2582 if (gref->gref_id) {
2583 - if (gnttab_query_foreign_access(gref->gref_id))
2584 - return;
2585 -
2586 - if (!gnttab_end_foreign_access_ref(gref->gref_id, 0))
2587 - return;
2588 -
2589 - gnttab_free_grant_reference(gref->gref_id);
2590 + if (gref->page) {
2591 + addr = (unsigned long)page_to_virt(gref->page);
2592 + gnttab_end_foreign_access(gref->gref_id, 0, addr);
2593 + } else
2594 + gnttab_free_grant_reference(gref->gref_id);
2595 }
2596
2597 gref_size--;
2598 list_del(&gref->next_gref);
2599
2600 - if (gref->page)
2601 - __free_page(gref->page);
2602 -
2603 kfree(gref);
2604 }
2605
2606 diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
2607 index 775d4195966c4..02754b4923e96 100644
2608 --- a/drivers/xen/grant-table.c
2609 +++ b/drivers/xen/grant-table.c
2610 @@ -114,12 +114,9 @@ struct gnttab_ops {
2611 */
2612 unsigned long (*end_foreign_transfer_ref)(grant_ref_t ref);
2613 /*
2614 - * Query the status of a grant entry. Ref parameter is reference of
2615 - * queried grant entry, return value is the status of queried entry.
2616 - * Detailed status(writing/reading) can be gotten from the return value
2617 - * by bit operations.
2618 + * Read the frame number related to a given grant reference.
2619 */
2620 - int (*query_foreign_access)(grant_ref_t ref);
2621 + unsigned long (*read_frame)(grant_ref_t ref);
2622 };
2623
2624 struct unmap_refs_callback_data {
2625 @@ -254,17 +251,6 @@ int gnttab_grant_foreign_access(domid_t domid, unsigned long frame,
2626 }
2627 EXPORT_SYMBOL_GPL(gnttab_grant_foreign_access);
2628
2629 -static int gnttab_query_foreign_access_v1(grant_ref_t ref)
2630 -{
2631 - return gnttab_shared.v1[ref].flags & (GTF_reading|GTF_writing);
2632 -}
2633 -
2634 -int gnttab_query_foreign_access(grant_ref_t ref)
2635 -{
2636 - return gnttab_interface->query_foreign_access(ref);
2637 -}
2638 -EXPORT_SYMBOL_GPL(gnttab_query_foreign_access);
2639 -
2640 static int gnttab_end_foreign_access_ref_v1(grant_ref_t ref, int readonly)
2641 {
2642 u16 flags, nflags;
2643 @@ -295,6 +281,11 @@ int gnttab_end_foreign_access_ref(grant_ref_t ref, int readonly)
2644 }
2645 EXPORT_SYMBOL_GPL(gnttab_end_foreign_access_ref);
2646
2647 +static unsigned long gnttab_read_frame_v1(grant_ref_t ref)
2648 +{
2649 + return gnttab_shared.v1[ref].frame;
2650 +}
2651 +
2652 struct deferred_entry {
2653 struct list_head list;
2654 grant_ref_t ref;
2655 @@ -324,12 +315,9 @@ static void gnttab_handle_deferred(unsigned long unused)
2656 spin_unlock_irqrestore(&gnttab_list_lock, flags);
2657 if (_gnttab_end_foreign_access_ref(entry->ref, entry->ro)) {
2658 put_free_entry(entry->ref);
2659 - if (entry->page) {
2660 - pr_debug("freeing g.e. %#x (pfn %#lx)\n",
2661 - entry->ref, page_to_pfn(entry->page));
2662 - put_page(entry->page);
2663 - } else
2664 - pr_info("freeing g.e. %#x\n", entry->ref);
2665 + pr_debug("freeing g.e. %#x (pfn %#lx)\n",
2666 + entry->ref, page_to_pfn(entry->page));
2667 + put_page(entry->page);
2668 kfree(entry);
2669 entry = NULL;
2670 } else {
2671 @@ -354,9 +342,18 @@ static void gnttab_handle_deferred(unsigned long unused)
2672 static void gnttab_add_deferred(grant_ref_t ref, bool readonly,
2673 struct page *page)
2674 {
2675 - struct deferred_entry *entry = kmalloc(sizeof(*entry), GFP_ATOMIC);
2676 + struct deferred_entry *entry;
2677 + gfp_t gfp = (in_atomic() || irqs_disabled()) ? GFP_ATOMIC : GFP_KERNEL;
2678 const char *what = KERN_WARNING "leaking";
2679
2680 + entry = kmalloc(sizeof(*entry), gfp);
2681 + if (!page) {
2682 + unsigned long gfn = gnttab_interface->read_frame(ref);
2683 +
2684 + page = pfn_to_page(gfn_to_pfn(gfn));
2685 + get_page(page);
2686 + }
2687 +
2688 if (entry) {
2689 unsigned long flags;
2690
2691 @@ -377,11 +374,21 @@ static void gnttab_add_deferred(grant_ref_t ref, bool readonly,
2692 what, ref, page ? page_to_pfn(page) : -1);
2693 }
2694
2695 +int gnttab_try_end_foreign_access(grant_ref_t ref)
2696 +{
2697 + int ret = _gnttab_end_foreign_access_ref(ref, 0);
2698 +
2699 + if (ret)
2700 + put_free_entry(ref);
2701 +
2702 + return ret;
2703 +}
2704 +EXPORT_SYMBOL_GPL(gnttab_try_end_foreign_access);
2705 +
2706 void gnttab_end_foreign_access(grant_ref_t ref, int readonly,
2707 unsigned long page)
2708 {
2709 - if (gnttab_end_foreign_access_ref(ref, readonly)) {
2710 - put_free_entry(ref);
2711 + if (gnttab_try_end_foreign_access(ref)) {
2712 if (page != 0)
2713 put_page(virt_to_page(page));
2714 } else
2715 @@ -1018,7 +1025,7 @@ static const struct gnttab_ops gnttab_v1_ops = {
2716 .update_entry = gnttab_update_entry_v1,
2717 .end_foreign_access_ref = gnttab_end_foreign_access_ref_v1,
2718 .end_foreign_transfer_ref = gnttab_end_foreign_transfer_ref_v1,
2719 - .query_foreign_access = gnttab_query_foreign_access_v1,
2720 + .read_frame = gnttab_read_frame_v1,
2721 };
2722
2723 static void gnttab_request_version(void)
2724 diff --git a/drivers/xen/xenbus/xenbus_client.c b/drivers/xen/xenbus/xenbus_client.c
2725 index 8bbd887ca422b..5ee38e939165c 100644
2726 --- a/drivers/xen/xenbus/xenbus_client.c
2727 +++ b/drivers/xen/xenbus/xenbus_client.c
2728 @@ -387,7 +387,14 @@ int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
2729 unsigned int nr_pages, grant_ref_t *grefs)
2730 {
2731 int err;
2732 - int i, j;
2733 + unsigned int i;
2734 + grant_ref_t gref_head;
2735 +
2736 + err = gnttab_alloc_grant_references(nr_pages, &gref_head);
2737 + if (err) {
2738 + xenbus_dev_fatal(dev, err, "granting access to ring page");
2739 + return err;
2740 + }
2741
2742 for (i = 0; i < nr_pages; i++) {
2743 unsigned long gfn;
2744 @@ -397,23 +404,14 @@ int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
2745 else
2746 gfn = virt_to_gfn(vaddr);
2747
2748 - err = gnttab_grant_foreign_access(dev->otherend_id, gfn, 0);
2749 - if (err < 0) {
2750 - xenbus_dev_fatal(dev, err,
2751 - "granting access to ring page");
2752 - goto fail;
2753 - }
2754 - grefs[i] = err;
2755 + grefs[i] = gnttab_claim_grant_reference(&gref_head);
2756 + gnttab_grant_foreign_access_ref(grefs[i], dev->otherend_id,
2757 + gfn, 0);
2758
2759 vaddr = vaddr + XEN_PAGE_SIZE;
2760 }
2761
2762 return 0;
2763 -
2764 -fail:
2765 - for (j = 0; j < i; j++)
2766 - gnttab_end_foreign_access_ref(grefs[j], 0);
2767 - return err;
2768 }
2769 EXPORT_SYMBOL_GPL(xenbus_grant_ring);
2770
2771 diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h
2772 index 18863d56273cc..6366b04c7d5f4 100644
2773 --- a/include/linux/arm-smccc.h
2774 +++ b/include/linux/arm-smccc.h
2775 @@ -89,6 +89,22 @@
2776
2777 #include <linux/linkage.h>
2778 #include <linux/types.h>
2779 +
2780 +enum arm_smccc_conduit {
2781 + SMCCC_CONDUIT_NONE,
2782 + SMCCC_CONDUIT_SMC,
2783 + SMCCC_CONDUIT_HVC,
2784 +};
2785 +
2786 +/**
2787 + * arm_smccc_1_1_get_conduit()
2788 + *
2789 + * Returns the conduit to be used for SMCCCv1.1 or later.
2790 + *
2791 + * When SMCCCv1.1 is not present, returns SMCCC_CONDUIT_NONE.
2792 + */
2793 +enum arm_smccc_conduit arm_smccc_1_1_get_conduit(void);
2794 +
2795 /**
2796 * struct arm_smccc_res - Result from SMC/HVC call
2797 * @a0-a3 result values from registers 0 to 3
2798 @@ -311,5 +327,63 @@ asmlinkage void __arm_smccc_hvc(unsigned long a0, unsigned long a1,
2799 #define SMCCC_RET_NOT_SUPPORTED -1
2800 #define SMCCC_RET_NOT_REQUIRED -2
2801
2802 +/*
2803 + * Like arm_smccc_1_1* but always returns SMCCC_RET_NOT_SUPPORTED.
2804 + * Used when the SMCCC conduit is not defined. The empty asm statement
2805 + * avoids compiler warnings about unused variables.
2806 + */
2807 +#define __fail_smccc_1_1(...) \
2808 + do { \
2809 + __declare_args(__count_args(__VA_ARGS__), __VA_ARGS__); \
2810 + asm ("" __constraints(__count_args(__VA_ARGS__))); \
2811 + if (___res) \
2812 + ___res->a0 = SMCCC_RET_NOT_SUPPORTED; \
2813 + } while (0)
2814 +
2815 +/*
2816 + * arm_smccc_1_1_invoke() - make an SMCCC v1.1 compliant call
2817 + *
2818 + * This is a variadic macro taking one to eight source arguments, and
2819 + * an optional return structure.
2820 + *
2821 + * @a0-a7: arguments passed in registers 0 to 7
2822 + * @res: result values from registers 0 to 3
2823 + *
2824 + * This macro will make either an HVC call or an SMC call depending on the
2825 + * current SMCCC conduit. If no valid conduit is available then -1
2826 + * (SMCCC_RET_NOT_SUPPORTED) is returned in @res.a0 (if supplied).
2827 + *
2828 + * The return value also provides the conduit that was used.
2829 + */
2830 +#define arm_smccc_1_1_invoke(...) ({ \
2831 + int method = arm_smccc_1_1_get_conduit(); \
2832 + switch (method) { \
2833 + case SMCCC_CONDUIT_HVC: \
2834 + arm_smccc_1_1_hvc(__VA_ARGS__); \
2835 + break; \
2836 + case SMCCC_CONDUIT_SMC: \
2837 + arm_smccc_1_1_smc(__VA_ARGS__); \
2838 + break; \
2839 + default: \
2840 + __fail_smccc_1_1(__VA_ARGS__); \
2841 + method = SMCCC_CONDUIT_NONE; \
2842 + break; \
2843 + } \
2844 + method; \
2845 + })
2846 +
2847 +/* Paravirtualised time calls (defined by ARM DEN0057A) */
2848 +#define ARM_SMCCC_HV_PV_TIME_FEATURES \
2849 + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
2850 + ARM_SMCCC_SMC_64, \
2851 + ARM_SMCCC_OWNER_STANDARD_HYP, \
2852 + 0x20)
2853 +
2854 +#define ARM_SMCCC_HV_PV_TIME_ST \
2855 + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
2856 + ARM_SMCCC_SMC_64, \
2857 + ARM_SMCCC_OWNER_STANDARD_HYP, \
2858 + 0x21)
2859 +
2860 #endif /*__ASSEMBLY__*/
2861 #endif /*__LINUX_ARM_SMCCC_H*/
2862 diff --git a/include/linux/bpf.h b/include/linux/bpf.h
2863 index 7995940d41877..fe520d40597ff 100644
2864 --- a/include/linux/bpf.h
2865 +++ b/include/linux/bpf.h
2866 @@ -295,6 +295,11 @@ static inline void bpf_long_memcpy(void *dst, const void *src, u32 size)
2867
2868 /* verify correctness of eBPF program */
2869 int bpf_check(struct bpf_prog **fp, union bpf_attr *attr);
2870 +
2871 +static inline bool unprivileged_ebpf_enabled(void)
2872 +{
2873 + return !sysctl_unprivileged_bpf_disabled;
2874 +}
2875 #else
2876 static inline void bpf_register_prog_type(struct bpf_prog_type_list *tl)
2877 {
2878 @@ -322,6 +327,12 @@ static inline struct bpf_prog *bpf_prog_inc(struct bpf_prog *prog)
2879 {
2880 return ERR_PTR(-EOPNOTSUPP);
2881 }
2882 +
2883 +static inline bool unprivileged_ebpf_enabled(void)
2884 +{
2885 + return false;
2886 +}
2887 +
2888 #endif /* CONFIG_BPF_SYSCALL */
2889
2890 /* verifier prototypes for helper functions called from eBPF programs */
2891 diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
2892 index d830eddacdc60..1c1ca41685162 100644
2893 --- a/include/linux/compiler-gcc.h
2894 +++ b/include/linux/compiler-gcc.h
2895 @@ -107,7 +107,7 @@
2896 #define __weak __attribute__((weak))
2897 #define __alias(symbol) __attribute__((alias(#symbol)))
2898
2899 -#ifdef RETPOLINE
2900 +#ifdef CONFIG_RETPOLINE
2901 #define __noretpoline __attribute__((indirect_branch("keep")))
2902 #endif
2903
2904 diff --git a/include/linux/module.h b/include/linux/module.h
2905 index 99f330ae13da5..be4a3a9fd89ca 100644
2906 --- a/include/linux/module.h
2907 +++ b/include/linux/module.h
2908 @@ -791,7 +791,7 @@ static inline void module_bug_finalize(const Elf_Ehdr *hdr,
2909 static inline void module_bug_cleanup(struct module *mod) {}
2910 #endif /* CONFIG_GENERIC_BUG */
2911
2912 -#ifdef RETPOLINE
2913 +#ifdef CONFIG_RETPOLINE
2914 extern bool retpoline_module_ok(bool has_retpoline);
2915 #else
2916 static inline bool retpoline_module_ok(bool has_retpoline)
2917 diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h
2918 index f9d8aac170fbc..c51ae64b6dcb8 100644
2919 --- a/include/xen/grant_table.h
2920 +++ b/include/xen/grant_table.h
2921 @@ -97,17 +97,32 @@ int gnttab_end_foreign_access_ref(grant_ref_t ref, int readonly);
2922 * access has been ended, free the given page too. Access will be ended
2923 * immediately iff the grant entry is not in use, otherwise it will happen
2924 * some time later. page may be 0, in which case no freeing will occur.
2925 + * Note that the granted page might still be accessed (read or write) by the
2926 + * other side after gnttab_end_foreign_access() returns, so even if page was
2927 + * specified as 0 it is not allowed to just reuse the page for other
2928 + * purposes immediately. gnttab_end_foreign_access() will take an additional
2929 + * reference to the granted page in this case, which is dropped only after
2930 + * the grant is no longer in use.
2931 + * This requires that multi page allocations for areas subject to
2932 + * gnttab_end_foreign_access() are done via alloc_pages_exact() (and freeing
2933 + * via free_pages_exact()) in order to avoid high order pages.
2934 */
2935 void gnttab_end_foreign_access(grant_ref_t ref, int readonly,
2936 unsigned long page);
2937
2938 +/*
2939 + * End access through the given grant reference, iff the grant entry is
2940 + * no longer in use. In case of success ending foreign access, the
2941 + * grant reference is deallocated.
2942 + * Return 1 if the grant entry was freed, 0 if it is still in use.
2943 + */
2944 +int gnttab_try_end_foreign_access(grant_ref_t ref);
2945 +
2946 int gnttab_grant_foreign_transfer(domid_t domid, unsigned long pfn);
2947
2948 unsigned long gnttab_end_foreign_transfer_ref(grant_ref_t ref);
2949 unsigned long gnttab_end_foreign_transfer(grant_ref_t ref);
2950
2951 -int gnttab_query_foreign_access(grant_ref_t ref);
2952 -
2953 /*
2954 * operations on reserved batches of grant references
2955 */
2956 diff --git a/kernel/sysctl.c b/kernel/sysctl.c
2957 index 78b445562b81e..184d462339e65 100644
2958 --- a/kernel/sysctl.c
2959 +++ b/kernel/sysctl.c
2960 @@ -222,6 +222,11 @@ static int sysrq_sysctl_handler(struct ctl_table *table, int write,
2961 #endif
2962
2963 #ifdef CONFIG_BPF_SYSCALL
2964 +
2965 +void __weak unpriv_ebpf_notify(int new_state)
2966 +{
2967 +}
2968 +
2969 static int bpf_unpriv_handler(struct ctl_table *table, int write,
2970 void *buffer, size_t *lenp, loff_t *ppos)
2971 {
2972 @@ -239,6 +244,9 @@ static int bpf_unpriv_handler(struct ctl_table *table, int write,
2973 return -EPERM;
2974 *(int *)table->data = unpriv_enable;
2975 }
2976 +
2977 + unpriv_ebpf_notify(unpriv_enable);
2978 +
2979 return ret;
2980 }
2981 #endif
2982 diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
2983 index 9abcdf2e8dfe8..62b0552b7b718 100644
2984 --- a/scripts/mod/modpost.c
2985 +++ b/scripts/mod/modpost.c
2986 @@ -2147,7 +2147,7 @@ static void add_intree_flag(struct buffer *b, int is_intree)
2987 /* Cannot check for assembler */
2988 static void add_retpoline(struct buffer *b)
2989 {
2990 - buf_printf(b, "\n#ifdef RETPOLINE\n");
2991 + buf_printf(b, "\n#ifdef CONFIG_RETPOLINE\n");
2992 buf_printf(b, "MODULE_INFO(retpoline, \"Y\");\n");
2993 buf_printf(b, "#endif\n");
2994 }
2995 diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
2996 index f6d1bc93589c7..f032dfed00a93 100644
2997 --- a/tools/arch/x86/include/asm/cpufeatures.h
2998 +++ b/tools/arch/x86/include/asm/cpufeatures.h
2999 @@ -194,7 +194,7 @@
3000 #define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */
3001
3002 #define X86_FEATURE_RETPOLINE ( 7*32+12) /* "" Generic Retpoline mitigation for Spectre variant 2 */
3003 -#define X86_FEATURE_RETPOLINE_AMD ( 7*32+13) /* "" AMD Retpoline mitigation for Spectre variant 2 */
3004 +#define X86_FEATURE_RETPOLINE_LFENCE ( 7*32+13) /* "" Use LFENCEs for Spectre variant 2 */
3005
3006 #define X86_FEATURE_MSR_SPEC_CTRL ( 7*32+16) /* "" MSR SPEC_CTRL is implemented */
3007 #define X86_FEATURE_SSBD ( 7*32+17) /* Speculative Store Bypass Disable */