Magellan Linux

Contents of /trunk/kernel-alx-legacy/patches-4.9/0418-4.9.319-all-fixes.patch

Parent Directory Parent Directory | Revision Log Revision Log


Revision 3720 - (show annotations) (download)
Mon Oct 24 14:08:31 2022 UTC (19 months, 1 week ago) by niro
File size: 41879 byte(s)
-linux-4.9.319
1 diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
2 index a5225df4a0702..22c078d1a23bf 100644
3 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
4 +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
5 @@ -361,6 +361,7 @@ What: /sys/devices/system/cpu/vulnerabilities
6 /sys/devices/system/cpu/vulnerabilities/srbds
7 /sys/devices/system/cpu/vulnerabilities/tsx_async_abort
8 /sys/devices/system/cpu/vulnerabilities/itlb_multihit
9 + /sys/devices/system/cpu/vulnerabilities/mmio_stale_data
10 Date: January 2018
11 Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
12 Description: Information about CPU vulnerabilities
13 diff --git a/Documentation/hw-vuln/index.rst b/Documentation/hw-vuln/index.rst
14 index 74466ba801678..608afc9667761 100644
15 --- a/Documentation/hw-vuln/index.rst
16 +++ b/Documentation/hw-vuln/index.rst
17 @@ -15,3 +15,4 @@ are configurable at compile, boot or run time.
18 tsx_async_abort
19 multihit
20 special-register-buffer-data-sampling
21 + processor_mmio_stale_data
22 diff --git a/Documentation/hw-vuln/processor_mmio_stale_data.rst b/Documentation/hw-vuln/processor_mmio_stale_data.rst
23 new file mode 100644
24 index 0000000000000..9393c50b5afc9
25 --- /dev/null
26 +++ b/Documentation/hw-vuln/processor_mmio_stale_data.rst
27 @@ -0,0 +1,246 @@
28 +=========================================
29 +Processor MMIO Stale Data Vulnerabilities
30 +=========================================
31 +
32 +Processor MMIO Stale Data Vulnerabilities are a class of memory-mapped I/O
33 +(MMIO) vulnerabilities that can expose data. The sequences of operations for
34 +exposing data range from simple to very complex. Because most of the
35 +vulnerabilities require the attacker to have access to MMIO, many environments
36 +are not affected. System environments using virtualization where MMIO access is
37 +provided to untrusted guests may need mitigation. These vulnerabilities are
38 +not transient execution attacks. However, these vulnerabilities may propagate
39 +stale data into core fill buffers where the data can subsequently be inferred
40 +by an unmitigated transient execution attack. Mitigation for these
41 +vulnerabilities includes a combination of microcode update and software
42 +changes, depending on the platform and usage model. Some of these mitigations
43 +are similar to those used to mitigate Microarchitectural Data Sampling (MDS) or
44 +those used to mitigate Special Register Buffer Data Sampling (SRBDS).
45 +
46 +Data Propagators
47 +================
48 +Propagators are operations that result in stale data being copied or moved from
49 +one microarchitectural buffer or register to another. Processor MMIO Stale Data
50 +Vulnerabilities are operations that may result in stale data being directly
51 +read into an architectural, software-visible state or sampled from a buffer or
52 +register.
53 +
54 +Fill Buffer Stale Data Propagator (FBSDP)
55 +-----------------------------------------
56 +Stale data may propagate from fill buffers (FB) into the non-coherent portion
57 +of the uncore on some non-coherent writes. Fill buffer propagation by itself
58 +does not make stale data architecturally visible. Stale data must be propagated
59 +to a location where it is subject to reading or sampling.
60 +
61 +Sideband Stale Data Propagator (SSDP)
62 +-------------------------------------
63 +The sideband stale data propagator (SSDP) is limited to the client (including
64 +Intel Xeon server E3) uncore implementation. The sideband response buffer is
65 +shared by all client cores. For non-coherent reads that go to sideband
66 +destinations, the uncore logic returns 64 bytes of data to the core, including
67 +both requested data and unrequested stale data, from a transaction buffer and
68 +the sideband response buffer. As a result, stale data from the sideband
69 +response and transaction buffers may now reside in a core fill buffer.
70 +
71 +Primary Stale Data Propagator (PSDP)
72 +------------------------------------
73 +The primary stale data propagator (PSDP) is limited to the client (including
74 +Intel Xeon server E3) uncore implementation. Similar to the sideband response
75 +buffer, the primary response buffer is shared by all client cores. For some
76 +processors, MMIO primary reads will return 64 bytes of data to the core fill
77 +buffer including both requested data and unrequested stale data. This is
78 +similar to the sideband stale data propagator.
79 +
80 +Vulnerabilities
81 +===============
82 +Device Register Partial Write (DRPW) (CVE-2022-21166)
83 +-----------------------------------------------------
84 +Some endpoint MMIO registers incorrectly handle writes that are smaller than
85 +the register size. Instead of aborting the write or only copying the correct
86 +subset of bytes (for example, 2 bytes for a 2-byte write), more bytes than
87 +specified by the write transaction may be written to the register. On
88 +processors affected by FBSDP, this may expose stale data from the fill buffers
89 +of the core that created the write transaction.
90 +
91 +Shared Buffers Data Sampling (SBDS) (CVE-2022-21125)
92 +----------------------------------------------------
93 +After propagators may have moved data around the uncore and copied stale data
94 +into client core fill buffers, processors affected by MFBDS can leak data from
95 +the fill buffer. It is limited to the client (including Intel Xeon server E3)
96 +uncore implementation.
97 +
98 +Shared Buffers Data Read (SBDR) (CVE-2022-21123)
99 +------------------------------------------------
100 +It is similar to Shared Buffer Data Sampling (SBDS) except that the data is
101 +directly read into the architectural software-visible state. It is limited to
102 +the client (including Intel Xeon server E3) uncore implementation.
103 +
104 +Affected Processors
105 +===================
106 +Not all the CPUs are affected by all the variants. For instance, most
107 +processors for the server market (excluding Intel Xeon E3 processors) are
108 +impacted by only Device Register Partial Write (DRPW).
109 +
110 +Below is the list of affected Intel processors [#f1]_:
111 +
112 + =================== ============ =========
113 + Common name Family_Model Steppings
114 + =================== ============ =========
115 + HASWELL_X 06_3FH 2,4
116 + SKYLAKE_L 06_4EH 3
117 + BROADWELL_X 06_4FH All
118 + SKYLAKE_X 06_55H 3,4,6,7,11
119 + BROADWELL_D 06_56H 3,4,5
120 + SKYLAKE 06_5EH 3
121 + ICELAKE_X 06_6AH 4,5,6
122 + ICELAKE_D 06_6CH 1
123 + ICELAKE_L 06_7EH 5
124 + ATOM_TREMONT_D 06_86H All
125 + LAKEFIELD 06_8AH 1
126 + KABYLAKE_L 06_8EH 9 to 12
127 + ATOM_TREMONT 06_96H 1
128 + ATOM_TREMONT_L 06_9CH 0
129 + KABYLAKE 06_9EH 9 to 13
130 + COMETLAKE 06_A5H 2,3,5
131 + COMETLAKE_L 06_A6H 0,1
132 + ROCKETLAKE 06_A7H 1
133 + =================== ============ =========
134 +
135 +If a CPU is in the affected processor list, but not affected by a variant, it
136 +is indicated by new bits in MSR IA32_ARCH_CAPABILITIES. As described in a later
137 +section, mitigation largely remains the same for all the variants, i.e. to
138 +clear the CPU fill buffers via VERW instruction.
139 +
140 +New bits in MSRs
141 +================
142 +Newer processors and microcode update on existing affected processors added new
143 +bits to IA32_ARCH_CAPABILITIES MSR. These bits can be used to enumerate
144 +specific variants of Processor MMIO Stale Data vulnerabilities and mitigation
145 +capability.
146 +
147 +MSR IA32_ARCH_CAPABILITIES
148 +--------------------------
149 +Bit 13 - SBDR_SSDP_NO - When set, processor is not affected by either the
150 + Shared Buffers Data Read (SBDR) vulnerability or the sideband stale
151 + data propagator (SSDP).
152 +Bit 14 - FBSDP_NO - When set, processor is not affected by the Fill Buffer
153 + Stale Data Propagator (FBSDP).
154 +Bit 15 - PSDP_NO - When set, processor is not affected by Primary Stale Data
155 + Propagator (PSDP).
156 +Bit 17 - FB_CLEAR - When set, VERW instruction will overwrite CPU fill buffer
157 + values as part of MD_CLEAR operations. Processors that do not
158 + enumerate MDS_NO (meaning they are affected by MDS) but that do
159 + enumerate support for both L1D_FLUSH and MD_CLEAR implicitly enumerate
160 + FB_CLEAR as part of their MD_CLEAR support.
161 +Bit 18 - FB_CLEAR_CTRL - Processor supports read and write to MSR
162 + IA32_MCU_OPT_CTRL[FB_CLEAR_DIS]. On such processors, the FB_CLEAR_DIS
163 + bit can be set to cause the VERW instruction to not perform the
164 + FB_CLEAR action. Not all processors that support FB_CLEAR will support
165 + FB_CLEAR_CTRL.
166 +
167 +MSR IA32_MCU_OPT_CTRL
168 +---------------------
169 +Bit 3 - FB_CLEAR_DIS - When set, VERW instruction does not perform the FB_CLEAR
170 +action. This may be useful to reduce the performance impact of FB_CLEAR in
171 +cases where system software deems it warranted (for example, when performance
172 +is more critical, or the untrusted software has no MMIO access). Note that
173 +FB_CLEAR_DIS has no impact on enumeration (for example, it does not change
174 +FB_CLEAR or MD_CLEAR enumeration) and it may not be supported on all processors
175 +that enumerate FB_CLEAR.
176 +
177 +Mitigation
178 +==========
179 +Like MDS, all variants of Processor MMIO Stale Data vulnerabilities have the
180 +same mitigation strategy to force the CPU to clear the affected buffers before
181 +an attacker can extract the secrets.
182 +
183 +This is achieved by using the otherwise unused and obsolete VERW instruction in
184 +combination with a microcode update. The microcode clears the affected CPU
185 +buffers when the VERW instruction is executed.
186 +
187 +Kernel reuses the MDS function to invoke the buffer clearing:
188 +
189 + mds_clear_cpu_buffers()
190 +
191 +On MDS affected CPUs, the kernel already invokes CPU buffer clear on
192 +kernel/userspace, hypervisor/guest and C-state (idle) transitions. No
193 +additional mitigation is needed on such CPUs.
194 +
195 +For CPUs not affected by MDS or TAA, mitigation is needed only for the attacker
196 +with MMIO capability. Therefore, VERW is not required for kernel/userspace. For
197 +virtualization case, VERW is only needed at VMENTER for a guest with MMIO
198 +capability.
199 +
200 +Mitigation points
201 +-----------------
202 +Return to user space
203 +^^^^^^^^^^^^^^^^^^^^
204 +Same mitigation as MDS when affected by MDS/TAA, otherwise no mitigation
205 +needed.
206 +
207 +C-State transition
208 +^^^^^^^^^^^^^^^^^^
209 +Control register writes by CPU during C-state transition can propagate data
210 +from fill buffer to uncore buffers. Execute VERW before C-state transition to
211 +clear CPU fill buffers.
212 +
213 +Guest entry point
214 +^^^^^^^^^^^^^^^^^
215 +Same mitigation as MDS when processor is also affected by MDS/TAA, otherwise
216 +execute VERW at VMENTER only for MMIO capable guests. On CPUs not affected by
217 +MDS/TAA, guest without MMIO access cannot extract secrets using Processor MMIO
218 +Stale Data vulnerabilities, so there is no need to execute VERW for such guests.
219 +
220 +Mitigation control on the kernel command line
221 +---------------------------------------------
222 +The kernel command line allows to control the Processor MMIO Stale Data
223 +mitigations at boot time with the option "mmio_stale_data=". The valid
224 +arguments for this option are:
225 +
226 + ========== =================================================================
227 + full If the CPU is vulnerable, enable mitigation; CPU buffer clearing
228 + on exit to userspace and when entering a VM. Idle transitions are
229 + protected as well. It does not automatically disable SMT.
230 + full,nosmt Same as full, with SMT disabled on vulnerable CPUs. This is the
231 + complete mitigation.
232 + off Disables mitigation completely.
233 + ========== =================================================================
234 +
235 +If the CPU is affected and mmio_stale_data=off is not supplied on the kernel
236 +command line, then the kernel selects the appropriate mitigation.
237 +
238 +Mitigation status information
239 +-----------------------------
240 +The Linux kernel provides a sysfs interface to enumerate the current
241 +vulnerability status of the system: whether the system is vulnerable, and
242 +which mitigations are active. The relevant sysfs file is:
243 +
244 + /sys/devices/system/cpu/vulnerabilities/mmio_stale_data
245 +
246 +The possible values in this file are:
247 +
248 + .. list-table::
249 +
250 + * - 'Not affected'
251 + - The processor is not vulnerable
252 + * - 'Vulnerable'
253 + - The processor is vulnerable, but no mitigation enabled
254 + * - 'Vulnerable: Clear CPU buffers attempted, no microcode'
255 + - The processor is vulnerable, but microcode is not updated. The
256 + mitigation is enabled on a best effort basis.
257 + * - 'Mitigation: Clear CPU buffers'
258 + - The processor is vulnerable and the CPU buffer clearing mitigation is
259 + enabled.
260 +
261 +If the processor is vulnerable then the following information is appended to
262 +the above information:
263 +
264 + ======================== ===========================================
265 + 'SMT vulnerable' SMT is enabled
266 + 'SMT disabled' SMT is disabled
267 + 'SMT Host state unknown' Kernel runs in a VM, Host SMT state unknown
268 + ======================== ===========================================
269 +
270 +References
271 +----------
272 +.. [#f1] Affected Processors
273 + https://www.intel.com/content/www/us/en/developer/topic-technology/software-security-guidance/processors-affected-consolidated-product-cpu-model.html
274 diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
275 index f2b10986ab889..97c0ff0787eaf 100644
276 --- a/Documentation/kernel-parameters.txt
277 +++ b/Documentation/kernel-parameters.txt
278 @@ -2520,6 +2520,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
279 kvm.nx_huge_pages=off [X86]
280 no_entry_flush [PPC]
281 no_uaccess_flush [PPC]
282 + mmio_stale_data=off [X86]
283
284 Exceptions:
285 This does not have any effect on
286 @@ -2541,6 +2542,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
287 Equivalent to: l1tf=flush,nosmt [X86]
288 mds=full,nosmt [X86]
289 tsx_async_abort=full,nosmt [X86]
290 + mmio_stale_data=full,nosmt [X86]
291
292 mminit_loglevel=
293 [KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
294 @@ -2550,6 +2552,40 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
295 log everything. Information is printed at KERN_DEBUG
296 so loglevel=8 may also need to be specified.
297
298 + mmio_stale_data=
299 + [X86,INTEL] Control mitigation for the Processor
300 + MMIO Stale Data vulnerabilities.
301 +
302 + Processor MMIO Stale Data is a class of
303 + vulnerabilities that may expose data after an MMIO
304 + operation. Exposed data could originate or end in
305 + the same CPU buffers as affected by MDS and TAA.
306 + Therefore, similar to MDS and TAA, the mitigation
307 + is to clear the affected CPU buffers.
308 +
309 + This parameter controls the mitigation. The
310 + options are:
311 +
312 + full - Enable mitigation on vulnerable CPUs
313 +
314 + full,nosmt - Enable mitigation and disable SMT on
315 + vulnerable CPUs.
316 +
317 + off - Unconditionally disable mitigation
318 +
319 + On MDS or TAA affected machines,
320 + mmio_stale_data=off can be prevented by an active
321 + MDS or TAA mitigation as these vulnerabilities are
322 + mitigated with the same mechanism so in order to
323 + disable this mitigation, you need to specify
324 + mds=off and tsx_async_abort=off too.
325 +
326 + Not specifying this option is equivalent to
327 + mmio_stale_data=full.
328 +
329 + For details see:
330 + Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst
331 +
332 module.sig_enforce
333 [KNL] When CONFIG_MODULE_SIG is set, this means that
334 modules without (valid) signatures will fail to load.
335 diff --git a/Makefile b/Makefile
336 index 46bea19a6c96d..bf4a7b0fe8e74 100644
337 --- a/Makefile
338 +++ b/Makefile
339 @@ -1,6 +1,6 @@
340 VERSION = 4
341 PATCHLEVEL = 9
342 -SUBLEVEL = 318
343 +SUBLEVEL = 319
344 EXTRAVERSION =
345 NAME = Roaring Lionus
346
347 diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
348 index 5b197248d5465..910304aec2e66 100644
349 --- a/arch/x86/include/asm/cpufeatures.h
350 +++ b/arch/x86/include/asm/cpufeatures.h
351 @@ -362,5 +362,6 @@
352 #define X86_BUG_TAA X86_BUG(22) /* CPU is affected by TSX Async Abort(TAA) */
353 #define X86_BUG_ITLB_MULTIHIT X86_BUG(23) /* CPU may incur MCE during certain page attribute changes */
354 #define X86_BUG_SRBDS X86_BUG(24) /* CPU may leak RNG bits if not mitigated */
355 +#define X86_BUG_MMIO_STALE_DATA X86_BUG(25) /* CPU is affected by Processor MMIO Stale Data vulnerabilities */
356
357 #endif /* _ASM_X86_CPUFEATURES_H */
358 diff --git a/arch/x86/include/asm/intel-family.h b/arch/x86/include/asm/intel-family.h
359 index 74ee597beb3e4..8b6c01774ca23 100644
360 --- a/arch/x86/include/asm/intel-family.h
361 +++ b/arch/x86/include/asm/intel-family.h
362 @@ -9,6 +9,10 @@
363 *
364 * Things ending in "2" are usually because we have no better
365 * name for them. There's no processor called "SILVERMONT2".
366 + *
367 + * While adding a new CPUID for a new microarchitecture, add a new
368 + * group to keep logically sorted out in chronological order. Within
369 + * that group keep the CPUID for the variants sorted by model number.
370 */
371
372 #define INTEL_FAM6_CORE_YONAH 0x0E
373 @@ -48,6 +52,24 @@
374 #define INTEL_FAM6_KABYLAKE_MOBILE 0x8E
375 #define INTEL_FAM6_KABYLAKE_DESKTOP 0x9E
376
377 +#define INTEL_FAM6_CANNONLAKE_MOBILE 0x66
378 +
379 +#define INTEL_FAM6_ICELAKE_X 0x6A
380 +#define INTEL_FAM6_ICELAKE_XEON_D 0x6C
381 +#define INTEL_FAM6_ICELAKE_DESKTOP 0x7D
382 +#define INTEL_FAM6_ICELAKE_MOBILE 0x7E
383 +
384 +#define INTEL_FAM6_COMETLAKE 0xA5
385 +#define INTEL_FAM6_COMETLAKE_L 0xA6
386 +
387 +#define INTEL_FAM6_ROCKETLAKE 0xA7
388 +
389 +/* Hybrid Core/Atom Processors */
390 +
391 +#define INTEL_FAM6_LAKEFIELD 0x8A
392 +#define INTEL_FAM6_ALDERLAKE 0x97
393 +#define INTEL_FAM6_ALDERLAKE_L 0x9A
394 +
395 /* "Small Core" Processors (Atom) */
396
397 #define INTEL_FAM6_ATOM_BONNELL 0x1C /* Diamondville, Pineview */
398 @@ -67,7 +89,10 @@
399 #define INTEL_FAM6_ATOM_GOLDMONT 0x5C /* Apollo Lake */
400 #define INTEL_FAM6_ATOM_GOLDMONT_X 0x5F /* Denverton */
401 #define INTEL_FAM6_ATOM_GOLDMONT_PLUS 0x7A /* Gemini Lake */
402 +
403 #define INTEL_FAM6_ATOM_TREMONT_X 0x86 /* Jacobsville */
404 +#define INTEL_FAM6_ATOM_TREMONT 0x96 /* Elkhart Lake */
405 +#define INTEL_FAM6_ATOM_TREMONT_L 0x9C /* Jasper Lake */
406
407 /* Xeon Phi */
408
409 diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
410 index 1fdea3c334e7c..5131146e1bd86 100644
411 --- a/arch/x86/include/asm/msr-index.h
412 +++ b/arch/x86/include/asm/msr-index.h
413 @@ -89,6 +89,30 @@
414 * Not susceptible to
415 * TSX Async Abort (TAA) vulnerabilities.
416 */
417 +#define ARCH_CAP_SBDR_SSDP_NO BIT(13) /*
418 + * Not susceptible to SBDR and SSDP
419 + * variants of Processor MMIO stale data
420 + * vulnerabilities.
421 + */
422 +#define ARCH_CAP_FBSDP_NO BIT(14) /*
423 + * Not susceptible to FBSDP variant of
424 + * Processor MMIO stale data
425 + * vulnerabilities.
426 + */
427 +#define ARCH_CAP_PSDP_NO BIT(15) /*
428 + * Not susceptible to PSDP variant of
429 + * Processor MMIO stale data
430 + * vulnerabilities.
431 + */
432 +#define ARCH_CAP_FB_CLEAR BIT(17) /*
433 + * VERW clears CPU fill buffer
434 + * even on MDS_NO CPUs.
435 + */
436 +#define ARCH_CAP_FB_CLEAR_CTRL BIT(18) /*
437 + * MSR_IA32_MCU_OPT_CTRL[FB_CLEAR_DIS]
438 + * bit available to control VERW
439 + * behavior.
440 + */
441
442 #define MSR_IA32_FLUSH_CMD 0x0000010b
443 #define L1D_FLUSH BIT(0) /*
444 @@ -106,6 +130,7 @@
445 /* SRBDS support */
446 #define MSR_IA32_MCU_OPT_CTRL 0x00000123
447 #define RNGDS_MITG_DIS BIT(0)
448 +#define FB_CLEAR_DIS BIT(3) /* CPU Fill buffer clear disable */
449
450 #define MSR_IA32_SYSENTER_CS 0x00000174
451 #define MSR_IA32_SYSENTER_ESP 0x00000175
452 diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
453 index 19829b00e4fed..8a618fbf569f0 100644
454 --- a/arch/x86/include/asm/nospec-branch.h
455 +++ b/arch/x86/include/asm/nospec-branch.h
456 @@ -323,6 +323,8 @@ DECLARE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
457 DECLARE_STATIC_KEY_FALSE(mds_user_clear);
458 DECLARE_STATIC_KEY_FALSE(mds_idle_clear);
459
460 +DECLARE_STATIC_KEY_FALSE(mmio_stale_data_clear);
461 +
462 #include <asm/segment.h>
463
464 /**
465 diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
466 index 94aa0206b1f98..b4416df41d63a 100644
467 --- a/arch/x86/kernel/cpu/bugs.c
468 +++ b/arch/x86/kernel/cpu/bugs.c
469 @@ -39,8 +39,10 @@ static void __init spectre_v2_select_mitigation(void);
470 static void __init ssb_select_mitigation(void);
471 static void __init l1tf_select_mitigation(void);
472 static void __init mds_select_mitigation(void);
473 -static void __init mds_print_mitigation(void);
474 +static void __init md_clear_update_mitigation(void);
475 +static void __init md_clear_select_mitigation(void);
476 static void __init taa_select_mitigation(void);
477 +static void __init mmio_select_mitigation(void);
478 static void __init srbds_select_mitigation(void);
479
480 /* The base value of the SPEC_CTRL MSR that always has to be preserved. */
481 @@ -75,6 +77,10 @@ EXPORT_SYMBOL_GPL(mds_user_clear);
482 DEFINE_STATIC_KEY_FALSE(mds_idle_clear);
483 EXPORT_SYMBOL_GPL(mds_idle_clear);
484
485 +/* Controls CPU Fill buffer clear before KVM guest MMIO accesses */
486 +DEFINE_STATIC_KEY_FALSE(mmio_stale_data_clear);
487 +EXPORT_SYMBOL_GPL(mmio_stale_data_clear);
488 +
489 void __init check_bugs(void)
490 {
491 identify_boot_cpu();
492 @@ -107,16 +113,9 @@ void __init check_bugs(void)
493 spectre_v2_select_mitigation();
494 ssb_select_mitigation();
495 l1tf_select_mitigation();
496 - mds_select_mitigation();
497 - taa_select_mitigation();
498 + md_clear_select_mitigation();
499 srbds_select_mitigation();
500
501 - /*
502 - * As MDS and TAA mitigations are inter-related, print MDS
503 - * mitigation until after TAA mitigation selection is done.
504 - */
505 - mds_print_mitigation();
506 -
507 arch_smt_update();
508
509 #ifdef CONFIG_X86_32
510 @@ -256,14 +255,6 @@ static void __init mds_select_mitigation(void)
511 }
512 }
513
514 -static void __init mds_print_mitigation(void)
515 -{
516 - if (!boot_cpu_has_bug(X86_BUG_MDS) || cpu_mitigations_off())
517 - return;
518 -
519 - pr_info("%s\n", mds_strings[mds_mitigation]);
520 -}
521 -
522 static int __init mds_cmdline(char *str)
523 {
524 if (!boot_cpu_has_bug(X86_BUG_MDS))
525 @@ -311,7 +302,7 @@ static void __init taa_select_mitigation(void)
526 /* TSX previously disabled by tsx=off */
527 if (!boot_cpu_has(X86_FEATURE_RTM)) {
528 taa_mitigation = TAA_MITIGATION_TSX_DISABLED;
529 - goto out;
530 + return;
531 }
532
533 if (cpu_mitigations_off()) {
534 @@ -325,7 +316,7 @@ static void __init taa_select_mitigation(void)
535 */
536 if (taa_mitigation == TAA_MITIGATION_OFF &&
537 mds_mitigation == MDS_MITIGATION_OFF)
538 - goto out;
539 + return;
540
541 if (boot_cpu_has(X86_FEATURE_MD_CLEAR))
542 taa_mitigation = TAA_MITIGATION_VERW;
543 @@ -357,18 +348,6 @@ static void __init taa_select_mitigation(void)
544
545 if (taa_nosmt || cpu_mitigations_auto_nosmt())
546 cpu_smt_disable(false);
547 -
548 - /*
549 - * Update MDS mitigation, if necessary, as the mds_user_clear is
550 - * now enabled for TAA mitigation.
551 - */
552 - if (mds_mitigation == MDS_MITIGATION_OFF &&
553 - boot_cpu_has_bug(X86_BUG_MDS)) {
554 - mds_mitigation = MDS_MITIGATION_FULL;
555 - mds_select_mitigation();
556 - }
557 -out:
558 - pr_info("%s\n", taa_strings[taa_mitigation]);
559 }
560
561 static int __init tsx_async_abort_parse_cmdline(char *str)
562 @@ -392,6 +371,151 @@ static int __init tsx_async_abort_parse_cmdline(char *str)
563 }
564 early_param("tsx_async_abort", tsx_async_abort_parse_cmdline);
565
566 +#undef pr_fmt
567 +#define pr_fmt(fmt) "MMIO Stale Data: " fmt
568 +
569 +enum mmio_mitigations {
570 + MMIO_MITIGATION_OFF,
571 + MMIO_MITIGATION_UCODE_NEEDED,
572 + MMIO_MITIGATION_VERW,
573 +};
574 +
575 +/* Default mitigation for Processor MMIO Stale Data vulnerabilities */
576 +static enum mmio_mitigations mmio_mitigation __ro_after_init = MMIO_MITIGATION_VERW;
577 +static bool mmio_nosmt __ro_after_init = false;
578 +
579 +static const char * const mmio_strings[] = {
580 + [MMIO_MITIGATION_OFF] = "Vulnerable",
581 + [MMIO_MITIGATION_UCODE_NEEDED] = "Vulnerable: Clear CPU buffers attempted, no microcode",
582 + [MMIO_MITIGATION_VERW] = "Mitigation: Clear CPU buffers",
583 +};
584 +
585 +static void __init mmio_select_mitigation(void)
586 +{
587 + u64 ia32_cap;
588 +
589 + if (!boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA) ||
590 + cpu_mitigations_off()) {
591 + mmio_mitigation = MMIO_MITIGATION_OFF;
592 + return;
593 + }
594 +
595 + if (mmio_mitigation == MMIO_MITIGATION_OFF)
596 + return;
597 +
598 + ia32_cap = x86_read_arch_cap_msr();
599 +
600 + /*
601 + * Enable CPU buffer clear mitigation for host and VMM, if also affected
602 + * by MDS or TAA. Otherwise, enable mitigation for VMM only.
603 + */
604 + if (boot_cpu_has_bug(X86_BUG_MDS) || (boot_cpu_has_bug(X86_BUG_TAA) &&
605 + boot_cpu_has(X86_FEATURE_RTM)))
606 + static_branch_enable(&mds_user_clear);
607 + else
608 + static_branch_enable(&mmio_stale_data_clear);
609 +
610 + /*
611 + * If Processor-MMIO-Stale-Data bug is present and Fill Buffer data can
612 + * be propagated to uncore buffers, clearing the Fill buffers on idle
613 + * is required irrespective of SMT state.
614 + */
615 + if (!(ia32_cap & ARCH_CAP_FBSDP_NO))
616 + static_branch_enable(&mds_idle_clear);
617 +
618 + /*
619 + * Check if the system has the right microcode.
620 + *
621 + * CPU Fill buffer clear mitigation is enumerated by either an explicit
622 + * FB_CLEAR or by the presence of both MD_CLEAR and L1D_FLUSH on MDS
623 + * affected systems.
624 + */
625 + if ((ia32_cap & ARCH_CAP_FB_CLEAR) ||
626 + (boot_cpu_has(X86_FEATURE_MD_CLEAR) &&
627 + boot_cpu_has(X86_FEATURE_FLUSH_L1D) &&
628 + !(ia32_cap & ARCH_CAP_MDS_NO)))
629 + mmio_mitigation = MMIO_MITIGATION_VERW;
630 + else
631 + mmio_mitigation = MMIO_MITIGATION_UCODE_NEEDED;
632 +
633 + if (mmio_nosmt || cpu_mitigations_auto_nosmt())
634 + cpu_smt_disable(false);
635 +}
636 +
637 +static int __init mmio_stale_data_parse_cmdline(char *str)
638 +{
639 + if (!boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA))
640 + return 0;
641 +
642 + if (!str)
643 + return -EINVAL;
644 +
645 + if (!strcmp(str, "off")) {
646 + mmio_mitigation = MMIO_MITIGATION_OFF;
647 + } else if (!strcmp(str, "full")) {
648 + mmio_mitigation = MMIO_MITIGATION_VERW;
649 + } else if (!strcmp(str, "full,nosmt")) {
650 + mmio_mitigation = MMIO_MITIGATION_VERW;
651 + mmio_nosmt = true;
652 + }
653 +
654 + return 0;
655 +}
656 +early_param("mmio_stale_data", mmio_stale_data_parse_cmdline);
657 +
658 +#undef pr_fmt
659 +#define pr_fmt(fmt) "" fmt
660 +
661 +static void __init md_clear_update_mitigation(void)
662 +{
663 + if (cpu_mitigations_off())
664 + return;
665 +
666 + if (!static_key_enabled(&mds_user_clear))
667 + goto out;
668 +
669 + /*
670 + * mds_user_clear is now enabled. Update MDS, TAA and MMIO Stale Data
671 + * mitigation, if necessary.
672 + */
673 + if (mds_mitigation == MDS_MITIGATION_OFF &&
674 + boot_cpu_has_bug(X86_BUG_MDS)) {
675 + mds_mitigation = MDS_MITIGATION_FULL;
676 + mds_select_mitigation();
677 + }
678 + if (taa_mitigation == TAA_MITIGATION_OFF &&
679 + boot_cpu_has_bug(X86_BUG_TAA)) {
680 + taa_mitigation = TAA_MITIGATION_VERW;
681 + taa_select_mitigation();
682 + }
683 + if (mmio_mitigation == MMIO_MITIGATION_OFF &&
684 + boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA)) {
685 + mmio_mitigation = MMIO_MITIGATION_VERW;
686 + mmio_select_mitigation();
687 + }
688 +out:
689 + if (boot_cpu_has_bug(X86_BUG_MDS))
690 + pr_info("MDS: %s\n", mds_strings[mds_mitigation]);
691 + if (boot_cpu_has_bug(X86_BUG_TAA))
692 + pr_info("TAA: %s\n", taa_strings[taa_mitigation]);
693 + if (boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA))
694 + pr_info("MMIO Stale Data: %s\n", mmio_strings[mmio_mitigation]);
695 +}
696 +
697 +static void __init md_clear_select_mitigation(void)
698 +{
699 + mds_select_mitigation();
700 + taa_select_mitigation();
701 + mmio_select_mitigation();
702 +
703 + /*
704 + * As MDS, TAA and MMIO Stale Data mitigations are inter-related, update
705 + * and print their mitigation after MDS, TAA and MMIO Stale Data
706 + * mitigation selection is done.
707 + */
708 + md_clear_update_mitigation();
709 +}
710 +
711 #undef pr_fmt
712 #define pr_fmt(fmt) "SRBDS: " fmt
713
714 @@ -453,11 +577,13 @@ static void __init srbds_select_mitigation(void)
715 return;
716
717 /*
718 - * Check to see if this is one of the MDS_NO systems supporting
719 - * TSX that are only exposed to SRBDS when TSX is enabled.
720 + * Check to see if this is one of the MDS_NO systems supporting TSX that
721 + * are only exposed to SRBDS when TSX is enabled or when CPU is affected
722 + * by Processor MMIO Stale Data vulnerability.
723 */
724 ia32_cap = x86_read_arch_cap_msr();
725 - if ((ia32_cap & ARCH_CAP_MDS_NO) && !boot_cpu_has(X86_FEATURE_RTM))
726 + if ((ia32_cap & ARCH_CAP_MDS_NO) && !boot_cpu_has(X86_FEATURE_RTM) &&
727 + !boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA))
728 srbds_mitigation = SRBDS_MITIGATION_TSX_OFF;
729 else if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
730 srbds_mitigation = SRBDS_MITIGATION_HYPERVISOR;
731 @@ -1065,6 +1191,8 @@ static void update_indir_branch_cond(void)
732 /* Update the static key controlling the MDS CPU buffer clear in idle */
733 static void update_mds_branch_idle(void)
734 {
735 + u64 ia32_cap = x86_read_arch_cap_msr();
736 +
737 /*
738 * Enable the idle clearing if SMT is active on CPUs which are
739 * affected only by MSBDS and not any other MDS variant.
740 @@ -1076,14 +1204,17 @@ static void update_mds_branch_idle(void)
741 if (!boot_cpu_has_bug(X86_BUG_MSBDS_ONLY))
742 return;
743
744 - if (sched_smt_active())
745 + if (sched_smt_active()) {
746 static_branch_enable(&mds_idle_clear);
747 - else
748 + } else if (mmio_mitigation == MMIO_MITIGATION_OFF ||
749 + (ia32_cap & ARCH_CAP_FBSDP_NO)) {
750 static_branch_disable(&mds_idle_clear);
751 + }
752 }
753
754 #define MDS_MSG_SMT "MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.\n"
755 #define TAA_MSG_SMT "TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.\n"
756 +#define MMIO_MSG_SMT "MMIO Stale Data CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/processor_mmio_stale_data.html for more details.\n"
757
758 void arch_smt_update(void)
759 {
760 @@ -1128,6 +1259,16 @@ void arch_smt_update(void)
761 break;
762 }
763
764 + switch (mmio_mitigation) {
765 + case MMIO_MITIGATION_VERW:
766 + case MMIO_MITIGATION_UCODE_NEEDED:
767 + if (sched_smt_active())
768 + pr_warn_once(MMIO_MSG_SMT);
769 + break;
770 + case MMIO_MITIGATION_OFF:
771 + break;
772 + }
773 +
774 mutex_unlock(&spec_ctrl_mutex);
775 }
776
777 @@ -1681,6 +1822,20 @@ static ssize_t tsx_async_abort_show_state(char *buf)
778 sched_smt_active() ? "vulnerable" : "disabled");
779 }
780
781 +static ssize_t mmio_stale_data_show_state(char *buf)
782 +{
783 + if (mmio_mitigation == MMIO_MITIGATION_OFF)
784 + return sysfs_emit(buf, "%s\n", mmio_strings[mmio_mitigation]);
785 +
786 + if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
787 + return sysfs_emit(buf, "%s; SMT Host state unknown\n",
788 + mmio_strings[mmio_mitigation]);
789 + }
790 +
791 + return sysfs_emit(buf, "%s; SMT %s\n", mmio_strings[mmio_mitigation],
792 + sched_smt_active() ? "vulnerable" : "disabled");
793 +}
794 +
795 static char *stibp_state(void)
796 {
797 if (spectre_v2_in_eibrs_mode(spectre_v2_enabled))
798 @@ -1778,6 +1933,9 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
799 case X86_BUG_SRBDS:
800 return srbds_show_state(buf);
801
802 + case X86_BUG_MMIO_STALE_DATA:
803 + return mmio_stale_data_show_state(buf);
804 +
805 default:
806 break;
807 }
808 @@ -1829,4 +1987,9 @@ ssize_t cpu_show_srbds(struct device *dev, struct device_attribute *attr, char *
809 {
810 return cpu_show_common(dev, attr, buf, X86_BUG_SRBDS);
811 }
812 +
813 +ssize_t cpu_show_mmio_stale_data(struct device *dev, struct device_attribute *attr, char *buf)
814 +{
815 + return cpu_show_common(dev, attr, buf, X86_BUG_MMIO_STALE_DATA);
816 +}
817 #endif
818 diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
819 index ff3253b9a8798..48843fc766953 100644
820 --- a/arch/x86/kernel/cpu/common.c
821 +++ b/arch/x86/kernel/cpu/common.c
822 @@ -962,18 +962,42 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
823 X86_FEATURE_ANY, issues)
824
825 #define SRBDS BIT(0)
826 +/* CPU is affected by X86_BUG_MMIO_STALE_DATA */
827 +#define MMIO BIT(1)
828 +/* CPU is affected by Shared Buffers Data Sampling (SBDS), a variant of X86_BUG_MMIO_STALE_DATA */
829 +#define MMIO_SBDS BIT(2)
830
831 static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {
832 VULNBL_INTEL_STEPPINGS(IVYBRIDGE, X86_STEPPING_ANY, SRBDS),
833 VULNBL_INTEL_STEPPINGS(HASWELL_CORE, X86_STEPPING_ANY, SRBDS),
834 VULNBL_INTEL_STEPPINGS(HASWELL_ULT, X86_STEPPING_ANY, SRBDS),
835 VULNBL_INTEL_STEPPINGS(HASWELL_GT3E, X86_STEPPING_ANY, SRBDS),
836 + VULNBL_INTEL_STEPPINGS(HASWELL_X, BIT(2) | BIT(4), MMIO),
837 + VULNBL_INTEL_STEPPINGS(BROADWELL_XEON_D,X86_STEPPINGS(0x3, 0x5), MMIO),
838 VULNBL_INTEL_STEPPINGS(BROADWELL_GT3E, X86_STEPPING_ANY, SRBDS),
839 + VULNBL_INTEL_STEPPINGS(BROADWELL_X, X86_STEPPING_ANY, MMIO),
840 VULNBL_INTEL_STEPPINGS(BROADWELL_CORE, X86_STEPPING_ANY, SRBDS),
841 + VULNBL_INTEL_STEPPINGS(SKYLAKE_MOBILE, X86_STEPPINGS(0x3, 0x3), SRBDS | MMIO),
842 VULNBL_INTEL_STEPPINGS(SKYLAKE_MOBILE, X86_STEPPING_ANY, SRBDS),
843 + VULNBL_INTEL_STEPPINGS(SKYLAKE_X, BIT(3) | BIT(4) | BIT(6) |
844 + BIT(7) | BIT(0xB), MMIO),
845 + VULNBL_INTEL_STEPPINGS(SKYLAKE_DESKTOP, X86_STEPPINGS(0x3, 0x3), SRBDS | MMIO),
846 VULNBL_INTEL_STEPPINGS(SKYLAKE_DESKTOP, X86_STEPPING_ANY, SRBDS),
847 - VULNBL_INTEL_STEPPINGS(KABYLAKE_MOBILE, X86_STEPPINGS(0x0, 0xC), SRBDS),
848 - VULNBL_INTEL_STEPPINGS(KABYLAKE_DESKTOP,X86_STEPPINGS(0x0, 0xD), SRBDS),
849 + VULNBL_INTEL_STEPPINGS(KABYLAKE_MOBILE, X86_STEPPINGS(0x9, 0xC), SRBDS | MMIO),
850 + VULNBL_INTEL_STEPPINGS(KABYLAKE_MOBILE, X86_STEPPINGS(0x0, 0x8), SRBDS),
851 + VULNBL_INTEL_STEPPINGS(KABYLAKE_DESKTOP,X86_STEPPINGS(0x9, 0xD), SRBDS | MMIO),
852 + VULNBL_INTEL_STEPPINGS(KABYLAKE_DESKTOP,X86_STEPPINGS(0x0, 0x8), SRBDS),
853 + VULNBL_INTEL_STEPPINGS(ICELAKE_MOBILE, X86_STEPPINGS(0x5, 0x5), MMIO | MMIO_SBDS),
854 + VULNBL_INTEL_STEPPINGS(ICELAKE_XEON_D, X86_STEPPINGS(0x1, 0x1), MMIO),
855 + VULNBL_INTEL_STEPPINGS(ICELAKE_X, X86_STEPPINGS(0x4, 0x6), MMIO),
856 + VULNBL_INTEL_STEPPINGS(COMETLAKE, BIT(2) | BIT(3) | BIT(5), MMIO | MMIO_SBDS),
857 + VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x1, 0x1), MMIO | MMIO_SBDS),
858 + VULNBL_INTEL_STEPPINGS(COMETLAKE_L, X86_STEPPINGS(0x0, 0x0), MMIO),
859 + VULNBL_INTEL_STEPPINGS(LAKEFIELD, X86_STEPPINGS(0x1, 0x1), MMIO | MMIO_SBDS),
860 + VULNBL_INTEL_STEPPINGS(ROCKETLAKE, X86_STEPPINGS(0x1, 0x1), MMIO),
861 + VULNBL_INTEL_STEPPINGS(ATOM_TREMONT, X86_STEPPINGS(0x1, 0x1), MMIO | MMIO_SBDS),
862 + VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_X, X86_STEPPING_ANY, MMIO),
863 + VULNBL_INTEL_STEPPINGS(ATOM_TREMONT_L, X86_STEPPINGS(0x0, 0x0), MMIO | MMIO_SBDS),
864 {}
865 };
866
867 @@ -994,6 +1018,13 @@ u64 x86_read_arch_cap_msr(void)
868 return ia32_cap;
869 }
870
871 +static bool arch_cap_mmio_immune(u64 ia32_cap)
872 +{
873 + return (ia32_cap & ARCH_CAP_FBSDP_NO &&
874 + ia32_cap & ARCH_CAP_PSDP_NO &&
875 + ia32_cap & ARCH_CAP_SBDR_SSDP_NO);
876 +}
877 +
878 static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
879 {
880 u64 ia32_cap = x86_read_arch_cap_msr();
881 @@ -1045,12 +1076,27 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
882 /*
883 * SRBDS affects CPUs which support RDRAND or RDSEED and are listed
884 * in the vulnerability blacklist.
885 + *
886 + * Some of the implications and mitigation of Shared Buffers Data
887 + * Sampling (SBDS) are similar to SRBDS. Give SBDS same treatment as
888 + * SRBDS.
889 */
890 if ((cpu_has(c, X86_FEATURE_RDRAND) ||
891 cpu_has(c, X86_FEATURE_RDSEED)) &&
892 - cpu_matches(cpu_vuln_blacklist, SRBDS))
893 + cpu_matches(cpu_vuln_blacklist, SRBDS | MMIO_SBDS))
894 setup_force_cpu_bug(X86_BUG_SRBDS);
895
896 + /*
897 + * Processor MMIO Stale Data bug enumeration
898 + *
899 + * Affected CPU list is generally enough to enumerate the vulnerability,
900 + * but for virtualization case check for ARCH_CAP MSR bits also, VMM may
901 + * not want the guest to enumerate the bug.
902 + */
903 + if (cpu_matches(cpu_vuln_blacklist, MMIO) &&
904 + !arch_cap_mmio_immune(ia32_cap))
905 + setup_force_cpu_bug(X86_BUG_MMIO_STALE_DATA);
906 +
907 if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN))
908 return;
909
910 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
911 index 622e5a7eb7db5..c287fcc310b3b 100644
912 --- a/arch/x86/kvm/vmx.c
913 +++ b/arch/x86/kvm/vmx.c
914 @@ -211,6 +211,9 @@ static const struct {
915 #define L1D_CACHE_ORDER 4
916 static void *vmx_l1d_flush_pages;
917
918 +/* Control for disabling CPU Fill buffer clear */
919 +static bool __read_mostly vmx_fb_clear_ctrl_available;
920 +
921 static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf)
922 {
923 struct page *page;
924 @@ -794,6 +797,8 @@ struct vcpu_vmx {
925 */
926 u64 msr_ia32_feature_control;
927 u64 msr_ia32_feature_control_valid_bits;
928 + u64 msr_ia32_mcu_opt_ctrl;
929 + bool disable_fb_clear;
930 };
931
932 enum segment_cache_field {
933 @@ -1573,6 +1578,60 @@ static inline void __invept(unsigned long ext, u64 eptp, gpa_t gpa)
934 : : "a" (&operand), "c" (ext) : "cc", "memory");
935 }
936
937 +static void vmx_setup_fb_clear_ctrl(void)
938 +{
939 + u64 msr;
940 +
941 + if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES) &&
942 + !boot_cpu_has_bug(X86_BUG_MDS) &&
943 + !boot_cpu_has_bug(X86_BUG_TAA)) {
944 + rdmsrl(MSR_IA32_ARCH_CAPABILITIES, msr);
945 + if (msr & ARCH_CAP_FB_CLEAR_CTRL)
946 + vmx_fb_clear_ctrl_available = true;
947 + }
948 +}
949 +
950 +static __always_inline void vmx_disable_fb_clear(struct vcpu_vmx *vmx)
951 +{
952 + u64 msr;
953 +
954 + if (!vmx->disable_fb_clear)
955 + return;
956 +
957 + rdmsrl(MSR_IA32_MCU_OPT_CTRL, msr);
958 + msr |= FB_CLEAR_DIS;
959 + wrmsrl(MSR_IA32_MCU_OPT_CTRL, msr);
960 + /* Cache the MSR value to avoid reading it later */
961 + vmx->msr_ia32_mcu_opt_ctrl = msr;
962 +}
963 +
964 +static __always_inline void vmx_enable_fb_clear(struct vcpu_vmx *vmx)
965 +{
966 + if (!vmx->disable_fb_clear)
967 + return;
968 +
969 + vmx->msr_ia32_mcu_opt_ctrl &= ~FB_CLEAR_DIS;
970 + wrmsrl(MSR_IA32_MCU_OPT_CTRL, vmx->msr_ia32_mcu_opt_ctrl);
971 +}
972 +
973 +static void vmx_update_fb_clear_dis(struct kvm_vcpu *vcpu, struct vcpu_vmx *vmx)
974 +{
975 + vmx->disable_fb_clear = vmx_fb_clear_ctrl_available;
976 +
977 + /*
978 + * If guest will not execute VERW, there is no need to set FB_CLEAR_DIS
979 + * at VMEntry. Skip the MSR read/write when a guest has no use case to
980 + * execute VERW.
981 + */
982 + if ((vcpu->arch.arch_capabilities & ARCH_CAP_FB_CLEAR) ||
983 + ((vcpu->arch.arch_capabilities & ARCH_CAP_MDS_NO) &&
984 + (vcpu->arch.arch_capabilities & ARCH_CAP_TAA_NO) &&
985 + (vcpu->arch.arch_capabilities & ARCH_CAP_PSDP_NO) &&
986 + (vcpu->arch.arch_capabilities & ARCH_CAP_FBSDP_NO) &&
987 + (vcpu->arch.arch_capabilities & ARCH_CAP_SBDR_SSDP_NO)))
988 + vmx->disable_fb_clear = false;
989 +}
990 +
991 static struct shared_msr_entry *find_msr_entry(struct vcpu_vmx *vmx, u32 msr)
992 {
993 int i;
994 @@ -3407,9 +3466,13 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
995 }
996 break;
997 }
998 - ret = kvm_set_msr_common(vcpu, msr_info);
999 + ret = kvm_set_msr_common(vcpu, msr_info);
1000 }
1001
1002 + /* FB_CLEAR may have changed, also update the FB_CLEAR_DIS behavior */
1003 + if (msr_index == MSR_IA32_ARCH_CAPABILITIES)
1004 + vmx_update_fb_clear_dis(vcpu, vmx);
1005 +
1006 return ret;
1007 }
1008
1009 @@ -5544,6 +5607,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
1010 update_exception_bitmap(vcpu);
1011
1012 vpid_sync_context(vmx->vpid);
1013 +
1014 + vmx_update_fb_clear_dis(vcpu, vmx);
1015 }
1016
1017 /*
1018 @@ -9177,6 +9242,11 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
1019 vmx_l1d_flush(vcpu);
1020 else if (static_branch_unlikely(&mds_user_clear))
1021 mds_clear_cpu_buffers();
1022 + else if (static_branch_unlikely(&mmio_stale_data_clear) &&
1023 + kvm_arch_has_assigned_device(vcpu->kvm))
1024 + mds_clear_cpu_buffers();
1025 +
1026 + vmx_disable_fb_clear(vmx);
1027
1028 asm(
1029 /* Store host registers */
1030 @@ -9295,6 +9365,8 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
1031 #endif
1032 );
1033
1034 + vmx_enable_fb_clear(vmx);
1035 +
1036 /*
1037 * We do not use IBRS in the kernel. If this vCPU has used the
1038 * SPEC_CTRL MSR it may have left it on; save the value and
1039 @@ -11879,8 +11951,11 @@ static int __init vmx_init(void)
1040 }
1041 }
1042
1043 + vmx_setup_fb_clear_ctrl();
1044 +
1045 for_each_possible_cpu(cpu) {
1046 INIT_LIST_HEAD(&per_cpu(loaded_vmcss_on_cpu, cpu));
1047 +
1048 INIT_LIST_HEAD(&per_cpu(blocked_vcpu_on_cpu, cpu));
1049 spin_lock_init(&per_cpu(blocked_vcpu_on_cpu_lock, cpu));
1050 }
1051 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
1052 index c0f7e746722d9..78c1838b9fffd 100644
1053 --- a/arch/x86/kvm/x86.c
1054 +++ b/arch/x86/kvm/x86.c
1055 @@ -1090,6 +1090,10 @@ u64 kvm_get_arch_capabilities(void)
1056
1057 /* KVM does not emulate MSR_IA32_TSX_CTRL. */
1058 data &= ~ARCH_CAP_TSX_CTRL_MSR;
1059 +
1060 + /* Guests don't need to know "Fill buffer clear control" exists */
1061 + data &= ~ARCH_CAP_FB_CLEAR_CTRL;
1062 +
1063 return data;
1064 }
1065
1066 diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
1067 index 100850398dd3f..88f89a38588a1 100644
1068 --- a/drivers/base/cpu.c
1069 +++ b/drivers/base/cpu.c
1070 @@ -556,6 +556,12 @@ ssize_t __weak cpu_show_srbds(struct device *dev,
1071 return sprintf(buf, "Not affected\n");
1072 }
1073
1074 +ssize_t __weak cpu_show_mmio_stale_data(struct device *dev,
1075 + struct device_attribute *attr, char *buf)
1076 +{
1077 + return sysfs_emit(buf, "Not affected\n");
1078 +}
1079 +
1080 static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
1081 static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
1082 static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL);
1083 @@ -565,6 +571,7 @@ static DEVICE_ATTR(mds, 0444, cpu_show_mds, NULL);
1084 static DEVICE_ATTR(tsx_async_abort, 0444, cpu_show_tsx_async_abort, NULL);
1085 static DEVICE_ATTR(itlb_multihit, 0444, cpu_show_itlb_multihit, NULL);
1086 static DEVICE_ATTR(srbds, 0444, cpu_show_srbds, NULL);
1087 +static DEVICE_ATTR(mmio_stale_data, 0444, cpu_show_mmio_stale_data, NULL);
1088
1089 static struct attribute *cpu_root_vulnerabilities_attrs[] = {
1090 &dev_attr_meltdown.attr,
1091 @@ -576,6 +583,7 @@ static struct attribute *cpu_root_vulnerabilities_attrs[] = {
1092 &dev_attr_tsx_async_abort.attr,
1093 &dev_attr_itlb_multihit.attr,
1094 &dev_attr_srbds.attr,
1095 + &dev_attr_mmio_stale_data.attr,
1096 NULL
1097 };
1098
1099 diff --git a/include/linux/cpu.h b/include/linux/cpu.h
1100 index e19bbc38a722c..6c1cb1f1bd4b3 100644
1101 --- a/include/linux/cpu.h
1102 +++ b/include/linux/cpu.h
1103 @@ -61,6 +61,10 @@ extern ssize_t cpu_show_tsx_async_abort(struct device *dev,
1104 char *buf);
1105 extern ssize_t cpu_show_itlb_multihit(struct device *dev,
1106 struct device_attribute *attr, char *buf);
1107 +extern ssize_t cpu_show_srbds(struct device *dev, struct device_attribute *attr, char *buf);
1108 +extern ssize_t cpu_show_mmio_stale_data(struct device *dev,
1109 + struct device_attribute *attr,
1110 + char *buf);
1111
1112 extern __printf(4, 5)
1113 struct device *cpu_device_create(struct device *parent, void *drvdata,