<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Linux-Kernel on virtfunc</title><link>https://virtfunc.com/tags/linux-kernel/</link><description>Recent content in Linux-Kernel on virtfunc</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Wed, 25 Mar 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://virtfunc.com/tags/linux-kernel/index.xml" rel="self" type="application/rss+xml"/><item><title>Unintercept CPUID</title><link>https://virtfunc.com/projects/unintercept-cpuid/</link><pubDate>Wed, 25 Mar 2026 00:00:00 +0000</pubDate><guid>https://virtfunc.com/projects/unintercept-cpuid/</guid><description>&lt;h2 id="introduction"&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Most hypervisors intercept the CPUID instruction to present a modified feature set to the guest. This behavior is critical for use cases like live migration across different hardware models, and disabling features that the emulator does not handle.&lt;/p&gt;
&lt;p&gt;However, this interception introduces a detectable side effect. Timing attacks are a common method for virtual machine (VM) detection because VM exits are relatively expensive operations. On Intel CPUs that support Virtual Machine Extensions (&lt;a href="https://en.wikipedia.org/wiki/X86_virtualization#Intel_virtualization_%28VT-x%29"&gt;VMX&lt;/a&gt;), executing CPUID unconditionally causes a VM exit. Without manual mitigation of the transition overhead, the resulting latency serves as a flag of hypervisor presence.&lt;/p&gt;</description><content>&lt;h2 id="introduction"&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Most hypervisors intercept the CPUID instruction to present a modified feature set to the guest. This behavior is critical for use cases like live migration across different hardware models, and disabling features that the emulator does not handle.&lt;/p&gt;
&lt;p&gt;However, this interception introduces a detectable side effect. Timing attacks are a common method for virtual machine (VM) detection because VM exits are relatively expensive operations. On Intel CPUs that support Virtual Machine Extensions (&lt;a href="https://en.wikipedia.org/wiki/X86_virtualization#Intel_virtualization_%28VT-x%29"&gt;VMX&lt;/a&gt;), executing CPUID unconditionally causes a VM exit. Without manual mitigation of the transition overhead, the resulting latency serves as a flag of hypervisor presence.&lt;/p&gt;
&lt;p&gt;However, AMD CPUs do not impose this restriction on guests virtualized under the Secure Virtual Machine (&lt;a href="https://en.wikipedia.org/wiki/X86_virtualization#AMD_virtualization_%28AMD-V%29"&gt;SVM&lt;/a&gt;) extension. An SVM guest can be configured to bypass &lt;code&gt;CPUID&lt;/code&gt; interception by clearing bit 18 at offset &lt;code&gt;0xC&lt;/code&gt; in the Virtual Machine Control Block (VMCB) (&lt;a href="https://docs.amd.com/api/khub/documents/68GKiN0gMEd6bMddsmhPwg/content?#G27.1021954.9Y"&gt;Appendix B of AMD document 24593&lt;/a&gt;).&lt;/p&gt;
&lt;h2 id="challenges-in-implementation"&gt;Challenges in Implementation&lt;/h2&gt;
&lt;p&gt;Flipping this bit in the VMCB should, in theory, make a hypervisor immune to &lt;code&gt;CPUID&lt;/code&gt; timing checks. However, bypassing the intercept passes the raw silicon data and &lt;code&gt;CPUID&lt;/code&gt; leaves directly to the guest, which introduces several critical issues:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Topology Mismatches:&lt;/strong&gt; The host CPU topology may differ from the hardware allocated to the VM, such as running a 6-core VM on an 8-core host.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Feature Set Inconsistency:&lt;/strong&gt; Passing through raw data may change the feature set presented to the virtualized OS, which occasionally leads to system instability if changed at runtime.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Boot Failures:&lt;/strong&gt; The VM may fail to boot entirely. This is caused by CET not being implemented in current QEMU versions (v10.2.x), as well as some MSR issues.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The first issue is resolvable by matching the VM core count to the host and utilizing &lt;a href="https://libvirt.org/formatdomain.html#cpu-tuning"&gt;libvirt pinning&lt;/a&gt; ensure the cores do not move between different physical cores and confuse the kernel.&lt;/p&gt;
&lt;p&gt;The second issue can be addressed using tools like &lt;a href="https://github.com/daaximus/arch_enum"&gt;arch_enum&lt;/a&gt; to identify differences between host and guest features. Emulators like QEMU often expose emulated Intel features by default for AMD CPUs; these must be disabled to maintain stealth and stability. Depending on the method used for patching disabling intercepts on CPUID, this may not be needed.&lt;/p&gt;
&lt;p&gt;The third issue is the most significant hurdle, as a VM that cannot boot is of questionable utility.&lt;/p&gt;
&lt;h2 id="solution-theory"&gt;Solution Theory&lt;/h2&gt;
&lt;p&gt;To work on fixing the third issue, we first need to patch QEMU to support CET for KVM guests. Recent linux kernels already have CET support in KVM, but QEMU does not virtualize it. A patch to enable support is available &lt;a href="https://patchwork.ozlabs.org/series/485044/mbox/"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;A hacky solution involves disabling &lt;code&gt;CPUID&lt;/code&gt; intercepts at runtime only after the guest VM has successfully initialized all CPU cores. While one could create a KVM parameter to toggle intercepts manually, doing so at every boot is tedious and error prone.&lt;/p&gt;
&lt;p&gt;A more automated approach follows this logic:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Enable &lt;code&gt;CPUID&lt;/code&gt; exits during the initial guest start sequence.&lt;/li&gt;
&lt;li&gt;Hook a VM exit that reliably occurs after all cores have been initialized.&lt;/li&gt;
&lt;li&gt;From that hook, loop through all vCPUs to disable their &lt;code&gt;CPUID&lt;/code&gt; intercepts.&lt;/li&gt;
&lt;li&gt;Conduct runtime tests without the performance overhead of &lt;code&gt;CPUID&lt;/code&gt; transitions.&lt;/li&gt;
&lt;li&gt;Restore the intercepts upon a guest reboot to ensure the next boot sequence succeeds.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="implementation"&gt;Implementation&lt;/h2&gt;
&lt;p&gt;Enabling &lt;code&gt;CPUID&lt;/code&gt; exits during the initial startup sequence is the first—and only—action performed for us. After the cores are initialized, Windows appears to enumerate CPUID leaves starting from 0 up to the maximum value returned in &lt;code&gt;EAX&lt;/code&gt; when &lt;code&gt;CPUID&lt;/code&gt; is invoked with &lt;code&gt;EAX = 0&lt;/code&gt; (which indicates the highest supported leaf). According to the AMD manual, leaves &lt;code&gt;0x00000008&lt;/code&gt; through &lt;code&gt;0x0000000A&lt;/code&gt; are reserved, meaning calls to &lt;code&gt;CPUID&lt;/code&gt; with these values do not provide useful information and are most likely part of a simple enumeration loop.&lt;/p&gt;
&lt;p&gt;Any leaf in this range is valid, so here lets choose &lt;code&gt;0x00000008&lt;/code&gt; as our trigger point to disable CPUID intercepts. It is worth noting that this appears to be executed after boot start drivers have loaded. A simple CPUID hook that catches this could be implemented as:&lt;/p&gt;
&lt;pre tabindex="0"&gt;&lt;code&gt;#define RESERVED_LEAF 0x00000008
static int windows_boot_hook(struct kvm_vcpu *vcpu)
{
u32 eax = kvm_rax_read(vcpu);
int ret = kvm_emulate_cpuid(vcpu);
if (eax == RESERVED_LEAF) svm_set_cpuid_intercept_for_all_vcpus(vcpu, false);
return ret;
}
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Here &lt;code&gt;svm_set_cpuid_intercept_for_all_vcpus()&lt;/code&gt; is a function that sets a flag to update the CPUID intercept, as well as sets the desired state of the intercept. Then after setting the flags for all the cores, it kicks the CPUs so they have a chance to process the update.&lt;/p&gt;
&lt;p&gt;A reset/reboot detection hook could be implemented like below. &lt;code&gt;svm_set_segment()&lt;/code&gt; gets called in cases other than processor resets, so we should do some filtering to make sure the core is the Bootstrap Processor (BSP), and is being reset.&lt;/p&gt;
&lt;pre tabindex="0"&gt;&lt;code&gt;static void svm_set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg)
{
...
if (unlikely(vcpu-&amp;gt;vcpu_id == 0 &amp;amp;&amp;amp; seg == VCPU_SREG_CS &amp;amp;&amp;amp;
var-&amp;gt;base == 0xffff0000)) {//detect BSP x86 reset
printk(KERN_INFO &amp;#34;kvm_amd: BSP reset detected\n&amp;#34;);
svm_set_cpuid_intercept_for_all_vcpus(vcpu, true);
}
...
}
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;To actually update the state of the intercept, something like the following is used. The flag controlling the CPUID intercept update exists as a performance optimization: since &lt;code&gt;struct vcpu_svm&lt;/code&gt; is already part of the fast path and typically cached, using a flag avoids making a comparison against the VMCB intercept region, and an unconditional &lt;code&gt;vmcb_mark_dirty()&lt;/code&gt; call on every &lt;code&gt;svm_vcpu_run()&lt;/code&gt;.&lt;/p&gt;
&lt;pre tabindex="0"&gt;&lt;code&gt;static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
{
...
if (unlikely(svm-&amp;gt;cpuid_update_req)) { // simple check instead of cmp against vcpu_svm
svm-&amp;gt;cpuid_update_req = false; // clear request
if (svm-&amp;gt;cpuid_intercept) svm_set_intercept(svm, INTERCEPT_CPUID);
else svm_clr_intercept(svm, INTERCEPT_CPUID);
vmcb_mark_dirty(svm-&amp;gt;vmcb, VMCB_INTERCEPTS);
}
...
}
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id="results"&gt;Results&lt;/h2&gt;
&lt;p&gt;Using this patch, it is possible to fully bypass CPUID based timing attacks. With correctly configured firmware and QEMU, it is possible to obtain 0/91 detections on the latest version of &lt;a href="https://github.com/kernelwernel/VMAware"&gt;VMAware&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://virtfunc.com/images/unintercept-cpuid/vmaware_bypass.png" alt=""&gt;&lt;/p&gt;
&lt;h2 id="alternatives"&gt;Alternatives&lt;/h2&gt;
&lt;p&gt;A simpler method involving passing through host &lt;code&gt;CPUID&lt;/code&gt; for the entire VM runtime is possible, but outside the scope of this article. Using this method, the configuration of the guest &lt;code&gt;CPUID&lt;/code&gt; in QEMU is completely irrelevant, because only what the silicon reports will be seen when executing &lt;code&gt;CPUID&lt;/code&gt;.&lt;/p&gt;</content></item></channel></rss>