The first question people may be wondering is: Is the recent Meltdown patch going to take effect for AMD-embedded processor systems? The answer is Yes. You will see proof of this shortly within this article.
- Intel Atom C, E, A, x3, Z and Celeron- + Pentium-Series J & N
- Xeon 3400, 3600, 5500, 5600, 6500 and 7500
- Xeon-Familiy E3 (v1 till v6), E5 (v1 till v4), E7 (v1 till v4)
- Xeon Scalable, Xeon Phi 3200, 5200 & 7200
- Older CPU’s aren’t affected because of their In-order-architecture (new CPU generations using Out-of-Order-Architecture)
- nVidia at this time still analyse their Tegra-processors
- ARM: Cortex-R7, Cortex-R8, Cortex-A8, Cortex-A9, Cortex-A15, Cortex-A15, Cortex-A17, Cortex-A57, Cortex-A72, Cortex-A73 & Cortex-A75
All executable binaries have an “entry” point. For those of you which have a background in user-mode software engineering with languages like C, or C++, you’ll be familiar with the “main” entry-point. For those of you with a background in the .NET Framework, you’ll be familiar with the Form “Load” and “Shown” events. This isn’t a tutorial on programming therefore I’m not going to explain how run-time libraries like the CRT (Common Run-Time) is initialised and then calls the user-defined “entry-point” and the alike, but what I will mention is that ntoskrnl.exe, like all software, has an entry-point too – whether it is a kernel or user image does not change this.
The entry-point of NTOSKRNL is a routine called KiSystemStartup. This user-defined entry-point will return an NTSTATUS error-code, where a return of STATUS_SUCCESS indicates successful execution. The routine follows the “entry-point” tradition for kernel-mode device driver development; there are two parameters to KiSystemStartup, being a PDRIVER_OBJECT and PUNICODE_STRING data-types – just like a DriverEntry routine. Bear in mind that *.exe, *.dll, and *.sys, they are all Portable Executable’s, so it isn’t strange, and NTOSKRNL is bound to follow kernel-mode device driver development style because device drivers are running under it and will have access to its memory.
When NTOSKRNL is loaded up in Interactive Disassembler (referenced as IDA from now on throughout this thread), it’ll automatically detect KiSystemStartup as being the user-defined entry-point. We’ll be able to see it clearly in graph view.
We can see that the parameters of KiSystemStartup are:
– DriverObject (PDRIVER_OBJECT)
– RegistryPath (PUNICODE_STRING)
The PDRIVER_OBJECT structure is a pointer to the DRIVER_OBJECT structure, and it holds data about the current driver object – that doesn’t explain it very well to people who are wondering what data a “driver object” will hold. To explain it as easily as possible, it will store data which the driver itself can reference in the future, and the driver can alter data under the pointer structure to change/work functionality available to it. An example would be granting support for the driver to be dynamically unloaded – by default, a loaded device driver will not be able to be “unloaded” by sending a stop request to it’s service unless it has registered the DriverUnload routine (PDRIVER_UNLOAD) to the DriverUnload member of the DRIVER_OBJECT structure. This is one reason as to why you can encounter an error message when trying to stop a service which is responsible for representing a loaded kernel-mode device driver sometimes.
Speaking of driver objects, we can actually see on the image presented to you above in graph view that one of the first things the KiSystemStartup routine does is backup the address of the DriverObject parameter, which would allow it to later reference the members and also change them. It assigns a global variable which appears to be named as KeLoaderBlock a value of the address of DriverObject, which points to the PDRIVER_OBJECT (pointer of the DRIVER_OBJECT structure) in memory.
The KiSystemStartup routine in itself is actually quite small as to what you’d expect, it is more-or-less responsible for basic initialisation of things. It will call other routines which will handle more important or sensitive topics, and it’ll receive back the status code of how those operations went. Routines called to call other routines will send back status codes on the operation success state, and it goes on forever. At the end of the KiSystemStartup routine, a do while loop is utilised to hold up the current thread which is executing on KiSystemStartup, and it holds up until a flag value is changed to FALSE.
If we have a variable called “Opcode” and the value of this variable is more than 0 (let’s say it is an integer value – which is the equivalent of a BOOL data-type in C++), then when performing a conditional statement to check if the value is TRUE, the conditional statement will come back as positive. The reason is because 0 is used to indicate FALSE, and anything other than 0 which is higher is used to indicate TRUE. Therefore, to explain what the term “flag” means, think of it as a value which indicates if something is true or not. A flag is merely used as a reference by other components when necessary to help in deciding execution flow with decisions. For example, if we have a group of 500 cats in our house and we want to track whether they have all been fed, we can use a flag which will indicate all of the 500 cats were fed if the value is equivalent to TRUE. Of course you must feed them more than once each day and I’m sure you can think of something more efficient but that should do the trick with explaining it.
There’s a function which is not exported by NTOSKRNL, it’s an non-exported, undocumented and internally private routine, called KiInitializeBootStructures. The KiInitializeBootStructures routine takes in two parameters, and funnily enough, the address of the DriverObject (PDRIVER_OBJECT) from the user-defined entry-point of KiSystemStartup is passed in as the first parameter.
We’ll start by moving into KeInitializeBootStructures. I will tease it… KiInitializeBootStructures plays a big part in the AMD enforcement for the recent patch!
In the disassembly of KiSystemStartup, we can see that KeLoaderBlock_O is being moved into the RCX register. KiInitializeBootStructures is then being called, and the start of the KiInitializeBootStructures routine is going to extract the value from RCX register to recover the parameter being passed in.
I’m going to move into pseudo-code view, and I’ll clean-up the routine by changing the data-type names, variable names, etcetera. It’ll make it clearer to understand, and it will be less confusing to look at this way.
Ignore everything except the highlighted part, this is what we need to be focusing on because this will introduce how the AMD-processor systems are being enforced for the patch update. We can see a type-definition here, with the calling convention __fastcall being used. No parameters for the type-definition.
KiSystemCall32 and KiSystemCall64 are critical key parts to the patch update. I’m going to explain to you what these routines are and why they are so important, as well as some additional background information regarding trust levels in Windows which was already present prior to the patch update. This is essential before I can proceed and explain how the AMD enforcement is done (which is actually quite brief and straight forward).
KiSystemCall32 and KiSystemCall64 are routines which will be triggered in the Windows Kernel for system calls. System calls are used by user-mode software which needs to have the kernel execute an operation via a Native API routine. For those of you which are unaware, a Native API routine is a lower-level routine which is usually less-documented and less-stable, but there are many different types of “Native API” routines. It is a lot closer to the Windows Kernel, but it isn’t as deep as it seems.
One type of Native API routines would be the group which start with “Nt” or “Zw” (Nt/Zw). These routines exist in kernel-mode memory, however they are also the same ones which are invoked by user-mode software (usually un-intentionally – documented interfaces such as the Win32 API will heavily rely on the Native API). As a means of separating the kernel-mode memory from user-mode memory, all user-mode software is given it’s own “Virtual Address Space”, and a user-mode process cannot access kernel-mode memory by default. This means that when a user-mode program needs a Native API routine to be performed by the Windows Kernel to provide functionality, it must have a mechanism to work which will communicate with the Windows Kernel in some shape or form for this to happen. This is where NTDLL and system calls come in.
NTDLL (ntdll.dll – located at “SystemDrive:\Windows\System32\” and for 64-bit systems also “SystemDrive:\Windows\SysWOW64\“) is loaded into the address space of every single user-mode process (I am not sure about Pico processes with WSL but for Windows PEs it will be). Without NTDLL, the process cannot operate. The reason for this is because NTDLL is depended on too much, it provides the ability for user-mode to kernel-mode transition for Native API operation usage. Without the ability for a user-mode process to work the Native API (e.g. in the background by the Windows Loader/documented interfaces internally), the process cannot work. Anything from opening a handle to a process, reading the contents of a file, or as simple but critical to creating a thread will require support from the Windows Kernel. This is why NTDLL.DLL is the first module loaded into every single process – there’s a specific group order and other modules like KERNEL32 and USER32 are important most of the time.
As mentioned, a communication method for user-mode to kernel-mode must exist for the correct support to be available. Windows relies on system calls for this communication, allowing user-mode software to perform a transition from user-mode and have code execution be moved to the kernel temporary, and receive back a response on how the operation went. This is achieved through a system call, and the instructions for system calls are part of the CPU instruction set. In the x86 instruction set, there are two widely used system call instructions: SYSCALL; and SYSENTER. The former is used on a 64-bit kernel.
When the SYSCALL or SYSENTER instructions are executed, code execution is moved to a kernel-mode routine in which the triggered routine which resides in kernel-mode memory only has its address pointed to by a Model Specific Register (MSR). For the SYSCALL instruction, when this instruction is triggered, execution is directed to the address pointed to by the IA32_LSTAR MSR. To make this explanation make a bit more sense, the address pointed to by the IA32_LSTAR MSR is supposed to be KiSystemCall64 (NTOSKRNL). This would also mean that whenever the SYSCALL instruction is executed, KiSystemCall64 is executed in kernel-mode (where the routine exists), because code execution flow is directed to the address held under the IA32_LSTAR MSR. At the same time, if an attacker has write access to kernel-mode memory, they can intercept every single user-mode to kernel-mode system call transition for the 64-bit environment by hijacking the IA32_LSTAR MSR (replacing the address it will be pointing to) or patching the memory of KiSystemCall64; if such activity has been performed then when the SYSCALL instruction is executed, either execution flow is directed to the rogue KiSystemCall64 routine, or when the real KiSystemCall64 is triggered and the address where the patch resides at in the function prologue in memory (e.g. the start of the routine) is found by the Instruction Pointer (also known as the ‘Instruction Counter’), execution flow can exit from the real KiSystemCall64 and be redirected over to the rogue copy via insertion of operation codes like the JMP/CALL instruction.
I’ve left unnecessary details in the screenshot because despite them not being so important right now, I’d rather explain it a bit more in-depth to make this worth-while reading for specific people who may be interested. The screenshot might look a bit complicated despite it being quite empty for people who are a bit more beginner but trying to get a grasp on an understanding of all of this, so let’s go through the screenshot bit by bit – focus on one part at a time, starting from the top of the screenshot.
The start of the screenshot has two lines which are talking about an “Exported entry”. It mentions both “NtAddBootEntry” and “ZwAddBootEntry”. You might be wondering why this is. The reason for this is because in user-mode, the Nt* routines are identical to the Zw* routines. That’s right! When you acquire the address to NtAddBootEntry (NTDLL), you’ll be given an address which is the same for if you were getting the address for ZwAddBootEntry (NTDLL) as well. The address for the Nt* functions are identical for the Zw* equivalent. Therefore, despite whether you use Nt* or Zw* in user-mode, the exact same instructions are going to be executed. I believe that Microsoft must have had plans early-on to make some differences with this, but then likely changed their mind about it. Because it would have been too late to make a massive change (for compatibility reasons – people prefer using Zw* and others prefer Nt) I suspect they just decided to leave it in. After all, there aren’t duplicate routines for the equivalent of Nt -> Zw or vice-versa, so the file size of NTDLL isn’t going to duplicate because of it. In kernel-mode, there is indeed a difference between using Nt and Zw*, but I’ll explain this later down the road.
The next part is commented by Interactive Disassembler as the “Subroutine”. This is telling us that the following instructions is for the actual sub-routine in itself. The part at the bottom is still part of the subroutine, but it is obsolete in the modern world. I’ll still be going through it though for old-school sake.
The instructions for the routine starts with the MOV R10, RCX line. We can see that to our left, we have the hexadecimal representation for the instructions. We can reference the instructions in “byte” form by adding an “0x” before the HEX. E.g. MOV R10, RCX in byte-form would be 0x4C, 0x8B, 0xD1. If we wanted to put this into traditional shell-code form with the string format, it’d become “\x4c\x8b\xd1”.
The important part we need to start noticing is the usage of the EAX register. On the second line, a value is being moved into the EAX register. In this case, 69 (69h) is being moved into the EAX register. See the following image.
The reason 69h is being moved into the EAX register is because 69 is the identifier for the Native API routine which needs to be called. These “identifiers” change across Windows patch updates and major versions, therefore you must never hard-code them. When code execution is triggered in kernel-mode to carry out the Native API operation due to the system call, the Windows Kernel needs to know which routine was being requested to be called. It achieves this through extracting the identifier from the EAX register, and in the Windows Kernel, the identifier for the routine is safely stored along with the pointer address of where the routine resides in kernel-mode memory.
The third and fourth line are now obsolete in the sense that the TEST and JNZ instructions are used to decide if another mechanism for the system call should be used. Back in the early days on Windows 2000 and Windows XP, the SYSCALL and SYSENTER instructions were not used. The Windows Kernel was very different and support for it wasn’t possible at the time. Microsoft changed the mechanism starting from Windows Vista, and they made a wide variety of other changes to the Windows Kernel at this point. Without Windows Vista, many things which are still used today (but with update extensions of course via patch updates and new versions of Windows) would likely not exist. Despite security not being so great in the Vista days, it did change the future for Windows. Depending on the TEST instruction usage, code execution flow will move to loc_1800A0B55. loc_1800A0B55 will use the 0x2E interrupt which is what Windows 2000 and Windows XP relied on instead of the now-traditional and modern SYSCALL and SYSENTER instructions. This will never be done on a modern version of Windows though, anything past Windows Vista will not end up at loc_1800A0B55.
Now we have the SYSCALL instruction. The bytes for this instruction is 0x0F 0x05. This version of NTDLL is 64-bit compiled and thus the SYSCALL instruction is being used.
For 32-bit compiled programs running on a 64-bit environment, WOW64 (Windows On Windows 64) will be used. WOW64 is for compatibility because a lot of software is compiled for 32-bit and may not be available exclusively with a 64-bit compilation alternate. WOW64 processes will have two copies of NTDLL loaded in their address space: the SysWOW64 copy; and the System32 copy. The former will always be 32-bit compiled and the latter will always be 64-bit compiled. When a Native API call is made in user-mode, it’ll lead to the 32-bit NTDLL function prologue for that Native API routine, but a system call won’t be performed at this point. Instead, execution flow will pass through the WOW64 layer, and then execution will land at the function prologue for the correct routine in the 64-bit NTDLL (which a WOW64 process cannot access by default because a 32-bit process cannot access 64-bit memory addresses, despite the 64-bit compiled NTDLL being present within the 32-bit process). WOW64 achieves execution redirection to the 64-bit compiled DLL by using the segment selector, which is used to determine the “world” of the current process architecture. When the segment selector is 0x23, you are executing under the context of “32-bit world”. When the segment selector is 0x33, you’re executing under the context of “64-bit world”. The Windows Kernel still relies on 64-bit system calls even for 32-bit compiled software operating on a 64-bit machine, hence why the segment selector is used to redirect to the 64-bit compiled NTDLL. This behavior can be replicated by third-party developers, despite being highly unrecommended and unstable, and this technique is commonly dubbed as “Heavens Gate”. I suspect it is dubbed this name because it is a gate to 64-bit memory access from within a 32-bit compiled process on a 64-bit environment, which can be seen as “Heaven” since you aren’t supposed to have been doing it.
I’ve finished explaining NTDLL and why the MSRs which point to KiSystemCall32/KiSystemCall64 are important and when they are triggered. Now I can progress to the final explanation (about trust levels) before I move on and finalise this thread with the ending regarding the AMD enforcement for the latest patch update.
When KiSystemServiceStart is called, it will call KiSystemServiceRepeat. KiSystemServiceRepeat will actually expose the address of KeServiceDescriptorTable and KeServiceDescriptorTableShadow, even on 64-bit systems where these tables are not exported and accessible by default. However, it must expose the addresses because this routine, KiSystemServiceRepeat, is responsible for extracting the address of the NT Service System Routine which needs to be called (the system call was performed so a kernel routine could be executed) from the System Service Descriptor Table. Just to be clear, the KeServiceDescriptorTable is what I’m referring to right now. The Shadow variant is the same however it also contains win32k.sys routine addresses, whereas the KeServiceDescriptorTable does not.
When a kernel-mode device driver makes a call to a Zw* kernel routine, eventually execution will land at KiSystemServiceRepeat. The difference however, is that Zw* routines will call a routine called KiServiceInternal (NTOSKRNL) which will change the PreviousMode of the current thread within an undocumented, opaque structure which every thread on the system is given, named KTHREAD. The reason for this is because the PreviousMode is used for the kernel to track the trust level of the caller performing the call for the System Service Routine (SSR). If the PreviousMode of the current thread is set to KernelMode, then the Windows Kernel knows that the trust level is a kernel-mode caller, and can grant permission for the routines to make use of kernel-mode addresses. Whereas with a user-mode caller, KiServiceInternal will not be invoked, and thus the PreviousMode value for the current thread in the KTHREAD structure will not be changed, preventing a user-mode program from referencing a kernel-mode address via the NT System Service Routines via system calls. The only difference in general is that in kernel-mode, using Zw* will let the Windows Kernel understand that the request is coming from a trusted source, and this doesn’t happen in user-mode. This is why if you manage to find the base address of the kernel image (NTOSKRNL) from user-mode, likely via NtQuerySystemInformation with the undocumented SystemModuleInformation class, you won’t be allowed to use a routine from user-mode to access that memory (e.g. read/write), even though the routine invocation from user-mode will rely on kernel-mode support. It’s because in user-mode, regardless of whether you use the Nt* or Zw* versions, they are the same as we have established, and a system call is performed for user-mode to kernel-mode transition – execution is handled to the kernel to handle the operation. The trust mode will be seen as being lesser than if a kernel-mode device driver made a call, because only the PreviousMode of the current thread will be changed for a kernel-mode caller.
Down the line, after what is necessary has been carried out, KiSystemServiceGdiTebAccess (NTOSKRNL) will be called. Other routines such as KiSystemServiceCopyEnd (NTOSKRNL) will finally be called, which will lead to KiSystemServiceExit (NTOSKRNL) eventually being called, which leads to another usage of the SWAPGS instruction and a SYSRET instruction being executed. Of course, anything like saved debug registers will be restored, the CSR register will be updated again, and so on forth.
To help explain the situation regarding the PreviousMode, let’s take a look at the ZwRemoveIoCompletion (NTOSKRNL) routine.
The above means the following:
1. Clear process interrupts (performed via the cli instruction)
2. Reads from the EFLAGS (Status and Control) register
3. Execution flow is directed to KiServiceInternal (NTOSKRNL)
I’ve highlighted some parts as you can see. I’ve done this to direct focus on the parts I wish to explain. The routine gets the base address of the PKTHREAD structure for the current thread by moving it into the RBX register. See the first two highlight boxes in red.
Following this, the PreviousMode value gets backed up into a local variable because the value at the address of the PKTHREAD structure (the address is the value of the RBX register) + the offset to reach the PreviousMode entry is assigned to a local variable. Notice the third highlight with the movzx instruction.
Finally, regarding the PreviousMode for this moment in time, the value of 0 (NULL) is being set as the PreviousMode of the current thread in the fourth highlight. 0 represents KernelMode and 1 represents UserMode. Since the caller is trusted, due to coming from other kernel-mode software, the system ensures the PreviousMode is indeed KernelMode when Zw* is used as a safety precaution.
To be completely clear, the PKTHREAD structure is a pointer to the KTHREAD structure of the current thread. When using the term “opaque” earlier, I am referring to the practice of leaving the structure undocumented and preventing the ability to access/alter the values of the fields while using the structure definition. It’s closed off for such access in the Windows Driver Kit (WDK) SDK. Of course, you can still use the PKTHREAD data-type (as you can with PEPROCESS), but you won’t be able to access the members or alter them without knowing the offsets.
The last highlighted box is regarding the execution flow being changed again. Once the PreviousMode has been dealt with, and thus the Windows Kernel will know the request is from a trusted source when the Native API routine is executed, KiSystemServiceStart is going to be called.
KiSystemServiceStart doesn’t do much in itself, it will call KiSystemServiceRepeat. KiSystemServiceRepeat will get the address of the KeServiceDescriptorTable and KeServiceDescriptorTableShadow. This is necessary because the addresses for the Native API routines which can be invoked from user-mode via a system call are held within the KeServiceDescriptorTable (also known as the System Service Descriptor/Dispatch Table – SSDT for short). The Shadow variant has also the Native API routine addresses which can be accessed, but also has addresses to the WIN32K routine addresses as well. The user-mode module, WIN32U, will require a communication to have the Windows Kernel execute Win32k routines for various functionality. NTDLL isn’t the only one.
We’ve covered the following since I postponed the continuation of KiInitializeBootStructures analysis.
1. What KiSystemCall32/KiSystemCall64 is used for.
2. What NTDLL is used for and how it shares a relationship with the Windows Kernel.
3. How Windows originally enforced (and still does for the record) Trust level validation between kernel-mode and user-mode Native API calls.
As a final summary in the case of a misunderstanding with #3, kernel-mode software should be using Zw* because this will allow the kernel-mode software to be treated as a trusted source, which it should be. Kernel-mode software does not trigger KiSystemCall32 or KiSystemCall64, this only happens when a user-mode program needs the Windows Kernel to execute a Native API routine (which has an export entry in the user-mode module, NTDLL) and thus executes a system call. KiSystemCall32 and KiSystemCall64 will both end up landing at KiSystemServiceRepeat, in which the correct NTAPI address is extracted from the KeServiceDescriptorTable, however the difference is that for kernel-mode software, the PreviousMode will already have been updated for the current thread. When the Nt* routine is executed, it will check the PreviousMode so it understands the trust level of the caller. The Zw* routines in kernel-mode are not the real routines, they are wrappers which ensure another routine which will update the PreviousMode is called. In user-mode, Nt/Zw has no difference. The other difference is that kernel-mode software will have a different code execution path, since as already noted, it doesn’t pass through KiSystemCall32 or KiSystemCall64.
We can move back to KiInitializeBootStructures now and finalise this thread.
If we take a look at KiInitializeBootStructures on a recent version of NTOSKRNL, but without the recent patch update which was released due to the Meltdown vulnerability, we will see that the address of KiSystemCall32 and KiSystemCall64 is going to be pointed to by two Model Specific Registers (MSRs).
The __writemsr routine is actually a macro, and the back-end implementation of the macro relies on the wrmsr Assembly instruction. There’s another macro called __readmsr for reading the value of an MSR.
The 0xC0000083 is for the IA32_CSTAR MSR. This MSR is used for the SYSCALL instruction (compatibility only). Whereas, the 0xC0000082 is for the IA32_LSTAR MSR. The former is going to point to the address of KiSystemCall32 and the latter is going to point to the address of KiSystemCall64. You can learn quite a bit about this from the OSDEV Wiki page, I’ll be sure to leave a link at the end.
Continuing our analysis, we can evidently see that two MSRs are pointed to an address (one different to the other) at system start-up, because KiInitializeBootStructures is one of the first routines executed when the Windows Kernel is started up into memory. Remember, all of this needs to happen before a single user-mode process can even think about being executed, because it would need system call support, which is being set up at this moment in time.
We are going to switch back over to the patched and updated NTOSKRNL now and see the difference in the KiInitializeBootStructures routine.
I’ve highlighted the difference, this chunk of code was added into the KiInitializeBootStructures routine.
Firstly, a routine called KiEnableKvaShadowing (which has just been introduced with the recent patch as well) is being called.
Secondly, a flag named KiKvaShadow is being checked. If the flag check for KiKvaShadow is TRUE, a local variable will be assigned the value of the address of KiSystemCall32Shadow, and another local variable will be assigned the value of the address of KiSystemCall64Shadow. This is another change, a local variable is now created at the start of the function prologue to store the correct address for the KiSystemCall32 and KiSystemCall64 which is going to be used.
Thirdly, if the KiKvaShadow flag was indeed TRUE and the value within the local variables to store the address of KiSystemCall32 and KiSystemCall64 has been replaced with the new “Shadow” appended prefix versions, but another check identifies the CPU as being AMD, then the value of the local variables will be changed again from KiSystemCall32Shadow and KiSystemCall64Shadow to KiSystemCall32AmdShadow and KiSystemCall64AmdShadow.
By default, the local variables are given the value of the normal original addresses to KiSystemCall32 and KiSystemCall64. This means that if the KiKvaShadow flag comes back as FALSE, the system calls being performed by user-mode software will behave exactly as it always has for us over the past few years, because the normal code we are used to will be getting executed in the kernel each time. However, if it doesn’t, the addresses are changed. If an AMD CPU is detected after this process, the addresses to the local variables are switched again, to another KiSystemCallXx alternate which appears to be specifically designed for AMD. Whereas, if you have any processor other than AMD, and the KiKvaShadow flag comes back as TRUE, you’ll get the normal KiSystemCallXxShadow address.
As previously with the pre-patch NTOSKRNL, the MSRs for IA32_CSTAR and IA32_LSTAR are written to via the __writemsr macro. This is the last step with this, but we are going to back-track briefly in a few minutes. The local variable used to store the KiSystemCall32 variant which is going to be applied for system calls is written to the IA32_CSTAR MSR and the IA32_LSTAR MSR is written to using the other local variable for the KiSystemCall64 variant which is to be targeted.
If a patch for Meltdown is implemented from hardware level then KiKvaShadow may not be kept enabled for that machine and thus the original KiSystemCall32 and KiSystemCall64 is likely to be used, however if the system is detected as being vulnerable then KiSystemCall32Shadow and KiSystemCall64Shadow is used, unless you’re an AMD user of course. Then you get that fancy AmdShadow variant.
Indeed, the flag is set to 1 (which will cause a TRUE response with the flag check after this routine has been called) in the KiEnableKvaShadowing routine.
There is actually many new routines which have come with this new patch update, there’s even a new section in the Portable Executable to NTOSKRNL. The new section is dubbed “KVASCODE” and there are 143 routines in this section to-date of posting this.
Here are some of the new routines added within the latest patch update.
If you’re wondering about the KiSystemCall32ShadowCommon and KiSystemCall64ShadowCommon, KiSystemCall32Shadow calls the former and KiSystemCall64Shadow calls the latter. The KiSystemCall32AmdShadow and KiSystemCall64AmdShadow also call the KiSystemCall64ShadowCommon routine, which KiSystemCall64Shadow will call.
In the end, regardless of whether your system has an Intel, AMD or other manufacturer based CPU, the code path meets after one routine. For example, if you have AMD then you will be using the KiSystemCallXxAmdShadow routine, but this routine will call KiSystemCallXxShadowCommon, which the non-AMD KiSystemCallXxShadow routines will call as well. The only difference is a few operations before the call. “Xx” being “32” or “64”.
I’ve compared the two routines, and the difference between the default one for non-AMD and AMD is hardly different, the changes are very minimal between the original wrapper being called (AmdShadow/Non-AMD Shadow).
- SWAPGS Assembly instruction – this is used because the kernel stack is unavailable and kernel data structures will need to be accessed, etcetera.
- An offset and a segment is provided with the MK_FP macro and a far pointer from this calculation is returned.
- For the non-AMD version, a bit test conditional check has to be performed, however this is not done in the AMD one. If the bit test check is successful with the non-AMD one, or for the AMD one where this doesn’t happen, the __writecr3 macro is used to write to the CR3 register.
- The rest for both of them is just some small MK_FP macro usage for either changing memory or querying from it.
Between the non-AMD one and the AMD one, there is really just pointer manipulation differences. The difference in them in terms of performance shouldn’t even be noticeable, and straight after this routine, the KiSystemCallXxShadowCommon routine is called, regardless of whether it’s the AMD one or not. Which means, for AMD processors as well as any other using the patch feature, code execution remains the same in terms of the code execution path for the rest of the system call operation being handled by the Windows Kernel.
This also makes me believe what Intel have said regarding the performance difference, because it isn’t all that different to my eye looking into this. Code execution will still land at KiSystemServiceRepeat, after having passed through routines like KiSystemServiceUser (which the original KiSystemCall64 does as well). I’ll be sure to do a proper bench-mark test once I get the patch sorted.
This thread also demonstrates an idea on how you could update Windows and not have the feature enforced even if it needs to be, but I don’t see why one would do that given the damage that the patched vulnerability can do. This new patch update improves security a lot more than it previously was. However, you would just re-patch the IA32_CSTAR \ IA32_LSTAR MSR to point to KiSystemCall32 and KiSystemCall64, not the Shadow variants. You’d need a work-around for PatchGuard on 64-bit though, so if you’re using virtualisation (e.g. hyper-visor) then it definitely a simpler job. As I said though, unrecommended. The vulnerability can cause good damage because despite the Meltdown exploit not allowing write access to the kernel, it still allows read access which can lead to data theft.
Anyway, there is a lot to this patch and this thread doesn’t explain how the patch actually has an effect on the recent vulnerability, it was neither the purpose – I don’t know all the facts of it either and there is so much in this latest patch which is still to be uncovered, all I have done is show it is enforced for AMD processors and explain how this changes for system calls. Here are some links to learn more about the vulnerabilities in general, and also the OSDEV Wiki which can be a helpful resource:
Remember to take this article with a grain of salt, so please do not just assume that anything said here is 100% truthful. This is simply based on my current checking, and I can make mistakes as well. More of the facts about everything likely won’t be out in the air for many months to come from now, and I am sure thousands of researchers are probably reversing parts of the recent patch update or trying to replicate exploitation of various vulnerabilities for research purposes.