Theory: EDR Syscall hooking and Ghost Hunting, my approach to detection
A deep exploration of Windows EDR syscall hooking with a new “Ghost Hunting” theory for detecting direct and indirect syscalls, plus insights into kernel-level callbacks and communication between usermode and driver components.
Intro
Alright, the foundations of the Sanctum EDR are now set up which include:
- The Driver and IOCTL, process handle interception, process creation interception
- Driver control
- Injecting DLLs into newly created processes
- Process monitoring in the usermode engine
- Communication regarding processes between driver and usermode engine
- GUI
- Logging
And now we can begin with the actual research and deepening of concepts, aka the fun bit! As ever, if interested you can check out this project at my GitHub.
Syscall hooking
I have talked in detail about syscalls and how malware abuses them in my Hells Gate blog post. I won’t cover old ground in depth, but as somewhat of an introduction to this, whenever we do an action in usermode, we somehow need to talk to the Windows kernel to make it do the thing.
The way Microsoft implements this in Windows is to have an interface library between ‘user land’ and ‘kernel land’ which allows the user application to call a number (SSN - System Service Number) which in turn maps to a function in the kernel.
One way malware tries to go undetected on a system is to directly call these syscalls to make the kernel perform functionality which ordinarily would look suspicious. This is mostly relevant when trying to bypass an EDR (Endpoint Detection and Response, basically corporate antivirus).
For example, using remote process DLL injection, the following needs to occur:
- Call
OpenProcess
to the target process, lets say the old favourite,Notepad.exe
. - Call
VirtualAllocEx
to allocate space in the target process for the DLL path you wish to inject. - Call
WriteProcessMemory
to write the DLL path. - Call
CreateRemoteThread
(or use APC Queue Hijacking) to spawn a thread on the DLL.
Now, given this is a ‘routine’ pattern for malware performing your bog standard, run of the mill DLL injection, an EDR has the ability to ‘hook’ these functions in userland just before they make the syscall so it can inspect parameters, and any other environment data that it collects.
Casting our eyes over the function OpenProcess, the syscall for this lives in ZwOpenProcess
of Ntdll.dll
; and it looks like this:
Here, you can see the SSN (System Service Number) is 0x26, which will correspond to the kernel’s receiver for an application calling OpenProcess
.
So, what we can do now is to overwrite these instructions instead to jump to a DLL we have injected into the process, so we can examine the function parameters (and any other environment information) that we like. In Sanctum EDR, currently we use IPC to communicate between the engine and the GUI; so one option we have is to use IPC to communicate from a process that’s hit the trigger, to talk to the usermode engine, which can use IOCTLs if it needs to talk to the driver, and can then respond back to the DLL via IPC.
The approach
I’ve thought long and hard about how I want to implement the hooking and inspection logic, and this so far is the best method I have come up with to do so. I’ll be using ‘vanilla’ remote process DLL injection as the testing candidate for detection.
There are 3 levels of depth to my approach for a ‘defence in depth’ strategy for detecting and blocking this behaviour. Here is what I am looking at:
Level | Description | Component |
---|---|---|
1 | Syscall hooking and execution flow redirection to examine properties | Injected EDR’s DLL, Usermode Engine |
2 | Kernel level callbacks | Driver |
3 | Communication | DLL, Engine, Driver |
Taking DLL injection as our use case, we can diagram the attack, and thus opportunity for EDR intervention like so:
Ghost hunting
Now, the interesting part of this project is going into areas which are not heavily documented, or things I have read before whilst doing my own research & learning. How do we make the EDR tick? What systems and logic can we build to detect bad activity.
The first thing I want to theorise and reason about, which I am calling Ghost Hunting, is a technique I have thought up which looks at directly combatting the avoidance of EDR hooks through techniques like SysWhispers, hells gate, direct syscalls, and indirect syscalls. Traditionally, from what I have read in prior research, one good way of detecting the use of syscall abuse by malware is looking at the call stack. There was one interesting article by Palo Alto on Cortex XDR which seems to hook the kernel side of the syscall, this is a nice approach. My approach (Ghost Hunting) is different yet.
NOTE TO THE READER: This technique is experimental, and simply an area of research I wish to test. In writing about this, I am expressing my inner dialogue and thought process whilst investigating this topic. What lies beyond is not written as a proved technique, but instead a theory which I wish to build and test. Ultimately, this is what the process of security research looks like. This research may amount to nothing. That said, whilst the theory behind this technique may be flawed; we will still build my own implementation of syscall hooking, so please, if you wish to at least join me on that journey, you are more than welcome!
The Ghost Hunting technique looks like this (using a hooked OpenProcess
as our example):
- On new process start, inject the EDR’s DLL into the process.
- The injected DLL will reflectively have to resolve function pointers to its own hook receiver functions, storing them in a struct for step 3.
- Overwrite the syscall stub we wish to hook with a jmp instruction to the Virtual Address of the function hook receiver.
- From the hook receiver, inspect arguments, communicate with the engine - is the process at a high risk level? What else has it done?
- The engine will communicate with the EDR driver to signal an OpenProcess syscall is incoming.
- Ghost hunt If a handle has been issued by the kernel before we make the syscall, we know some form of syscall evasion has taken place.
- If abuse is detected, kill the process.
- If no handle was yet issued, then the process is playing by the rules and not trying to evade an EDR
- At some point, either after point 5 or point 7, the EDR DLL will make the syscall to make it more difficult for malware to simply patch the jmp and continue execution to the syscall.
There are two obvious downside to this methodology:
- Since we are subscribing to process handle events, there are multiple
NTAPI
functions which could cause the creation or duplication of a process handle so each and every one would need to be hooked. This is a lot of work. - Tracking which handle creation relates to which API call will be difficult; what if there are 5 requests from 1 process all at once to the same target process (or itself), how do you align these requests?
As stipulated above, this technique is a theory, and may not work. The detail of the implementation will be dealt with later once we have achieved syscall hooking.
Whilst the downsides are present for OpenProcess; they are much less cumbersome when dealing with the CreateRemoteThread
API - that will be a much better candidate for the ghost hunting technique. When we deal with the
implementation detail of Ghost Hunting we can examine the difference between using the technique on CreateRemoteThread vs OpenProcess. We also can explore some Event Tracing for Windows: Threat Intelligence subscriptions we can
attach to in the kernel to look at other ‘areas’ of the attack flow which again can use the Ghost Hunting technique to aid syscall evasion detection.
Why Ghost Hunting?
I have named it as such for 2 reasons. First - Ghost Hunting here refers to the fact a syscall made it to the kernel without going through our hooked function, if you like, a ghost. Second - I’m about to go on an IRL ghost hunt with some friends and it feels topical.
Next steps
There’s a lot to digest here, but the next obvious step is to implement syscall hooking which is a challenge in itself. Once a basic POC is working for that, I will likely create some internal tooling within the EDR to streamline the hooking of functions.
Finally, once the hooks are working properly and we can make syscalls from the DLL without breaking execution, we will look more closely at Ghost Hunting!