Theory: EDR Syscall hooking and Ghost Hunting, my approach to detection

A deep exploration of Windows EDR syscall hooking with a new “Ghost Hunting” theory for detecting direct and indirect syscalls, plus insights into kernel-level callbacks and communication between usermode and driver components.

TL;DR

I have moved the Ghost Hunting technique into the kernel - you can see the merge of the refactor here, or to see the specific source file, check here.

I’m writing this from the future, having implemented one Ghost Hunting technique already, what sets my technique apart:

If NTDLL gets remapped by malware (i.e. a fresh copy loaded from the system into the process) - it will not affect the ability of Sanctum to detect the malicious behaviour. If anything, this will raise the risk score generated.
This technique can detect direct syscalls, indirect syscalls, hells gate etc.
THis technique thwarts dynamic syscall runtime resolution as we overwrite the entire stub (including the SSN) with NOPs before writing our jump instruction.

Intro

Alright, the foundations of the Sanctum EDR are now set up which include:

The Driver and IOCTL, process handle interception, process creation interception
Driver control
Injecting DLLs into newly created processes
Process monitoring in the usermode engine
Communication regarding processes between driver and usermode engine
GUI
Logging

And now we can begin with the actual research and deepening of concepts, aka the fun bit! As ever, if interested you can check out this project at my GitHub.

Syscall hooking

I have talked in detail about syscalls and how malware abuses them in my Hells Gate blog post. I won’t cover old ground in depth, but as somewhat of an introduction to this, whenever we do an action in usermode, we somehow need to talk to the Windows kernel to make it do the thing.

The way Microsoft implements this in Windows is to have an interface library between ‘user land’ and ‘kernel land’ which allows the user application to call a number (SSN - System Service Number) which in turn maps to a function in the kernel.

One way malware tries to go undetected on a system is to directly call these syscalls to make the kernel perform functionality which ordinarily would look suspicious. This is mostly relevant when trying to bypass an EDR (Endpoint Detection and Response, basically corporate antivirus).

For example, using remote process DLL injection, the following needs to occur:

Call OpenProcess to the target process, lets say the old favourite, Notepad.exe.
Call VirtualAllocEx to allocate space in the target process for the DLL path you wish to inject.
Call WriteProcessMemory to write the DLL path.
Call CreateRemoteThread (or use APC Queue Hijacking) to spawn a thread on the DLL.

Now, given this is a ‘routine’ pattern for malware performing your bog standard, run of the mill DLL injection, an EDR has the ability to ‘hook’ these functions in userland just before they make the syscall so it can inspect parameters, and any other environment data that it collects.

Casting our eyes over the function OpenProcess, the syscall for this lives in ZwOpenProcess of Ntdll.dll; and it looks like this:

NtOpenProcess

Here, you can see the SSN (System Service Number) is 0x26, which will correspond to the kernel’s receiver for an application calling OpenProcess.

So, what we can do now is to overwrite these instructions instead to jump to a DLL we have injected into the process, so we can examine the function parameters (and any other environment information) that we like. In Sanctum EDR, currently we use IPC to communicate between the engine and the GUI; so one option we have is to use IPC to communicate from a process that’s hit the trigger, to talk to the usermode engine, which can use IOCTLs if it needs to talk to the driver, and can then respond back to the DLL via IPC.

The approach

I’ve thought long and hard about how I want to implement the hooking and inspection logic, and this so far is the best method I have come up with to do so. I’ll be using ‘vanilla’ remote process DLL injection as the testing candidate for detection.

There are 3 levels of depth to my approach for a ‘defence in depth’ strategy for detecting and blocking this behaviour. Here is what I am looking at:

Level	Description	Component
1	Syscall hooking and execution flow redirection to examine properties	Injected EDR’s DLL, Usermode Engine
2	Kernel level callbacks	Driver
3	Communication	DLL, Engine, Driver

Taking DLL injection as our use case, we can diagram the attack, and thus opportunity for EDR intervention like so:

EDR Diagram

Ghost hunting

Now, the interesting part of this project is going into areas which are not heavily documented, or things I have read before whilst doing my own research & learning. How do we make the EDR tick? What systems and logic can we build to detect bad activity.

The first thing I want to theorise and reason about, which I am calling Ghost Hunting, is a technique I have thought up which looks at directly combatting the avoidance of EDR hooks through techniques like SysWhispers, hells gate, direct syscalls, and indirect syscalls. Traditionally, from what I have read in prior research, one good way of detecting the use of syscall abuse by malware is looking at the call stack. There was one interesting article by Palo Alto on Cortex XDR which seems to hook the kernel side of the syscall, this is a nice approach. My approach (Ghost Hunting) is different yet.

NOTE TO THE READER: This technique is experimental, and simply an area of research I wish to test. In writing about this, I am expressing my inner dialogue and thought process whilst investigating this topic. What lies beyond is not written as a proved technique, but instead a theory which I wish to build and test. Ultimately, this is what the process of security research looks like. This research may amount to nothing. That said, whilst the theory behind this technique may be flawed; we will still build my own implementation of syscall hooking, so please, if you wish to at least join me on that journey, you are more than welcome!

The Ghost Hunting technique looks like this (using a hooked OpenProcess as our example):

On new process start, inject the EDR’s DLL into the process.
The injected DLL will reflectively have to resolve function pointers to its own hook receiver functions, storing them in a struct for step 3.
Overwrite the syscall stub we wish to hook with a jmp instruction to the Virtual Address of the function hook receiver.
From the hook receiver, inspect arguments, communicate with the engine - is the process at a high risk level? What else has it done?
The engine will communicate with the EDR driver to signal an OpenProcess syscall is incoming.
Ghost hunt If a handle has been issued by the kernel before we make the syscall, we know some form of syscall evasion has taken place.
1. If abuse is detected, kill the process.
2. If no handle was yet issued, then the process is playing by the rules and not trying to evade an EDR
At some point, either after point 5 or point 7, the EDR DLL will make the syscall to make it more difficult for malware to simply patch the jmp and continue execution to the syscall.

There are two obvious downside to this methodology:

Since we are subscribing to process handle events, there are multiple NTAPI functions which could cause the creation or duplication of a process handle so each and every one would need to be hooked. This is a lot of work.
Tracking which handle creation relates to which API call will be difficult; what if there are 5 requests from 1 process all at once to the same target process (or itself), how do you align these requests?

As stipulated above, this technique is a theory, and may not work. The detail of the implementation will be dealt with later once we have achieved syscall hooking.

Whilst the downsides are present for OpenProcess; they are much less cumbersome when dealing with the CreateRemoteThread API - that will be a much better candidate for the ghost hunting technique. When we deal with the implementation detail of Ghost Hunting we can examine the difference between using the technique on CreateRemoteThread vs OpenProcess. We also can explore some Event Tracing for Windows: Threat Intelligence subscriptions we can attach to in the kernel to look at other ‘areas’ of the attack flow which again can use the Ghost Hunting technique to aid syscall evasion detection.

Why Ghost Hunting?

I have named it as such for 2 reasons. First - Ghost Hunting here refers to the fact a syscall made it to the kernel without going through our hooked function, if you like, a ghost. Second - I’m about to go on an IRL ghost hunt with some friends and it feels topical.

Next steps

There’s a lot to digest here, but the next obvious step is to implement syscall hooking which is a challenge in itself. Once a basic POC is working for that, I will likely create some internal tooling within the EDR to streamline the hooking of functions.

Finally, once the hooks are working properly and we can make syscalls from the DLL without breaking execution, we will look more closely at Ghost Hunting!