Assembly.Lie – Using Transactional NTFS and API Hooking to Trick the CLR into Loading Your Code “From Disk”
Assembly.Load, a method that has been one of the primary reasons for the meteoric rise in offensive tooling written in C# over the past few years. Its most commonly used overload in offensive tooling – Assembly.Load(byte) allows for memory-only loading of .Net assembly objects (.exe / .dll) directly from a byte array representing the object’s contents, effectively granting the ability to reflectively load and execute a program entirely in memory in just 2-3 lines of code. This has enabled all sorts of multi-staged payloads, modular program functionality, and fileless post-exploitation operations.
A few months back some tooling I was working on caused me to take a closer look into the mechanisms behind loading assemblies into the .net Common Language Runtime (CLR). I found that while Assembly.Load has several overloads that all correspond to the same managed method, the unmanaged functions they call vary quite drastically. Through this process, I attempted to determine if it would be possible to intercept the function calls used by the CLR to load an assembly from disk, which would allow us to return spoofed data and trick it into thinking a non-existent assembly was being loaded from an on-disk location. I believed it to be possible as at the end of the day, regardless of if an assembly is loaded from an on-disk file or a byte array directly, the code being executed is simply a series of bytes that exists in a pre-defined region of memory.
Why does this matter? Due to AMSI only scanning assemblies loaded directly from (what the CLR believes to be) memory, this technique would serve as another category of AMSI bypass. Secondly, this function serves as a prototype that can be implemented for other .net assembly loading methods which provide additional functionality over the traditional Assembly.Load call, but which historically have not been used due to their reliance on loading from disk. Finally, inspection of assemblies loaded into a process via this technique would show that they appear to be backed by a (nonexistent) file on disk.
Mechanics of Assembly.Load
When the CLR encounters an instruction to load an assembly from disk, several steps are immediately taken to attempt to locate it, depending on the overload called and the information provided. The CLR’s most-preferred way of loading an assembly is to be provided a “full” reference in the form of an AssemblyName object. This includes data on the target assembly’s name, version, culture, and public key token (if one exists). In practice, loading assemblies in this way is unnecessary for most use-cases as a “partial” reference consisting of just the assembly’s name can be provided instead. This partial reference means the CLR does not search the global assembly cache for the target assembly, but rather will first check to see if it is already loaded in the current AppDomain. If the assembly has not been previously loaded, the current directory the application is executing from will be searched, after which a “file not found” exception will be thrown if a matching file is not located. This process can be seen below in API Monitor when we execute the following code that looks for a non-existent file (executing as x64, running from Z:\test):
As can be seen above, the GetFileAttributes function is used to attempt to identify either a dll or exe with the provided name, finding none we get the expected crash:
Conversely, if a valid file is found, it kicks off a series of function calls to grab additional information and load the file into memory. This primary chain of calls to load an assembly consists of the following:
In this long chain of function calls, of most interest to us are the following:
- Responsible for initial identification of a valid file in the execution directory (does this file exist?)
- Queries the file now believed to exist, provides a pointer to a WIN32_FILE_ATTRIBUTE_DATA structure that gets populated with data on file size and creation/access/modification times.
- Returns a handle to the file that has now been validated as existing through the successful completion of the two prior calls. This call has a dwCreationDisposition value of OPEN_EXISTING, meaning the call will fail if the file does not already exist.
- Uses the handle to the file returned in the prior call. Provides a pointer to a BY_HANDLE_FILE_INFORMATION structure that gets populated with the same data that exists in the WIN32_FILE_ATTRIBUTE_DATA structure created during the initial GetFileAttributesEx call, as well as additional data on volume and file fingerprint. Gets checked against other results returned to ensure the handle is to the correct file.
- (another) GetFileAttributesEx
- Final check to validate no changes have been made to the file that is being loaded.
- Takes the file handle returned by CreateFile and creates a file mapping with PAGE_READONLY | SEC_IMAGE protections set.
- Creates a view to read the mapped file. Due to the SEC_IMAGE flag being set on the file mapping object, (read-only) access is inherited and as a result no additional access can be set.
Tricking the CLR (and beating AMSI along the way)
Note: this section is somewhat of a wall of text regarding the specifics on how this works. I’ve also included a diagram at the bottom that pretty well sums stuff up if that’s more your speed.
Outside of the context of API hooking as performed by EDR vendors and the stubs of code that can be used to unhook them, I didn’t have much experience with implementing my own hooks in code until earlier this year, when @FuzzySec dropped Dendrobate (https://github.com/FuzzySecurity/Dendrobate). This is a cool project that uses the EasyHook library to place a hook onto CreateFile, causing calls to that function from the current process to instead be redirected to an operator-controlled detour method, from where the call can be sent along or have additional operations performed on it. If you’re interested in an in-depth explanation on hooking, I would definitely recommend checking out the official EasyHook documentation here: http://easyhook.github.io/documentation.html.
After gaining an understanding of the calls being made when a load command was executed, my thought was to implement what is essentially a scaled-up version of the code in Dendrobate to hook, modify, and provide faked responses to the calls in the above loading process. This was necessary as each of the calls builds off those prior, meaning any interception and modification of responses needs to start at the beginning of the load process with GetFileAttributes, returning a spoofed response to tell the CLR that a nonexistent file of our choosing does indeed exist in the current execution directory. Assuming we are able to hook unmanaged function calls and thus can intercept / return faked responses, this call is pretty easy to handle. It doesn’t need us to fill any structs with data, and only requires that a single dword be returned. As a result, the code to implement this is straightforward and gives a good example of what a detour method looks like:
In this case, loadedAssemblyName corresponds with the name of the nonexistent assembly we told Assembly.Load to attempt to identify in the current execution directory. As calls to this function are intercepted, this code compares the filename passed in against this initial value. An unsuccessful match indicates other normal program behavior, and so the call is simply forwarded along, whereas a successful match indicates that the CLR is searching for our nonexistent file. We return a value of 32 (0x20) to this call, indicating that the requested file exists and is of type FILE_ATTRIBUTE_ARCHIVE.
Receiving this response, the CLR immediately calls GetFileAttributesEx for the same filename passed in the prior call. Intercepting this as well, the main task associated with this step of the loading process is populating the WIN32_FILE_ATTRIBUTE_DATA structure pointed to by lpFileInformation. To successfully return valid data requires generation of (somewhat) random times in the form of FileTime structures, as well as knowledge of the size of the assembly we are attempting to load through this process. Upon populating the pointed-to structure with data, a simple Boolean true can be returned.
Having obtained confirmation that the target file exists and receiving some preliminary data on its size, the CLR proceeds to attempt to open it via CreateFile. This is the point in the chain things started getting a bit interesting for me, as although there are many types of I/O channels CreateFile can interface with and return a handle from, the majority are not compatible with CreateFileMapping. Things such as mailslots and pipes seemed like interesting ways to store bytes in a way that didn’t touch disk, but always resulted in an invalid handle being returned when a section attempted to be created using the handle passed back by CreateFile. Early in the development process, I simply hooked this call and redirected it to open a test file I had in another directory. This worked as it was a valid file on disk, but it didn’t really do a whole lot beyond being a neat party trick that allowed me to make it appear as though I was loading assemblies from locations they didn’t reside. Further attempts with creating an empty file that had a dwFlagsAndAttributes value of FILE_ATTRIBUTE_TEMPORARY set and subsequently writing to it also proved to be unsuccessful, as even with this flag data appeared to still be immediately written through to disk.
The breakthrough on this came when I was talking with my buddy @anthemtotheego, who had recently released a tool called CredBandit (https://github.com/anthemtotheego/CredBandit) that utilized transactional NTFS to write an LSASS dump file to memory instead of disk. He recommended I look into this mechanism a bit further, as the function returns a valid file handle identical to one passed by a file backed by disk. At the time I didn’t know much about transactional NTFS beyond that it was used as a part of process doppelgänging, but pretty quickly after writing up a PoC to test it out, it seemed to be exactly what I was looking for. Taking a look at Microsoft’s page on the subject that has a giant disclaimer at the top did nothing but further reinforce I was probably on the right track.
For those that may be in the same boat I was last month on their level of knowledge around transactional NTFS, here is a quick diagram explaining how it is being used in the scope of this project:
Back to the assembly loading process - as we have been intercepting and providing faked responses to each of the CLR’s outbound queries attempting to ensure it is opening the correct file, it accepts the valid file handle we pass back to it from our memory-only transacted file without any issues.
With a valid handle to what it believes to be an on-disk file in the current execution directory, two more queries are performed by the CLR to ensure that the handle it has points to the file it is expecting to be opened. GetFileInformationByHandle grabs similar data to the above GetFileAttributesEx call, the results of which are compared to ensure the open handle is to the correct file. One final GetFileAttributesEx call is ran to ensure the loaded file has not been changed since the initial query. Intercepting these two calls allows us to return forged data for these as well, populating the pointed-to structures with the same data originally generated for the first GetFileAttributesEx call, and finally convincing the CLR that the file it is in the process of loading is correct and does in fact exist. Things are pretty much done at this point, a section is created, view is mapped to it, and the bytes are read out. These calls don’t need to be intercepted or modified, as the CLR already believes it has loaded a valid file.
Here is a diagram summarizing the entire process:
So how does this beat AMSI? Currently, AMSI hooks are limited to Assembly.Load overloads that load directly from a byte array, so through the process of convincing the CLR that we are loading from an on-disk location, we can simply sidestep AMSI entirely. This means no modifications must be made to process memory in terms of overwriting the amsi.dll instance loaded into the process, etc. and allows us to do fun stuff like this:
While also having the assembly we’re loading showing up as existing on-disk:
What’s in a Name?
At this point in development, I had a working PoC that successfully loaded files from memory that appeared to be backed by a file on-disk and bypassed AMSI. Ready to release, right? Not so much. Up until this point, all my testing had used a static assembly I had converted into a byte array, and it just so happened the filename I was attempting to locate with Assembly.Load shared the same name as this assembly. Basically, I had converted test.exe to a byte array and was loading that through the above process, but when kicking off this whole chain I was also telling the CLR to look for a nonexistent test.exe file on-disk. When modifying my code to be more dynamic (instead of only being able to just load a single test array from memory), I quickly found I was running into errors that looked like this:
Through some testing I found that the CLR performs an additional check during the loading process - it will compare the name originally supplied in the Assembly.Load(filename) call against the filename contained within the metadata of the loaded file. If the two don’t match it will throw an error, and your assembly will not be successfully loaded. I figured there were two ways around this, the easy (but lazy) way of just making an operator provide the name of their assembly as an argument, and the better but more difficult way that actually required me to do some work. At the end of the day, I ended up implementing both to give the operator more flexibility. An assembly name (including extension) can optionally be supplied to the load call, alternatively the PE header will be walked to automatically pull the name the assembly was compiled with (this is the value the CLR checks, not the current display name). The result of the walking process looks like this:
Putting the Pieces Together
With this last hurdle overcome, everything was working pretty much as intended. I ended up publishing the tool as a library to make it more portable for use in other projects and have put together a super basic PE that leverages it so people can get an idea of how the functionality works. This PE is based on another of my buddy @anthemtotheego’s tools, SharpCradle, and takes a small bit of its functionality (web loading only). As currently built, this library and the SharperCradle PoC PE are compatible with execute-assembly, Donut, etc. but are not meant to be “op-ready” tools, rather they are meant to serve as templates from which parts can be grabbed or further modifications can be made. If you’re interested in checking out the code, both the library and PoC PE are available here: https://github.com/G0ldenGunSec/SharpTransactedLoad
Assembly.Load is just one of several different methods that exists to load .net assemblies into memory. These other methods typically provide some type of additional functionality, with their main caveat being that almost all do not support direct loading from byte arrays as the aforementioned Assembly.Load does. As a result, any sort of code to be executed first must be dropped to disk. With the new capabilities STL brings to the table, some of these other methods may now come into play for offensive operations. There are definitely some use-cases for the library as it exists today, but some of the more interesting applications have not been covered within the scope of this post.
Regardless of the call being hooked / method for kicking the process off, this technique still relies on transactional NTFS. The CreateTransaction and CreateFileTransacted functions seem to have very few legitimate uses and would be the first things I recommend implementing additional detections for in an environment. Additionally, as the files supposedly backing the loaded assemblies do not in fact exist, a scan cross-referencing loaded images against files on disk should catch this anomaly.