Assembly.Lie – Using Transactional NTFS and API Hooking to Trick the CLR into Loading Your Code “From Disk”
Introduction:
Assembly.Load, a method that has been one of the primary
reasons for the meteoric rise in offensive tooling written in C# over the past
few years. Its most commonly used
overload in offensive tooling – Assembly.Load(byte[]) allows for memory-only
loading of .Net assembly objects (.exe / .dll) directly from a byte array
representing the object’s contents, effectively granting the ability to
reflectively load and execute a program entirely in memory in just 2-3 lines of code. This has enabled
all sorts of multi-staged payloads, modular program functionality, and fileless
post-exploitation operations.
A few months back some tooling I was working on caused me to
take a closer look into the mechanisms behind loading assemblies into the .net
Common Language Runtime (CLR). I found
that while Assembly.Load has several overloads that all correspond to the same
managed method, the unmanaged functions they call vary quite drastically. Through this process, I attempted to
determine if it would be possible to intercept the function calls used by the
CLR to load an assembly from disk, which would allow us to return spoofed data and
trick it into thinking a non-existent assembly was being loaded from an on-disk
location. I believed it to be possible
as at the end of the day, regardless of if an assembly is loaded from an
on-disk file or a byte array directly, the code being executed is simply a
series of bytes that exists in a pre-defined region of memory.
Why does this matter? Due to AMSI only scanning assemblies
loaded directly from (what the CLR believes to be) memory, this technique would
serve as another category of AMSI bypass.
Secondly, this function serves as a prototype that can be implemented
for other .net assembly loading methods which provide additional functionality
over the traditional Assembly.Load call, but which historically have not been
used due to their reliance on loading from disk. Finally, inspection of assemblies loaded into
a process via this technique would show that they appear to be backed by a (nonexistent) file
on disk.
Mechanics of Assembly.Load
When the CLR encounters an instruction to load an assembly
from disk, several steps are immediately taken to attempt to locate it,
depending on the overload called and the information provided. The CLR’s most-preferred way of loading an
assembly is to be provided a “full” reference in the form of an AssemblyName
object. This includes data on the target
assembly’s name, version, culture, and public key token (if one exists). In
practice, loading assemblies in this way is unnecessary for most use-cases as a
“partial” reference consisting of just the assembly’s name can be provided
instead. This partial reference means
the CLR does not search the global assembly cache for the target assembly, but
rather will first check to see if it is already loaded in the current AppDomain. If the assembly has not been previously
loaded, the current directory the application is executing from will be
searched, after which a “file not found” exception will be thrown if a matching
file is not located. This process can be
seen below in API Monitor when we execute the following code that looks for a
non-existent file (executing as x64, running from Z:\test):
As can be seen above, the GetFileAttributes function is used
to attempt to identify either a dll or exe with the provided name, finding none
we get the expected crash:
Conversely, if a valid file is found, it kicks off a series
of function calls to grab additional information and load the file into
memory. This primary chain of calls to
load an assembly consists of the following:
In this long chain of function calls, of most interest to us
are the following:
- GetFileAttributes
- Responsible for initial identification of a valid file in the execution directory (does this file exist?)
- GetFileAttributesEx
- Queries the file now believed to exist, provides a pointer to a WIN32_FILE_ATTRIBUTE_DATA structure that gets populated with data on file size and creation/access/modification times.
- CreateFile
- Returns a handle to the file that has now been validated as existing through the successful completion of the two prior calls. This call has a dwCreationDisposition value of OPEN_EXISTING, meaning the call will fail if the file does not already exist.
- GetFileInformationByHandle
- Uses the handle to the file returned in the prior call. Provides a pointer to a BY_HANDLE_FILE_INFORMATION structure that gets populated with the same data that exists in the WIN32_FILE_ATTRIBUTE_DATA structure created during the initial GetFileAttributesEx call, as well as additional data on volume and file fingerprint. Gets checked against other results returned to ensure the handle is to the correct file.
- (another) GetFileAttributesEx
- Final check to validate no changes have been made to the file that is being loaded.
- CreateFileMapping
- Takes the file handle returned by CreateFile and creates a file mapping with PAGE_READONLY | SEC_IMAGE protections set.
- MapViewOfFileEx
- Creates a view to read the mapped file. Due to the SEC_IMAGE flag being set on the file mapping object, (read-only) access is inherited and as a result no additional access can be set.
Tricking the CLR (and beating AMSI along the way)
Note: this section is somewhat of a wall of text regarding
the specifics on how this works. I’ve
also included a diagram at the bottom that pretty well sums stuff up if that’s
more your speed.
Outside of the context of API hooking as performed by EDR
vendors and the stubs of code that can be used to unhook them, I didn’t have
much experience with implementing my own hooks in code until earlier this year,
when @FuzzySec dropped Dendrobate (https://github.com/FuzzySecurity/Dendrobate). This is a cool project that uses the EasyHook
library to place a hook onto CreateFile, causing calls to that function from
the current process to instead be redirected to an operator-controlled detour
method, from where the call can be sent along or have additional operations
performed on it. If you’re interested in an in-depth explanation on hooking, I
would definitely recommend checking out the official EasyHook documentation
here: http://easyhook.github.io/documentation.html.
After gaining an understanding of the calls being made when
a load command was executed, my thought was to implement what is essentially a
scaled-up version of the code in Dendrobate to hook, modify, and provide faked
responses to the calls in the above loading process. This was necessary as each of the calls builds
off those prior, meaning any interception and modification of responses needs
to start at the beginning of the load process with GetFileAttributes, returning
a spoofed response to tell the CLR that a nonexistent file of our choosing does
indeed exist in the current execution directory. Assuming we are able to hook unmanaged
function calls and thus can intercept / return faked responses, this call is
pretty easy to handle. It doesn’t need us to fill any structs with data, and
only requires that a single dword be returned.
As a result, the code to implement this is straightforward and gives a good
example of what a detour method looks like:
In this case, loadedAssemblyName corresponds with the name
of the nonexistent assembly we told Assembly.Load to attempt to identify in the
current execution directory. As calls to this function are intercepted, this
code compares the filename passed in against this initial value. An
unsuccessful match indicates other normal program behavior, and so the call is
simply forwarded along, whereas a successful match indicates that the CLR is
searching for our nonexistent file. We
return a value of 32 (0x20) to this call, indicating that the requested file
exists and is of type FILE_ATTRIBUTE_ARCHIVE.
Receiving this response, the CLR immediately calls GetFileAttributesEx for the same filename passed in the
prior call. Intercepting this as well, the
main task associated with this step of the loading process is populating the
WIN32_FILE_ATTRIBUTE_DATA structure pointed to by lpFileInformation. To successfully return valid data requires
generation of (somewhat) random times in the form of FileTime structures, as
well as knowledge of the size of the assembly we are attempting to load through
this process. Upon populating the pointed-to
structure with data, a simple Boolean true can be returned.
Having obtained confirmation that the target file exists and
receiving some preliminary data on its size, the CLR proceeds to attempt to
open it via CreateFile. This is the
point in the chain things started getting a bit interesting for me, as although
there are many types of I/O channels CreateFile can interface with and return a
handle from, the majority are not compatible with CreateFileMapping. Things such as mailslots and pipes seemed
like interesting ways to store bytes in a way that didn’t touch disk, but
always resulted in an invalid handle being returned when a section attempted to
be created using the handle passed back by CreateFile. Early in the development process, I simply
hooked this call and redirected it to open a test file I had in another
directory. This worked as it was a valid
file on disk, but it didn’t really do a whole lot beyond being a neat party
trick that allowed me to make it appear as though I was loading assemblies from
locations they didn’t reside. Further
attempts with creating an empty file that had a dwFlagsAndAttributes value of FILE_ATTRIBUTE_TEMPORARY
set and subsequently writing to it also proved to be unsuccessful, as even with
this flag data appeared to still be immediately written through to disk.
The breakthrough on this came when I was talking with my
buddy @anthemtotheego, who had recently released a tool called CredBandit (https://github.com/anthemtotheego/CredBandit)
that utilized transactional NTFS to write an LSASS dump file to memory instead
of disk. He recommended I look into this
mechanism a bit further, as the function returns a valid file handle identical
to one passed by a file backed by disk. At
the time I didn’t know much about transactional NTFS beyond that it was used as
a part of process doppelgänging, but pretty quickly after writing up a PoC to test
it out, it seemed to be exactly what I was looking for. Taking a look at Microsoft’s page on the
subject that has a giant disclaimer at the top did nothing but further
reinforce I was probably on the right track.
For those that may be in the same boat I was last month on
their level of knowledge around transactional NTFS, here is a quick diagram
explaining how it is being used in the scope of this project:
Back to the assembly loading process - as we have been intercepting and providing faked responses to each of the CLR’s outbound queries attempting to ensure it is opening the correct file, it accepts the valid file handle we pass back to it from our memory-only transacted file without any issues.
With a valid handle to what it believes to be an on-disk
file in the current execution directory, two more queries are performed by the
CLR to ensure that the handle it has points to the file it is expecting to be
opened. GetFileInformationByHandle grabs similar data to the above GetFileAttributesEx
call, the results of which are compared to ensure the open handle is to the
correct file. One final GetFileAttributesEx
call is ran to ensure the loaded file has not been changed since the initial
query. Intercepting these two calls
allows us to return forged data for these as well, populating the pointed-to structures
with the same data originally generated for the first GetFileAttributesEx call,
and finally convincing the CLR that the file it is in the process of loading is
correct and does in fact exist. Things
are pretty much done at this point, a section is created, view is mapped to it,
and the bytes are read out. These calls
don’t need to be intercepted or modified, as the CLR already believes it has
loaded a valid file.
Here is a diagram summarizing the entire process:
So how does this beat AMSI? Currently, AMSI hooks are limited to Assembly.Load overloads that load directly from a byte array, so through the process of convincing the CLR that we are loading from an on-disk location, we can simply sidestep AMSI entirely. This means no modifications must be made to process memory in terms of overwriting the amsi.dll instance loaded into the process, etc. and allows us to do fun stuff like this:
While also having the assembly we’re loading showing up as
existing on-disk:
What’s in a Name?
At this point in development, I had a working PoC that
successfully loaded files from memory that appeared to be backed by a file
on-disk and bypassed AMSI. Ready to
release, right? Not so much. Up until this point, all my testing had used a static
assembly I had converted into a byte array, and it just so happened the filename
I was attempting to locate with Assembly.Load shared the same name as this
assembly. Basically, I had converted test.exe to a byte array and was loading that
through the above process, but when kicking off this whole chain I was also
telling the CLR to look for a nonexistent test.exe file on-disk. When modifying my code to be more dynamic (instead
of only being able to just load a single test array from memory), I quickly
found I was running into errors that looked like this:
Through some testing I found that the CLR performs an
additional check during the loading process - it will compare the name
originally supplied in the Assembly.Load(filename) call against the filename
contained within the metadata of the loaded file. If the two don’t match it will throw an error,
and your assembly will not be successfully loaded. I figured there were two ways around this,
the easy (but lazy) way of just making an operator provide the name of their
assembly as an argument, and the better but more difficult way that
actually required me to do some work. At
the end of the day, I ended up implementing both to give the operator more
flexibility. An assembly name (including
extension) can optionally be supplied to the load call, alternatively the PE
header will be walked to automatically pull the name the assembly was compiled
with (this is the value the CLR checks, not the current display name). The result of the walking process looks like
this:
Putting the Pieces Together
With this last hurdle overcome, everything was working
pretty much as intended. I ended up
publishing the tool as a library to make it more portable for use in other projects
and have put together a super basic PE that leverages it so people can get an
idea of how the functionality works.
This PE is based on another of my buddy @anthemtotheego’s tools,
SharpCradle, and takes a small bit of its functionality (web loading only). As currently built, this library and the
SharperCradle PoC PE are compatible with execute-assembly, Donut, etc. but are
not meant to be “op-ready” tools, rather they are meant to serve as templates
from which parts can be grabbed or further modifications can be made. If you’re
interested in checking out the code, both the library and PoC PE are available
here: https://github.com/G0ldenGunSec/SharpTransactedLoad
Beyond Assembly.Load
Assembly.Load is just one of several different methods that
exists to load .net assemblies into memory.
These other methods typically provide some type of additional
functionality, with their main caveat being that almost all do not support
direct loading from byte arrays as the aforementioned Assembly.Load does. As a result, any sort of code to be executed
first must be dropped to disk. With the
new capabilities STL brings to the table, some of these other methods may now
come into play for offensive operations.
There are definitely some use-cases for the library as it exists today,
but some of the more interesting applications have not been covered within the
scope of this post.
Detections
Regardless of the call being hooked / method for kicking the
process off, this technique still relies on transactional NTFS. The CreateTransaction and
CreateFileTransacted functions seem to have very few legitimate uses and would
be the first things I recommend implementing additional detections for in an
environment. Additionally, as the files supposedly backing the loaded
assemblies do not in fact exist, a scan cross-referencing loaded images against
files on disk should catch this anomaly.
Comments
Post a Comment