"Advanced" Format String Exploitation
06 Apr 2017Influenced by this awesome live stream by Gynvael Coldwind, where he talks about format string exploitation
While simple format string vulnerabilities are becoming relatively less common these days, every once in a while, we come across some interesting cases in either CTFs or (less likely) real world programs, where having a better understanding of how to attack these vulnerabilities helps immensely.
Simple format string exploits:
You can use the %p
to see what's on the stack. If the format string
itself is on the stack, then one can place an address (say foo) onto
the stack, and then seek to it using the position specifier n$
(for
example, AAAA %7$p
might return AAAA 0x41414141
, if 7 is the
position on the stack). We can then use this to build a read-where
primitive, using the %s
format specifier instead (for example, AAAA
%7$s
would return the value at the address 0x41414141, continuing the
previous example). We can also use the %n
format specifier to make
it into a write-what-where primitive. Usually instead, we use
%hhn
(a glibc extension, iirc), which lets us write one byte at a
time.
We use the above primitives to initially beat ASLR (if any) and then
overwrite an entry in the GOT (say exit()
or fflush()
or ...) to
then raise it to an arbitrary-eip-control primitive, which
basically gives us arbitrary-code-execution.
Possible difficulties (and "advanced" exploitation):
If we have partial ASLR, then we can still use format strings and beat it, but this becomes much harder if we only have one-shot exploit (i.e., our exploit needs to run instantaneously, and the addresses are randomized on each run, say). The way we would beat this is to use addresses that are already in the memory, and overwrite them partially (since ASLR affects only higher order bits). This way, we can gain reliability during execution.
If we have a read only .GOT section, then the "standard" attack of
overwriting the GOT will not work. In this case, we look for
alternative areas that can be overwritten (preferably function
pointers). Some such areas are: __malloc_hook
(see man
page for
the same), stdin
's vtable pointer to write
or flush
, etc. In
such a scenario, having access to the libc sources is extremely
useful. As for overwriting the __malloc_hook
, it works even if the
application doesn't call malloc
, since it is calling printf
(or
similar), and internally, if we pass a width specifier greater than
64k (say %70000c
), then it will call malloc, and thus whatever
address was specified at the global variable __malloc_hook
.
If we have our format string buffer not on the stack, then we can
still gain a write-what-where primitive, though it is a little
more complex. First off, we need to stop using the position specifiers
n$
, since if this is used, then printf
internally copies the stack
(which we will be modifying as we go along). Now, we find two pointers
that point ahead into the stack itself, and use those to overwrite
the lower order bytes of two further ahead pointing pointers on the
stack, so that they now point to x+0
and x+2
where x
is some
location further ahead on the stack. Using these two overwrites, we
are able to completely control the 4 bytes at x
, and this becomes
our where in the primitive. Now we just have to ignore more
positions on the format string until we come to this point, and we
have a write-what-where primitive.