Introduction to Format Strings Bugs

Format strings are the result of facilities for handling functions with variable arguments in the C programming language.

Because it’s really C what makes format strings bugs possible, they affect every OS that has a C compiler.

What is a Format String?

To understand what a format string is, you need to understand the problem that format strings solve. Most programs output textual data in some form, often including numerical data.

Say, for example, that a program wanted to ouput a string containing an amount of money.

double amountInDollars;

Say the amount in euros is $ 1234.88. With a decimal point an two places after it.

Without format strings we would need to write a substantial amount of code just to format a number this way.

Format strings would provide a more generic solution to this problem by allowing a string to be output that includes the values of variables, formatted precisely as dictated by the programmer.

To output the number as specified, we would simply call the printf function, which outputs the string to the process’s standard output (stdout):

printf( "$%.2f\n", AmountInDollars );

To output a double you use the format specifier %f.
In this case the format string is: %.2f
We are using the precision component to specify that we require two places after the decimal point

Why are they useful?

Let’s say that we want to print the same variable in three different ways:

  • In decimal
  • In hex
  • In ASCII

We can use format Strings to do that:

int main ( int argc, char *argv[] )
{
int c;

printf ("=====================\n");
printf ("Decimal Hex Character\n");
printf ("=====================\n");

for ( c=0x20; c<256; c++ ){
	printf( "%03d %02x %c \n", c, c, c);
}

}

If we execute this program we can see that we printed the same variable using 3 different format strings:

What is a Format String bug?

A format string bug occurs when user-supplied data is included in the format string specification string of one of the printf family functions, including:

printf
fprintf
sprintf
snprintf
vfprintf
vsprintf
vsnprintf
...

The attacker supplies a number of format specifiers that have no corresponding arguments on the stack, and values from the stack are used in their place.
This leads to information disclosure and potentially the execution of arbitrary code.

So, let’s create a vulnerable example code:

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>

int target;

void vuln(char *string)
{
  printf(string);
  
  if(target) {
      printf("you have modified the target :)\n");
  }
}

int main(int argc, char **argv)
{
  vuln(argv[1]);
}

And let’s compile it disabling all the protections:

gcc -fno-stack-protector -m32 -z execstack -no-pie -o example example.c

And let’s supply some malicious user input to display internal memory addresses of the program:

./format1 `python2 -c 'print ("A"*4 + "%x."*8)'`%x

So this is all I wanted to cover with the introduction to Format Strings, in the following days I will try to do ProtoStar exploiting CTF box to learn a bit about this vulnerability:

Here is the link to ProtoStar:

https://exploit-exercises.lains.space/protostar/

I would try to write some blog posts to save them as a reference for me in the future, when I will probably forget how to exploit this.

See you soon and happy hacking! 🙂

This entry was posted in Exploiting and tagged , , , , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *