# Acing a C Homework Using Inline Assembly

February 7, 2019

In courses taught in Java (such as a data structures class), you can build pretty robust autograders for homeworks — after all, the student’s code runs in the JVM, which prevents it from doing spooky things like corrupting memory or branching to weird places. So you can run the grader instantly after submission on a service like Gradescope and boom, they’ve got their grade. But C homeworks provide students a lot more opportunities to write… unconventional solutions.

I’ve posted an example homework/POC on GitHub; clone it if you want. The assignment asks students to complete `fib()` in `assignment.c`, like the following:

``````int fib(int n) {
if (n < 0)
return -1;

int *arr = malloc(sizeof (int) * (n + 1));
if (!arr)
return -1;

arr = 0;
if (n > 0)
arr = 1;

for (int i = 2; i <= n; i++)
arr[i] = arr[i - 1] + arr[i - 2];

int ret = arr[n];

free(arr);

return ret;
}
``````

The test suite `assignment_suite.c` tests their implementation (don’t worry about this too much, it’s libcheck stuff):

``````static int input[] =  {-200, 0, 1, 2, 3, 5, 10,  15,     27};
static int output[] = {  -1, 0, 1, 1, 2, 5, 55, 610, 196418};

START_TEST(test_fib) {
ck_assert_int_eq(fib(input[_i]), output[_i]);
}
END_TEST
``````

But let’s be honest, sometimes you’re a little stressed and you just don’t have time for that homework. That’s 19 whole lines of code! This leaves you with two options:

1. Copy your fraternity brother’s homework
2. Use inline assembly to jump past the failing assertions back into the grader — specifically, to the end of the `test_fib` function

I’ll write a post on how to do #1 later, but first I’ll stub out `fib()` as follows:

``````int fib(int n) {
(void)n; // don't complain about how this is unused
return -1;
}
``````

and then run it in gdb and step instruction-by-instruction until we’re back in `test_fib()`:

``````\$ make run-gdb
(gdb) layout asm
(gdb) b fib
(gdb) r
(gdb) si
(gdb) si
(gdb) si
``````

Then gdb shows the following, which shows the instruction immediately following the instruction that calls `fib()` is 65 bytes after the beginning of `test_fib()`:

`````` |0x555555555d4d <test_fib+55>    mov    (%rdx,%rax,1),%eax
|0x555555555d50 <test_fib+58>    mov    %eax,%edi
|0x555555555d52 <test_fib+60>    callq  0x555555555bb0 <fib>
>|0x555555555d57 <test_fib+65>    cltq
|0x555555555d59 <test_fib+67>    mov    %rax,-0x8(%rbp)
|0x555555555d5d <test_fib+71>    mov    -0x14(%rbp),%eax
|0x555555555d60 <test_fib+74>    cltq
``````

Why do we care? Well, this means the return address pushed onto the stack when calling `fib()` must be 65 bytes after the beginning of `test_fib()`. If we grab this return address off the stack, we can add some predetermined offset to it and perform an indirect jump to skip to the end of the test function, past any nasty failing assertions!

Looking at the disassembly of the test suite object file `assignment_suite.o` (`assignment_suite.asm` in the repository), we can see the teardown for `test_fib()` begins at `0xc3 = 195` bytes after the beginning of `test_fib()`:

``````  b2:	48 8d 3d 00 00 00 00 	lea    0x0(%rip),%rdi        # b9 <test_fib+0xb9>
b9:	b8 00 00 00 00       	mov    \$0x0,%eax
be:	e8 00 00 00 00       	callq  c3 <test_fib+0xc3>
END_TEST
c3:	c9                   	leaveq
c4:	c3                   	retq
``````

So we need to add `195-65 = 130` bytes to the return address to jump to the end of `test_fib()`. Now we can write our cooked solution using inline assembly:

``````int fib(int n) {
__asm__ ("leave\n\t"
"popq %rax\n\t"
"jmp *%rax");

(void)n;
return -1;
}
``````

Each instruction does the following:

1. `leave` will restore the frame pointer of the caller so it doesn’t get confused
2. `popq %rax` pops the return return address off the stack and puts it in `%rax`
3. `addq \$130, %rax` adds our precalculated offset to the return address
4. `jmp *%rax` jumps to the location we want, the end of `test_fib()`, past all the assertions

And behold, it “passes” all tests!

``````\$ make run-tests
./tests
Running suite(s): fun assignment
100%: Checks: 9, Failures: 0, Errors: 0
``````

## Mitigation

You’d think you could avoid this by passing `-fno-asm` to gcc, but it turns out this only disables `asm`, not `__asm__` which still works according to `gcc(1)`:

``````-fno-asm
Do not recognize "asm", "inline" or "typeof" as a
keyword, so that code can use these words as
identifiers.  You can use the keywords "__asm__",
You could also pass `-D__asm__=YOUREABADBOY` to gcc, but this seems to break standard library headers, and the student could thwart this anyway by simply saying `#undef __asm__`.
So if you ask me, the best way to prevent this form of cheating is to tweak the Makefile rule that builds `.c` files:
``````%.o: %.c \$(HFILES)