The memory addresses that processes use, do not actually point to a physical memory, but instead, to a virtual memory. Multiple processes can point to the same
address, but it can contain different data.
This is because there exists a MMU (memory management unit) that translates the virtual address from a process to a physical address for the CPU to use.
When the system boots up, the CPU uses direct physical addresses, and the kernel creates the dictionary for the MMU, which is called page table, then tells the
CPU to start using MMU and virtual memory.
This way, the kernel prevents processes from accessing memory not allocated for them (which results in a SEGFAULT when they try to)
Let's take the sample below:
struct Result {
int Sum, Sub;
};
Result* foo(int a, int b) {
Result* r = new Result{ a, b };
return r;
}
foo(int, int):
push {r4, r5, r11, lr}
mov r4, r0
mov r0, #8
mov r5, r1
bl operator new(unsigned int)
strd r4, r5, [r0]
pop {r4, r5, r11, pc}
Any decent compiler would compile the function foo into the code above.
Registers are memory locations that points to data, or another memory location.
The above code would do the following:
r4 = a;
r5 = b;
r0 = allocate(8);
*r0 = r4;
*(r0 + 4) = r5;
You can see that there is no struct Result, that is because structs are a programming concept, but in reality they just represent a data in memory which is
just, a bunch of bytes.
what about if statements? what about loops? what about function calls?
Let's check the code below:
int ArraySum(int *nums, int size) {
int n = 0;
for (int i = 0; i < size; i++)
n += nums[i];
return n;
}
ArraySum(int*, int):
cmp r1, #1
blt .LBB0_4
mov r2, r0
mov r0, #0
.LBB0_2:
ldr r3, [r2], #4
subs r1, r1, #1
add r0, r3, r0
bne .LBB0_2
bx lr
.LBB0_4:
mov r0, #0
bx lr
CPU fetches instruction at pc (program counter), and executes it, then increments the pc with it's size. incase of loops, pc is changed, changing the flow of
the execution with it to complete from a separate address.
The above assembly can be represented by the following pseudo code:
if (size < 1)
goto LABEL_END;
r0 = 0;
r2 = nums;
LABEL_START:
r3 = *r2;
size -= 1;
r0 += r3;
if (size == 0)
return 0;
goto LABEL_START;
LABEL_END:
return 0;
What if these instructions were manipulated with at runtime? What if compare-instructions jump addresses changed? Is this even possible?
While systems and apps tries to mitigate this with lot of ways, like Apple memory protection that sign executable pages, it is possible to modify these.
That's how HIT works!
With proper tools and killing protections, you can get full control over a program, changing it's behavior or even implementing something missing (example: auto
shoot in games).
But didn't we talk about process isolation? how is this even possible?
It is possible to inject code at runtime in a process, pre-load it, or even load with it by modifying it ahead of time.