Create your own Virtual CPU...in C++
Writing a Virtual machine teaches you how the processor actually deals with the assembly instructions, memory and stack.
We will simulate the working of a processor & memory. A new assembly like language will be generated in the process which will be used to program our virtual machine. We will make heavy use of a stack to perform all the operations like add or subtract instead of CPU registers for now. So let us begin!
Complete Source code is available on my github.
CPU & memory
Working of a CPU
An abstract CPU works in 4 stages, namely the fetch, decode, execute,write. Our CPU will have to perform these operations but we will merge the last two stages into one.
A fetch stage is where an instruction is fetched from the memory. In particular, a special segment in memory is accessed, code segment. An array will be used for the code segment memory segment.
A decode stage is when the instruction is decoded into the operation which is to be performed. We shall use a switch-case statement to decode the instruction fetched from the memory in the fetch stage.
An execute stage is the one in where we have decoded the instruction and settled for executing a function. We will also do the writing back to the stack in this stage.
Registers
There are a few registers that are essential for the working of the CPU stages that were discussed above.
- Instruction Pointer: This register holds the address of the next operation that is to be run on the cpu.
- Stack Pointer(SP): This holds the address of the stack top.
- Base Pointer(BP): This holds the address of the current function stack frame.
Working of our memory
We need three types of memory, first is the code segment
and the second is the stack
. A third one to keep the data
. Register ip
, which points to the next instruction(an integer position) actually acts as an index in code segment array. Stack pointer and Base Pointer registers(both integers) points to some addresses in stack
array. Actually we will be using std::vector
for these arrays implementation.
Operations
All the operations are going to be involving stack if they need any kind of data transfer to some other part of the code. We will start by creating our CPU class. Add the registers - all integers, and a flag and 3 vectors one for each stack, code, data along with a constructor to initialize these variables.
class CPU{ //cpu registers int ip,sp,bp; int flag=0; //memory regions std::vector<int> stack; std::vector<int> code; std::vector<int> data; public: CPU(std::vector<int>&& instructions,int main_address,int datasize=100,int stacksize=200) :sp(-1),bp(0),ip(main_address),stack(stacksize),data(datasize),code(instructions) { } };
We need a method to fetch
, and one to decode, will call it run
. Add these methods to the CPU class.
// mark these methods private private: int fetch() { //grabbing the next instruction from code vector return code[ip++]; //incrementing the ip to point to the next location in ip. } void run() { //decoding and executing here guys! std::cout<<"Cpu Started" << std::endl; }
This is our skeleton CPU class with the registers initialized. Our job from now on is to implement a switch-case statement inside the run
method so that we could execute some instructions, but before that let's look at the types of instructions we are going to use.
Instructions
We shall use instructions as simple as we can. So it is going to be the compiler's job to do the type deduction and provide our CPU with tagged instructions like for PUSH we will have integer tagged IPUSH. Following is an incomplete subset of instructions just for a glance of the curious minded.
Push: IPUSH 5
Increments the stack pointer by one and pushes the integer '5'.
Pop: IPOP
Pops off whatever was there on the stack and decrement the stack pointer.
Print: IPOP
Prints the top of the stack.
IADD: IADD A B
Increments the stack pointer by one and pushes the result of adding two integers picked from code segment on the stack.
Similarly, ISUB, IMUL can also be implemented. There is a little issue in working with IDIV because there can be an exception while executing the division operation when denomenator is a 0. We need exception handling for that and it is out of the scope of this blog but it can easily be implemented using the flag register. Readers can try that for themselves as an exercise after finishing up the blog.
IADDS: IADDS indexA indexB
This variant of iadd
instruction adds two integers taken from the indexes at indexA
and indexB
relatively to bp
register in the stack. The result is then pushed onto the stack.
Similarly, ISUBS, IMULS can also be implemented.
Jumping: JUMP address
Jump to the address
by setting the IP register equal to it. There are cousins of jump that we are going to use the flag
register to determine if the jump is to be made or not. These cousins include, JLT
, JGT
, `JEQ`.
Halt: HALT
Halt the machine.
Loading from data memory: LOAD address
Loads the stack top from the data array indexed at address
.
Storing: STORE address
Stores the value at the top of the stack at the address
index in the data memory/array/segment.
Next two instructions are going to be used to implement function calls.
Calling : CALL address
Calling the function at the index address
in the code segment.
Return: RET
Returns to the calling function/ code from inside a function.
Comparison : ICMP A B
Boolean comparison is done using icmp
instruction, which is short for integer compare. It compares 2 integers and sets the flag accordingly.
ICMPS indexA indexB
This variant of ICMP
compares two integers placed on indexes indexA
and indexB
in the stack relative to the bp
register
Instructions
In order to define these instructions we can use an enum. Let's create another file called instructions.hpp
and add the enum. This file defines the names of the instructions that we are going to use. It is commonly called an Instruction Set of a CPU.
#ifndef __INSTRUCTIONS_HEAD_ADDED__ #define __INSTRUCTIONS_HEAD_ADDED__ enum Instructions : int { HALT = 10, //binary operations IADD, ISUB, IMUL, IADDS, ISUBS, IMULS, //boolean operations- implemented using flags! ICMP, //memory operations IPUSH , PRINT , POP, STORE , LOAD, LOCAL_LOAD, //branching operations JUMP, JLT, JGT, JEQ, //functions CALL, RET }; #endif
I have given a value of 10 to the HALT
instructions meaning that the rest of the instructions will acquire contiguous values 11, 12, 13...
The Run Method!
The core of our CPU is the run method, let's add to it now with the case handling of PUSH & POP instructions to manipulate the stack.
std::cout<<"Cpu Started" << std::endl; ip=0; int value; while(ip<code.size()) { int instruction = fetch();//fetch switch(instruction)//decode { case IPUSH: value=code[ip++]; stack[++sp] = value; break; case POP: sp--; break; } }
In order to print what is on the top of stack adding a print instruction next!
case PRINT: if(sp>-1) { std::cout << "Stack top: "<<stack[sp] <<std::endl; } else{ std::cout << "Stack Empty!"<<std::endl; } break;
Now it is actually possible to provide the machine with some push instructions and print the values on the top!
Add cases for the binary operations(iadd,imul,isub),
case IADD: stack[++sp]=code[ip++]+code[ip++]; break; case ISUB: stack[++sp]=code[ip++]-code[ip++]; break; case IMUL: stack[++sp]=code[ip++]*code[ip++]; break; case IADDS:// instruction for when operands are on the stack! IADDS 0 1 stack[++sp]=stack[bp-code[ip++]]+stack[bp-code[ip++]]; break; case ISUBS:// instruction for when operands are on the stack! ISUBS 0 1 stack[++sp]=stack[bp-code[ip++]]-stack[bp-code[ip++]]; break; case IMULS:// instruction for when operands are on the stack! IMULS 0 1 stack[sp]=stack[bp-code[ip++]]*stack[bp-code[ip++]]; break;
Now this VM has become a calculator, it lacks the ability to be able to compute conditional statements or even create loops, so we fix it right away.
Conditionals and Loops
Jumps
Jumps are simple. For an unconditional jump, just change the ip
register to the required address. For conditional jumps, decide jump based on the flag value. This flag value was the result of a previous icmp
instruction which compared 2 values passed to it.
case JUMP: ip=code[ip]; break; case JEQ: if(flag==0) { ip=code[ip]; } else{ ip++; } break; case JLT: if(flag<0) { ip=code[ip]; } else{ ip++; } break; case JGT: if(flag>0) { ip=code[ip]; } else{ ip++; } break;
We can write conditional statements like if-else
or loops like while
or for
using boolean conditionals icmp
paired with the jump
statements.
For boolean operation icmp
we shall use the flag register, if the comparison results in an equality flag is set to 0, otherwise either -1 or 1 depending on the actual relationship.
case ICMP: value=code[ip++]; if(value == code[ip]) { flag = 0; } else if(value > code[ip]) { flag =1; } else{ flag=-1; } ip++; break; case ICMPS: // relative to the base pointer value=stack[bp-code[ip++]]; int i =stack[bp-code[ip]]; if(value == i) { flag = 0; } else if(value > i) { flag =1; } else{ flag=-1; } ip++; break;
Loading and Storing from the data segment
case STORE: data[code[ip++]]=stack[sp--]; break; case LOAD: stack[++sp]=data[code[ip++]]; break;
Main.cpp
Let's start by defining our code segment with some initial code at the start of our main.cpp
file
#include<iostream> #include<vector> #include "instructions.hpp" #include "machine.cpp" std::vector<int> fabricate_instructions() { //read file or throw an exception; return { IPUSH, 10, IPUSH, 20, STORE, 1, LOAD, 1, PRINT, POP, PRINT, POP, PRINT, PRINT, HALT }; }
Inside our main function we shall initialize our CPU
class and pass the code segment to the CPU object. Then just call the run method to start executing the code segment we just passed, or we can pass an argument to the name of the file which contains instructions.
int main(int argc, char *argv[]) { std::vector<int> program; //read file and get instructions! try{ program=fabricate_instructions(); CPU vm(std::move(program),0,120,300); vm.run(); // std::cout<<"starting cpu!"<<program[0]<<program[1]; } catch(char* msg) { std::cout << "Error!"; std::cout << msg; } return 0; }
With this in place, we can run a sample test code to ensure things are working fine and then we shall move on towards implementing function calls!!
Compiling and executing...
To compile the code above we will use GCC compiler for C++. Open a terminal at the root of the project and enter
$g++ simulator.cpp machine.cpp instructions.hpp -o machine
check the type of the file generated: machine
$ file machine
which outputs
machine: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=cd3f6932facca50ccb3c90fda698fd4caf245491, for GNU/Linux 3.2.0, not stripped
I am using an i5(x86_64) processor so the executable is generated to match that architecture.
Execution
Simple run the machine
using ./machine
command, output should show,
Cpu Started Pushing now: 10 Pushing now: 20 Stack top: 20 Popping now: 20 Stack top: 10 Popping now: 10 Stack Empty! Stack Empty! Halting now
Next we add the most awaited Function calling feature in our VM.
Function Calls
One last thing left to add in our virtual machine is a mechanism to call functions. There is a little theory first that we have to get out of the way in order to code it up! The code which calls the function is called a CALLER and the code that gets called is called the CALLEE.
So when we normally jump to a location we don't have any intention to come back to the next instructions and carry on from there. But this is what happens after a function call returns. To start executing the function, the ip
register is set with the address(index) of the new function in the code segment
.
What about the arguments to the function? We push them on the stack before calling the function along with a number representing the number of arguments
What else? How will we be able to come back to the next instruction of the caller? We will push the address(index) to the stack as well. This is called the return address.
For a new function call we will have to change the values of the sp
and bp
to point to the current position in stack for that function and the current base of the stack of that pointer.
But what about the old values of these registers? Since we will be changing both of these and after the function returns we need the old values back...Push them on the stack as well!
So the full code for the CALL
instruction looks like this:
case CALL: value = code[ip++]; stack[++sp] = code[ip++];;//number of arguments stack[++sp] = bp;//base pointer stack[++sp] = ip;//return address bp=sp; ip = value; break;
Syntax to use it: CALL 20 #arguments
if the function starts at index 20 in the code segment
along with the number of arguments constant(required to clear the stack during RET
).
When the function returns using RET
instruction, all we have to do is clear the stack and bring back the old values of the registers ip
, sp
, and bp
. The following code handles the case for the RET
instruction.
case RET: int return_value = stack[sp--]; sp=bp;//deleting everything above the bp ip=stack[sp--]; bp=stack[sp--]; sp -= stack[sp--]; stack[++sp] = return_value; break;
With this taken care of, let's compile and run our machine.
Time to do some real code : factorial()!
We have the ability to call functions therefore we shall write a recursive algorithm, simplest one being, the Factorial of an integer n.
Just to remind you that a factorial is simply a sequence of multiplication from 1..n. The general recursion follows,
Fact(n) = n * Fact(n-1)
First we check if the argument value is not equal to 1, call self recursively with n-1 value till it is equal to 1! Without extending this article further I present the code, just replace the fabricate_instructions
function in the main.cpp
file. In this example, I am calculating the factorial(5) = 120.
std::vector<int> fabricate_instructions() { return { //Start of the factorial function!, index=0 IPUSH, 1, ICMPS, -1, 3,//compare the 1 just pushed @ stack[bp-(-1)] in stack, with // the argument, @ stack[bp-(3)] JEQ, 16,// jump to return if it is equal, 1==1, then just return 1, which is alread on the stack top. ISUBS, 3,-1, // if the argument is not equal to 1 then call self with n-1 as argument. CALL, 0, 1,// call self, return value will be stored at the stack! IMULS,3,-2, // multiply n @stack[bp-(3)] with stack[bp+(2)] RET,//end of the factorial function, index=16 0, 0, 0, IPUSH,5,// Main function: index=20. pushing 5 to call factorial(5) CALL, 0, 1,//calling the factorial function with 5 at the stack! PRINT,//printing the top of the stack = factorial(5) = 120 HALT }; }
The main function starts at index=20, so we also change that when we initialize the CPU.
int address_of_main=20,data_segment_size=20,stack_size=100; CPU vm(std::move(program), address_of_main, data_segment_size, stack_size );
If you want to calculate higher number factorials then make sure you increase the stack size to 500 or 1K.
Run this code and see the output should be,
$ ./machine Cpu Started Stack top: 120 and flag: 0 Halting now
So that is it for our Virtual Machine. We could have gone crazy with it, adding exception handling and garbage collection, but that is left for the readers to implement.
Thank-you for reading!!