hugo-site/content/posts/deconstructing-a-c-program.md

238 lines
8.3 KiB
Markdown

---
title: "Deconstructing a C Program"
date: 2024-10-16T21:30:50-05:00
draft: false
---
I am going to break down this C program:
## The source
```c
#include <stdio.h>
int times(int a, int b)
{
return a * b;
}
int main(int argc, char** argv)
{
int someVariable = times(6,7);
printf("%i\n", someVariable);
return 0;
}
```
## High Level Overview
The hashtag include statement includes header files from standard library like stdio.h in this example:
```c
#include <stdio.h> //if the include uses quotes like this -> #include "yourheader.h" it will search your project first
```
We create a function called **times** which returns an **int** (which is usually a 32 bit integer number) with two arguments of type **int** with the names **a** and **b**.
Inside the function we return the factor of **a** and **b**.
We declare another function called **main** which is required unless you are coding without the c runtime it returns an **int** with two arguments that is basicly your program's command line arguments including the filename, the **int argc** is the number of arguments (including the filename of course) and an argument **char\*\* argv** which is a pointer in memory to where pointers of the arguments are stored, for the filename you would use:
```c
printf("%s\n", argv[0]);
```
Inside this function we declare a variable of type **int** with the name **someVariable** which we set to the result of calling the function **times** with the arguments **6** and **7**.
Then we call the function **printf** which is declared in **stdio.h** with the arguments **"%s\n"** and **someVariable** which **someVariable** is the variable we declared on the previous line.
Then we return the value **0** from the **main** function to say the program succeeded in what it was supposed to do.
## How are Functions declared
```c
RETURNTYPE FUNCTION_NAME(ARGTYPE a, ARGTYPE b) {/*body, use return EXPRESSION; unless RETURNTYPE is void*/ }
```
| Name of token | What it is |
| --------------| ---------- |
| RETURNTYPE | The Return type of a function which can be void for this function returns nothing or anything else for this function returns an expression of that type |
| ARGTYPE | The argument type to the function (can be anything that RETURNTYPE is execpt void (well technically there can be in header files like RETURNTYPE FUNCTION_NAME(void) this is to say your function has exactly zero args, but otherwise you cant)), but there can be any number of arguments in a function, even zero, arguments are comma seperated |
| a | the first argument's variable |
| b | the seconds argument's variable |
| EXPRESSION | anything that returns a value (such as a number, an assignment of a variable, a calculation (such as a + b), a string (such as "Hello, world"), a char (such as '*'), a variable name (such as a) , a function name (for function pointers) (such as times) ) |
| FUNCTION_NAME | What you name the function |
## How are functions called
```c
int a = myfunction(FIRST_EXPRESSION, SECOND_EXPRESSION, THIRD_EXPRESSION);
```
In this example the function **myfunction** returns an **int** and has three arguments that are **FIRST_EXPRESSION**, **SECOND_EXPRESSION** and **THIRD_EXPRESSION** they are also comma seperated and can be any expression as long as the argument type is the same as expression (there is leeway however, such as if your expression is an 32 bit integer and the argument type expects a 64 bit integer, it will be converted for you)
if the function returns void it, the function result can't be set to a variable or treated as an expression (due to there being no result)
## How are variables declared
```c
VARIABLE_TYPE variableName; //this variable is unset
VARIABLE_TYPE variableName = EXPRESSION; //this variable is set, any expression, has same behaviour as arguments in functions
```
## How are variables set
```c
variableName = EXPRESSION; //any expression, has same behaviour as arguments in functions
```
## A simplified lexing of the C program
NOTE due to there being a preprocessor involved, I have ignored the include statement
```
KEYWORD: "int"
IDENTIFIER: "times"
SYMBOL: "("
KEYWORD: "int"
IDENTIFER: "a"
SYMBOL: ","
KEYWORD: "int"
IDENTIFIER: "b"
SYMBOL: ")"
SYMBOL: "{"
KEYWORD: "return"
IDENTIFIER: "a"
SYMBOL: "*"
IDENTIFIER: "b"
SYMBOL: ";"
SYMBOL: "}"
KEYWORD: "int"
IDENTIFIER: "main"
SYMBOL: "("
KEYWORD: "int"
IDENTIFIER: "argc"
SYMBOL: ","
KEYWORD: "char"
SYMBOL: "*"
SYMBOL: "*"
IDENTIFIER: "argv"
SYMBOL: ")"
SYMBOL: "{"
KEYWORD: "int"
IDENTIFIER: "someVariable"
SYMBOL: "="
IDENTIFIER: "times"
SYMBOL: "("
NUMBER: "6"
SYMBOL: ","
NUMBER: "7"
SYMBOL: ")"
SYMBOL: ";"
IDENTIFIER: "printf"
SYMBOL: "("
STRING: "%s\n"
SYMBOL: ","
IDENTIFIER: "someVariable"
SYMBOL: ")"
SYMBOL: ";"
KEYWORD: "return"
NUMBER: "0"
SYMBOL: ";"
SYMBOL: "}"
```
Notice no spaces or new lines in lexed output it is trimmed out
## A simplified AST of this program
```json
{
"type": "program"
"nodes": [
{
"type": "function_definition",
"returnType": "int",
"name": "times",
"arguments": [
{
"type": "int",
"name": "a"
},
{
"type": "int",
"name": "b"
}
],
"body": {
"type": "scope_node",
"statements": [
{
"type": "return_node",
"expression": {
"type": "times_expression",
"left": {
"type": "get_variable_expression",
"name": "a"
},
"right": {
"type": "get_variable_expression",
"name": "b"
}
}
}
]
}
},
{
"type": "function_definition",
"returnType": "int",
"name": "main",
"arguments": [
{
"type": "int",
"name": "argc"
},
{
"type": "char**",
"name": "argv"
}
],
"body": {
"type": "scope_node",
"statements": [
{
"type": "declare_variable_with_value",
"variableType": "int",
"name": "someVariable",
"expression": {
"type": "call_function_expression",
"name": "times",
"arguments": [
{
"type": "const_int_expression",
"number": 6
},
{
"type": "const_int_expression",
"number": 7
}
]
}
},
{
"type": "call_function_expression",
"name": "printf",
"arguments": [
{
"type": "const_string_expression",
"value": "%s\n"
},
{
"type": "get_variable_expression",
"name": "someVariable"
}
]
},
{
"type": "return_node",
"expression": {
"type": "const_int_expression",
"number": 0
}
}
]
}
}
]
}
```
## Well this is the end.
Hope this blessed you in some way.