|
1 | 1 | ---
|
2 | 2 | title: Reverse Engineering
|
3 | 3 | ---
|
4 |
| -# Reverse Engineering |
5 |
| -<embed src="./rev.pdf" type="application/pdf" style="width: 100%; height: 80vh;"> |
| 4 | + |
| 5 | +# **Reverse Engineering** |
| 6 | + |
| 7 | +## **TL;DR** |
| 8 | + |
| 9 | +- The goal: understand the program behavior. |
| 10 | +- Usually this means finding the input that makes the program output "Success!" or speeding up a slow algorithm. |
| 11 | + |
| 12 | +### **Past Meetings** |
| 13 | + |
| 14 | +- Check out our [Reverse Engineering Setup meeting](https://sigpwny.com/meetings/general/2025-09-18/) for more details on installing our recommended tools. |
| 15 | + |
| 16 | +## **Introduction** |
| 17 | + |
| 18 | +### **Terminology** |
| 19 | + |
| 20 | +Rev, short for **Rev**erse Engineering, is the process of understanding computer programs. The |
| 21 | +goal is to figure out what the program does. Usually, programs are difficult |
| 22 | +to understand, either intentionally or unintentionally. |
| 23 | + |
| 24 | +#### **Abstractions** |
| 25 | + |
| 26 | +Abstractions are simplifications made for a programming language that hides some of the complexity to make a language easier to use. |
| 27 | + |
| 28 | +- Abstract (higher level) programs are easier to understand |
| 29 | +- Languages like Python and JavaScript are higher level |
| 30 | +- Languages like assembly and C are lower level |
| 31 | +- As you modify a program to become more abstract (to better understand it), you lose some information in the process |
| 32 | + |
| 33 | +#### **Static and Dynamic Analysis** |
| 34 | + |
| 35 | +- Static analysis: reading code, using tools to understand code without running it |
| 36 | + - Good place to start, not great if there's a lot of code |
| 37 | +- Dynamic analysis: running code, inspecting or modifying the program as it's running |
| 38 | + - Generally faster, captures entire program environment |
| 39 | + |
| 40 | +## **Tools** |
| 41 | + |
| 42 | +### **Bytecode Viewer** |
| 43 | + |
| 44 | +#### **Installation** |
| 45 | + |
| 46 | +See [https://github.com/Konloch/bytecode-viewer](https://github.com/Konloch/bytecode-viewer) |
| 47 | + |
| 48 | +#### **When to use** |
| 49 | + |
| 50 | +This program is used to decompile Java files, which usually have the .jar |
| 51 | +extension |
| 52 | + |
| 53 | +#### **How to use** |
| 54 | + |
| 55 | +Simply import the java jar program into the bytecode viewer and see the |
| 56 | +decompiled java code! This works by recovering the java code from the |
| 57 | +compiled java bytecode. |
| 58 | + |
| 59 | +### **Ghidra** |
| 60 | + |
| 61 | +#### **Installation** |
| 62 | + |
| 63 | +- See our [Reverse Engineering Setup meeting](https://sigpwny.com/meetings/general/2025-09-18/) |
| 64 | +- Or read the [Installation Guide](https://ghidra-sre.org/InstallationGuide.html) |
| 65 | + |
| 66 | +#### **When to use** |
| 67 | + |
| 68 | +Use this tool for binaries, not python scripts. Ghidra "decompiles", or simplifies, binary programs into more human-readable "pseudo-C" code. |
| 69 | + |
| 70 | +Ghidra is a **static analysis** tool. |
| 71 | + |
| 72 | +#### **Interface** |
| 73 | + |
| 74 | + |
| 75 | +To open a program in Ghidra, go to File -> Import File... -> select the file you want to analyze. |
| 76 | + |
| 77 | +Click "OK" for all the auto analyze popups (there should be several). |
| 78 | +Now, the interface should look like the above image. |
| 79 | + |
| 80 | +1. is the decompiled code output. This is what you will be looking at for |
| 81 | + the most part. You can rename variables by clicking a variable and pressing |
| 82 | + "**L**". Change the type by right clicking and selecting **Retype Variable**. |
| 83 | +2. is the assembly instructions. This won't be very helpful if you don't |
| 84 | + know assembly, and can be mostly ignored for the challenges at Fall CTF. |
| 85 | +3. is the symbol tree. This shows you dierent named values that |
| 86 | + are present in the le. Click **Functions** and scroll down to select the **main** |
| 87 | + function. This shows you the first function that runs. |
| 88 | + |
| 89 | + |
| 90 | + |
| 91 | +Here we can see the **main** function in the symbol tree. If there is no **main**, |
| 92 | +click **\_start** and see what that function calls. |
| 93 | + |
| 94 | + |
| 95 | + |
| 96 | +Above is a picture of the decompilation (disclaimer: this is not a challenge |
| 97 | +from Fall CTF). Almost every function you see will have an if statement with |
| 98 | +**\_\_stack_chk_fail** at the bottom. This is a check for the stack canary, |
| 99 | +which is not relevant to any challenges here. It may be of more interest in |
| 100 | +pwn challenge. The "local_10 = \*(long \*)(in_FS_OFFSET + 0x28);" line |
| 101 | +at the top sets up the stack canary and can also be ignored. |
| 102 | + |
| 103 | +Note that the variables are named with undescriptive names, such as |
| 104 | +**iVar1** and **local_28**. This is because the decompiler does not know the |
| 105 | +details of variables in the original function. As a result, it has to generate |
| 106 | +variable names. |
| 107 | + |
| 108 | +### **GDB** |
| 109 | + |
| 110 | +#### **Installation** |
| 111 | + |
| 112 | +- See our [Reverse Engineering Setup meeting](https://sigpwny.com/meetings/general/2025-09-18/) |
| 113 | + |
| 114 | +#### **When to use** |
| 115 | + |
| 116 | +Similarly to Ghidra, use this tool for binaries, not python scripts. GDB is |
| 117 | +a debugger that runs programs, giving you the ability to stop, inspect, and |
| 118 | +modify code as it is executing. |
| 119 | + |
| 120 | +GDB is a **dynamic analysis** tool. |
| 121 | + |
| 122 | +#### **Basics** |
| 123 | + |
| 124 | +Run **gdb ./chal** on the command line, where **chal** is the name of the program. |
| 125 | +Note that you must be on Linux (WSL works too). This will not work |
| 126 | +for Apple Silicon Mac users. |
| 127 | + |
| 128 | +**GDB** will launch you into a program with a dierent terminal prompt, |
| 129 | +where each line starts with **(gdb)**. You interact with the program by typing |
| 130 | +in commands |
| 131 | + |
| 132 | +#### **Commands** |
| 133 | + |
| 134 | +- misc |
| 135 | + - help \<command\>: get help about any of the commands listed here |
| 136 | +- running |
| 137 | + - run: run the program from the start |
| 138 | + - quit: exit GDB |
| 139 | + - start: start the program and break on the **main** function |
| 140 | +- breakpoints |
| 141 | + - break \<func\>+\<offset\>: set a breakpoint at the function \<func\> |
| 142 | + with an offset \<offset\>. Useful to get the offset from the **disas** |
| 143 | + command |
| 144 | +- inspecting the program |
| 145 | + - disas \<func\>: disassemble the \<func\> function |
| 146 | + - info reg: print all the registers |
| 147 | + - x: print data (see help x for more info) |
| 148 | + - x/4gx 0x1234: print 4 QWORDS (64-bit values) in hex starting at address 0x1234 |
| 149 | + - x/10i $rip: print 10 instructions starting at $rip (current |
| 150 | + instruction pointer) |
| 151 | + - x/7wx $rsp: print 7 WORDS (32-bit values) in hex starting |
| 152 | + at $rsp (stack pointer) |
| 153 | + - x/8bd $rdi: print 8 bytes in decimal starting at the address |
| 154 | + in $rdi |
| 155 | + - set: set values |
| 156 | + - set $rax=23: sets $rax to 23 |
| 157 | + - set $rip+=4: adds 4 to $rip |
| 158 | + - this skips the current instruction, if it is 4 bytes long |
| 159 | + |
| 160 | +#### **General Workflow** |
| 161 | + |
| 162 | +- first, identify interesting places to set a breakpoint in Ghidra |
| 163 | +- use the assembly instructions window in Ghidra to see the offset to break at |
| 164 | +- run the program in GDB and set a breakpoint |
| 165 | +- modify or print values as desired |
| 166 | +- repeat until solved |
0 commit comments