Assembly language for CE newbie #2: Pointers
This article is for newbies.
Note: I'm not an expert. I'm just someone who knows some instructions and can write Auto Assembler scripts. This article shares my experiences.
Warning: This post features AI-assisted content. While I created the first document, an AI arranged the syntax & wording, which I then curated. If you prefer not to engage with such material, please use your browser's back button.
Assembly language for CE newbie #2: Pointers
I never fully understood pointers, especially in the C/C++ field, something like: *ptr, **ptr, int *(*flt)(void).
Well, I'm not a programmer, so I just leave it.
Pointers are often seen in memory structure. They are used to store complex data. Imagine we have a backpack:
Items in Backpack:
- a pouch - a knife - an apple - a torch - a small bag
Here we have two containers inside the backpack:
- a pouch - a small bag
Items in the pouch:
- a gem - a coin
Items in the small bag:
- some herbs
Players can put/remove items from the backpack, even drop the pouch on the ground without removing the items in the pouch first.
Now we have a structure for the current backpack like:
Backpack
-----------------------------------------------------
pouch knife apple torch small_bag
+-------------------------------+--------------------
| |
-------------- |
gem coin |
+--------------------
herb1 herb2 ...
Backpack structure in memory
Since contents in the backpack are dynamic, a structure with pointers is often used.
Let's convert the backpack into a memory structure, may look like as:
(Please note this is not a working structure, it's just for explanation.)
Address Object -------------------------------------- 0x1000 Backpack Item ID 0x1004 Backpack Unique ID 0x1008 Points to the address list of objects in the backpack: 0x3000 0x1010 Points to the address list of containers: 0x2000 0x1018 ...other data . . 0x2000 Points to the address of container #1 (pouch): 0x5000 0x2008 Points to the address of container #2 (small bag): 0x5100 0x2010 8-byte length with 0 (null) . . 0x3000 Item ID: knife (1) 0x3004 Knife unique ID 0x3008 Item ID: apple (2) 0x300C Apple unique ID 0x3010 Item ID: torch (3) 0x3014 Torch unique ID 0x3018 4-byte length with 0 (null) . . 0x5000 Item type ID: pouch 0x5004 Pouch unique ID 0x5008 Item ID: gem 0x5010 Gem unique ID 0x5014 Item ID: coin 0x5018 Coin unique ID 0x501C 4-byte length with 0 (null) . . 0x5100 Item type ID: small bag 0x5104 Small bag unique ID 0x5108 Item ID: herb1 0x510C Herb1 unique ID 0x5110 Item ID: herb2 0x5114 Herb2 unique ID 0x5118 ... . . 0x51?? 4-byte length with 0 (null)
To get item info from the backpack, we need to start from the backpack and go through pointers to get the final data.
Example 1: To get the knife:
- Find the backpack (0x1000)
- Items in the backpack are located in the first pointer; offset is the address of the backpack + 8 = 0x1008
- Get the value at address 0x1008; we get 0x3000, meaning the item list in the backpack is in memory at 0x3000
- Finally, we can find a list of items in the backpack, match the knife ID with the address list.
Address Value -------------------------------------- 0x1000 ??? 0x1004 ??? 0x1008 0x3000 <== content of address 0x1008 = 0x3000 0x1010 0x2000 0x1018 ??? . . 0x3000 1 <== search knife ID (=1) from here, if not found add 8-bytes offset 0x3004 ??? 0x3008 2 0x300C ??? 0x3010 3 0x3014 ??? 0x3018 0
Example 2:
The same algorithm, to get the gem in the pouch, we need to search from the top (backpack), go through pointers:
Backpack -> containers -> pouch -> gem
To search the gem may be more complex:
Backpack -> items -> (not found)
Backpack -> containers -> container #1 (pouch) -> items in container #1 -> (Found)
Game's data structure can be analyzed via CE's built-in function: dissect data. In most cases, you may need to change the data type in the dissected structure.
Using assembly language to access pointers
Back to the topic, when using assembly, we can access pointer and its structure directly by assembly code.
Using the example above, if we want to get the pouch ID & we already know where it is:
Example 1: Get pouch ID directly
mov rdx, 1000 ; suppose we already know the address of the backpack
mov rdx, [rdx+10] ; the address of 0x1010 = 0x2000 = list of containers
; now rdx = 0x2000
mov rdx, [rdx] ; the address of 0x2000 = 0x5000 = the address of the pouch
mov edx, [rdx+4] ; 0x5004 = pouch unique ID
As you can see, there is no "back" operation for a pointer. To go back to the upper level, you need to save the address of the upper level first.
In most cases, we only know part of the data structure, although data can be obtained via some tools in some cases (i.e., Unity or Unreal games).
Example 2: Find if there is a torch in the backpack
mov rdx, 1000 ; suppose we already know the address of the backpack mov r15, [rdx+8] ; address list of items in the backpack xor r14, r14 ; set r14 to zero chk_loop: ; define a label for the loop mov r13d, [r15+r14d*8] ; get the current pointed item id cmp r13d, 0 ; if end of item list (item ID = 0) je not_found ; go to the not-found code block cmp r13d, 3 ; if item id is 3 = torch je do_found ; go to the found code block inc r14d ; if not found, search for the next jmp chk_loop ; do_found: ; Found code block . . jmp endp not_found: ; Not found code block . . endp:
Avoid invalid pointer operations
In most cases, the pointer data is dynamic, or the code is shared by different objects. To avoid invalid pointer operations, we should perform some checks:
- Avoid access to a null pointer (0x0).
- Invalid pointer access can cause the program to crash.
To avoid access to a null pointer, we can add a test instruction after the pointer is moved:
mov rdx, 1000 ; suppose we already know the address of the backpack mov rdx, [rdx+10] ; the address of 0x1010 = 0x2000 = list of containers test rdx, rdx ; test if rdx is zero = invalid pointer jz end_search mov rdx, [rdx] ; the address of 0x2000 = 0x5000 = the address of the pouch test rdx, rdx ; test if rdx is zero = invalid pointer jz end_search mov edx, [rdx+4] ; 0x5004 = pouch ID test rdx, rdx ; test if rdx is zero = end of the list jz end_search end_search: . .
Or use CE's built-in function: try/except
mov rdx, 1000 ; suppose we already know the address of the backpack
{$try}
mov rdx, [rdx+10] ; the address of 0x1010 = 0x2000 = list of containers
; now rdx = 0x2000
mov rdx, [rdx] ; the address of 0x2000 = 0x5000 = the address of the pouch
mov edx, [rdx+4] ; 0x5004 = pouch ID
jmp short noerror ; force a short jump if no error found
{$except}
do_if_error:
; Error handling code block
.
.
noerror:
; No error code block
.
.
Please note {$try}/{$except} is slower. If the code will be executed often (i.e., 100 times per second), you may notice the process becomes very slow.
{$try}/{$except} can be used for pointers that are not well utilized in the process. For example: it's not zero and pointed to the wrong area.
In most case test if the register is zero or not will be enough.
Reference: Auto Assembler:TRY_EXCEPT