Page 1 of 1

Assembly language for CE newbie #2: Pointers

Posted: Thu Apr 11, 2024 2:48 am
by bbfox

This article is for newbies.

Note: I'm not an expert. I'm just someone who knows some instructions and can write Auto Assembler scripts. This article shares my experiences.

Warning: This post features AI-assisted content. While I created the first document, an AI arranged the syntax & wording, which I then curated. If you prefer not to engage with such material, please use your browser's back button.




Assembly language for CE newbie #2: Pointers


I never fully understood pointers, especially in the C/C++ field, something like: *ptr, **ptr, int *(*flt)(void).
Well, I'm not a programmer, so I just leave it.


Pointers are often seen in memory structure. They are used to store complex data. Imagine we have a backpack:

Items in Backpack:

- a pouch
- a knife
- an apple
- a torch
- a small bag

Here we have two containers inside the backpack:

- a pouch
- a small bag

Items in the pouch:

- a gem
- a coin

Items in the small bag:

- some herbs

Players can put/remove items from the backpack, even drop the pouch on the ground without removing the items in the pouch first.

Now we have a structure for the current backpack like:

Backpack
-----------------------------------------------------
pouch   knife   apple   torch   small_bag
+-------------------------------+--------------------
|                               |
--------------                  |
gem   coin                      |
                                +--------------------
                                herb1   herb2   ...

Backpack structure in memory
Since contents in the backpack are dynamic, a structure with pointers is often used.
Let's convert the backpack into a memory structure, may look like as:
(Please note this is not a working structure, it's just for explanation.)

Address      Object
--------------------------------------
0x1000       Backpack Item ID
0x1004       Backpack Unique ID
0x1008       Points to the address list of objects in the backpack: 0x3000
0x1010       Points to the address list of containers: 0x2000
0x1018       ...other data
.
.
0x2000       Points to the address of container #1 (pouch): 0x5000
0x2008       Points to the address of container #2 (small bag): 0x5100
0x2010       8-byte length with 0 (null)
.
.
0x3000       Item ID: knife (1)
0x3004       Knife unique ID
0x3008       Item ID: apple (2)
0x300C       Apple unique ID
0x3010       Item ID: torch (3)
0x3014       Torch unique ID
0x3018       4-byte length with 0 (null)
.
.
0x5000       Item type ID: pouch
0x5004       Pouch unique ID
0x5008       Item ID: gem
0x5010       Gem unique ID
0x5014       Item ID: coin
0x5018       Coin unique ID
0x501C       4-byte length with 0 (null)
.
.
0x5100       Item type ID: small bag
0x5104       Small bag unique ID
0x5108       Item ID: herb1
0x510C       Herb1 unique ID
0x5110       Item ID: herb2
0x5114       Herb2 unique ID
0x5118       ...
.
.
0x51??       4-byte length with 0 (null)

To get item info from the backpack, we need to start from the backpack and go through pointers to get the final data.

Example 1: To get the knife:

  • Find the backpack (0x1000)
  • Items in the backpack are located in the first pointer; offset is the address of the backpack + 8 = 0x1008
  • Get the value at address 0x1008; we get 0x3000, meaning the item list in the backpack is in memory at 0x3000
  • Finally, we can find a list of items in the backpack, match the knife ID with the address list.
Address      Value
--------------------------------------
0x1000       ???
0x1004       ???
0x1008       0x3000   <== content of address 0x1008 = 0x3000
0x1010       0x2000
0x1018       ???
.
.
0x3000       1        <== search knife ID (=1) from here, if not found add 8-bytes offset
0x3004       ???
0x3008       2
0x300C       ???
0x3010       3
0x3014       ???
0x3018       0

Example 2:
The same algorithm, to get the gem in the pouch, we need to search from the top (backpack), go through pointers:
Backpack -> containers -> pouch -> gem

To search the gem may be more complex:
Backpack -> items -> (not found)
Backpack -> containers -> container #1 (pouch) -> items in container #1 -> (Found)

Game's data structure can be analyzed via CE's built-in function: dissect data. In most cases, you may need to change the data type in the dissected structure.



Using assembly language to access pointers

Back to the topic, when using assembly, we can access pointer and its structure directly by assembly code.
Using the example above, if we want to get the pouch ID & we already know where it is:


Example 1: Get pouch ID directly

mov rdx, 1000     ; suppose we already know the address of the backpack
mov rdx, [rdx+10] ; the address of 0x1010 = 0x2000 = list of containers
                  ; now rdx = 0x2000
mov rdx, [rdx]    ; the address of 0x2000 = 0x5000 = the address of the pouch
mov edx, [rdx+4]  ; 0x5004 = pouch unique ID

As you can see, there is no "back" operation for a pointer. To go back to the upper level, you need to save the address of the upper level first.

In most cases, we only know part of the data structure, although data can be obtained via some tools in some cases (i.e., Unity or Unreal games).


Example 2: Find if there is a torch in the backpack

mov rdx, 1000          ; suppose we already know the address of the backpack
mov r15, [rdx+8]       ; address list of items in the backpack
xor r14, r14           ; set r14 to zero

chk_loop:              ; define a label for the loop
mov r13d, [r15+r14d*8] ; get the current pointed item id

cmp r13d, 0            ; if end of item list (item ID = 0)
je not_found           ;   go to the not-found code block

cmp r13d, 3            ; if item id is 3 = torch
je do_found            ;   go to the found code block

inc r14d               ; if not found, search for the next
jmp chk_loop           ;

do_found:
; Found code block
.
.

jmp endp
not_found:
; Not found code block
.
.

endp:


Avoid invalid pointer operations

In most cases, the pointer data is dynamic, or the code is shared by different objects. To avoid invalid pointer operations, we should perform some checks:

  • Avoid access to a null pointer (0x0).
  • Invalid pointer access can cause the program to crash.

To avoid access to a null pointer, we can add a test instruction after the pointer is moved:

mov rdx, 1000     ; suppose we already know the address of the backpack
mov rdx, [rdx+10] ; the address of 0x1010 = 0x2000 = list of containers
test rdx, rdx     ; test if rdx is zero = invalid pointer
jz end_search

mov rdx, [rdx]    ; the address of 0x2000 = 0x5000 = the address of the pouch
test rdx, rdx     ; test if rdx is zero = invalid pointer
jz end_search

mov edx, [rdx+4]  ; 0x5004 = pouch ID
test rdx, rdx     ; test if rdx is zero = end of the list
jz end_search

end_search:
.
.

Or use CE's built-in function: try/except

mov rdx, 1000     ; suppose we already know the address of the backpack

{$try}
mov rdx, [rdx+10] ; the address of 0x1010 = 0x2000 = list of containers
                  ; now rdx = 0x2000
mov rdx, [rdx]    ; the address of 0x2000 = 0x5000 = the address of the pouch
mov edx, [rdx+4]  ; 0x5004 = pouch ID
jmp short noerror ; force a short jump if no error found
{$except}

do_if_error:
; Error handling code block
.
.
noerror:
; No error code block
.
.

Please note {$try}/{$except} is slower. If the code will be executed often (i.e., 100 times per second), you may notice the process becomes very slow.
{$try}/{$except} can be used for pointers that are not well utilized in the process. For example: it's not zero and pointed to the wrong area.

In most case test if the register is zero or not will be enough.

Reference: Auto Assembler:TRY_EXCEPT