GetVRAMAddress

From WikiPrizm
Revision as of 13:33, 22 May 2012 by Tari (talk | contribs)
Jump to navigationJump to search


Synopsis

Header: fxcg/display.h
Syscall index: 0x01E6
Function signature: void *GetVRAMAddress(void)

Returns

A pointer to the system's VRAM.

Comments

While this function currently always returns 0xA8000000, it may change in the future (although this is unlikely). The VRAM address has however changed between calculator models, so using this syscall can help portability or with detecting the calculator model.

If speed is a concern, it will be slightly faster to use a hard-coded (preprocessor define) address for VRAM. If portability is a concern, one should use this function. For cases where both speed and portability are important, it is suggested that the program check the return value of this function against the hard-coded value to ensure it will function normally.

In most real-world code, the performance cost of the additional indirection for the dynamic approach will be hugely amortized across a whole function, as the compiler will usually be able to load the VRAM address once and keep it in registers across the whole function. If writes to VRAM are all concentrated in a few functions, the speed hit incurred from the dynamic approach will probably be negligible.

Speed proof

The following program proves that using a hard-coded VRAM address will perform better (built with GCC 4.7.0):

#define VRAM ((unsigned short *)0xA8000000)
static unsigned short *vram;

// VRAM location determined at runtime
void dynamic() {
    vram[0] = 0;
}

// Hard-coded VRAM location
void hard() {
    VRAM[0] = 0;
}

int main() {
    vram = GetVRAMAddress();

    dynamic();
    Bdisp_PutDisp_DD();
    hard();
    Bdisp_PutDisp_DD();

    return 0;
}

Running objdump on the generated binary (built with GCC -Os), we can see there is a difference:

$ sh3eb-elf-objdump -rd vramtest.o
Disassembly of section .text:

00000000 <_dynamic>:
   0:   d1 02           mov.l   c <_dynamic+0xc>,r1     ! 0 <_dynamic>
   2:   e2 00           mov     #0,r2
   4:   61 12           mov.l   @r1,r1
   6:   00 0b           rts
   8:   21 21           mov.w   r2,@r1
   a:   00 09           nop
   c:   00 00           .word 0x0000
                        c: R_SH_DIR32   .bss
        ...

00000010 <_hard>:
  10:   d1 01           mov.l   18 <_hard+0x8>,r1       ! a8000000
  12:   e2 00           mov     #0,r2
  14:   00 0b           rts
  16:   21 21           mov.w   r2,@r1
  18:   a8 00           bra     fffff01c <_hard+0xfffff00c>

It is possible that more extreme optimization options (-O2 or higher, -flto) will reduce or even completely remove the advantage of the hard-coded address, but the dynamic version cannot ever be faster.