Sonic CD decompilation
Table of contents
Purpose
The aim of this matching decompilation project is to recreate the original source code of Sonic CD as faithfully as possible.
History
Sonic the Hedgehog CD was released in 1993 for the Sega Mega CD. Like other games developed for the platform, it was written written in Motorola 68000 assembly.
In 1995, Sega and Intel collaborated to port Sonic CD to PCs equipped with a Pentium processor running Windows 95. As part of this venture, the game's assembly code had to be translated to C code.
The Windows PC version was then used as the basis for the version released as part of Sonic Gems Collection for the Sony PlayStation 2 and the Nintendo GameCube.
Some time after, it was found that the Sonic Gems Collection release contains debug symbols. However, the full extent of this was not discovered until 2022. After Sonic and Sega Retro member Devon Artmeier used the original compiler, Metrowerks CodeWarrior, to produce a disassembly, they found that the game's binaries contained a complete set of DWARF debugging information, which could be used to reconstruct the original source code.
A year later, in 2023, I read about the findings and started the decompilation project.
Status
All the ELF files that have debug information (which is all of them, save for the main program) have been decompiled, and 99,99% of the decompiled code compiles back to the same assembly.
The source code can be found in the Git repository.
Documentation
Sprite status table
Name | Description | Type |
---|---|---|
actno | Action Number | unsigned char |
actflg | Action Flags | unsigned char |
sproffset | Offset in VRAM | unsigned short |
patbase | Base address of pattern table | sprite_pattern** |
xposi | X direction offset within playfield | int_union |
yposi | Y direction offset within playfield | int_union |
xspeed | X direction speed | short_union |
yspeed | Y direction speed | short_union |
mspeed | Movement speed | short_union |
sprhsize | Horizontal offset from center of character to edges. | unsigned char |
sprvsize | Vertical offset from center of character to bottom. | unsigned char |
sprhs | Horizontal width | unsigned char |
sprpri | Sprite priority | unsigned char |
patno | Pattern number | unsigned char |
mstno | Master pattern number | short_union |
patcnt | Pattern counter | unsigned char |
pattim | Pattern timer counter | unsigned char |
pattimm | Pattern timer master | unsigned char |
colino | Collision size | unsigned char |
colicnt | Collision counter | unsigned char |
cddat | ? | unsigned char |
cdsts | ? | unsigned char |
r_no0 | Routine number 0 | unsigned char |
r_no1 | Routine number 1 | unsigned char |
direc | Angular orientation | short_union |
userflag | ? | short_union |
dummy | Padding bytes | unsigned char[2] |
actfree | Custom data | unsigned char[22] |
Remaining differences
PLAYER6::sibi2
In the original assembly, the variable block_tbl is put on the stack. Compiling the decompiled code puts on in CPU register s0 instead. Presumably the original function took the address of block_tbl somehow (maybe in dead debug code?), which is why it was put on the stack.
A band-aid solution to temporarily get matching assembly so it doesn't throw off the rest of the ELF addresses is to declare block_tbl as short* volatile
.
SNDDO::MC_SONICCreate
The original assembly zeroes part of the stack after returning from calling hmx_bitmap_get_scan0_module
while loading "YAMA_L3.BMP" and "YAMA_R3.BMP".
TA::GetPlayRound
When assiging lpScoreData->roundNo
to ret, the original assembly has an extra addu instruction that adds zero.
TAGAMEB4::a_check
When computing actwk[0].yposi.w.h - pActwk->yposi.w.h
, the original assembly has an extra addu instruction that adds zero.
Missing symbols
Types
The majority of structs and all unions were anonymous. This means that they were defined without a tag, and used through type aliases, which are not recorded in the debug information. As a result, it's easier to list the known types than the ones that were made up:
- _POINTL
- _RECT
- _RECTL
- brankodata
- draw_context
- dlink_export
- hmx_background
- hmx_bitmap
- hmx_ddagrid
- hmx_environment
- hmx_grid
- hmx_renderer_base
- hmx_renderer_context
- hmx_sprite
- hmx_surface
- ld_bitmap_inf
- ld_pack_header
- ld_scroll_header
- ld_sprite_header
- ld_sprite_inf
- tagPALETTEENTRY
- tagPOINT
Parameters
Unused parameters are not added to the debug information. However, they do affect the stack during compilation, which is how we can tell that they were present.
BIGBOM8::m_move1, BIGBOM8::m_move4
These functions are empty, so their parameter is unused. From the stack size and sister functions it can be deduced that the missing parameter is sprite_status* pActwk.
BOSS_5::door_open
This function's parameter is unused. However, BOSS_5::egg6_dead2
calls it with a sprite_status pointer. From this it can be deduced that the missing parameter is sprite_status* pActwk.
BOSS_7::egg7beam_ini, BOSS_7::egg7beam_kemuri2
These functions's second parameter is unused. However, they are part of the object table created in BOSS_7::egg7beam
, and by looking at the other functions it can be deduced that the missing parameter is sprite_status* a2.
BOSS_8::egg8_ini, BOSS_8::egg8_wait, BOSS_8::egg8_move_r, BOSS_8::egg8_move_l, BOSS_8::egg8_move_d, BOSS_8::egg8_move_u, BOSS_8::egg8_move_c, BOSS_8::egg8_tobi_u, BOSS_8::egg8_esc, BOSS_8::egg8_esc2
These functions's second parameter is unused. However, they are part of the object table egg8_act_tbl, and by looking at the other functions it can be deduced that the missing parameter is sprite_status* pMecawk.
BOSS_8::egg8meca_ini, BOSS_8::egg8meca_chg1, BOSS_8::egg8meca_chg2
These functions's second parameter is unused. However, they are part of the object table meca_act_tbl, and by looking at the other functions it can be deduced that the missing parameter is sprite_status* pEggwk.
BOSS_8::egg8hane_normal, BOSS_8::egg8hane_wait
These functions's third parameter is unused. However, they are part of the object table hane_act_tbl, and by looking at the other functions it can be deduced that the missing parameter is sprite_status* pEggwk.
BOSS_8::egg8hane_kill
This function's second and third parameters are unused. However, it is part of the object table hane_act_tbl, and by looking at the other functions it can be deduced that the missing parameters are sprite_status* pMecawk and sprite_status* pEggwk.
COLI4::pcolspecial, COLI5::pcolspecial, COLI7::pcolspecial, COLI8::pcolspecial
These functions have three unused parameters. However, ColliHitChk
calls them. As it's passing in the same arguments as the function's own parameters, the types and names can be deduced to be short iXposi, short iChkPosi, and short iD5.
DAI_RD5::type06_02
This function is empty, so its parameter is unused. From the stack size and sister functions it can be deduced that the missing parameter is sprite_status* pActwk.
DLLMAIN::LibMain
This old Windows 3.1 function was stubbed, so all of its parameters are unused. Old documentation contains the expected function signature: `LibMain(void* hInstance, unsigned short wDataSeg, unsigned short wHeapSize, char* lpsxCmdLine
.
DLLMAIN::WEP
This old Windows 3.1 function was stubbed, so its parameter is unused. Old documentation contains the expected function signature: WEP(int nParameter)
.
ENEMY::ene_tama
This function is empty, so its parameter is unused. From the stack size and sister functions it can be deduced that the missing parameter is sprite_status* pActwk.
GAITOU73::get_x
This function's parameter is unused. However, GAITOU73::gaitou73_01
calls it. From this it can be deduced that the missing parameter is sprite_status* pActwk.
GAME::SetUseOk
This function was stubbed, so its parameters are unused. Apart from the stack size, the only information remaining is that other code calls it with three numbers. As a result, a placeholder parameter list is used: short unknown1, short unknown2, short unknown3
.
PLAYSUB4::bou_move2
This function is empty, so its parameter is unused. From the stack size and sister functions it can be deduced that the missing parameter is sprite_status* pActwk.
UDBLK4::udblk4_type2
This function is empty, so its parameter is unused. From the stack size and sister functions it can be deduced that the missing parameter is sprite_status* pActwk.
ASCIISET::ascchg
This function was stubbed, so its parameter is unused. However, it is part of the object table created in ASCIISET::sprascii
, where the function parameter is defined as sprite_status_lpl*.
ASCIISET::ascspr_chk, ASCIISET::ascspr_set
These functions's second parameter is unused. However, they are part of the object table created in ASCIISET::ascii_sprite
, and by looking at the other functions it can be deduced that the missing parameter is sprite_status_lpl* pAscwk.
AVIOPNDO::AVIPaint
This function was stubbed, so its parameter is unused. However, AVIOPNEN::DLLPaint
calls it with unsigned int hdc.
AVIOPNEN::DLLNotify
This function was stubbed, so its parameters are unused. It is not called.
AVIOPNDO::ReadDIB
This function was stubbed, so its parameter is unused. However, AVIOPNEN::DLLInit
calls it with char fileName[27]. The parameter was restored as char* fileName.
BESTGRID::OEGridCreate, BESTGRID::OEGridDelete, SNDGRID::OEGridCreate, SNDGRID::OEGridDelete, STGGRID::OEGridCreate, STGGRID::OEGridDelete
These functions were stubbed, so their parameter is unused. However, these functions are not stubbed in other parts of the code, from which the parameter, `unsigned short indx`, could be recovered.
HMX_OEEACTL::EAError
This function was stubbed, so its parameters are unused. However, this function is not stubbed in TA, from which the parameters could be recovered: int ret, int line, and char* str.
HMX_OEEACTL::ld_bitmap_file2
This function has two unused parameters. However, SOUNDTST::SNDDO
calls it. From this call, and the names of the other parameters, it can be deduced that the missing parameters are int sy and int dy.
TACOLOR::TAColorSet
This function's parameter is not used. However, TA::game_init
calls it with a constant number. From the stack size it can be deduced that the parameter is an int, but the name is unknown.