Sonic CD decompilation

Table of contents

Purpose

The aim of this matching decompilation project is to recreate the original source code of Sonic CD as faithfully as possible.

History

Sonic the Hedgehog CD was released in 1993 for the Sega Mega CD. Like other games developer for the platform, it was written written in Motorola 68000 assembly.

In 1995, Sega and Intel collaborated to port Sonic CD to PCs equipped with a Pentium processor running Windows 95. As part of this venture, the game's assembly code had to be translated to C code.

The Windows PC version was then used as the basis for the version released as part of Sonic Gems Collection for the Sony PlayStation 2 and the Nintendo GameCube.

Some time after, it was found that the Sonic Gems Collection release contains debug symbols. However, the full extent of this was not discovered until 2022. After Sonic and Sega Retro member Devon Artmeier used the original compiler, Metrowerks CodeWarrior, to produce a disassembly, they found that the game's binaries contained a complete set of DWARF debugging information, which could be used to reconstruct the original source code.

A year later, in 2023, I read about the findings and started the decompilation project.

Status

All the ELF files that have debug information (which is all of them, save for the main program) have been decompiled, and 99,99% of the decompiled code compiles back to the same assembly.

The source code can be found in the Git repository.

Documentation

Sprite status table

NameDescriptionType
actnoAction Numberunsigned char
actflgAction Flagsunsigned char
sproffsetOffset in VRAMunsigned short
patbaseBase address of pattern tablesprite_pattern**
xposiX direction offset within playfieldint_union
yposiY direction offset within playfieldint_union
xspeedX direction speedshort_union
yspeedY direction speedshort_union
mspeedMovement speedshort_union
sprhsizeHorizontal offset from center of character to edges.unsigned char
sprvsizeVertical offset from center of character to bottom.unsigned char
sprhsHorizontal widthunsigned char
sprpriSprite priorityunsigned char
patnoPattern numberunsigned char
mstnoMaster pattern numbershort_union
patcntPattern counterunsigned char
pattimPattern timer counterunsigned char
pattimmPattern timer masterunsigned char
colinoCollision sizeunsigned char
colicntCollision counterunsigned char
cddat?unsigned char
cdsts?unsigned char
r_no0Routine number 0unsigned char
r_no1Routine number 1unsigned char
direcAngular orientationshort_union
userflag?short_union
dummyPadding bytesunsigned char[2]
actfreeCustom dataunsigned char[22]

Remaining differences

PLAYER6::sibi2

In the original assembly, the variable block_tbl is put on the stack. Compiling the decompiled code puts on in CPU register s0 instead. Presumably the original function took the address of block_tbl somehow (maybe in dead debug code?), which is why it was put on the stack.

A band-aid solution to temporarily get matching assembly so it doesn't throw off the rest of the ELF addresses is to declare block_tbl as short* volatile.

decomp.me scratch

SNDDO::MC_SONICCreate

The original assembly zeroes part of the stack after returning from calling hmx_bitmap_get_scan0_module while loading "YAMA_L3.BMP" and "YAMA_R3.BMP".

decomp.me scratch

TA::GetPlayRound

When assiging lpScoreData->roundNo to ret, the original assembly has an extra addu instruction that adds zero.

decomp.me scratch

TAGAMEB4::a_check

When computing actwk[0].yposi.w.h - pActwk->yposi.w.h, the original assembly has an extra addu instruction that adds zero.

decomp.me scratch

Missing symbols

Types

The majority of structs and all unions were anonymous. This means that they were defined without a tag, and used through type aliases, which are not recorded in the debug information. As a result, it's easier to list the known types than the ones that were made up:

Parameters

Unused parameters are not added to the debug information. However, they do affect the stack during compilation, which is how we can tell that they were present.

BIGBOM8::m_move1, BIGBOM8::m_move4

These functions are empty, so their parameter is unused. From the stack size and sister functions it can be deduced that the missing parameter is sprite_status* pActwk.

BOSS_5::door_open

This function's parameter is unused. However, BOSS_5::egg6_dead2 calls it with a sprite_status pointer. From this it can be deduced that the missing parameter is sprite_status* pActwk.

BOSS_7::egg7beam_ini, BOSS_7::egg7beam_kemuri2

These functions's second parameter is unused. However, they are part of the object table created in BOSS_7::egg7beam, and by looking at the other functions it can be deduced that the missing parameter is sprite_status* a2.

BOSS_8::egg8_ini, BOSS_8::egg8_wait, BOSS_8::egg8_move_r, BOSS_8::egg8_move_l, BOSS_8::egg8_move_d, BOSS_8::egg8_move_u, BOSS_8::egg8_move_c, BOSS_8::egg8_tobi_u, BOSS_8::egg8_esc, BOSS_8::egg8_esc2

These functions's second parameter is unused. However, they are part of the object table egg8_act_tbl, and by looking at the other functions it can be deduced that the missing parameter is sprite_status* pMecawk.

BOSS_8::egg8meca_ini, BOSS_8::egg8meca_chg1, BOSS_8::egg8meca_chg2

These functions's second parameter is unused. However, they are part of the object table meca_act_tbl, and by looking at the other functions it can be deduced that the missing parameter is sprite_status* pEggwk.

BOSS_8::egg8hane_normal, BOSS_8::egg8hane_wait

These functions's third parameter is unused. However, they are part of the object table hane_act_tbl, and by looking at the other functions it can be deduced that the missing parameter is sprite_status* pEggwk.

BOSS_8::egg8hane_kill

This function's second and third parameters are unused. However, it is part of the object table hane_act_tbl, and by looking at the other functions it can be deduced that the missing parameters are sprite_status* pMecawk and sprite_status* pEggwk.

COLI4::pcolspecial, COLI5::pcolspecial, COLI7::pcolspecial, COLI8::pcolspecial

These functions have three unused parameters. However, ColliHitChk calls them. As it's passing in the same arguments as the function's own parameters, the types and names can be deduced to be short iXposi, short iChkPosi, and short iD5.

DAI_RD5::type06_02

This function is empty, so its parameter is unused. From the stack size and sister functions it can be deduced that the missing parameter is sprite_status* pActwk.

DLLMAIN::LibMain

This old Windows 3.1 function was stubbed, so all of its parameters are unused. Old documentation contains the expected function signature: `LibMain(void* hInstance, unsigned short wDataSeg, unsigned short wHeapSize, char* lpsxCmdLine.

DLLMAIN::WEP

This old Windows 3.1 function was stubbed, so its parameter is unused. Old documentation contains the expected function signature: WEP(int nParameter).

ENEMY::ene_tama

This function is empty, so its parameter is unused. From the stack size and sister functions it can be deduced that the missing parameter is sprite_status* pActwk.

GAITOU73::get_x

This function's parameter is unused. However, GAITOU73::gaitou73_01 calls it. From this it can be deduced that the missing parameter is sprite_status* pActwk.

GAME::SetUseOk

This function was stubbed, so its parameters are unused. Apart from the stack size, the only information remaining is that other code calls it with three numbers. As a result, a placeholder parameter list is used: short unknown1, short unknown2, short unknown3.

PLAYSUB4::bou_move2

This function is empty, so its parameter is unused. From the stack size and sister functions it can be deduced that the missing parameter is sprite_status* pActwk.

UDBLK4::udblk4_type2

This function is empty, so its parameter is unused. From the stack size and sister functions it can be deduced that the missing parameter is sprite_status* pActwk.

ASCIISET::ascchg

This function was stubbed, so its parameter is unused. However, it is part of the object table created in ASCIISET::sprascii, where the function parameter is defined as sprite_status_lpl*.

ASCIISET::ascspr_chk, ASCIISET::ascspr_set

These functions's second parameter is unused. However, they are part of the object table created in ASCIISET::ascii_sprite, and by looking at the other functions it can be deduced that the missing parameter is sprite_status_lpl* pAscwk.

AVIOPNDO::AVIPaint

This function was stubbed, so its parameter is unused. However, AVIOPNEN::DLLPaint calls it with unsigned int hdc.

AVIOPNEN::DLLNotify

This function was stubbed, so its parameters are unused. It is not called.

AVIOPNDO::ReadDIB

This function was stubbed, so its parameter is unused. However, AVIOPNEN::DLLInit calls it with char fileName[27]. The parameter was restored as char* fileName.

BESTGRID::OEGridCreate, BESTGRID::OEGridDelete, SNDGRID::OEGridCreate, SNDGRID::OEGridDelete, STGGRID::OEGridCreate, STGGRID::OEGridDelete

These functions were stubbed, so their parameter is unused. However, these functions are not stubbed in other parts of the code, from which the parameter, `unsigned short indx`, could be recovered.

HMX_OEEACTL::EAError

This function was stubbed, so its parameters are unused. However, this function is not stubbed in TA, from which the parameters could be recovered: int ret, int line, and char* str.

HMX_OEEACTL::ld_bitmap_file2

This function has two unused parameters. However, SOUNDTST::SNDDO calls it. From this call, and the names of the other parameters, it can be deduced that the missing parameters are int sy and int dy.

TACOLOR::TAColorSet

This function's parameter is not used. However, TA::game_init calls it with a constant number. From the stack size it can be deduced that the parameter is an int, but the name is unknown.