Last update: 26.10.2014: video of properly textured X-wing

Introduction

Star Wars: Rogue squadron 3D from Factor5/LucasArts, released in 1998, was one of my favourite games back in the day. With fast action-filled gameplay (reminding me of my other all-time favourite - Tyrian), attractive setting of the SW universe presented in outstanding graphics, it was my clear choice for many fun-filled afternoons. The game never reached the level of popularity of X-wing/Tie fighter spacesims series between PC gamers though. It was very popular on Nintendo 64 however and the series continued on Gamecube (RS2: Rogue Leader, RS3: Rebel Strike), never to be seen on PC again. (There actually was a similar follow-up game called Battle for Naboo, but I haven't played it until many years later and never quite liked it.)

As was often the case, curious fans researched the data formats for their beloved X-wing spacesims and created many modifications, for example higher resolution model replacements, new missions etc. However modifications for Rogue Squadron never reached this level, for one reason or another. There never was any specification for the 3d model formats the game used (or any of its other file formats), so there were no improved models created. And many times, years after the game's release, I thought that somewhat higher-poly 3D models may improve the game a bit. And having some new levels would be nice too...

Basically all tools that tackled with the game data came from one person, Luka/Araz (thank you!). Especially one program caught my attention: RogueDat. It was able to view and extract files from the packed filesystem file the game uses. However there weren't any file extensions, so the data formats remained a mystery to me. The source code for the game was (sadly) never released, and with Factor5 and now LucasArts gone, its probably lost forever.

Recently, I've been doing some data reverse-engineering, and somehow Rogue Squadron crossed my mind. I got curious, and as I like solving riddles, I decided to try to figure out at least some of the data formats. And inspired by the work of Fabien Sanglard, I decided to write down and publish my notes about what I've found out so far. The info is split into several sections:

Program

RS had some quite cool graphics for its time, and it was cramped into a mere 50MB of data, which always fascinated me. The data size had to be kept low so it would fit on Nintendo64 cartridge. PC version is built on DirectX 6 and can use Direct3D or Glide for 3D acceleration. Sadly, there are some inconveniences with running the game on new machines:

The fix for the first issue that worked for me is to run the game through D3DWindower: it forces to run the game in windowed mode instead of fullscreen. Works nicely enough, but the window resolution switches between menu (640x480) and level stages, which is disturbing. Higher resolutions can be enabled by means of using a Glide wrapper - nGlide and dgVoodoo2 work nicely. They also help the camera issue somewhat, but it's not perfect. The scrolling-text issue is likely related to the camera problem.

I've been looking at the game dissasembly through IDA free; it's a tedious task to dig out some interesting info. What I've figured out so far is that if you put a variable "ROGUE_SQUADRON_ENABLE_LOGFILE" with value "1" into your environment variables, the game writes a logfile. Assertions have been compiled into the executable, and assertion messages contain the complete path of a file they came from, so for some blocks of code you can figure out which source files they came from.

D:\Scoundrel\Code\RoguePC\src\ seems to contain the main game source, F:\btm\INSANE\source\ seems to contain the code for INSANE - a LucasArts library, though I have no idea to what extent was it used. BTM probably stands for "Behind the Magic", an interactive CD-ROM Star Wars encyclopedia from around the same time as Rogue Squadron came out.

Having the source would be nice: to fix some issues with running the game, add higher resolutions or a new renderer, make the game multiplatform (Linux, ...) and probably more stuff. One can dream.

Filesystem

The main game data is packed into one archive file: DATA.DAT. It's split into 2 or 3 sections (depending on the game version), that are described in entries of DATA.HDR file. The program doesn't do any checksum check against the data, so it can be freely modified (great!). The format can be (partially) extracted by RogueDat or Game extractor, and I found a description at Xentax wiki.

DAT+HDR pseudo-spec at Github: file_data_spec.txt. I don't understand the the type flag completely, but the program doesn't make a fuzz if it's not the same in a repacked file. Also, there seems to be some padding between files, but it's likely not mandatory, so a repacked file can be smaller even if it contains the same data. The largest section by far is "data". It contains several folders:

folder description
backdrop Images used in game menus, intro and level selection.
blueprints Player craft images, used in level selection menu.
cuts In-game cutscene animations.
dbg Files containing basic 3d objects used most likely for engine debugging : xyz axis, ball, cube, valve.
demo Unknown.
fonts Bitmap fonts.
frontend Data for level ("bridge") and craft ("hangar") selection screen. The screens are actually 3d environments.
imp_stuff 3d objects of Imperial craft
level Level data: terrain, textures, level objects, etc.
ov In-game HUD overlay icons: player craft status, radar, targeting circle etc.
pl_crafts 3d models of player-playable crafts, including cockpits, and lower detail models for wingmen. The higher detail models are used in cutscenes for wingmen too. The models are not freely interchangeable.
reb_stuff (very few) 3d objects of Rebel (or captured) craft
sound sound and music
weapon_spr weapon sprites? contains only one file
(root) various files

The filenames are case sensitive. It seems to be the convention that file extensions were written in uppercase, and separated from filename with an underscore; but often they are missing altogether.

The menu uses data stored in bundle files, which have a different structure. Description is at Xentax wiki again.

Terrain

Terrain is stored as a heightmap: ingame levels are rectangular, the polygon structure looks regular, and a dev confirmed in a web interview that they were working with heightmaps before Rogue Leader. Starting with this assumption, I was able to recognize enough of the structure of HMP files to create a parser and an exporter to obj format.

Some terrain renders from Blender - level 16 and 1, they look kind of plain without the level objects (also the colors aren't accurate):

A level map, presumably generated the same way as the game does:

HMP format description at Github. The format is quite efficient and interesting. The terrain is segmented into 2D array of tile indices. Each tile index points to a square tile. Each tile contains a 5x5 array of bytes, each byte representing terrain height at that point (one row and one column seems to overlap with a neighboring tile). More indices can reference the same tile, which allows more effective storage by removing redundancies (for example large flat areas can all reuse the same tile).

Terrain textures are stored in a TEXT file. Tiles reference them through an indirect scheme: tile itself has a texmap index, which is remapped to the real texture index through a TEX file which defines the mapping between them. No idea why it was done this way. Terrain texture file format itself is simple: multiple images (64x64 pixels 16x3B RGB palette) without any headers, stored consecutively.

The game seems to be doing some LOD-ing on the terrain, which is visible for example in the intro of second level. Also, the heights of each level are scaled differently; this scaling info is probably somewhere in the unrecognized bytes of the file.

Images

The key to keeping the assets small was to often use palettized images. In the case of terrain textures, a palette with just 16 entries was used for a 64x64 texture; which is very space-efficient compared to a true color image (6x space reduction). I guess there was hardly the computing power to use some compressed texture format and have almost instant level loading, so it was a very reasonable thing to do.

JPEG images were used for menu background screens, level "icons" in the level selection screen, and a few other places. Some images can be easily exchanged - like player craft overlay icon. Level icon is a small jpeg file with a separate jpeg used as alpha layer. I replaced it with a screenshot, game shows some scanline artifacts though.

Model format

Each 3d model seems to be consisting of two files: HOB and HMT. HOB contains the vertex data, HMT contains the material definition and model texture.

Most of mesh formats consist basically of the same things in various arrangements: faces, vertices, texture coordinates. Vertex coordinates usually go as sets of 3 floats (x,y,z). This is what I believed to be true also for the HOB format. To find out more, I made a few assumptions and started experimenting with the file data: * one usual mesh arrangement is having a separate list of vertices and a separate list of faces, which contain indices into vertex list * most of the time, there should be little variation of coordinate values between neighboring vertices * 1ky.hob is the skydome object for each level: if the game is run through 3DRipperDX and set in wireframe mode, one can see the (slightly irregular) skydome. Its circumference is about 32 vertices (quads).

Now if you look at the 1ky.hob file in hex editor (or most other hob files), you will see 3 notable areas: * area filled with a lot of empty space right after the header at the beginning of file * area of repeating structures about 40 bytes wide * area of repeating 8B structure, of which 6 bytes contain a varying value and 2 bytes are zero

Let's have look at the third area: 6B can be a x/y/z value where each coordinate takes 2 bytes. Could this be it? I tested by zeroing some bytes in this area in the xwing model file and sure enough, after the game started with the modified model, some vertices of the model were squashed to zero coordinate. So most likely I was right about it.

Let's do some more digging. I took a look at other hob-s, and the vertices really seem to be the last part of the data files. The 2 bytes per coordinate is too few for an usual float value (32 and 64bit floats are used most of the time), and I doubted the devs used 16bit floats, so could it be a 2 byte signed int? Time for a test: I made a simple OpenGL viewer, that reads groups of 8 bytes from the end of the file to the beginning and displays them as single vertices, using 2 byte signed int coordinates. There will be some "dirt" without knowing the exact structure (group of bytes that's not a vertex that's interpreted as one), but the mesh should be recognizable. Results are promising (but what's the "ball" inside, I have no idea yet):

"Gray area", face groups and faces

All this was nice, but I had to tune the offsets to vertex data by hand and had no face info. In addition, some files seemed to contain multiple meshes, for example the opkg.hob - object package file, combining several independent level objects (each level has one file like this). My initial guess was that also the craft models contained several versions of one model for multiple LOD stages (This doesn't turned out to be the case; there are two versions of major player crafts however: the regular version, used as player ship and in intros/outros and a simplified wingman model).

So the header and the "grey area" had to be decoded (grey because it contained many null bytes and thus was quite dark in the visualization). My way to do it was by compare various HOB files and look for similarities and differences, beginning with the files that had small size, which were probably simpler.

To compare contents of various HOB files I first wanted to print hexviews of the files to make notes, mark similarities etc. But then I thought - why not do this digitally? So I imported the hex data into Inkscape (giving it a hard time with some non-printable characters - I had to fix a saved SVG file by hand) and this was more useful for decoding the format than I initially thought.

So after some experiments later, after tuning the HOB viewer to the supposed file structure, the result was this:

The HOB file is mostly offset based: in short, header contains offset to face group definitions, which contain offsets to extended group definitions, which contain offsets to used faces and vertices. Multiple meshes can be stored sort of in paralell: face group definitions, faces and vertices are interleaved. Given the current format description, vast majority of the HOB files can be viewed.

Other

What is still needed? The parser skips over some groups that contain useful faces, and at last but not least, impact of all fields that are still unknown should be tested. After having proper model definition, custom models should be possible.

Frontend

The pilot image in the main menu (with moving bright rectangle overlays on menu item changes) is stored in SANM video format and can be played back by mplayer.

Other

In-game text is most likely stored with some kind of compression; I'd probably need to disassembly the decoder code to be able to figure it out. Some sort of sound compression was employed as well. Music is stored using some MIDI like scheme; Araz knows better.

Tools

Source of all tools is hosted at Github: ReRogue.

Repacker - packs files from folder structure to DATA.DAT+DATA.HDR or unpacks them.

Hmp2obj - creates wavefront OBJ files from .hmp and corresponding .tex + .text files. Takes three parameters, the HMP file, texture and texmap file. Example: for extracting the first level heightmap, you need to specify the hmp file \data\level\lv_0\hmp, the textures \data\level\lv_0\lv_0_TEXT and texmap \data\level\lv_0\lv_0_TEX.

Image exporter - exports some images to pnm/pgm/tga, according to their format.