-
Notifications
You must be signed in to change notification settings - Fork 12
Reverse Engineering the Recorded Game File Format
This document is a work in progress, but feel free to add resources/tools/tips in the mean time :)
Since Microsoft doesn't publish the recorded game file format anywhere, we have to figure it out ourselves to add new features or support new game versions in RecAnalyst.
There've been many people who have worked on reverse-engineering the format throughout the years, and their work can often be referred to when adding new RecAnalyst features. Some resources:
- bari's MGX format description: https://web.archive.org/web/20160310131209/http://aocai.ninja-web.net/tool/mgx_format.html. This has pretty much been the go-to resource.
- AoK Trigger studio: https://github.com/mullikine/aokts. It deals with Scenario files, but they share some parts with recorded game files, such as the Trigger Info block. AoKTS supports most versions of Age of Empires, including recent HD Editions.
- Biegleux's original Pascal RecAnalyst: https://github.com/biegleux/recanalyst. This or older versions of PHP RecAnalyst can be good resources, particularly if a newer version of RecAnalyst fails for a file that did work in an older version.
- stefan-kolb's AoC MGX format repository: https://github.com/stefan-kolb/aoc-mgx-format. It contains a Ruby parser, and describes many more Body actions (see High-level format) than RecAnalyst currently does.
Recorded games are split up in two parts. First is the Header part, which is compressed using the DEFLATE algorithm and contains a lot of metadata about the game. For example, player information, the map, and starting units are all stored in the compressed header. Second is the Body part, which is not compressed. It consists of a list of actions, chat messages and Sync operations.
Most of the time the changes between versions aren't major. Usually there are only a few offsets that change, because a few fields might be added. Also, if changes happen in the "body" of the file (so, the list of actions that players take), it doesn't tend to break RecAnalyst, because all actions follow a similar data format and can be skipped in a generic way. The length of the action data comes first, the action data second. It's the metadata that's more volatile.
Most of the time RecAnalyst only breaks after a certain point in the file because data fields were added or removed in a new game version. To make RecAnalyst read the new version properly, it's sufficient to make it skip the new fields (if they're interesting, we can add support for reading them later). To skip the new fields, all we need to know is their size. It's not very elegant, but adding printfs all over the place with a label and the current position in the file can help:
printf("Breaks at %d\n", $this->position);[WIP, the actually useful bit is coming soon™]