Learning new languages
Last October I wrote about my progress and the
first release of bjxa. I published a reference
implementation of a BandJAM XA decoder in C able to reproduce bit-for-bit the
same audio output as
xa.exe. After that I reviewed DTXMania’s code, which to
this date I still can’t build completely on Linux with Mono.
I remember seeing years ago a piece of advice for developers, encouraging them to learn one new language each year, to challenge their comfort zone. While I never followed this advice, I think that learning a new language can be counter-productive if their isn’t an actionable project to back the learning. Porting bjxa to C# was a perfect excuse, since the scope is small enough and the language is somewhat familiar to Java which I already know.
My conclusions after conducting this work is that C# is indeed a much more pleasant language than Java. I’m still not satisfied with the code I came up with but it was enough to even challenge my C implementation and drive some improvements.
I then sent an email to the DTXMania maintainer in November to raise his awareness of BJXA and was greeted with deafening silence. This was of course my own fault for not creating an account and create a ticket on the project tracker, but that’s a consequence of a distinct lack of confidence.
That made me wonder about the languages I learned over the years and whether the total amount would average around one per year. Here is a list I could produce off the top of my head, in alphabetic order:
- POSIX shell
- x86 assembler (Intel syntax)
That’s a fair amount of languages, but I’m a beginner in most of them. For some of them I only learned them to write exactly one program or library for a transient need and never touched them again. Some I use out of necessity every once in a while but never became proficient, but I digress…
Reverse script kidding
Tools keep making astonishing progress in all areas, and even a beginner can get results pretty fast with the help of IDEs or similar tools. And yet, as I settled in programming my personal trend was to move away from such tools and instead learn underlying tools they tend to abstract away.
Using IDEs is fine, learning the fundamental tools is critical. Keeping IDEs despite being proficient with the underlying tools is also fine, although I personally prefer to avoid IDEs.
But as a beginner it is sometimes much easier to use graphical tools that try
to be more user friendly by hiding the dirty details. Despite my success in
uncovering the BJXA codec and documenting it
thoroughly there are
still unknown parts. The reverse engineering of
xadec.dll was easy even for
an x86 beginner because it came with a header file that gave a significant
head start. The fact that it was a shared library also gave me by definition
access in the assembly to all the public functions.
Now to find the answers to the remaining mysteries of the codec I would need
xa.exe but there is a catch: it’s a statically linked program
and the only starting point I have is the entry point of the program, which is
responsible for calling the
main() function of a C program. In addition it’s
a Win32 executable, which I’m not familiar with.
Just when I decided to live with those mysteries, the NSA knocked on my proverbial door and convinced me to give a try to their IDE.
Here be dragons
xadec.dll I was convinced that it was originally written
in C and I suspected
xa.exe would be too. Finding the
would in theory allow me to unravel the encoding code and see how it treats
the undefined behavior of the codec.
Trying to use
didn’t work out because it failed to identify the calling convention and I
gave up (without reaching out for help) when I couldn’t figure out how to let
it know. The result was even worse with
Trying to decompile the assembly by hand resulted in a total failure to even
main() function. Even the knowledge of
xa.exe’s command line
arguments I failed to use a reference to the usage code and find my way back
$ wine xa.exe WAV to XA v1.22 Copyright 2000-2001 bandjam.net Usage : xa.exe <option> [filename<.wav/.xa>] Option : -e[n] Encode[WAV->XA](Default) / n:BitCount(4/6/8) -d Decode[XA->WAV] -p Play File -o[dir] Output Directory -u Update
Then the NSA announced it would open source one of their reverse engineering tools and eventually made it available, although the source code isn’t yet. Ghidra is a graphical tool that is for reverse engineering the equivalent of an IDE for development.
In less than 2 minutes (I timed it!) I was able to find the
and confirm that it used a C-like signature taking the two famous arguments
int argc and a
I think that Ghidra’s main advantage over retdec was its ability to find the
right calling convention from the get go. I managed to find the equivalent of
xaDecodeOpen by luck and found the decompiled code to be quite
Overall, I think the NSA deserves to be congrat… to be cong… I think the NSA did an OK job.
One reason why I had a hard time finding the
main() function and its command
line parsing is again an incarnation of the robustness principle. With no hint
from the usage description I couldn’t guess that options were case-insensitive
and that they also supported the Windows style for options:
$ wine xa.exe -d square-mono-8.xa WAV to XA v1.22 Copyright 2000-2001 bandjam.net square-mono-8.xa -> square-mono-8.wav : 44100Hz / monaural / 8bits. 1 Files Decoded. Completed. $ wine xa.exe /d square-mono-8.xa WAV to XA v1.22 Copyright 2000-2001 bandjam.net square-mono-8.xa -> square-mono-8.wav : 44100Hz / monaural / 8bits. 1 Files Decoded. Completed. $ wine xa.exe /D square-mono-8.xa WAV to XA v1.22 Copyright 2000-2001 bandjam.net square-mono-8.xa -> square-mono-8.wav : 44100Hz / monaural / 8bits. 1 Files Decoded. Completed.
I also couldn’t have guessed that the command line parsing would intertwine
options parsing and a lot of processing. Thanks to Ghidra I have a semi-clean
switch statement and was able to figure that while some processing is done
immediately when an option is identify, some checks are deferred until needed.
For example I can ask to encode something and ultimately say that I want to do
decoding instead and it won’t complain:
$ wine xa.exe -e5 square-mono-8.wav WAV to XA v1.22 Copyright 2000-2001 bandjam.net bitcount error. $ wine xa.exe -e5 -d square-mono-8.xa WAV to XA v1.22 Copyright 2000-2001 bandjam.net bitcount error. $ wine xa.exe -e8 square-mono-8.wav WAV to XA v1.22 Copyright 2000-2001 bandjam.net square-mono-8.wav -> square-mono-8.xa : 44100Hz / 16bit / monaural 1 Files Encoded. Completed. $ wine xa.exe -e8 -d square-mono-8.xa WAV to XA v1.22 Copyright 2000-2001 bandjam.net square-mono-8.xa -> square-mono-8.wav : 44100Hz / monaural / 8bits. 1 Files Decoded. Completed.
I also found an interesting undocumented
-l option and its effects are
puzzling to say the least… Looking at the shape of the code, it looks as
xa.exe was written in C++ and if I’m right that’s probably Visual
I hope to solve the remaining mysteries in the BandJAM XA codec with the help of Ghidra and so far it looks much more within reach. Unfortunately that’s a project I will have to shelve for later, but I digress.
19 years of xadec.dll
Eventually, I received a response from the DTXMania maintainer regarding my
struggles when it comes to running it on Linux using Wine, and especially the
xadec.dll. The response was overwhelmingly positive and I learned
that they were struggling too because of it and the fact that so many songs
playable for the game use that ancient codec.
They wanted to maintain the capability to read BandJAM XA files, but that
prevented 64bit builds of the game. Out of nowhere came bjxa and it got them
closer to solving the problem for good (they still have an ancient OGG and MP3
decoder similar to
xadec.dll: no source code).
I also learned in the process that the original DTXMania maintainer was
working on a DTXMania2 but unfortunately that one is even harder to run with
Wine. Without much help from my side they managed to integrate
their projects and the
bjxa.exe was available to see how to use the library probably
When I got the first response, the maintainer was apologetic because it had taken a couple months to reply to my inquiry. Working on free and open source software I have experienced the demanding tone of some users. I have found myself in demanding positions too, and have probably given that impression more than once even when it wasn’t the case. I find it a bit sad though when maintainers apologize when they shouldn’t and keep doing so even after being told that all is fine, but I digress…
No matter how I thanked them they would thank me even more. I’d like to thank them here once more and I hope to hit them hard with more demanding patches as I hope to iron out the remaining problems inside Wine that I identified.
19 years of buffer overflows
I found two interesting classes of bugs in
xadec.dll, one of which could
lead to a security vulnerability. The first one is that most tainted pointer
dereferences are done without a proper null check. The second one is a buffer
overflow that can be triggered easily with a specially crafted XA file. Can it
result in arbitrary code execution? I don’t know and even if I managed to own
a DTXMania process running in Wine that wouldn’t prove that the same exploit
could be used on Windows and vice versa. And considering that I won’t install
Windows to run the game and introduced myself to reverse engineering just for
the sake of running the game on Fedora, I won’t shave that yak.
If you would like to see what real reverse engineering looks like, I strongly recommend this conference from people that could have probably done in minutes what initially took me hours. My plan is to level up my game (all kinds of puns intended) by cheating. I will rely on Ghidra, my new power tool, but before that I have other things to tend to and hopefully I will write on other topics in the future.