Fuzzing WeChat’s Wxam Parser

August 7, 2022

By: Christopher Vella

Fuzzing WeChat’s Wxam Parser | Signal Labs | Advanced Offensive Cybersecurity Training | Self-Paced Trainings | Live Trainings | Virtual Trainings | Custom Private Trainings for Business

Background

WeChat (if you haven’t heard of it) is a super popular chat app similar to the likes of WhatsApp, and runs on iOS, Android, Windows and MacOS.

Being a chat app, it handles various file formats like images and videos, and also propriety formats like “Wxam” (which honestly I haven’t researched before so you’ll see how I approached that).

You’ll also see below some of the challenges I had in my harnessing of the target and how my initial fuzzer framework I chose had to be replaced due to lack of support for certain functionality that WeChat used (and how I debugged this).

Researching the Target

Now that we know what WeChat is we can look at how I decided to write a fuzzer (in 1 day!) for this target!

It started by deciding I wanted to blog about fuzzing something, previously I’ve had blogs on Logic bugs and I wanted to balance that with some cool fuzzing target I haven’t looked at before, so I started by browsing ZDI to see if any displayed targets were interesting.

I noticed a few entries for WeChat like the below:

ZDI WeChat bug disclosures

Now at this point I know what WeChat is, but I have no idea what WXAM is (but its safe to guess its some format that gets parsed).

So my next step was to simply install WeChat in a VM! Note that here I’m targeting the Windows build of WeChat, for the following reasons:

  1. I want this to be quick, its primarily for this blog post and I know I can fuzz Windows targets faster than iOS/Android
  2. If this parser also exists on other platforms, it probably isn’t much different (potentially if I find the bug on Windows, it’ll exist on the other platforms)

Now its installed and I have a bunch of executables and DLL files in C:\Program Files (x86)\Tencent\WeChat, so how do I find the WXAM parsing functionality?

Finding the Target

A good starting point may be to dump all the imported & exported functions from all the executables and DLLs and search for anything with the name “wxam” in it, but I went a different route — I simply guessed and opened the DLL that sounded interesting in IDA!

For me, looking at the list of DLLs I spotted “WeChatWin.dll”, this sounds like a main DLL for WeChat that handles certain Windows specific APIs or something? Who knows, but it stood out more than some of the other DLLs, so I opened this in IDA.

This DLL took a while to load, its pretty large (~40mb), once done the first thing I did was search in functions, imports & exports for the name “wxam”, there I found:

wxam2pic imported function shown in WeChatWin.dll

We spot an imported function named “wxam2pic” that lives in “VoipEngine.dll” — nice! This is a great starting point, it even sounds like a parser.

Before I look at wxam2pic in VoipEngine, I first examine cross-references to this import within WeChatWin.dll and see how WeChatWin uses this, I spot two functions that call this, including this one:

Usage of wxam2pic in WeChatWin.dll

Scrolling to the top of this function we spot:

Don’t you love debug prints?

This string alone implies the function we’re looking at is a “WxAMDecoderHelper”, specifically this function handles the “DecodeWxam” functionality — Awesome! This is exactly the type of function that corresponds with the ZDI entries we saw.

There’s something else notable about this function, look at how IDA shows the prototype:

Its a custom calling convention!

This means if we were to target this function for fuzzing directly, we’d have to match this custom parameter passing convention instead of Visual Studio’s provided options (fastcall, cdecl, etc).

Instead, I took a look at the function that calls this function, and I got:

(Note: ignore the function name itself, I named it this from what I saw!)

Nice, this function uses a standard calling convention (fastcall), takes only two arguments and calls the DecodeWxam function (handling the custom calling convention for us!)

We also see from the debug print that this function appears to decode the Wxam and then re-encode it as a jpeg, this would be a great function to fuzz!

(Note: There’s another decoder that transforms the Wxam to a GIF! We’re not going to look at that one in this blog, but its essentially the same).

Reversing the Target Function

Alright so I want to fuzz this function as it appears to take a Wxam file and parse it, lets analyze the parameters.

Lets view cross-references to this function to see how its called:

(Note: I named the read_file function myself, if you open this function you see a simple CreateFile + ReadFile operation on the provided fName variable!)

From this, I see the following:

  • A filename is provided to the function I myself named “read_file” and a buffer is returned in v11
  • The buffer and a value is passed to “isWxGF”, this function reads a header and the flag to determine if we should parse it further or not
    • Actually, turns out the input structure is a format of a 32bit input buffer pointer followed by a 32bit size of input. So isWxGF takes (pBuffer, buf_sz)
  • If we pass the “isWxGF” check, we call the decoder function passing through:
    • The address of an input structure that contains (pBuffer, buf_sz), the pseudocode looks similar to
      • InputStruct inputStruct = (pBuffer, buf_sz)
      • Where the first input to the decoder function is a pointer to our inputStruct
    • A pointer to a int containing the value 0
      • This pointer seems to be some output from the decoder, if its non-zero its assumed to be another valid pointer

This seems super easy to fuzz:

  • We can fuzz using shared-memory mode in a fuzzer like WinAFL
  • Our fuzz function will:
    • Call isWxGF; and if successful:
    • Calls the decoder

So I wrote a harness to do this in WinAFL, however:

This usually means our program is crashing before reaching the our fuzz function.

So I run WinAFL under WinDBG and see an invalid address dereference when trying to load the “WeChatWin.dll” file!

I analyze the DLL entry point and spot:

I see, this DLL uses CRT (also thread-local storage) — this causes issues with DynamoRIO (which I was using with WinAFL).

This can be confirmed by compiling my executable with CRT support and noting that WinAFL crashes before our process main executes at all!

So this means we can’t use DynamoRIO, our options include:

  • Using WinAFL in IntelPT mode (I’m using an AMD CPU, so no go here)
  • Use a different fuzzer

Well I chose a different fuzzer.

I could have gone the snapshot route with Nyx or what-the-fuzz, instead I decided to try Jackalope

This has a very similar command line to WinAFL, and uses TinyInst for instrumentation (no DynamoRIO!)

Upon trying this, it worked:

Its fuzzing, and we are getting new coverage!

At this point I stopped, I got the fuzzer working well enough I was happy for the day, next steps would include:

  • Analyzing coverage, ensuring we’re not hitting any roadblocks
  • Check stability / determinism, ensure there’s no globals we need to reset
    • Or just throw this into a snapshot fuzzer
  • Reverse the WXAM format and create better corpus, and a format-aware mutator

Also note that in the isWxGF function, I noted the header bytes it checks for and ensured my initial corpus had that header (so we start with an input that successfully passes that check).

There are other things I did in the harness, which are general fuzzing things like obtaining the non-exported function pointers to our target functions we wanted to fuzz.

I’ve included the harness I used below, along with the Jackalope command line I used to kick off fuzzing, feel free to take this and expand on it or view coverage to see how far it gets!

Overall this was a fun half a day exercise at quickly writing a basic fuzzing harness based on some ZDI entry.

Update — Android Bugs!

So, turns out some of the bugs I found from this fuzzer were reproducible on Android:

Files

I put all the files on my Github: https://github.com/Kharos102/BasicWXAMFuzzer

Brand Icon Seperator | Signal Labs | Advanced Offensive Cybersecurity Training | Self-Paced Trainings | Live Trainings | Virtual Trainings | Custom Private Trainings for Business

Empowering Cyber Defense with Advanced Offensive Security Capabilities

Signal Labs provides self-paced and live training solutions, empowering our learners to acquire the latest cutting-edge skills in this rapidly evolving field. Improve your vulnerability research campaigns and adversary simulation capabilities with the latest in offensive research and techniques.

Stay Connected

We'll let you know when our next live training is scheduled.

Stay Connected

We'll let you know when our next live training is scheduled.

Stay Connected

We'll let you know when our next live training is scheduled.

Stay Connected

We'll let you know when our next live training is scheduled.