resolution changes considered harmful

+++
date = "2023-06-14"
+++

TL;DR People were able to extract unitialized memory from the machines at Discord that created thumbnails. That includes images uploaded to the CDN. This issue has been discovered AS OF JANUARY 6 2021 and fixed AS OF JANUARY 7 2021. I did not discover it myself.

NOTE: This article was originally written IN 2021, it was left in the backburner because I did not, and still do not, have enough drive to finish the investigation of the issue. However, as it's been 2 years and Discord has not publically stated this vulnerability existed, I'm going to publish this for the sake of the users that deserve to know.

The TODOs left in the article are NOT going to be finished. If you wish to do the research and submit text for me to add to this artlcle, you're welcome to.

an image of hatsune miku is looking at a piece of abstract art

hatsune miku stares at abstract art, AI generated, by me, approximately 1h spent working on it, 2023

what #

Before this exploit was discovered the "Discord meme meta" at the time were videos that allegedly could crash the Discord desktop client. It was later discovered it was a Chromium bug (TODO: add reference chromium bug).

I got samples of these types of videos to experiment with, for example, they would cause a massive memory allocation when played back through VLC, and then proceed to crash.

[00007f26340a0480] chain filter error: Too high level of recursion (3)
[00007f26340a0010] main filter error: Failed to create video converter
[00007f26340a0480] chain filter error: Too high level of recursion (3)
[00007f26340a0010] main filter error: Failed to create video converter
[00007f26340a0480] chain filter error: Too high level of recursion (3)
[00007f26340a0010] main filter error: Failed to create video converter
[00007f26340a0480] chain filter error: Too high level of recursion (3)
[...A lot of the same...]
[00007f26343b80d0] main vout display error: Failed to create video converter
[00007f26343b80d0] main vout display error: Failed to adapt decoder format to display
[00007f26380a7310] main video output error: video output creation failed
[00007f2644c2f530] main decoder error: failed to create video output

If played with mpv, it would work at first, playing the video normally, but by the time the "crashing" part of the video would come up, the resolution changed, mpv changed the window's resolution accordingly, and finished the video:

AO: [pulse] 44100Hz stereo 2ch float
VO: [gpu] 480x452 yuv420p
AV: 00:00:03 / 00:00:07 (45%) A-V:  0.000
Invalid audio PTS: 4.038821 -> 4.379864
AV: 00:00:03 / 00:00:07 (51%) A-V:  0.261 ct:  0.084
VO: [gpu] 1920x1080 yuv444p
AV: 00:00:07 / 00:00:07 (92%) A-V:  0.000 ct:  0.339 Dropped: 17

While talking with a colleague about this, they cited that someone else was automating the process of creating such crashing videos, and found out that one of them gave a different result when uploaded to Discord, where the thumbnail would give a wildly different result than expected (usually the thumbnails are the first frames of the video file).

don't upload sensitive stuff to discord right now, there's a bug in the thumbnail generator that leaks uninitialized memory https://t.co/AGxVeivMd4

- Khangaroo (Khangarood), 2021-01-06T20:16:31+00:00

Turns out that's unitialized memory from the process that creates thumbnails on Discord's servers.

Some time after discovery (it is unknown how much data was actually gotten, or which types of data could've been at risk), the vulnerability was communicated through Discord's vulnerability disclosure channels, and was patched on the next day.

how #

Discord uses lilliput, which is a bunch of C and Go to drive image resizing (for thumbnails) or video processing.

When you upload an image or a video to discord, a cluster of servers inside Discord's infrastructure (living in the https://media.discordapp.net domain) will process it to generate the thumbnails using lilliput. This is separate from the part of the Discord system where it creates thumbnails for direct image URLs (commonly reffered as mediaproxy). It is possible to believe it also uses lilliput internally, but it lives in a separate URL (https://images-ext-2.discordapp.net/external/).

TODO: it uses https://ffmpeg.org/libswscale.html

TODO: reference: https://github.com/discord/lilliput/commit/dbb0328436e844c1ee148f7df3ae6e15b5bb9e8b

TODO: liliput reads too much memory? need to actually walk the code

This is the nonpatched version of avcodec_decoder_copy_frame inside lilliput, (TODO: how the fuck does this work, i believe the issue happens because the frame data mismatches the video metadata?)

static int avcodec_decoder_copy_frame(const avcodec_decoder d, opencv_mat mat, AVFrame *frame) {
    auto cvMat = static_cast<cv::Mat *>(mat);
    int res = avcodec_receive_frame(d->codec, frame);

    if (res >= 0 && frame->width == cvMat->cols && frame->height == cvMat->rows) {

        int stepSize = 4  frame->width;

        if (frame->width % 32 != 0) {

            int width = frame->width + 32 - (frame->width % 32);

            stepSize = 4  width;

        }

        if (!opencv_mat_set_row_stride(mat, stepSize)) {

            return -1;

        }
        struct SwsContext sws = sws_getContext(frame->width, frame->height, (AVPixelFormat)(frame->format),

                                                frame->width, frame->height, AV_PIX_FMT_BGRA,

                                                0, NULL, NULL, NULL);

        int linesizes[] = {stepSize, 0, 0, 0};

        uint8_t data_ptrs[] = {cvMat->data, NULL, NULL, NULL};

        sws_scale(sws, frame->data, frame->linesize, 0, frame->height, data_ptrs, linesizes);

        sws_freeContext(sws);

    }
    return res;

}

this is a test #

ldkjgldskjf