Cache Deception: How I discovered a vulnerability in Medium and helped them fix it

By Yuval Shprinz

In my previous post, I tried to demonstrate how powerful and cool reverse engineering Android apps can be. I did this by showing how to modify Medium’s app so all membership-required stories in it would be available for free.

Well, there was a bit more to the story :)

While working towards my desired goal, I found a large collection of API endpoints that Medium declared in their code, which exposed a neat Cache Deception vulnerability after a short iteration on them. I was especially excited about that find because cache-based attacks are exceptionally awesome, and it could have been a great addition to my story.

Unfortunately, it took Medium three months and a couple of reminders to respond, so I had to wait with the public disclosure for a bit.

In this post, I will try to explain intuitively what Cache Deception is, describe the bug at Medium, and reference two outstanding articles about cache-based attacks.

Cache Deception

Web browsers cache servers’ static responses so they won’t need to request them again — saving both time and bandwidth.

In a similar principle, servers and CDNs (Content delivery networks, Cloudflare for example) cache responses too (their own responses), so they won’t need to waste time processing them again. Instead of passing to the server a request that the CDN already knows its response to (i.e. a static image), it can return a response immediately to the client and reduce both server load and response time.

When servers cache static responses, everyone benefits from it. But what happens when a server caches a non-static response that contains some sensitive information? The server will start serving the cached response to everyone from now on, hence making any sensitive information in it public!

So that’s basically what Cache Deception is — making servers cache sensitive data, by exploiting badly configured caching rules. After the sensitive data is cached, an attacker can come back to hoard it, for example.

Caching User Profiles

Medium uses the library Retrofit to turn their HTTP APIs into Java interfaces for their Android app, so basically every endpoint lies nicely in their code with all of its available parameters specified. I extracted all of them to a list that ended up being about 900 endpoints.

Some extracted endpoints

That list was a real treasure, so I couldn’t stop myself from spending some time iterating it. Among other things, I looked for URLs that ended with user controlled input, because there is a common miss-configuration of caching services to cache every resource path that looks like a file. Remember, our goal is to find endpoints that both contain sensitive information and are cached by Medium’s servers. So, finding an API endpoint that’s being cached would be great.

As it turned out, Medium indeed cached paths that looked like files by default, but only for resources that were right under the root directory of the site, URLs like https://medium.com/niceImage.png.

Fortunately, my beautiful list contained one endpoint that held the above requirements — user profile pages. By setting my username to “yuval.png”, my profile page URL became https://medium.com/@yuval.png, and when someone visited it, its response was cached there for a while (4 hours, then the server dropped it). And that was actually the whole bug, setting usernames to end with a file extension -in order to cause profile pages to be cached.

What sensitive information can be extracted from cached responses of visits to my profile page?

CSRF tokens. Those are embedded in the returned document. (Cross-Site Request Forgery in simple words)
Information about who viewed my profile. The currently logged in user can also be extracted from returned documents.

The fact that each cached response was there for 4 hours and blocked other responses from being cached wasn’t a problem, because by using a simple script usernames can be changed repeatedly (and generate new URLs that aren’t cached yet).

Note that this bug could have also been used by users that were willing to hide the “block user” option on their own profile page, if they repeatedly entered it (again, using a script). This would work because users don’t have the option to block themselves on their own profile and so others wouldn’t have it either when they receive a cached response that was created for the account owner.

Report Timeline

I sent Medium my report through their bug bounty program, and here’s the timeline:

Aug 24 — Sent my initial report, and received an automatic email which said that Medium would try to get back to me within 48 hours.

Sep 14 — Checked with them if something wasn’t clear since they hadn’t responded yet.

Nov 1 — Issued another message, saying was fine with me if my report got rejected, and asking for a response so I would know they received it.

Nov 20 — Response from Medium! apologizing for the delay and rewarding my bug with $100 and a shirt.

I guess it took them a while because Cache Deception isn’t the usual kind of bug people report — but I was just hoping for a quick response asking me for more explanation or something. I assumed no one was reading their inbox.

P.S. the bug was rewarded only $100 because Medium’s program is small, not because it’s lame :P

Cache Based Attacks — Further Readings

Cache-based attacks have been known for a long time, but were considered mostly theoretical until the recent publish of two outstanding works by Omer Gil and James Kettle. If you find the subject interesting, don’t miss these:

Web Cache Deception Attack — Omer Gil, Feb 2017

While demonstrating it on PayPal, Omer claims the term Cache Deception for this new and amazing attack vector.

Practical Web Cache Poisoning — James Kettle, Aug 2018

Cache Poisoning has been known for years, but by publishing his extensive research James made it practical. Check out his follow up article on the subject “Bypassing Web Cache Poisoning Countermeasures” too.

See you next time…
???