For months, a bug in Cloudflare’s content optimization systems exposed sensitive information sent by users to websites that use the company’s content delivery network. The data included passwords, session cookies, authentication tokens and even private messages.

Cloudflare acts as a reverse proxy for millions of websites, including those of major internet services and Fortune 500 companies, for which it provides security and content optimization services behind the scenes. As part of that process, the company’s systems modify HTML pages as they pass through its servers in order to rewrite HTTP links to HTTPS, hide certain content from bots, obfuscate email addresses, enable Accelerated Mobile Pages (AMP) and more.

The bug that exposed user data was in an older HTML parser that the company had used for many years. However, it didn’t get activated until a newer HTML parser was added last year, changing the way in which internal web server buffers were used when certain features were active.

As a result, internal memory containing potentially sensitive information was being leaked into some of the responses returned to users as well as to search engine crawlers. Web pages with the sensitive data were cached and made searchable by search engines like Google, Yahoo and Bing.

The leakage was discovered almost accidentally by Google security engineer Tavis Ormandy while he worked on an unrelated project. As soon as he and his colleagues realized what the strange data they were seeing was, and where it was coming from, they alerted Cloudflare.

This happened on February 18th. Cloudflare immediately assembled an incident response team and killed the feature that was causing most of the leakage within hours. A complete fix was in place by February 20th. The rest of the time, until the incident was publicly disclosed Thursday, was spent working with search engines to scrub the sensitive data from their caches.

“With the help of Google, Yahoo, Bing and others, we found 770 unique URIs that had been cached and which contained leaked memory,” said John Graham-Cumming, Cloudflare’s CTO, in a blog post. “Those 770 unique URIs covered 161 unique domains.” A URI (Uniform Resource Identifier) is a character string that identifies a resource on the web, and is sometimes used interchangeably with the term URL (Universal Resource Locator).

According to Graham-Cumming, the leakage might have been going on since September 22, but the period of greatest impact was between February 13 and February 18, when the email obfuscation feature was migrated to the new parser. Cloudflare estimates that around one in every 3.3 million HTTP requests that passed through its system potentially resulted in memory leakage. That’s about 0.00003 percent of all requests.



Source link