De-bloating my piece of the web

While tackling a TODO task on a client's mobile web app, I noticed it was loading nearly 15Mb of non-app scripts. This included several third-party marketing/user assistance/analytics scripts, CDN hosted libraries, some ad scripts, and their embedded videos where also loading their own cornucopia of scripts, cookies, and trackers. FIFTEEN MEGABYTES! I sighed as I injected the 13th "user-experience-enhancing" tool into the app…

"Oh the modern web." I lamented - my tired smugness tinged with sadness - "Why can't you be more like my blog. It's a beacon. An oasis in this privacy-hostile, bloated world. No tracking, no ads, no client-side frameworks. Just hand-rolled, artisanal code - the epitome of integrity… what the web should be."

Then, out of interest, I opened a dev console on mrspeaker.net.

What I saw shook me to my very core.

I got fat, and chatty

The simple old hompage of Mr Speaker weighed in at over five megabytes. Five. I did a cartoon-style double take and restarted Firefox to make sure it was not a mistake. It was not. The network requests linked to six third-party domains… including Doubleclick! Cookies for Google, Youtube, A bunch of CDN calls, a huge custom emoji file (particularly disheartening as there is a site-wide emoji ban on mrspeaker.net).

I didn't even know where two-thirds of the scripts were getting loaded from, or what they were doing... all those page-loads going off to random data silos. It happened so slowly I didn't even realise I'd done it: I'd become the modern web. I was making the problem worse.

Rocky montage

This was critical. I had to make the world a better place for poor souls who accidentally googled "hacking windows pinball" and landed in the middle of my privacy-invading, bloated mess.

Begone, third parties

Step 1: remove all data leeches. Just like pretty-much everyone, I use about 0.01% of my injected analytics services, and I don't really care about the results anyway. Additionally, I'm not in danger of being in the Alexa Top 500 sites so I can handle serving a couple of JavaScript files myself. Time to banish as many non-local requests as possible:

  • Removed Google analytics. I think there charting/reporting on logs from my webserver if I needed it. (I don't).
  • Set privacy on embedded YouTube videos. Stop setting them cookies.
  • Removed web fonts linked to google. Google doesn't need to know every page you visited.
  • Moved hot-linked images from GitHub to local. Microsoft doesn't need to know every page you visited.
  • Removed Three.js. Now instead of spinning cubes, you get a crappy triangle: but I'm capable of hand-rolling some neato WebGL. Also, I was CDN-ing Three.js and the CDN doesn't need to know every page you visited.
  • Moved jQuery local. The jQuery team doesn't need to know every page you visited.

WordPress, you're not helping

Next up, time to take a look at my self-hosted WordPress install. It was always pretty bloated - and I use such a small percentage of its features that I really should switch - but hey, I've had 15 years of not doing any maintenance on it, so I can't be bothered starting now. Over the years they've added a lot of extra stuff, and I was forced to dig in a bit to de-cruft it:

  • Removed HTML cruft. It was spitting out a LOT of meta tags and other stuff in the HTML that I did not need. I also tidied up the view-source formatting so it's less HTML soup-y.
  • Removed Emoji/embeds. There was a few large JavaScript files for handling emojis and object embedding that have no place here: "Windows Liver Writer" and DNS prefetch garbage be gone.
  • Audited my WordPress "theme". I used some minimal theme from over a decade ago that I'd never looked at. It was outputting so much of redundant HTML and class names that I could chop.
  • Audited WordPress functions. Didn't reduce the external bloat, but the PHP pages are a lot smaller now!

Fix my own shit

  • Refactor/remove a bunch of nonsense in my JS files
  • Re-encoded large images - I like images, but 250k for a picture of some popcorn kernels baked in sourdough bread? That's too much.
  • Made it validate. Now the W3C validator says "You're good"… Party like it's 2008!
  • Force HTTPS. And some other refactoring/tweaking my .htaccess file.
  • Make the experience better for NoScript-ers. I have a lot of odd JavaScript experiiments around, so my blog is "Best viewed with JavaScript" - but these days we know script should rarely be enabled on the web. I wanted to ensure the experience was not as bad as say, your average React app, and everything works sans JavaScript.

I'm slowly going through each page and un-bit-rotting and un-bloating everything. The homepage is 4.5Mb lighter. The rest is mostly images, though the largest is jQuery (90Kb) that I still need to insert into the head of my page for historical reasons: 15 years ago I in-lined lots of my wacky JavaScript experiments and I haven't had a chance to fix this (And a quick test with Zepto didn't work - I might need to re-write some selectors).

My goal is to make every page (not including games and other standalone experiments) be under 500Kb, and (where possible) not give people's info to other people's businesses. Perhaps I'll look into self-hosting videos too - YouTube is pretty much the last third-party I can't see how to ditch.

Web smugness restored

What I learned from this experience is that every tool, every dependency, everything you put in your codebase that you didn't make yourself comes at a cost. Sometimes the cost is small, but we're so addicted to convenience and those costs accumulate.

The cost of convenience can be bloat (your blog platform thinks everyone needs emojis). Or the cost can be privacy (Your CDNs and providers receive all your user activity and data. They know more about your business and users than you do). Or the cost can be you just don't know what's going on in your system anymore.

Also I learned that I've written a lot of weird stuff on this blog over the last decade-and-a-bit. Here's to many more weird, non-bloated, non-privacy-invasive years to come!