Optimizations implemented into my website

This post was last modified over a year ago and as such, information in it may be outdated (or the post may be outright broken).

Hii guys,

I've gotten some questions about what optimizations I have implemented into my website to allow it to load so fast (although, the speed at which my site loads may vary for some people), so in this post, I'll be going over some optimizations I have made to achieve this.
Please do note that I'll only cover things that you can most likely implement into your own website as well (so no CakePHP-specific techniques), although, since I'll be covering my specific website, implementing these optimizations may take some work on your end as well depending on what stack you use.
I'll be going into my stack and why I choose certain things in a bit.
Additionally, this guide is written with commit ab3eace, so depending on when you read it, things may have already changed.
And finally, while I have done a lot of reading + experimentation, not all optimizations might be ideal and some of you may facepalm extremely hard.
Feel free to inform me of any mistake I made so I can see what I can improve.

Disclaimer
This post is intended for educational use only.
I am in no way or shape responsible for any damages done!
Please make sure you made adequate preperations before proceeding.

My stack

My stack is fairly simple.
Everything is Dockerized, however, this doesn't affect performance too much (only marginally at best).
In this section, I'll go over each of the components I use in my core stack and I go into a bit of detail into why I did chose these components specifically.
I use more than just this, however, those things not listed will be mentioned later as those are more part of my toolkit and often don't make it to the servers.

In the front, I have a reverse-proxy with Traefik which handles routing requests to the appropriate containers (since I run multiple services on my servers like my website, DevineHQ's website, GamingHQ's website, MaidBot, my Matrix Synapse server etc.).
It also serves as a load-balancer, however, since one container per "component" is enough at this moment, there is no real load-balancing done.
And finally, Traefik is in charge of managing and renewing the HTTPS certificates (provided by Let's Encrypt) for each service I'm running.
Traefik does include any real special sauce.
The only downside is that Traefik does not support HTTP/2 Server Push, to which I'll come back later.

Next, I run Nginx as my webserver.
This means that the configurations I'll show later only apply to Nginx.
I choose Nginx because it is what I was used to for years.
Back in the day (think 2012-2013), Nginx was way, way faster than Apache2 (like Apache2 didn't even come close) and there weren't many big alternatives (think Caddy and H2O) like we have today.
And by the time I discovered Caddy (and later H2O), I was already using Traefik in my infrastructure so there was no real need for the special sauce that both Caddy and H2O provide (namely their integrated HTTPS).
Additionally, the configuration of Nginx is a lot easier compared to Apache2, mainly because the configuration syntax that Nginx uses is a lot simpler than that of Apache2.
I don't know whether Nginx is still a lot faster than Apache2, however, I'll stick with Nginx.
Don't get me wrong, if you like Apache2, then, by all means, do go ahead, I just personally think Nginx is a lot better for production than Apache2.

For my dynamic stuff, I use PHP version 7.4.
PHP is hands-down my favourite back-end language (sue me), so well... It's kind of a no-brainer that it'd end up being used in my stack right?
I still use PHP 7.4 at this moment despite PHP 8.0 already being released because I still need to find out what stuff breaks when I upgrade.
One feature PHP 8.0 includes is the very hyped JIT, however, while it does certainly excel in CPU-bound workloads, my website doesn't have much of these workloads (the only one that comes to mind is watermarking my images, but I'm not even sure about that) so the real-world performance gains would be marginal at best.
PHP 8.0 does include a lot of nice new ways to write code (like the named arguments, union types and the constructor promotion), I'll need to see if it breaks anything that already is in-place.
So the upgrade to PHP 8.0 may take a little while as it doesn't have any real priority to me just yet.
My PHP containers talk to Nginx using FastCGI.
Nginx serves static content as-is without passing it to the PHP container (given Nginx can find it locally), anything not found as a static file on the webserver is passed to the PHP container.

Next, my database, for which I use MySQL 5.7.
I honestly can't tell much about it, except that I chose it because I'm familiar with it.
I have looked at MariaDB but still need to see whether it's really worth upgrading.
No special sauce here either.

Now that we know my core stack, we can start talking about some things I took into consideration while optimizing my website.

Optimizing Serving: Protocols

It is important to at least grasp the fundamentals of the protocols used in your website in order to understand how to optimize for it.
Most of the websites nowadays use HTTP/2 (which is just a short way of saying "HTTP version 2", which can be served over any reliable transport protocol.
In most cases, this is TCP however, HTTP/3 is already looking around the corner as well which uses QUIC (formerly known as "HTTP-over-UDP"), but HTTP/3 adoption is still rare and since Traefik doesn't support it yet either, I don't support it at the moment.

In order to understand what I'm going to talk about, we need to understand the basics of how a TCP connection is set up through the so-called "handshake".
This handshake is the most "expensive" part in setting up a connection, and as such, you do want to limit the number of handshakes that need to be done.
You have probably seen this diagram somewhere else.

Hang on tight while we're loading this image!

TCP Handshake

HTTP/2 improves on the older HTTP/1.1 protocol in a variety of ways.
One of them is that HTTP/1.1 could only serve a single request at a time.
So in order to download all required files (eg. scripts, CSS and images), your browser needed to open more connections (which involves the expensive handshake).
Most browsers limited this to 6 connections at a time per domain (Internet Explorer 10 and Internet Explorer 13 increased this to 8 and 13 respectively).
So, a lot of developers implemented what is called "domain sharding" to work around this limitation.

Hang on tight while we're loading this image!

Domain Sharding example

Image source

Luckily, most browsers could re-use a single connection to request another file later down the line, which would be a slight benefit when needing to download multiple files from a single source.
With HTTP/2, we can use a single connection for multiple files by using a technique called "multiplexing", thus eliminating the need for domain sharding.

Hang on tight while we're loading this image!

HTTP/1.1 vs HTTP/2

Image source

While HTTP/2 is supported over both HTTP and HTTPS as per specification, most browsers only allow it to be used with HTTPS, which you should be using anyway (and if you don't stop reading right here, implement HTTPS, then come back).
Most modern webservers and load balancers support HTTP/2.
So, by using HTTP/2 instead of HTTP/1.1, you can increase performance significantly with reduced complexity as opposed to domain sharding.

HTTP/3 works by sending data over UDP instead of TCP, which completely eliminates the need of the handshake to begin with, however, since Traefik doesn't support it yet, I won't talk much about it here.
I will, however, keep an eye to see when Traefik will support HTTP/3.

Optimizing Serving: Compression

Compression allows files to be sent at a reduced amount of bytes by optimizing what data is sent.
This serves two main benefits:
- Saving bandwidth
- Saving time (bigger files take longer to download)

There are two different ways of compression:
- Lossy compression
- Lossless compression

Lossy compression is a compression technique where you sacrifice some data to make a file smaller, often at the cost of quality.
This is often done with images, audio and video (JPEG, MP4 and MP3 are examples of lossy compression algorithms) as there, it doesn't always matter that some data (and as such, quality) is lost.
With lossy compression, you just need to find a balance between quality and filesize.
For example, when using a lossy compression on images, you can take every group of 4 pixels and take their average values to create one single pixel.
eg: 4 + 3 + 2 + 1 becomes 2.5 and 8 + 7 + 6 + 5 becomes 6.5.
You now saved the additional data of 3 pixels, however, these 3 pixels are now lost forever.

However, it may not always be desirable to lose some data, as is the case with scripts and CSS (imagine missing half your scripts!).
For these, we use what is known as "lossless compression".
Using lossless compression, we trade a bit of filesize in for not losing any data.
Lossless compression works best when there are repeating bits of data.
Eg. the string "AAABBBCDDEFFFFFF" is 16-bytes long, but we see some repeating bits of data, so if we instead write it as "A3B3C1D2E1F6", we now have 12-bytes instead, saving us 4 bytes!
And since we didn't leave out any data, we can easily translate it back (also known as "decompression"):
- "A3" becomes "AAA"
- "B3" becomes "BBB"
- "C1" becomes "C"
- "D2" becomes "DD"
- etc.
Trying to compress non-repeating runs, however, may actually increase our filesize.
Eg. "AABCDDEFGHIIIJJJ" is 16-bytes long and becomes "A2B1C1D2E1F1G1H1I3J3", which is actually 20-bytes in size, so we increased our size by 4-bytes, which isn't desirable.
So, it may be better to write it as "A2BCD2EFGHI3J3" instead, which is 12-bytes again, all we need to know when decompressing is that when there is a number after a character, we repeat that for N times but if there is no number after a character, we just put it there once:
- "A2" becomes "AA"
- "B" becomes "B"
- "C" becomes "C"
- etc.
This "compression algorithm" is very crude and better algorithms obviously exist.
Speaking of which, let's talk about those.

In the webspace we have 2 major compression algorithms for sending data:
- GZip
- Brotli

GZip is the "old gold" compression algorithm that has served outside the webspace as well (often seen on Linux as ".tar.gz"), it has served us very well and doesn't have any other real requirements to be used.
GZip works similar to the algorithm we mentioned above by compressing down repeating bits of data.
However, in recent years, a new kid has been on the block specifically designed for the web: Brotli.
Brotli works in a totally different way and is best suited towards text (since most things we find on the web are actually just text, like our scripts and CSS).
Instead of compressing repeating bits of data, Brotli works by keeping a dictionary instead along with using a huffman-tree.
Sadly, even a small example would be very complex to demonstrate here, so I've opted not to, sorry.
If you are interested in the nitty-gritty, you can read the specification on it.

Compression often comes in multiple "levels", indicating how much effort your server will put into compression.
Brotli starts from 1 ("only the obvious") all the way to 11 ("compress me harder daddy") while Gzip starts at 1 and goes up to 9.
GZip can get a savings of 78% when using GZip-1 but can reach all the way up to 81% when using GZip-9.
Brotli, however, can get a savings of 81% when using Brotli-1 but can reach all the way up to 87% when using Brotli-11.
However, it isn't a wise thing to just slam everything on the highest level and call it a day since the higher the compression you want to achieve, the more CPU time it'll cost you (and as a result, you need beefier servers and/or have your visitor wait longer).
What you need to do is find a nice balance between speed and compression.
Additionally, you can mix&match between Brotli and GZip as well.

Nginx allows me to use "pre-compressed" files, which allows me to compress some files beforehand and some on-the-fly.
pre-compression is often done during the build process, where waiting a second more or less per file won't matter too much.
I have made use of this by pre-compressing my static assets (like scripts, CSS and some images) in my build process as well with maximum compression.
Doing so allows me to have maximum compression (at the cost of some extra time for the build process) for files that never change once the build process is done without adding additional wait time once someone actually wants the file.
If I were to use maximum compression on-the-fly, it means that not only will my CPU have to work even harder for each file when requested, it also means that it has to do this each time someone wants it, which wastes resources.
This would leave me with two options:
- Having you wait longer for compression to complete (not exactly ideal).
- Serving files that are bigger in size (not exactly ideal either).
So instead, by doing it during the build process, I can do it once, but do it REALLY well.
For the visitor, the only thing you'll notice is faster load times because I now have to send less data over my already pretty slow connection (and maybe yours isn't that fast either).

So what compression level should you be using?
Well, it depends on your use-case.
As I said, you need to find a balance between waiting time and compression.

Generally, I follow these guidelines to decide what levels I need to use:
- Highly dynamic content (eg. the HTML page itself or APIs): GZip-1.
- Highly static content (eg. scripts and CSS): Brotli-11 (pre-compressed).
- Already compressed content (eg. zipfiles): don't bother.

One downside about Brotli is that browsers only support it when using HTTPS, which, again, you should be using anyway.

Optimizing Serving: Browser Caching

In order to save resources, all browsers leverage a local cache to store assets in.
This means that when an asset it loaded up, your browser will check whether it exists in the cache and if so, use that.
If it doesn't exist in the cache, it'll download it from the server.
You can optimize your serving by setting proper cache policies.

One thing you basically must do is set an expiration, often this can be set to be as long as possible, but there is a catch.
Even if you tell the browser to set the expiration to "infinite", the browser may clean it up regardless in order to save space, meaning it needs to be re-downloaded again.
Generally, you want to set this expiration date to be as long as possible.
On my website, I've set it to 6 months, however, anything above a week is acceptable.
However, dynamic items (like API calls and HTML pages) shouldn't be cached at all since this can cause issues.

But then how would the browser know whether an asset has changed?
Well, this can be done in two ways: the ETag header and/or by appending a query parameter.
The ETag header is used in conjunction with the server, however, in my opinion, it can be inefficient to use.
For this, we need to know how ETag headers work.
When your browser downloads an asset, the server also adds an ETag header to its response.
This header is a hash of the asset.
When a browser now tries to load the asset, it'll send an If-None-Match header with the request, containing the ETag hash.
The server will then check whether the content of this header matches and two possible scenarios can play out:
- The server will send a regular "200 OK" status back along with the asset in its body.
- The server will send a "304 Not Modified" status back with no content body.
In the first case, this is fine and we can just use our "fresh" asset.
In the second case, we wasted a request (albeit small) on basically nothing.
Instead, what I do is appending a small hash to all my assets in the query parameter (basically telling the browser: "yo, this has changed" beforehand) instead of using ETags.
This hash is calculated on the server and changes our scenarios a bit:
- The browser realizes it's not in the cache and will download it as normal.
- The browser realizes it's in the cache and immediately uses the cached value.

Hang on tight while we're loading this image!

I personally think that the ETag concept is dumb and query parameters like these should be enough to decide whether to use the cache or not as I can slam it in the HTML and the browser can see "hey, I already have this asset with this query parameter, I do not need to check this on the server", saving us an additional request (which saves us a few bytes, a few milliseconds of time and allows our server to process other requests instead).
The downside is that this isn't compatible with most caching proxies but imo, those caching proxies need to be smarter about it.
I do use ETags as fallback as to achieve this effect, I need to set the "Cache-Control" header to "immutable", which isn't supported by all browsers according to MDN.

Optimizing Serving: To CDN or not to CDN?

I personally don't use a CDN, however, using a CDN may be desirable for most.
CDN's have servers scattered all over the globe, which means latency between client and server can be lower.
This means requests for assets can go a lot faster in theory, especially if people are further away from your own server.
However, the reason I don't use a CDN is due to mixed reasons:
- Using a CDN can be pricey (unless you use "public" CDNs, of which you have less control with the optimizations listed above).
- Using a CDN can cause downtime (in theory).
- Using a CDN exposes your visitors to privacy invasions.

First, let's cover the cost.
Unless you use a "public" CDN (eg. the CDN jQuery provides for easy use of their assets), prices can go up quite badly.
Taking the pricing of KeyCDN, for example, it costs me about $0.04/GB for the first 10TB with a minimum of $4 per month.
Additionally, their price varies depending on where your visitors come from, so if you run internationally, well, it can add up really quickly.
Taking some numbers from my November 2020 analytics, I have had about 905 unique visitors.
During November, my website lacked some optimizations I have now.
So, downloading all the required assets took about 455.41KB.
Implying none of these visitors had any of the assets (excluding images), this means that I had served about 0.412GB worth of assets.
This isn't enough to break KeyCDN's $4 boundary so I'd just pay $4 instead for a single month, which isn't a lot, however, it's still money.
My server was already running anyway, so I don't lose much money whether I serve 1GB worth of assets or 20GB worth of assets.
Most websites may run a lot more assets, so prices can go up really easily there as well.

Next, let's cover the theoretical downtime.
Every service is bound to run into some issues sooner or later, most CDNs have a lot of redundancy, but on the 17th of July, 2020, CloudFlare had an issue with their network, which caused it to go down here and there.
Now there is the possibility that any CDN can go down at some point, leaving your website without it's needed assets.
While very unlikely, it played a crucial role as to why I don't use any CDN at all.

Finally, there are some privacy concerns.
I don't think CDNs really do anything fishy but I simply do not want to trust 3rd parties like this.
I don't always know what information they collect and how they process it (and honestly, their ToS and privacy policies are a lot of legal jargon, and I'm not a lawyer...).
Again, I don't think CDNs are inherently bad when it comes to privacy, but I'd like to keep as much with me as I can.

There are some more downsides to CDNs that should be self-explanatory like:
- Requiring additional DNS lookups to connect.
- It may require more HTTP requests to fetch your asset.
- You may not have perfect control about how things are served (eg. cache control, compression levels etc.).

In my opinion, the benefits of having less latency do not really outweigh the downsides.

Optimizing PHP: OPCache

Next let's look a bit more at PHP.
PHP is an interpreted language, which means the entire code that is being executed is read line by line, "translated" (compiled) into an intermediary language (the "opcode") and then later executed by the engine by compiling our opcode to machine code (that your CPU can actually understand).
This process looks something like this:

Hang on tight while we're loading this image!

PHP Execution Process (more or less)

PHP opcode is similar to the bytecode found in the Java world.
For PHP, it translates this code:

function foo(string $s1, string $s2, string $s3, string $s4){
  $x = ($s1 . $s2) . ($s3 . $s4);
  return $x;
}

To something like this opcode:

foo: (lines=8, args=4, vars=4, tmps=0)
  L0: CV0($s1) = RECV 1
  L1: CV1($s2) = RECV 2
  L2: CV2($s3) = RECV 3
  L3: CV3($s4) = RECV 4
  L4: T6 = CONCAT CV0($s1) CV1($s2)
  L5: T7 = CONCAT CV2($s3) CV3($s4)
  L6: T5 = CONCAT T6 T7
  L7: ASSIGN CV4($x) T5
  L8: RETURN string($x)

The parsing and compiling to opcode has to be done for every time the PHP script is executed (so basically, every request made).
This is highly inefficient and rather unnecessary since our script is very unlikely to have been changed (unless you're one of those goofballs that still deploys code to staging and live using (S)FTP).
What if we could store this opcode somewhere so we can use that instead of always having to waste resources on recompiling?
Oh wait!
We can!
Meet: OPCache.
The OPCache is a place where we store the opcode and retrieve it from, often this is located in memory.
So instead of our diagram above, it now looks more like this:

Hang on tight while we're loading this image!

PHP Execution Process with OPCache enabled (more or less)

We now check whether a script has already been put into the OPCache, and if so, we obtain the cached opcode and use that instead of going through the expensive parsing and compilation, which saves CPU time (and as a result, money if using a VPS) and wait time.

Enabling OPCache is really easy.
All you need is to install the OPCache extension and enable it in your php.ini:

; Enable the OPCache
opcache.enable=1

There are some additional things you may want to change depending on your scenario.
You can find my config, along with why I set certain values in here.

Optimizing PHP: APCu

APCu is the "new" version of APC.
APC in the past provided the opcode cache we talked about in the last section, however, as PHP shipped its own OPCache since PHP 5.5, this part became useless and the developers worked on the next version of APC: APCu.
As you may have been able to deduct, APCu is a caching extension that can be used to store objects into.
This is very helpful when dealing with things like external API (so that you don't hit a rate-limit as easily) or when you don't expect data to change anytime soon.
APCu uses the shared memory of PHP, so this means that if you restart PHP (eg. your server crashes or you make a new deployment), all data in the store is gone.
Additionally, you may not want to store too much into it as this can easily clog up your memory (eg. it's not advisable to store too much user-specific data in APCu).
I don't use APCu nearly as much as I should, only really using it for external APIs, but some examples of things you may want to cache in APCu include (but are not limited to):
- Non-user specific external API data (eg. my Twitch livestream status).
- Non-dynamic partials (eg. bits of HTML that don't have any conditionals).
- Compiled models.
- File hashes for assets (so you don't have to re-compute them each time they are requested).

All you need is to install the APCu extension and start using it:

// Check if data exists in APCu
// If so fetch it from cache
// Else, fetch data from source and cache it for 1 hour
if(apcu_exists('my_key')) {
  $data = apcu_fetch('my_key');
} else {
  $data = get_data();
  apcu_store('my_key', $data, 3600);
}

// ... Use data

Optimizing PHP: Queueing

By using a queue system (I use a CakePHP plugin called Queuesadilla for this), I can decrease load times a bit by delaying certain things to be picked up by another container is my stack.
Eg. I use IPFS to serve the images you've seen above, however, in order to do so, I obviously need to upload said images to IPFS.
In order to not increase the response time of my server, I use a queue to do this in the background.
It allows my website to not have to wait for this upload to happen before serving it out (if an image is not uploaded to IPFS yet, it'll just serve you a local version instead).
This means that if my IPFS node is slow to respond to uploads, your load times won't suffer as a consequence for this.

Thing you may want to put into a queue:
- Sending emails.
- Uploading things elsewhere.
- Basically anything that doesn't need to be available immediately with this request.

Optimizing Assets: Treeshaking

When using 3rd party CSS libraries like Bootstrap4, Bulma, Materialize etc. it can often occur that there is a lot of CSS on your website that you don't actually use.
This means that you'll add data that would be wasted.
Treeshaking allows us to scan what CSS rules we actually use in our application and only include these in our served CSS files.
During a big update I've made a little while ago, I started doing some treeshaking myself, which allowed me to remove a lot of unused CSS from my CSS, saving about 13.5KiB, which is actually a lot considering this is even with Brotli-11 (so you can imagine how much unused CSS there was).
Treeshaking can be done in a variety of different ways, TailwindCSS offers it as part of their build process, however, since I don't use TailwindCSS, I just use Gulp along with purgecss, which scans all my handlebars templates, my Javascript files and my Cake template (though it works with any PHP and HTML file) and checks every rule in my CSS to see whether it exists.
This prodcedure can take quite a while, which is why it's done during my build process.

It should also be possible to treeshake Javascript files to remove any unused Javascript, however, I am still trying to figure that part out.

Optimizing Assets: Bundling

Bundling allows us to serve some scripts and CSS as one big file instead of multiple small files.
While this decreases granular control for caching (as we've discussed previously) since now the entire bundle needs to be re-downloaded, it does trade this for fewer requests.
Remember, each requests adds a bit of load time due to the latency, however, if we serve one big file instead of 7 small files, we can make up for this latency by only having it be an issue once instead of 7 times.
By bundling those 7 files into one bigger file, we can as such lower our total load time by a few milliseconds (or more, depending on said latency).
Especially when not using a CDN, bundling can be a big saving in this regard and the benefits only become bigger the longer the physical distance packets have to travel.
An additional benefit of bundling means that often files can be compressed a bit better.
Just by bundling some files, I saved about 7KiB for my CSS and about 32.4KiB for the JS, which is actually quite significant.
The most popular bundling tool is called "Webpack", however, getting this setup can be a pain.
So instead, I just opted for writing my own very crude bundler (which just concatenates all files belonging to a bundle).
While probably less ideal than Webpack for creating bundles, it works alright and is easy for me to configure with new bundles and assets, so it'll do.
Additionally, Webpack was pretty difficult to add into my stack without increasing the number of dependencies significantly.

Optimizing Assets: Thumbnails

Previously, when you visited a blog post, my blog would serve the entire full-resolution, high-quality image to you.
This had the downside that some blog posts would serve you 30MB worth of images as-is.
It obviously isn't very efficient to serve a 1920x1080 image when it only displays as a 450x253 image.
As such, I make a set of thumbnails when an image is first loaded (this is why some blog posts load up a bit slower when it's the first time).
Thumbnails are:
- lower quality (this is quite visible).
- smaller in resolution.

Additionally, I use the webp format to enable for higher compression.

By doing this, it allows me to instead serve "temporary" images that are quite tiny in size (a few KB on average) and only serve the full ones when the user requests it (by clicking on an image).
Sure, they don't often look as good but get the job done just well.

Client Optimizing: External Data (APIs)

Sometimes you may need to access external data from your client but this can significantly slow down the webpages if done improperly.
Previously, I used to access my Twitch status through PHP and then render it on the server in the HTML body.
This is the way a lot of people do it, however, this has some issues:
- If the external API is slow, your page is slow.
- If the external API is dead, your page is dead or you wasted pointless time
- The first visitor to trigger the fetch might be screwed with waiting.

So instead, I decided to move external data fetching to my own API.
This allows me to use Javascript to fetch this data after the page is loaded, wasting little to no time at all (just delaying the API calls a bit).
Additionally, I use APCu to cache the data as well so that if there is a sudden influx of people, I don't hit the rate limit of Twitch's API, though I did this as well before moving to an API.

Always use your own server as a proxy (with caching) when trying to access external APIs or you'll be bound to run into issues (besides, it's more secure to use a proxy anyways).

Client Optimizing: DOM Blocking

Blocking your DOM can lead to a severe impact on the load time of your page.
When your browser receives HTML, it looks over it and builds a DOM (which is an internal object that can be manipulated).
However, when your browser encounters a script, a CSS file or any other external asset, it'll want to download this.
While it downloads this asset, your DOM runs the risk of blocking, meaning it will pause until the asset is downloaded and processed.
This is undesirable as the time that it spent on downloading, it could also have spent on building the DOM further.

This is why I use the defer keyword so heavily.
Defer tells the browser to not bother with it just yet, instead, first finish up the rest then download and process the asset.
This can increase load times significantly, however, it can also be severely detrimental to your user experience.
You can also use the async keyword, however, this is not my preferred way since it says: "Download the asset in the background then come back and process it".
Additionally, it means that some assets may finish loading before others, which can cause some nasty bugs if one of those assets depend on other assets.

A deferred script can look like this (ignore the "="1"", this is an artifact from CakePHP but doesn't affect anything):

<script type="text/javascript" src="/bundles/js/app-js.js?h=a4a20570c" defer="1">

I'll go into how you can use the benefits of defer with the benefits of async in a bit.

Client Optimizing: Preconnect

If you use multiple domains (eg. a CDN) you may want to have a look at the preconnect attribute.
What it tells your browser is to start connecting to the other server in the background.
Connecting to a different server can take a little bit of time due to the handshakes involved, which consist of both the TCP/IP handshake and the TLS handshake.

Hang on tight while we're loading this image!

TCP/IP and TLS Handshake

Image source

This is especially noticeable when the server is further away from you due to the time it takes for a packet to go from one place to another.
By preconnecting, you can do this while your client is working on other stuff and once it actually needs something from the server, the connection is all ready to go.
This can shave off a few milliseconds (which is a lot) easily.
Preconnecting can be done by adding this to the head of your HTML:

<link rel="preconnect" href="https://www.finlaydag33k.nl">

Obviously, that HTML code won't work on my blog since... well... you're already connected.
A downside of using preconnects is that it's only useful when you know it'll do something with it before the server closes the connection.
So don't use preconnects when you won't use it nearly immediately.

If you don't know whether you'll really use the connection immediately, but do know it is likely to be used at a later point in your lifecycle, you can alternatively use the "dns-prefetch" attribute instead.
This will already resolve the DNS lookup (which saves you a wee bit of time) but not actually connect just yet.

<link rel="dns-prefetch" href="https://www.finlaydag33k.nl">

Client Optimizing: Loading Assets - Preload

Preloading can off-set the penalty given by using defer by still downloading the asset in the background.
By using preload in combination with defer, you can create a similar effect to using the async keyword without the nasty bugs.
What preload does it downloading the asset in the background, but it will not return to process it just yet.
Preloading is a very strong strategy when you know you'll be using that asset in a bit.

Preloads can look like this:

<!-- Preload and apply immediately -->
<link rel="preload" href="/css/main.css?h=738352e4e" as="style" onload="this.rel='stylesheet'">

<!-- Preload but do not apply -->
<link rel="preload" href="/bundles/js/app-js.js?h=a4a20570c" as="script">

<!-- Apply preloaded script -->
<script type="text/javascript" src="/bundles/js/app-js.js?h=a4a20570c" defer="1">

Things you may want to preload:
- Scripts
- CSS
- fonts

Client Optimizing: Loading Assets - Lazyloading

I use lazyloading on most images on my website.
By doing this, I load images only when the user is actually about to see them, saving them some bandwidth on images they may not see.

Hang on tight while we're loading this image!

Before lazyloading, images not in view

Hang on tight while we're loading this image!

After lazyloading, images in view

When you loaded this post, you may have noticed the message that images were loading up, that's due to the lazy loading!
For this, I use verlok's vanilla-lazyload.

There is not much else to say about this, the only downside is that images may not be ready at the moment the user should be able to see them, however, this small trade-off vs. saving them a few KB is well worth it in my opinion.
I mean, why waste bandwidth loading images that the user won't see anyways?
If you are on a metered connection (eg. AWS) and you have a lot of traffic, using lazy loading can save you quite a bit of money in the long-run.

Client Optimizing: Javascript Frameworks are a nono

I have explicitly chosen not to use a Javascript framework because they can be severely detrimental for the user-experience.
Sure, having a SPA can be nice since you can make nice transitions but it often comes at a severe tradeoff since you now put a lot of additional load on the client.
The JS has to all be parsed and executed on the client, this means that if you use either some poorly written framework or your own JS is poor, your client can easily become unresponsive for a while.
Or if the client didn't enable JS or blocks specific JS (which is starting to become more common nowadays) the site can just outright break.
Additionally, all this JS can easily take up more bandwidth than a single HTML page.
Next, every CPU cycle spends on executing all that JS is spending energy, while it may not be a lot, it will surely add over-time, this is becoming an issue especially on mobile devices where batteries only have a limited amount of capacity and can only be recharged a limited amount of times, which can cause additional e-waste.
And finally, each time you make a change to your JS, your client has to re-download the entire JS bundle, most of the time this can mean quite significant bandwidth and additional energy usage (especially on mobile devices that use wireless networks).
Going off a free service an acquaintance of mine made called "Whyp" (which is actually quite awesome), each time he makes a change to his website, my browser has to fetch 266KB worth of JS and of course, parse and execute that each time I load the site.
To put that into contrast, the entire homepage of my website, including lazyloaded images only costs 590KB to completely load, of which most never changes and can be cached, saving this bandwidth.
This is difficult to get around when using a framework however, since all the JS that must be downloaded is in charge of actually building the website you visit.

As such, Javascript frameworks have become a nono for me on this site.
That's not to say JS itself is bad, just that you should do as much work as possible on your server.
If you optimize your server properly, you don't even need that much server capacity either.
My own server for example runs anyways and running this site doesn't much to add to it's power usage.

Client Optimizing: What do you actually need?

Okay, so it's no secret that adding more stuff to your code can easily increase the amount of stuff you need to send out to your client.
The problem is that a lot of web developers, don't really consider what they serve vs what they actually need.

A commonly used library is jQuery.
This library is about 26KB big after compressing it with Brotli (though it's effective size goes down a bit after bundling it with other JS).
But I didn't really use jQuery that much, only a very tiny portion of jQuery was what I actually used.
The main reason I used jQuery was that Bootstrap's JS actually depends on it (this one is for you, library devs: try to not rely on other libraries to build your own library).
As such, in a recent update, I've set out to get rid of jQuery from my website completely.
All in all, when factoring in everything I had to change for this, I saved about 43KB in total.
This may not seem like a lot, but that is a lot of code to download, parse and execute, which, again, adds up over time.

As such, it is wise to think about the following when wanting to use a library:
- Do I actually need it?
- Will I make a lot of use of off it?
- Are there any better alternatives?
- Can I write the portions that I am gonna use myself in plain JS (albeit with some effort)?

If one of these answers made you doubt the need for a library, then you probably don't really need it.

Additionally, serving only what you need ties in with the sections about Lazyloading and Thumbnails:
- A visitor may not always see a certain image, so I can lazyload it when they actually do.
- An image on my blog is never above 450 pixels wide, so I don't need to serve anything bigger than that.

Getting into the "Do I actually need this?" mindset can help you save a lot of resources in the long run.

Client Optimizing: Limit inline assets (BONUS)

Every now and then, I come across a site that has most of its stylesheet embedded into the page itself, this is a disaster for performance for multiple reasons:
- First, parsing that CSS blocks the DOM (which is not desirable).
- Second, it means that it can't be cached locally (meaning that people will have to download this CSS again and again and again on each visit), which wastes bandwidth.

This goes the same for any asset, be it stylesheets, images or scripts.
Generally, you do not want to add your assets in-line so your browser can properly cache the assets instead, which saves both you and the visitor bandwidth.
If you *must* use in-line CSS or scripts, keep them very, very tiny and on an element-level.
Once you start using that same piece of code again elsewhere, consider moving it to an external file which can be cached.

Conclusion

There are a lot of optimizations that can be done and I surely haven't covered all of them but these are the ones that I have currently implemented and you may want to investigate as well if you haven't done so.
I hope you learned a thing or two and manage to get your website optimized nicely as well.
A lot of websites these days don't seem to care much about optimizing but I personally think we should.

Anyways, that's it for now, as always, feel free to join my subreddit.
Cheers!

Comments

Please login to leave comment!

Optimizations implemented into my website

My stack

Optimizing Serving: Protocols

Optimizing Serving: Compression

Optimizing Serving: Browser Caching

Optimizing Serving: To CDN or not to CDN?

Optimizing PHP: OPCache

Optimizing PHP: APCu

Optimizing PHP: Queueing

Optimizing Assets: Treeshaking

Optimizing Assets: Bundling

Optimizing Assets: Thumbnails

Client Optimizing: External Data (APIs)

Client Optimizing: DOM Blocking

Client Optimizing: Preconnect

Client Optimizing: Loading Assets - Preload

Client Optimizing: Loading Assets - Lazyloading

Client Optimizing: Javascript Frameworks are a nono

Client Optimizing: What do you actually need?

Client Optimizing: Limit inline assets (BONUS)

Conclusion

Comments

Leave a comment