Getting to Know… WPO (Web Performance Optimization)

TOC

What

So, WPO is just a massive beast. There are so many parts, strewn across so many branches of tech and divisions and teams, that each “part” really deserves its own “Getting to Know” series. Maybe some day.

But for now, I am going to cover what I consider to be the most important high-level topics, drilling down into each topic a little bit, offering best practices, suggestions, options, tips and tricks that I have collected from around the Interwebs!

So, let’s get to know… WPO!

TOC

Why

The first couple things to understand is that a) “web performance” is not just about making a page load faster so someone can get to your shopping cart faster, and b) not everyone has blazing fast Internet and high-powered devices.

The web is more than just looking at cat pics, sharing recent culinary conquests or booking upcoming vacations. People also use the web for serious life issues, like applying for public assistance, shopping for groceries, dealing with healthcare, education and much more.

And for the 2021 calendar year, GlobalStats puts Internet users at 54.86% Mobile, 42.65% Desktop and 2.49% Tablet, and of those mobile users, 29.24% are Apple and 26.93% are Samsung, with average worldwide network speeds in November 2021 of 29.06 Mbps Download and 8.53 Mbps Upload.

And remember, those are averages, skewed heavily by the highly-populated regions of the industrialized world. Rural areas and developing countries are lucky to get connections at all.

So for people that really depend on the Internet, and may not have the greatest connection, nor the most powerful device, let’s see what we can do about getting them the content they want/need, as fast and reliably as possible.

TOC

Getting Started

This was a tough one to get started on, and certainly to collect notes for, because, as I mentioned above, the topics are so wide, that it took a lot to try to pull them all together…

WPO touches on server settings, CDNs, cache control, build tools, HTML, CSS, JS, file optimizations, Service Workers and more.

In most organizations, this means pulling together several teams, and that means getting someone “up the ladder” to buy into all of this to help convince department heads to allocate resources (read: people, so read: money)…

Luckily, there have been a LOT of success stories, and they tend to want to brag (rightfully so!), so it has actually never been easier to convince bosses to at least take a look at WPO as a philosophy!

TOC

BLUF

You’ll find details for all of these below, but here are the bullets, more-or-less in order…

  1. HTTP2
  2. Cache-Control header, not Expire
  3. CDN
  4. preconnect to third-party domains
  5. preload for important files coming up later in page
  6. prefetch resources for next page
  7. prerender pages likely to navigate to next
  8. Split CSS into components/@media sizes, load conditionally
  9. Inline critical CSS, load full CSS after
  10. Replace JS libraries with native HTML, CSS and JS, when possible
  11. async / defer attributes for JS, especially 3rd party
  12. Split JS into components, load conditionally
  13. Avoid Data-URIs unless very small code
  14. WOFF2 before WOFF
  15. font-display: swap
  16. WEBP before JPG/PNG
  17. Multiple crops for various screen sizes / connection speeds
  18. srcset / sizes attributes for automated size swap
  19. media attribute for manual size swap
  20. loading="lazy" for below-the-fold images
  21. WEBM before MP4
  22. preload="none" for all video elements
  23. width / height attributes on media and embeds, use CSS to make responsive
  24. Optimize all media assets
  25. Reserve space for delayed-loading content, like ads, 3rd-party widgets
  26. Lazy-load below the fold content
  27. Create flat/static versions of dynamic content
  28. Minify / compress text-based files
  29. requestIdleCallback, requestAnimationFrame and IntersectionObserver to reduce CPU load /find better time to run tasks
  30. Service Worker to cache / reduce requests, swap content before requesting

TOC

Glossary

We need to define some acronyms and terms…

CLS (Cumulative Layout Shift)
Visible layout shift as a page loads and renders
CRP (Critical Rendering Path)
Steps a browser must complete to render page
CrUX (Chrome User Experience)
Perf data gathered from RUM within Chrome
CWV (Core Web Vitals)

Three metrics that score a page load:

  1. LCP: ideally < 2.5s
  2. FID: ideally < 100ms
  3. CLS: ideally < 0.1
DSA (Dynamic Site Acceleration)
Store dynamic content on CDN or edge server
FCP (First Contentful Paint)
First DOM content paint is complete
FMP (First Meaningful Paint)
Primary content paint is complete; deprecated in favor of LCP
FID (First Input Delay)
Time a user must wait before they can interact with the page
FP (First Paint)
First pixel painted; not really used anymore, use LCP instead
LCP (Largest Contentful Paint)
Time for largest page element to render
Lighthouse
Google lab testing software. Analyzes and evaluates site, returns score and possible improvements
PoP (Points of Presence)
CDN data centers
Rel mCvR (Relative Mobile Conversion Rate)
Desktop Conversion Rate / Mobile Conversion Rate
RUM (Real User Monitoring)
Not similuated; data from real user experiences
SI (Speed Index)
Calculation for how quickly page content is visible
TBT (Total Blocking Time)
Time between FCP and TTI
TTFB (Time to First Byte)
Time until first byte is received by browser
TTI (Time to Interactive)
Time until entire page is interactive
Tree Shaking
Removing dead code from code base
WebpageTest
Live testing site, configurable, returns treasure-trove of performance data

TOC

Notes

  • Three cardinal rules

    1. Reduce bytes: the fewer bytes, the faster the download
    2. Reduce critical resources:
      HTML      = 1
      each CSS += 1
      each JS  += 1 // unless `async` or `defer`
      
    3. Reduce CRP length: the browser can download CSS/JS at the same time, but each HTTP request maxes at 8kb, so if:
      HTML      =  5kb
      CSS       =  4kb
      JS        =  2kb
      ----------------
      TOTAL     = 11kb
      

      CRP length = 2 (1 for HTML, 1 for CSS/JS)

  • Three levels of UX

    These three questions drive Google’s CWVs:

    1. Is anything happening?

      Once a user clicks something, if there is no visual indicator that something is happening, they wonder if something broke, and so the experience suffers.

    2. Is it useful?

      Once stuff does start to appear, if all they see is placeholders, or partial content, it is not useful yet, so the experience suffers.

    3. Is it usable?

      Finally, once everything looks ready, is it? Can they read anything, or interact yet? If not, the experience suffers.

  • Three Core Web Vitals

    1. TTFB (Time to First Byte)

      Time between requesting page and getting first bits.

      • Connection speed, latency, server speed, database speed and device power all affect TTFB
      • Distance of user from server also affects TTFB: “Even at near light speed, it still takes time to travel around the world.”
      • Static is always fastest than dynamic
      • Remember the above is “per request”, including 3rd party
    2. LCP (Largest Contentful Paint)

      When the primary content section becomes visible to the user.

      • CSS, JS, custom fonts, images, can all slow LCP
      • Not just download time, but also processing time: DOM has to be rendered, CSS & JS processed, etc.
      • Remember, each asset can request additional assets, creating a snowball effect
    3. TTI (Time to Interactive)

      When the user can finally interact with the UI.

      • HTML has been downloaded, elements are shown, but if necessary JS is not ready yet, the user can see something, but they cannot do anything
      • Again, speed of download and processing affects TTI
  • Analysis Prep

    The first step to testing is analysis.

    1. Find out who your audience is

      • Where they are geographically, what type of network connection they typically have, what devices they use.
      • This is “field data”, coming from your real-life market, ideally via analytics.
    2. Look for key indicators

      • Any problems your project has, and determine goals.
      • Perhaps your site has a TTFB of 3.5 seconds, and a LCP of 4.5 seconds, and a TTI of 6 seconds.
      • All of these should be able to be improved, so they are great candidates for goals.
  • Goals and Budgets

    A goal is a target that you set and work toward, such as “TTFB should be under 300ms” or “LCP should be under 2s”.

    A budget is a limit you set and try to stay under, such as “no more than 100kb of JS per page” or “hero image must be under 90kb”.

    • How to choose goals?

      • Might want to compare against competition, or look for statistics / best practices
    • How to create budgets?

      • Similar to goals, can compare against competition or research stats, or look for problem areas and set limits to control them
      • For existing projects, starting budget can be “no worse than it is right now”…
      • For new sites, can start with “talking points” for the team, to help set limits on a project, then refine as needed
      • Budgets can change as the site changes; Reach a new goal? Adjust budget to reflect that. Adding a new section/feature? That will likely affect the budget.
    • How to stick to budgets?

      • Lighthouse CI integrates with GitHub to test changes during builds, stopping deployments
      • Speedcurve dashboard sets budgets, monitors, and notifies team of failures
      • Calibre estimates how 3rd party add-ons will affect site, or how ad-blockers will affect page loads
    • What if a change breaks the budgets?

      • Optimize feature to get back under budget
      • Remove some other feature to make room for the new one
      • Don’t add the new feature
  • Analysis process

    • Lab testing

      With goals set, design and test improvements to see if they get closer to goals. This is “lab data”: testing, trying, evaluating, repeating, until you think you have a solution.

    • Evaluate

      Each test should try one improvement, to be sure your results are the direct effect of your improvement. This tells you if that improvement is worth keeping, needs polishing, or is ready to ship.

    • RUM testing

      Finally test “in the wild”, make sure seeing same results there as in the lab.

  • Analysis tools

    Some tools allow you to test in live browsers, on a variety of devices, alter connection speeds, run multiple tests, and receive reports of request, delivery, and processing times for all of it.

    Other tools allow you to run your site against a defined set of benchmarks, receiving estimations of speeds and reports with improvement suggestions.

    Still other tools let you work right in your browser, testing live or local sites, hunting for issues, testing quickly.

    • WebpageTest

      • Actual run-tests, in real browsers on real devices
      • Device and Browser options vary depending on Location
      • Advanced Settings vary depending on Device and Browser
      • Advanced Settings tabs allow you to Disable JS, set UA string, custom headers, inject JS, send username/password, add Script steps to automate actions on the page, block requests that match substrings or domains, set SPOF to see what happens if assets completely fail to load
      • “Performance Runs” displays median runs, if you do multiple runs, for both first and repeat runs
      • Waterfall reports:
        • Note color codes above report
        • Start Render, First Paint, etc. draw vertical lines down the entire waterfall, so you can see what happens before & after these events, as well as which assets affect those events
        • Wait, DNS, etc. show in the steps of a connection
        • Light | Dark color for assets indicate Request | Download time
        • Click each request for details in a scrollable overlay; also downloadable as JSON
        • JS execution tells how long each file takes to process
        • Bottom of waterfall, Main Thread is flame chart showing how hard the browser was working across the timeline
        • To right of each waterfall is filmstrip to help view TTFB, LCP; Timeline compares filmstrip with waterfall, so you see how waterfall becomes visible, and how assets affect it
        • Check all tabs across top (Performance Review, Content Breakdown, Processing Breakdown, etc.) for many more features
    • PageSpeed Insights

      • Pure lab testing, no real devices, just assumptions based on the code
      • First split for Mobile & Desktop, then nice score up front, including grades on the CWVs, followed by things that you could try to improve these scores, and finally things that went well
      • This tool is powered by the same tech that powers the Lighthouse DevTools and Chrome extension tool
    • BrowserStack

      • Test real browsers on remote servers
      • Also offer automated testing, including pixel-perfect testing
    • Browser DevTools

      • All modern browsers have one, but all vary slightly
      • Firefox’s Performance tab shows inspectable flamechart; Network shows all network activity
      • Chrome has the above, but also offers an Audit tab that runs Lighthouse in your browser, so you can run against localhost sites, useful for testing before pushing to a live server
  • Analysis Example

    For this process, I recommend Chrome, if only for the Lighthouse integration:

    1. Open site in Incognito
    2. Open DevTools
    3. Go to Lighthouse tab
    4. Under “Category”, only check “Performance”
    5. Under “Device” check “Mobile”
    6. Check “Simulated throttling”
      (click “ClLear storage” to simulate a fresh cache)
    7. Click “Generate Report”
    8. Look for issues
    9. Fix one issue
    10. Re-run audit
    11. Evaluate audit for that one change
    12. Determine if change was improvement, decide to keep or revert
    13. Repeat from step 7, until no issues or happy with results

    Screenshot of Chrome's DevTools Lighthouse tab
    Chrome > DevTools > Lighthouse

TOC

Tips

  • Server

    • Use HTTP/2

      Reimagined version created in 2015, primarily focused on mobile and server-intensive graphics/videos, it is based on Google’s SPDY protocol, which focuses on compression, multiplexing, and prioritization.

      Key differences:

      • Binary, instead of textual
      • Fully multiplexed, instead of ordered and blocking
      • Can use one connection for parallelism
      • Uses header compression to reduce overhead
      • allows servers to “push” responses proactively into client caches
        This is being removed from the spec
      • If a device does not support HTTP/2, it will automatically degrade to HTTP/1.x

      All major browsers support it; IE11 works on Win10 only; IE<11 not supported at all

    • HTTP/3

      Not widely adopted yet, but growing.

      Key difference so far is that it uses QUIC instead of TCP, which is faster and more secure.

    • TLS

      Upgrade to 1.3 to benefit from reduced handshakes for renewed sessions, but be sure to follow RFC 8446 best practices for possible security concerns.

    • Cache Control

      While not helping with the initial page load, setting a far-future cache control header for files that do not change often, tells the browser it does not even need to ask for the file.

      This is in contrast to the Expire header that we used to use, which would prevent sending a file that had not yet expired, but still required the browser to ask the server about it.

      And there is no faster response than one not made…

  • Website (CMS, Site Generator, etc.)

    Nothing could ever be faster than static HTML, but it is not very scalable, unless you really like hand-cranking HTML.

    Assuming you are not doing that…

    Database-driven

    • Any site that uses a database, like WordPress or some other CMS, suffers from the numerous database requests required in order to build the HTML before it can even send anything to the user.
    • The best thing you can do here is caching the site pages, likely via a caching plugin.
    • Caching plugins pre-process all possible pages of a site (dynamic pages, like Search are hard to do) and create flat HTML versions which it then sends to users.
    • This (mostly) bypasses the database per page request, delivering as close to the static HTML experience as possible.

    SSG

    • SSG sites, like Jekyll, Gatsby, Hugo, Eleventy, etc., leave little work to do in this section.
    • As they are already pre-processed, static HTML and have no database connections, there is not much to do aside from the sections covered below under “Frontend”.

    CSR

    • CSR sites, like Angular, React, Vue, etc., have an initial advantage of delivering typically very small HTML files to the user, because a CSR site typically starts with a “shell” of a page, but no content. This makes for a very fast TTFB!
    • But then the page needs to download, and process, a ton of JS before it can even begin to build the page and show something to the user. This often makes for a pretty terrible LCP and FID.
    • Aside from framework-specific performance optizations, there is not much to do aside from the sections covered below under “Frontend”.

    SSR

    • In an attempt to solve their LCP and FID issues, CSRs realized they could also render the initial content page on the server, deliver that, then handle site interactions as a typical CSR site.
    • Depending on the server speed and content interactions, this does solve the LCP, but even the SSR often needs to download, and process, a lot of JS, in order to be interactive. Therefore, FID can still suffer.
    • Aside from framework-specific performance optizations, there is not much to do aside from the sections covered below under “Frontend”.

    SSR w/ Hydration

    • Another wave of JS-created sites arrived, like Svelte, that realized they could benefit by using a build system to create “encapsulated” pages and code base.
    • Rather than delivering all of the JS needed for the entire app experience, these sites package their code in a way that allows it to deliver “only the code that is needed for this page”.
    • This method typically maintains the great TTFB of its predecessors, but also takes a great leap toward better LCP and FID.
    • Aside from framework-specific performance optizations, there is not much to do aside from the sections covered below under “Frontend”.
  • Database

    • Create indexes on tables.
    • Cache queries.
    • When possible, create static cached pages to reduce database hits.
  • CDN

    • Distributed assets = shorter latency, quicker return
    • Load balancers increase capacity and reliability
    • Typically offer automated caching patterns
    • Some also offer automates media compression
    • Some also offer dynamic content acceleration
  • Frontend

    The basics here read like a modified “Reduce, Reuse, Recycle”: “Reduce, Minify, Compress, Cache”.

    Reduce

    Our first goal should always be to reduce whatever we can: the less we send through the Intertubes, the faster it can get to the user.

    • HTML/XML/SVG
      • Removing components from the HTML and lazy loading will probably have better results, and is certainly easier to implement.
    • Images/Videos
      • Do you need every image or video that you plan to send?
    • CSS/JS/JSON
      • Tree Shake to remove old crap
      • Componentize so only sending what “this” page needs
      • Possible to remove frameworks/libraries in favor of native functionality?
    • Fonts
      • Do you really need custom fonts?
      • If so, do you really need all the variations of those fonts that are in those font files?
      • New Variant Fonts are coming, which will offer a great reduction!

    Minify

    If something must be sent to the browser, remove everything from it that you can.

    • HTML/XML/SVG
      • Removing comments and whitespace are trivial during a build or deployment process, and has a massive payoff.
      • One could also easily get carried away with stuff like removing optional closing tags and optional quotes, but doing so by hand or even template would be nauseating.
      • If you decide to go this route, look for an automation during the build or deployment.
    • Images/Videos
      • Optimize, automate, during upload, build, deployment or ideally via CDN
    • CSS/JS/JSON
      • Again, removing comments and whitespace are trivial during a build or deployment process, and has a massive payoff.
      • The JS minification will even rename functions and variable names into tiny, cryptic names, since humans do not need to be able to read minified files, and computers do not care.
    • Fonts

      • Depending on which fonts you use and how they get served, they may or may not already be minified.
      • Investigate, and minify if they are not already so.

    Compress

    Once only the absolutely necessary items are being sent, and everything is as small as it can be, it is time to compress it before sending.

    The good thing here is that compression is mostly a set-it-and-forget-it technique. Once you know what works for your users, set it up,make sure it is working, and move on…

    • HTML/XML/SVG/CSS/JS/JSON/Fonts
    • Images/Videos
      • If you have optimized your images properly, then they alread compressed and should NOT be compressed again.

    Cache

    Once the deliverables are as few and small as possible, there is nothing more we can do for the inital page load.

    But we can make things better for the next time by making it so the browser does not have to fetch it again.

    In addition to the server/CDN caching, we can also cache some data in the browser. Depending on the data type, we can use:

    • Cookies
      • That old stand-by, good for small bits of text, but beware these travel to and from the server with every request, so they could be sniffed.
      • Additionally, the more cookies you send back and forth, the more bandwidth you consume, making your page a little slower…
      • Also note that while third-party cookies are being phased out, this shouldn’t affect cookies that you plant on your own site.
      • Lastly, although you can set expiration dates on cookies, the user can also delete them anytime they want.
    • Local Storage
      • In the browser, limited capacity, varies by broser.
      • Good for strings or data, but everything must be “stringified” before saving.
      • Easy “set” and “get” API.
      • Once set, is there until it is removed.
      • Although again, the user can delete or edit anytime they want.
    • Session Storage
      • Identical to Local Storage, except the lifespan is only for “this browser session”, then automatically deleted by browser.
      • Again, the user can delete or edit anytime they want.
    • Service Workers
      • Among other things, allow you to monitor all network requests.
      • Means you can cache pages using the Cache API, then instead of letting the browser fetch from the server, the SW can interrupt and serve the cached version instead.
      • This can work for all file type, including full pages.
      • This also means that, depending on your site, you may be able to handle offline situations very gracefully.
  • Tools

    There are many, many tools and options for each option above. I will list a few, feel free to share your favorites in the comments.

    Minify

    • Can do manually (Minify, BeautifyTools), but that gets pretty tiresome for things like CSS and JS, which you edit often.
    • Usually handled during build or deployment, but there are so many options, you would need to search for one related to your process.
    • Can also set JS minifiers to obfiscate code, reducing size beyond just whitespace.
    • UnCSS also looks for code that your site isn’t using and removes it; online version, and build version.
    • Tree Shaking tries to do the same thing for JS. Rollup offers this, Webpack, and probably others; again, it would depend greatly on your build/deployment process.

    Compress

    This is handled on your server, and is, for the most-part, a “set it and forget” feature.

    Nearly every browser supports Gzip. Brotli is nearly there, but lacks support in older and less common browsers.

    As your target market if Brotli is right for you…

    • Gzip compresses and decompresses files on the fly, as requested, deflating before sending, inflating again in the browser. Gzip is available is basically every browser available.
    • Brotli also processes at run-time, and usually gets better results than Gzip, but does lack support in older and less common browsers.

    Optimizing Images

    • OptImage is a desktop app, offering subscriptions or limited optimizations per month for free. Can handle multiple image foramts, including WEBP.
    • Can also do during build time or during deployment. Essentially every process you could choose offers a service for this, you would just need to search for one for your process.
    • Can also do during run-time, like Cloudinary, works as media repo, though has costs of new latency and possible point of failure.
    Optimizing Videos

    In my opinion, all videos should be served via YouTube or Vimeo, as they will always be better at compressing and serving videos than the rest of us.

    But of course there are situations where that isn’t wanted or just isn’t practical or ideal.

    So if must serve your own videos…

    Optimizing Fonts

    Reducing fonts is called “subsetting”, where you remove parts you don’t use (characters, weights, styles, etc.).

TOC

Tricks

  • This is a collection of tips that you might want to try employing. Remember, very few things are right everywhere, and not everything is going to fix the problems you might have…

    Throw a “pre” party

    • preconnect

      • For resources hosted on another domain that will be fetched in the current page.
      • Sort of a DNS pre-lookup.
      • Add a link element to the head to tell the browser “you will fetching from this site soon, so try to connect as soon as you can”.
        <head>
        ...
        <link rel="preconnect" href="https://maps.googleapis.com">
        ...
        <script src="https://maps.googleapis.com/maps/api/js?key=1234567890&callback=initMap" async defer></script>
        </head>
        
    • preload

      • For resources that will be needed later in the current page.
      • Add a link element to the head to tell the browser “you will need this soon, so try to download it as soon as you can”:
        <head>
        ...
        <link rel="preload" as="style" href="style.css">
        <link rel="preload" as="script" href="main.js">
        ...
        <link rel="stylesheet" href="style.css">
        </head>
        <body>
        ...
        <script src="main.js" defer></script>
        </body>
        
      • You can preload several types of files.
      • The rel attribute should be "preload".
      • The href is the asset’s URL.
      • It also needs an as attribute.
      • You can optionally add a type attribute to indicate the MIME type:
        <link rel="preload" as="video" href="video.mp4" type="video/mp4">
        
      • You can optionally add a crossorigin attribute for CORS fetches:
        <link rel="preload" as="font" href="https://font-store.com/comic-sans.woff2" type="font/woff2" crossorigin>
        
      • You can optionally add a media attribute to conditionally load something:
        <link rel="preload" as="image" href="image-big.png" media="(min-width: 600px)">
        
    • prefetch

      • For resources that might be used in the next page load.
      • Add a link element to the head to tell the browser “you might need this on a future page, so try to download it as soon as you can”:
        <link rel="prefetch" href="blog.css">
        
    • prerender

      • To pre-render a page in the background, if expecting the user to navigate to it.
      • Add a link element to the head to tell the browser “you might need this soon, so try to download it as soon as you can”:
        <link rel="prerender" href="/blog">
        

    Add critical CSS in-page

    Conditionally load CSS

    • Use a media attribute on link tags to conditionally load them:
      <!-- only referenced for printing -->
      <link rel="stylesheet" href="./css/main-print.css" media="print">
      <!-- only referenced for landscape viewing -->
      <link rel="stylesheet" href="./css/main-landscape.css" media="orientation: landscape">
      <!-- only referenced for screens at least 40em wide -->
      <link rel="stylesheet" href="./css/main-wide.css" media="(min-width: 40em)">
      
    • Note that while all of the above are only applied in the scenarios indicated (print, etc.), all are actually downloaded and parsed, by the browser as the page loads. But they are all downloaded and parsed in the background… (This is foreshadowing, so remember it!)

    Split CSS into “breakpoint” files

    • Taking the above conditional-loading technique beyond just a separate file for print, you could also split your CSS based on @media queries, then use the same media trick above.
      <link href="main-min-480.css" media="(min-width: 480px)">
      <link href="main-min-600.css" media="(min-width: 600px)">
      <link href="main-min-1080.css" media="(min-width: 1080px)">
      
    • Even if not needed right now, will still download in the background, then will be ready if it is needed later.

    Split CSS into component files

    • Taking the above splitting and conditional-loading approach beyond print or min-width, you can also break your CSS into sections.
    • Create one file for all of your global CSS (header, footer, main content layout), then create a separate file for just Home Page CSS that is only loaded on that page, Contact Us CSS that is only loaded on that page, etc.
    • The practicality of this technique would depend on the size of your site and your overall CSS.
    • If your site and CSS are small, then a single file cached for all pages makes sense
    • If you have lots of sections and components and widgets and layouts, then there is no need for users to download the CSS for those sections until they visit those sections.

    Prevent blocking CSS, load it “async”

    • Remember that “downloaded and parsed in the background” bit from above? Well here is where it gets interesting…
    • Because, while neither link nor style blocks recognize async or defer attributes, files with media attributes that are not currently true do actually load async, meaning they are not blocking…
    • This means we can kind of “gently abuse” that feature with something like this:
      <link rel="stylesheet" href="style.css" media="print" onload="this.media='all'">
      

      Note that onload event at the end? Once the file has downloaded, async “in the background”, the onload event changes the link‘s’ media value to all, meaning it will now affect the entire current page!

    • While you wouldn’t want this “async CSS” to change the visible layout, as it might harm your CLS, it can be useful for below-the-scroll content or lazy-loaded widgets/modules/components.

    Enhancing optimistically

    • The Filament Group came up with something they coined “Enhancing optimistically“.
    • This is when you want to add something to the page via JS (like a carousel, etc.), but know something could go wrong (like some JS might not load).
    • To prepare for the layout, you add a CSS class to the html that mimics how the page will look after your JS loads.
    • This helps put your layout into a “final state”, and helps prevent CLS when the JS-added component does load.
    • Ideally you would also prepare a fallback just in case the JS doesn’t finish loading, maybe a fallback image, or some code letting the user know something went sideways.

    Reduce JS

    Conditionally load JS

    • While you cannot conditionally load JS as easily as you can with CSS:
      <!-- This does NOT work -->
      <script src="script.js" media="(min-width: 40em)"></script>
      

      You can conditionally append JS files:

      // if the screen is at least 480px...
      if ( window.innerWidth >= 480 ) {
          // create a new script
          let script = document.createElement('script');
              script.type = 'text/javascript';
              script.src = 'script-min-480.js';
          // and prepend it to the DOM
          let script0 = document.getElementsByTagName('script')[0];
              s.parentNode.insertBefore(script, script0);
      }
      
    • And the above script could of course be converted into an array and loop for multiple conditions and scripts.
    • There are also libraries that handle this, like require.js, modernizr.js, and others.

    Split JS into component files

    • Similary to how we can break CSS into components and add them to the page only as and when needed, we can do the same for JS.
    • If you have some complex code for a carousel, or accordion, or filtered search, why include that with every page load when you could break into into separate files and only add to pages that use that functionality?
    • Smaller files mean smaller downloads, and smaller JS files mean less blocking time.

    Prevent blocking JS

    • When JS is encountered, it stops everything until it is downloaded and parsed, just in case it will modify something.
    • If this JS is inserted into the DOM before CSS or your content, it harms the entire render process.
    • If possible, move all JS to the end of the body.
    • If this is not possible, add a defer attribute to tell the browser “go ahead and download this now, but in the background, and wait until the DOM is completely constructed before implementing it”.
    • Deferred scripts maintain the same order in which they were encountered in the DOM; this can be quite important in cases where dependencies exist, such as:
      <!-- will load and process in THIS order -->
      <script src="jquery.js" defer></script>
      <script src="jquery-plugin.js" defer></script>
      
    • In the above case, both JS files will download in the background, but, regardless of which downloads and parses first, the first will always process completely before the second.
    • A similar option is to add an async attribute. This tells the browser “go ahead and download this now, but in the background, and you can process it any time”.
    • Async scripts download and process just as the name implies: asynchronously. This means the following scripts could download and process in any order, and that order could change from one page load to another, depending on latency and download speeds:
      <!-- could load and process in ANY order -->
      <script src="script-1.js" async></script>
      <script src="script-2.js" async></script>
      <script src="script-3.js" async></script>
      
    • defer and async are particularly useful tactics for third-party JS, as they remove any other-domain latency, download and parsing time from the main thread.
    • And remember, third-party JS has a tendency to add more JS and even CSS, all of which would otherwise block the page load.

    Optimize running JS

    Move JS to a new thread

    • A final method for removing blocking JS is to move it to a JS Worker.
    • A JS Worker is sort of a component that can run some scripts, in the background, without affecting the main JS thread.
    • This means that, even while a Worker is processing some massive calculation, your page can continue to load, unencumbered.
    • There are three main types of JS Workers:
      1. Service Workers
      2. Web Workers
      3. Web Sockets
    • Note that, although similar, and sharing some characteristics, each is quite different, with different pros and cons, and suitable for different situations.
    • I personally find the similar-sounding names very confusing, so I wrote a high-level article discussing the differences between them, what makes each unique, and some basic use-cases.

    Use image size attributes

    • Be sure to use width and height attributes.
    • These work as a “hint” to the browser, letting it know the image’s aspect-ratio:
      <img src="/image.jpg" width="600" height="400" alt="..." />
      
    • If you inspect an image in a modern browser, you should see a User Agent style like this:
      img {
          aspect-ratio: attr(width) / attr(height);
      }
      
    • You can then use CSS to make the responsive:
      img {
          max-width: 100%;
          height: auto;
      }
      
    • Combined, these let the browser calculate how big the image will be, and “reserve space” for it while waiting for it to download.
    • This helps greatly reduce CLS.

    Use srcset and sizes attributes

    • The srcset and sizes attributes allow you to specify multiple image sizes in the same img element, and at which breakpoints they should switch:
      <img src="/image-sm.jpg" 
          srcset="/image-sm.jpg 300w, /image-lg.jpg 800w"
          sizes="100vw (max-width: 500px), 50vw"
          width="300" height="200" alt=""
      />
      
    • The browser will determine which image size makes the most sense based on the environment.
    • By offering multiple sizes, you assure the user downloads the smallest file size possible, thus saving bandwith and helping page load speeds.
    • And as long as all image variations have the same aspect ratio, the width and height attribute hinting from above will continue to work and prevent CLS.
    • If you want to help with perceived perofmrance, you can also add a background-color to the img element that is similar to the actual image.
    • I have also seen examples of people using background gradients to mimic the coming image…
    • Supported in all browsers except for IE.

    Lazy load images & iframes

    • Lazy loading is a huge help! It tells the browser “no need to fetch it yet, but if the user gets close, go get it”.
    • The easiest implementation is to add loading="lazy" to an img or iframe element:
      <img src="image.jpg" loading="lazy" alt="..."/>
      
    • Support for the native implementation is good, but not great yet, but it also a perfect progressive enhancement: it will work where it does, causes no harm where it doesn’t, and will patiently wait for browsers to implement it.
    • There are also countless libraries that implement this via JS.
    • Libraries typically ask you to move the src to a data-src attribute, then add some specific class, like:
      <img data-src="image.jpg" class="lazy-loading" alt="..."/>
      
    • As the user scrolls, most of the libraries use Intersection Observer (supported by all except IE) to determine when to switch the data-src back to the src, thus fetching the image.
    • If you are worried about native support, and really want to use this feature, you can run a combo:
      // add the native `loading` attribute,
      // do NOT use a `src,
      // but DO use a `data-src`,
      // then add the library `class`
      <img data-src="image.jpg" loading="lazy" class="lazy-loading" alt="..."/>
      
      // then feature-detect native lazy loading,
      if ( 'loading' in HTMLImageElement.prototype ) {
          // if available, move `data-src` to `src`
      } else {
          // otherwise add js library script
      }
      

    Reserve space for delayed content

    • This is similar to using the width and height attributes on images to preserve space while waiting for them to load.
    • If you have ad units or fixed-size content that is loaded later, you can reserve space by giving fixed sizes to container elements.
    • Then when the content arrives, it simply fills the pre-sized container, without triggering a re-paint.
    • This helps reduce CLS.

    Do not use Data-URIs

    • Data-URIs can be good, but only if the string is really small.
    • Otherwise, they typically just bloat the page and delay the rendering of the content below it.

    Use preload="none" on audio and video elements

    • Unless a video should start pre-playing, like in a hero section, the preload="none" attribute should always be used.
    • And I would argue this should probably always be used for audio
    • preload="none" is really important for performance, telling the browser to prevent downloading any content, thus saving bandwidth:
      <video preload="none" poster="poster.jpg">
          <source src="video.webm" type="video/webm">
          <source src="video.mp4" type="video/mp4">
      </video>
      
    • It is also recommended that you always use a poster attribute so that the user has something to look at.

    Order source elements properly

    • Note that the picture, audio and video elements can all include multiple source elements, which should be placed in order of preference:
      <video preload="none" poster="poster.jpg">
          <source src="video.webm" type="video/webm">
          <source src="video.mp4" type="video/mp4">
      </video>
      
    • As in the example above, you can offer webm first, and if that is not supported, the browser will fallback to the mp4 format.
    • This will also work for audio and picture format options.
    • The source elements can also have media attributes to offer different size resources for different screen sizes, further saving bandwidth.

    Progressively load fonts

    • Mobile browsers will typically wait up to 3s to display web fonts, hoping the desired font is ready before showing the text.
    • That is a long time for a user to look at a blank screen, and triggers CLS if/when the font does finally render.
    • And note that if you have multiple fonts that need to download, each font can trigger a new re-paint, further hurting CLS.
    • font-display: swap allows the browser to show the content as soon as possible using a backup from your font stack until the font arrives, then it will swap.
    • You can use a Promise to fetch fonts, even multiple fonts at one time, then do a single re-paint for all of your fonts at the same time:
      // declare all of your fonts
      const font1 = new FontFace('myFont1.woff2');
      const font2 = new FontFace('myFont2.woff2');
      const font3 = new FontFace('myFont3.woff2');
      // setup a Promise that *all* 
      // must resolve before returning
      Promise.all([
          font1.load(),
          font2.load(),
          font3.load()
      ]).then(allFonts => {
          // once all have resolved,
          // loop through and `add` each
          allFonts.forEach(font => {
              document.fonts.add(font)
          });
      });
      
    • You can also check the user’s preference for data saving; if they are trying to save data, maybe don’t download the font(s) at all:
      // could be written more concisely,
      // but this makes the point more clearly...
      if ( navigator.connection &&
           navigator.connection.saveData ) {
          // maybe do nothing?
      } else {
          // load all of your fonts
      }
      
    • You can also check the effective connection speed before downloading; if the connection is bad, maybe don’t download the font(s) at all:
      if ( navigator.connection &&
           ( navigator.connection.effectiveType === "slow-2g" ||
             navigator.connection.effectiveType === "2g" ) 
         ) {
          // maybe do nothing?
      } else {
          // load all your fonts
      }
      

    Use a Cache-Control header

    • Previously, you might have used the expires header with a specific date to tell the browser whether a new file was needed.
    • But this stilll required the browser to ask the server if the file had been updated, which still required a request.
    • Instead, we can now use the Cache-Control header, which tells the browser “don’t even bother asking until [some directive]”:
      # cache for 7 days
      Cache-Control: max-age=604800
      
    • You can assign different cache durations for different file types or even specific files:
      # cache these file types for 7 days
      <filesMatch ".(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf)$">
          Header set Cache-Control "max-age=604800"
      </filesMatch>
      
      # cache just this file for 7 days
      <filesMatch "logo.svg$">
          Header set Cache-Control "max-age=604800"
      </filesMatch>
      
    • Cache-Control accepts several different directives, but they all boil down to the browser not having to even ask until that directive has expired.
    • All CDNs should offer this feature as well.

    Cache files using a Service Worker

    Use Service Worker to swap resources

    • As all network requests go through the Service Worker, it is possible to make changes to the requests before they go out.
    • One use for this would be to replace images like JPGs or PNGs with WEBPs, only if the browser supports them.
    • The same technique could be used to replace MP4s with WEBMs, or any other resources you want to swap.

TOC

Summary

Where do I begin?

Well, as I said at the beginning, WPO is just a massive beast.

The hardest part of this article was not so much the info-gathering, as it was the info-ordering.

I toyed with absolulte subject-ordering (“Here is everything I found about servers, now everything I found about CDNs, etc.”), but decided on the grouping above because I felt it was important for all of the teams to have some general idea of the entire scope.

No, the server team probably doesn’t care or want to know anything about CSS or JS file splitting and conditional loading, nor do most frontend developers want to get into Cache-Control headers or CDN configurations.

But again, I think it is important that we all have some idea of what the other people are going through, even if just conceptually.

So, my bottom line is this: Probably no one needs everything above. In fact, there may even be some stuff that contradicts some other stuff. But if you are testing first, and find a troublesome issue, maybe there is something up above that could help alleviate it…

So, read through, get familiar, share this with team members, maybe you can create a conversation.

Because WPO is not a set-it-and-forget-it subject. It is a constant battle to make sure that you are always implementing things the best way possible; to make sure that those ways are still the best way possible; to make sure that your business goals have not changed; to make sure you aren’t experiencing some new problem…

And those concerns stretch across all teams concerned.

So WPO is not a “skillset” or a “project”; it must be a philosophy. A company-wide philosophy.

TOC

Resources

Below is a very short list of resources that I found useful. I highly recommend visiting them all as well!

Learning

Tools

TOC

Okay, well, thanks a lot for following along, I know this was a crazy-long one, and please, let me know if you think I missed anything or if you differ with my interpretation or understanding on something. My goal here is for everyone to read this and get to know this topic better! And you can help with that!

Happy performing,
Atg

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.