March Madness: Getting To The CrUX of The Matter.

It's that time of year where the NBA playoffs are around the corner, division rankings are starting to settle in, but more importantly - it's tournament time: March Madness.
I've lately been spending some time doing two of my fave things: watching basketball and poking around BigQuery.
It's been interesting piecing together findings from the vast amount of data available from Big Query using the data sets provided from both HTTP Archive and the Chrome UX Report.
I've also been closely mirroring the rudimentary examples, and simple recipes provided by Google engineer Rick Viscomi, the anointed Crowned Custodian, Regent of the Repository that is the HTTP Archive.
How did I arrive here?? After Rick's last presentation @ the NY Performance Meetup, he posted a simple query comparing 2 origins (sites) and the page views (densities) from given form factors and connection types. These were available by looking at the Chrome UX Report (CrUX) data which roughly provides the following figures:

  • origin
  • effective_connection_type
  • form_factor
  • first_paint
  • first_contentful_paint
  • dom_content_loaded
  • onload

Following the example Rick presented, and being in the midst of March Madness, I personally wondered which sites were being frequented by sports fans.

Overwhelmed with responses, I conferred with Alexa. With the list provided, I picked a small amount of sites to begin to take a quick look.

The sites : bleacherreport.com, cbssports.com, cricbuzz.com, espn.com, sports.yahoo.com, si.com.

So what did I do? This was the original query:

#standardSQL
SELECT
  origin,
  form_factor.name AS device,
  effective_connection_type.name AS connection, 
  ROUND(SUM(fcp.density), 4) AS density
FROM
  `chrome-ux-report.country_us.201802`,
  UNNEST(first_contentful_paint.histogram.bin) AS fcp
WHERE
  origin IN ('http://www.espn.com',
             'http://www.cricbuzz.com',
             'https://sports.yahoo.com',
             'http://bleacherreport.com',
             'https://www.cbssports.com',
             'https://www.si.com')
GROUP BY
  origin,
  device, 
  connection
ORDER BY
  origin,
  device,
  connection

This was essentially Rick's recipe with different ingredients, which then provided densities of the first contentful paint (fcp) of origins per form factors and connection types. The form factors were split between desktop, tablet and phone, and connection types were as per the Network Information API networkInformation.effectiveType: 4G, 3G, 2G and slow-2G.
The result of the query when run, looked like this:

Origin Device Connection Density
http://bleacherreport.com desktop 4G 0.4214
http://bleacherreport.com phone 3G 0.03
http://bleacherreport.com phone 4G 0.5244
http://bleacherreport.com tablet 4G 0.024
http://www.cricbuzz.com desktop 3G 0.0929
http://www.cricbuzz.com desktop 4G 0.8878
http://www.cricbuzz.com tablet 4G 0.0184
http://www.espn.com desktop 4G 0.2556
http://www.espn.com phone 3G 0.0387
http://www.espn.com phone 4G 0.6924
http://www.espn.com tablet 4G 0.0132
https://sports.yahoo.com desktop 3G 0.0161
https://sports.yahoo.com desktop 4G 0.4367
https://sports.yahoo.com phone 3G 0.0344
https://sports.yahoo.com phone 4G 0.4764
https://sports.yahoo.com tablet 4G 0.0357
https://www.cbssports.com desktop 4G 0.375
https://www.cbssports.com phone 3G 0.0361
https://www.cbssports.com phone 4G 0.5545
https://www.cbssports.com tablet 4G 0.0345
https://www.si.com desktop 4G 0.3734
https://www.si.com phone 3G 0.0324
https://www.si.com phone 4G 0.5666
https://www.si.com tablet 4G 0.0274

In a more pleasant layout, we then create pivot table (thank you Paul Calvano):

A reminder: these are results from the February 2018 CrUX release for the USA: chrome-ux-report.country_us.201802.

The original list of origins included cricbuzz.com, which I ended up discarding entirely as the mobile site was a m dot redirect. The mobile subdomain was the reason the above cricbuzz.com results were skewed only to desktop, remembering where I erred prior.
Now, before going any further I should also mention that the findings do not involve native apps. As such these results are far from final assertions, but are simply there to provide some insights and ideas about the web.
It's clear that mobile is the primary form factor used by sports buffs, and proven to be the same worldwide (remember this moment in internet history?). We can see that the delta between mobile web and desktop densities @ 4G is anywhere from 10 percentage points (bleacherreport.com) to as many as 43 (espn.com). What might be making the company formerly known as Entertainment and Sports Programming Network - better known today as ESPN, click better on mobile than all others? I decided to look a touch closer @ some data to see what could be revealed. I went ahead with the following query:

#standardSQL
SELECT
  origin,
  form_factor.name AS device,
  effective_connection_type.name AS connection,
  fcp.start,
  ROUND(SUM(fcp.density) * 100,2) AS pct 
FROM
  `chrome-ux-report.country_us.201802`,
   UNNEST(first_contentful_paint.histogram.bin) AS fcp
WHERE
  origin IN ('http://www.espn.com',
             'https://sports.yahoo.com',
             'http://bleacherreport.com',
             'https://www.cbssports.com',
             'https://www.si.com')
  AND form_factor.name IN ('phone')
GROUP BY
  origin,
  device,
  connection,
  start
ORDER BY
  origin,
  device,
  connection,
  start

This time, we added the fcp.start to get a snapshot of first contentful paint timings which came in increments. This gave us some more interesting insight on how and specifically when first contentful paints was taking place. Let’s look at some histograms for results.

bleacherreport.com

I wanted to expose what was taking place within the first 5s timeline of contentful painting, and according to the query, these were the results. We know how important the first 3s mean to users on mobile (and publishers too) as we look for enough content to appear in order retain their attention and maintain an acceptable user experience. We know that on bleacherreport.com, 52.44% of page views had content paint on mobile @ 4G (previous chart), and 45.5% were getting their fcp by the 3s mark, and in fact 35% by 2s. We also know that, for successful user experience, we'd like those views to be front ended as much as possible. Meaning, much of first contentful painting will ideally occur before the 3s mark, and visually - to have the tallest bars as far left as possible. Pretty simple. Let's look @ more results.

cbssports.com

Here, we have very similar results with cbssports.com: 55.45% density, but 50.74% by 3s. Data provided showed the next 5% were between 3s+ and 40s (uncharted).

espn.com

Getting back to the original findings - espn.com was seemingly doing things a little better and this histogram does make the case. Let's look @ the figures: by 3s, 65% of fcp density. In fact, 52% by 1.6s and 35% by 1s. Impressive.

si.com

Peculiar no? I ran this test a few times to make sure I wasn't doing anything incorrectly or caught the servers at a bad moment (originally ran these during March Madness after all, and there were a several upsets), and ran it a few days after - during a lull of updates. All delivered the same results.
As you can tell, a seemingly large share of the fcp was taking place @ 3s+. By the 3s a mark, we saw 31% of page views getting fcp. But from 3s+ and beyond, 12% within the next 1500ms, 15% from 5s to 10s (uncharted). Thus, with 57% density on mobile @ 4G speeds, nearly half (27%) of fcp is happening after 3s. Something possibly worth investigating.
Nota Bene: The nature of the results from CrUX are that at 3000ms and beyond, they use 500ms increments. Though this histogram is bound to look differently @ uniform increments (without what looks like an exagerated set of findings), the results remain the same.

sports.yahoo.com

This final chart from sports.yahoo.com gets us back to what we were used to: 44% of page views were getting fcp by 3s, which is in fact the lion's share of the mobile density @ 4G. The remaining 3% takes place from 3s+ right up to 40s. Oddly, they also seem to be getting a sizeable amount of desktop activity, just around 5% less than mobile, which contravenes current CrUX.

One Step Further

Having seen the breakdown of the first contentful paint timeline on mobile, one would ask what exactly was taking place with each site. But the goal was to have a quick look at some of the performance data, as this was never meant to be a deep dive. But one more quick check was due, in order to get one last look @ each site. This is where we call on webpagetest. Mind you, the CrUX data is all RUM, so this isn’t exactly oranges to oranges comparison. Let’s take a look either way.

The above are WPT results for all 5 origins, done on a Moto G4 (4th gen) on 4G. We can make some quick assessments.

  • espn.com despite what seemed like a higher speed index than most, they had some of the better results overall. We also saw better results than most from the FCP histogram.

  • in spite of a low-ish speed index, the fewest total requests (sub 100) and the lowest amount of bytes transferred, sports.yahoo.com had some higher reading than one would expect. It feels like they’re likely some small tweaks away from much better readings and possibly better fcp times.

  • notables: the above were the only two with requests =< 178 (494, 704, 710 for all other origins), the only two with times to interactive < 20s and the only 2 with bytes =< 4.5MB (8, 13.7 & 20.1 for the other origins). Again, you can poke around and see some loose correlations as to what is allowing espn.com and sports.yahoo.com to post better overall results.

As a nota bene, It bears repeating: always try to run your tests on very ordinary, middle of the road devices. We chose a Moto G4. Why?? We were just recently reminded of the state of low end phone ownership during #googleGDC18.

In closing, this was a simple case of fun curiosity and fiddling with Big Query, turned no-frills fact finding. Certainly a deeper dive was possible, but the initial point was to keep it relatively brief. But the bigger lesson is the idea of keeping an eye on competitors or at least the players in your business category. With tools like BigQuery, the Chrome UX Report, Http Archive and even tools like Webpagetest, you have the means to dive as deeply as you chose to provide yourself or your team insights into what's happening with your online properties.
Finally, would like to thank both Rick Viscomi and Paul Calvano for the inspiration, technical review and help of course.