# Response:

Hello!  This Webtortoise post was written 2014-MAR-31 at 10:35 PM ET.

## Keep in Mind:

#- Credit to Lee Humphries over at http://www.thinkingapplied.com/means_folder/deceptive_means.htm for the idea-inspiring post.  Thank you, sir.

## Story:

Hello, Everyone.  In this Webtortoise Story, are going to expand on the concepts of The Web Performance “Hockey Stick” Cumulative Distribution Function (“CDF”) Chart and explore “rates of change between the percentiles”.  Specifically, we want to see how the Geometric rate of change between percentiles compares with the actual rate of change between percentiles.  We will do this by:

– constructing an actual CDF ;

– constructing a geometric CDF ; and

–  then charting both the actual and the geometric .

In this excel sheet, take a look at the raw data sample for the Internet Retailer Top 10 sites (these were simple default home page loads, but underlying measurement theory can apply to any performance metrics).  In chart 01 (cell AD5), have constructed the aggregate Hockey Stick CDF chart by calculating all the percentiles and then charting them as a line. In chart 02 (cell AD32), have constructed the aggregate Hockey Stick, but have instead used the geometric accumulation instead of the actual accumulation. In chart 03 (cell AD59), have plotted the information from charts 01 and 02 together. In chart 04 (cell AY119), can now do things like compare the delta between the “actual” versus the “geometric” and use as another data point for studying your web performance.  Note the scroll button is cell ax120, to scroll through the IR Top 10. #Analytics #CatchpointUser #ChartsAndDimensions #ChartsAndGraphs #Performance #SiteSpeed #WebPerformance #Webtortoise #WebPerf #WPO #DataVis

#ExcelHockeyStick #WebPerformanceHockeyStick #Percentile #GeometricMean

# Response:

Hello! This #WebTortoise post was written 2013-JAN-31 at 09:06 PM ET (about #WebTortoise).

## Main Points

#- An Arithmetic Mean will, for all intent and purpose in WebTortoise World, result in a higher value than its Geometric Mean counterpart. Relative to “faster is better” in web performance, might say an Arithmetic Mean is a pessimistic calculation.

#- A Geometric Mean will, for all intent and purpose in WebTortoise World, result in a lower value than its Arithmetic Mean counterpart. Relative to “faster is better” in web performance, might say a Geometric Mean is an optimistic calculation.

#- Define: What is a Percentile?

## Story

Had an opportunity to discuss which statistical calculation should be used when looking at Performance charts. The discussion summary goes something like this.

First, assume consideration for a central-tendency calculation. Then:

If, in fact, looking for spurious outliers, consider plotting the Arithmetic Mean average.

Otherwise, consider plotting either the Geometric Mean or the Median, as they are very good central-tendency calculations.

To start, see this XY scatter plot taken from a day’s worth of synthetic test runs. In this Story, are using data from Catchpoint’s US node network (Thank you, Catchpoint), measuring @ 3,500 times a day (about 170 per hour). Intentionally chose this webpage as it contained a third-party ad network having particular host issues (the waterfall data was invaluable for troubleshooting, but that’s a Story for another day).

Eyeballing the chart, notice the thick band of majority data is less than 5,000 ms (right around 1,500 – 3,000 ms) with thinner pockets and bands throughout. Also notice around between 10:00 AM – 02:00 PM, there were no measurements higher than around 14,000 ms. Second, will take the above XY scatter plot and draw a bar graph representing the middle 25th-75th percentile range (See, “What is a Percentile”). The idea here is to show a middle range (which might better represent overall Performance) versus just a single line (which can sometimes ‘lie’ or misrepresent). Third, using the same data from the XY scatter plot, overlay line charts showing respective Arithmetic Mean, Geometric Mean and Median calculations. Critical thing to notice is the height of the Arithmetic Mean (Y axis) versus either the Geometric Mean or the Median. Notice how the Arithmetic Mean is, at times, either very near the upper limit of the middle range or, in some cases, even above the upper limit of the middle range! Now notice the Geometric Mean and Median are always comfortably between the middle range.

Other:

Notice the 12:00 AM and 07:00 AM hour’s Arithmetic Mean is above the Middle Range. Now, quickly glance back at the XY scatter plot to see the measurement data.

Notice the middle range for the 02:00 PM and 03:00 PM hours are smaller than other hours. Glancing back at the XY scatter plot, can see the thick band of measurement data is more tightly packed.

Last, want to give a fair warning when looking at these types of charts: The amount of the data will generally affect the height and patterns of the lines and bars. Do not be caught off guard if, for example, the Arithmetic Mean average is always above your middle range. This is a function of the amount of data.