Cary Millsap: January 2011

Friday, January 21, 2011

Describing Performance Improvements (Beware of Ratios)

Recently, I received into my Spam folder an ad claiming that a product could “...improve performance 1000%.” Claims in that format have bugged me for a long time, at least as far back as the 1990s, when some of the most popular Oracle “tips & techniques” books of the era used that format a lot to state claims.

Beware of claims worded like that.

Whenever I see “...improve performance 1000%,” I have to do extra work to decode what the author has encoded in his tidy numerical package with a percent-sign bow. The two performance improvement formulas that make sense to me are these:

Improvement = (b – a)/b, where b is the response time of the task before repair, and a is the response time of the task after repair. This formula expresses the proportion (or percentage, if you multiply by 100%) of the original response time that you have eliminated. It can’t be bigger than 1 (or 100%) without invoking reverse time travel.
Improvement = b/a, where b and a are defined exactly as above. This formula expresses how many times faster the after response time is than the before one.

Since 1000% is bigger than 100%, it can’t have been calculated using formula #1. I assume, then, that when someone says “...improve performance 1000%,” he means that b/a = 10, which, expressed as a percentage, is 1000%. What I really want to know, though, is what were b and a? Were they 1000 and 1? 1 and .001? 6 and .4? (...In which case, I would have to search for a new formula #3.) Why won’t you tell me?

Any time you see a ‘%’ character, beware: you’re looking at a ratio. The principal benefit of ratios is also their biggest flaw. A ratio conceals its denominator. That, of course, is exactly what ratios are meant to do—it’s called normalization—but it’s not always good to normalize. Here’s an example. Imagine two SQL queries A and B that return the exact same result set. What’s better: query A, with a 90% hit ratio on the database buffer cache? or query B, with a 99% hit ratio?

Query	Cache hit ratio
A	90%
B	99%

As tempting as it might be to choose the query with the higher cache hit ratio, the correct answer is...

There’s not enough information given in the problem to answer. It could be either A or B, depending on information that has not yet been revealed.

Here’s why. Consider the two distinct situations listed below. Each situation matches the problem statement. For situation 1, the answer is: query B is better. But for situation 2, the answer is: query A is better, because it does far less overall work. Without knowing more about the situation than just the ratio, you can’t answer the question.

Situation 1
Query	Cache lookups	Cache hits	Cache hit ratio
A	100	90	90%
B	100	99	99%

Situation 2
Query	Cache lookups	Cache hits	Cache hit ratio
A	10	9	90%
B	100	99	99%

Because a ratio hides its denominator, it’s insufficient for explaining your performance results to people (unless your aim is intentionally to hide information, which I’ll suggest is not a sustainable success strategy). It is still useful to show a normalized measure of your result, and a ratio is good for that. I didn’t say you shouldn’t use them. I just said they’re insufficient. You need something more.

The best way to think clearly about performance improvements is with the ratio as a parenthetical additional interesting bit of information, as in:

I improved response time of T from 10s to .1s (99% reduction).
I improved throughput of T from 42t/s to 420t/s (10-fold increase).

There are three critical pieces of information you need to include here: the before measurement (b), the after measurement (a), and the name of the task (here, T) that you made faster. I’ve talked about b and a before, but this I’ve slipped this T thing in on you all of a sudden, haven’t I!

Even authors who give you b and a have a nasty habit of leaving off the T, which is far worse even than leaving off the before and after numbers, because it implies that using their magic has improved the performance of every task on the system by exactly the same proportion (either p% or n-fold), which is almost never true. That is because it’s rare for any two tasks on a given system to have “similar” response time profiles (defining similar in the proportional sense). For example, imagine the following quite dissimilar two profiles:

Task A
Response time	Resource
100%	Total
90%	CPU
10%	Disk I/O

Task B
Response time	Resource
100%	Total
90%	Disk I/O
10%	CPU

No single component upgrade can have equal performance improvement effects upon both these tasks. Making CPU processing 2× faster will speed up task A by 45% and task B by 5%. Likewise, making Disk I/O processing 10× faster will speed up task A by 9% and task B by 80%.

For a vendor to claim any noticeable, homogeneous improvement across the board on any computer system containing tasks A and B would be an outright lie.

Friday, January 14, 2011

An Axiomatic Approach to Algebra and Other Aspects of Life

Not many days pass that I don’t think a time or two about James R. Harkey. Mr. Harkey was my high school mathematics teacher. He taught me algebra, geometry, analytic geometry, trigonometry, and calculus. What I learned from Mr. Harkey influences—to this day—how I write, how I teach, how I plead a case, how I troubleshoot, .... These are the skills I’ve used to earn everything I own.

Prior to Mr. Harkey’s algebra class, algebra for me just was a morass of tricks to memorize: “Take the constant to the other side...”; “Cancel the common factors...”; “Flip the fraction and multiply...” I could practice for a while and then solve problems just like the ones I had been practicing, by applying memorized transformations to superficial patterns that I recognized, but I didn’t understand what I had been taught to do. Without continual practice, the rules I had memorized would evaporate, and then once more I’d be able to solve only those problems for which I could intuit the answer: “7x + 6 = 20” would have been easy, but “7/x – 6 = 20” would have stumped me. This made, for example, studying for final exams quite difficult.

On the first day of Mr. Harkey’s class, he gave us his rules. First, his strict rules of conduct in the classroom lived up to his quite sinister reputation, which was important. Our studies began with a single 8.5" × 14" sheet of paper that apparently he asked us to label “Properties A” (because that’s what I wrote in the upper right-hand corner; and yes, I still have it). He told us that we could consult this sheet of paper on every homework assignment and every exam he’d give. And here’s how we were to use it: every problem would be executed one step at a time; every step would be written down; and beside every step we would write the name of the rule from Properties A that we invoked to perform that step.

You can still hear us now: Holy cow, that’s going to be a lot of extra work.

Well, that’s how it was going to be. Here’s what each homework and test problem had to look like:

The first few days of class, we spent time reviewing every single item on Properties A. Mr. Harkey made sure we all agreed that each axiom and property was true before we moved on to the real work. He was filling our toolbox.

And then we worked problem after problem after problem.

Throughout the year, we did get to shift gears a few times. Not every ax + b = c problem required fourteen steps all year long. After some sequence of accomplishments (I don’t remember what it was—maybe some set number of ‘A’ grades on homework?), I remember being allowed to write the number of the rule instead of the whole name. (When did you first learn about foreign keys? ☺) Some accomplishments after that, we’d be allowed to combine steps like 3, 4 and 5 into one. But we had to demonstrate a pattern of consistent mastery to earn a privilege like that.

Mr. Harkey taught algebra as most teachers teach geometry or predicate logic. Every problem was a proof, documented one logical step at a time. In Mr. Harkey’s algebra class, your “answer” to a homework problem or test question wasn’t the number that x equals, it was the whole proof of how you arrived at the value of x in your answer. Mr. Harkey wasn’t interested in grading your answers. He was going to grade how you got your answers.

The result? After a whole semester of this, I understood algebra, and I mean thoroughly. You couldn’t make a good grade in Mr. Harkey’s algebra class without creating an intimate comprehension of why algebra works the way it does. Learning that way supplies you for a whole lifetime: I still understand it. I can make dimensioned drawings of the things I’m going to build in my shop. I can calculate the tax implications of my business decisions. I can predict the response time behavior of computer software. I can even help my children with their algebra. Nothing about algebra scares me, because I still understand all the rules.

When I help my boys with their homework, I make them use Mr. Harkey’s axiomatic approach with my own Properties A that I made for them. (I rearranged Mr. Harkey’s rules to better illuminate the symmetries among them. If Mr. Harkey had been handy with the laptop computer, which didn’t exist when I was in school, I imagine he’d have done the same thing.)

Invariably, when my one of boys misses a math problem, it’s for the same stupid reason that I make mistakes in my shop or at work. It’s because he’s tried to do steps in his head instead of writing them all down, and of course he’s accidentally integrated an assumption into his work that’s not true. When you don’t have a neat and orderly audit trail to debug, the only way you can fix your work is to start over, which takes more time (which itself increases frustration levels and degrades learning) and which bypasses perhaps the most important technical skill in all of Life today: the ability to troubleshoot.

Theory: Redoing an n-step math problem instead of learning how to propagate a correction to an error made in step n – k through step n is how we get to a society in which our support analysts know only two solutions to any problem: (a) reboot, and (b) reinstall.

It’s difficult to teach people the value of mastering the basics. It’s difficult enough with children, and it’s even worse with adults, but great teachers and great coaches understand how important it is. I’m grateful to have met my share, and I love meeting new ones. Actually, I believe my 11-year old son has a baseball practice with one tomorrow. We’ll have to check his blog in about 30 years.

Thursday, January 13, 2011

New paper "Mastering Performance with Extended SQL Trace"

Happy New Year.

It’s been a busy few weeks. I finally have something tangible to show for it: “Mastering Performance with Extended SQL Trace” is the new paper I’ve written for this year’s RMOUG conference. Think of it a 15-page update to chapter 5 of Optimizing Oracle Performance.

There’s lots of new detail in there. Some highlights:

How to enable and disable traces, even in un-cooperative applications.
How to instrument your application so that tracing the right code path during production operation of your application becomes dead simple.
How to make that instrumentation highly scalable (think 100,000+ tps).
How timestamps since 10.2 allow you to know your recursive call relationships without guessing.
How to create response time profiles for calls and groups of calls, with examples.
Why you don’t want to be on Oracle 11g prior to 11.2.0.2.0.

I hope you’ll be able to make productive use of it.