Thursday, June 18, 2009

Profiling with my Boy

We have an article online called "Can you explain Method R so even my boss could understand it?" Today I'm going to raise the stakes, because yesterday I think I explained Method R so that an eleven year-old could understand it.

Yesterday I took my 11 year-old son Alex to lunch. I talked him into eating at one of my favorite restaurants, called Mercado Juarez, over in Irving, so it was a half hour in the car together, just getting over there. It was a big day for the two of us because we were very excited about the new June 17 iPhone OS 3.0 release. I told him about some of the things I've learned about it on the Internet over the past couple of weeks. One subject in particular that we were both interested in was performance. He likes not having to wait for click results just as much as I do.

According to Apple, the new iPhone OS 3.0 software has some important code paths in it that are 3× faster. Then, upgrading to the new iPhone 3G S hardware is supposed to yield yet another 3× performance improvement for some code paths. It's what Philip Schiller talks about at 1:42:00 in the WWDC 2009 keynote video. Very exciting.

Alex of course, like many of us, wants to interpret "3× faster" as "everything I do is going to be 3× faster." As in everything that took 10 seconds yesterday will take 3 seconds tomorrow. It's a nice dream. But it's not what seeing a benchmark run 3× faster means. So we talked about it.

I asked him to imagine that it takes me an hour to do my grocery shopping when I walk to the store. Should I buy a car? He said yes, probably, because a car is a lot faster than walking. So I asked him, what if the total time I spent walking to and from the grocery store was only one minute? Then, he said, having a car wouldn't make that much of a difference. He said you might want a car for other reasons, but he wouldn't recommend it just to make grocery shopping faster.

Good.

I said, what if grocery shopping were taking me five hours, and four of it was time spent walking? "Then you probably should get the car," he told me. "Or a bicycle."

Commit.

On the back of his menu (photo above: click to zoom), I drew him a sequence diagram (A) showing how he, running Safari on an iPhone 3G might look to a performance analyst. I showed him how to read the sequence diagram (time runs top-down, control passes from one tier to another), and I showed him two extreme ways that his sequence diagram might turn out for a given experience. Maybe the majority of the time would be spent on the 3G network tier (B), or maybe the majority of the time would be spent on the Safari software tier (C). We talked about how if B were what was happening, then a 3× faster Safari tier wouldn't really make any difference. Apple wouldn't be lying if they said their software was 3× faster, but he really wouldn't notice a performance improvement. But if C were what was happening, then a 3× faster Safari tier would be a smoking hot upgrade that we'd be thrilled with.

Sequence diagrams, check. Commit.

Now, to profiles. So I drew up a simple profile for him, with 101 seconds of response time consumed by 100 seconds of software and 1 second of 3G (D):

Software 100
3G 1
-------------
Total 101
I asked him, if we made the software 2× faster, what would happen to the total response time? He wrote down "50" in a new column to the right of the "100." Yep. Then I asked him what would happen to total response time. He said to wait a minute, he needed to use the calculator on his iPod Touch. Huh? A few keystrokes later, he came up with a response time of 50.5.

Oops. Rollback.

He made the same mistake that just about every forty year-old I've ever met makes. He figured if one component of response time were 2× faster, then the total response time must be 2× faster, too. Nope. In this case, the wrong answer was close to the right answer, but only because of the particular numbers I had chosen.

So, to illustrate, I drew up another profile (E):

Software 4
3G 10
-------------
Total 14
Now, if we were to make the software 2× faster, what happens to the total? We worked through it together:

Software 4 2
3G 10 10
------------------
Total 14 12
Click. So then we spent the next several minutes doing little quizzes. If this is your profile, and we make this component X times faster, then what's the new response time going to be? Over and over, we did several more of these, some on paper (F), and others just talking.

Commit.

Next step. "What if I told you it takes me an hour to get into my email at home? Do I need to upgrade my network connection?" A couple of minutes of conversation later, he figured out that he couldn't answer that question until he got some more information from me. Specifically, he had to ask me how much of that hour is presently being spent by the network. So we did this over and over a few times. I'd say things like, "It takes me an hour to run my report. Should I spend $4,800 tuning my SQL?" Or, "Clicking this button takes 20 seconds. Should I upgrade my storage area network?"

And, with just a little bit of practice, he learned that he had to say, "How much of the however-long-you-said is being spent by the whatever-it-was-you-said?" I was happy with how he answered, because it illustrated that he had caught onto the pattern. He realized that the specific blah-blah-blah proposed remedies I was asking him about didn't really matter. He had to ask the same question regardless. (He was answering me with a sentence using bind variables.)

Commit.

Alex hears me talk about our Method R Profiler software tool a lot, and he knows conceptually that it helps people make their systems faster, but he's never known in any real detail very much about what it does. So I told him that the profile tables are what our Profiler makes for people. To demonstrate how it does that, I drew him up a list of calls (F), which I told him was a list of visits between a disk and a CPU. ...Just a list that says the same thing that a sequence diagram (annotated with timings) would say:

D 2
C 1
D 6
D 4
D 8
C 3
I told him to make a profile for these calls, and he did (H):

Disk 20
CPU 4
---------
Total 24
Excellent. So I explained that instead of adding up lists in our head all day, we wrote the Profiler to aggregate the call-by-call durations (from an Oracle extended SQL trace file) for you into a profile table that lets you answer the performance questions we had been practicing over lunch. ...Even if there are millions of lines to add up.

The finish-up conversation in the car ride back was about how to use everything we had talked about when you fix people's performance problems. I told him the most vital thing about helping someone solve a performance problem is to make sure that the operation (the business task) that you're analyzing right now is actually the most important business task to fix first. If you're looking at anything other than the most important task first, then you're asking for trouble.

I asked him to imagine that there are five important tasks that are too slow. Maybe every one of those things has its response time dominated by a different component than all the others. Maybe they're all the same. But if they're all different, then no single remedy you can perform is going to fix all five. A given remedy will have a different performance impact upon each of the five tasks, depending on how much of the fixed thing that task was using to begin with.

So the important thing is to know which of the five profiles it is that you ought to be paying attention to first. Maybe one remedy will fix all five tasks, maybe not. You just can't know until you look at the profiles. (Or until you try your proposed remedy. But trial-and-error is an awfully expensive way to find out.)

Commit.

It was a really good lunch. I'll look forward to taking my 9-year-old (Alex's little brother) out the week after next when I get back from ODTUG.

15 comments:

Jeff Hunter said...

We expect your 9yo to be able to discuss 3NF by the end of a Happy Meal.

Cary Millsap said...

:-)

Doug said...

... Or more to the point, the benefits and draw backs of OFA in a non-standard environment.

See you at ODTUG!

Joel Garry said...

Explaining things so your kid or boss can understand it may be completely different than teaching things that are learned better in a mentoring role because they are difficult to verbalize.

In other words, for some things it's better to throw them in the deep end.

"We don't experience the physical world itself, but rather the end product of a long causal chain of processes, from the senses to the brain. Not only that, but that end product is so amazingly elaborate and functional that most people live out their lives never realizing they have only indirect contact with the world." - Jack Loomis

word: pacla
word: donacton

Domen said...

Nice dialog, is your kid 9 or 12 years old?

Cary Millsap said...

Domen: Eleven.

Harold Fowler said...

Actually sounds like a pretty good plan to me dude!

RT
www.anonymize.tk

mrjbluedevil said...

That's a great, if laborious, way to explain Amdahl's law to someone without the mathematical background to understand a simpler formulation :)

FF said...

You could also extend this to apply to non-technical, social issues, such as managing the national deficit. Our lawmakers (as well as many non-technical people) have a tendency to get really excited about line-items that sound outrageous but don't really make a big difference to the bottom line.

John said...

Now THAT is quality time! BTW, as an ex-Dallas resident I LOVE Mercardo Juarez :o)

Bill Smith said...

I enjoyed your post. Thanks for sharing it. I'll use that someday when my 6-year-old is a little older.

Rog said...

At the beginning you say it's a 11 year old and at the end a 9 year old, so I suppose this is a way to teach "find the bottleneck" the easy way instead of a real story.

Good article btw.

Cary Millsap said...

Rog, my 11 year-old boy is the one with whom I had the discussion. My reference at the end was that I'll need to take his younger brother (age 9) to lunch for his turn later. —Cary

Andreas said...

I had a similar discussion with my son (back then he was 11). My customer had placed a couple of machines in a south-facing room with large windows - during a untypical hot summer. Obviously they (we) lost a couple of machines due to heat failure. Also the air-conditioners were placed in the same room but the windows were shut in the night - leaving them to pump air inside the room instead of out of the window.

He understood the concepts of thermic extraction and the impact of large glass structures (we live in the Netherlands where a lot of fruit and vegetables are grown in large greenhouses).

I suggested to my customer to hire my son - as he would have prevented the loss of the machines. My customer could not laugh about it :-$

Cary Millsap said...

Andreas, the tail of your story reminds me of this. It's a lot easier to see the broader perspective when it's somebody else's problem.