Stale Bread Lunch

Literate and nerdy. By Michael James Boyle.

October, 2013

Coffee Science

Oct 2, 2013 ∞

Via Marco Arment, I encountered this interesting blog post from Dr. Bunsen (Seth Brown) on “Coffee Science.” His goal was to determine experimentally what among a few common differences in technique actually had a measurable effect on his friend’s enjoyment of the cup. Because how good your coffee tastes is such a fun subject, I think these experiments provide a nice little platform to think about how this kind of thing plays out when we try to write about scientific approaches to more serious subjects.

It’s always great to see someone attempt measure things people usually just assume, but taking on the name “Science” also brings some risks. When we apply science to our everyday lives it’s hard not to take shortcuts that wouldn’t (or shouldn’t) fly doing it “for real.” That’s understandable. You’d think measuring such everyday things would be an easier task than the work that goes on in labs, but really it’s often much harder due to the much squishier, poorly controllable subjects. No one is expecting a clinical trial here… But then again there’s a reason that is required when the results really matter.

I should point out that Seth Brown’s about page states that he has a background in computational genomics. I have no doubt that he is well aware of everything I’m about to point out here and is a far more qualified statistician than I am with my less computational developmental biology background. And indeed he points out some of these issues himself. That said, let’s take a closer look at some of his results and use them to think about two things: the pitfall of expecting a big result and the danger in assumptions about what you are really testing.

First let’s look at his marquee result. Seth found that he could not measure a significant difference between using a burr grinder vs a blade grinder to prepare coffee for brewing with an Aeropress for his guests. This is set up to be a surprising conclusion due to a set of a few assumptions:

  1. That a burr grinder’s purpose is to improve the taste of the brewed coffee.
  2. That a taster must be able to reliably state a preference in a blind test for it to qualify. (As opposed to, say, something that might arise out of a longer-running survey.)
  3. That any individual element in the brewing system should, on its own, be able to make for a striking difference in quality. (Versus, say, more complicated combinations of grinding and brewing methods.)

I’m not saying that those are bad or unreasonable assumptions. In particular numbers 2 and 3 come out of a critical need to limit the scope of the experiment to something achievable. But it is important to be aware of them. What he found was that his guests stated a preference for coffee brewed from beans ground by the blade grinder over the burr grinder at about the rate of a coin flip. So being unable to prove that the burr grinder outperforms a blade grinder, a burr grinder is a useless expense, right? Here he falls into a frequent problem with presenting science to the general public. His result is a failure of his assay to tell a difference between the two conditions. As he goes on to state below, he doesn’t have the power to detect subtle differences. This means we can’t be sure whether we should make a conclusion about the difference in grinders or about the assay.

But before he gets the the details, he expresses this lack of result as follows, “Surprisingly, 13/24 or, ~ 54% of subjects actually preferred the blade grinder. This data suggested that blade grinders might actually produce better tasting coffee than burr grinders.” Seth knows this is a premature conclusion and his next step is to go on to perform the statistics. But if we were talking about, say, public health research, the article written about this research would be very likely to stop there. We have our result! Blade better than burr! Of course an educated reader should know better, but this demonstrates how difficult it can be to talk carefully about science.

What actually happened? He had a weak assay (because it would be a monumental task to produce a strong one) and one of his assumptions was that he’d see a large difference and that the difference would be one of coffee taste. If the benefit of a burr grinder turns out to be subtle or to be beneficial for other reasons (such as allowing different brews) he’d be unable to pick up on it, but because of the way he presents the data, this isn’t the message that a reader would be likely to take away. Though it most certainly isn’t what he’s trying to say, a naive reader might come to the conclusion that he is arguing that blade grinders are fundamentally better for coffee taste than burr grinders.

That’s a presentation issue and Seth’s audience is an informed one. What about the result? We can agree that in his tested brewing method, grinding via blade vs burr doesn’t make a mind blowing difference. That conclusion is solid. But let’s take a closer look at the assay and see what else might have been going on. The assumption was that a burr grinder is going to produce better coffee grounds. I’d argue that’s not the primary purpose of a burr grinder. It’s really there to produce more consistent grinds. When tasting coffee the most striking feature is the overall strength. Plenty of other features matter, but that’s the one out front. It’s a little like the loudness of music. One set of speakers might be better than another, but a quick listen will always tend to favor the louder of the two. Moving beyond that takes much more careful testing.

Coffee strength depends on many factors, but one of the most important is the size of the grounds. Smaller grounds give better access of the coffee to the water (it’s a classic surface area to volume deal) so all other factors being equal more coffee will end up in the water in a shorter period of time if the particles of coffee are smaller. By grinding with a burr grinder, you are likely to reliably hit a narrow range of particle sizes. Using a blade grinder will produce a much wider range with some very fine powder and some coarser fragments. It will also be very difficult to get the same average grind size between repetitions.

So if we assume that the grind size produced by the burr grinder is the midpoint, the blade grinder would be pretty likely to fluctuate randomly between producing larger on average vs smaller on average particles, and therefore half the time the coffee produced by the blade grinder might come out stronger than the burr grinder and half the time weaker. Whether that is better or worse depends on the preference of the taster, but either way, what could easily be the most dominant factor in taste is going to flip randomly between the two cups. As long as strength is a more important factor than any other differences produced between the two, this would be enough to drive a 50% result right there, even if the less important factor would otherwise have been noticeable in the assay. So the conclusion might be that the difference between burr and blade grinders is very subtle… or it might be that it is simply more subtle than coffee strength, a rather expected result when you phrase it like that, given how important a factor strength is.

It’s hard to say whether this is what was really going on here or not. It’s quite possible that the differences in strength were too subtle to measure as well or that the blade grinder always produced stronger or weaker coffee. But I thought this was a lovely example of how the results of an experiment can sometimes be telling you as much about what is going on in the assay and in your assumptions as they can about the thing you’re trying to test. So it’s important, even when trying to talk to a lay audience to try to talk about these factors and avoid the temptation to oversimplify them away, even though that’s usually what the general public wants.

So do I think Seth was wrong? Absolutely not! He concludes that you can make a fantastic cup of coffee with a blade grinder and an Aeropress. That you can. In fact, it’s one of the reasons an Aeropress is such a popular brewing method. It’s very flexible and very tolerant to different brewing situations. Unlike a french press, you don’t have to worry about the super fine grinds that may be produced as a part of the blade grinding process. And unlike drip brewing you can (to some varying extents depending on the technique you use) immerse the grounds for as long as you need, giving the larger particles the time they need to soak.

But I still think a burr grinder is a good investment for a coffee lover. Not all brewing methods are so forgiving. I’ve made halfway decent espresso with a blade grinder. It’s possible, but it is hit or miss at the very best and is never great. Even more importantly, I see a burr grinder much like a kitchen scale. You wouldn’t expect a kitchen scale to improve the quality of the coffee you weigh out. But it makes measuring out a specific amount of coffee much easier. Similarly a burr grinder lets you dial in a specific setting and just let it go, giving you just what you ordered every time. That may not directly lead to a striking difference in coffee taste, but it leads to a much better overall experience and certainly reduces the potential for making a bad cup of coffee because you screwed up.

SBL