Philosophical Reflection on Scoring


Renowned American philosopher Alvin Plantinga once said that philosophical reflection is really not much different than just thinking really hard about something. As a group, we have been conducting some serious thinking and debating on the topic of how best to score wines during our tastings. The discussion was largely brought about by RH’s modification of his scoring system. Assuming (after general immpresions based on his own tasting) that an 88 in Wine Spectator’s (WS) 100 point system was an average wine, RH wanted to be able to correlate his scores on the 10 point scale with that of WS 100 point scale. So he made 4 out of 10 average and created the following formula: 2x(RH’s score)+80. Thus, a 4 would yield an 88 (2×4+80), an 8 a 96, and so on.

Now we all have been working on our own scores tyring to move from a ‘Gestault’ system to breaking the score into categories such as aroma, mouthfeel, finish, flavors by mouth, or whatever. Working with a 10 point system can be difficult. But it isn’t really that different from a 100 point system . And even the 100 point system isn’t really a 100 point system. WS and Parker never give anything below 50, and rarely 50. Essentially they are working on a 50 point system.

Whatever the system is, we all would like to have some idea of the accuracy of any particular rating. But first, is there such a thing as great wine? Or is wine simply just what you like? All of us in the group believe the answer is yes to both questions. From the variation in our scores one can make a strong case that wine quality is merely a subjective judgement. But isn’t there an ideal out there? If it were purely subjective, then if entire wine drinking population tasted and rated all wines, we would expect all wines to receive the same average score. Although the entire wine drinking population has not tasted and rated all wines, there are average scores determined for certain wines from several different groups of people that year in and year out score higher than other wines. Though not an air tight line of reasoning, I believe this at the very least allows us to unapologetically argue about the merits and quality of wines, and yes, even resort to name calling when people are clearly wrong about a wine’s virtues.

The following is a summary of philosophical reflection on RH’s scoring that has some important themes regarding wine rating. First it was duly noted that “we’ve always known RH was full of sh$#% - but I didn’t think we cared!” JR goes oon to point out that “some obvious considerations are internal consistency (do you rate the same wine similarly each time?), use of the range (it’s a not a 10 pt range if you only give 4, 5 or 6) and preference vs. “quality” (or even someone else’s preference - a production tasting might be geared towards picking the wine the target consumer will like, or picking the wine the winery owner will like, even if that’s not the blend you actually bottle!). Unless instructed differently, I typically rate wines based on my preference - I don’t expect that this tracks with a particular critic for particular wine although perhaps it would overall. I think that quality is a real phenomenon and, generally, better wines will be rated higher by most people most of the time. Of course any one person can disagree on any one wine. There will be trends to these differences of opinions and these trends reflect our individual tastes. For instance, maybe someone likes a little brett, maybe someone considers varietal typicity important. It is not our goal to agree on each wine -with each other or the critics. Instead, each of us should have confidence in our own rating system. And, if we are to use the group total as some sort of metric..we should all be using the whole scale similarly.. otherwise the person who gives more 1’s and 9’s has a disproportionate influence over the 4,5,6′er. ” Internal consistency is important and JR ended this missive by suggesting we start inserting wines we have already tasted so we can examine whether or not we score that wine similarly each time.

RH continued the discussion: “…nice to know JR thinks I’m full of shit…At least he didn’t go so far as to call me a ‘Euro-imitator’.” In response to an early comment that I had made that it looked as though RH was basing his system on the assumption that the WS and Parker pallets are correct RH says “I’m not predicating my system on the assumption that Parker and Spectator are right, but rather because I believe it is useful to be able to easily convert my scores to a 100 point scale. I think there is a lot of value in correlating my scores with the critics. If I find I often agree with one critic more than another it will help me in purchasing decisions. But most importantly - we can argue over whether the critics are right or wrong until the cows come home but that doesn’t change the fact that they wield enormous power and all of our wines would be selling a lot better if Laube hadn’t trashed so many of them. I want to make wines that I like, but just because I like it doesn’t mean the critics will, if I have a good sense of how my pallet tracks with a particular writer that will probably impact my decision of which critics to send wines to.” These are excellent points and in addition to knowing which writer you may track with, and which to send your wines to, you may also be able to tell when they are full of it. That is, if your score differs greatly then you may be able to suspect that they (because of course it wouldn’t be you) may have messed up (fatigue, bias, whatever).

RH goes on, as he is in the habit of doing. “I would argue that my system is probably more consistent than one that often gives out high scores. I think one of the reasons it tracks fairly well with the critics is that it is more of an absolute scale than some of you may be using. I suspect those of you who routinely give 8-10 pts may give the same wine lower scores when tasted in a flight of better wines. I’m not claiming that my system is immune to the influence of context of the tasting, but given the general quality of the wines we have been tasting of late (fair to good/very good) I think my scores reflect this. JR argues it’s not a ten point scale if you only give 4-6 points, but that’s what the wines generally deserve. If we have a really bad flight and give everything a 1 it doesn’t mean I’m not using the whole scale.” I agree it may be true that “those of [us] who routinely give 8-10pts may give the same wine lower scores when tasted in a flight of better wines.” But RH’s scale does not preclude him from this possiblity (as he admits). But are we really routinely giving 8-10? I don’t think so. Rarely 9,
very rarely 10, and sometimes 8. RH says himself that “the general quality of the wines we have been
tasting of late [is] (fair to good/very good),” so why is an occasional 8 surprising? I think 8, on a 10 point scale is very good. RH’s system is good for converting to a 100 point scale, but the fact remains that we use a 10 point scale at our tasting, he is not in the framework of the other tasters. But is that a problem or does it simply represent the subjective nature of rating wine quality and trying to get several people to agree about one wine? Interestingly, nothing has received over a 7 (rounding to
the nearest 0.5) in the last 3 tastings. Which is consistent with RH’s idea that we have been tasting good/very good wines. In fact, in the last two tastings, the broadest appealing wine (not necessarily the highest socring wine - from 1 individual) has won. I.e., perhaps we are rating the better wines higher the majority of the time (at least as a group).

Have you ever known a philosophical discussion that had an end? I’m not sure this one does either. Maybe as JR has suggested, we should simply move to writing Haiku’s for our ratings. I think it would still engender passionate discussion. I mean come on, finishes like scope, clearly it was Listerine!

Pellegrini Vineyards North Fork Merlot 1999 (Long Island, NY)
Barbecued handbag
Carpet cleaner marinade
Finishes like Scope