Archive for May, 2006

Judgment of Paris: a technical review.

Friday, May 26th, 2006

While searching for something else, I found this summary of a statistical analysis of the original Paris Tasting in 1976 (my summary follows, go to the link for more technical information and Tables). Apropos, the 30-year anniversary of this tasting recently supplied impetus to perform it again (with the original and new vintages). I don’t know much of Dennis Lindley, but he clearly understands statistics and I have often wondered what a statistical analysis would do to the conclusions drawn regarding the ‘Judgment of Paris.’ Especially because the difference between the means of the winning American wines and their 2nd place French counterparts was so small.

Let’s start simple: analyzing the means. For both Chardonnay and Cabernet, there were true differences among the wines, but there was no clear winner. Rather, there was a cluster at the top, in the middle, and a bottom dweller. Without getting into the technical reasons why Lindley slightly adjusts the means, I’ll report that he concludes “that the American Chardonnays did as well as the French…[and] if the French were expecting to give high marks to their own wines in comparison with those from Napa, they failed.” By simply looking at the means it seems clear that the Americans held their own with the Chardonnay but did not fare as well with the reds. Lindley summarizes his first analysis nicely: “The claim that the Americans won is presumably based on the fact that both the top wines were from the Napa valley. We will later see why this claim is probably fallacious.”

The next step in the analysis was to determine the probability that one wine was truly better than another. Let’s take the Cabernet flight as our example. The adjusted mean scores were as follows: Stag’s Leap Wine Cellar 1973 - 14.1, Château Mouton Rothschild 1970 – 14.1, Château Montrose 1970 – 13.6, Château Haut Brion 1970 – 13.2, Ridge Monte Bello 1971 – 12.1, Château Léoville-Las-Cases 1971 11.2, Heitz “Martha’s Vineyard” 1970 – 10.4,Clos du Val 1972 – 10.1, Mayacamas 1971 – 9.8, Freemark Abbey 1969 – 9.6. The probability that SLWC was truly better (beyond chance) than the 2nd place French wine is only 52% and therefore there is a 48% probability it is worse. Lindley summarizes; “It is not until SLWC is compared with Ridge, another American, that there is substantial probability of a real difference. Similar remarks apply to the whites. It can now be seen why the claim that, since a US wine was the best in each class, the Americans won, is doubtful.”

Finally, Lindley looks at the precision of the scorers and the system as a whole (put on your thinking caps). One of the primary (and safe) assumptions is that a given score of a taster will be affected by the wine and by any bias of the taster. Additionally (and this is what he is trying to get at in the stats) the score will be affected by the fact that tasting is not a precise science. But how much does the imprecision impact the score? To use Lindley’s example: “consider taster #4 with Ridge Monte Bello. She gave it a score of 16. The mean for all Cabernets is 11.84. Ridge had a mean over all tasters of 12.14.” Thus the Ridge average was 0.30 higher than the average for all Cabernets. Taster #4 had a mean over all Cabernets of 13.90, or 2.06 higher than the average for all Cabernets. Stay with us: “So taking account of the wine and the taster involved, the score expected would be 11.84 [mean for all Cabs] + 0.30 [Ridge’s excess over Cab means] + 2.06 [taster #4 excess over Cab means] = 14.20.” Since the observed value of 16 is in excess of this by 1.80, Lindley believes it is a measure of the imprecision of tasting and scoring. In stats, it is called a residual. The magnitude of these residuals reveals the variability (or variance) in each taster’s judgments. No taster stood out as being exceptionally variable but the variances showed that the tasters found the wines difficult to judge.

The mean imprecision value (1.8 in example above) for all tasters was ~ 2.9 for both red and white. So what does this mean? “The near identity demonstrates that the tasters found the two sets of wines equally difficult to judge. Recall that 2.9 measures the lack of precision in scoring and is not affected by either the wine or the bias of the taster. A technical argument (based on the normal distribution) shows that about one third of the time a taster will be [off] by at least 2.9, in either direction, when giving a score…Thus when a wine is given a score of 12, it could easily [have been] 9 or 15, and one third of the time even larger, though a residual of twice 2.9, 5.8, is unlikely to be exceeded.” Taking all of this together, “a score of 12 only asserts a true score of a little more than 6, or a little less than 18.” The conclusion is that tasting wine is imprecise – big surprise, right? But this fact does demonstrate the need for several, independent tasters to get an accurate score of quality. Lindley suggests that “to reduce that 2.9 to 1.0, so that the mean score is most likely not to be more than 1 out, requires 9 tasters. To be fairly sure of a discrepancy of no more than 1, requires about 33 [tasters].

The analysis reveals that despite the expertise of the participants in the 1976 Paris tasting it is very difficult even for skilled people to judge wine. What does this tell us about the articles and scores handed to us (or bought by us) by one individual? Lindley suggests that “unless the author is much more skilled than the participants in this tasting, the opinions cannot be relied upon.” Perhaps they are much more skilled?

Tasting: Sauvignon Blanc (Sancerre and NZ)

Wednesday, May 24th, 2006

The weather has warmed and our mouths have begun to water for the thirst quenching nectar that is white wine. Last week’s tasting was a tantalizing mix of Sauvignon blancs from Sancerre and New Zealand, with one ringer from Anderson Valley, CA. After recently reviewing an article detailing yeast contribution to the varietal flavor of Sauvignon blanc I was eager to taste the magic of winemaking. The clear winner for this tasting was the ringer, Breggo 2005 Sauvignon blanc. The wine garnered the rare honor of receiving four scores of 8 out of 6 judges. Wonderfully tart, broad flavor profile, with a lovely mouthfeel, this wine will be gone before the summer is up.

Characterizing Sancerre and NZ Savvy was not overtly difficult and most people guessed correctly for the majority of the wines. The French wines were much more subtle, layered, and had flavors of green apple, ‘flinty’, and sweet spices compared to their uber tropical NZ counterparts. If anything, the NZ Savvy was almost too tropical, that is to say they tend to lean toward a singular character and lack a little complexity. Nevertheless, the tropical flavors are wonderful and certainly parallel the flavors of fruits other than grapes, like guava, lychee, and pineapple.

Breggo, Anderson Valley, CA, 2005, $22 (Avg: 7; 3 and 8 ). What more can I add? 4 8s is rare in our group. Nice flavor profile, very nice mouthfeel.

White Haven, New Zealand, 2005 $17 (Avg: 6.5; 5 and 7). Classic tropical Kiwi Savvy. Guava, mandarin, lychee and orange blossom. Some thought it was slightly sweet but with good acid on the finish. I’d buy it again.

La Moussiere, Sancerre, Alphonse Mellot, 2004 (Avg: 6; 4 and 8 ) Honey, baked apple, well balanced with a minerally finish consisting of clove, cinnamon. Sublte, complex, and very nice. I’d buy this again too.

Sacred Hill, White Cliff, 2005 $12 (Avg: 6; 5 and 7 - low range, broad appeal). Brancott, Marlborough New Zealand, 2004 (Avg: 5.5; 4 and 7). Domaine Girard, Sancerre, ‘La Garenne’, $17 (Avg: 4.5; 3 and 7). Domaine Andre Vat (sp?), Sancerre, ‘Les Charmes’ 2004 (Avg: 4.5; 3 and 7).

Fruit Set

Tuesday, May 9th, 2006

As shoots continue to vaunt skyward we turn our thoughts to the subject of fruit set. Certain varieties for whatever reason are prone to poor fruit set. Merlot is a good example and because you can see poor berry set (coulure) and “hens and chicks” (millerandage). In 2004 we had significant fruit set problems in all our Merlot blocks. In fact, we harvested all the clusters with “shot” berries separately. We fermented 7 tons of this stuff and although we never thought it would make the final blend, it did. The resulting wine had a very low pH, 3.37, but was wonderfully fruity if simple. I digress. In 2005 our set was much improved. What happened? My initial suspicion is that the late rain of 2005 improved set, but in NorCal there is rarely a water deficit problem as early as bloom. 2004 was marked by an early budbreak, warm weather, but with subsequent cooler weather during bloom.

Mark Greenspan offers a few possibilities in this piece for Wine Business.com. Molybdenum deficiency is an iteresting possibilty, but since we have had bad then good set without adding any Mo, it is doubtful that is our issue. All this begs the question, is poor set bad? Well, I suppose that depends on your goals and how much it impacts your yield. If you believe lower yields improve you quality (which is debatable depending on what level you are reducing your yields to and from), then why worry about set? But certainly you want to be sure you have product to sell as well. Finally, having a significant proportion of clusters with shot berries will impact your wine and it may not be economical to harvest separately as we have done. All in all, it seems it would be best to have even fruit set, controlling your yields with other mechanims.

Babies and Wine

Thursday, May 4th, 2006

Sorry no updates lately, our second son - Elias Timothy - was born this past Saturday. All is sleepless and well. Sparkling all around! I’ll be back online ASAP. We just had an interesting tasting of Ribiera del Duero. Tell me, how do you taste terroir through Brettanomyces characteristics (medicinal, horsey, etc.)?