A Spectrophotometric Romance

Assessing inter-instrument agreement

211

This is a romantic love story. The usual … boy spectrophotometer meets girl spectrophotometer. Sparks fly, and naturally, they fall in love.

I haven’t cast the parts yet, but I am thinking Jennifer Aniston could play the female lead. My wife would like to see Antonio Banderas as the male spectro. But like all romantic comedies, there has to be a conflict.

It isn’t long before the two ill-fated spectros start to disagree. It is obvious to everyone in the theater that the two should be making little estogether, but sigh! One man’s teal is another woman’s aquamarine.

The disagreement scene of the movie is quite familiar to me. As a guy, I see this scene with my little chromophiliac wife on a daily basis. Is the blouse on that cute young lady teal, aquamarine, cyan or turquoise? No matter what I say, I know there will be an argument.

As a mathematician, I can readily calculate that my odds of winning this argument are no better than 0 in n, where n is a really big number. I mean, really big. Like so close to infinite that you can taste it.

That’s what I know as a mathematician, but as a color scientist with an ego the size of the planet Jupiter, it’s hard for me to just let this go. I should know my colors, right?!?!?!??

So I can relate, as can any married male applied mathematician color scientist.

Agreement Among Spectrophotometers

One would think that two spectrophotometers, given the proper care and feeding, would always agree “pretty well”. They are in love, so of course they should. These are the expectations. IFRA published a report [1] to quantify this expectation:

Inter-instrument agreement is usually indicated by a color difference value between two instruments or between a master instrument and the average of a group of production instruments. Although various ways are used to describe this color difference, a common value is the average or mean value for a series of 12 British Ceramic Research Association (BCRA) Ceramic Colour Standards Series II (CCS II) ceramic tiles. A value of 0.3  Eab is acceptable.

How much do our own hapless lovers disagree? I did a little research. I went digging for technical papers and reports where others had brought spectros together to see how much they agreed, to assess inter-instrument agreement.

WhAt to Do?

For an answer, I turn to a couple of smart color science kinda guys, Danny Rich [9] and Harold Van Aken [10]. Danny’s smart-guy-ness is indisputable, since he just received the Robert F. Reed Technology Medal, recognizing outstanding engineers, scientists, inventors and researchers in the graphic communications industry.

The idea put forth by these two really smart guys is that at least some of the discrepancies between spectrophotometers are due to understandable and predictable phenomena. If the understandable phenomena can be quantified, they can be corrected.

Here is where the BCRA tiles show up in the movie. They attempt to reconcile the two spectros using a perfectly rational, time-tested approach. Who says romantic comedies are predictable? If I have any say in the casting for this movie, I would have George Clooney play the part of the BCRA tiles. I am thinking that he would play a therapist.

George “BCRA” Clooney does all the usual psychotherapeutic stuff, and there appears to be some agreement on the difference between beige and taupe. But, alas, the improved relations falter and once again the couple are disagreeing, in some cases louder than before. This is a totally unexpected turn of events in a romantic comedy, right?

Results from an actual experiment

To keep this from being just a mildly entertaining bunch of technodrivel, I summarize some results from a paper that I presented at the spring TAGA conference [11]. For this research, I collected a collection of spectrophotometers and assembled an assemblage of sample sets. All the samples sets were measured with all the spectros.

I then used variants on the methods described by Rich and Van Aken to adjust the measurements from one spectrophotometer to match those of another. This is commonly called “calibration,” but I prefer the more hoity-toity word “standardization.

” Technically, calibration is what happens in the factory that makes spectros. Standardization is what happens out in the field where they are used.

The table below shows one example of what happens when the BCRA tiles are used to calibrate the standardization. The first column, labeled “BCRA,” shows the before and after agreement between two instruments, as measured on the BCRA tiles. The median value of the color differences was basically unchanged, going from 0.35  E to 0.41 ΔE. The 90th percentiles, on the other hand, have been nearly cut in half, from 1.84 ΔE down to 0.95 ΔE.

Based on that evidence and that evidence alone, I would conclude that standardizing with the BCRA tiles will have little effect on most samples, but will really rein in some of the worst offenders. This would be good, but …

What I have described violates the fourth rule of good scientific practice. One should never use the same data set to calibrate and to assess the calibration.

From a practical standpoint, this sort of ssessment only tells you how well the two instruments will agree on the color of BCRA tiles. I’m gonna guess that the percentage of spectrophotometer users in the graphic arts who are primarily interested in measuring BCRA tiles is not terribly large.

The next two columns (Pantone primaries and Pantone ramps) are perhaps a bit more indicative of the effectiveness of standardization.

For these two columns, the BCRA tiles were used to calibrate, but the assessment of agreement was based on sets of samples from a Pantone book.

Comparing the before and after rows, I see that some numbers are up and some are down, but overall there is little change. Based on these two results I would say, “Why bother?”

The final column shows the result of BCRA-based standardization on a set of Behr paint samples. The median color disagreement was unchanged, but the 90th percentile was made significantly worse. This demonstrates that, in certain cases, standardization is deleterious.

As the great color scientist and poet Robert Burns wrote, “The best laid standardizations of mice and men oft go awry.” See my blog post [12] for further explanation of the foibles of regression.

Back to the movie

So. George Clooney (who was favored because he is the analytical psychotherapist and because he is just a darn sexy guy) failed. Now we get the unexpected twist. Note all romantic comedies have an unexpected twist. It’s just expected. Enter Owen Wilson, dufus extraordinaire.

Owen plays “Behr,” an unkempt ne’er do well with little or no couth. In his normal inept way, he proves himself to be fully ept in getting the pitiable instruments together. The tie-in to the serious side of this blog is a set of matte paint samples. I walked into a Home Depot. Please don’t let them know, but I was just pretending to be buying paint.

I looked at the Behr paint samples and selected a set based on being cute.

My wife thinks Owen Wilson is cute. I just don’t see it.

The table on the next page shows the results from using the Behr paint samples to standardize the instruments. Note that the 90th percentiles are all more better, and most are way more better. So thanks to the help of Owen “Behr” Wilson, they lived happily ever after.

Scientific conclusions

OK, now for the serious part. Before Home Depot has a run on samples of the pretty color set, let me say that I did not choose the set scientifically. I know that it is uncharacteristic of me, but I was actually telling the truth when I said that I picked them out because they were pretty.

I don’t see the Behr paint samples as being an adequate replacement for the BCRA tiles. They are definitely not durable or cleanable, and there is no data available on lightfastness or sensitivity of the color to changes in temperature.

I am not sure just yet about why the paint samples worked better than the BCRA tiles. Here are some possible reasons:

1 The set of paint samples has a richer set of spectral transitions — both positive and negative derivatives with respect to wavelength. This means the paint samples would be better at diagnosing wavelength shift if there is any.

2 The set of BCRA tiles is very glossy compared with almost all print samples. Thus, if there is a small but significant difference between the instruments in terms of geometry, the BCRA tiles would be blind to that.

3 The paint samples are a good sampling of reflectance values at pretty much every wavelength. If there is an issue with nonlinearity between the two instruments, this would be important.

4 Two of the BCRA tiles have a fair amount of lateral diffusion, which is to say, light travels laterally in the tile. By and large, printing on opaque substrate has less of this effect. If there is a difference between instruments in terms of the area illuminated and the areas measured, then measurements of the BCRA tiles might not translate to the other samples.

5 There are more samples in the set of paint samples. If measurement noise is an appreciable part of the disagreement, then more is better.

Practical conclusions

It is unlikely that two spectrophotometers of different design will agree to better than 1 ΔE. Standardization can help, but should be approached cautiously.

A little healthy skepticism is in order. Test the standardization on your own samples.

The issue of inter-instrument agreement is being addressed. IDEAlliance (in the U.S.) and Fogra (in Europe) have assembled groups of experts from all the graphic arts spectrophotometer manufacturers to work on improving agreement.

I have been impressed by the level of commitment and the sincerity of the individuals involved, so I am cautiously optimistic about the eventual results.

The U.S. group noted that the different manufacturers traced their white calibration back to different national standards labs.

This has a surprisingly large effect on measurements. In the U.S. meeting, a single set of patches was measured with all the assembled instruments.

There was modest improvement when all the readings were calibrated to the same white level. Thus, one relatively simple step that can be taken is for all spectro manufacturers to calibrate to the same national standards lab.

The work represented in the TAGA paper does not imply that standardization of one instrument to another is hopeless; merely that it cannot be expected to work if indiscriminately applied.

For the printing industry, it would appear that the BCRA tiles as they currently stand are not the ideal choice. Other industries have managed to find means for standardizing instruments, and work is underway to make that happen in this industry.

This article was adapted from a recent post to the John the Math Guy blog

If you are satisfied with your sales, you probably don’t need us!

If you are happy with your equipment, consumables, and software sales to Indian printers, you probably don’t need us. But if you want to grow your sales or improve your marketing, then talk to us. Our research and consulting company, IppStar can assess your potential and addressable markets in light of the competition. We can discuss marketing, communication, and sales strategies for market entry, and for market growth.

For suppliers or service providers with a strategy and budget, I suggest you talk to us about using our hybrid print, web, video, and social media channels to impact your product communication. We are one of the world’s leading B2B publications in the print industry with hands-on practitioner and consulting experience – an understanding of business and financials, and some of the best technical writers. Our young team is ready to travel to meet you and your customers for content.

India’s fast-growing large economy has considerable headroom for print. Get our 2025 media kit and recalibrate your role in this dynamic market. Enhance your visibility and relevance to existing markets and turn potential customers into conversations.

Founded in 1979 as a technical newsletter, Indian Printer and Publisher is the oldest B2B trade publication in the multi-platform and multi-channel IPPGroup. IppStar [www.ippstar.org] is our Services, Training and Research organization.

Naresh Khanna – 20 January 2025

Subscribe Now

LEAVE A REPLY

Please enter your comment!
Please enter your name here