Back in the early 1800s, not long after the first analog signals were transmitted over wires, a tool was developed to measure the quality of those signals. The tool had a name. It was called “mean square error.”

Mean square error was a relatively simple mathematical formula. Remarkably, even though image and audio quality assessment has evolved by leaps and bounds in the ensuing 170 or so years, the mean square error quality assessment tool is still in use today.

And it is, even though it doesn’t work very well.

“Even many years ago there were complaints about this method because people felt that it didn’t reflect reality,” says Prof. Zhou Wang, SSIMWAVE co-founder and Chief Science Officer.

The problem with mean square error is that the formula has very little to do with human perception, with quality on the ground, as it were. It’s a purely numerical way to reflect differences in signal transmission. As a result, what the mean square error numbers say, and what a person hears or sees, often doesn’t square.

“If humans are the final receiver of a signal, a human would look at the signal, and look at the image in front of them, and say this is good or this is bad, but this [assessment] may or may not have good correlation with the numerical mean square error finding you compute,” says Dr. Wang.

For researchers and scientists looking to optimize video delivery, it’s imperative that there be good correlation between the measurement tool in use and the actual images a viewer sees. Otherwise, its efforts at making improvements would be obstructed, even misdirected.

“Once I have a measure of error I can use this to optimize my system,” says Prof. Wang. “I need something to guide these processes, guide my design, guide my optimization. If you don’t have this error measurement how do you know one design is better than another? How do I know if I change something it will make the quality better or worse?”

“If I have two systems, which system is better? I compute the error for each and if the error is smaller in one, that’s the one I’m going to use and trust.”

“People trust their eyes.”

“But if this error measurement is not correlating with what I want in the end, then either I’m picking the wrong thing or going in the wrong direction.”

Bad outcomes, obviously.

And so the hunt continued for other measurement tools.

In the 1970s, a metric was developed called “just-noticeable difference,” or JND.

As the name would imply, JND is the point at which a viewer can see a change introduced to a visual signal. It can be converted into a mathematical model.

“It’s very useful,” says Prof. Wang.

Useful, but as Wang and his SSIMWAVE co-founder, Abdul Rehman, would discover as they attempted to create new, more accurate, tools for video quality analysis, there were limits to the usefulness. Once again, “there was a significant gap between this JND idea and the reality,” says Prof. Wang.

In other words, as with mean square error, what the measurement tool said and what the viewer saw didn’t necessarily match up.

There are other metrics for video quality analysis, including one called the “full-reference method.” Full reference models compare an original video signal to the received video signal.

Then there is the “reduced reference method.” With reduced reference, elements of the original video are extracted and compared with the received video and a score obtained.

Finally, there is the “no-reference method,” which attempts to assess the quality of a distorted video without reference to the original.

There are other metrics, too. Peak signal-to-noise ratio, for instance, or PSNR. Dynamic range is another. Bit rate, yet another.

What each of these metrics have in common is they are able to measure isolated, specific aspects of video quality, but they aren’t able to provide a complete quality assessment. They aren’t able to do what a person would do if they were sitting and watching a video on their phone or tablet or TV, which is to evaluate what they see and ask, simply, “Is it good or is it bad?”

SSIMWAVE developed its product with that viewer in mind. Its SSIMPLUS algorithm provides an answer to that critical question – good or bad? – displaying the answer as a Viewer Score ranging from zero to 100. The higher the number, the better the quality, from the point of view of the human being sitting down to watch the video in question.

“The goal is to put all of these [measurements] into a Viewer Score,” says Dr. Rehman. “That score reflects what viewers, in general, would say when they look at content.”

Adds Dr. Wang: “People trust their eyes.”

And that’s what SSIMWAVE has done – created a computer system capable of seeing as a person would see, combining all the diagnostic tools that have been used before.

“The human perception is the final thing that should be used to define quality,” says Dr. Wang. “If I have something that can monitor these things 24/7 and tell me what is happening as a person would experience it, then I have the awareness.”