Is the THX Standard Website Comparing AV Electronics Fairly?
For years, the THX certification process was nebulous to the end user. What did their standards mean?
When we attempted to pry information from them, they weren’t very forthcoming, claiming it to be IP (intellectual property). So, unless you were a licensing partner, you never really knew what was involved in their certification process and what specifications they used to govern them.
All of this seems to be changing since Razer acquired the company, as evidenced by the new THX Standard Website recently launched with this mission statement:
"With the launch of the THX Standard website, we give you data-driven comparisons of popular electronics to make informed purchase decisions unbiased by an editorial perspective."
Their focus seems to "objectively" compare products for various categories, including HDTVs, Amplifiers, and Powered Speakers.
On the surface, this seems like a great idea. It's obvious they've put considerable effort into creating a database of products they've measured that consumers can search and directly compare. Yes, we need something like this!
We took a closer look at the THX mission statement with the emphasis on amplifiers is in order to determine if their efforts are of value to the consumer and if their measurement and scorecard comparisons are truly objective and impartial. Unfortunately, our initial analysis of THX’s testing has us questioning some of their results and recommendations. Has THX made measurement errors and even rated possibly defective products?
Read on to see our thoughts and why audio consumers need to take note that you are likely getting more consistent and objective measurement results from audio publications like Audioholics that publish test results in addition to their subjective impressions of the gear they review.
See: Audioholics Amplifier Measurement Procedure to see how we objectively evaluate receiver and amplifier performance.
Manufacturer's Editorial Note about THX Certification
In the beginning, THX standards for speakers related only to home theater use. Front LCR had to have restricted vertical dispersion to minimize floor bounce and ceiling reflections. This was done to enhance dialog intelligibility. Frequency response had to be within a tight +/-3dB window for tonal accuracy.The surrounds had to be dipole, so that one pair would remain non-localizable and could do a reasonable job of mimicking the sound of the several pairs of surrounds found in a commercial theater.
Early on THX-certified speakers were thought to be “compromised” for music-only use.
But from the very beginning, THX-cert for amplifiers was viewed as a valuable indicator of performance and capability. Spec requirements were clear, strenuous and unambiguous. If it was a “THX amp,” it was a solid, high-performance unit. But over the years, they seemed to have lost their way as they spread their licensing out to too many categories of products and thus watering down their brand.
Caution: It's important to note that THX does in fact still certify audio equipment and license their reference designs to be manufactured by third-party companies. Do they disclose this in their comparisons? Let's dig in to find out.
Amplifier Tests
Running a comparison of the limited products in their database, one quickly notices that Benchmark Audio is clearly "superior" to the competition by their measurement metrics. The Benchmark AHB2, for example, earns a 93 overall rating while the next best amp, the NAD M22, earns an 85. In fact, this very amplifier is on their homepage rotator as the "latest featured product tested". But, what exactly is behind these ratings and are they truly objective like THX claims?
THX comes up with these ratings based on a variety of tests listed below:
- Frequency Response
- THD vs. Output Power
- THD vs. Frequency
- 1 kHz FFT
- 10 kHz FFT
- IMD (DFD) FFT
- IMD (SMPTE) FFT
- Crosstalk
- Noise
- Power Supply Injection
- Burst Power
All of these tests are conducted for 8 and 4 ohm loads with some test conditions listed in the measurement graphs and brief accompanying summary on their website. While these are all good metrics to look at and compare, the devil is in the details, and what THX isn't saying is quite revealing.
Frequency Response
It is a desirable goal for an amplifier be able to produce a flat frequency response within the audio bandwidth (20Hz to 20kHz) when driving both 8 and 4-ohm loads. The less an amplifier varies in this frequency range when driving various load impedances, the more consistent it will sound when driving different types of loudspeakers since, in the real world, even so-called "8-ohm" speaker's impedance can vary from below 4 ohms to over 20 ohms. Some Class D amplifiers exhibit more variance in frequency response than linear Class AB amps under various loading conditions due to interactions with their output filter which should show up even under the limitations of resistive load testing.
THX Spec: "The desired frequency response is flat; 0.5 dB at any point between 20 Hz and 20 kHz."
Editorial Note About Amplifier Frequency Response
It’s important not to draw the wrong conclusions when looking at the results of these tests. Most amplifier manufacturers deliberately roll off frequency response beyond both ends of the audio frequency spectrum. Filtering above 20kHz helps minimize noise pick-up from the amplifier inputs as well as overall noise of the system. You don’t want an audio amplifier ruler flat to 100kHz or its inputs can act like an RF antenna when not connected to a preamplifier which is why most manufacturers employ low pass filtering above the audio band. Similar could be said about the other end of the spectrum too. Generally, it's not a good idea for an amplifier to measure flat to DC as there is no program material that digs down much below 10Hz. You can actually wind up wasting amplifier power and damaging speakers if you're passing a subsonic signal from a turntable for example. If you see a gentle roll-off at very low frequencies and above the audio band (20kHz), this is typically a deliberate design attribute to improve performance in real world systems while also reducing noise susceptibility pickup issues.
THX Frequency Response Measurement of Emotiva XPA-7 Gen3
I noticed some amplifiers in the THX comparisons achieve their 0.5dB requirement from 20Hz to 20kHz but still receive a lower score than others that also meet this requirement. The Emotiva XPA-Gen3 scored a 70 despite its meeting their frequency response criteria. In fact, it even had perfect channel-to-channel tracking, which surprisingly isn't even listed or discussed as an important metric in the THX test procedure! The NAD M22, by comparison, showed a .6dB spike above 20kHz into 4 ohms and its channel-to-channel tracking wasn’t nearly as good as the Emotiva’s, yet it scored a 94 in this test.
THX Frequency Response Measurement of NAD M22
Taking a closer look at the NAD M22 frequency response revealed that either the THX test fixture is faulty or the Hypex NCore module inside the M22 has a hardware problem on channel 2 (more likely). You can see channel 2 at 8 ohms starts to rise again above 50kHz indicating there may very well be a problem with the output filter implementation for that channel. It may have been prudent for THX to have examined this result more closely in this case and either rejected the sample they tested or requested a new product from NAD to reaffirm their findings. If THX paid closer attention to channel-to-channel variance, this is something that may not have been overlooked. Close channel-to-channel tracking is a critical aspect of a good multi-channel amplifier design and something that most manufacturers pay meticulous attention to if they want to ensure consistent sound quality.
There doesn’t seem to be much rhyme or reason with THX score results based on their measured frequency response and how could they overlook channel-to-channel consistency? It's unclear how THX is rewarding a score in this category as it certainly doesn't appear to be solely on meeting their frequency response requirement. If this is a comparison between products based on metrics, these scores require a rigorous standard. There should be a detailed description of the metric and weighting used to determine the score.
Editorial Note about Ultra Wide Bandwidth Amplifiers
Any manufacturer can easily just adjust the filtering of their amps to measure like a straight line from DC to 100kHz to pass the THX frequency response test with flying colors and earn a high score at the expense of potentially introducing added noise or stability issues. Given the choice between amplifying a DC signal vs. having a gentle HPF to prevent DC from ever reaching a speaker output, I'd choose the latter. Given the choice between implementing a shallow LPF above 20kHz vs extending bandwidth flat out to 100kHz to get a higher score on this test while potentially adding noise and RF pickup into the system, I'd choose the former.
It is useful to see these measurements, particularly how some Class D amps alter frequency response under various loading conditions, but be careful in interpreting the THX scoring here. Be equally cautious in interpreting THX's published frequency response sweeps, like in the case of the NAD, to determine if they really do make sense or if there is a problem with the test unit or testing rig, leading to erroneous results. In fact, this is a good piece of advice when looking at measurements (3rd-party or manufacturer) for any products you are considering. Context and proper interpretation are necessary, otherwise, false conclusions are inevitable.
THD vs Output Power
THX Spec: "The desired result is less than 0.001% THD, from 100 mW to half power (3 dB below the point at which the amplifier exhibits 1% THD)."
First, I'm wondering why they are looking at THD alone and not THD + N which is usually the standard way of measuring amplifier distortion and system noise over power.
Editorial Note About Amplifier Distortion
Designing an amplifier with ultra-low distortion especially at lower power levels is quite trivial these days. You can employ lots of open loop gain and compensate with a ton of negative feedback to achieve this at the expense of high power bandwidth linearity, stability and in most cases, fidelity. Think of the mass market receivers of the 1980’s that touted < .008% THD+N. They looked great on paper but most of them sounded lifeless when driving real loudspeakers and they didn’t handle low impedance loads well either. Thankfully most manufacturers smartened up and stopping making ultra-low distortion a primary design goal in “good” amplifier design moving forward.
Note: I am not suggesting the Benchmark AHB2 amp is anything like these receivers from the 1980's. But, it's easy for a manufacturer to go down this road simply to get a better score on the THD vs Output distortion test for the THX scorecard.
It's common sense to most designers that amplifier distortion is USUALLY not the limiting factor in a system. Loudspeakers have orders of magnitude higher distortion than solid state amplifiers so to place so much emphasis on ultra-low distortion seems more of an exercise in textbook design vs real-world design. Some of the best amplifiers today don’t tout such ridiculously low distortion numbers. But, they're usually able to nearly double power with halving load impedance, thus acting like an ideal voltage source, which is what good amplifiers with adequate heatsinking, robust output devices and capable power supplies should do.
THX THD vs Output Power Measurement of Monoprice Monolith-7
It is quite puzzling how THX measured the Monolith-7 as having the same power (200 watts) into 8-ohm and 4-ohm loads while most of the other amps they tested delivered 1.5-2X the power into 4-ohm loads. Yet, somehow the Monolith-7 scored an 84 while the Emotiva XPA-7 Gen3 scored a 73 despite the Emotiva amp having delivered 2X its 8-ohm power into a 4-ohm load (500 watts) according to THX's own testing! Once again, we have no way of determining how THX derives their scoring for this category.
Audioholics Power Table Measurements of the Monolith-2 Amplifier
Our own tests of the Monoprice Monolith-2 revealed that amp was, in fact, able to deliver 1.6X the power into a 4-ohm load than it did into an 8-ohm load. You can see in our chart above that it delivered 212 watts/ch into 8 ohms and 335 watts/ch into 4 ohms (2CH driven, 1kHz, 0.1% THD +N). Since ALL of the Monolith amps are based on the same design, THX clearly published inaccurate results and their score card doesn't reflect this. Everyone makes errors from time to time, which is why it's always a good idea to have data comparisons like these peer reviewed for accuracy.
Burst Power
THX uses a burst test signal with a 10ms on / 90ms off duty cycle and compares expected vs. measured power at 100Hz and 1kHz, with the goal that the power being delivered should double each time the load impedance is halved. The score is based on the ratio of measured vs expected power, with a higher number rewarded to the amplifier that has the least deviation between expected (ideal doubling of power with halving load impedance) vs measured power and taking 10*log (Pmeas/Pexp) to show the deviation from ideal.
THX Burst Test Measurement: Benchmark AHB2 vs NAD M22
In this comparison, we look at the Benchmark AHB2 vs the NAD M22. The Benchmark amp wins the test because it had a narrower margin of expected vs. measured power yet ironically the NAD amplifier is far more powerful producing 1315 watts vs 426 watts at 1kHz, 2 ohms, which is almost a 5dB advantage. Dollar for dollar, the NAD is a better value when you consider how much more power you're getting, but you'd never know it from this test. The scores seem to be arbitrarily assigned by THX with no indication of how far off their ideal doubling of power with halving load impedance threshold you shouldn’t exceed for each test in order to be rewarded with a high score.
Note: THX doesn't specify the line voltage, nor do they say if the line was sagging during their power tests, which would greatly affect the delivered power of a high-power amp like the NAD when driving a 2-ohm load. Something seems a bit skewed that the Benchmark amp won this comparison considering that the NAD amp delivered 2 to 3 times the power into all loads.
What Happened to Meeting Published Power Specs?
The power tests that THX conducts don’t seem to reward a manufacturer that delivers high-power high-value designs, especially when they meet or exceed rated power likely because THX doesn’t seem to place ANY emphasis on full power steady-state measurements of amplifiers. They also don't appear to factor in value when calculating their scores, otherwise, the Benchmark amp would be at a disadvantage to many of the products in these comparisons.
Editorial Note on 1kHz vs Full Bandwidth Power Testing
ALL of THX power testing on amplifiers appears to be done at 1kHz only. As we've discussed in past articles on Audioholics, it's critical to do full power bandwidth testing to reveal any design limitations that can crop up. Some Class D amplifiers, for example, cannot deliver full power at full bandwidth when driving low impedance loads due to limitations in their post filter feedback or output filter. This is something you would never see if only testing power at 1kHz. As a result, at Audioholics, we test power at full bandwidth for up to 2 channels driven, and 1kHz for ALL channels driven to flush this out. We also do power burst testing based on the CEA 2006 protocol to determine available amplifier headroom.
Based on available output power and value, the THX-designed Benchmark AHB2 would rank lower than most of the amps in their very own comparison! As you can see it’s all a matter of perspective of which metrics they deem are important in amplifier design. It’s clear by their examples, that their goal is ultra-low distortion and wide bandwidth, with very little emphasis on sustained high output power or even validating whether or not each manufacturer meets or exceeds FTC rated power at full bandwidth with at least two channels driven.
Crosstalk
THX Crosstalk Measurement of Monoprice Monolith-7 Amplifier
THX Goal: "The desired level of crosstalk is less than -70 dB throughout the audible spectrum; from 20 Hz to 20 kHz."
Crosstalk is an important measurement metric to ensure you have good channel-to-channel isolation, which leads to good stereo separation. If the channel under test isn’t being polluted by signals from adjacent channels, -70dB channel-channel isolation from 20Hz to 20kHz is a reasonable goal. In fact, we always talk about how amplifiers should be able to achieve at least -60dB of isolation between channels at 20 kHz in our testing.
Looking at their Monolith-7 test results, I find it unbelievable that a difference of 30dB at 20kHz is seen between driving the amp at 8 ohms vs 4 ohms. Are they even testing the same adjacent channels when driving 8 and 4-ohm loads? Did they terminate the inputs of the unused channels during this test or just allow them to remain open for susceptibility of RF ingress?
In my 20+ years of measuring power amplifiers and designing telecom equipment, I’ve never seen crosstalk measurements vary this much solely by changing the load impedance. You can see a difference at low frequencies due to magnetic coupling when current demand increases going from 8-ohm to 4-ohm loads, but at 20kHz varying the load impedance shouldn't affect this measurement since you're dealing mostly with parasitic capacitive coupling in that region.
Oddly, the Monolith-7 received their highest rating for crosstalk of all multi-channel amplifiers they’ve tested thus far despite it failing their -70dB requirement at 20kHz for a 4-ohm load by almost 10 dB. This is a head scratcher for sure.
Editorial Note on Amplifier Crosstalk Measurements
At Audioholics, when we test crosstalk, ALL amplifier inputs are terminated to the test gear and ALL outputs channels are terminated with a load. We run each test channel undriven with all other channels driven which results in what's called an "all-to-one crosstalk" measurement which plots a frequency vs crosstalk sweep for EVERY channel of the amp. This verifies that the entire amp has good crosstalk and not just 2 arbitrary channels that were tested.
A Failure of Disclosure of a Conflict of Interest?
It’s interesting to note that Benchmark Audio touts its extremely low distortion design in their products which incidentally is a THX reference design that THX conveniently does a poor job mentioning on their website. Most consumers will likely not see this connection, which, in my opinion, is intellectually dishonest on THX's part.
Someone at THX clearly has an idea of what they think the "ideal" amplifier should be and it seems to be based on their Benchmark AHB2 reference design which is both expensive to produce and has limited output power compared to other designs. However, I'd caution that some of their design preferences, which can look good on paper, may lead to amplifiers that may not perform better in real-world applications and in some cases may sacrifice fidelity for reasons previously stated.
There are additional tests THX conducts on amplifiers but I think by now you get the point regarding the issues with their published data and scoring of products.
Conclusion
In my brief expose of the new THX Standard website, I’ve seen numerous examples of either poorly documented amplifier measurement procedures, erroneous test results, or subjective biases in their inexplicable scorecard ratings that never once factor value into the equation. I haven't really looked at what they are doing with HDTV displays or loudspeakers but it's quite possible there are issues with those test results too based on what I'm seeing with the small sampling of their website relating to amplifier testing and scoring.
Data Driven and Unbiased?
It’s hard not to conclude the THX benchmarks for performance were derived to match the “Benchmark AHB2” amplifier. That behavior is something that is expected from a manufacturer and not an impartial website. Their criteria becomes suspect when the scores seem arbitrary, thus bias becomes the logical conclusion. There is also no concept of suitability to environments (ie. recommended for small rooms, efficient speakers, modest listening levels, etc). THX used to be known for those metrics but now they just seem to want to declare "winning" products based on their scorecard results. Based on this, I find their statement of “unbiased by an editorial perspective” to be a bit hypocritical. Does one have to wonder if they will ever publish measurements of a competing amplifier that scores higher than their reference standard? This is not dissimilar to how some loudspeaker manufacturers NEVER lose their own blind listening tests.
A cynic could conclude that THX is trying to seduce manufacturers into soliciting their services in order to “improve” their products to earn a higher score card results. If successful, this could be quite lucrative for THX and could make their brand relevant again in consumer audio.
I will cut them some slack given the fact that their results lack editorial bias only because they aren’t actually writing reviews based on subjective experiences. But their test results, and their resultant scoring, in my opinion, are NOT without bias. As it stands now, I commend THX's effort in attempting to provide a method of objectively comparing product performance measured by a single source that allegedly tests all of the products under the same test conditions and with the same test rig. However, their procedure is in desperate need of improvement and they should aspire for greater transparency and actual objectivity with their scorecard or perhaps just dropping the scorecard all together would be more prudent. I do like many of the measurement parameters they've chosen, but I feel more emphasis on full power bandwidth testing and channel-channel tracking should also be included. Perhaps THX could validate rated power per FTC with at least 2 channels driven and they could test channel-channel tracking to make sure it's under a reasonable threshold say .5dB for the entire audio passband.
Anyone can pull amplifier measurements using an Audio Precision. But it takes care, peer review, and knowledge in interpreting those results to ensure accuracy in reporting as well as establishing meaningful metrics that produce better-sounding and better-performing amplifiers in real-world scenarios, as opposed to just getting good measurement results on a test bench.
On a final note, the THX Standard website user interface is a bit difficult to use. When you start a comparison and dig down, there is no way back to the comparison list. Instead, you must start a new comparison but that cannot be done without de-selecting and selecting the products you wish to compare. The splitting of stereo and multi-channel amps is arbitrary since a number of multi-channel amps are also available in two channel mode. It would be useful for the consumer to be able to compare measurements for all amplifier types.
Article Updates
5/9/18: THX reached out to let us know they've updated the Monoprice power and crosstalk measurements on their website with the correct graphs. Their scorecard results were originally based on the correct AP measurements and thus haven't changed.