Objective and Subjective benchmark goals and procedures

This "master" page defines my long term plan for objective and subjective evaluation. For the actual evaluation and results, see "Objective and Subjective benchmark results".
________________________________

Goals

________________________________

  •    Put a detailed subjective rating system in place for audio gears. Share results.
  •    Put a detailed objective rating system in place for audio gears. Share results.
  •    See if a correlation can be found between objective measurements and subjective impressions.
  •    Optimize my system to be as good as possible. (Yes, that's selfish but at least I hunt performance without conflict of interest)
  •    Learn new measuring / listening techniques.
  •    As a middle man, see if any agreement is at all possible between objectivists (ASR) and subjectivists (SBAF).

________________________________

Subjective evaluation
________________________________

See results under "Objective and Subjective benchmark results".
Before starting the evaluation, when setup allows, I perform volume matching within 0.1dB (multimeter on 150 ohms dummy load for headphone amps).
The evaluation takes place during the working hours and lasts few days. I listen to background music while working without paying too much attention, and also occasionally listen critically/at higher volume level and taking notes, which allows me to go back on odd observation noted during that first phase.
During the final critical listening sessions (where differences between components are well identified and 
hunted for), I fill an excel table where the following criteria are rated on at least 8 different songs and averaged for consistency:

  •       Treble*:
    •       Rating only focusing on high frequencies reproduction. Here the high frequencies should be extended, well defined, have good micro-dynamics and not have any glare or veil. They should not be harsh now too sweet.
  •       Mids*:
    •       Rating only focusing on medium frequencies reproduction. Here, the mids should be articulate, sound sweet and never harsh. Grain on voices is a good indicator of mids definition. Neither too forward or laid back.
  •       Bass*:
    •       Rating only focusing on low frequencies reproduction. The bass should be articulated, have no bloom and not bleed over the mids (for example double bass and voice should remain distinct). Bass depth and slam have large impact on P.R.A.T. Typically a blooming bass will also give more slam, but will quickly create a mess when Bass is overlapped with other contents.
  •       Texture:
    •     Texture can be rated on instruments having high amount of harmonic content like trumpet, trombone, violin, cellos, flutes. Those harmonics should remain attached to the main instrument, ie coherent with the main fundamental (phase), and not be over-done. Wood percussion help to rate the naturalness of the textures too.
  •       Transparency:
    •       Opposite of veil. Clearly see-through character as opposed to seeing via small dirty window. Somehow relates to "air", but not exclusively. Not to be misinterpreted as a forward presentation.
  •       Micro-dynamics:
    •       Small details are popping from the background and are not muted/dull. Short scale dynamics relates to Details/Definition/Plankton, and to some extend texture.
  •        P.R.A.T:
    •       Engaging performance, good large scale dynamics and good timing. Foot tapping quality. When one just forgets he was rating gears and just enjoy the music. Somehow related to "Emotion", but without the euphonic aspect.
  •       Sound Stage:
    •       Depth (Sound Stage) and width (Imaging) of the audio scene are two pendants of the global Sound Stage rating. Sound Stage depth can be associated with 3D/holographic, and is more difficult to perceive on headphones Vs speakers. Imaging is just left/right panning accuracy, and is best evaluated with headphones to avoid for room coloration.
  •       Composure:
    •       Ability to remain clean and keep its coherence/qualities during complex passages just as well as during more simple tracks. (big orchestra or crowded electro). Typically I like to follow a certain instrument that gradually gets mixed with others and monitor how it evolves. Bad composure is the reason why simple-spectral contents are often played during HiFi shows, with little overlap (both spectral and imaging) between instruments. Note: wide spectral content also help to highlight phase issues.

*=Might overlap with other non-frequency dependent ratings. When there is a poor integration between bass/mid/treble, the lowest performer of the 3 will be further downgraded.

The final ratings are then lightly tempered again (typ. offset) to account for consistency cross all items rated, and see if it all makes sense looking at the bigger picture.

To help performing blind A/B comparisons, I created a switch that uses guitar pedal push button (so the operator does not knwo which device is being selected), which can switch between sources (RCA) and outputs (Jack) at the same time.
It is usable for headphones amplifiers as well as DACs (then sharing the same SPDIF).

________________________________

Objective evaluation

________________________________



See results under "Objective and Subjective benchmark results".

Fortunately, I have access to an Audio Precision APX555 and a SYS2722, a scope Keysight DSOX2024A and a G.R.A.S head so I am able to properly measure electronics and headphones, both for isolated componenents, and system as a whole.
I will do my best to measure the different components in the same conditions so that comparisons and conclusions can be drawn. (Still need to learn there.)

Here
s the list of what I intend to measure:
  •     SCOPE + Signal Generator:
    •     10kHz square wave (@2Vp-p)
      •     Rise/Fall time (or better, slew rate?) 
      •     Ringing & stability of power supply

  •      APX555:
o    THD+N Vs frequency
o    Full band FFT spectrum, averaged (3)
§  100Hz
§  1kHz
§  10kHz
§  Multitone (32?)
o    IMD measurement:
§  60 Hz and 7 kHz, at an amplitude ratio of 4:1 following the SMPTE standard
§  18 kHz and 19 kHz at an amplitude ratio of 1:1
§  1 kHz + 5.5 kHz intermodulation distortion
o    Channel crosstalk
o    Impulse / acoustical measurements
§  Which metrics could help to measure attack and decay per frequency? CSD/Waterfall for decay, to be checked for attack.
o    Noise:
§  Residual with inputs shorted, volume 0% (silence played for DACs)


________________________________

Subjective Vs Objective Correlation

________________________________

[TO BE UPDATED FOR EACH ITEMS TESTED]
[CONCLUSION TO BE ADDED]


________________________________

Reference setup - Headphones

________________________________

Foobar2000 (WASAPI Push direct output) -> Usb 5 -> Schiit Bifrost Multibit Rev. B -> Whammy DIY HPA -> Sennheiser HD58X modded.



Note: The modified WHAMMY and HD58X make for an excellent combo that is very resolving (to my ears) and allows to evaluate the upstream gears, until some better headphones come up.
While the JDS Atom is subjectively less resolving/dynamic/transparent/extended than the Whammy, it is used as a fixed reference so that modifications (
Op Amps, capacitors, output bias etc…) can be done on the WHAMMY and rated consistently against the JDS Atom.

________________________________

Reference setup - 2ch Loudspeakers:

________________________________

Coming soon.

Comments