Please, cite this online document as:
The Unbearable Fuzziness of NMR Data,
Presentation at XLVI GIDRM, Fisciano (Italy), September 27-29, 2017.
It is almost ten years since I started working with Mestrelab Research (Santiago de Compostela) on some aspects of their Mnova software NMR plug-in. One of the tasks I am handling is the automatic structure verification (ASV) which consists in answering the question "to which degree is this proposed molecular structure compatible with this set of NMR data". This means "teaching" the computer to "think NMR" or even to "think NMR the way chemists do". Something that brings up an amazing number of almost philosophical dilemmas and even grotesque situations!
Here, I do not want to present any specifics of Mnova, this is not a Company presentation. Rather, I will attempt a very personal reflection about the lessons that I have personally learned so far from my work. Lessons that regard the unavoidable fuzziness of any experimental data and the propagation of the fuzziness to higher logical levels in any system complex enough to resemble an artificial intelligence. It is much like studying error propagation in complex calculations, but applied to logical systems.
While teaching a computer to dab in human logic is a hard proposition, teaching it to apply human logic to real data is something that can drive a programmer insane.
In NMR many of the problems arise from the fact that no spectrum is really fully defined - there is noise, field inhomogeneity and instability, baseline roll, massive overlap of quantum transition, mathematical ill-definiteness of spectral decomposition into peaks, etc. On top of this there are also imperfections (errors) in predictions of NMR parameters, unexpected interactions between sample components, weird behavior of real molecules (as opposed to their structural formulas), and the impossibility to fully rely on chemical NMR 'rules' because they are either too bland to be useful, or else have quite frequent 'exceptions'.
I will also attempt to answer the hard question of how much can we really trust the information deduced from NMR spectra and which are the principle traps for novices, experienced NMR spectroscopists, and AI's alike. They are all apt to err - but in ways different enough to start charting some characteristic patters.
Note: I do collaborate quite closely with many people at Mestrelab Research, Santiago de Compostela, Spain. Carlos Cobas, Felipe Seoane, Mike Bernstein, Vadim Zorin, Esther Vaz, Maruxa Sordo, to name just a few.
However, the opinions I express here are rather personal; I am not quite sure whether my colleagues would approve them (well, probably yes). A Company should tell you that the Verification wizard is a marvel of unchallengeable logic, while I will try and show you that such a dream is impossible. Yet, we try hard to keep improving it.
After all, we humans are not perfect, too, and we also all try hard to improve, don't we?