> _Probability is the most important concept in modern science, especially as nobody has the slightest notion what it means._
>
> Bertrand Russel [^1]
At this point, you might argue that, beyond economics, philosophy, and linguistics, entire branches of hard science have developed objective, empirical methods for examining uncertainty. You might say that uncertainty is quantifiable—that it can be captured through **probabilistic** means. After all, probability theory underpins much of modern empirical science, providing the foundation for analyzing experimental data. In computer science, probabilistic methods are the backbone of machine learning, signal processing, and various forms of pattern recognition, from speech to image processing. Surely, these stochastic models can quantify and represent uncertainty, right?
> Disclaimer: The next few paragraphs are a bit math-heavy. Feel free to [[FUD|skip ahead]] if this starts to make your head spin!
In probability science, uncertainty is represented using probability distributions. A probability distribution assigns probabilities to different possible outcomes, ensuring that the total probability sums to one. However, to use these distributions, we already need some **prior knowledge**. For example:
- A ***normal distribution*** requires knowledge of the mean (the central value) and the standard deviation (which measures the spread of values).
- A ***binomial distribution*** depends on knowing the number of trials (n) and the probability of success (p).
- A ***Poisson distribution*** requires an estimate of the average rate of occurrence (λ).
- An ***exponential distribution***—which relates to the Poisson—also needs a rate parameter (λ).
- ***Bayesian statistical methods*** need not just one but two probability distributions (the prior and the likelihood function). The Bayesian approach, however, is dynamic—it continuously updates predictions based on new information, much like the way humans revise their understanding as they learn.
Without delving too deeply into the technical details, the key point here is clear: probabilistic methods require a great deal of prior knowledge to function effectively. And therein lies the dilemma. Uncertainty, by definition, deals with the **unknown**—yet these methods rely on substantial *pre-existing knowledge* to generate meaningful predictions. How can we apply probabilistic techniques in situations of radical uncertainty, where this information is unavailable? The answer is: **we can’t**. Probabilistic methods only work when partial knowledge exists.
To be fair, this is often the case. When probability distributions are known, or when we have ample historical data—such as insurance actuarial tables or economic time series—stochastic inference is a useful tool. In cases where the underlying distribution is unknown, we can even apply ***non-parametric*** methods,[^2] which do not require strict assumptions about the data. These methods can be effective in cases where data doesn’t fit a predefined distribution. However, even non-parametric techniques rely on certain assumptions, such as data independence or stationarity.
But when so much is already known, are we still dealing with uncertainty? The answer is yes, but only in the ***resolvable*** sense—the type of uncertainty that can be overcome with the right models and sufficient data. However, we must be cautious about applying these models when probability distributions are unknown. This is especially critical in high-stakes fields such as environmental forecasting, energy grid management, or financial market modelling, where conventional probabilistic models often fail due to the ***non-stationary*** nature of these systems.[^3]
In mathematics, a system is *non-stationary* when its statistical properties—such as the mean or variance—change over time. While some probabilistic techniques attempt to address non-stationary processes, they struggle when applied to ***complex dynamical systems***—systems composed of many interdependent parts that interact in unpredictable ways. Complex dynamical systems are found everywhere, from ecosystems and climate models to economies and social networks. The sheer number of interacting variables makes it impossible to generate reliable probability-based predictions.
All probabilistic methods require prior knowledge to be effective. This fundamental limitation confines their applicability to cases where relevant information is available. In situations of radical uncertainty—where the unknowns are too numerous or too unpredictable—probabilistic approaches lose their validity.
Therefore, probability should not be equated with uncertainty. At best, it can model *some* forms of uncertainty, but only when dealing with well-behaved, stationary systems where sufficient data already exists. In the broader landscape of the unknown, probability offers a map, but it cannot chart territories that have yet to be discovered.
[[FUD|Next page]]
<hr>
[^1]: Bertrand Russell, 1929 Lecture (cited in Bell 1945, 587.)
[^2]: Non-parametric methods: histograms, kernel density estimation, empirical distribution function, quantile-quantile plots. These methods have the advantage that they do not presuppose knowledge about probability distributions. However, they require large sample sizes to make meaningful statements.
[^3]: Non-stationary processes can be modelled by autoregressive integrated moving average (ARIMA) or stochastic differential equations (SDEs). These methods require even more knowledge of the process being modelled, such as the autocorrelation, probability distributions, and the appropriate differential equations.