An interesting complicated place is the stock market, and how prices move there. If we, using Nobel Laurate Robert Schiller's data, look at the real total return of the stock market the last 150 years, we get a striking image (*total returns* means that it's including dividends, *real*, means that it compensates for inflation):

Maybe a bit too striking since the exponential growth means that everything before 1950 disappears into the x axis. When studying exponential growth, it's almost always better to look at it with a logarithmic scale, where a doubling looks the same regardless of whether it is 1 to 2, or 1 million to 2 million.

Many different models, of different complexity, can be created to describe the process we see in these charts. Some are very simplified like "it goes up and then down and then up again" or "the stock market grows by 5% per year" and they are not wrong, just a bit **too** simple to be interesting.

There is a lot to gain psychologically and financially in having the most accurate model. With the most accurate model you can make the most accurate predictions of future gains, losses and risks, which gives you the lowest risk of life altering negative surprises, and the highest chances of making money, or avoiding losing money.

One common model, though not perfect in any way, is to assume the stock market moves randomly, in a kind of Brownian walk, where every step has a random length and direction.

If we look at what kind of steps, or moves, the stock market has made every month since 1870, we get this picture:

We can see that most months the market moves somewhere between up 5% and down 5%, except when it doesn't and move much more. I've labelled the most spectacular one-month changes. In general it looks very random, and it is hard to see any meaningful pattern. If we look at how many instances of various price changes has happened in a histogram, we get this:

Looking at the data this way, we can now see a pattern. Moves closer to 0% is more common than moves closer to 5%. Actually, the peak, the most common, move is slightly above 0%, which pretty well reflects our belief that the market over time slowly grows, and increases in price.

The question is now if we can use this pattern to create a model? One common model for something similar to what we see here is the Normal Distribution, also known as Gauss Distribution. In many instances, using the Normal Distribution is a good model for random events because whenever you sum a lot of independent events, you will end up with the Normal Distribution.

Every move in a stock market is not independent from earlier moves, but despite that, if you assume that every random step on the stock market fits the normal distribution you end up with something that seems reasonably accurate and believable.

Using SciPy to configure the best fitting Normal Distribution we end up with this model, where the orange line is the Normal Distribution's idea of how the stock market moves:

It is not a perfect fit. When trying to fit the Normal Distribution to reality, the algorithm will have to choose between covering the less likely events, or to match the more likely events. In the compromise chosen above, it fails at both.

The problem we observe with the Normal model is the lack of *skewness* and the lack of *fat tails*. Skewness is that the distribution is not equal to the right and to the left of the middle point.

The Normal Distribution has no skewness at all but if you look at the historical stock market data, there is clear signs of skewness. The price of something typically takes many small steps up and some big steps down, or the other way around. Prices can also never go below zero while they can, in theory, go up any amount. That in itself forces the model to be skewed.

The Normal Distribution also lacks fat tails, or "high kurtosis". With the Normal Distribution it is exceedingly rare for random samples to be more than 3 standard deviations away from the mean, but when we look at the actual data we see samples appearing far, far away from the middle. In the normal distribution, one such sample would be rare, two just don't happen. To communicate that events far from the usual distribution do occur, we say that the distribution has a fat tail, or a heavy tail, or have high kurtosis. Ignoring that fat tail means that your models will ignore black swan events which can cost you dearly.

After a lot of research, comparing dozens of random distributions in thousands of configurations, I have determined that the Johnson S_{U} Distribution is a really good fit to the observed pattern. It is not well known and sadly many statistics packages lack support for it, but it’s supporting both negative skewness and fat tails and the more recent data we look at, the better the Johnson S_{U} Distribution fits, statistically. Remember, it's still just a model.

Comparing this distribution to actual data shows a rather good fit. It might still miss the most extreme movements from the 1929 crash and the great depression, but everything else looks good.

In general, no distribution I researched was able to cover the 1929 crash and the years after that. This can be seen pretty well in this plot where I was looking for optimal parameters for different time periods (looking at 30 year blocks starting at the chart date). Anything that covers 1929-1935 (i.e. 1899 to 1935 in the chart) has to have completely different parameters and the result is still not that good. In the end I elected to choose distribution parameters that handled everything else well. After all, there have been many other crashes so the model does not ignore them completely.

This also tells us another thing. Whatever model we select today will probably not be the best model for the future since the world is complicated. Not to mention that we have an Heisenberg effect. Using a certain model will contribute to making that model unsuitable since people will work hard to take advantage of the model’s weaknesses.

While Johnson S_{U} has the best fit, there are plenty of other probability distributions that are better than the Normal Distribution but they all have their quirks.
A common choice for financial modelling is Student's t distribution.
It also has fatter tails than the Normal Distribution and at least some skewness which does help. It is more known for being the basis of the t test, the popular test for whether an observation is statistically relevant or not.

As can be seen in the plot above, Student's t distribution and the Johnson S_{U} distribution both look pretty good. Johnson S_{U} has a slightly fatter negative tail, and a skewness that fits the data slightly better, but both are ok. Johnson S_{U} is just slightly better. I am not the first one to discover it (see this Morningstar article from 2011),
but maybe I am the second one.

They are far from the only random distributions with fat tails and skewness though. Honorable mentions go to the Tukey Lambda distribution, and the Cauchy distribution. Tukey lambda is an hybrid distribution that also fits data well, especially in the middle of last century, and Cachy distributions have the interesting property that they have no limit on extreme movements. All other models have some limit on how bad or how good it can go, but not Cauchy. If you like fat tails, no tail is fatter than the Cauchy tail. Unfortunately that also meant that simulations using the Cauchy distribution can go completely wonkers. Maybe that is realistic, but in our timeline, nothing has been that crazy, yet.

A couple of notes about the data I'm trying to match. They are based on S&P 500, with compensation for inflation and dividends. Other markets have different characteristics. In particular this is based on the US dollar which has seen a gradual increase in value in the last 150 years. Whether this helps (cheaper imports) or hinders (harder to sell abroad) long term is always up to debate, but it does play a role.

Finally, note that this model, assuming that the market is doing a random walk following a Johnson S_{U} distribution, is a simplified model, not reality. Not as simplified as "5% growth every year" but there is room for many other models. You can
for instance build supermodels where you model multiple areas of the economy (statistically or otherwise) and let those models interact. Vanguard has such a model that they use internally and seem proud of. Those might be better, or they may have more human biases built in.

## Comments