Outlier Risk, Part I

Published 10/26/2021, 07:59 AM
Updated 07/09/2023, 06:31 AM

Would you know an outlier if you saw one? They’re everywhere and easy to spot, or so one can argue. But casual observation is one thing and shouldn’t be confused with robust statistical definitions.

Indeed, definitions matter in this space—a lot. Alas, there’s no consensus on the single, best way to identify “extreme” values in a data set for every analytical project. Regardless, the stakes are high because extreme numbers can reduce the reliability of modeling and analysis and so it’s often essential to filter these outliers.

“An outlier is an observation that lies an abnormal distance from other values in a random sample from a population,” advises the Engineering Statistics Handbook. Unfortunately, that leaves plenty of room for debate since “this definition leaves it up to the analyst (or a consensus process) to decide what will be considered abnormal.”

The good news: there are a number of techniques for identifying outliers. The only problem is that each technique has its own set of pros and cons and so there’s no one-size-fits-all solution.

To understand what’s available and how to identify the best option for your data analytics, let’s take a brief review of the choices. That starts with recognizing that finding “abnormal” data points first requires defining “normal.”

One of the standard approaches is to use the interquartile range (IQR), which measures the statistical dispersion of a data set based on quartiles. Using the standard application for this statistical tool, data within the 25th to 75th percentiles is the IQR and is considered “normal.” Numbers outside this range are the outliers.

As an example, let’s run the analytics using rolling one-year percentage changes for the US stocks market (S&P 500) since 1959. For perspective, here’s how the raw data compares through time.

S&P 500 Rolling 1-Year Return

It’s not obvious how to define outliers by looking at the chart above. That’s where IQR analysis can help, at least as an initial filtering step. The boxplot below shows the IQR for these returns in the grey box, which covers performances from roughly 0% to 19%. By this measure, returns that are negative or above 19% are considered outliers.

Rolling 1-Year S&P 500 Change

But that’s a bit harsh since one-year negative returns for the S&P 500 are common, or at least not unusual through time. In other words, the standard approach for identifying outliers via IQR isn’t practical. Fortunately, there are other techniques that are better-suited to finding outliers in financial markets.

In upcoming installments of this series, we’ll take a closer look at the possibilities for improving on the IQR method for outlier detection.

Latest comments

Loading next article…
Risk Disclosure: Trading in financial instruments and/or cryptocurrencies involves high risks including the risk of losing some, or all, of your investment amount, and may not be suitable for all investors. Prices of cryptocurrencies are extremely volatile and may be affected by external factors such as financial, regulatory or political events. Trading on margin increases the financial risks.
Before deciding to trade in financial instrument or cryptocurrencies you should be fully informed of the risks and costs associated with trading the financial markets, carefully consider your investment objectives, level of experience, and risk appetite, and seek professional advice where needed.
Fusion Media would like to remind you that the data contained in this website is not necessarily real-time nor accurate. The data and prices on the website are not necessarily provided by any market or exchange, but may be provided by market makers, and so prices may not be accurate and may differ from the actual price at any given market, meaning prices are indicative and not appropriate for trading purposes. Fusion Media and any provider of the data contained in this website will not accept liability for any loss or damage as a result of your trading, or your reliance on the information contained within this website.
It is prohibited to use, store, reproduce, display, modify, transmit or distribute the data contained in this website without the explicit prior written permission of Fusion Media and/or the data provider. All intellectual property rights are reserved by the providers and/or the exchange providing the data contained in this website.
Fusion Media may be compensated by the advertisers that appear on the website, based on your interaction with the advertisements or advertisers.
© 2007-2025 - Fusion Media Limited. All Rights Reserved.