Expectation Modeling (White Paper)
The Starting Point
To understand tradescapes, it is necessary to have a grasp for EM, Expectation Modeling. This is the science that generates the trading signals used for all of the backtests represented in a tradescape.
We begin with what appears on the surface to be something quite elementary, an expectation model that describes a time series. In simplest terms, an expectation model is a mathematical estimation where each point in the estimation represents the maximum likelihood of price based on the order within the price movements.
If the dynamics of a price series are seen as a blend of orderly and chaotic components, expectation modeling can be summarized as the technology that extracts just the orderly component.
The analytical procedures that accomplish this function are many. They typically fall under the class of numerical methods that consist of smoothers, filters, denoisers, and methods for isolating signal components. These are traditionally performed in the time domain, but can be high frequency Fourier or wavelet filters, or the eigenfiltering done in an SSA (singular spectrum analysis) algorithm.
The separation of chaotic movements from those that are ordered is a far more challenging problem than basic smoothing. If one were to look at the daily price movements in a financial time series, the density of those movements would have a skew, more movements to either the upside or downside and any number of what are known as "fat tail" events that can dramatically alter trading system performance. At any point in time, and for a given time horizon, some portion of the trending represented in that skew will persist. The same is true of the fat tail events. EM seeks to isolate the persistent or ordered component, and to strip away the chaotic portion of the movements that lack this persistence in time.
Causal vs. Non-Causal Smoothing or Filtering
If we reduce the function of an expectation algorithm to its simplest form, it is to offer a smooth representation of the ordered movements within a time series. In this context, we wish to filter out what signal scientists call noise, the higher frequency oscillations and seemingly random scatter that make the underlying deterministic trends difficult to discern.
In a live financial system, any smoothing or filtering must be causal—it can use only the information available at any given point in time. It cannot look ahead. Moving averages are typically causal methods since they use all data up to and including the present bar's data, but do not usually look ahead. All causal methods that actually reduce the noise in a time series introduce some measure of lag in the measurement.
If one can generate a smooth causal representation of the underlying trending within a data stream, one can potentially signal the turns, the maxima and minima, that occur in a financial time series. If there is sufficient order in the underlying movement of prices, that is to say, the data are sufficiently 'well behaved', it may be possible to signal entries and exits where the lag is small enough, and the accuracy high enough, to realize profitable investing. This describes the signalers used in most automated trading systems.
Trading Signaling – The Inside-the-Box Paradigm
The art of designing a computerized trading system has a great many elements. One must often select the entities that will be traded. One must find a signaling system that furnishes profitable entries and exits. That signaling system may or may not act upon the actual entity being traded. Most signaling systems have a good number of adjustable parameters that can be optimized for historical data, the premise being that the changes in the underlying dynamics going forward in time will not be so great as to invalidate the optimization walking forward.
Let's look at a typical scenario that might occur in a large investment firm's numerical group. Let's say someone designs a new indicator that offers what is supposedly a faster lower-lag relative strength estimate. To see if such an analytical signaling system has merit anywhere in the actively managed fund portfolios within the company, the software scientists at the firm may design an experiment that consists of the 500 securities currently under investment in the firm. To mimic the real world, let's assume that this signaling system has an information content parameter, such as a window length, and a parameter that tweaks the adaptive property of this new indicator. Let's also assume the company's signaling systems typically use a small but variable epsilon as a turn confirmation to avoid the fast and costly whipsaws that can occur in a binary signaler. We could equally assume a volatility channel or a confidence band that attenuates the signaling process and is likewise adjustable. Further, let's assume that the company's signaling systems sometimes take their signals from overall market surrogates, such as major market or sector indices, instead of signaling directly on the security.
For this hypothetical case, we have a candidate pool of 500 securities, a set of three dozen sector and overall index surrogates as potential signal targets, a two parameter signaling algorithm with 20 reasonable values of a window length and 20 reasonable values of the adaptive parameter, and a signal conditioning parameter with 10 reasonable values. If a design of experiments matrix is ruled out, and a blind brute force study is done, we have 500 *36 *20 * 20 *10= 72 million backtests for each set of selected and reserved data ranges used in a properly designed study. That is 72 million backtests for each data setup. For a well structured Monte Carlo analysis that will give strong confidence, one would like to shuffle the design and walkforward portions of the data quite a few times.
Computers can do this. Computers are doing exactly this. It may take many hours, or even days or weeks, and when all is said and done, the analyst will have to distill the results from this immense matrix. Ideally the response surface of the two parameters in the signaling algorithm will also be inspected to be certain that the signal settings are in a large, robust, stable zone of profitability. And those response surfaces should ideally consist of plots of risk-adjusted returns.
In truth, trading systems can be more complex than this, with perhaps as many as a half dozen parameters in the signaling algorithm, two tiered signalers working together, complex stops that become a part of the system, and the pyramiding of multiple trades. There are actually genetic optimizers and neural nets that can sometimes very effectively tune such a system without overfitting.
If this sounds familiar, or is at least makes some sense to you hypothetically, you are looking at the box, the way advanced computer-aided trading system design is often done.
EM works well outside the box. EM, executed properly, makes much of this unnecessary.
Let us assume the aforementioned experiment is a success, and that a small number of securities are traded very effectively using this new signaler. Let us assume that some of those are signaled directly, and others trade the security using signals from an overall market index and others using signals operating on a sector index. Let us also assume that for each, there are rather sharply varying parameters in the algorithm optimization for best risk-adjusted return. From our experience, unless the indicator is completely worthless, such is actually the likely outcome of such an experiment.
It isn't difficult to look upon the process from a scientific mindset. We ask why. Why certain securities and not others? Of those that did work, why are some using a relatively fast signaling while others are much slower? Why are certain securities best traded directly and others best traded using a more generic market signal? Basically, for those that work, why do they work? And for those that do not, why is there no useful trading function? What is the dynamic that is in play and more importantly, how stable is that dynamic across time? How likely is it to cease to work further along? In multidimensional mathematical space, is one operating on a fragile knife-edge or squarely in the center of a very stable plateau?
Our experience suggests there are seldom answers to those questions. The thinking tends to go something like this: if you have something that works now, it's fine to not know why. Count your blessings and trade profitably while you can. The trading function could disappear tomorrow with a major sea-change in the market, or even a small one. Enjoy it while you can and then move on. And what is the use of knowing anyway, especially if one is exploiting an ephemeral opening for profitability? It is all fleeting, anyway.
Or perhaps your technologist will start talking about it having to do with Hurst coefficient compressions, or shifts in the Lyapunov exponents. Or perhaps the analysts will explain that the favorable securities as those that are often ripe for breakouts due to patterns of pent-up demand and serious market undervaluation.
If you happen to be shrewd and insightful, you generally grasp that no one has the slightest clue. Why else could there be such a huge third party market for trading system black boxes?
Using EM to Go Outside-the-Box
What if the signaling process could be understood scientifically, analytically, mathematically? What if you could actually know why one entity works and another does not? What if you could actually understand the true efficacy of signalers in the context of what genuinely matters for profitable trading?
To do this, we have to think far, far outside the box. EM, expectation modeling, can be causal or non-causal. The latter is genuinely deemed worthless in the financial arena, since such models cannot be used in real-world signaling. Inside the box, one asks why it could possibly make any sense to expend energy on anything that can't generate live signals.
Outside the box, we look at life very differently. We can actually use non-causal EM to answer those nagging questions. With the proper EM algorithm, we can not only know which entities can be effectively traded on price, we can know how good the signaler has to be in order to generate a specific reward to pain. Further, we can use an EM reference as a benchmark, as the gold standard, to evaluate the accuracy and the lag in the real-world signaling algorithm.
With a bit of innovation, we can generate tradescapes, 3D visualizations of the reward-pain that can be achieved at various information content and lags using an ideal signaler, one that has what we assume to be accuracy. We can plot the real-world signals atop the tradescape and immediately see the cost of the inaccuracy incurred from the real-world causal signaling.
By understanding the order within a price series and the lag and accuracy realized by one's signaler, it is possible to answer a great many of the seemingly impossible questions. Further, much of it is possible in just a few seconds time.
A Little Lag Reduction Goes a Long Way
While the adage that one could go from a mediocre trader to the richest person on earth with but a day's reduction in signaling lag is perhaps a bit much, but the influence is profound, and this we will illustrate using our EM reference as a trading signaler. Bear in mind that this cannot be done in the real world, but this should help to demonstrate why EM can be so useful in answering critical questions.
This is the SP500 index signaled with our EM algorithm. We use an information content of 15 days, simulating a system that works with 14 past day's information and the current day in order to generate a signal. The information content, which we call the EM length, is the time horizon that determines the trade density, and accordingly the average trade length of the signaler. In this example, we deliberately lag this ideal signal by 14 days.
The white curve is the SP data, the cyan curve is the traded or equity curve for the system. We are looking at 3500 days or bars of data, 14 years in all.
Forget for the moment how the signal was generated, and let's say that this was one of the trading systems one developed for trading SPY or ES futures. The robust trend for the buy and hold for the 14 years is -0.07%, essentially flat as a consequence of the Internet bubble collapse and financial meltdown events. The trading system, unleveraged, trading long only, generates a robust CAGR of +3.9%. There are 46 trades, average duration 45 days, and of those just 45.6% of the trades are wins. The d(R³), the improvement in R³ over the underlying is 0.28, the d(RRt) is 0.59. While hardly spectacular, this is what one might expect from a typical signaling system operating on the SP index.
If this were to actually represent the optimum signaling system for a given design, this state of affairs would represent the best-case scenario. We would expect to see inferior performance with other parameter settings on the signaler, for example.
With a non-causal EM signaler, however, we can study what we gain by an improvement in lag. Let's see what the equity curve looks like if we build a super-signaler that has just 12 days lag, but retains 100% accuracy in terms of detecting the turns.
The first instance with a lag of 14 had a lag fraction of 14/15, or 0.93. Here we have a lag of 12, and a lag fraction of 12/15, or 0.8, quite realizable in real-world causal signalers, but seldom with anything close to 100% accuracy. This change from 0.93 to 0.8 lag fraction has changed the scenario appreciably. We now have a +7.1% robust CAGR, a 59% win rate, and the d(R³) is 0.87 and the d(RRt) is 1.43. By the definition of pain in R³, we have almost as much reward as pain. By the definition of pain in RRt, we have more reward than pain.
Now, let's see what a world-take-notice signaling system would look like. We will shift the EM lag to just 10 days. The lag fraction is now 10/15 or 0.67. This level of lag is often possible in a real-world signaler, but usually at a high cost in accuracy of signaling the true turns in prices.
We now see very little in the way of drawdowns. The return by robust CAGR is +11.1% for the 14 years. The win rate is now 74%, the d(R³) is 3.32, the d(RRt) is 3.45. There is clearly far more reward than pain.
Do you begin to see the value of an EM signal? Without as yet looking at any real-world signal, we are beginning to get a picture of how the entity, here the SP500 index, responds to improvements in lag and we also see what we can realistically expect at each lag if we achieve full accuracy. In practice, achieving close to full accuracy in a real-world signal is surprisingly easy. A two pass sequential simple moving average trading signal typically comes very close, but its lag is usually prohibitive. The EM technology at this point is telling us more about the measure of well-behaved trending in the entity than anything else. That is useful. If we want to build a signaling system for the SP 500, how good do we have to be? How well rewarded are we for any reduction in lag or any improvement in more accurately catching the price turns?
What happens if we carry EM to its unique extreme, that of no lag whatsoever?
At this stage, we have entered imaginary space. No such real world signaler is even remotely possible. Still, there is information to be gleaned from this ideal equity curve. The robust trend is a CAGR of 25.8%. That is spectacular for this entity, but there are individual securities that have done better than this across the last 14 years. Even with the Internet bubble burst included where AAPL suffered a horrific 80% retracement, its buy and hold robust CAGR is 31.1%.
This should mean a great deal to the fundamental analyst selecting securities and wanting to factor in historical price behavior. One earned more from buying and holding AAPL, doing no trading whatsoever for those 14 years, than one would have realized from trading the SP with an absolutely ideal signaler, full accuracy, no lag.
Tradescapes – Visualizing the EM Response Surface
We looked at just one EM length of information content, 15 days, and we looked at just 4 lags, 14, 12, 10, and 0 days. The extrapolation to a tradescape is straightforward. What would happen if we plotted a 3D surface consisting of a whole range of EM lengths and a whole range of lags?
This is a compact representation, in a contour plot, of a 2500-day tradescape for SPY, the primary tradable SP entity outside of futures. Depending on the ranges generated, a tradescape will consist of anywhere from 500-1000 backtests, each similar to the four specific instances plotted individually. In effect, the whole universe of EM information is presented in a single plot. By using RRt, or another risk-adjusted reward, as the independent or Z variable (the contour gradient variable), one can immediately see where one must be in terms of lag and trade density in order to have more reward than pain, assuming one can build a close to full accuracy signaler.
Because there is so much information, it is often useful to view the response surface in a 3D plot. The information conveyed in the tradescape suggests SPY can be traded, but it is not easy to realize more reward than pain. Everywhere there is a color other than the baseline red, there is more reward than pain by the RRt metric, but the EM signals can be assumed to have 100% accuracy, and that is not realistically achievable at low lags.
Think of lag in terms how your human mind would process twice as much future information. In terms of a turn actually taking place at the present point, that transition's detection is dramatically improved if you have all of this additional information to work it out. In a causal signaling system, you assign a signal to the present point, but in truth it represents the state of affairs some number of bars back in time. A simple price signaler, such as a moving average, has an algorithmic lag. From that lag, one knows the point in the time series that is actually being estimated. The points between this defined lag and the current point are in truth future points the algorithm has available to work it out, to get it right, to find an accurate state for the smoothed causal estimate.
Even a predictive system, which seeks to forecast a future price, is almost certainly working with some form of momentum and volatility, and these will each have their own intrinsic lag. In so doing, one is not really estimating tomorrow's price from today's information, but rather from a set of predictors which are each centered somewhere in the past, each at their own lag.
Even a system managed by human decisions for entry and exit will have lag, however much there is an effort to preempt or anticipate turns. And that lag is likely to wildly vary. Still, we wonder if the system analysis aspect of the EM technology will turn up human traders that can beat any computerized signaling system. EM does furnish the means to evaluate the lag and accuracy of any signaling system, algorithmic or human.
One place where a tradescape is particularly useful is telling you where not to build your signaling system. This is a shaded surface plot for SPY, the angles changed to show a deep depression in the RRt response surface. The EM lengths are approximately 35-50% of the average trade lengths. This suggests to us an area that is being very heavily worked by traders. We would not want to trade SPY with any system whose average trade lengths were around 100-days unless we had a very effective low lag signaling system.
Tradescapes for Security Screening
Before you begin to build trading systems, wouldn't it be helpful to know which systems signal on price in a well-behaved orderly way and which ones require heroics? Even if one will be making buy and sell decisions from human judgment, wouldn't it be helpful to know which securities might be forgiving and quite tolerant of some very lagged decisions, and which ones are likely to punish brutally when there is a delay in acting? To sum it up succinctly, it is very useful to know just how good one has to be, algorithmically or on the human side, in order to actively manage any given investment instrument.
Why do hedge funds love to put an actively-managed AAPL in their portfolios? Compare this 2500-day AAPL tradescape with that of SPY. At the appropriate trade density, one can trade AAPL very effectively with nothing more than a two-pass SMA for a signal. We use that as the upper limit algorithm in our work since it approaches the accuracy of the EM reference, with a lag typically around 1.1 or so. With AAPL, at least in terms of its historic behavior, the demands upon the signaler are far more forgiving.
This is not to say that AAPL doesn't suffer punishing movements. It certainly does, and on almost every time scale. Rather the tradescape indicates that the reward, the long term trend, is so great relative to those drawdowns or retracement, that one can have a very lagged entry and exit and still realize a very good reward to pain.
The explanation is actually quite simple in a mathematical sense. Imagine an entity like AAPL whose 10-year robust trend is actually greater than a +40% CAGR. If one builds a system that generates totally random entries and exits, and is in the market half of the time, one would expect to see about half that trend across some large Monte Carlo sampling study. With no signaling intelligence whatsoever, one would expect to see about a +20% return if the historical growth continues into the future. The higher the long term trend, and the more stable and consistent that trend is distributed across time, the more one can signal inefficiently and still come out looking good. With some random luck, it can look impressive.
With SPY we have a 3.7% robust CAGR over the past 10 years. That is far less leeway to work with for erroneous or badly lagged signals. The same experiment with random entries and exits would realize only about half of this far smaller long term trend, on average less than 2%. For such an entity random positive luck could easily make for a nice return and negative luck could produce sharp losses. This is why luck is such a consideration when evaluating any system' performance.
In general, the faster the signaling, the greater the return. One is able to mine more of the movements with more frequent trades. Still, it is not always true that faster signaling is better. When AAPL is viewed as a 3D response surface, it is clear that there are three optimal zones for fast, low-lag signaling systems. Or to put it a different way, there are two zones that appear to be heavily exercised where one might more wisely target the signal design for one of the three peak areas where the RRt is higher. If one lacks the resources to produce a fast low-lag signaler, the issue is less. Note how the tradescape surface shows more reward than pain at every point in the surface. That is quite rare.
Our initial EM algorithm has been designed to trade the orderly movements within price. Tradescapes plot the trading landscape for what is possible when extracting the order that is present in the time series. The EM algorithm is designed to filter out the chaotic movements, acknowledging that is possible to trade the disorder in movements as well as the order. While we may yet find a way to produce generalized landscapes for countertrend and chaotic trading, nearly all trading systems that have any time horizon in terms of count of bars will trade order to some extent. And few would argue that understanding what one can realize from trading order is a good benchmark.
What we found so very challenging was to build an EM modeling system that uses just one algorithm for every type of data and market, and which produces the highest backtest return and the highest accuracy for a given count of zero crossings in the first derivative of the signaler. We needed this to work for any entity, with only an information content or length parameter as an adjustment. We wanted the EM algorithm to be adaptive, incorporating the fat tail events that occur in financial time series, but not to overreact to them. In other words, we wanted the EM to not only be good, we wanted it to be a gold standard of sorts for mapping the tradable order in a time series. In our design, the issue was always accuracy as measured by real-world trading backtests, not by some more theoretical measure in the signal processing domain.
In effect, we wanted EM to be an absolute reference, our measure of what was possible for trading order at a given lag and trade density for any financial entity. We expected real world signals to produce less risk-adjusted return simply because the accuracy of catching the price turns would be less. We felt that the tradescape point for a given time horizon and lag should be exceeded only rarely, by those exceptional real-world signalers with a strong synergy of trading both order and chaos, and those few instances where random chance and fat tail events just happened to coincide for unexpected returns at one specific time horizon and lag.
While we hardly assert there is no better method available, we have found our EM procedure to serve as a sufficient gold standard across all of the international and US markets tested. We tested securities, ETFs, commodities, forex, and index futures. We probably had as many as several hundred candidate algorithms and it took us three years to settle on the procedure that generates our tradescapes.