THE APPLICAT APPL ICATION ION OF NEUR N EURAL AL NETWORKS TO U.S. EQUITY PRICING AND CLASSIFICATION
by S. Kris Kaufman, President Douglas Frick, Vice President Alex Kaufman, Kaufman, Research Research Associate Associate Parallax Financial Research, Inc.
2004
White Paper Paper for Parallax Parallax Financial Financial Research, Research, Inc. products: products:
Price Wizard TM Dynamic Price Wizard™ ValueNet ValueNet TM Beta Predictor™ Wizard Explorer Explorer TM
Abstract Neural Neural network network modeling modeling technology technology has been applied to the problem problem of determining determining the fair market price for a company based on all significant stores of value as reported through required SEC filings and current economic conditions. No attempt was made to use analyst estimates or other future looking assumptions. Significant biases due to trend persistence, price, economic conditions, and company size were reduced using proprietary splitting and sampling techniques. Sector and industry biases were eliminated by producing one network for each of 16 economic sectors, each of which included binary flags to signify industry membership. Data used for this work came from Zacks Zacks databases of US corporate data between 1990 1990 and 2001. After a preparation preparation step which separated separated the data samples into training and test sets, and applied some outlier elimination filters, the NGO product by BioComp was used to train the networks. Training for most sectors resulted in R squared numbers above 0.8 as follows: Sector Consumer Staples Consumer Discretionary Retail/Wholesale Medical Auto/Tires/Trucks Basic Materials Industrial Products Construction Multi-Sector Conglomerates Conglomerates Computer And Technology Aerospace Oils/Energy Finance Utilities Transportation Business Services
R 2 .88 .79 .75 .77 .90 .86 .87 .89 .93 .63 .88 .82 .91 .96 .88 .75
Samples 16,413 12,943 21,511 14,321 3,891 13,893 16,935 8,619 4,795 27,991 3,387 11,035 32,000 15,567 6,451 4,065
The final neural models models return an estimated price. There are numerous applications for these price estimates estimates such as pricing IPO’s, portfolio portfolio selecti selection, on, macro macro modeling modeling of sectors sectors and and industries, industries, stock market index pricing, asset allocation, and construction of enhanced indices. The asset allocation application required an additional network be built to classify “value” and “growth” stocks. Another stock screening application required that the recent rates of change of valuation and price be combined with our estimated price in a dynamic neural network model. model. Beta is an often used management variable, and we wished to estimate beta to the S&P 500 as well as beta to an industry average. A beta predictor neural net was constructed for this purpose. Test results show that our methodologies significantly improve the stock screening process at all levels.
TABL TA BLE E OF CONT CO NTEN ENTS TS
List of Figures............. Figures ............................ .............................. ............................. ............................. ............................. ............................. .......................... ................. ...... ii List of Tables.................. Tables................................. ............................. ............................. ............................. ............................. ............................. .......................... ................ iv Acknowledgement Acknowledgements......................... s....................................... ............................. ............................. ............................. ............................. .......................... ............vv Glossary................. Glossary................................ ............................. ............................. .............................. ............................. ............................. ............................. ....................... .........vi vi Introduction Introduction ............................. ........................................... ............................. .............................. ............................. ............................. ............................ ................... ......11 TM Chapter I: Building an Equity Pricing Model (Price Wizard ) ........................... ................................ ..... 4 The Search for Value............................ Value........................................... ............................. ............................. ............................. ........................... .............44 Data Preparation Preparation ........................... .......................................... .............................. ............................. ............................. ............................. ................... ..... 7 Neural Network Background Background ............................. ............................................ ............................. ............................. ........................12 .........12 Neural Net Training Results ............................. ............................................ ............................. ............................. ..........................17 ...........17 TM Value/Growth Value/Growth Classification Classification Network (ValueNet (ValueNet )............................. ).........................................26 ............26 Predicting Predicting Beta to the S&P500 and Industry Indices Indices (Beta Predictor™) Predictor™) .......27 Adding Rate-of-C Rate-of-Change hange to to Price Wizard (F% or Dynamic Dynamic Price Wizard™).28 Wizard™).28 Chapter II: Model Delivery & Visualization Visualization ............................ ........................................... ............................. .....................30 .......30 Model Processing Processing & Delivery....................................... Delivery...................................................... .............................. ............................30 .............30 TM Interactive Price Calculator (Wizard Explorer ) ............................. ............................................ ................... ....32 32 TM TradeStation TradeStation Historical Historical Graphing.......................... Graphing......................................... ............................. ............................. ................. 35 TM TradeStation TradeStation Aggregate Aggregate Portfolio Portfolio Graphing Graphing ........................... ......................................... ..........................37 ............37 TM EXCEL Database Database and Summary Spreadsheets................. Spreadsheets................................ ............................. ..................44 44 Chapter III: Historical Testing Testing Methods ............................. ............................................ ............................. .........................48 ...........48 Survivorship Survivorship and In-Sample Considerations Considerations ............................. ............................................ ...........................48 ............48 Chapter IV: Applications Applications.............. ............................. ............................. ............................. ............................. ............................. .........................60 ..........60 Stock Selection Selection Screening.................................. Screening................................................. ............................. ............................. ..........................40 ...........40 CSFB Holt versus Parallax Equity Pricing Model ............................. ............................................ .................. ...40 40 Aggregate Aggregate Valuation Valuation Statistics............ Statistics.......................... ............................. ............................. ............................. ...........................40 ............40 Aggregate Aggregate Portfolio Portfolio Valuation........................... Valuation.......................................... ............................. ............................. .........................40 ..........40 Sector and Industry Valuation Valuation ............................. ............................................ ............................. ............................. .......................40 ........40 Index/ETF Valuation Valuation ........................... .......................................... ............................. ............................. .............................. ........................40 .........40 Enhanced Index Applications Applications ............................ .......................................... ............................. .............................. ........................40 .........40 Long/Short Long/Short Applications Applications ............................. ............................................ ............................. ............................. .............................. ................. 40 Asset Allocation Allocation Applicati Applications.............. ons............................ ............................. ............................. ............................. .........................40 ..........40 Chapter V: Client Performance Examples........................... Examples......................................... ............................. ..........................60 ...........60 The Influence Influence of Price Trend Trend Persistence....................... Persistence...................................... .............................. .......................63 ........63 Equity Screening Example 1 (Marque Millennium) .......................... ........................................ .................. ....63 63 Equity Screening Example 2 (Corporate (Corporate Consulting)...................... Consulting)..................................... ................... ....63 63 Glossary................. Glossary................................ ............................. ............................. .............................. ............................. ............................. ............................. ......................73 ........73 Bibliography Bibliography ............................ ........................................... .............................. ............................. ............................. ............................. ............................ .................. ....75 75 Index...................................... Index.................................................... ............................. ............................. ............................. ............................. .......................... ......................78 ..........78 Appendix A: Parallax Parallax Financial Financial Research Research.............. .............................. ............................... .............................. ........................77 .........77 Appendix Appendix B: Zacks Zacks Historical Historical Data.................................. Data................................................. ............................. ............................. ................. 78
LIST OF FIGURES
Number Number Page 1. Zacks Current Database.............. Database ............................ ............................. ............................. ............................. .............................5 ..............5 2. Three Layer Layer Neural Network ............................ .......................................... ............................. ............................. .................. ....12 12 3. Neural Network Activation Functions Functions .............................. ............................................ .............................14 ...............14 4. Network Training for the Finance Sector Finance Sector............... .............................. .............................. ..........................16 ...........16 5. Predicted Predicted Price vs. Actual for the Test Data ............................. ............................................ ....................17 .....17 6. Predicted Predicted Price vs. Actual for the Training Data ............................... ..........................................17 ...........17 7. Consumer Staples Network Network Training Results ............................. ............................................ .......................18 ........18 8. Utilities Network Utilities Network Training Results........................ Results...................................... ............................. ..............................18 ...............18 9. Retail/Wholesale Network Retail/Wholesale Network Training Results.............................. Results............................................. .......................19 ........19 10. Finance Network Finance Network Training Results...................................... Results..................................................... ..............................19 ...............19 11. Medical Network Medical Network Training Results.................................. Results................................................ ............................. ....................20 .....20 12. Oil/Energy Network Oil/Energy Network Training Results...................... Results..................................... .............................. .........................20 ..........20 13. Neural Response Surface.......................... Surface......................................... ............................. ............................. ............................21 .............21 14. Training Statistics Statistics for Best Best Value/Growth Value/Growth Network................. Network................................ .................. ...26 26 15. Neural Response Curve for Value/Growth Network.................................27 Network.................................27 16. Price Calculator Calculator Data Retrieval Retrieval .............................. ............................................. .............................. ...........................32 ............32 17. Price Calculator Data Editing and Recalculation............................. Recalculation..........................................33 .............33 18. TradeStation TradeStation Display Display of Estimated Estimated Prices Prices ............................. ............................................ ........................34 .........34 19. CSCO Estimated Prices shows Bubble Formation......................... Formation......................................35 .............35 20. June 1996 1996 Relative Relative Valuation Valuation Histogram Histogram................................................................... 36 21. March 2001 Relative Valuation Valuation Histogram................................. Histogram............................................... ...................37 .....37 22. Percentage Percentage of Undervalued Undervalued Stocks in the Banking Sector.............. Sector ..........................39 ............39 23. Percentage of Undervalued Stocks in the Retail Sector...............................40 24. Percentage of Undervalued Stocks in the Oil/Energy Sector....................41 ii
25. Percentage Percentage of Undervalued Undervalued Stocks in the Nasdaq 100 ........................... ................................42 .....42 26. Percentage Percentage of Undervalued Undervalued Stocks by Sector .............................. ............................................. .................. ...43 43 27. Percentage Percentage of Undervalued Stock Stockss by Market Cap Band ...........................44 ...........................44 28. Percentage Percentage of Undervalued Undervalued Stocks by Industry .............................. ............................................45 ..............45 29. Percentage Percentage of Undervalued Undervalued Stocks by Asset Class............... Class .............................. ........................46 .........46
iii
LIST OF TABLES
Number Number Page 1. List of Fundamental Factors used to Estimate Price .............................. ..................................... .......55 2. Top Price Factors for for Consumer Consumer Discretionary Discretionary Stocks Stocks ............................. ............................... 22 3. Neural Network Training Results................................ Results.............................................. ............................. ......................23 .......23 4. TradeStation TradeStation Inputs for Indicator Indicator PFR_Wizard........................ PFR_Wizard...................................... ...................35 .....35 5. Monthly EXCEL Spreadsheet of Estimated Equity Prices.............. Prices ........................43 ..........43 6. Monthly EXCEL Spreadsheet with the Statistical Valuation Summary...44
iv
ACKN AC KN OWLE OW LEDG DGME MENT NTSS
Work on this product began in 1990 and has gone through three three major revisions over the intervening 14 years. The author wishes to express sincere appreciation to Edward Raha, Doug Frick, Phillip Zachary, Gary Gould, John Rickmeier, Bill Meckel, Ken Garvey, Diana Kaufman, and Alex Kaufman for their support and assistance in the development of these products and related applications. This work was partially supported by contracts contracts with Bankers Trust, Daiwa Securities, Securities, Managed Quantitative Advisors, and Marque Millennium Capital Management
v
GLOSSARY
Neural Network . A mathematical modeling technique which has the capacity to learn by example. It is also referred to as a “non-parametric” modeling technique. ETF. Exchange traded funds are portfolios of stocks that have similar characteristics and are available for trading as a single entity. IPO. Initial public offering of a stock. EXCEL. Microsoft Office spreadsheet software program TradeStation TradeStation. Stock charting and back testing software program from TradeStation TradeStation Securities Securities Sector. One of 16 divisions of the US economy by corporate characteristics SEC. The Securities and Exchange Commission. NGO. Neurogenetic optimizer software produced by Dr. Carl Cook of BioComp Systems ( www.biocompsystem www.bioc ompsystems.com s.com )
Zacks Investment Research is a Chicago based firm with over 24 years of experience in providing institutional and individual investors with the analytical tools and financial information necessary to the success of their investment process ( www.zack www.zacks.com s.com ) ) 3000® Index offers offers investors investors access to the broad broad U.S. Russell 3000. The Russell 3000® equity universe representing approximately 98% of the U.S. market. The Russell 3000 is constructed to provide a comprehensive, unbiased, and stable barometer of the broad market and is completely reconstituted annually to ensure new and growing equities are reflected ( www.russell.com/US/ www.russe ll.com/US/Indexes/U Indexes/US/3000 S/3000.asp .asp ). I/B/E/S. Institutional Brokers Estimate System. A system that gathers and compiles the different estimates made by stock analysts on the future earnings for the majority of U.S. publicly traded companies ( www.firstcall.com www.first call.com )
vi
INTRODUCTION
“Intrinsic value is the investment concept on which our views of security analysis are founded. Without some defined standards of value for judging whether securities are over- or under-priced in the marketplace, the analyst is a potential victim of the tides of pessimism and euphoria which sweep the security markets.” ...“Intrinsic value is therefore dynamic in that it is a moving target which can be expected to move forward but in a much less volatile manner than typical cyclical or other gyrations of market price. Thus, if intrinsic value is accurately estimated, price will fluctuate about it.” Graham and Dodd, Security Analysis (pg. 41 and 43, Fifth Ed.)
The price of a stock is an opinion, or more correctly, the sum total of thousands of opinions which are expressed through buy and sell decisions every day in the marketplace. These opinions are based on published corporate and economic fundamentals, as well as imprecise factors such as future estimates, recent trends, news, and lawsuits.
They are also subject to irrational investment behavior,
rumors, individual biases, and to some extent random fluctuations. Since price is dependent on so many factors, it is natural to try and separate out the effect of the most intrinsic components first. In fact, our goal was to find a mathematical way of pricing pricing a stock stock that that may not have been been priced by the marketplace. marketplace. As they say though, if the problem were that easy, it would have been solved already. Even the intrinsic components of a stock price depend on many interrelated factors, and just as identical real estate varies in price by location, the same corporate earnings numbers will result in different valuations within different sectors or in different economic conditions. Our model had to be based on known fundamentals and economic conditions, compensate (normalize) for 1
sector and industry, and properly weight the many stores of value that give a company an intrinsic worth. Mathematical models are normally built by making a-priori assumptions about the functional form of the solution. These are called parametric models, and are solved by regression methods to determine a number of coefficients. This is fine if you know that the solution must be a straight line or some other simple, wellknown function. But in the real world, relationships are not necessarily simple, and we don’t always know the form of the solution. Inputs and outputs could even be related in a non-linear fashion. If you don’t have to guess the functional form of the answer, you have a big advantage. We chose to use neural network modeling for this reason. A neural network network is a mathematical mathematical modeling modeling tool which has has the capacity capacity to learn by example.
This is an extraordinarily extraordinarily useful ability, especially in financial
modeling, where the inputs & outputs are usually well documented, and there are countless examples. Networks are “trained” by being presented with thousands of facts, each fact consisting of inputs and corresponding outputs. Through a unique feedback process, the network learns how those inputs are related to the outputs, and develops a general model to describe the relationship. In our case, fundamentals were fed in, and the corresponding stock price was used as the output. Again, if this problem were so easy…. There were significant sources of bias and data error which had to be dealt with, and some of our techniques are so innovative that they must remain proprietary. Data preparation steps were critical to the success of the model, and will be discussed in as much depth as possible.
2
Once trained, the model knows how to form a price opinion that should be close to a reasonable consensus opinion based solely on economic and fundamental report data. We expect that when the model price is lower than the actual price, then there is probably a degree of optimism that things will be far better in the future. On the flip side, if actual price is lower than the model, then there may be significant pessimism. Whatever the cause of the discrepancy, it is the analyst’s job to try and understand why it exists. A successful model will have many applications depending on the kind of client being served. Our prices could be used to evaluate portfolios of all sorts, from individual to professional funds. Model prices could even be used to assess the economy based on broad collections of stocks. For clients that need investment products, our model would be useful at all levels. The prices could be used to rank stocks from under-valued purchase candidates to over-valued sell or short sale candidates.
3
Chapter 1
BUILDING AN EQUITY PRICING MODEL
The Search Search for for Value
The search for corporate value is complex. Do earnings alone determine a fair price? How about cash flow, book value, sales, or dividends? How important are interest rates, debt, or the industry in which the company operates? These are all good questions, and the answer is that they all matter. The complexity of judging how much to weight each factor, and how the factors change relative to each other is hopelessly complex. Add to that the challenge of remaining unbiased by the times in which we live. It was rare, for example, to find an analyst who was willing to say that tech stocks were overvalued in 1999. Instead, many chose to adjust their models to reflect an irrational reality.
So Parallax has chosen
mathematical modeling for its ability to provide consistent results to an unmanageably complex problem. Our goal was to build a computer model with no assumptions about the relationship between price and fundamental factors. We wanted the model to learn from example and be able to generalize what it had learned to any other set of corporate data.
More specifically, our equity-pricing model used neural
network technology to learn how a stock’s fundamental balance sheet data had been translated to a market price in the past, within the context of its industry group and economic sector. Then, going forward in time, the trained model simultaneously evaluates all these factors as they change month to month, in order to produce an estimated stock price. 4
To accomplish accomplish this ambitious goal we needed to include the factors that have been cited by the financial community as key to stock valuation. This also included such economic numbers as PPI, CPI, GDP growth and interest rates. The data we chose for this project came from Zacks corporation’s corporation’s DBDQ, DBCM, and ECON databases, and are listed below along with a screenshot of the DBCM database reference guide from which all of the values and their explanation were gleaned:
Figure 1 – Database Items reference page for the Zacks DBCM current database. The table below was compiled from this document.
Inputs
Input Description
stale
This is an integer value representing the number of months old the data is
roi
Return on Invested capital. Calc: (Income before extras & discontinued operations)/(Long-term debt, convertible debt + non-current capital leases + mortgages + book value preferred stock + common equity)
sales_q
Sales quarterly per share
inc_bnri12
Twelve month Income before non-recurring items per share
book/share
Book value of common equity per share
5
trend_bvsg
Trend book value per share growth rate calculated from beta of an exponential regression on the last 20 quarters’ of fiscal book value per share
cash/share
Cash flow per common share. Calc: CASH FLOW/ SHARES OUT
op_margin
(Income before extraordinary items and discontinued operations for the previous 12 month period/Sales for the previous 12 month period) X 100
net_margin
(Net income for the previous 12 month period/Sales for the previous 12 month period) X 100
pretax_mrg
Trailing four quarter pretax profit margin. Calc: (PRETAX INC/SALES) x 100
curr_ratio
Current Ratio: Current Assets/Current liabilities
payout_rat
Dividend payout ratio. Calc: INDICATED ANNUAL DIVIDEND/ACT EPS 12
inventory
Inventory value per share
tot_c_asst
Total current assets per share
tot_c_liab
Total current liabilities per share
beta
Stock return volatility relative to the S&P 500 over the last 60 months (includes dividends)
roe_12m
Return on Equity; 12-month EPS/Book Value per Share
roa_12m
Return on Assets; 12-month EPS/Total Assets per Share
div_yield%
Dividend yield, based on IND AN DIV and PRICE
ttl_lt_dbt
Total Long-term debt quarterly; Debt due more than one year from balance sheet date; long-term debt, convertible debt, non-current capital leases and mortgages
act_eps_q
Diluted quarterly actual EPS before non-recurring items.
act_eps_12
Diluted quarterly actual earnings per share before nonrecurring items
sales_12mo
Sales during the last 12 months
12msls/-4q
Last 12 months of sales divided by the last 12 month sales 4 quarters ago
12meps/-4q
Last 12 months of earnings divided by the last 12 month earnings 4 quarters ago
quick_rati
Quick Ratio – Most Recent (Current AssetsInventory)/Current Liabilities
gdp_growth
Real Gross Domestic Product divided by GDP one quarter ago
Ppi_fin_gds
Producer price Index – Finished Goods Seasonally Adjusted Index
6
Cpi
Consumer Price Index – All Urban Consumers – Not Seasonally Adjusted
TBond Yield
Yield on Long-term Treasury Bonds
TBill Discount
Discount rate on new issues of 91-day Treasury bills
Prime_Rate
Average prime rate charged by banks
Industry
Zacks industry classification
Table 1. 1. List of of all fundamen fundamental tal factors factors used used to estimate estimate stock price via neural neural network. network.
We selected this list for a few reasons besides the obvious connection connection to valuation. valuation. First, these these items had the the highest historica historicall availability availability for US stocks stocks in the Zacks databases, and the more samples the better. Some of the items are primary and some are expressed as formulas. There are certainly items that are dependent on each other, and even a few that are redundant. We found that neural net training is less likely to become biased if the inputs are chosen in this manner. At the end of training training we can look at what the neural net learned about the relative importance of factors in general, and a bit about how price varies with each factor. It is essential that what the network learns also makes sense. Price should be positively correlated with earnings, cash flow, and book value for instance.
Choosing a good set of inputs is important, but data quality and
preparation is critical. In the next section we will discuss how training and testing data were prepared. Data Preparation
The most significant significant source of bias is due to sector and industry norms. Different types of companies show their worth in different ways. For example, we can’t hope to evaluate utility utility and tech stocks with the same model, since their their business structures are just too different. For this reason we split the data up by 7
Zacks economic sector. Twelve years of monthly data samples for 2,547 stocks, were physically divided into sixteen EXCEL spreadsheet files by sector (223,000 (223,000 facts in all).
The Zacks sectors are as follows:
Then, within each each sector file, columns columns were inserted and filled filled with a one or zero binary flag to signify industry membership. In the end, the full impact of sector and industry membership was preserved and this source of bias removed. The following is the complete list of all industry classifications placed within their respective sectors:
8
9
10
The trickiest bias to remove was price. Stocks priced less than $1 are simply not valued in quite the same way as those at $100, and what about Berkshire Hathaway which is priced in the tens of thousands of dollars. The proprietary trick we developed to eliminate price bias also had the side benefit of stopping economic bias. Data errors and extreme outliers had to be eliminated also. We used a multiple pass Gaussian filter for this step. Lastly, cycles of optimism and pessimism are also a source of bias. We were careful to balance price trend persistence over the twelve year period in order to deal with this problem.
11
Neural Network Background
The NeuroGenetic Optimizer (NGO) software by BioComp Systems, facilitates the creation of artificial neural networks by employing genetic algorithms in order to modify neural net structure. The term neural network could refer to the real life networking of neurons in various biological organisms, but it is used here (and in most cases) to reference the concept of an artificial neural network, which is a model created inside the computer that mimics the most fundamental functions in a biological neural network. In real world neural networks, it is believed that “learning” takes place when the strength of an axon connection is increased by virtue of its usage. For instance, when a small small child is learning learning how how to define define a tree, he he could be shown shown a pine tree, and the connections supporting a tree as a tall, green, spiky, thing with brown bark are reinforced and strengthened. As the concept of “tree” evolves, green will be given more weight, and spiky will be given less weight, because, while most trees he is shown are still green, they do not all have spikes (i.e. – a maple tree). This is a real-world example of a neural network, in order for the boy to accurately predict what is and is not considered a “tree”; he must have seen enough examples of different kinds of trees .
This way, he can determine the
connection between the various characteristics of the tree (independent variables) and whether or not it is a tree (classification output). This real-world example might be solved with a “classification” “classification” neural network. network. For the young boy’s learning, he had to classify objects as ‘tree’ or ‘not tree’ by looking at the variables that are important.
He knew which variables were
important by experience and training with many different sorts of trees. If he had only seen pine trees all his life, and classified them as ‘tree’, then he would not be 12
able to accept that a palm tree is also a ‘tree’. For this project, the computer was simply asked to learn how corporate fundamentals contribute to stock price, which is a “function “function approximatio approximation” n” problem. problem. The NeuroGenetic NeuroGenetic Optimizer Optimizer allows the user to train a number of different different application types: function approximation, diagnosis, clustering, time series prediction, and classification. Classification neural networks look at examples with multiple outputs outputs or categories, categories, and correlate that to the input data provided. provided. Once a classification net has been properly trained, it will be able to look at a set of data and determine what category it falls into. Neural networks are powerful tools, but nobody has placed biological neurons inside the computer, and they are certainly not comparable to human consciousness, so how exactly is this machine learning achieved? In order to understand neural networks it is important to understand the ‘hidden layer’ of nodes in between the input and output layers. Input nodes could be compared to the neurons in the eyes, and output nodes could be the neurons leading to the vocal chords where an answer or result is spoken aloud. The hidden layer simulates the set of neurons in between, where both learning and problem solving occur. All the input nodes send information to all the hidden nodes, and all the hidden nodes send data to all the output nodes. Here is a simplified diagram of just such a basic three-layer neural network:
13
Figure 2. Picture provided from Louis Francis’ article “The Basics of Neural Networks Demystified” in Contingencies magazine, magazine, < www.con www.contingen tingencies.o cies.org/nov rg/novdec01/ dec01/worksh workshop.pdf op.pdf >
In the diagram, the term “feedforward” simply references the direction of data flow from input layer to hidden layer to output layer. A quote from the magazine article from which this diagram was taken goes a long way to summarize the “learning” that goes on in the hidden layer: Neural Neural networks “learn” by adjusting adjusting the strength of the signal coming coming from nodes in the previous previous layer connecting to it. As the neural network better learns how to predict the target value from the input pattern, each of the connections between the input neurons and the hidden or intermediate neurons and between the intermediate neurons and the output neurons increases or decreases in strength… A function called called a threshold or activation function modifies the signal coming into the hidden layer nodes… Currently, activation functions are typically sigmoid in shape and can take on any value between 0 and 1 or between -1 and 1, depending on the particular function chosen. The modified signal is then output to the output layer nodes, which also apply activation functions. Thus, the information about the pattern being learned is encoded in the signals carried to and from the nodes. These signals map a relationship between the input nodes (the data) and the output nodes (the dependent variable(s)). (3) Just as with our analogy of the young young boy and the trees, the neural network network takes each independent variable (input) and weights them according to their importance in discerning the dependent variable(s) (output).
The activation
functions that are triggered in each hidden node of the neural network record the 14
signal strength and thereby encode the patterns correlating input and output data. In summary, this program “evolves” to a point where it can continually solve similar problems over and over again. These tools are extraordinarily useful, because they can “learn” about how to solve a problem that may be non-linear, such as stock market price estimation. The trained network is the answer to the problem. If one were to distill the neural network down to a corresponding equation it probably would be pages in length and be too confusing to comprehend.
The weights and activation
functions of the hidden layer can be viewed by the user, since they are written into the file, but only when each of the weights and interconnections are taken together in their entirety can the pattern be discerned by the neural network. Understanding how and why the chosen variables, the activation functions, and the connection weights interact to produce the correct answer is next to impossible for a human being to discern by simply looking at the net. Neural networks are often referred to as “black box” modeling, because the correlation that’s determined is not easily discernible, but the trained neural network will allow the user to look at “response curves.” These response curves show a graphed correlation between data variables used for input, and allow the user to judge whether or not the neural network has learned about the independent input variables variables adequately. adequately. There will be an elaboration elaboration on response curves in the results section. The last bit of information information that is important to know concerns the activation activation functions. The NGO utilizes three different types of activation functions, which are the functions in the hidden layer that specify the relationship between inputs and outputs. The different types of functions used by the NGO are as follows: 15
1. “Lo” – Logistic Logis tic Sigmoid Si gmoid
2. “T” – Hyperbolic Tangent
3. “Li” – Linear
Figure 3. Three types of neural network activation functions used by NGO.
These functions adjust during training so that the neural net as a whole can simulate the learning process. Depending on the problem being solved, different combinations of these activation functions will be employed, and the sum of the dynamics of these functions will produce the trained network. 16
Neural Net Training Results
Recall from the data preparation discussion that we prepared training and test data containing thousands of facts for all sixteen sectors being considered. Each fact contained a monthly snapshot of our fundamental inputs and the corresponding month-end price as output. The facts were run through NGO directly from EXCEL on a modest 1GHz Pentium IV PC with 512M Ram. It typically took several hours to complete the training and evaluation of each sector net. A successfully successfully trained trained network will will understand how to to interpret new new information. information. In all neural net training runs, some data is held back for testing and additional validation. validation. If the net has has not learned learned anything, anything, and merely memorized memorized its training training data, then it will score well when presented with the training data, but poorly when presented presented with with new fresh fresh data. A successfully successfully trained trained net performs virtually virtually the same on both sets. The network was presented presented with test facts to gauge its progress after every training loop. The error criterion chosen for training was r-squared. Low rsquared numbers close to zero signify a poor training result, while high r-squared numbers close to one signify good results. Poor results are also confirmed when the training set has a high r-squared and the test set a low one. This signifies memorization occurred instead of true learning. We were hoping for r-squared results that were consistent and in excess of 0.8. An r-squared of 1.0 means that the predicted stock prices exactly matched the actual prices. We know this is not possible simply because of random noise and other effects that we did not model.
17
The picture shown below is an actual screenshot screenshot of the training run for the Finance sector network. Notice the training and test r-squared results were almost identical at 0.88.
Figure 4. This is a screenshot taken during training of the finance sector network. Notice how similar the r-squared figure is between the test and training sets.
18
Figure 5. Predicted price for the test set fits the actual results very well. This indicates the network has learned how to estimate price.
Figures 5 and 6 are actual screenshots of “Predicted vs. Actual” prices from the Finance sector training run. Notice that the quality of fit on the training set is almost identical to the fit on the test set. This indicates that our network is learning instead of memorizing.
19
Figure 6. Predicted price fits the actual training prices, which indicates that the fundamental factors relate strongly to market price.
Figure 7. This picture shows the predicted versus actual prices we found from the Consumer Staples network training (r-squared = 0.8866):
Figure 8. This figure shows the network training results for the Utilities sector (r-squared=0.9594).
20
Figure 9. Network training results for the Retail/Wholesale sector (r-squared=0.7529)
Figure 10. Network training results for the Finance sector (r-squared=0.9077)
21
Figure 11. Network training results for the Medical sector (r-squared=0.7657)
Figure 12. Network training results for the Retail/Wholesale sector (r-squared=0.7529)
22
A response curve, or “neural “neural response surface”, surface”, provides information information about how two or more variables relate to one another in the prediction of a particular output. These response curves are generated by isolating one or two of the inputs and changing them over a particular range of values in order to see what relationship the net perceives between the variables and the output. All other inputs are left at average values. At this point a graph is produced so the relationship can be visualized. This tool can be used to show whether or not the connections that the neural network has made between input and output variables variables make sense. sense.
Figure 13. Does the result make sense? A neural response surface shows the relationship between some inputs and an output.
One such relationship that can be easily verified with response curves is that price should increase as book value and earnings rise. The figure above shows that the network has learned this lesson from all the examples it was fed. 23
Another way to analyze the neural net training results is by a sensitivity analysis. This test determines which of the inputs have the most influence on the output, and if they are positively or negatively correlated. Here are the top factors for the Consumer Discretionary sector network: Causal Variable (Input) Direction Directi on Pct.Effect Cum.Effect Cum.E ffect ----------------------------- ----------
----------
----------
act_eps_12
Positive
9.07%
9.07%
inc_bnri12
Positive
6.98%
16.05%
book/share
Positive
5.35%
21.40%
payout_ratio
Positive
4.41%
25.80%
Table 2. 2. Top factors factors effecting effecting price price for Consumer Consumer Discretion Discretionary ary sector sector stocks stocks
If a net didn’t train well, it could mean that a few bad facts made the process difficult. It could also mean that the problem has no solution. Our results were excellent in terms of high r-squared numbers, the equivalence of training and test r-squared, and most importantly, the response curves made economic sense. Price did in fact correlate positively with earnings, cash flow, book value, etc.
Sector Consumer Staples Consumer Discretionary Retail/Wholesale Medical Auto/Tires/Trucks Basic Materials Industrial Products Construction Multi-Sector Conglomerates Conglomerates
R^2 .88 .79 .75 .77 .90 .86 .87 .89 .93
24
Samples 16,413 12,943 21,511 14,321 3,891 13,893 16,935 8,619 4,795
Trendline A=0.97P+6.70 A=0.98P – 0.68 A=0.92P+4.16 A=0.97P+0.39 A=0.99P+0.95 A=0.98P+2.34 A=1.00P – 1.09 A=0.98P+1.70 A=0.99P+1.08
Computer And Technology Aerospace Oils/Energy Finance Utilities Transportation Business Services
.63 .88 .82 .91 .96 .88 .75
27,991 3,387 11,035 32,000 15,567 6,451 4,065
A=0.97P+0.11 A=1.00P+0.70 A=0.96P+9.47 A=0.99P - 0.22 A=0.99P+0.17 A=0.96P+2.66 A=0.92P+4.66
Table 3. The figure shows the sector r-squared error, error, number of samples, samples, and best regression regression trendline found during neural net modeling
The table above contains the r-squared results for all sample facts used during training and testing. It also includes the formula for a best fit regression trendline through the the results graph.
If the fundamental model could completely explain
price, the trendline formula would simply be A (actual price) = P (predicted price). Our results are very close to this result. Slopes were slightly less than 1.0 and most intercepts slightly more than 0, which could be interpreted as slight tendency for predicted to be less than actual price at the lower price end. This effect is canceled by the method we use for constructing a final estimate, since we sample the model numerous times across the whole price spectrum The sector networks were successfully successfully trained and are now ready to accept fresh corporate data and produce price estimates.
No further training is needed,
although we expect to retrain every 5 years in order to benefit from the additional data. In later chapters we will explore the monthly processing, distribution, visualization, visualization, investment uses, and historical historical test results for this pricing tool we call Price Wizard TM
25
Value/Growth Value/Growth Classifi Classification cation Network Network
Before examining the test results for our equity pricing model, we found the need for one more piece of information. The world of equity management is subject to numerous product divisions based on broad classifications of stocks as “value” or “growth”. We needed to build a neural network that would be able to look at a company’s fundamental data and determine whether it should be classified as a value stock stock or a growth growth stock, stock, or maybe maybe somewhere somewhere in between. between. Value stocks stocks are thought to be safer, growing slowly, but having solid intrinsic value and often paying dividends to compensate investors for the slower growth. Growth stocks often do not pay dividends, may have little intrinsic value, but have been growing very fast. fast. It is useful to have a consistent consistent method method of classifying classifying stocks stocks as value or or growth since it enables us to build asset allocation tools and better evaluate client portfolios. Training data was taken from the Russell 3000 value and growth indices and the fundamental data from Zacks. Data preparation was the same as for the price network except that all sixteen sector training files were combined into one, with Industry membership replaced by Sector membership.
The Russell method
includes I/B/E/S earnings estimates in their formula, which Parallax found to be misleading in past studies. We expect that by leaving expectations out of our training, that the classification system produced may even outperform the Russell technique.
26
Figure 14 – Training statistics for the best Value/Growth neural network found
The best network used 16 of the possible 42 inputs for value vs. growth prediction as follows: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.
book value of common equity per share, trend book value per share growth rate, cash flow per common share, operating margin, pretax margin, inventory value, total current assets per share, total current liabilities per share, beta, return on assets, dividend yield, 12 month sales divided by 12 month sales 4 quarters back, presence in Sector 1 (Consumer Staples), presence in Sector 3 (Retail/Wholesale), presence in Sector 6 (Basic Materials), and presence in Sector 10 (Computers and Technology).
27
Results show that our model was able to match 92% of the Russell classifications without the IBES estimates, estimates, which is sufficient sufficient for our purposes. Response curves again help us verify that the model makes sense. One such relationship that can be easily verified with a response curve is between book value per share, dividend yield per share, and the value stock designation. Generally when a company has high dividend yield and a high book value then it will be designated as a value stock. However, if the opposite is true, and both the dividend yield and book value are low, then it is more likely going to be listed as a growth stock. Here is a picture of the response surface that shows that the neural network learned to make that distinction.
Figure 15 – The response surface showing the relationship between dividend yield, book value per per share, share, and the value value stock stock designation designation..
28
A few other things can be inferred from the graph as well. For instance, when dividend yield is really high, an increase in the book value will have little effect on the stock being a value stock. If dividend yield is low however, book value has a large impact on the classification. In this way human beings can check the reasoning of the neural network and declare whether it is marred by confusion or has the same common sense perceptions about market variables as an analyst. The end product of this training is our ability to assign a rank; we call it Value%, to every stock we process. A Value% of 0 means the stock is a growth stock; while a rank of 100 means it is a pure value selection. selection. Value% in between 0 and 100 represents a stock with qualities of both classes.
29
Beta Predictor Network
Beta is a measure of a stock’s price volatility relative to a chosen benchmark index. It is often calculated in relation to the S&P 500 Index in order to judge whether a security security is more or less volatile volatile than than the market. market. A stock with a beta beta of 1.00 will tend to move higher and lower in tandem with the S&P 500. Securities with a beta greater greater than than 1.00 tend to be more volatile volatile than than the S&P 500, 500, and those with betas below 1.00 tend to be less volatile than the underlying index. Securities with betas of zero generally move independently of the overall market. Just as price is dependent dependent on fundamental fundamental factors, factors, the the volatility volatility of price also also is dependent upon fundamental factors. It isn’t hard to imagine that a company with high high debt and and poor earnings earnings might might be subject subject to to bigger bigger price swings swings than a company that is on a more solid financial footing. Our goal was to train a neural network to predict beta to the S&P 500 and beta to corresponding sector indices. Zacks provides us with a 60-month beta to the S&P 500 which includes dividends. We have added sector betas using Zacks sixteen sectors. As in the the pricing network, network, we used all the data data items items in Table 1 except except beta, to train the sixteen sector nets. Response curves help us verify that that the model makes sense. One such relationship that can be easily verified with a response curve is between current ratio and predicted beta. The current ratio is the ratio of current assets over current liabilities, which measures the company’s liquidity. The lower lower the current current ratio, the less liquid and the higher higher we might might expect expect beta to become. Here is a picture of the response surface that shows that the neural network learned to make that distinction.
30
Figure 15 – The response surface showing the relationship between current ratio and beta to the s&p 500.
The next two pictures show the predicted predicted versus actual betas for the consumer staples sector after training.
31
32
The end product of this training is our ability to assign a predicted beta to the S&P 500 as well as a predicted beta to each corresponding sector. The following table lists the r-squared results from training each sector and the number of samples used for training. Note that as expected it is easier to predict betas to the sector than to the S&P 500:
Sector
Consumer Staples Consumer Discretionary Retail/Wholesale Medical Auto/Tires/Trucks Basic Materials Industrial Products Construction Multi-Sector Conglomerates Conglomerates Computer And Technology Aerospace Oils/Energy Finance Utilities Transportation Business Services
S&P500 5YrBeta R 2
Sector 1YrBeta R 2
Samples
0.68 0.71 0.59 0.56 0.83 0.50 0.68 0.78 0.79 0.54 0.67 0.64 0.61 0.50 0.68 0.38
0.83 0.63 0.69 0.86 0.98 0.75 0.76 0.80 0.85 0.73 0.81 0.81 0.72 0.66 0.78 0.74
8645 6207 12159 7839 2037 6591 7593 4535 2627 15170 2001 6249 18319 7939 3521 1929
33
Chapter 2
EQUITY PRICING MODEL DELIVERY & VISUALIZATION
Model Processing and Delivery
On the third Friday following the last trading day of the month, Zacks sends a custom report file to Parallax that contains the most recent fundamentals for about 8,600 stocks. We chose Zacks because from our experience they have the longest historical databases with the highest quality of any corporate data vendor (see Appendix B). Despite this, only about 3,600 stocks (Aug 2004 count) have all the data required for pricing.
Some of the “dropout” is linked to new
companies, since we need company records to go back at least 5 years. It seems though, that most of the dropout is due to to incomplete SEC filings. We will come back to this topic later in the results section. The pricing process for each stock is actually quite involved. involved. Numerous price estimates are retrieved from the trained networks by repeatedly splitting the per share data, presenting the artificial data to the net, retrieving the price estimate, and then un-splitting the estimate. If the model is internally consistent and the stock is relatively easy to value, then the price estimates will all be about the same and independent of split factor. In practice, non-linearities and biases that remain in the model can cause scattered or even unstable estimates. We have even noticed cases of “dual” stable solutions that are widely separated in price. Our technique identifies these cases. The benefit of this type of error analysis is that it samples different regions of the valuation model for a single stock. 34
A big error
also may signal that the stock’s balance sheet is a bit atypical. Negative earnings, cash flow, ROE, etc. can cause low or unstable prices. We produce price estimates estimates in this manner for all stocks with at least one stable solution. Monthly results are delivered to clients via three products: 1. A monthly database of estimated estimated prices for each stock in an EXCEL spreadsheet format 2. A summary report in EXCEL format containing containing statistics statistics on various collections of stocks. For example, we list the percentage of large cap stocks that are undervalued, and by what average and median amount. 3. A database file for the TradeStation TM charting program, so that all price estimates can be displayed.
This same database also powers our
interactive price calculator program called Wizard Explorer TM Clients are notified monthly by email that these products are available for download. A message such as the following is sent from our operations center in Hawaii: The August August Wizard database database is now available available in: in: ftp://ftp.pfr.com/users/dfrick/private/wiz_mwx/wizdb200408.exe This is a self-extracting self-extracting Zip archive archive that will put the database files into their proper location. The Wizard Wizard spreadsheet spreadsheet and summary report report are in: ftp://ftp.pfr.com/users/dfrick/private/wiz_mwx/wizrep200408.zip
35
Interactive Price Calculator
The interactive interactive price calculator calculator software we call Wizard Explorer TM, is the simplest way for a client to explore our neural network pricing model. When the calculator appears, enter the stock symbol, month, and year that you wish to examine in the top three boxes and then press the “Get Data” button. All of the fundamental factors will be read from our database and displayed in the edit windows. The windows with with a white background background may be edited, edited, including the the pulldown list of industry names. If for instance you want to see what Aetna is worth as an insurance company instead of a health care company, just change the industry before pressing the “Re-Calculate” button at the bottom. This button invokes our neural net model and causes the estimated price to be calculated. The price above this estimate is the month-end price from the prior month, so if you selected August 2004 at the top, the month-end price date would be July 31, 2004. Sometimes fields display a “N/A” which means that no data was available from Zacks for this field. Just type in an estimate for the missing data and then recalculate. If all the input data is present and the price estimate is listed as N/A, then you know know that a stable solution could not be found.
Our database goes
back to 1990, but the displayed historical data is split corrected to September 2001 only. We have placed the appropriate split factor in an edit box on the lower right side for splits that have occurred since then.
36
Enter year, “Get bring data.
symbol, month, and then press Data” button to in fundamental
Figure 16. Neural network price calculator for “what-if” exploration. exploration. Enter stock information at the top and press “Get Data” to populate the fields below.
37
Edit any of the fields with a white background and re-calculate the estimated price
Figure 17. Edit any of the fields with a white background and recalculate. Compare the actual month end price with our estimated price.
38
TradeStation TradeStation TM Historical Graphing
The best way for clients to view our Price Wizard pricing model results is within the TradeStation charting program from TradeStation Technologies. The graph below shows our estimated stock valuations in purple for the stock IBM. The solid purple line is the published record, while the dashed line refers back to the company quarter from which the data was taken. Additional text is included which describes the average valuation of the Industry and Sector. In the graph below we show only valuations from 1998-2000. The text describes IBM on Dec 27, 2002. At that time we estimated it was worth $143.
Notice how over valued IBM IBM became
Price Wizard historical stock price estimates
Figure 18. TradeStation display of estimated prices from our neural net model
In TradeStation this feature is an indicator called PFR_Wizard and has the following inputs: 39
INPUT
DESCRIPTION
DEFAULT
NAME
WizText
Prints descriptive text if true
TRUE
CoRepLine
A dashed line is drawn if this is true that leads to the quarter end date upon which the valuation is made
TRUE
DumpFile
Dump valuation data to an ASCII file if True (C:\IBM_Wiz.csv”) (C:\IBM_Wiz.csv”)
FALSE
Split
Wizard is published once a month, and price is split corrected to the end of the month prior to the published date. If your stock stock split since publication, then you must change this factor. A 2:1 split split requires Split=2.0 Reserved for testing other databases
DBS
1
“Wizard”
Table 4. 4. TradeStatio TradeStation n inputs inputs for Parallax Parallax indicator indicator PFR_Wizar PFR_Wizard. d.
The stock stock price bubble of the late nineties nineties is best seen in the the chart of Cisco Systems.:
Figure 19. Model prices of Cisco clearly showed the bubble forming
40
TradeStation TradeStation TM Aggregate Portfolio Graphing
All of the stocks stocks we price price each month are tagged with with an industry, industry, sector, sector, market market cap, and now a value/growth rank. We can calculate the percentage that they are away from the actual market price and plot histograms of relative valuation for many different collections of stocks. Statistics can be used on these distributions to characterize the collection. For instance, we may choose to look at the relative valuations valuations of large large cap finance finance stocks, or maybe small cap growth stocks. stocks. The figure below is a histogram of relative valuation from June of 1996. 55% of all the stocks we priced were under their estimated price at that time.
Figure 20. June 1996 relative valuation histogram shows 55% of all stocks are undervalued.
41
By March 2001 2001 however, only 25% of stocks were under-valued. What came next was an enormous stock market drop.
Figure 21. March 2001 relative valuation histogram shows 25% of all stocks are undervalued.
42
The percentage percentage of undervalued undervalued stocks stocks is one one useful statistic, statistic, another another is the average amount of that relative valuation histogram, the median and standard deviation may also be important. We will come back to the question of which statistics work best in the testing chapter. As far as visualization is concerned, we have built several tools in TradeStation that plot “percent under-valued” graphs for stock collections by sector (PFR_WizardSector), industry (PFR_WizardIndustry), and any arbitrary list of stock symbols (PFR_WizardList). In the following figures, the size of the “percent undervalued” bar at the bottom of the chart represents the sampling error. Arrows mark extreme readings, which can be used to guide buy and sell decisions.
43
Figure 22. The figure shows the bank index and the percentage of stocks that were undervalued in the bank industry over time. The size of the bar represents the sampling error. Arrows mark extreme readings, which can be used to guide buy and sell decisions.
44
Figure 23. The figure shows the retail index and the percentage of stocks that were undervalued in that sector over time. time. The size of the bar represents the sampling error. Arrows mark extreme readings, which can be used to guide buy and sell decisions.
45
Figure 24. The figure shows the oil index and the percentage of stocks that were undervalued in the oil sector over time. The size of the bar represents the sampling error. Arrows mark extreme readings, which can be used to guide buy and sell decisions.
46
Figure 25. The figure shows the percentage of stocks that make up QQQ (Nasdaq 100) that are under their fair value.
47
EXCEL TM Database and Summary Spreadsheets
The results of our stock pricing model are sent to clients each month in several forms, as was mentioned before. The following picture shows the EXCEL database format for the August 2004 report:
Table 5. EXCEL spreadsheet spreadsheet containing containing estimate estimated d prices prices for all stocks stocks with with sufficient sufficient data. data.
The fields listed across the top include symbol, company name, sector name, industry name, market cap, month-end price from the prior month, estimated price (called “netprice”), and the percent change between the month-end price and our estimated price.
48
Another EXCEL report contains contains statistics on various collections collections of stocks. The August 2004 report is shown below. At the top of this report is the statistical statistical analysis of the stocks by common industry. For instance, in the apparel industry, there are 37 stocks, of which 16% 16% are undervalued. The average price is 18% over the actual market price, and the median price is 20% over the market price. Not a very attractive industry. The report looks at sectors, market cap ranges, value versus growth, and industry breakdowns. breakdowns. The next two pages contain pictures created in EXCEL and derived from this August 2004 summary report.
Table 6. 6. EXCEL EXCEL spreadshee spreadsheett containing containing valuatio valuation n statistics statistics by by sector, sector, industry, industry, market market cap, and value/growth value/growth rank.
49
Figure 26.
Figure 27.
50
Figure 28.
Figure 29.
51
Chapter 3
HISTORICAL TESTING
Survivorship and In-Sample Considerations
Historical testing errors are often the undoing of great looking investment management systems. There are four four main sources of these errors, all dangerous; curve fitting, clairvoyance, random data errors, and survivorship bias. We have already discussed curve fitting, fitting, although it is called memorization memorization in neural network lingo. It means that a neural net has not really learned anything about the facts presented, but because of the size of the network, it was able to just remember every fact. We have shown that this did not happen in our work. Both the training and test sets scored the same when processed by our networks. An additional memorization memorization test requires we compare how well the networks networks performed in the “out-of-sample” period versus the “in-sample” period. Our out-of-sample period began in October 2001, so we will examine results before and after that date. Clairvoyance in our case means that information used to construct historical price estimates was not available on or before the publishing date for those estimates. We have confidence confidence that Zacks data is date consistent, consistent, since the databases we used had been carefully vetted over many years.
52
We do know, however, however, that occasionally occasionally corporate data is entered into these databases with a decimal place in the wrong spot. These random errors are worked out of a database over time, time, but would would have been present present in real real time.
We do need to set some limits for our testing so that data errors are a bit more limited. Toward that end, we have chosen a set of all stocks, sampled monthly, which has a month-end month-end price prior and subsequent to the published published valuation valuation report which were above $5.00 a share at the time. We assume that if a stock is found mid-month that it is bought at month end if it is still above $5.00. Further, we have restricted restricted market capitalizations capitalizations to be above 250 million as of the prior month-end. We are also requiring that the database list an earnings report date. In order to evaluate subsequent performance, we require at least one additional price point within the following 12 months. Performance for missing price periods is interpolated, but never extrapolated. Missing prices for the end of the evaluation period are simply the last known price carried forward.
This set of
filters produces 350,215 records spanning the date range from Oct-1989 through Aug-2004 Aug-2004 (181 months). Out of these, a subset of 193,467 193,467 have been priced using our model. The average average 12 month performance of all stocks during the period was 9.23%. We were not able to price every stock because of missing data. The absence of data turns out to be a bit of an advantage however. The 12 month performance of stocks that had valuations was 9.67%, while the average was only 8.69% for the stocks that had none. This means that just having a valuation gives an advantage of 98 basis points over those that didn’t, or 44bp over the set of all stocks, as shown in the graph below (from w1_2.xls):
53
Figure 30. This shows the average percentage gain for all stocks, all priced stocks, and all unpriced stocks during our test period from from Oct-1989 to August 2004 (250 million min market cap). Note there is a 44 basis point advantage at 12 months just because the company reported sufficient data to allow pricing.
The next pictures show the average 12 month performance of all priced stocks during the test period that were undervalued by different percentages at the time of hypothetical purchase. The first chart shows the straight average, while the second shows the excess performance above the average. Note that just being undervalued by any amount resulted in a performance gain of 400 basis points.
54
Figure 31. This shows the average percentage gain for all priced stocks that were undervalued by some percentage at the time of hypothetical purchase during our test period from Oct-1989 to August 2004 (250 million min market market cap). Note there is a 400 basis point advantage advantage at 12 months just by being undervalued by any amount.
55
Figure 32. This shows the 12-month excess average percentage gain in basis points for all priced stocks that were undervalued by different percentages at the time of hypothetical purchase during our test period from Oct-1989 to August 2004 (250 million min market cap). cap). Note there appears to be an optimal 12-month performance of 600 basis points achieved by being undervalued by at least 20% at purchase.
56
Figure 33. This shows the average percentage gain for all priced stocks that were either over or undervalued by some percentage at the time of hypothetical purchase during our test period from Oct-1989 to August 2004 (250 million min market cap).
57
Figure 34. This shows the average percentage gain for all priced stocks that were undervalued by some percentage at the time of hypothetical purchase during our test period from Oct-1989 to August 2004 (5 billion billion min market cap). Note there there is a 325 basis point advantage advantage at 12 months months just by being undervalued by any amount. However, large cap stocks more than 40% undervalued performed worse than average at 12 months.
58
Figure 35. This shows the average excess percentage percentage gain by year for all priced stocks that were undervalued by some percentage at the time of hypothetical purchase during our test period from Oct-1989 to August 2004. Note that during the bubble years of 1998 and 1999, undervalued stocks underperformed the index, while after the bubble in 2000, they soared.
59
Information Coefficient
Information Coefficient or IC is an accepted measure of skill for ranking stocks. IC is calculated as follows: 1) Rank the stocks from best long long ideas (forecasted highest return) to best short ideas (forecasted lowest return). 2) Rank the subsequent period actual returns, e.g. one month or 6 month from highest return to lowest return. 3) Calculate the correlation between those 2 series of numbers. The higher the IC, the better is the skill. A good IC is typically in the .05-.06 range, and above .07 is generally considered excellent. IC is a measure of the slope of the graph of forecasted returns compared with actual returns. The weakness of IC is that it’s a linear calculation. Therefore, Therefore, you you could have have a high IC, but lower lower accuracy accuracy on the tails of the distribut distribution ion – best best long and short recommendations – or have a low IC, yet with excellent forecast ability on the tails.
60
Figure 36. This shows the average percentage gain per month for all priced stocks, and undervalued stocks with varying predicted betas, during our test period from Oct-1989 to August 2004 (using a 5 billion minimum market cap). cap). Note that having a predicted beta from 0.8 to 1.2 and being undervalued, performed better than stocks with predicted betas outside of this range.
61
Predictor Statistical Testing Methods
Parallax has been producing neural-network based financial predictors since 1990 and so it is an integral part of our business to validate these predictors using reliable statistical methods. With each predictor, we need to answer the following set of questions: 1. What price price behavior behavior is being being predicted? predicted? 2. What is the the effective effective duration of of the predictor? predictor? 3. How statistically significant is the predictor at each time step forward? 4. Is the predictor effective on all time scales? 5. Is the predictor effective on all financial series? 6. What combinations combinations of predictors predictors are the most effective? effective? The following following section section is an overview overview of a simple and reliable statistical statistical method and an 8-year analysis of the Price Wizard predictor. In order to carry out this analysis, it is necessary to measure the price action during the time period immediately following each prediction event. We call this the “post-predictor” or “outrun” period.
62
Post-Predictor Z Scores
We are interested interested in characterizing characterizing the post-predictor post-predictor time period using a scaleindependent method that allows comparisons between financial series. The “Z Score” is the most appropriate measure for this job. A Z-Score is the measure of how many standard deviations price has moved away from its price at the prediction event, assuming that the probability of either an up or down move is random at 50%. By measuring local volatility at the prediction event, a normal probability distribution can be drawn going forward in time that acts as a roadmap for subsequent price moves. prediction event.
The map is centered at the closing price of the
Each day the map widens according to normal diffusion
velocities, velocities, which are proportional to 1/√ 1/√ Time, representing representing the region where the future price is most likely to be found. An example of such a probability map is shown below for the stock Home Depot on Feb 3, 2005:
63
Viewed from the side at two time steps, diffusion acts to spread out the region where we might find find the stock stock price price as time elapses:
Diffusion causes expected prices to spread out over time
At each time step, there is a larger standard deviation and the same mean. If we represent the actual price achieved at each step in terms of that standard deviation, we produce a series of Z-Scores.
For example, if the price at a
prediction event is $5, and then moves to $7 on day ten with a standard deviation of $1.60, then the Z-Score on day ten would be (7-5)/1.6 = 1.25. This means that price moved 1.25 standard deviations above the price at the prediction event. Since the standard deviation continues to increase, in order to maintain the same Z, price would have to increase by the same relative amount.
64
If all post-predictor prices for all financial series are converted to Z-Scores, and if the predictors do not work, and if markets are random , then at every time
step, a histogram of all the Z-Scores would be expected to result in a “standardized” normal distribution, with mean of zero and a standard deviation of one N(0,1) as shown below:
We of course hope that our predictors predictors actually predict non-random non-random price behavior, so the degree of deviation from the normal curve is critical. We will use a Chi-squared test to determine if the distributions are significantly different.
Are Financial Financial Series Series Random? Random?
We have assumed that price movement movement is best characterized characterized by a random walk model, but this might not be the case. There is gathering evidence evidence for other characteristic distributions such as a Biased Random Walk, Truncated Levy Flights, or Cauchy distribution.
Our solution is to produce a quantitative
background distribution based on randomly selected dates across our entire 3000 65
day test period, and for all of the 2500 stocks being considered. The figure below shows this background distribution of Z-scores corresponding to random prediction dates, and a normal distribution based on a random walk assumption plotted together. It is clear from this figure that a strict random walk assumption is inappropriate during our 8 year test period.
Instead there appears to be a
positive return bias.
To illustrate this another another way, we could ask what percentage of buy predictions predictions are winners (Z>0) if the timing is random, and plot this percentage each day following the randomly selected purchase dates. Normally this would be 50%, but since the background distributions is positively biased, the figure below shows the percent winners for randomly selected buys climbs steadily over time. The reverse reverse is of course true true for randomly randomly selected selected sells sells (mirror image). 66
On a weekly scale the effect is even more pronounced, as shown in the next figure. I guess this could be called the dartboard effect, in that even a random selection of stocks during this period showed a 57% win rate at 30 weeks after purchase, and shorting was decidedly unprofitable.
67
We will use use these curves as as benchmarks benchmarks for our predictors predictors over over this test period and stock set. In order to have a non-random buy predictor then, we need to see a win rate in excess of 57% at 30 weeks for example, and sells would need a win rate better than 43%. So far we have examined the background distribution of Z-Scores and the percentage of winners. What we still need to know is how much gain is possible from randomly selected buys and sells, so that each predictor’s excess gain profile can be evaluated. The pictures below show the positively biased background gain performance for randomly selected buys and sells:
68
69
Price Wizard™ as a Predictor
Price is often out of sync with value, being bid up or down based on future expectations and irrational trend persistence. The Price Wizard model discussed in this paper incorporates all major stores of corporate value, normalized simultaneously for sector, industry, and economic factors. It predicts what price should be, and if it works, then we should be able to detect a significant effect. The figure below shows the Z-Scores 30 weeks following a stock moving to undervalued from overvalued. Note the significant positive bias between the background distribution and the undervalued stock Z distribution.
A Chi-
squared test is designed for comparing distributions and a p<0.01 constitutes a rejection of our predictor distributions as random. In the case below, p=0, which confirms that these two distributions are indeed statistically different.
70
These two two graphs show the percentage percentage of winners, winners, daily and and weekly, weekly, for buys and shorts, which exceed the background “dartboard” rate.
71
The next four graphs show the median and average percentage gain, daily and weekly, weekly, for longs longs and shorts, shorts, which which exceed the background background “dartboard” “dartboard” rate. rate.
72
73
Dynamic Price Wizard™ (F% for short) as a Predictor
The knowledge of how much a stock is undervalued or overvalued is of primary importance, but the rate of change in value and price during the preceding year is also important. The Dynamic Price Wizard neural net (we call the indicator F% for short) adds value by capturing these dynamics explicitly. The figure below shows the Z-Scores 30 weeks following a stock moving to F% greater than a rank of 90 out of 100. Note the significant positive bias between the background distribution and the undervalued stock Z distribution.
These two two graphs show the percentage percentage of winners, winners, daily and and weekly, weekly, for buys and shorts, which exceed the background “dartboard” rate.
74
The next two graphs show the median and average percentage gain weekly, for longs and shorts, which exceed the background “dartboard” rate.
75
76