Pricing illiquid assets A Deep Learning approach Oded Luria Deep Learning Meetup Dec 2015
Deep Learning in Nature (May 2015) Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech. see also Jürgen Schmidhuber, Critique of Paper by "Deep Learning Conspiracy" (Nature 521 p 436). http://people.idsia.ch/~juergen/deep-learning-conspiracy.html
Takeuchi and Lee, 2013 Examine whether deep learning techniques can discover features in time series of stock prices that can successfully predict future returns Main idea: Autoencoder reduces inputs to 4 dimensional features Classifier outputs probabilities of two classes (returns below/above the median) Performance: Overall accuracy rate of 53% Open questions: Use separate autoencoders for different categories of features The impact of updating the weights over time. See also: Gilberto Batres-Estrada, 2015 http://cs229.stanford.edu/proj2013/takeuchilee-applyingdeeplearningtoenhancemomentumtradingstrategiesinstocks.pdf
Everyone is exploring (*from public sources) 4 http://robusttechhouse.com/list-of-funds-or-trading-firms-using-artificial-intelligence-or-machine-learning/
If you have a deep learning architecture that can identify those patterns across different time scales, you can arguably use that information to better forecast what will happen next Binatix s software doesn t just learn from static data points, but incorporates temporal signals, essentially how the information continually changes over time Nadav Ben-Efraim (left) and Itamar Arel (right) 5
Finance and Deep Learning? (*from public sources) PROS Lots of features lots of examples Complex data Versatile regimes Evolving market conditions Competitive advantage? CONS Weak spatial correlation/ parameter tying Requires transparent model Conservative industry
Pricing illiquid assets
Price Price uncertainty theoretical price curve Business Requirements: 1. Accurate pricing 2. Prediction confidence 3. Transparency trade price uncertainty increases Time 8
Experimental settings Data preparation Mixture of numerical, boolean & categorical features Filtering outliers Scaling Filling missing values Splitting categorical variables Some degree of feature engineering 9
Feature engineering? * Trends Cross-sectional information Information about other bonds? *Molecular Activity Challenge 10 http://blog.kaggle.com/2012/11/01/deep-learning-how-i-did-it-merck-1st-place-interview/
Deep Leaning aspects
Research methodology This (supervised regression) problem can be approached using a number of methods: 1. Using discriminative models (supervised learning) Use Trace spreads as target labels for classification \ regression problem 2. Using Hybrid models (combine supervised and unsupervised learning) Discrimination assisted with outcomes of generative \ unsupervised networks 3. More options? Classification \ Regression? Network depth? Open questions Type of units? Types of optimizers? Use Ensemble learning? 12
Regression Vs Classification for price prediction regression classification Direct approach Cost function: mean square error Difficult in estimation of example error* Unimodal model Ordered classes Cost function: categorical cross entropy Performance vs. resolution tradeoff Good error estimation for each example Supports multi-modal decisions 13
How deep should the network be? Shallow networks: Easier to train Could suffer from high bias error Deep networks: Superior when the data is complex have more parameters More difficult to train More prone to overfit? *units removed from figure 14
raw features Random Forests class predictions raw features Random Forests classifier class predictions Are the Deep Learning features better? Random Forests applied on raw features Random Forests applied on Deep Learning features *units removed from figure 15
Are the DL features better? 570-60 spread PCA applied on Deep Learning features 16
Probability Assessing prediction quality (classification net) Ordinal classification -> smooth probability distribution Indicates spread prediction certainty Similarity to bid-offer spread The predictive power of this index needs to be verified low confidence high error high confidence low error Spread [BPS] 17
Categorical Cross-entropy Bin Coverage [%] Training process Epoch 18
Ensemble Learning Avoids being trapped in saddle points/ bad minima Handles initialization problems Improves resolution 1 network 5 networks A very simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then to average their predictions (Hinton) Another way to exploit computing power to push performance i model averaging... After training them, the outputs of different networks can be averaged (text modified from Yoshua Bengio) 19
Integration into business processes
Conclusions Applying Deep Learning on Financial datasets is not trivial Use it when: Many features Many examples Data has complex relationships between variables Many regimes Transparency is not required (mostly) Think of ways to assess your error Use benchmark methods to assess the Deep Learning contribution 21
Thank you oded.luria@citi.com