Knowledge snippets

Miscellaneous knowledge snippets.

Modelling

Question	Answer	Reference
Naive Bayes assumptions	Independence among predictors - assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature.
Which modeling techniques require predictors to have a common scale?	1. Neural networks 2. K means clustering 3. SVM’s 4. K Nearest Neighbours 5. Any technique using regularisation
What is gradient descent?	Gradient Descent is an optimization algorithm for finding a local minimum of a differentiable function. Gradient descent is simply used to find the values of a function’s parameters (coefficients) that minimize a cost function as far as possible.	Wikipedia
What is regularization?	Regularization is adding a penalty term to the objective function to control the model complexity using that penalty term. When a model overfits data or the predictors are collinear, parameter estimates can become inflated. Adding a penalty stops this and can result in a lower error.
What are regularisation techniques for neural networks?	1. Learning rate shrinkage 2. Early stopping 3. Batch normalization 4. Ensembles	Empirical Asset Pricing via Machine Learning(https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3159577) provides a good summary

Preprocessing

Question	Answer	Reference
What is polynomial contrast?	The process of transforming ordered categorical variables / predictors to numeric quantities.	SO 105115
What characteristics of a response variable indicates it should be transformed prior to modelling?	Responses that have a distribution where the frequency of response proportionally decreases with larger values may indicate that the response follows a log-normal distribution. In this case, log-transforming the response would induce a normal (bell-shaped, symmetric) distribution and often will enable a model to have better predictive performance.	FES¹ s.4.2.1
xxxxx	xxxxx	xxxxx

Feature selection

Question	Answer	Reference
What is the primary purpose of feature selection?	Removal of non-informative or redundant predictors	FES¹
What type of models feature automatic feature selection?	1. Tree-based models 2. MARS 3. Elastic net / LASSO 4. Nearest shrunken centroids 5. GAM’s (use parameter “select = TRUE”, effectively the same as LASSO)	gam.selection
xxxxx	xxxxx	xxxxx
xxxxx	xxxxx	xxxxx

Reference

¹FES : Feature Engineering and Selection: A Practical Approach for Predictive Models

Source material : Google sheets