Miscellaneous knowledge snippets.


Modelling


Question Answer Reference
Naive Bayes assumptions Independence among predictors - assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature.
Which modeling techniques require predictors to have a common scale? 1. Neural networks
2. K means clustering
3. SVM’s
4. K Nearest Neighbours
5. Any technique using regularisation
What is gradient descent? Gradient Descent is an optimization algorithm for finding a local minimum of a differentiable function. Gradient descent is simply used to find the values of a function’s parameters (coefficients) that minimize a cost function as far as possible. Wikipedia
What is regularization? Regularization is adding a penalty term to the objective function to control the model complexity using that penalty term. When a model overfits data or the predictors are collinear, parameter estimates can become inflated. Adding a penalty stops this and can result in a lower error.
What are regularisation techniques for neural networks? 1. Learning rate shrinkage
2. Early stopping
3. Batch normalization
4. Ensembles
Empirical Asset Pricing via Machine Learning(https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3159577) provides a good summary


Preprocessing


Question Answer Reference
What is polynomial contrast? The process of transforming ordered categorical variables / predictors to numeric quantities. SO 105115
What characteristics of a response variable indicates it should be transformed prior to modelling? Responses that have a distribution where the frequency of response proportionally decreases with larger values may indicate that the response follows a log-normal distribution. In this case, log-transforming the response would induce a normal (bell-shaped, symmetric) distribution and often will enable a model to have better predictive performance. FES1 s.4.2.1
xxxxx xxxxx xxxxx


Feature selection


Question Answer Reference
What is the primary purpose of feature selection? Removal of non-informative or redundant predictors FES1
What type of models feature automatic feature selection? 1. Tree-based models
2. MARS
3. Elastic net / LASSO
4. Nearest shrunken centroids
5. GAM’s (use parameter “select = TRUE”, effectively the same as LASSO)
gam.selection
xxxxx xxxxx xxxxx
xxxxx xxxxx xxxxx