Miscellaneous knowledge snippets.
Question | Answer | Reference |
---|---|---|
Naive Bayes assumptions | Independence among predictors - assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature. | |
Which modeling techniques require predictors to have a common scale? | 1. Neural networks 2. K means clustering 3. SVM’s 4. K Nearest Neighbours 5. Any technique using regularisation |
|
What is gradient descent? | Gradient Descent is an optimization algorithm for finding a local minimum of a differentiable function. Gradient descent is simply used to find the values of a function’s parameters (coefficients) that minimize a cost function as far as possible. | Wikipedia |
What is regularization? | Regularization is adding a penalty term to the objective function to control the model complexity using that penalty term. When a model overfits data or the predictors are collinear, parameter estimates can become inflated. Adding a penalty stops this and can result in a lower error. | |
What are regularisation techniques for neural networks? | 1. Learning rate shrinkage 2. Early stopping 3. Batch normalization 4. Ensembles |
Empirical Asset Pricing via Machine Learning(https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3159577) provides a good summary |
Question | Answer | Reference |
---|---|---|
What is polynomial contrast? | The process of transforming ordered categorical variables / predictors to numeric quantities. | SO 105115 |
What characteristics of a response variable indicates it should be transformed prior to modelling? | Responses that have a distribution where the frequency of response proportionally decreases with larger values may indicate that the response follows a log-normal distribution. In this case, log-transforming the response would induce a normal (bell-shaped, symmetric) distribution and often will enable a model to have better predictive performance. | FES1 s.4.2.1 |
xxxxx | xxxxx | xxxxx |
Question | Answer | Reference |
---|---|---|
What is the primary purpose of feature selection? | Removal of non-informative or redundant predictors | FES1 |
What type of models feature automatic feature selection? | 1. Tree-based models 2. MARS 3. Elastic net / LASSO 4. Nearest shrunken centroids 5. GAM’s (use parameter “select = TRUE”, effectively the same as LASSO) |
gam.selection |
xxxxx | xxxxx | xxxxx |
xxxxx | xxxxx | xxxxx |
1FES : Feature Engineering and Selection: A Practical Approach for Predictive Models
Source material : Google sheets