- How to calculate velocity components of a point on Mars surface w.r.t. Mars center?
- What are the small red cylinders used for on the bottom of the Soyuz Launch vehicle?
- Can centrifugal force actually overcome the health problems of microgravity?
- How would I create an arch?
- Use texture + color ramp to set density vertex group for particle system
- Is there a way to make gradient shadows?
- Animation Nodes - How to make text creep up?
- ospf troubleshooting
- WS2_32.send is used but not WS2_32.recv
- The SQL Server edition of the target server is unsupported, e.g. SQL Azure
- A TOEFL question
- The use of “in consonance with” as a synonyms to “according” and “according to”
- Proper use of adverb 'out'
- can we say “All in a day's disappointment”?
- How to use “make”?
- What is the different between “giving an interview” and “having an interview”?
- What does the word “it” refer to in this context?
- Magento 2.* integration with Azure AD
- Googlebot not seeing images
- Tier prices discount with 4 decimals
How to perform feature selection and hyperparameter optimization in cross validation?
note: I read a lot of the questions already posted on this topic, but still have some confusion.
I want to perform feature selection and model selection for multiple models e.g. Random forest (RF), Support vector machine (SVM), lasso regression. There seem to be a few ways to do feature selection (fs) or hyper parameter optimization (hpo) through cross validation (cv). My data set is n~700 (sample size) and p = 272 (number of features). However, adding another set of features could increase p to ~20272.
My current plan is the following:
Run whatever resampling method (k fold or Monte carlo) to get different splits of pseudo test and training data.
In each iteration of resampling:
Run feature selection on pseudo training data
Increment counts for which top variables are selected
Train model using those features on pseudo training data
Get estimate for how well it does by testing on pseudo test data
Now we can select our feature set by taking the top k selected varia