Member-only story

Important Considerations when Predictive Modeling with Tabular Data

My Notes for “How to Win a Data Science Competition From Top Kagglers” on Coursera.org

Rohan Kotwani
10 min readOct 10, 2020
https://www.kaggle.com/progression

Overall, I found the course helpful and insightful, 4.79/5. There were many ideas that I had not considered before so I am posting some of my notes here. More than likely, you have seen most of these ideas so I will try to focus on the most interesting ones. Here is the link to the course.

TOC

  1. Data Exploration Checklist
  2. Validation
  3. Target Leakage
  4. Metrics and Loss Functions
  5. Metric Optimization
  6. Mean Encoding
  7. Coding Tips
  8. Advanced Feature Engineering
  9. Ensemble Strategies
  10. StackNet
  11. Creating a Diverse Set of Models
  12. Tips on Meta-Learning and Stacking
  13. Text Based Features in XGBoost
  14. Sequence Feature Extraction (XGBoost)
  15. Semi-supervised & Pseudo Labeling
  16. My Opinion

Data Exploration Checklist

Rohan Kotwani
Rohan Kotwani

Written by Rohan Kotwani

My goal is to share a collection of thoughts, ideas, and possibilities from high quality artists and content producers.

No responses yet

Write a response