Predicting Pet Life Expectancy: An AI Hackathon Challenge

MARS hosts an internal data science/AI hackathon every year, and for 2022 the challenge was to create a model that could predict pet life expectancy based on medical history.

I decided to compete this year to take a break from my daily deep learning work and exercise some AI muscles I hadn’t used in a while (regression problems and tabular data 😅):

  • The model I submitted scored an RMSE of 0.695 years (8 months).

  • The training data consisted of the medical history of 2000 pets (canine and feline), including the age that they passed away (e.g. 10.856 years).

  • The test data contained the medical history of 1000 pets.

  • The basic workflow was:

    1. Exploratory Data Analysis to understand the data and remove any obvious bad data (e.g. pets with age in negative years)

    2. Basic feature selection (about 15 columns of data; e.g. number of vet visits over lifetime)

    3. Feature Engineering; creating new features from our understanding of the data. (over 3k features/columns)

    4. Final Feature Selection; Using feature selection packages to determine which features had the biggest impact on our RMSE. (down to ~500 columns)

    5. Baseline model testing (XGBOOST, CatBoost, LightGBM). LightGBM had the best RMSE scores and was the fastest to train.

    6. Hyperparameter Tuning; Used Optuna and LightGBMTuner to find the best Hyperparameters for our model.

    7. Final round of training and testing before model submission!

Technology Used:

  • Python 🐍

  • LightGBM ⚡

  • Optuna 📊

  • Honorable Mentions

    • CatBoost

    • XGBoost

3rd Place had and RMSE of 0.687 and first had 0.661, a very close competition!

Previous
Previous

AI Snow Walker: A game about knocking over an AI that taught itself to walk

Next
Next

Race Against an AI: Masters Thesis project on Deep Reinforcement Learning