Overview

Obesity is a global health crisis influenced by multiple factors, including diet, physical activity, genetics, and lifestyle choices. This project utilizes a comprehensive dataset to uncover insights about obesity trends, identify key influencing factors, and propose actionable solutions.

Dataset Details

This dataset contains rich survey data covering lifestyle, dietary, and health-related variables. Below is a breakdown of the key attributes and their possible values:

  • Sex

    • Male (712)
    • Female (898)
  • Age

    • Integer values (in years)
  • Height

    • Integer values (in centimeters)
  • Overweight/Obese Families

    • Yes (266)
    • No (1344)
  • Consumption of Fast Food

    • Yes (436)
    • No (1174)
  • Frequency of Vegetable Consumption

    • Rarely (400)
    • Sometimes (708)
    • Always (502)
  • Number of Main Meals Per Day

    • 1–2 meals (444)
    • 3 meals (928)
    • More than 3 meals (238)
  • Food Intake Between Meals

    • Rarely (346)
    • Sometimes (564)
    • Usually (417)
    • Always (283)
  • Smoking

    • Yes (492)
    • No (1118)
  • Daily Liquid Intake

    • Less than 1 liter (456)
    • 1–2 liters (523)
    • More than 2 liters (631)
  • Calorie Monitoring

    • Yes (286)
    • No (1324)
  • Physical Activity Frequency

    • None (206)
    • 1–2 days/week (290)
    • 3–4 days/week (370)
    • 5–6 days/week (358)
    • 6+ days/week (386)
  • Daily Technology Use

    • 0–2 hours (382)
    • 3–5 hours (826)
    • More than 5 hours (402)
  • Mode of Transportation

    • Automobile (660)
    • Motorbike (94)
    • Bicycle (116)
    • Public transportation (602)
    • Walking (138)
  • Target Class (Obesity Category)

    • Underweight (73)
    • Normal (658)
    • Overweight (592)
    • Obesity (287)

Step 1: Understanding the Dataset

  • Description: The dataset includes survey data covering various obesity classes.
    • Key Features: Age, dietary habits, physical activity, smoking habits, and fast food consumption.
  • Purpose: Build a foundation for:
    • Visualizing obesity trends.
    • Creating machine learning-based predictions.

Step 2: Data Exploration

  • Objective: Familiarize yourself with the dataset structure.
    • Import libraries and load the dataset.
    • Examine the data types, missing values, and distributions.
  • Learning Opportunity: Practice exploratory data analysis (EDA) to discover patterns and relationships.

Step 3: Data Visualization

  • Focus Areas:
    • Analyze obesity class distributions.
    • Explore correlations between lifestyle factors and obesity:
      • Age and obesity class.
      • Impact of fast food consumption.
      • Relationship with physical activity levels.
  • Visualization Tools: Matplotlib, Seaborn.
    • Pro Tip: Use a correlation heatmap to highlight key relationships.

Step 4: Machine Learning Implementation

  1. Data Preparation:

    • Split data into training and testing sets.
    • Standardize features to ensure accurate model performance.
  2. Modeling:

    • Train and evaluate multiple algorithms:
      • Linear Regression
      • Random Forest
      • K-Nearest Neighbors
      • Decision Tree
      • Support Vector Regression
    • Compare performance using metrics like RMSE and ( R^2 ).
  3. Prediction Insights:

    • Understand which factors have the most predictive power.

Step 5: Answering Key Questions

Leverage machine learning and EDA to address these critical health-related questions:

  1. Does eating more vegetables reduce obesity?
  2. Are individuals eating more than three meals a day more likely to be obese?
  3. How do obesity rates differ between smokers and non-smokers?
  4. How does obesity vary across age groups?
  5. Does low physical activity and high fast food consumption lead to higher obesity rates?

Step 6: Insights and Recommendations

  • Findings:
    • Relationships between lifestyle choices and obesity.
    • The role of dietary habits and physical activity in maintaining a healthy weight.
    • Age-based obesity trends and risk factors.
  • Recommendations:
    • Encourage healthy eating habits.
    • Promote regular physical activity.
    • Target interventions based on age-specific trends.