Learn AIWith Me

Overview

Obesity is a global health crisis influenced by multiple factors, including diet, physical activity, genetics, and lifestyle choices. This project utilizes a comprehensive dataset to uncover insights about obesity trends, identify key influencing factors, and propose actionable solutions.

Dataset Details

This dataset contains rich survey data covering lifestyle, dietary, and health-related variables. Below is a breakdown of the key attributes and their possible values:

Sex
- Male (712)
- Female (898)
Age
- Integer values (in years)
Height
- Integer values (in centimeters)
Overweight/Obese Families
- Yes (266)
- No (1344)
Consumption of Fast Food
- Yes (436)
- No (1174)
Frequency of Vegetable Consumption
- Rarely (400)
- Sometimes (708)
- Always (502)
Number of Main Meals Per Day
- 1–2 meals (444)
- 3 meals (928)
- More than 3 meals (238)
Food Intake Between Meals
- Rarely (346)
- Sometimes (564)
- Usually (417)
- Always (283)
Smoking
- Yes (492)
- No (1118)
Daily Liquid Intake
- Less than 1 liter (456)
- 1–2 liters (523)
- More than 2 liters (631)
Calorie Monitoring
- Yes (286)
- No (1324)
Physical Activity Frequency
- None (206)
- 1–2 days/week (290)
- 3–4 days/week (370)
- 5–6 days/week (358)
- 6+ days/week (386)
Daily Technology Use
- 0–2 hours (382)
- 3–5 hours (826)
- More than 5 hours (402)
Mode of Transportation
- Automobile (660)
- Motorbike (94)
- Bicycle (116)
- Public transportation (602)
- Walking (138)
Target Class (Obesity Category)
- Underweight (73)
- Normal (658)
- Overweight (592)
- Obesity (287)

Step 1: Understanding the Dataset

Description: The dataset includes survey data covering various obesity classes.
- Key Features: Age, dietary habits, physical activity, smoking habits, and fast food consumption.
Purpose: Build a foundation for:
- Visualizing obesity trends.
- Creating machine learning-based predictions.

Step 2: Data Exploration

Objective: Familiarize yourself with the dataset structure.
- Import libraries and load the dataset.
- Examine the data types, missing values, and distributions.
Learning Opportunity: Practice exploratory data analysis (EDA) to discover patterns and relationships.

Step 3: Data Visualization

Focus Areas:
- Analyze obesity class distributions.
- Explore correlations between lifestyle factors and obesity:
  - Age and obesity class.
  - Impact of fast food consumption.
  - Relationship with physical activity levels.
Visualization Tools: Matplotlib, Seaborn.
- Pro Tip: Use a correlation heatmap to highlight key relationships.

Step 4: Machine Learning Implementation

Data Preparation:
- Split data into training and testing sets.
- Standardize features to ensure accurate model performance.
Modeling:
- Train and evaluate multiple algorithms:
  - Linear Regression
  - Random Forest
  - K-Nearest Neighbors
  - Decision Tree
  - Support Vector Regression
- Compare performance using metrics like RMSE and ( R^2 ).
Prediction Insights:
- Understand which factors have the most predictive power.

Step 5: Answering Key Questions

Leverage machine learning and EDA to address these critical health-related questions:

Does eating more vegetables reduce obesity?
Are individuals eating more than three meals a day more likely to be obese?
How do obesity rates differ between smokers and non-smokers?
How does obesity vary across age groups?
Does low physical activity and high fast food consumption lead to higher obesity rates?

Step 6: Insights and Recommendations

Findings:
- Relationships between lifestyle choices and obesity.
- The role of dietary habits and physical activity in maintaining a healthy weight.
- Age-based obesity trends and risk factors.
Recommendations:
- Encourage healthy eating habits.
- Promote regular physical activity.
- Target interventions based on age-specific trends.