I love doing Data analysis, don’t get me wrong. But what if it can be enhanced?
Or better automated?
So this journey has started with a bold question:
What If Data Analysis Didn’t Need You Anymore?
But no, no. There should be a driver in the driver's seat, which will be us. In this article, you and I will explore how AI automates/enhances Data Analysis. And trust me, it is not as hard as you think!
And if you don’t want to do it by yourself, I’ll provide you with a link where you can use it. Let’s get started!
Life Longevity Analysis
In this dataset, we are going to use this data project.
5 years ago, you should have downloaded the dataset, read it with pandas, and explored it by using codes like this.
import pandas as pd
df = pd.read_csv("/Users/learnai/Downloads/LiveLongerData (1).csv")
df.head()
Here is the output.
SS of the outputs
What about the column names? Let’s see.
df.info()
Here is the output.
SS of the outputs
Good, but outdated.
What about developing an AI agent empowered with ChatGPT? Trust me, it is not too complicated. Just paste the code I’ll give you.
Can a Simple Prompt Replace Hours of Work?
Photo by Marvin Meyer on Unsplash
First, let’s install these libraries.
pip install langchain-openai
pip install langchain_experimental.agents
pip install pandas openai
Good, now you are good to go. Let’s see the entire code.
from langchain_openai import ChatOpenAI
from langchain_experimental.agents.agent_toolkits import create_pandas_dataframe_agent
from langchain_openai import ChatOpenAI
from langchain.agents.agent_types import AgentType
agent = create_pandas_dataframe_agent(
ChatOpenAI(temperature=0, model="gpt-3.5-turbo", api_key=api_key),
df,
verbose=True,
agent_type=AgentType.OPENAI_FUNCTIONS,
**{"allow_dangerous_code": True}
)
But before you can set your api key.
api_key = "api-key-here"
Good, now let’s use this code.
agent.invoke("how many rows are there?")
Here is the output.
SS of the output
But let’s make it look better. I wrote code to make it look better. Here is the code.
from IPython.display import Markdown, display
import contextlib
import io
def display_clean_output(agent, prompt):
buffer = io.StringIO()
# stdout'u geçici olarak yönlendiriyoruz (böylece zincir mesajları bastırılıyor)
with contextlib.redirect_stdout(buffer):
result = agent.invoke(prompt)
# Sadece temiz 'output' kısmını gösteriyoruz
output = result.get("output", "").strip()
display(Markdown(output))
Good, let’s use this code.
display_clean_output(agent, "Show me first two rows of the dataframe. And also show me all column names of df")
Here is the output.
SS of the output
In seconds. It improves. But it can be better.
Streamlit
Good, now I’ll give you the entire code, because I know some of them are just searching for code, which is totally okay for me.
Here is the entire code:
import streamlit as st
import pandas as pd
import warnings
warnings.filterwarnings('ignore')
from langchain_experimental.agents.agent_toolkits import create_pandas_dataframe_agent
from langchain_openai import ChatOpenAI
from langchain.agents.agent_types import AgentType
import io
# 📌 OpenAI API key
api_key = "api-key-here"
# Page config
st.set_page_config(page_title="Advanced CSV Explorer", layout="wide")
st.title("📊 Chat With Your File – Powered by Langchain & Magic")
# Upload CSV
uploaded_file = st.file_uploader("📂 Upload your CSV or Excel file", type=["csv", "xlsx"])
if uploaded_file:
# Detect file type and read accordingly
if uploaded_file.name.endswith(".csv"):
df = pd.read_csv(uploaded_file)
elif uploaded_file.name.endswith(".xlsx"):
df = pd.read_excel(uploaded_file)
st.dataframe(df.head())
buffer = io.StringIO()
df.info(buf=buffer)
s = buffer.getvalue()
st.text(s)
# Create agent with hardcoded API key
agent = create_pandas_dataframe_agent(
ChatOpenAI(
temperature=0,
model="gpt-3.5-turbo",
api_key=api_key
),
df,
verbose=False,
agent_type=AgentType.OPENAI_FUNCTIONS,
**{"allow_dangerous_code": True}
)
# Ask prompt
prompt = st.text_input("💬 Ask a question about your data")
if prompt:
with st.spinner("Thinking..."):
response = agent.invoke(prompt)
st.success("✅ Answer:")
st.markdown(f"> {response['output']}")
Now save this code inside the automated_analysis.py file.
Install streamlit if you have not.
pip install streamlit
Go to the directory where you have this .py file. ( Let’s say it is in downloads.)
cd Downloads
Use this code.
streamlit run automated_analysis.py
That’s it. Wait a second, and it will run on your local host. If the window did not open, go there:
http://localhost:8501/
Let’s see the output.
SS of the output
Good, now let’s upload the dataset.
SS of the output
Good, let’s ask about anything. Here is the question we will use.
- Top 5 Factors Affecting Life Expectancy.
SS of the output
- Top 5 Factors Losing Years from Life
SS of the output
- Eating Less or Eating Healthier
SS of the outputs
No Time? No Code = No Problem
If you are too lazy to write all of this code, or don’t want to. You can use agents in our platform.
Let’s see.
Let’s upload the file.
SS of the output
But there are a lot of different things to discover here. You can automate Data Exploration, Data Visualization, or even the Model-building process.
SS of the Output
To discover more, visit our platform to find Assistants, AI News, AI Projects, and more!
Final Thoughts
In this one, we first automated the data analysis process. You can do it with a streamlit dashboard or inside a Jupyter notebook.
Or if you don’t want to do it but want to use it, visit our platform, click on the agents, and go!
Thanks for reading this one.
Here are the free resources.
Here is the ChatGPT cheat sheet.
Here is the Prompt Techniques cheat sheet.
Here is my NumPy cheat sheet.
Here is the source code of the “How to be a Billionaire” data project.
Here is the source code of the “Classification Task with 6 Different Algorithms using Python” data project.
Here is the source code of the “Decision Tree in Energy Efficiency Analysis” data project.
Here is the source code of the “DataDrivenInvestor 2022 Articles Analysis” data project.
“Machine learning is the last invention that humanity will ever need to make.” Nick Bostrom