I read numerous articles about RAG, but I often got lost in the lengthy code examples. Does it have to be in that way?
I’ve found myself replying silently, saying, I don’t think so!
Now, do not get me wrong, I am not a promoter like you see in X, or everywhere, saying, copy my method, steal my system, use this to be a billionaire!!
But why should I spend so much time writing the entire system from scratch?
I have massive respect for the people who are doing this, but let’s do it with CLI agents, like Claude Code. And I know you need a paid account, but you can also use Gemini CLI for free.
What is RAG?
Retrieval Augment Generation is used to improve the performance of the LLMs.(ChatGPT, Claude … ).
A notable example is splitting uploaded documents into chunks, which allows for faster access. How would I know? Because I downloaded RAG’s research document from here, uploaded the streamlit app that I developed with RAG, and asked it.
RAG-based app answers questions about RAG by using RAG’s academic research paper
RAG-based app answers questions about RAG by using RAG’s academic research paper.
Technicalities
I am going to finish this section as fast as I can. Please do not leave this article and send claps and comments. Subscribe to my channel. Oops, I am sorry, just send claps and comments.
Now, let’s talk about the technicalities a little bit. To build a system like this, you don’t need a Ph.D. in computer science, at least now.
Here is the schema.
RAG Schema
Your PDF is turned into raw text and then split into small vector chunks(embedding).
Those vector pieces are stored in the database (FAISSDB), and when you ask a question, the system quickly finds the best match from those small pieces and sends it to the Chatbot to generate a response. That’s it.
The Prompt That Builds It
You know the deal right now. Here is the prompt we will use.
> Build a RAG (Retrieval-Augmented Generation) application using Python.
>
> Goal: A web app where users can upload PDF documents and ask questions about their content.
>
> Core Technologies:
> - Backend: Flask
> - Frontend: Streamlit
> - Vector Search: FAISS (faiss-cpu)
> - Embeddings: Sentence-Transformers (all-MiniLM-L6-v2 model)
> - QA Model: OpenAI (gpt-3.5-turbo)
>
> Architecture (File Structure):
> - /ingest: A script to handle PDF uploads and extract raw text.
> - /process: A script to chunk the text and create vector embeddings using Sentence-Transformers.
> - /index: A script to manage storing and retrieving vectors from a FAISS index file.
> - /backend: A Flask API with endpoints for /upload, /ask, and /status. The /ask endpoint should retrieve context from FAISS and use the OpenAI API to generate an answer.
> - /interface: A Streamlit script (ui.py) that provides the user interface for file uploads and the chat window.
> - requirements.txt: Please generate a list of all necessary Python libraries.
> - run.sh: A shell script to start both the Flask backend and the Streamlit frontend simultaneously.
>
> Please generate the complete code for this application, with each part in its respective file and run the app afterwards.
But use it where? I disliked technical articles that omit the most crucial aspect of the concept.
No-code Agents
Cursor, windsurf, lovable, Claude code, Gemini CLI, pick your go-to tool. I used to love Claude Code. Why did I use to?
Because they are lowering the usage limit day by day.
And accusing the users as a reason.
Claude Code
But they are the best no matter what. However, if you want a free option, use Gemini CLI.
To run Claude Code, all you have to do is type its name in your terminal. Check out this article for a full installation guide.
claude
Same with Gemini CLI.
gemini
Of course, they wanted you to log in to their account on your browser next, but you get the idea.
Gemini CLI vs Claude Code
Testing the Stremlit APP
Streamlit APP
I admit the front-end is not optimal, but these are the details where you can work on with your no-code tool. Let’s test this app. Here is the GIF that cost me $30 per month to be able to send to you. I downloaded the app. (So send claps, please.)
Sorry for the quality; I couldn't upload a file larger than Medium's specific limit.
Final Thoughts
Thanks for reading this one. If you like this, send claps, and if you have any questions, send comments. I'll use AI to answer them since English is my second language, and I am a bit lazy.
If you like what you read, we also have a Substack and a platform where we have agents and no-code AI tools. However, we also have free resources if you are a bit stingy.
Here are the free resources.
Here is the ChatGPT cheat sheet.
Here is the Prompt Techniques cheat sheet.
Here is my NumPy cheat sheet.
Here is the source code of the “How to be a Billionaire” data project.
Here is the source code of the “Classification Task with 6 Different Algorithms using Python” data project.
Here is the source code of the “Decision Tree in Energy Efficiency Analysis” data project.
Here is the source code of the “DataDrivenInvestor 2022 Articles Analysis” data project.
“Machine learning is the last invention that humanity will ever need to make.” Nick Bostrom
Originally published at https://medium.com/@geencay.
📧 Stay Updated with AI Insights
Join 10,000+ subscribers getting the latest AI, Data Science, and tech insights delivered to your inbox.
💡 No spam, unsubscribe anytime. We respect your privacy.