The ITRex team designed and built the Melody Sage platform entirely on GCP, focusing on two key flows: automated course generation from uploaded content and interactive consultation with a tool-augmented AI agent.
The end-to-end pipeline functions as follows:
●
Automated content ingestion & structuring. Instructional materials (PDFs, DOCXs, etc.) are uploaded into Google Cloud Storage. Google's Document AI reads these files and extracts clean text, which is then segmented into semantic chunks. These chunks are vectorized using Vertex AI Embeddings and saved in Vertex AI Vector Search.
●
Generative curriculum assembly. A back-end service constructs a structured course, generating lessons and corresponding quizzes. Google's Gemini 2.5 Pro model is prompted to create instructional text for each lesson, extracting context from the vector database. The Imagen3 model then creates a relevant cover image for each lesson based on the Gemini description.
●
Personalized learning flow. Students can access their personalized curriculum via a secure front end. Their progress and interactions are tracked in real time by Firestore, allowing the system to adjust the lesson difficulty and pace.
●
Real-time AI consultation agent. Students may ask open-ended questions at any time. A dedicated AI agent extracts relevant information from the internal knowledge base and augments it with real-time web results from the Google Search API. Gemini then uses this information to generate a comprehensive, context-aware response.
While working on the Melody Sage project, our R&D team had to address several complex engineering problems:
●
Streamlining quiz generation. The engineering team had to navigate a major hurdle—i.e., creating assessments for each dynamically generated lesson without manual reviews. Our solution was to devise a sophisticated prompting strategy where the LLM was instructed to perform two tasks in one step: first, generate relevant, multiple-choice quiz questions based on the lesson's content, and second, identify and provide the correct answer to each question. As a result, we developed a self-contained assessment module that required no additional validation: the AI solution generated both the questions and the answer key simultaneously, ensuring that each lesson was instantly paired with an accurate quiz.
●
Verifying conflicting information for the AI agent. The consultation agent occasionally received conflicting information from the internal knowledge base and external web searches. We navigated the issue by introducing a self-reflection step into the agent's reasoning process. Before responding to a user's question, the agent explicitly assesses the credibility and contextual alignment of all retrieved sources, prioritizing vetted internal content.
●
Balancing LLM quality, latency & cost. Early in the project, we experimented with different LLMs and found that they excelled at different tasks. While Claude 3.5 initially produced higher-quality lessons, Gemini 2.5 Pro, which came later, offered a better balance of performance, lower latency, and seamless integration with the GCP ecosystem, prompting a strategic migration.