All cases

Melody Sage: a Gen AI music learning platform

Client
An internal R&D project by the ITRex AI/Gen AI team
Industry
EduTech, AI research
Services
Gen AI development, agent implementation, RAG, cloud platform engineering
Tech stack
Gemini 2.0, Imagen3, custom RAG & LLM-as-Judge pipelines, Terraform, Google Search API, Python, Google Cloud Platform (Vertex AI, Firestore, Cloud Storage, Firebase Authentication, Cloud Run)

Challenge

Music education has long been plagued by its one-size-fits-all approach. Adult beginners with no prior training may find learning music theory intimidating, and traditional online platforms frequently fail to adapt to an individual's learning pace or answer nuanced questions—unless a human tutor is present in the background. To investigate the future of personalized education, our Gen AI development company launched Melody Sage, an internal R&D project aimed at creating a fully autonomous music tutor that would push the boundaries of agent-based AI solutions. Our main challenge was to create an end-to-end system on Google Cloud Platform (GCP) that could not only generate curriculum from scratch but also interact intelligently and adaptively with a learner in real time.

Solution

The ITRex team designed and built the Melody Sage platform entirely on GCP, focusing on two key flows: automated course generation from uploaded content and interactive consultation with a tool-augmented AI agent. The end-to-end pipeline functions as follows:
Automated content ingestion & structuring. Instructional materials (PDFs, DOCXs, etc.) are uploaded into Google Cloud Storage. Google's Document AI reads these files and extracts clean text, which is then segmented into semantic chunks. These chunks are vectorized using Vertex AI Embeddings and saved in Vertex AI Vector Search.
Generative curriculum assembly. A back-end service constructs a structured course, generating lessons and corresponding quizzes. Google's Gemini 2.5 Pro model is prompted to create instructional text for each lesson, extracting context from the vector database. The Imagen3 model then creates a relevant cover image for each lesson based on the Gemini description.
Personalized learning flow. Students can access their personalized curriculum via a secure front end. Their progress and interactions are tracked in real time by Firestore, allowing the system to adjust the lesson difficulty and pace.
Real-time AI consultation agent. Students may ask open-ended questions at any time. A dedicated AI agent extracts relevant information from the internal knowledge base and augments it with real-time web results from the Google Search API. Gemini then uses this information to generate a comprehensive, context-aware response.
While working on the Melody Sage project, our R&D team had to address several complex engineering problems:
Streamlining quiz generation. The engineering team had to navigate a major hurdle—i.e., creating assessments for each dynamically generated lesson without manual reviews. Our solution was to devise a sophisticated prompting strategy where the LLM was instructed to perform two tasks in one step: first, generate relevant, multiple-choice quiz questions based on the lesson's content, and second, identify and provide the correct answer to each question. As a result, we developed a self-contained assessment module that required no additional validation: the AI solution generated both the questions and the answer key simultaneously, ensuring that each lesson was instantly paired with an accurate quiz.
Verifying conflicting information for the AI agent. The consultation agent occasionally received conflicting information from the internal knowledge base and external web searches. We navigated the issue by introducing a self-reflection step into the agent's reasoning process. Before responding to a user's question, the agent explicitly assesses the credibility and contextual alignment of all retrieved sources, prioritizing vetted internal content.
Balancing LLM quality, latency & cost. Early in the project, we experimented with different LLMs and found that they excelled at different tasks. While Claude 3.5 initially produced higher-quality lessons, Gemini 2.5 Pro, which came later, offered a better balance of performance, lower latency, and seamless integration with the GCP ecosystem, prompting a strategic migration.
Gen AI Platform for Personalized Music Education
Gen-AI-music-training-platform

Impact

As an internal R&D initiative, the Melody Sage project helped advance our team's expertise in Gen AI and agentic systems:
The project showcases ITRex's ability to build complex, generative AI applications—from data ingestion and RAG to agent-based interactions—using latest technologies
Through this project, we developed and validated reusable architectural patterns for agent-based reasoning, automated content generation, and LLM-as-judge evaluation, which can accelerate future client engagements
The platform demonstrates a sophisticated application of Google Cloud's AI suite, including tools such as Gemini, Imagen3, and Vertex AI
Melody Sage provides a tangible blueprint for how generative AI can transform education, moving from static content to truly adaptive and personalized learning paths

Latest projects