As the core engineering partner, ITRex developed a
generative AI training platform that personalizes the learning process at scale.
Our approach was based on a modular, purpose-driven language model (LLM) architecture, ensuring that each component of the training experience was precise, scalable, and efficient.
ITRex developed the solution using a sophisticated
retrieval-augmented generation (RAG) pipeline to generate a high-fidelity knowledge base.
We devised a multi-stage procedure to avoid the common pitfalls of LLM hallucination and content repetition:
●
Advanced data processing. The ITRex R&D team created custom parsers for a variety of source documents (PDFs, PPTX, DOCX, audio/video subtitles) to normalize all incoming data into structured text format.
●
Intelligent chunking & embedding. To improve semantic segmentation, we implemented an adaptive chunk splitter that used positional encoding. This ensured that the context fed into the model was always relevant. The processed chunks were then transformed into vector embeddings via the OpenAI Embeddings model and domain-specific SentenceTransformers.
●
Few-shot learning for factual consistency. To further ground the model's output in reality, we enhanced our advanced RAG pipeline with few-shot learning. By providing the model with curated, high-quality question-and-answer pairs directly within the prompt, we guided its responses to be more factually consistent and less repetitive, achieving high accuracy without retraining the model.
With a solid knowledge base in place, ITRex used specialized Gen AI components to develop the platform's core features, such as automated lesson generation, dynamic personalization based on resume and role analysis, and a real-time interactive Q&A module.
A project of this magnitude involved overcoming several engineering hurdles. Among the primary challenges and proposed solutions were:
●
Reducing hallucinations and repetitions during lesson generation. Early LLM outputs were frequently repetitive or contained hallucinated content that did not match the source documents. To tackle the problem, our R&D engineers enhanced the RAG pipeline with an adaptive chunk splitter and introduced retrieval-based filtering layers. Combining this technique with few-shot learning on curated examples resulted in a strong, multi-layered defense against inaccurate content generation.
●
Personalizing educational content across different organizations. We discovered that seniority definitions, such as "junior" or "senior," differed greatly between clients, making resume-only personalization inconsistent. We addressed the issue by configuring the system to perform dual analysis. The platform compares parsed CV data to role requirements extracted from the client's own company documents, dynamically adjusting lesson complexity and ensuring relevance.
●
Achieving real-time performance and low latency. The LLM powering the platform was initially hosted in the cloud (Azure) and accessed via APIs. This introduced a lag during real-time Q&A sessions. We solved the puzzle by benchmarking providers and, after careful consideration, migrating the LLM-related services to a direct OpenAI API endpoint. The switch was handled with minimal disruption thanks to the platform's modular architecture, significantly improving response time and user experience.