What are foundation models, and how could they help your company excel at AI?
Unless you’ve been living under a rock, you’ve heard about OpenAI’s ChatGPT. This language model has absorbed tremendous volumes of conversational text using the supervised learning and, at the fine-tuning stage, the reinforcement learning from human feedback (RLHF) approaches.
The generative AI solution can analyze input data against 175 billion parameters and profoundly understand the written language. The smart tool can answer questions, summarize and translate text, produce articles on a given topic, write code, and much more. All you need is to provide ChatGPT with the right prompts.
OpenAI’s groundbreaking product is just one example of foundation models that transform AI application development as we know it.
With foundation AI models like ChatGPT, companies no longer have to train algorithms from scratch for every task they want to enhance or automate. Instead, you only need to select a foundation model that best fits your use case—and fine-tune its performance for a specific objective you’d like to achieve.
The smart approach could help businesses significantly reduce the cost of implementing AI, especially when training data is scarce or difficult to obtain due to privacy regulations. This often happens in healthcare, banking, finance, education, and other heavily regulated industries.
On a side note, companies looking to join the Gen AI bandwagon should be aware of the specifics of generative AI pricing. In some cases, opting for a readily available tool like ChatGPT may be less cost-effective than using open-source foundation models “as is” or retraining them on your corporate data.
What types of foundation AI models are there?
Several types of foundation AI models are commonly used in business applications:
-
Semi-supervised learning models are trained on a dataset that contains a mixture of labeled and unlabeled data. The goal is to use the labeled data to improve the model’s performance on the unlabeled data. AI experts turn to semi-supervised learning when training data is difficult to obtain or would cost your company an arm and a leg. This, for instance, may happen in medical settings where various healthcare IT regulations are enacted. Some common examples of semi-supervised models include pre-trained text document and web content classification algorithms.
-
Unsupervised learning models are fully trained on unlabeled datasets. They discover patterns in training data or structurize it on their own. Such models, among other things, could segment information into clusters based on the parameters they’ve uncovered in a training dataset. ML engineers turn to auto-encoders, K-Means, hierarchical clustering, and other techniques to create unsupervised machine learning solutions and improve their accuracy.
-
Reinforcement learning models interact with their environment without specific training. When achieving a desired outcome — i.e., making a prediction the developers have hoped for — the models get rewarded. On the contrary, reinforcement learning models are penalized when making wrongful assumptions. The approach allows AI algorithms to make more complex decisions than their supervised and semi-supervised counterparts. An example of reinforcement learning in action would be autonomous vehicles or game-playing artificial intelligence like AlphaGo.
-
Generative AI models produce new data similar to the data they’ve been trained on. This data may include text, images, audio clips, and videos. The ChatGPT solution mentioned in the previous section belongs to this category of foundation AI models. Other examples of generative AI solutions include the DALL-E 2 tool, which creates images based on descriptions written in natural language, and the Synthesia.io video platform, which uses text-based inputs to produce video content.
-
Transfer learning models can solve tasks other than they’ve been trained on. For instance, computer vision engineers may leverage pre-trained image classification algorithms for object detection. Or harness existing NLP solutions for more knowledge-intensive tasks, such as customer sentiment analysis. Some popular pre-trained machine learning solutions include OpenCV, a computer vision library containing robust models for object classification and image detection, and Hugging Face’s Transformers library offerings, such as generative pre-trained transformer (GPT) — i.e., a rich language model whose third generation (GPT-3) powers the ChatGPT service.
-
Meta-learning models, unlike their task-orientated equivalents, literally learn to learn (no pun intended). Instead of devouring data to solve a specific problem, such models develop general strategies for problem-solving. This way, meta-learning solutions can easily adapt to new challenges while using their resources, such as memory and computing power, more efficiently. ML experts tap into meta-learning when training data is scarce, or a company lacks definitive plans regarding AI implementation in business. TensorFlow, PyTorch, and other open-source machine learning libraries and frameworks offer tools that allow developers to explore meta-learning techniques. And cloud computing providers like Google help ML experts and newbies train custom machine learning models using AutoML.
Depending on the specific application and the type of data you have, one foundation model may be more appropriate than another. And your company is free to choose between an open-source solution, which needs a bit of tweaking, or a ready-to-use third-party product — provided it meets your business targets.
Top 3 reasons to leverage foundation AI models for your next project
Below you will find a rundown of foundation AI models’ advantages:
-
Foundation models will help you implement AI faster, cheaper, and with fewer resources involved. Creating and deploying an AI solution requires considerable time and resources. For every new application, you need a separate well-labeled data set. And if you don’t have it, you’ll need a team of data experts to find, cleanse, and label that information. According to Dakshi Agrawal, CTO of IBM AI, foundation models help cut down on data labeling requirements by 10-200 times depending on a given use case, which translates into significant cost savings. On the business side, you should also consider the rising cloud computing expenses. Google, for instance, spent as much as $35 million to teach DeepMind to play Go. And while your AI project may not be half as ambitious, you could easily spend $300,000 in cloud server costs alone to get your AI app up and running. Another reason to use foundation models, such as generative AI solutions, is the opportunity to quickly prototype and test different concepts without investing heavily in R&D.
-
You can reuse foundation AI models to create different applications. As their name implies, AI foundation models can serve as a basis for multiple AI applications. Think about driving a car. Once you’ve got a driver’s license, you don’t need to pass the exam every time you buy another vehicle. Similarly, you can use a smaller amount of labeled data to train a general-purpose foundation model that summarizes texts to process domain-specific content. And foundation models possess “emergence” capabilities, too, which means that a model, once trained, may learn to solve problems it was not supposed to address or glean unexpected insights from training data.
-
Foundation AI models help achieve your company’s sustainability goals. Training one large machine learning model can have the same environmental footprint as running five cars over their lifetime. Such a heavy carbon footprint stands in sharp contrast with the fact that 66% and 49% of businesses are increasing the efficiency of energy use and developing new climate-friendly services and products, respectively. With foundation AI models, you can train intelligent algorithms faster and utilize computing resources wisely — not the least thanks to the models’ architecture that takes advantage of hardware parallelism, executing several tasks simultaneously.
Deemed “the future of AI,” foundation models lower the threshold for tapping into artificial intelligence and could potentially end the failed AI proof of concept cycle by helping businesses scale models across other use cases and company wide.
But with every opportunity comes a challenge.
Things to consider when using foundation models
The only glaring drawback of foundation AI models is a lack of explainability.
Large foundation models can use so much training data and have so many deep layers that it’s sometimes hard to determine how algorithms arrive at their conclusions.
The black-box nature of foundation models leaves a backdoor for cybercriminals, too. Hackers can launch data poisoning attacks and introduce AI bias, further exacerbating artificial intelligence’s ethical issues.
Technology companies should join forces with governments to set up infrastructure for public AI projects to avoid contention surrounding foundation AI models’ usage. AI vendors should also disclose what datasets they use and how they train their models.
As Percy Liang, Stanford HAI faculty and computer science professor, opined during his recent interview with Venture Beat, “We’re very much in the early days, so the professional norms are underdeveloped. It’s therefore imperative that we, as a community, act now to ensure that this technology is developed and deployed in an ethically and socially responsible fashion.”
What does it take to start using foundation models in your organization
As someone who’s spent the last ten years helping companies implement AI systems, the ITRex team is witnessing a shift in artificial intelligence.
Systems that execute specific tasks in a single domain give way to broad AI that learns more generally and works across industries and use cases. Foundation models, trained on large, unlabeled datasets and fine-tuned for various applications, are driving this transformation.
If your company is ready to leapfrog your competitors and get ROI from your AI systems faster, here’s a high-level strategy for implementing foundation models:
-
Collect and pre-process your data. The first step involves collecting and pre-processing the data you will feed to a foundation AI model. The quality and diversity of this data are critical for ensuring that the fine-tuned model is accurate and robust.
-
Choose a foundation model. Many pre-trained AI foundation models are available on the market. Some popular solutions include BERT, GPT, and ResNet, among others. It’s important to choose the right foundation model based on the task you want to solve and the type of data you have.
-
Tweak the model in line with your business objectives. Once your foundation model and data are ready, you can adjust the model’s parameters to your specific task. One way to achieve this goal is transfer learning, where you use the pre-trained weights of the foundation model as a starting point and adjust them based on your training data.
-
Evaluate the model. After fine-tuning, it’s crucial to determine if the model works well and if further adjustment is necessary. To assess the foundation model’s performance, you can use standard metrics such as accuracy, precision, recall, and F1 score.
-
Deploy your AI solution. Once you’re satisfied with the performance of your fine-tuned model, you can deploy it in a production environment. Several options for deploying AI models include cloud-based platforms, on-premise servers, or edge devices.
It’s important to remember that implementing AI foundation models requires technical expertise and access to specialized hardware and software tools. Therefore, it may be helpful to partner with a specialized AI vendor or consult with a team of AI experts to ensure the process is done effectively.