Portrait of a Multiethnic QA Engineer Working on Finding and Fixing Bugs in a Product or Program Software Code Before the Launch. Female Using Laptop Computer, Collaborating with Developers Online

Four steps to implement a large language model successfully

Organizations are eager to adopt LLMs but navigating potential pitfalls can be challenging.


In brief
  • Companies are pondering how to best prepare for adopting an LLM.
  • Implementation issues include a lack of tools or platforms to develop models, as well as artificial intelligence (AI) skills.
  • A comprehensive framework that considers the use case, cost of ownership, AI readiness and hosting architecture can help overcome these obstacles.

Generative AI (GenAI) is revolutionizing industries and reformulating how organizations engage with customers, design products and streamline operations. Large language models (LLMs) are GenAI models that are trained on an enormous amount of text data to understand and generate human-like language, which means they can help organizations process and synthesize data more quickly, uncover patterns and generate valuable insights. However, many companies are grappling with how to get ready to adopt LLMs.

Top factors hindering the adoption of AI, such as an LLM

Source: IBM Global AI Adoption Index 2022, IBM Corporation, May 2022

How a comprehensive AI and LLM framework helps prepare companies

 

A thorough AI framework that evaluates readiness and addresses potential issues before investing can help organizations get on the right path. For example, some private equity firms are experimenting with LLMs to analyze market trends and patterns, manage documents and automate some functions. They are also considering how GenAI may impact their investing strategy. The following four-step analysis can assist an organization in deciding whether to build its own LLM or work with a partner to facilitate an LLM implementation.

 

1) Define the use case for adopting an LLM

 

There is a lot of hype around GenAI and all that it can do. Although it’s a powerful technology, it may not be suitable for addressing some problems and could be costly if deployed without defining the specific use case. Use cases related to lower-level customer support, content creation and document analysis tend to be best suited for GenAI experimentation.

 

Once businesses determine they have zeroed in on the right set of use cases, it is beneficial for them to start experimenting with one of the pretrained enterprise-scale LLMs like OpenAI’s GPT-4. Using a state-of-the-art pretrained model can lead to multiple operational efficiencies by:

 

  • Streamlining hybrid and multi-cloud management, which enables teams to communicate with cloud infrastructure using natural language queries
  • Simplifying tasks such as monitoring, troubleshooting and maintaining multi-cloud deployments
  • Facilitating the automation of testing processes by analyzing input materials, including requirements documents and user stories (e.g., generating structured test cases that can be executed automatically or manually and creating realistic test data)
  • Accelerating the software process by automating code reviews and suggesting potential optimization via advanced natural language understanding capabilities that enable the analysis of code quality and provide actionable feedback, thus streamlining the development workflow and reducing the time required for manual code inspections and improvements

Still, in certain scenarios where pretrained models fail to meet accuracy goals, companies may opt to train or fine-tune a model by funneling proprietary data to improve the overall performance.

 

2) Evaluate AI readiness

 

The next step is to assess the organization’s AI and machine learning (ML) readiness across three categories: AI capabilities, data and data practices, and analytics capabilities. If companies don’t take this step, there’s a risk of going in unprepared and not being able to successfully achieve the project’s goals.

To implement an LLM, it’s also helpful for companies to focus on acquiring or building AI skill sets, including prompt engineering, retrieval augmented generation, vector databases and ethical AI practices. To accomplish this, businesses can:

  • Provide continuous training and upskilling
  • Access professionals with experience in AI development
  • Create a systematic approach to embed ethical AI practices in every aspect of implementation

Next, it’s important for a company to analyze its data readiness by determining:

  • The appropriate data analytics and learning pipeline architecture
  • The selection of automated tools that import data from various sources to one target location
  • The level of security in the analytics environment

Lastly, it’s beneficial for companies to build and fine-tune analytics capabilities, including models and algorithms, to improve performance. This includes:

  • Rigorous model validation and testing
  • Iterative improvements based on real-world feedback
  • Integration of AI insights into decision-making processes for measurable impact

3) Choose the appropriate hosting architecture

Selecting the right architecture for an organization’s unique needs encompasses three vital layers — the pre-processing, middleware and post-processing layers — as displayed in the chart below.


The pre-processing layer in an LLM architecture serves a critical role in handling data. Its responsibilities include collecting and consolidating structured and unstructured data into a container and employing optical character recognition (OCR) to convert a non-text input into text. It’s also responsible for ranking relevant chunks to send based on a token (a fundamental unit of text that a language model reads and processes) with a limit (the maximum length of the prompt). Furthermore, it may utilize custom personally identifiable information (PII) and mask it to protect sensitive information.

The middleware layer facilitates seamless interaction between the operating system and various applications. It supports a wide range of programming languages, including Python, .NET and Java, which enables compatibility and smooth communication across different platforms.

The post-processing layer refines the LLM’s output by using prompt engineering to frame queries and by offering a fine-tuning application programming interface (API) for customization on domain-specific data (such as curating financial data for training). It then consolidates and evaluates the results for correctness, addressing bias and drift with targeted mitigation strategies, to improve output consistency, understandability and quality.

4) Assess cost, data ownership options and resources

 

The choice of an LLM implementation approach impacts the complexity and costs, including those associated with:

  • Training
  • Data collection, ingestion and cleansing
  • Hiring data scientists
  • Maintaining the model in production

The selection also greatly affects how much control a company will have over its proprietary data. The key reason for using this data is that it can help a company differentiate its product and make it so complex that it can’t be replicated, potentially gaining a competitive advantage. In addition, proprietary data can be crucial for addressing narrow, business-specific use cases.

 

Also, there are regulatory and ethical reasons for sustaining control. For example, depending on the data that is stored and processed, secure storage and auditability could be required by regulators. In addition, uncontrolled language models may generate misleading or inaccurate advice. Implementing control measures can help address these issues; for instance, preventing the spread of false information and potential harm to individuals seeking medical guidance.

 

Typically, there are three ways to implement an LLM — an API, platform as a service (PaaS) or self-hosted — each of which presents different considerations.

 

Off-the-shelf model via API

 

Using an API can alleviate the complexities of maintaining a sizable team of data scientists, as well as a language model, which involves handling updates, bug fixes and improvements. Using an API shifts much of this maintenance burden to the provider, allowing a company to focus on its core functionality. In addition, an API can enable on-demand access to the LLM, which is essential for applications that require immediate responses to user queries or interactions.

 

When a company uses an LLM API, it typically shares data with the API provider. It’s important to review and understand the data usage policies and terms of service to confirm they align with a company’s privacy and compliance requirements. The ownership of data also depends on the terms and conditions of the provider. In many cases, while companies will retain ownership of their data, they will also grant the provider certain usage rights for processing it. It’s beneficial for companies to clarify data ownership in their provider contracts before investing.

 

PaaS

 

PaaS provides companies access to use its LLM as part of a broader platform offering and allows customers to operate their LLMs without managing the underlying application infrastructure, middleware or hardware. However, by using this approach, companies may incur higher model costs associated with purchasing the rights to build on top of the LLM using their own data, as well as allowing domain specificity and model customization during deployment. It also enables companies to control their data and minimize the time to value and cost compared to the self-hosted approach. On the flip side, auditability of the data and the ability to provide comprehensive explanations for results can pose challenges as organizations are constrained given that their PaaS providers don’t provide the underlying data. In addition, PaaS can result in a greater total cost of ownership for the LLM and can be more complex than utilizing an API.

 

Self-hosting an LLM

 

This is the most expensive approach because it means rebuilding the entire model from scratch and requires mature data processes to fully train, operationalize and deploy an LLM. Furthermore, upgrading the underlying model for self-hosted implementations is more intensive than a typical software upgrade. On the other hand, it provides maximum control — since a company would own the LLM — and the ability to customize extensively.

No matter what stage a company is at in defining its GenAI strategy, the EY-Parthenon Software Strategy Group can help transform its vision into reality with frameworks to help identify the right approach for any organization’s unique circumstances.

Conclusion

Advances in deep learning networks are foreshadowing a productivity revolution, which is spurring companies to keep up with the adoption of GenAI technologies. When embarking on an AI initiative that includes an LLM implementation, companies can better inform their decisions by employing a comprehensive AI implementation framework. Taking this approach can help businesses prepare by analyzing their purpose, goals, costs and readiness factors, including regulatory compliance and ethical safeguards.

Lianda Luo, Austin Chen, Naina Wodon, Andrea Lamas-Nino and Ioannis Wallingford of Ernst & Young LLP also contributed to this article.

Summary 

As companies increasingly focus on adopting LLMs, using a comprehensive framework that evaluates readiness and addresses potential issues before investing can help organizations overcome implementation challenges.


Related articles

How to navigate global trends in Artificial Intelligence regulation

Learn why the AI regulatory approach of eight global jurisdictions have a vital role to play in the development of rules for the use of AI.

12 Jan 2024 EY Americas

CEO Survey: Focus on GenAI strategy to accelerate growth

Our survey finds US CEOs preparing for a new normal of costly capital while M&A intentions remain strong and divestment plans are robust.

30 Oct 2023 EY Americas

Cutting costs in the cloud: six strategies for SaaS companies

SaaS companies are urgently seeking to control cloud hosting costs, but navigating the complex landscape of cloud expenditures is no simple task.

31 Aug 2023 Jeff Vogel + 2