Case Study

How to accelerate your search speed with natural language processing

Technologies like optical character recognition help a managed care leader track compliance requirements in a fraction of the previous time.

The better the question

How fast can a million words become the one you need?

When a managed care company was losing time to manual searching, NLP answered the need for speed.


Health care is a priority for countries around the globe. Just in the United States, more than 85 million Americans spanning every state depend on Medicaid, a government funded health care program, for coverage. The federally regulated program is especially vital in helping pregnant women, young mothers and children receive routine checkups, vaccinations and preventive care, a benefit for the US health system overall.

Due to the high volume of Medicaid patients and the complexities inherent in a system that is federally funded, but state-managed, many states struggle to keep pace with the complex compliance requirements that allow its consumers to receive necessary health care. Large, private managed care organizations (MCOs) serve as the answer to the states’ administrative obligations so that health benefits keep flowing to those who need them most.


One of the largest MCOs in the competitive US Health Insurance Marketplace found that it was losing valuable time to arduous compliance verification processes. The company’s compliance analysts were spending hundreds of hours manually combing through Medicaid contracts to confirm that state requirements were being met. In a highly competitive industry, that lost time was costing the MCO the edge it needed to win additional contracts.

Medicaid compliance requirements differ from state to state, and non-compliance results in stiff penalties to providers with state service level agreements (SLAs). Continual contract updates were adding to the time and labor drain for analysts at this MCO and negatively impacting the company’s bottom line.

The managed care organization sought the help of Ernst & Young LLP (EY US) practitioners to create a more efficient system for its compliance analysts. EY teams combined more current technologies with the backing of health care regulatory teams to bring firsthand health industry insight, helping the client sharpen its decision-making and improve people’s lives.

The outcome is a searchable database, configured for the MCO’s needs and trained by the smartest emerging technology available.

“EY health care regulatory teams are dedicated to tracking the national policies and practices that impact health care,” says Heather Meade, Principal, Washington Council Ernst & Young LLP. “We’re passionate about helping health organizations make more informed decisions to improve people’s lives by providing leading-class technology innovations with a human touch. When our policy insights join forces with our technology experience, improving the tools that health care organizations depend on, everyone wins.”

The better the answer

Word search on warp speed

When contractual language is tangled, NLP unravels the web for better business.


EY professionals began by interviewing the MCO’s compliance analysts to gain a thorough understanding of the organization’s core challenges. They also assessed the existing Medicaid contracts across multiple states to broadly identify the commonalities and variations in the contractual formats and language.

Data scientists turned to artificial intelligence (AI), which emulates human cognition by analyzing large data sets, then applying algorithms to metabolize the raw material into usable information that can be harvested at astonishing speed. Natural language processing (NLP) is a form of AI that uses machine learning to serve as a translator between words as raw data – all of them – and words as functional data – only those relevant to a specific need or task.

Unstructured data represents the highest volume of data in organizations today, yet it remains largely overlooked for insights and can be an ongoing burden to manage.

For this client, the raw data was numerous state contracts written with little or no consistency in language, wording, topical organization or formatting. The words used to write contractual requirements may be similar, but they are often organized and combined differently.

EY teams used NLP to custom-train an algorithm to extract requirements that would then populate a central database, enabling the company’s analysts to rapidly search keywords, phrases and characters by both state and business function. The searchable business areas include claims centers, call centers, appeals and grievances, and others with compliance implications.

In the first phase, the EY teams identified common, relevant paragraph and sentence structures, flagging phrases such as “In the case of” and “is required to.” They also classified characters such as bullets and Roman numerals. Since these special characters can vary by state contract, different combinations were necessary for the NLP algorithm to learn the individual contracts and requirements. In some cases, the state contracts were only available in PDF format, which required the application of optical character recognition software to convert pictures of words to readable characters before NLP could be taught what to find.

The teams then took a deep dive into the syntactical and semantic language to build a dictionary of about 450 critical search terms for the algorithm to zoom in on. The keywords are primarily related to formal obligations and requirements. Words like “required,” “shall” and “must” denote obligation, while “within” signals a quantitative requirement, as in “within 48 hours.” Verbs such as “comply,” “fulfill” and “reside” figured prominently in the NLP application dictionaries and libraries built specifically for the MCO’s state contracts.

The NLP algorithm took about four months to develop and fine-tune, through a staged process that increased its accuracy from 50% to 84% to nearly 100%, with repeated testing, additions and clarifications to the searchable library. In total, some 170,000 compliance requirements were identified and classified by the words and characters that describe them in the state Medicaid contracts.

“Unstructured data is the largest type of data in organizations today, yet it remains largely untapped for insights and can be an ongoing burden to manage,” says Traci Gusher, EY Americas Data and Analytics Leader. “Application of modern artificial intelligence changes our ability to rapidly use this impactful type of information and drive value from it.”

The better the world works

Better compliance analysis on its own terms

A new database reduces manual search time from about 250 hours to 15 minutes.


The NLP-driven database developed and trained by EY teams allows the managed care organization’s compliance analysts to access, search and review designated Medicaid contracts quickly in one centralized source.

Those responsible for the MCO’s contract compliance across multiple states and requirements can now open a dashboard and search highly differentiated Medicaid contracts by fields, including state, function, requirement, compliance terms and keywords. The new database has decreased compliance analysts’ search time from about 250 hours to 15 minutes across multiple state contracts, greatly reducing the potential for instances of non-compliance and related penalties.

The new NLP-enabled dashboard adds value

Months to develop the database

Medicaid patients better served
10+ million

Manual search time reduced from
∼250 hours to 15 minutes

Searchable compliance terms

Search terms defined in NLP algorithm

Pages of contracts no longer manually searched

Compliance term search accuracy grew from
50% to nearly 100%

The user-friendly dashboard also reveals state contract differentiation across capabilities. For example, if one state has a call center response time of 60 seconds, and another has a call center response time of 30 seconds, analysts can note and highlight the information for relevant business stakeholders in the organization.

The 200-plus hours saved from manual searches, along with new insights, equips the care company's analysts to boost business understanding and more effectively compete for state Medicaid contracts.

The longer-term impact of this capability helps Medicaid patients receive a critical benefit that is already funded but challenging to disperse. For millions of pregnant mothers and young children around the US, daily struggles are made easier when improved MCO compliance makes fundamental health care more accessible.

“EY teams are using leading-edge, AI-enabled technologies to drive innovations forward,” says Sezin Palmer, Managing Director, Consulting, Data and Analytics, Ernst & Young LLP. “Our teams unlock the power of data to infuse health care organizations with strategies and solutions that redefine value.”

Related topics

Contact us

Like what you’ve seen? Get in touch to learn more.