Expert’s Opinion

Solving Compliance Challenges and Beyond with Natural Language Processing

Understanding new data sources and elevating compliance to drive innovation.

By: Updesh dosanjh

Practice Leader, Pharmacovigilance Technology Solutions, IQVIA

Currently, a mere 15 percent of drugs successfully make it from clinical trials to FDA approval — with nearly 75 percent of failures attributed to safety and efficacy concerns. Once drugs do make it to market, keeping treatments there requires increasingly in-depth pharmacovigilance practices.

With 80 percent of health care data currently residing in unstructured formats, adverse event monitoring and reporting has long been a challenge for the industry. This is further exacerbated as unstructured data sources grow and need to incorporate new natural language (i.e., slang words and more informal descriptions of drug side effects that are provided by patients). Patient-reported outcomes are being shared through a multitude of unstructured channels, including patient discussion boards and online forums, doctors’ notes captured in electronic health records, phone conversations with medical call centers, social media platforms and more. As a result, the industry requires digital transformation to capture this information.

Natural Language Processing (NLP) is an Artificial Intelligence (AI) technology that can mine and “read” unstructured text-based documents, extract the key information and convert this into structured information that can be analyzed by a computer. This is one tool that pharmaceutical companies are increasingly adopting as part of their digital transformation initiatives. NLP is not a new technology for the industry — with almost all the top 20 pharmaceutical companies in the world leveraging NLP in some way. NLP has already seen strong use cases in recent years from bench to bedside — these include helping gene-disease associations with the rapid review of literature landscapes for in early discovery phases, as well as patient identification for clinical trials via data mining from electronic medical records and previous clinical studies.  Still, NLP has significant untapped potential in other areas of the industry. Here are three key challenges it can help address in the area of safety and regulatory compliance.

Meeting Evolving Compliance Demand
Pharmaceutical professionals working in compliance have traditionally been the most reticent to adopt technology; however new regulations and expectations are forcing a shift. Standards from regulatory bodies around the world increasingly demand more holistic and timely reporting for safety. With this change, the need for real-time processing becomes critical to maintain compliance.

NLP can solve the issue of timely reporting while handling increased complexity of data. It does so by combining and comparing adverse events from decades of static legacy data (such as previously published medical literature) with new incoming patient data, which can be captured and processed in near real time.

Today, NLP is capable of standardizing and reporting potential adverse events with a high degree of accuracy. In one example, rare disease biopharma company CSL Behring, doubled its accurate auto-coding of adverse events to Medical Dictionary for Regulatory Activities (MedDRA) from 30 percent (with the simple use of verbatim text-matching) to more than 60 percent with NLP technology.

Understanding New Data Sources
The biggest concern that arises with the availability of so much data in unstructured formats is the possibility that something could be missed. The average patient does not always knowingly report an adverse event, nor do they communicate it in absolute certain terms. The intricacies of natural language must be taken into consideration, as well, including slang and exaggeration — particularly when thinking about social media environments and similar online repositories. In this case, NLP becomes a critical tool for contextualizing information beyond a simple keyword search.

NLP becomes still more important as we think about the trajectory of emerging or entirely new data sources. For example, the number of connected wearable devices worldwide is expected to grow to by more than 1.1 billion in 2022. Data from these devices can be expected to grow in tandem as technology improves its ability to measure biometrics as well as process natural language via built-in voice assistants. With more available data sources, the quality of manual data processing will inevitably degrade. Thus, NLP will become essential for continuously making sense of these growing critical data sources.

Elevating Compliance to Drive Innovation
Compliance has long been viewed as a cost center for pharmaceutical companies. However, the industry is waking up to how automation via NLP and other technologies can truly evolve the entire function of safety and regulatory departments. Freeing up resources from manual reporting challenges enables companies to reinvest in activities that drive true business value, with the ability to analyze the depth and breadth of previously untapped data to fuel future research and development activities.

Thinking past regulatory and safety compliance, understanding natural-language data sources will provide a competitive edge for biopharmaceutical companies seeking new opportunities for clinical development. The same channels that detect adverse events may lead to the discovery of entirely new indications for their product pipeline or broader need for future treatments. This will prove to be a critical differentiator — to the industry at large and particularly, for companies driving toward proofs of concept as we enter the era of precision medicine.


Jane Reed is head of life science strategy at Linguamatics, an IQVIA company. She is responsible for developing the strategic vision for Linguamatics’ growing product portfolio and business development in the life science domain. Jane has extensive experience in life science informatics. She worked for more than 15 years in vendor companies supplying data products, data integration and analysis, and consultancy to pharma and biotech—with roles at Instem, BioWisdom, Incyte, and Hexagen. Before moving into the life science industry, Jane worked in academia with post-docs in genetics and genomics.

As Practice Leader for the Technology Solutions business unit of IQVIA, Updesh Dosanjh is responsible developing the overarching strategy regarding Artificial Intelligence and Machine Learning as it relates to safety and pharmacovigilance. He is focused on the adoption of these innovative technologies and processes that will help optimize pharmacovigilance activities for better, faster results. Dosanjh has over 25 years of knowledge and experience in the management, development, implementation, and operation of processes and systems within the life sciences and other industries.

Keep Up With Our Content. Subscribe To Contract Pharma Newsletters