Online Exclusives

Leveraging AI to Advance Discovery, Development and Delivery

Sanofi leveraged the Linguamatics NLP AI text-mining software to identify links between specific genes, diseases, and drug sensitivity

By: Kristin Brooks

Managing Editor, Contract Pharma

In order to advance the discovery, development and delivery of multiple sclerosis drugs (MS), global pharma company Sanofi wanted to identify links between specific genes, diseases, and drug sensitivity. A key requirement for any drug development project – and increasingly in precision/personalized medicine and pharmacogenomics – is a comprehensive understanding of the genetic associations for the disease of interest.

For a specific MS biomarker project, Sanofi needed to explore a broad and comprehensive knowledge base to identify potential new biomarkers. Sanofi leveraged the Linguamatics AI-based natural language processing (NLP) text-mining software to process an extensive collection of literature sources, which identified all 22 previously published autoimmune diseases and drug sensitivities associated with HLA alleles and haplotypes, and an additional 33 novel unpublished disease and drug sensitivity associations. The observations were fed into a searchable knowledge base for broad use within the Sanofi team in its search for novel biomarkers.

Sanofi is leveraging NLP and text analytics in other areas of R&D including target identification and prioritization, drug repurposing, interpretation of genes/proteins identified by ‘omics experiments, and full patent text mining for new targets.  Beyond R&D, Sanofi is also using text mining along the bench to bedside pipeline in areas such as clinical trial site selection and study design, opportunity scouting, pharmacovigilance, competitive intelligence, and social media analysis.

Dongyu Liu, Associate Director of Translational Sciences at Sanofi discusses the use of AI across drug development, current obstacles, and opportunities.  –KB
 
Contract Pharma: What aspects of the drug development continuum have the most potential to benefit from NLP AI?

Dongyu Liu: NLP AI technology has been used along the whole drug development continuum, but I’m in more of the early research part. We have been using the NLP AI (specifically the Linguamatics I2E platform) for disease mapping, mutation analysis, targeted identification and prioritization, and biomarker discovery. My colleagues also applied NLP to the later part of drug development for such things as pharmacovigilance, opportunity scouting and competitive intelligence. Other pharma companies also use NLP AI in clinical trials for trial site selection and study design. So basically, it’s being used across all the different phases of the drug development unit.

I think the most potential benefit from NLP AI probably is for Real World Data. All pharma companies are heavily invested in digital now, including Sanofi, and have become more data-driven and heavily invested in Real World Evidence. They’re all using electronic health records, insurance claims, patient surveys, and other data to discover, develop, and deliver insights from structured and unstructured text. We want to understand how drugs work outside the clinical trial environment and demonstrate value so we can improve outcomes. By applying NLP AI to all these sources, which are mostly stored as unstructured data, we can extract information and transform the data from an unstructured to structured format. We can then apply other machine learning or AI operations to get even more value from the information.

CP: What are some of the current obstacles employing NLP AI in the pharmaceutical industry? How can they be overcome?
 
DL: One of the biggest obstacles is the result of the high expectations people have for these AI technologies. Many people don’t quite understand NLP and think that if you can extract all this information it will solve all the problems. We are immersed in several technologies that aren’t not cheap. We need to be able to show the value of NLP to different research projects and get upper level management buy-in to continue to invest further development. One way to overcome these obstacles is to start small, address specific questions and build from there.
 
CP: Where do you see the most opportunities for NLP AI in drug development?
 
DL: I think that NLP will help with Real World Evidence. Pharma companies are heavily implementing these digital strategies and Real World Evidence has the potential to validate the therapeutic value of pharmaceutical products and help to customize product development in a more patient-centric manner. Using NLP AI extract unstructured data is valuable because it provides more Real World Evidence. Real World Evidence holds great promise for precision medicine because of the information extracted from EHRs and other sources will help gain insight into more effective research directions in drug development.
 
CP: How is NLP AI currently being leveraged in drug development and what can we expect in the future?
 
DL: Currently NLP AI is widely used to transform unstructured data or the semi-structured data into the structured knowledge from literature, reports, health claims and EHRs. These help in different processes throughout drug development. I think in the future we can expect the application of more machine learning algorithms on the data extracted and more integration with other AI technologies. This will allow us to advance our findings and make even better use of the data we extract.


Dongyu Liu is associate director in the science computing group of Translational Sciences department at Sanofi. Dongyu’s research interests include bioinformatics, data mining and text mining.  He plays a key role in bringing in and employing text mining technology to support ongoing research projects in Sanofi.  He received a Ph.D. from University of Rochester, and did postdoctoral research at Whitehead Institute.

Keep Up With Our Content. Subscribe To Contract Pharma Newsletters