当前位置: X-MOL 学术J. Pharm. Innov. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An Industrial Approach to Using Artificial Intelligence and Natural Language Processing for Accelerated Document Preparation in Drug Development
Journal of Pharmaceutical Innovation ( IF 2.7 ) Pub Date : 2020-05-15 , DOI: 10.1007/s12247-020-09449-x
Shekhar Viswanath , Jared W. Fennell , Kalpesh Balar , Praful Krishna

Purpose

Due to the exceptionally high standards for accuracy and data integrity in scientific regulatory reporting, it is vital that any tool that aims to streamline this process is as efficient or more in gathering data as a team of scientists, without higher cost in terms of time or resources. For this reason, an artificial intelligence-based tool with parallel search, document creation, and data integrity review capabilities is being investigated as a potential solution. This paper describes a proof of concept project to develop an AI-based tool to rapidly assemble an end-of-phase 2 (EOP2) briefing document for a potential medicine. We have called the tool an Intelligent Machine for Document Preparation or IMDP.

Methods

A training corpus of approximately 65,000 pdf documents derived from electronic lab notebooks and technical reports related to five molecules (including Merestinib) was ingested, and prior EOP2 documents from the remaining four molecules was used to generate training questions and answers. Then, an annotation-light natural language processing algorithm analyzed a set of structured and unstructured data regarding Merestinib. A simple user interface was created allowing scientists to query the system in natural language, and a table builder, image/plot finder, and free-text addition features were added to allow for advanced search without dependence on keywords.

Results

Three significant innovations were designed-in to improve overall performance as compared to our benchmark solution without sacrificing usability. First, the AI-based IMDP was built to improve accuracy and accelerate document creation with remarkably low amount of training. Second, image search capability was added to enrich the knowledge base, and third, the IMDP was integrated with the existing process rather than adding a step in the workflow. Finally, accuracy and total document creation time were compared with the existing tool (benchmark tool). Our experiments show that the AI-based technology reached 89% accuracy which surpassed the internal benchmark of 54% and retrieved the right information 3.6 times faster.

Conclusions

The main contribution of this study is to show the value of artificial intelligence-based tools in accelerating all major stages of regulatory report creation while allowing a team of scientists to seamlessly collaborate.



中文翻译:

在药物开发中使用人工智能和自然语言处理加速文件准备的工业方法

目的

由于科学监管报告中的准确性和数据完整性特别高的标准,至关重要的是,任何旨在简化此过程的工具都可以像一组科学家一样有效或更高地收集数据,而不会花费更多的时间或金钱。资源。因此,正在研究一种具有并行搜索,文档创建和数据完整性检查功能的基于人工智能的工具,作为一种潜在的解决方案。本文介绍了一个概念证明项目,该项目旨在开发一种基于AI的工具,以快速为潜在药物组装第2阶段结束(EOP2)简介文件。我们称该工具为用于文档准备或IMDP的智能机。

方法

摄入了来自电子实验室笔记本的约65,000个pdf文档的训练语料库以及与五个分子(包括Merestinib)有关的技术报告,并且使用了来自其余四个分子的先前EOP2文档来生成训练问题和答案。然后,一种注解轻便的自然语言处理算法分析了有关梅雷替尼的一组结构化和非结构化数据。创建了一个简单的用户界面,使科学家可以用自然语言查询系统,并添加了表格生成器,图像/绘图查找器和自由文本添加功能,从而可以在不依赖关键字的情况下进行高级搜索。

结果

与我们的基准解决方案相比,设计了三项重大创新,以提高整体性能,而又不牺牲可用性。首先,基于AI的IMDP的构建目的是通过很少的培训来提高准确性并加速文档创建。其次,添加了图像搜索功能以丰富知识库;第三,将IMDP与现有流程集成在一起,而不是在工作流程中添加步骤。最后,将准确性和总文档创建时间与现有工具(基准工具)进行了比较。我们的实验表明,基于AI的技术的准确率达到了89%,超过了内部基准测试的54%,并且正确信息的检索速度提高了3.6倍。

结论

这项研究的主要贡献在于展示基于人工智能的工具在加速监管报告创建的所有主要阶段的价值,同时允许一组科学家进行无缝协作。

更新日期:2020-05-15
down
wechat
bug