机构:[1]Information Center, Xuanwu Hospital, Capital Medical University, Beijing 100053, China首都医科大学宣武医院信息中心[2]Ministry of Education Key Laboratory of Bioinformatics, Tsinghua University, Beijing 100084, China[3]Bioinformatics Division, Center for Synthetic and Systems Biology, Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China[4]Department of Automation, Tsinghua University, Beijing 100084, China[5]Beijing INFI-SAGACITY Technology Company, Ltd., Beijing 100086, China
The current ultrasound reports in Chinese hospitals are mostly written in free-text format. Important clinical information, such as stenosis rate and plaque location, is recorded in long sentences, especially for ultrasound reports of cerebrovascular diseases. They cannot be directly used for further automatic analysis due to the lack of structure and standardization. The goal of this paper is to assess the feasibility of applying natural language processing technology to automatically extract disease entities and relate information such as the stenosis rate and plaque location from free-text ultrasound reports of cerebrovascular diseases. A structured model using conditional random fields (CRFs) is first constructed. Then, the clause optimizing and segmentation process is performed on a word level to achieve data structuring. Seven categories of terms, including symptoms, plaque locations, diseases, and degree, in 1980 de-identified ultrasound reports were manually annotated as a training dataset. With this model, 7937 ultrasound reports were automatically processed to structure data within 40 min. The true positive rate of the model for each category of terms is 96%, 94%, 97%, 100%, 100%, 100%, and 97%, respectively. The CRF model can be used in Chinese natural language processing to provide support for unstructured data analysis. The standardized segmentation results can be obtained based on medical ontology libraries. However, real-time processing and scientific annotation remain a challenge if intelligent clinical decision making needs to be applied to a real-world clinical environment.
基金:
Capital Medical University Research and Development Fund [PYZ2018125]; National Natural Science Foundation of China [NSFC U1736210]; National Key Research and Development Program of China [SQ2018YFC090002]; Tsinghua-Fuzhou Institute for Data Technology [TFIDT2018004]
语种:
外文
被引次数:
WOS:
中科院(CAS)分区:
出版当年[2018]版:
大类|2 区工程技术
小类|2 区计算机:信息系统3 区工程:电子与电气3 区电信学
最新[2023]版:
大类|3 区计算机科学
小类|3 区工程:电子与电气4 区计算机:信息系统4 区电信学
JCR分区:
出版当年[2017]版:
Q1ENGINEERING, ELECTRICAL & ELECTRONICQ1TELECOMMUNICATIONSQ1COMPUTER SCIENCE, INFORMATION SYSTEMS
最新[2023]版:
Q2COMPUTER SCIENCE, INFORMATION SYSTEMSQ2ENGINEERING, ELECTRICAL & ELECTRONICQ2TELECOMMUNICATIONS
第一作者机构:[1]Information Center, Xuanwu Hospital, Capital Medical University, Beijing 100053, China
共同第一作者:
通讯作者:
通讯机构:[1]Information Center, Xuanwu Hospital, Capital Medical University, Beijing 100053, China[2]Ministry of Education Key Laboratory of Bioinformatics, Tsinghua University, Beijing 100084, China[3]Bioinformatics Division, Center for Synthetic and Systems Biology, Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China[4]Department of Automation, Tsinghua University, Beijing 100084, China
推荐引用方式(GB/T 7714):
Chen Pengyu,Liu Qiao,Wei Lan,et al.Automatically Structuring on Chinese Ultrasound Report of Cerebrovascular Diseases via Natural Language Processing[J].IEEE ACCESS.2019,7:89043-89050.doi:10.1109/ACCESS.2019.2923221.
APA:
Chen, Pengyu,Liu, Qiao,Wei, Lan,Zhao, Beier,Jia, Yin...&Fei, Xiaolu.(2019).Automatically Structuring on Chinese Ultrasound Report of Cerebrovascular Diseases via Natural Language Processing.IEEE ACCESS,7,
MLA:
Chen, Pengyu,et al."Automatically Structuring on Chinese Ultrasound Report of Cerebrovascular Diseases via Natural Language Processing".IEEE ACCESS 7.(2019):89043-89050