Corpus Construction for Named Entities and Entity Relations on Chinese Electronic Medical Records

doi:10.13328/j.cnki.jos.004880

微信服务号

微信订阅号

Home > Archive>Volume 27, Issue 11, 2016 >2725-2746. DOI:10.13328/j.cnki.jos.004880

PDF HTML XML Export Cite reminder

Corpus Construction for Named Entities and Entity Relations on Chinese Electronic Medical Records
DOI:
                        10.13328/j.cnki.jos.004880
                    
Author:
                        
                        
                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

An electronic medical record (EMR) is a patient's individual medical record written by health care providers and stored in digital format in which much medical knowledge and information about patient's personal health conditions are kept. The construction of annotated corpus for named entities and entity relations on EMR is a primary and fundamental task for information extraction which plays important role in clinical decision support, practice of evidence-based medicine, and other medical applications. Based on survey of current research on corpus construction for named entities and entity relations on EMR, this research proposes an annotation scheme for named entities and entity relations on Chinese electronic medical records (CEMR) according to characteristics of the records. Under the supervision of physicians, a complete and detailed annotation specification on CEMR is formulated, and an annotated corpus with high agreement is constructed. The corpus comprises 992 medical text documents, and inter-annotator agreement (IAA) of named entity annotations and entity relation annotations attain 0.922 and 0.895, respectively. The work presented in this paper builds substantial foundations for the subsequent research on information extraction in CEMR.

Reference

Cited by

Get Citation

杨锦锋,关毅,何彬,曲春燕,于秋滨,刘雅欣,赵永杰.中文电子病历命名实体和实体关系语料库构建.软件学报,2016,27(11):2725-2746

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:December 03,2014
Revised:June 24,2015
Adopted:
Online: March 24,2016
Published:

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

Article Metrics

History