[关键词]
[摘要]
汉字的基本特征表示是笔段,提出一种基于多边形逼近和有限状态机的笔段提取-合并算法.该算法首先找到笔画的拐点(最小内角值小于指定阈值),然后分别寻找拐点两侧曲线段上的拐点,反复执行,直到再也找不到拐点为止.依次连接一个笔画中所有曲线的起点和终点,就形成了该笔画的笔段系列.随后,运用有限状态机描述并判定笔段的状态,并以此判定笔段的合并要求,以最大限度地减少冗余笔段.实验表明,这种算法具有较低的计算复杂度和很好的逼近效果,能适应手写汉字的笔段提取合并要求.
[Key word]
[Abstract]
In this paper, a segment extraction-integrate algorithm based on polygon approximation and finite state machines for on-line Chinese characters recognition (OLCCR) is presented. With this method, the point with the smallest interior angle which is less than the given value is detected and the whole stroke is split into two adjacent curves by this point, which is called as a cut-off point or an inflexion. To each of the two curves, the same step is performed to detect the cut-off points respectively. The same operations are performed iteratively until the smallest interior angle in all the curves is larger than the given threshold value. All the cut-off points and the start-end points compose the stroke and every pair of adjacent points constructs a segment. After segments have been extracted, Finite State Machines is used to check whether the adjacent segments need combination thus redundant segments can be reduced. Experiments proved that this method has the advantages of less computing complexity and better approximating effect than other methods.
[中图分类号]
[基金项目]