本篇文章针对多种自然语言处理的任务:part-of-speech tagging, chunking, named entity recognition, semantic role labeling 建立了一个统一的模型以及相应的学习算法。模型抛弃了原来那种 man-made 的输入特征以及忽略掉了很多的先验知识,而是从大量的无标签的训练数据中学习到更加本质的特征。
试想,如果我们能有效地将一段文本表示成一种数据结构(尽管这种数据结构的形式并没有一个定论),那么就可以较为容易地从中提取出更为简洁的表示。“These representations can also be motivated by our belief that they capture something more general about language.” 它们能够更好地表示语法和语义的信息。尽管原来那种 ad-hoc 的表示方法尽管在效果上还不错,但是它并不能为我们的目标带来更多的信息。
“Text corpora have been manually annotated with such data structures in order to compare the performance of various systems. The availability of standard benchmarks has simulated research in NLP and effective systems have been designed for all these tasks.” 本篇文章的方法就是建立在这些任务之上的,为什么没有针对某个单个任务呢,文章中是这样描述的:“Instead we use a single learning system able to discover adequate internal of the internal vrepresentations. In fact we view the benchmarks as indirect measurements of the relevance intermediate representations are more general than any of the benchmarks.” 而且从最后的实验结果来看,文章中的方法在这几个任务上都取得了不错的效果。