Solving and Generating Chinese Character Riddles
Solving and Generating Chinese Character Riddles
846
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 846–855,
c
Austin, Texas, November 1-5, 2016.
2016 Association for Computational Linguistics
千 里 会 千 金 2 Related Work
thousand kilometer meet thousand gold
To the best of our knowledge, no previous work has
马 女 studied on Chinese riddles. For other languages,
horse daughter
there are a few approaches concentrated on solv-
妈 ing English riddles. Pepicello and Green (1984) de-
mother scribe the various strategies incorporated in riddles.
(De Palma and Weiner, 1992; Weiner and De Palma,
Figure 2: An example of Chinese character riddle: The solution
1993) use the knowledge representation system to
“妈” is composed of the radical “女” derived from “千 金” and
solve English riddles that consist of a single sen-
“马” derived from “千 里”.
tence question followed by a single sentence an-
swer. They propose to build the relation between the
riddles always imply the radicals of characters. phonemic representation and their associated lexi-
We show an example of a Chinese character rid- cal concepts. Binsted and Ritchie (1994) imple-
dle in Figure 2. The riddle description is “千 里 会 ment a program JAPE which generates riddles from
千 金” and the riddle solution is “妈”. In this exam- humour-independent lexical entries and evaluate the
ple, “千 里” (thousand kilometer) aligns with “马” behaviour of the program by 120 children (Binsted
(horse) because in Chinese culture it is said that a et al., 1997). Olaosun and Faleye (2015) identify
good horse can run thousands of kilometers per day. meaning construction strategies in selected English
Furthermore, “千 金” (thousand gold) aligns with riddles in the web and account for the mental pro-
“女” (daughter) because of the analogy that a daugh- cesses involved in their production, which shows
ter is very important in the family. The final solution that the meaning of a riddle is an imposed mean-
“妈” is composed of these two metaphors because ing that relates to the logical, experiential, linguistic,
the radical “女” meets the radical “马”. Radicals can literary and intuitive judgments of the riddles. Be-
be derived not only from the meaning of metaphors, sides, there are some studies in Yoruba(Akı́nyemı́,
but also from the structure of characters. We will de- 2015b; Akı́nyemı́, 2015a; Magaji, 2014). All of
scribe the alignments and rules in detail in Section 3. these works focus on the semantic meaning, which
In this paper, we propose a statistical framework is different from Chinese character riddles that focus
to solve and generate Chinese character riddles. We on the structure of characters.
show our pipeline in Figure 3. First, we learn the Another popular word game is Crossword Puzzles
common alignments and the combination rules from (CPs) that normally has the form of a square or rect-
large riddle-solution pairs which are mined from the angular grid of white and black shaded squares. The
Web. The alignments and rules are used to identify white squares on the border of the grid or adjacent to
the metaphors in the riddles. Second, in the solving the black ones are associated with clues. Compared
phase, we utilize a dynamic programming algorithm with our riddle task, the clues in the CPs are derived
on the basis of the alignments and rules to figure out from each question where the radicals in solution are
the candidate solutions. For the generating phase, derived from the metaphors in the riddles. Proverb
we use a template-based method and a replacement- (Littman et al., 2002) is the first system for the au-
based method based on the decomposition of the tomatic resolution of CPs. Ernandes et al. (2005)
character to generate the candidate riddles. Finally, utilize a web-search module to find sensible candi-
we employ Ranking SVM to rank the candidates in dates to questions expressed in natural language and
both the solving and generation task. We conduct get the final answer by ranking the candidates. And
the evaluation on 2,000 riddles in the riddle solving the rule-based module and the dictionary module are
task and 100 Chinese characters in the riddle gener- mentioned in his work. The tree kernel is used to
ation task. Experimental results show that the pro- rerank the candidates proposed by Barlacchi et al.
posed method outperforms baseline methods in the (2014) for automatic resolution of crossword puz-
solving task. We also get very promising results in zles.
the generation task according to human judges. From another perspective, there are a few projects
847
Offline Learning
Phrase-Radical Alignment Alignment
Riddle/Solution Pairs Rule Table
and Rule Learning Table
Riddle Solving
Solution Solution Candidate
Solution Riddle Description
Ranking Generation
Riddle Generation
Riddle Riddle Candidate Solution (Chinese
Riddle Description
Ranking Generation Character)
Figure 3: The pipeline of offline learning, riddle solving and riddle generation
on Chinese language cultures, such as the couplet scribe the simple metaphors, e.g. “千 里” aligns
generation and the poem generation. A statistical “马”, which aligns the phrase and the radical by the
machine translation (SMT) framework is proposed meaning. We employ a statistical framework with
to generate Chinese couplets and classic Chinese po- a word alignment algorithm to automatically mine
etry (He et al., 2012; Zhou et al., 2009; Jiang and phrase-radical metaphors from riddle dataset. Con-
Zhou, 2008). Jiang and Zhou (2008) use a phrase- sidering the alignment is often represented as the
based SMT model with linguistic filters to generate matching between successive words in the riddle and
Chinese couplets satisfied couplet constraints, using a radical in the solution, we propose two methods
both human judgments and BLEU scores as the eval- specifically to extract alignments. The first method
uation. Zhou et al. (2009) use the SMT model to in according with (Och and Ney, 2003) is described
generate quatrain with a human evaluation. He et al. as follows. With a riddle description q and corre-
(2012) generate Chinese poems with the given topic sponding solution s, we tokenize the input riddle
words by combining a statistical machine translation q to character as (w1 , w2 , . . . , wn ) and decompose
model with an ancient poetic phrase taxonomy. Fol- the solution s into radicals as (r1 , r2 , . . . , rm ). We
lowing the approaches in SMT framework, it is valid count all ([wi , wj ], rk )(i, j ∈ [1, n], k ∈ [1, m])
to regard the metaphors with its radicals as the align- as alignments. The second method takes into ac-
ments. There are several works using neural network count more structural information of characters. Let
to generate Chinese poems(Zhang and Lapata, 2014; (w1 , w2 ) denote two successive characters in the rid-
Yi et al., 2016). Due to the limited data and strict dle q. If w1 is a radical of w2 and the rest parts of
rules, it is hard to transfer to the riddle generation. w2 as r appear in the solution q, we strongly sup-
port that ((w1 , w2 ), r) is a alignment. It is identical
3 Phrase-Radical Alignments and Rules if w2 is a radical of w1 . We count all alignments and
filter out the alignments whose occurrence number
The metaphor is one of the key components in both is lower than 3. Some high-frequency alignments
solving and generation. On the one hand we need to are shown in Table 1. For example, “四方”(square)
identify these metaphors since each of them aligns aligns “口”(mouth) because of the similar shape and
a radical in the final solution. On the other hand, “二十载”(two decades) aligns “艹”(grass) because
we need to integrate these metaphors into the rid- “艹” looks like two small “十”s.
dle descriptions to generate riddles. Thus, how to
extract the metaphors of riddles becomes a big chal- Besides alignments are represented as common
lenge in our task. Below we introduce our method collocations, there is another kind of common
to extract the metaphors based on the phrase-radical metaphors concentrating on the structure of char-
alignments and rules. acters. We define 6 categories of rules shown in
We exploit the phrase-radical alignments as to de- Table 2 to identify this kind of metaphors. A
848
Bigram Alignments Radical Frequency Trigram Alignments Radical Frequency
西湖 氵 二十载 艹
77 21
(west lake) (water) (two decades) (grass)
四方 口 党中央 口
40 19
(square) (mouth) (center of party) (mouth)
千里 马 意中人 日
36 16
(thousand kilometer) (horse) (sweetheart) (sun)
Table 1: The high-frequency alignments
rule is often represented as an operation that ap- a word or phrase that means “removing” as well as
plies to a character for obtaining parts of it as rad- the others mean the “position” and “direction”.
icals. For example, the character “上” (up) is usu- We mine 14,090 phrase-radical alignments in to-
ally represented as an operation to get the upper tal. More than 1,000 Chinese characters have at least
radical of the corresponding character. We extract one alignment, and there are 27 characters with more
the rules from the phrase-radical alignments we just than 100 alignments. Common radicals are almost
obtain. In a phrase-radical alignment, if a radi- all contained in our alignments set. Chinese char-
cal appears in the one part of a character, we sup- acter is mostly composed of these common radical,
port that this radical is derived from this charac- so these alignments are enough for our task. We ex-
ter, which means the other words in the phrase tract 193 rules in total for all categories of rules, all
may describe an operation to this character. We of them are applied to the riddle solving and the rid-
replace this radical to a placeholder and generate dle generation.
a candidate rule with the corresponding direction
by the radical position in this character. Thus, 4 Riddle Solving and Generation
for each phrase-radical alignment ([w1 , wn ], r), we 4.1 Solving Chinese Character Riddles
count (w1 , . . . , wi−1 , (.), wi+1 , . . . , wn ) as a poten-
tial rule only if r is a radical of wi . We count all rules The process of solving riddles has two components.
learned from data, and filter out the rules whose oc- First, we identify the metaphors in the riddle as
currence number is lower than 5. Some rules are much as possible by matching the phrase-radical
shown in Table 2. The word or phrase in the rule alignments and rules, and integrate these metaphors
“A-B” mostly has the analogous meaning of “re- to obtain a candidate set of solutions. Each candi-
moving”. The word or phrase in the rule “Half” date contains the corresponding parsing clues that
mostly has the analogous meaning of “half”. As imply how and why it is generated as its features.
for the rules “LeftRemove”, “RightRemove”, “Up- Second, we employ a ranking model to determine
perRemove” and “LowerRemove”, there are usually the best solution as output. Below we introduce our
method to generate solution candidates, and we will
849
宀 Algorithm 1: Candidate generation for riddle
Path[1,7] -> 必 密 solving
山 Input : Riddle q, Alignment, Rule
F Output: Path[1,n]
戴
Path[1,2] Path[5,7] 1 Tokenize the input riddle q to w1 , w2 , . . . , wn ;
-> 山 -> 宀 2 for len ← 0 to n − 1 do
Path[3,3] Path[4,4] A 3 for j − i = len do
R
-> 必 -> 戴 4 if len = 0 then
S S 5 Character can align itself ;
1 2 3 4 5 6 7 6 P ath[i, j].Add([wi , wi ] → wi ) ;
上 岗 必 戴 安 全 帽 7 end
on sentry must wear safety helmet
8 else if [wi , wj ] in Alignment then
Figure 4: The decoding process of “上 岗 必 戴 安 全 帽”.
9 Obtain the corresponding radical r
-R: Path[1,2] records the clue that “上 岗” matches “山” by
in Alignment ;
the rule. -S: Path[3,3] records the clue that “必” matches itself
10 P ath[i, j].Add([wi , wj ] → r) ;
and Path[4,4] records ”戴”. -A: Path[5,7] records the clue that 11 end
“安 全 帽” matches “宀” by the alignment. -F: We get a final 12 else if [wi , wj ] matchs Rule then
solution candidate in Path[1,7] by above clues. In this example, 13 Run the predefined operation of the
the character ”戴” from Path[4,4] is irrelevant to the solution. Rule, obtain radical r ;
14 P ath[i, j].Add([wi , wj ] → r) ;
15 end
introduce the ranking model in Section 4.3. 16 foreach k in [i,j-1] do
It is common that two metaphors do not share a 17 P ath[i, j].Add(P ath[i, k] ⊕
character and the metaphor is composed of succes- P ath[k + 1, j]) ;
sive characters. Therefore, we utilize a dynamic pro- 18 end
gramming algorithm based on the CYK algorithm 19 end
(Kasami, 1965) to identify the metaphors with the
20 end
help of the learned alignments and the predefined
rules. We describe the algorithm in Algorithm 1.
An example to illustrate our algorithm is “上 岗
必 戴 安 全 帽”, where the corresponding solution introduce our method to generate candidates of rid-
is “密”. As shown in Figure 4, “上 岗”(on sentry) dle descriptions, and we will introduce the ranking
aligns “山” by matching the rule “上(up) (.)” which model in Section 4.3.
means to take the upper part of the character “岗”. We propose two strategies to generate the candi-
“必” and “戴” aligns itself. And the phrase “安 全 date riddle descriptions for a given Chinese charac-
帽”(safety helmet) aligns to the radical “宀” by the ter, called the template-based method and the re-
alignments because of the analogical shape. Our placement based-method, respectively. First we
ranking model will get the final solution “密” by show our template-based method to generate rid-
these clues. dles. The most natural method is to connect the
metaphor of each radical. For a character and its
4.2 Generating Chinese Character Riddles
possible splitting RD = rdi , we select a correspond-
Two major components are required in the process ing metaphor by the alignment or rule, and then we
of riddle generation. The first step is to generate a connect all metaphor without any other conjunction
list of candidates of riddle descriptions for a Chi- words to form a riddle. The further method is to add
nese character as the solution. The second step is to a few conjunction words between each metaphor,
rank the candidate riddle descriptions and select the which can make the riddle more coherent. We re-
top-N (e.g. 10) candidates as the output. Below we move the recognized metaphors in riddle sentences,
850
Feature Description
Correct Radical number of radicals matched
Missing Radical number of radicals not matched
Disappearing Radical number of radicals that disappear in all characters of riddle descriptions
Single Matching number of clues derived from character itself
Alignment Matching number of clues derived from alignments
Rule Matching number of clues derived from rules
Length Rate ratio of the length of clues
Frequency prior probability of this character as a solution
Table 3: Features for riddle solving
Feature Description
Riddle Length length in characters of the candidate riddle
Riddle Relative Length abs(Riddle Length-5) because the length of common riddles is between 3 and 7
Number Radical number of radicals that the character decompose
Avg Freq Character average number of frequencies of characters in riddle
Max Freq Radical maximized number of frequencies of characters in riddle
Number Alignment number of alignments used for generating the candidate
Length Alignment length of words from alignments
Number Rule number of rules used for generating the candidate
Length Rule length of words from rules
LM Score R score of language model trained by Chinese riddles, poems and couplets
LM Score G score of language model trained by web documents
Table 4: Features for riddle generation
and count the unigram and bigram word frequency ranking model to determine the final output. Below
of the rest words. These words are usually common we show the ranking model.
conjunctions. We sample these words based on the The ranking score is calculated as
frequency distribution and add them into the riddles m
X
to connect the metaphor of each radical. Score(c) = λi ∗ gi (c) (1)
Second, we use an alternative replacement-based i=1
method to generate the candidate riddle descriptions.
Instead of generating the riddle descriptions totally where c represents a candidate, gi (c) represents the
from scratch, we try to replacement part of an ex- i-th feature in the ranking model, m represents the
isting riddle to generate a new riddle description. number of features in total, and λi represents the
Let w = (w1 , w2 , . . . , wn ) denote the word se- weight of the feature. The features of riddle solving
quence of a riddle description on our dataset, where and riddle generation are in Table 3 and Table 4, re-
n denotes the length of the riddle in character. Let spectively. We use Ranking SVM (Joachims, 2006)1
[wi , wj ] (i,j ∈ [1,n]) denote the word span that can to do the model training to get the feature weights.
be aligned to a radical rd, and let X=(x1 , . . . , xm ) The weights of the features are trained with riddle-
denotes the corresponding phrase descriptions of rd. solution pairs. Specifically, in the riddle solving
We then replace [wi , wj ] ∈ X with the other alter- task, for the set of solution candidates, we hold that
native phrases descriptions of rd in X. We try all the the original solution as the positive sample and oth-
possible replacements to generate riddle candidates. ers are the negative samples. Using the dynamic
This method can generate candidate riddles that are programming algorithm to obtain a list of solution
more natural and fluent. candidates, the training process try to optimize the
feature weights so that the ranking score of the orig-
4.3 Ranking Model inal solution is greater than any of the ones from the
Above we introduce the algorithm to solve and gen- 1
https://www.cs.cornell.edu/people/tj/
erate candidates, respectively. Then, we develop a svm_light/svm_rank.html
851
candidate list. In the riddle generation task, we se- Feature Set Acc@1 Acc@5 Acc@10
lect 100 characters on the basis of the frequency dis- G 10.3 12.0 13.6
tribution of characters as a solution. For each char- G+A 17.0 19.2 19.9
acter we use the riddle generation module to gener- A 18.7 22.7 24.2
ate a list of riddle candidates. And we label these G+A+R 28.4 31.0 31.4
candidates manually where the better riddle descrip- A+R 28.8 31.8 32.1
tions get the higher score. Then the training process Table 5: Results of evaluation on test dataset with 2,000 rid-
optimizes the feature weights. dles. -G: The alignments from GIZA++. -A: The alignments
extracted following our method in Section 3. -R: Using the rules
5 Experimental Study to identify the metaphors between the phrase and the radical fol-
5.1 Dataset lowing our method in Section 3. Our method (A+R) achieves
better performances than the baseline methods from GIZA++.
We crawl 77,308 character riddles including riddle
descriptions with its solution from the Web. All of Ranking Method Acc@1 Acc@5 Acc@10
these riddle-solution pairs concentrate on the struc- Jaccard Similarity 26.2 30.2 31.2
ture of characters. Ranking SVM 28.8 31.8 32.1
A stroke table, that contains 3,755 characters en- Table 6: Results of evaluation between ranking methods us-
coded in the first level of GB2312-80, is provided ing the feature set (A+R). The Ranking SVM achieves better
to describe how a Chinese character is decomposed performances than the baseline metric from Jaccard similarity
into its corresponding radicals. Characters may have coefficient.
more than one splitting forms and a character is typ-
ically composed of no more than 3 radicals.
The data for training language model in riddle the alignment between bilingual corpuses. We use
style include two parts: One is the corpus of rid- it as our baseline system that extracts the alignments
dles mentioned above, and the other is a corpus of automatically. And we use the Jaccard similarity co-
Chinese poem and Chinese couplets because of the efficient as the baseline ranking metric. The Jaccard
similar language style. We follow the method that similarity coefficient is defined as:
proposed by (He et al., 2012; Zhou et al., 2009), T
A B
to download the <Tang Poems>,<Song Poems>, J(A, B) = S (2)
A B
<Ming Poems>, <Qing Poems>, <Tai Poems>
from the Internet, and use the method proposed by where A means the radicals set of the solution and B
Fan et al. (2007) to recursively mine those data means the radicals set of the candidate.
with the help of some seed poems and couplets. The results are reported in the Table 5 and Ta-
It amounts to more than 3,500,000 sentences and ble 6. The baseline method can only give about
670,000 couplets. Besides the language model one-tenth correct solution at the Acc@1. Compared
trained in riddle style, we also train a general lan- with the baseline model, by using the alignments
guage model with the web documents. extracted by our method, the system can improve
6.7% at the Acc@1 and 6.3% at Acc@10. A phe-
5.2 Evaluation on Riddle Solving nomenon is that only using the alignments we ex-
We randomly select 2,000 riddles from the riddle tract has the better results than combining it with the
dataset as the test data, and 500 riddles as the de- alignments from Giza++ because metaphors match-
velopment data, while the rest as training data. ing between phrases and characters are particular
Our system always returns a ranking list of candi- in our riddle task. Small changes in the phrase
date solutions, so we use the Acc@k (k = 1, 5, 10) can affect the character that it implies and it may
as the evaluation metric. The Acc@k is the fraction be not a metaphor even if a character in phrase is
of questions which obtain correct answers in their changed. Furthermore, by using rules to identify
top-k results. the metaphors in riddles, we get an improvement of
Giza++ (Och, 2001) is a common tool to extract 10.1% at Acc@1, which proves the validity of the
852
Score Criterion Method Avg(Score)
5 Elegant metaphors, totally coherent Template-based Method 3.49
4 Correct metaphors, mostly coherent Replacement-based Method 4.14
3 Acceptable metaphors, more of less coherent
Riddle from dataset 4.38
2 Tolerable metaphors, little coherent
1 Wrong metaphors, incoherent Table 8: Human evaluation of different methods
Table 7: The criterion of riddle evaluation
system generates riddle descriptions following the
rule we define. The results prove that it is valid to methods in Section 4.2 for each character. Some-
use the alignments and rules that we extract to iden- times the riddles we generate exist in our training
tify the metaphors in our character riddle task. The data. We remove these riddles for the reason that
comparison between Jaccard similarity coefficient we want to evaluate the ability of generating new
and our Ranking SVM method shows that the Rank- riddles. In order to avoid the influence of annota-
ing SVM is better with an improvement of 2.6% at tors and compare the riddles generated by the sys-
Acc@1, which prove that compared to the Jaccard tem with the riddles written by human beings, the
similarity coefficient, the Ranking SVM determine riddles are randomly disordered so that the annota-
the solution more correct if we successfully iden- tors do not know the generating method of each rid-
tify all metaphors in riddle descriptions. Moreover, dle. For each character, we select 5 riddles generated
there is less improvement beyond Acc@5, which by the template-base method, 5 riddles generated by
means the ranking model gets better results even if the replacement-based method, and 2 riddles from
the system cannot identify all metaphors in riddle the riddles dataset written by human beings, which
descriptions. We think that unlike the Jaccard sim- form a set of 12 riddles in total. The annotators score
ilarity coefficient which only uses the features be- each riddle according to the above criterion.
tween the candidate character and the correct solu- The result is shown in Table 8. The riddles writ-
tion, the ranking model uses extra features in the ten by human beings from the riddle dataset get
riddles descriptions, e.g. the number of disappear- the highest score than the riddles generated by the
ing radicals, which helps to exclude obvious wrong system. The riddles generated by the replacement-
candidates. based method have a greater improvement than the
basic template-based method. We consider that the
5.3 Evaluation on Riddle Generation replacement-based method retains some human in-
formation, which makes the generated riddles more
Because there is no previous work about Chinese
coherent.
riddle generation, in order to prove its soundness,
Another result is that the riddle whose solution
we conduct human evaluations on this task in accor-
is a common character or is composed of common
dance with the following two reasons. Firstly, the
radicals gets the higher score, which is explicit that
generated riddles, which is different from the certain
we can get the better results if we have the more
and unique solution in the riddle solving task, are
alternative metaphors of a radical.
varied. So it is hard to measure the quality of gen-
Below we show two examples of the riddle de-
erated riddles with a well defined answer set. Sec-
scriptions generated with the solution “思”(miss)
ondly, small differences in riddles have a great effect
which often decompose into “田”(field) and
on the corresponding solution. It may imply distinct
“心”(heart) shown in Figure 1(b).
radicals even if only a character in the metaphors is
changed. The existing metrics such as BLEU, are • 三 星 伴 月 似 画 里 (Three stars with the
not suitable for our task. Based on above analysis, moon, like in the picture): The radical “田” is
each riddle that the system generates is evaluated by the inside part of “画”. The shape of “心” is
human annotators according to a 5 division criterion three points and a curved line, which looks like
described in Table 7. three stars around a crescent.
We randomly sample 100 characters following
the distribution of the character as a solution. The • 日 日 相 系 在 心 头 (Every day in my heart):
853
The radical “田” is composed of two “日”s, and Paul De Palma and E Judith Weiner. 1992. Riddles:
“心” occurs in the riddle description. The char- accessibility and knowledge representation. In Pro-
acter “头”(top) means the radical “田” is on the ceedings of the 14th conference on Computational
top position. linguistics-Volume 4, pages 1121–1125. Association
for Computational Linguistics.
Marco Ernandes, Giovanni Angelini, and Marco Gori.
6 Conclusion
2005. Webcrow: A web-based system for crossword
We introduce a novel approach to solving and gen- solving. In AAAI, pages 1412–1417.
erating Chinese character riddles. We extract align- Cong Fan, Long Jiang, Ming Zhou, and Shi-Long Wang.
ments and rules to capture the metaphors of phrases 2007. Mining collective pair data from the web.
In Machine Learning and Cybernetics, 2007 Interna-
in riddle descriptions and radicals in the solution
tional Conference on, volume 7, pages 3997–4002.
characters. In total, we obtain 14,090 alignments IEEE.
that imply the metaphors between phrases and rad- Jing He, Ming Zhou, and Long Jiang. 2012. Generat-
icals as well as 193 rules in 6 categories formed ing chinese classical poems with statistical machine
as regular expressions. To solve riddles, we utilize translation models. In Proceedings of the Twenty-Sixth
a dynamic programming algorithm to combine the AAAI Conference on Artificial Intelligence, July 22-26,
identified metaphors based on the alignments and 2012, Toronto, Ontario, Canada.
rules to obtain the candidate solutions. To gener- Long Jiang and Ming Zhou. 2008. Generating chinese
ate riddles, we propose a template-based method and couplets using a statistical mt approach. In Proceed-
a replacement-based method to generate candidate ings of the 22nd International Conference on Compu-
tational Linguistics-Volume 1, pages 377–384. Associ-
riddle descriptions. We employ the Ranking SVM
ation for Computational Linguistics.
to rank the candidates on both the riddle solving and
Thorsten Joachims. 2006. Training linear svms in linear
generation. Our method outperforms baseline meth- time. In Proceedings of the 12th ACM SIGKDD inter-
ods in the solving task. We also get promising re- national conference on Knowledge discovery and data
sults in the generation task by human evaluation. mining, pages 217–226. ACM.
Tadao Kasami. 1965. An efficient recognition and syntax
Acknowledgments analysis algorithm for context-free languages. Techni-
cal report, DTIC Document.
The first author and the fourth author are sup-
Michael L Littman, Greg A Keim, and Noam Shazeer.
ported by the National Natural Science Foundation 2002. A probabilistic approach to solving crossword
of China (Grant No. 61421003). puzzles. Artificial Intelligence, 134(1):23–55.
Maryam Yusuf Magaji. 2014. Morphology, syntax and
functions of the kilba folk riddles. International Jour-
References nal on Studies in English Language and LiteratureI-
Akı́ntúndé Akı́nyemı́. 2015a. Riddles and metaphors: JSELL.
The creation of meaning. pages 37–87. Springer. Franz Josef Och and Hermann Ney. 2003. A system-
Akı́ntúndé Akı́nyemı́. 2015b. Yorùbá riddles in perfor- atic comparison of various statistical alignment mod-
mance: Content and context. In Orature and Yoruba els. Computational linguistics, 29(1):19–51.
Riddles, pages 11–35. Springer. Franz Josef Och. 2001. Training of statistical translation
Gianni Barlacchi, Massimo Nicosia, and Alessandro models.
Moschitti. 2014. Learning to rank answer candi- Ibrahim Esan Olaosun and James Oladunjoye Faleye.
dates for automatic resolution of crossword puzzles. 2015. A cognitive semantic study of some english rid-
In CoNLL, pages 39–48. dles and their answers in amidst a tangled web. Asian
Kim Binsted and Graeme Ritchie. 1994. An imple- Journal of Social Sciences & Humanities Vol, 4:2.
mented model of punning riddles. Technical report, William J Pepicello and Thomas A Green. 1984. Lan-
University of Edinburgh, Department of Artificial In- guage of riddles: new perspectives. The Ohio State
telligence. University Press.
Kim Binsted, Helen Pain, and Graeme Ritchie. 1997. E Judith Weiner and Paul De Palma. 1993. Some prag-
Children’s evaluation of computer-generated punning matic features of lexical ambiguity and simple riddles.
riddles. Pragmatics & Cognition, 5(2):305–354. Language & communication, 13(3):183–193.
854
Xiaoyuan Yi, Ruoyu Li, and Maosong Sun. 2016.
Generating chinese classical poems with rnn encoder-
decoder. arXiv preprint arXiv:1604.01537.
Xingxing Zhang and Mirella Lapata. 2014. Chinese
poetry generation with recurrent neural networks. In
EMNLP, pages 670–680.
Ming Zhou, Long Jiang, and Jing He. 2009. Generat-
ing chinese couplets and quatrain using a statistical ap-
proach. In PACLIC, pages 43–52.
855