Solving and Generating Chinese Character Riddles

This document proposes a statistical framework to solve and generate Chinese character riddles. It discusses how Chinese characters have unique structures like radicals that provide meaning and metaphors. It describes learning alignments between riddle phrases and character radicals, and combination rules. These are used to identify metaphors and solve riddles by combining metaphors. Template-based and replacement-based methods are used to generate candidate riddles. Ranking models rerank candidates in solving and generation. The framework outperforms baselines in solving, and gets promising results in generation according to human judges.

Uploaded by

Zhong Fu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

221 views10 pages

Solving and Generating Chinese Character Riddles

Uploaded by

Zhong Fu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Solving and Generating Chinese Character Riddles

Chuanqi Tan† ∗ Furu Wei‡ Li Dong+ Weifeng Lv† Ming Zhou‡

†
State Key Laboratory of Software Development Environment, Beihang University, China
‡
Microsoft Research Asia +
University of Edinburgh
† +
[email protected] [email protected]
‡ †
{fuwei, mingzhou}@microsoft.com [email protected]

Abstract and a corresponding solution. The character rid-

dle is one of the most popular forms of various rid-
Chinese character riddle is a riddle game in dles in which the riddle solution is a single Chinese
which the riddle solution is a single Chi- character. While English words are strings of let-
nese character. It is closely connected with ters together, Chinese characters are composed of
the shape, pronunciation or meaning of Chi- radicals that associate with meaning or metaphor.
nese characters. The riddle description (sen-
In other words, Chinese characters are usually posi-
tence) is usually composed of phrases with
rich linguistic phenomena (such as pun, sim- tioned into some common structures, such as upper-
ile, and metaphor), which are associated to lower structure, left-right structure, inside-outside
different parts (namely radicals) of the so- structure, which means they can be decomposed
lution character. In this paper, we propose into other characters or radicals. For example, “好”
a statistical framework to solve and generate (good), a character with left-right structure, can be
Chinese character riddles. Specifically, we decomposed into “女” (daughter) and “子” (son). As
learn the alignments and rules to identify the
illustrated in Figure 1(a), the left part of “好” is “女”
metaphors between phrases in riddles and rad-
icals in characters. Then, in the solving phase, and the right part is “子”. “女” and “子” are called
we utilize a dynamic programming method the “radical” of “好”. Figure 1(b) is another exam-
to combine the identified metaphors to obtain ple of the character “思” (miss) with an upper-lower
candidate solutions. In the riddle generation structure.
phase, we use a template-based method and
a replacement-based method to obtain candi-
date riddle descriptions. We then use Rank- 好田
ing SVM to rerank the candidates both in the good
solving and generation process. Experimental 思 field

results in the solving task show that the pro- 女子 miss 心

posed method outperforms baseline methods. daughter son heart
We also get very promising results in the gen-
(a) Left-Right Structure (b) Upper-Lower Structure
eration task according to human judges.
Figure 1: Examples of the structure of Chinese characters

1 Introduction One of the most important characteristics of char-

acter riddle lies in the structure of Chinese charac-
The riddle is regarded as one of the most unique ters. Unlike the common riddles which imply the
and vital elements in traditional Chinese culture, object in the riddle descriptions, character riddles
which is usually composed of a riddle description pay more attention to structures such as combination
∗
The work was done when the first author and the third of radicals and decomposition of characters. Ac-
author were interns at Microsoft Research Asia. cording to these characteristics, metaphors in the

846

Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 846–855,
c
Austin, Texas, November 1-5, 2016. 2016 Association for Computational Linguistics
千里会千金 2 Related Work
thousand kilometer meet thousand gold
To the best of our knowledge, no previous work has
马女 studied on Chinese riddles. For other languages,
horse daughter
there are a few approaches concentrated on solv-
妈 ing English riddles. Pepicello and Green (1984) de-
mother scribe the various strategies incorporated in riddles.
(De Palma and Weiner, 1992; Weiner and De Palma,
Figure 2: An example of Chinese character riddle: The solution
1993) use the knowledge representation system to
“妈” is composed of the radical “女” derived from “千金” and
solve English riddles that consist of a single sen-
“马” derived from “千里”.
tence question followed by a single sentence an-
swer. They propose to build the relation between the
riddles always imply the radicals of characters. phonemic representation and their associated lexi-
We show an example of a Chinese character rid- cal concepts. Binsted and Ritchie (1994) imple-
dle in Figure 2. The riddle description is “千里会 ment a program JAPE which generates riddles from
千金” and the riddle solution is “妈”. In this exam- humour-independent lexical entries and evaluate the
ple, “千里” (thousand kilometer) aligns with “马” behaviour of the program by 120 children (Binsted
(horse) because in Chinese culture it is said that a et al., 1997). Olaosun and Faleye (2015) identify
good horse can run thousands of kilometers per day. meaning construction strategies in selected English
Furthermore, “千金” (thousand gold) aligns with riddles in the web and account for the mental pro-
“女” (daughter) because of the analogy that a daugh- cesses involved in their production, which shows
ter is very important in the family. The final solution that the meaning of a riddle is an imposed mean-
“妈” is composed of these two metaphors because ing that relates to the logical, experiential, linguistic,
the radical “女” meets the radical “马”. Radicals can literary and intuitive judgments of the riddles. Be-
be derived not only from the meaning of metaphors, sides, there are some studies in Yoruba(Akı́nyemı́,
but also from the structure of characters. We will de- 2015b; Akı́nyemı́, 2015a; Magaji, 2014). All of
scribe the alignments and rules in detail in Section 3. these works focus on the semantic meaning, which
In this paper, we propose a statistical framework is different from Chinese character riddles that focus
to solve and generate Chinese character riddles. We on the structure of characters.
show our pipeline in Figure 3. First, we learn the Another popular word game is Crossword Puzzles
common alignments and the combination rules from (CPs) that normally has the form of a square or rect-
large riddle-solution pairs which are mined from the angular grid of white and black shaded squares. The
Web. The alignments and rules are used to identify white squares on the border of the grid or adjacent to
the metaphors in the riddles. Second, in the solving the black ones are associated with clues. Compared
phase, we utilize a dynamic programming algorithm with our riddle task, the clues in the CPs are derived
on the basis of the alignments and rules to figure out from each question where the radicals in solution are
the candidate solutions. For the generating phase, derived from the metaphors in the riddles. Proverb
we use a template-based method and a replacement- (Littman et al., 2002) is the first system for the au-
based method based on the decomposition of the tomatic resolution of CPs. Ernandes et al. (2005)
character to generate the candidate riddles. Finally, utilize a web-search module to find sensible candi-
we employ Ranking SVM to rank the candidates in dates to questions expressed in natural language and
both the solving and generation task. We conduct get the final answer by ranking the candidates. And
the evaluation on 2,000 riddles in the riddle solving the rule-based module and the dictionary module are
task and 100 Chinese characters in the riddle gener- mentioned in his work. The tree kernel is used to
ation task. Experimental results show that the pro- rerank the candidates proposed by Barlacchi et al.
posed method outperforms baseline methods in the (2014) for automatic resolution of crossword puz-
solving task. We also get very promising results in zles.
the generation task according to human judges. From another perspective, there are a few projects

847
Offline Learning
Phrase-Radical Alignment Alignment
Riddle/Solution Pairs Rule Table
and Rule Learning Table

Riddle Solving
Solution Solution Candidate
Solution Riddle Description
Ranking Generation

Riddle Generation
Riddle Riddle Candidate Solution (Chinese
Riddle Description
Ranking Generation Character)

Figure 3: The pipeline of offline learning, riddle solving and riddle generation

on Chinese language cultures, such as the couplet scribe the simple metaphors, e.g. “千里” aligns
generation and the poem generation. A statistical “马”, which aligns the phrase and the radical by the
machine translation (SMT) framework is proposed meaning. We employ a statistical framework with
to generate Chinese couplets and classic Chinese po- a word alignment algorithm to automatically mine
etry (He et al., 2012; Zhou et al., 2009; Jiang and phrase-radical metaphors from riddle dataset. Con-
Zhou, 2008). Jiang and Zhou (2008) use a phrase- sidering the alignment is often represented as the
based SMT model with linguistic filters to generate matching between successive words in the riddle and
Chinese couplets satisfied couplet constraints, using a radical in the solution, we propose two methods
both human judgments and BLEU scores as the eval- specifically to extract alignments. The first method
uation. Zhou et al. (2009) use the SMT model to in according with (Och and Ney, 2003) is described
generate quatrain with a human evaluation. He et al. as follows. With a riddle description q and corre-
(2012) generate Chinese poems with the given topic sponding solution s, we tokenize the input riddle
words by combining a statistical machine translation q to character as (w1 , w2 , . . . , wn ) and decompose
model with an ancient poetic phrase taxonomy. Fol- the solution s into radicals as (r1 , r2 , . . . , rm ). We
lowing the approaches in SMT framework, it is valid count all ([wi , wj ], rk )(i, j ∈ [1, n], k ∈ [1, m])
to regard the metaphors with its radicals as the align- as alignments. The second method takes into ac-
ments. There are several works using neural network count more structural information of characters. Let
to generate Chinese poems(Zhang and Lapata, 2014; (w1 , w2 ) denote two successive characters in the rid-
Yi et al., 2016). Due to the limited data and strict dle q. If w1 is a radical of w2 and the rest parts of
rules, it is hard to transfer to the riddle generation. w2 as r appear in the solution q, we strongly sup-
port that ((w1 , w2 ), r) is a alignment. It is identical
3 Phrase-Radical Alignments and Rules if w2 is a radical of w1 . We count all alignments and
filter out the alignments whose occurrence number
The metaphor is one of the key components in both is lower than 3. Some high-frequency alignments
solving and generation. On the one hand we need to are shown in Table 1. For example, “四方”(square)
identify these metaphors since each of them aligns aligns “口”(mouth) because of the similar shape and
a radical in the final solution. On the other hand, “二十载”(two decades) aligns “艹”(grass) because
we need to integrate these metaphors into the rid- “艹” looks like two small “十”s.
dle descriptions to generate riddles. Thus, how to
extract the metaphors of riddles becomes a big chal- Besides alignments are represented as common
lenge in our task. Below we introduce our method collocations, there is another kind of common
to extract the metaphors based on the phrase-radical metaphors concentrating on the structure of char-
alignments and rules. acters. We define 6 categories of rules shown in
We exploit the phrase-radical alignments as to de- Table 2 to identify this kind of metaphors. A

848
Bigram Alignments Radical Frequency Trigram Alignments Radical Frequency
西湖氵二十载艹
77 21
(west lake) (water) (two decades) (grass)
四方口党中央口
40 19
(square) (mouth) (center of party) (mouth)
千里马意中人日
36 16
(thousand kilometer) (horse) (sweetheart) (sun)
Table 1: The high-frequency alignments

Category Description Examples

[半折断边](.)
Half take half of the matched placeholder as radicals
[half,snap,break,side](.)
[减走无缺](.)(.)
A-B remove the B as radical in A to compose a new Chinese character
[subtract,leave,not,lack](.)(.)
(.)[字]0,1[下南]
UpperRemove remove the upper-side radical of the matched placeholder
(.)[character](0,1)[lower,south]
[首前上北](.)
LowerRemove remove the lower-side radical of the matched placeholder
[top,front,up,north](.)
(.)[字]0,1[右东]
LeftRemove remove the left-side radical of the matched placeholder
(.)[character](0,1)[right,east]
(.)[字]0,1[左西]
RightRemove remove the right-side radical of the matched placeholder
(.)[character](0,1)[left,west]
Table 2: The descriptions and examples of rules

rule is often represented as an operation that ap- a word or phrase that means “removing” as well as
plies to a character for obtaining parts of it as rad- the others mean the “position” and “direction”.
icals. For example, the character “上” (up) is usu- We mine 14,090 phrase-radical alignments in to-
ally represented as an operation to get the upper tal. More than 1,000 Chinese characters have at least
radical of the corresponding character. We extract one alignment, and there are 27 characters with more
the rules from the phrase-radical alignments we just than 100 alignments. Common radicals are almost
obtain. In a phrase-radical alignment, if a radi- all contained in our alignments set. Chinese char-
cal appears in the one part of a character, we sup- acter is mostly composed of these common radical,
port that this radical is derived from this charac- so these alignments are enough for our task. We ex-
ter, which means the other words in the phrase tract 193 rules in total for all categories of rules, all
may describe an operation to this character. We of them are applied to the riddle solving and the rid-
replace this radical to a placeholder and generate dle generation.
a candidate rule with the corresponding direction
by the radical position in this character. Thus, 4 Riddle Solving and Generation
for each phrase-radical alignment ([w1 , wn ], r), we 4.1 Solving Chinese Character Riddles
count (w1 , . . . , wi−1 , (.), wi+1 , . . . , wn ) as a poten-
tial rule only if r is a radical of wi . We count all rules The process of solving riddles has two components.
learned from data, and filter out the rules whose oc- First, we identify the metaphors in the riddle as
currence number is lower than 5. Some rules are much as possible by matching the phrase-radical
shown in Table 2. The word or phrase in the rule alignments and rules, and integrate these metaphors
“A-B” mostly has the analogous meaning of “re- to obtain a candidate set of solutions. Each candi-
moving”. The word or phrase in the rule “Half” date contains the corresponding parsing clues that
mostly has the analogous meaning of “half”. As imply how and why it is generated as its features.
for the rules “LeftRemove”, “RightRemove”, “Up- Second, we employ a ranking model to determine
perRemove” and “LowerRemove”, there are usually the best solution as output. Below we introduce our
method to generate solution candidates, and we will

849
宀 Algorithm 1: Candidate generation for riddle
Path[1,7] -> 必密 solving
山 Input : Riddle q, Alignment, Rule
F Output: Path[1,n]
戴
Path[1,2] Path[5,7] 1 Tokenize the input riddle q to w1 , w2 , . . . , wn ;
-> 山 -> 宀 2 for len ← 0 to n − 1 do
Path[3,3] Path[4,4] A 3 for j − i = len do
R
-> 必 -> 戴 4 if len = 0 then
S S 5 Character can align itself ;
1 2 3 4 5 6 7 6 P ath[i, j].Add([wi , wi ] → wi ) ;
上岗必戴安全帽 7 end
on sentry must wear safety helmet
8 else if [wi , wj ] in Alignment then
Figure 4: The decoding process of “上岗必戴安全帽”.
9 Obtain the corresponding radical r
-R: Path[1,2] records the clue that “上岗” matches “山” by
in Alignment ;
the rule. -S: Path[3,3] records the clue that “必” matches itself
10 P ath[i, j].Add([wi , wj ] → r) ;
and Path[4,4] records ”戴”. -A: Path[5,7] records the clue that 11 end
“安全帽” matches “宀” by the alignment. -F: We get a final 12 else if [wi , wj ] matchs Rule then
solution candidate in Path[1,7] by above clues. In this example, 13 Run the predefined operation of the
the character ”戴” from Path[4,4] is irrelevant to the solution. Rule, obtain radical r ;
14 P ath[i, j].Add([wi , wj ] → r) ;
15 end
introduce the ranking model in Section 4.3. 16 foreach k in [i,j-1] do
It is common that two metaphors do not share a 17 P ath[i, j].Add(P ath[i, k] ⊕
character and the metaphor is composed of succes- P ath[k + 1, j]) ;
sive characters. Therefore, we utilize a dynamic pro- 18 end
gramming algorithm based on the CYK algorithm 19 end
(Kasami, 1965) to identify the metaphors with the
20 end
help of the learned alignments and the predefined
rules. We describe the algorithm in Algorithm 1.
An example to illustrate our algorithm is “上岗
必戴安全帽”, where the corresponding solution introduce our method to generate candidates of rid-
is “密”. As shown in Figure 4, “上岗”(on sentry) dle descriptions, and we will introduce the ranking
aligns “山” by matching the rule “上(up) (.)” which model in Section 4.3.
means to take the upper part of the character “岗”. We propose two strategies to generate the candi-
“必” and “戴” aligns itself. And the phrase “安全 date riddle descriptions for a given Chinese charac-
帽”(safety helmet) aligns to the radical “宀” by the ter, called the template-based method and the re-
alignments because of the analogical shape. Our placement based-method, respectively. First we
ranking model will get the final solution “密” by show our template-based method to generate rid-
these clues. dles. The most natural method is to connect the
metaphor of each radical. For a character and its
4.2 Generating Chinese Character Riddles
possible splitting RD = rdi , we select a correspond-
Two major components are required in the process ing metaphor by the alignment or rule, and then we
of riddle generation. The first step is to generate a connect all metaphor without any other conjunction
list of candidates of riddle descriptions for a Chi- words to form a riddle. The further method is to add
nese character as the solution. The second step is to a few conjunction words between each metaphor,
rank the candidate riddle descriptions and select the which can make the riddle more coherent. We re-
top-N (e.g. 10) candidates as the output. Below we move the recognized metaphors in riddle sentences,

850
Feature Description
Correct Radical number of radicals matched
Missing Radical number of radicals not matched
Disappearing Radical number of radicals that disappear in all characters of riddle descriptions
Single Matching number of clues derived from character itself
Alignment Matching number of clues derived from alignments
Rule Matching number of clues derived from rules
Length Rate ratio of the length of clues
Frequency prior probability of this character as a solution
Table 3: Features for riddle solving

Feature Description
Riddle Length length in characters of the candidate riddle
Riddle Relative Length abs(Riddle Length-5) because the length of common riddles is between 3 and 7
Number Radical number of radicals that the character decompose
Avg Freq Character average number of frequencies of characters in riddle
Max Freq Radical maximized number of frequencies of characters in riddle
Number Alignment number of alignments used for generating the candidate
Length Alignment length of words from alignments
Number Rule number of rules used for generating the candidate
Length Rule length of words from rules
LM Score R score of language model trained by Chinese riddles, poems and couplets
LM Score G score of language model trained by web documents
Table 4: Features for riddle generation

and count the unigram and bigram word frequency ranking model to determine the final output. Below
of the rest words. These words are usually common we show the ranking model.
conjunctions. We sample these words based on the The ranking score is calculated as
frequency distribution and add them into the riddles m
X
to connect the metaphor of each radical. Score(c) = λi ∗ gi (c) (1)
Second, we use an alternative replacement-based i=1
method to generate the candidate riddle descriptions.
Instead of generating the riddle descriptions totally where c represents a candidate, gi (c) represents the
from scratch, we try to replacement part of an ex- i-th feature in the ranking model, m represents the
isting riddle to generate a new riddle description. number of features in total, and λi represents the
Let w = (w1 , w2 , . . . , wn ) denote the word se- weight of the feature. The features of riddle solving
quence of a riddle description on our dataset, where and riddle generation are in Table 3 and Table 4, re-
n denotes the length of the riddle in character. Let spectively. We use Ranking SVM (Joachims, 2006)1
[wi , wj ] (i,j ∈ [1,n]) denote the word span that can to do the model training to get the feature weights.
be aligned to a radical rd, and let X=(x1 , . . . , xm ) The weights of the features are trained with riddle-
denotes the corresponding phrase descriptions of rd. solution pairs. Specifically, in the riddle solving
We then replace [wi , wj ] ∈ X with the other alter- task, for the set of solution candidates, we hold that
native phrases descriptions of rd in X. We try all the the original solution as the positive sample and oth-
possible replacements to generate riddle candidates. ers are the negative samples. Using the dynamic
This method can generate candidate riddles that are programming algorithm to obtain a list of solution
more natural and fluent. candidates, the training process try to optimize the
feature weights so that the ranking score of the orig-
4.3 Ranking Model inal solution is greater than any of the ones from the
Above we introduce the algorithm to solve and gen- 1
https://www.cs.cornell.edu/people/tj/
erate candidates, respectively. Then, we develop a svm_light/svm_rank.html

851
candidate list. In the riddle generation task, we se- Feature Set Acc@1 Acc@5 Acc@10
lect 100 characters on the basis of the frequency dis- G 10.3 12.0 13.6
tribution of characters as a solution. For each char- G+A 17.0 19.2 19.9
acter we use the riddle generation module to gener- A 18.7 22.7 24.2
ate a list of riddle candidates. And we label these G+A+R 28.4 31.0 31.4
candidates manually where the better riddle descrip- A+R 28.8 31.8 32.1
tions get the higher score. Then the training process Table 5: Results of evaluation on test dataset with 2,000 rid-
optimizes the feature weights. dles. -G: The alignments from GIZA++. -A: The alignments
extracted following our method in Section 3. -R: Using the rules
5 Experimental Study to identify the metaphors between the phrase and the radical fol-
5.1 Dataset lowing our method in Section 3. Our method (A+R) achieves
better performances than the baseline methods from GIZA++.
We crawl 77,308 character riddles including riddle
descriptions with its solution from the Web. All of Ranking Method Acc@1 Acc@5 Acc@10
these riddle-solution pairs concentrate on the struc- Jaccard Similarity 26.2 30.2 31.2
ture of characters. Ranking SVM 28.8 31.8 32.1
A stroke table, that contains 3,755 characters en- Table 6: Results of evaluation between ranking methods us-
coded in the first level of GB2312-80, is provided ing the feature set (A+R). The Ranking SVM achieves better
to describe how a Chinese character is decomposed performances than the baseline metric from Jaccard similarity
into its corresponding radicals. Characters may have coefficient.
more than one splitting forms and a character is typ-
ically composed of no more than 3 radicals.
The data for training language model in riddle the alignment between bilingual corpuses. We use
style include two parts: One is the corpus of rid- it as our baseline system that extracts the alignments
dles mentioned above, and the other is a corpus of automatically. And we use the Jaccard similarity co-
Chinese poem and Chinese couplets because of the efficient as the baseline ranking metric. The Jaccard
similar language style. We follow the method that similarity coefficient is defined as:
proposed by (He et al., 2012; Zhou et al., 2009), T
A B
to download the <Tang Poems>,<Song Poems>, J(A, B) = S (2)
A B
<Ming Poems>, <Qing Poems>, <Tai Poems>
from the Internet, and use the method proposed by where A means the radicals set of the solution and B
Fan et al. (2007) to recursively mine those data means the radicals set of the candidate.
with the help of some seed poems and couplets. The results are reported in the Table 5 and Ta-
It amounts to more than 3,500,000 sentences and ble 6. The baseline method can only give about
670,000 couplets. Besides the language model one-tenth correct solution at the Acc@1. Compared
trained in riddle style, we also train a general lan- with the baseline model, by using the alignments
guage model with the web documents. extracted by our method, the system can improve
6.7% at the Acc@1 and 6.3% at Acc@10. A phe-
5.2 Evaluation on Riddle Solving nomenon is that only using the alignments we ex-
We randomly select 2,000 riddles from the riddle tract has the better results than combining it with the
dataset as the test data, and 500 riddles as the de- alignments from Giza++ because metaphors match-
velopment data, while the rest as training data. ing between phrases and characters are particular
Our system always returns a ranking list of candi- in our riddle task. Small changes in the phrase
date solutions, so we use the Acc@k (k = 1, 5, 10) can affect the character that it implies and it may
as the evaluation metric. The Acc@k is the fraction be not a metaphor even if a character in phrase is
of questions which obtain correct answers in their changed. Furthermore, by using rules to identify
top-k results. the metaphors in riddles, we get an improvement of
Giza++ (Och, 2001) is a common tool to extract 10.1% at Acc@1, which proves the validity of the

852
Score Criterion Method Avg(Score)
5 Elegant metaphors, totally coherent Template-based Method 3.49
4 Correct metaphors, mostly coherent Replacement-based Method 4.14
3 Acceptable metaphors, more of less coherent
Riddle from dataset 4.38
2 Tolerable metaphors, little coherent
1 Wrong metaphors, incoherent Table 8: Human evaluation of different methods
Table 7: The criterion of riddle evaluation
system generates riddle descriptions following the
rule we define. The results prove that it is valid to methods in Section 4.2 for each character. Some-
use the alignments and rules that we extract to iden- times the riddles we generate exist in our training
tify the metaphors in our character riddle task. The data. We remove these riddles for the reason that
comparison between Jaccard similarity coefficient we want to evaluate the ability of generating new
and our Ranking SVM method shows that the Rank- riddles. In order to avoid the influence of annota-
ing SVM is better with an improvement of 2.6% at tors and compare the riddles generated by the sys-
Acc@1, which prove that compared to the Jaccard tem with the riddles written by human beings, the
similarity coefficient, the Ranking SVM determine riddles are randomly disordered so that the annota-
the solution more correct if we successfully iden- tors do not know the generating method of each rid-
tify all metaphors in riddle descriptions. Moreover, dle. For each character, we select 5 riddles generated
there is less improvement beyond Acc@5, which by the template-base method, 5 riddles generated by
means the ranking model gets better results even if the replacement-based method, and 2 riddles from
the system cannot identify all metaphors in riddle the riddles dataset written by human beings, which
descriptions. We think that unlike the Jaccard sim- form a set of 12 riddles in total. The annotators score
ilarity coefficient which only uses the features be- each riddle according to the above criterion.
tween the candidate character and the correct solu- The result is shown in Table 8. The riddles writ-
tion, the ranking model uses extra features in the ten by human beings from the riddle dataset get
riddles descriptions, e.g. the number of disappear- the highest score than the riddles generated by the
ing radicals, which helps to exclude obvious wrong system. The riddles generated by the replacement-
candidates. based method have a greater improvement than the
basic template-based method. We consider that the
5.3 Evaluation on Riddle Generation replacement-based method retains some human in-
formation, which makes the generated riddles more
Because there is no previous work about Chinese
coherent.
riddle generation, in order to prove its soundness,
Another result is that the riddle whose solution
we conduct human evaluations on this task in accor-
is a common character or is composed of common
dance with the following two reasons. Firstly, the
radicals gets the higher score, which is explicit that
generated riddles, which is different from the certain
we can get the better results if we have the more
and unique solution in the riddle solving task, are
alternative metaphors of a radical.
varied. So it is hard to measure the quality of gen-
Below we show two examples of the riddle de-
erated riddles with a well defined answer set. Sec-
scriptions generated with the solution “思”(miss)
ondly, small differences in riddles have a great effect
which often decompose into “田”(field) and
on the corresponding solution. It may imply distinct
“心”(heart) shown in Figure 1(b).
radicals even if only a character in the metaphors is
changed. The existing metrics such as BLEU, are • 三星伴月似画里 (Three stars with the
not suitable for our task. Based on above analysis, moon, like in the picture): The radical “田” is
each riddle that the system generates is evaluated by the inside part of “画”. The shape of “心” is
human annotators according to a 5 division criterion three points and a curved line, which looks like
described in Table 7. three stars around a crescent.
We randomly sample 100 characters following
the distribution of the character as a solution. The • 日日相系在心头 (Every day in my heart):

853
The radical “田” is composed of two “日”s, and Paul De Palma and E Judith Weiner. 1992. Riddles:
“心” occurs in the riddle description. The char- accessibility and knowledge representation. In Pro-
acter “头”(top) means the radical “田” is on the ceedings of the 14th conference on Computational
top position. linguistics-Volume 4, pages 1121–1125. Association
for Computational Linguistics.
Marco Ernandes, Giovanni Angelini, and Marco Gori.
6 Conclusion
2005. Webcrow: A web-based system for crossword
We introduce a novel approach to solving and gen- solving. In AAAI, pages 1412–1417.
erating Chinese character riddles. We extract align- Cong Fan, Long Jiang, Ming Zhou, and Shi-Long Wang.
ments and rules to capture the metaphors of phrases 2007. Mining collective pair data from the web.
In Machine Learning and Cybernetics, 2007 Interna-
in riddle descriptions and radicals in the solution
tional Conference on, volume 7, pages 3997–4002.
characters. In total, we obtain 14,090 alignments IEEE.
that imply the metaphors between phrases and rad- Jing He, Ming Zhou, and Long Jiang. 2012. Generat-
icals as well as 193 rules in 6 categories formed ing chinese classical poems with statistical machine
as regular expressions. To solve riddles, we utilize translation models. In Proceedings of the Twenty-Sixth
a dynamic programming algorithm to combine the AAAI Conference on Artificial Intelligence, July 22-26,
identified metaphors based on the alignments and 2012, Toronto, Ontario, Canada.
rules to obtain the candidate solutions. To gener- Long Jiang and Ming Zhou. 2008. Generating chinese
ate riddles, we propose a template-based method and couplets using a statistical mt approach. In Proceed-
a replacement-based method to generate candidate ings of the 22nd International Conference on Compu-
tational Linguistics-Volume 1, pages 377–384. Associ-
riddle descriptions. We employ the Ranking SVM
ation for Computational Linguistics.
to rank the candidates on both the riddle solving and
Thorsten Joachims. 2006. Training linear svms in linear
generation. Our method outperforms baseline meth- time. In Proceedings of the 12th ACM SIGKDD inter-
ods in the solving task. We also get promising re- national conference on Knowledge discovery and data
sults in the generation task by human evaluation. mining, pages 217–226. ACM.
Tadao Kasami. 1965. An efficient recognition and syntax
Acknowledgments analysis algorithm for context-free languages. Techni-
cal report, DTIC Document.
The first author and the fourth author are sup-
Michael L Littman, Greg A Keim, and Noam Shazeer.
ported by the National Natural Science Foundation 2002. A probabilistic approach to solving crossword
of China (Grant No. 61421003). puzzles. Artificial Intelligence, 134(1):23–55.
Maryam Yusuf Magaji. 2014. Morphology, syntax and
functions of the kilba folk riddles. International Jour-
References nal on Studies in English Language and LiteratureI-
Akı́ntúndé Akı́nyemı́. 2015a. Riddles and metaphors: JSELL.
The creation of meaning. pages 37–87. Springer. Franz Josef Och and Hermann Ney. 2003. A system-
Akı́ntúndé Akı́nyemı́. 2015b. Yorùbá riddles in perfor- atic comparison of various statistical alignment mod-
mance: Content and context. In Orature and Yoruba els. Computational linguistics, 29(1):19–51.
Riddles, pages 11–35. Springer. Franz Josef Och. 2001. Training of statistical translation
Gianni Barlacchi, Massimo Nicosia, and Alessandro models.
Moschitti. 2014. Learning to rank answer candi- Ibrahim Esan Olaosun and James Oladunjoye Faleye.
dates for automatic resolution of crossword puzzles. 2015. A cognitive semantic study of some english rid-
In CoNLL, pages 39–48. dles and their answers in amidst a tangled web. Asian
Kim Binsted and Graeme Ritchie. 1994. An imple- Journal of Social Sciences & Humanities Vol, 4:2.
mented model of punning riddles. Technical report, William J Pepicello and Thomas A Green. 1984. Lan-
University of Edinburgh, Department of Artificial In- guage of riddles: new perspectives. The Ohio State
telligence. University Press.
Kim Binsted, Helen Pain, and Graeme Ritchie. 1997. E Judith Weiner and Paul De Palma. 1993. Some prag-
Children’s evaluation of computer-generated punning matic features of lexical ambiguity and simple riddles.
riddles. Pragmatics & Cognition, 5(2):305–354. Language & communication, 13(3):183–193.

854
Xiaoyuan Yi, Ruoyu Li, and Maosong Sun. 2016.
Generating chinese classical poems with rnn encoder-
decoder. arXiv preprint arXiv:1604.01537.
Xingxing Zhang and Mirella Lapata. 2014. Chinese
poetry generation with recurrent neural networks. In
EMNLP, pages 670–680.
Ming Zhou, Long Jiang, and Jing He. 2009. Generat-
ing chinese couplets and quatrain using a statistical ap-
proach. In PACLIC, pages 43–52.

855

ANN, T. Cracking The Chinese Puzzles PDF
No ratings yet
ANN, T. Cracking The Chinese Puzzles PDF
822 pages
Roland Barthes Mythologies
No ratings yet
Roland Barthes Mythologies
6 pages
Mind Stretchers
100% (5)
Mind Stretchers
38 pages
Junior78 - LogicPuzzles - Oct29 - LOGIC GRADE 7
No ratings yet
Junior78 - LogicPuzzles - Oct29 - LOGIC GRADE 7
16 pages
Mandarin Chinese Picture Dictionary: Learn 1,500 Key Chinese Words and Phrases (Perfect for AP and HSK Exam Prep; Includes Online Audio)
From Everand
Mandarin Chinese Picture Dictionary: Learn 1,500 Key Chinese Words and Phrases (Perfect for AP and HSK Exam Prep; Includes Online Audio)
Yi Ren
4.5/5 (7)
Decrypting Cryptic Crosswords Semantically Complex
No ratings yet
Decrypting Cryptic Crosswords Semantically Complex
12 pages
Chine
No ratings yet
Chine
154 pages
A Mathematical Model For Universal Semantics
No ratings yet
A Mathematical Model For Universal Semantics
9 pages
XENO: Computer-Assisted Compilation of Crossword Puzzles
No ratings yet
XENO: Computer-Assisted Compilation of Crossword Puzzles
7 pages
6th Grade CogAT PDF
No ratings yet
6th Grade CogAT PDF
9 pages
English Homework Maybe Crossword Clue
100% (1)
English Homework Maybe Crossword Clue
5 pages
A995 PDF
No ratings yet
A995 PDF
25 pages
Solving Substitution Ciphers: Hasinoff@cs - Toronto.edu
No ratings yet
Solving Substitution Ciphers: Hasinoff@cs - Toronto.edu
8 pages
English Homework Crossword Clue
100% (1)
English Homework Crossword Clue
7 pages
TOPIC 3
No ratings yet
TOPIC 3
5 pages
Riddles A Journey of Wit and Wonder
No ratings yet
Riddles A Journey of Wit and Wonder
9 pages
Pruthwik Mishra PhD Thesis
No ratings yet
Pruthwik Mishra PhD Thesis
134 pages
Prosodic Organization of English Folk Riddles and The Mechanism of Their Decoding
No ratings yet
Prosodic Organization of English Folk Riddles and The Mechanism of Their Decoding
14 pages
Submit As Homework Crossword Clue
100% (1)
Submit As Homework Crossword Clue
6 pages
A Mathematical Model For Universal Semantics: Weinan E and Yajun Zhou
No ratings yet
A Mathematical Model For Universal Semantics: Weinan E and Yajun Zhou
12 pages
Chinese Character Structure Analysis Based On Complex Networks
No ratings yet
Chinese Character Structure Analysis Based On Complex Networks
10 pages
How To Realize "A Sense of Humour" in Computers ?
No ratings yet
How To Realize "A Sense of Humour" in Computers ?
14 pages
Dissertation Crossword Answer
100% (2)
Dissertation Crossword Answer
6 pages
Building Chinese Word Knowledge Base For
No ratings yet
Building Chinese Word Knowledge Base For
5 pages
Hand Outs in MC Eng 2
No ratings yet
Hand Outs in MC Eng 2
2 pages
Chinese Characters Writing Practice Pad: Learn Chinese in Just Minutes a Day!
From Everand
Chinese Characters Writing Practice Pad: Learn Chinese in Just Minutes a Day!
Xin Liang
No ratings yet
神经黑子
No ratings yet
神经黑子
18 pages
Extracting Linguistic Speech Patterns of
No ratings yet
Extracting Linguistic Speech Patterns of
14 pages
halliday
No ratings yet
halliday
4 pages
A Character-Net Based Chinese Text Segmentation Method: Lixin Zhou Qun Liu
No ratings yet
A Character-Net Based Chinese Text Segmentation Method: Lixin Zhou Qun Liu
6 pages
Riddles
No ratings yet
Riddles
13 pages
Vector Based Models
No ratings yet
Vector Based Models
41 pages
EEd-7-GROUP-2-QUIZ
No ratings yet
EEd-7-GROUP-2-QUIZ
3 pages
Research On Chinese Word-Formation
No ratings yet
Research On Chinese Word-Formation
18 pages
Natural Language Processing
No ratings yet
Natural Language Processing
44 pages
English Homework Often Crossword
100% (1)
English Homework Often Crossword
5 pages
Like Most Philosophy Dissertations Crossword Clue
100% (1)
Like Most Philosophy Dissertations Crossword Clue
4 pages
Term Paper Abbr 2 Words Crossword
100% (1)
Term Paper Abbr 2 Words Crossword
6 pages
B.TECH - Sem 2 - STANDARD - Reasoning - Ability
No ratings yet
B.TECH - Sem 2 - STANDARD - Reasoning - Ability
150 pages
Semantic_Overlaps_Between_Chinese_Two-Character_Wo
No ratings yet
Semantic_Overlaps_Between_Chinese_Two-Character_Wo
10 pages
Flavor Designing Puzzles (AD&D2) (Dr#271)
No ratings yet
Flavor Designing Puzzles (AD&D2) (Dr#271)
10 pages
Daily Thesis Crossword
100% (3)
Daily Thesis Crossword
6 pages
Dancing Men Cipher
No ratings yet
Dancing Men Cipher
8 pages
ds pbl
No ratings yet
ds pbl
17 pages
Meaning Representation in Natural Language Categories: Trevor Fountain
No ratings yet
Meaning Representation in Natural Language Categories: Trevor Fountain
32 pages
English Homework Maybe Crossword
100% (1)
English Homework Maybe Crossword
6 pages
Taranenko Paper 2016 Revised
No ratings yet
Taranenko Paper 2016 Revised
14 pages
The Model Method
No ratings yet
The Model Method
33 pages
Literature Review Crossword
100% (1)
Literature Review Crossword
4 pages
The Mini Edition How To Use It
No ratings yet
The Mini Edition How To Use It
10 pages
Comparing_Fifty_Natural_Languages_and_Twelve_Genet
No ratings yet
Comparing_Fifty_Natural_Languages_and_Twelve_Genet
27 pages
2012 - Tuzzi - RK - Senza Copertina
No ratings yet
2012 - Tuzzi - RK - Senza Copertina
20 pages
Habibi Et Al 2020 Classifiers Preprint
No ratings yet
Habibi Et Al 2020 Classifiers Preprint
35 pages
Wallace Tomlin Xu Yang Pathak Ginsberg Klein - 2022 - Crossword - Paper
No ratings yet
Wallace Tomlin Xu Yang Pathak Ginsberg Klein - 2022 - Crossword - Paper
13 pages
Folk Song AND RIDDLES
No ratings yet
Folk Song AND RIDDLES
8 pages
Riddle
No ratings yet
Riddle
20 pages
Thesis Subject Crossword
100% (3)
Thesis Subject Crossword
7 pages
A New Way To Teach Chinese Characters: Using Meaningful Interpretation
No ratings yet
A New Way To Teach Chinese Characters: Using Meaningful Interpretation
16 pages
Coreference: Fundamentals and Applications
From Everand
Coreference: Fundamentals and Applications
Fouad Sabry
No ratings yet
Mandarin Chinese Characters Language Practice Pad: Learn Mandarin Chinese in Just a Few Minutes Per Day! (Fully Romanized)
From Everand
Mandarin Chinese Characters Language Practice Pad: Learn Mandarin Chinese in Just a Few Minutes Per Day! (Fully Romanized)
Xin Liang
No ratings yet
John Lara's The Samaritan: Answering Excerpt and Essay Questions: A Guide to Reading John Lara's The Samaritan, #3
From Everand
John Lara's The Samaritan: Answering Excerpt and Essay Questions: A Guide to Reading John Lara's The Samaritan, #3
Jorges P. Lopez
No ratings yet
Anymore Vs Any More
No ratings yet
Anymore Vs Any More
1 page
Material de La Clase 9
No ratings yet
Material de La Clase 9
7 pages
Timpurile in Limba Engleza
No ratings yet
Timpurile in Limba Engleza
6 pages
1st Blended Sound LP Reading
No ratings yet
1st Blended Sound LP Reading
3 pages
The Sound Symbolism of Names: David M. Sidhu and Penny M. Pexman
No ratings yet
The Sound Symbolism of Names: David M. Sidhu and Penny M. Pexman
5 pages
Unit 2.2 Sentence Types
No ratings yet
Unit 2.2 Sentence Types
13 pages
Christian Hymns & Songs - This Is My Desire Lyrics + Filipino - Tagalog Translation
100% (1)
Christian Hymns & Songs - This Is My Desire Lyrics + Filipino - Tagalog Translation
3 pages
V-th
No ratings yet
V-th
16 pages
Screenshot 2025-04-28 at 5.54.32 PM
No ratings yet
Screenshot 2025-04-28 at 5.54.32 PM
41 pages
Grammar Mini Guide
No ratings yet
Grammar Mini Guide
10 pages
Stative Verbs Grammar Guides 10275
No ratings yet
Stative Verbs Grammar Guides 10275
2 pages
102 U1
No ratings yet
102 U1
26 pages
Candidate Exemplars - June 2017 (H474 - 04)
No ratings yet
Candidate Exemplars - June 2017 (H474 - 04)
54 pages
The 200 Most Common French Verbs - Talk in French
100% (3)
The 200 Most Common French Verbs - Talk in French
16 pages
Sol3e Preint U2 Short Test 1b
100% (1)
Sol3e Preint U2 Short Test 1b
2 pages
Past Simple and Past Continuous
No ratings yet
Past Simple and Past Continuous
4 pages
The Copulative Predication
No ratings yet
The Copulative Predication
29 pages
Immediate download (eBook PDF) Grammar to Get Things Done: A Practical Guide for Teachers Anchored in Real-World Usage ebooks 2024
100% (6)
Immediate download (eBook PDF) Grammar to Get Things Done: A Practical Guide for Teachers Anchored in Real-World Usage ebooks 2024
56 pages
1.Grammar Writing Test (Final Exam for Beginner) IV (1)
No ratings yet
1.Grammar Writing Test (Final Exam for Beginner) IV (1)
6 pages
HND Year 2 Common Texts
No ratings yet
HND Year 2 Common Texts
58 pages
141124-SBI - SCO - IH English 2024
No ratings yet
141124-SBI - SCO - IH English 2024
7 pages
Sonnet, Haiku, Acrostic Poem
100% (1)
Sonnet, Haiku, Acrostic Poem
20 pages
Class 10 Further Practice
No ratings yet
Class 10 Further Practice
6 pages
Instant Download TEACH YOURSELF Dutch Grammar You Really Need to Know 2013 2013th Edition Gerdi Quist PDF All Chapters
100% (3)
Instant Download TEACH YOURSELF Dutch Grammar You Really Need to Know 2013 2013th Edition Gerdi Quist PDF All Chapters
72 pages
Upstprep Test 1 A2 1st Term 2021 PDF
No ratings yet
Upstprep Test 1 A2 1st Term 2021 PDF
5 pages
English For The Workplace
No ratings yet
English For The Workplace
3 pages
TP N° 2 Elementary
No ratings yet
TP N° 2 Elementary
6 pages
Syllabus Design for English Language Teaching (1)
No ratings yet
Syllabus Design for English Language Teaching (1)
82 pages
ENGLISH CORE CLASS XI SYLLABUS
No ratings yet
ENGLISH CORE CLASS XI SYLLABUS
4 pages