Example-Based Problem Solving Support Using
Concept Analysis of Programming Content
Roya Hosseini1 and Peter Brusilovsky1,2
1 Intelligent Systems Program (ISP)
2 School of Information Sciences
University of Pittsburgh
12th International Conference on Intelligent Tutoring Systems
Young Researchers Track
June 8, 2014
University of Pittsburgh Intelligent Systems Program
Purpose
 Select relevant examples for:
 Supporting problem solving in Java Programming
 Introduce two concept-based approach for:
 Finding similar examples to a question
2
Outline
 Literature Review
 Concept-based Similarity Approach
 Evaluation: Lab Study
3
Example-based Problem Solving
 Helps students in solving problems
 Finds most relevant examples for problems
4
ELM-ART
 ELM-ART: an ITS for LISP programming (Weber 1996)
 Required advance analysis for:
 Task Description
 Domain Knowledge
 Learner Model
5
Outline
 Literature Review
 Concept-based Similarity Approach
 Evaluation: Lab study
6
Concept-based Similarity
 Uses a Standard Ontology
7
 Extracts concepts in
programming content
(Hosseini & Brusilovsky 2013)
 Measures similarity of
concepts in contents
is-a
Abstraction Inheritance Encapsulation
Overriding
Method
Inheritance
Field
Inheritance
Overriding
Equals
Overriding
Hash Code
is-a is-a
is-a
is-a
is-a is-a
Example
8
public class Tester {
public static void main (String[] args)
{
int x = 0;
for (int i = 0; i < 10; i++){
for (int j = 0; j < 10; j++){
…
}
}
}
}
for (int i = 0; i < 10; i++){
…
}
for (int j = 0; j < 10; j++){
…
}
Global
for (int i = 0; i < 10; i++){
for (int j = 0; j < 10; j++){
…
}
}
Local
Cosine Similarity
Question vector
Example vector
9
Global Similarity
Wn-1 WnW2 ...W1
Cn-1 CnC2 ...C1
Wn-1 WnW2 ...W1
Cn-1 CnC2 ...C1
Local Similarity
 #1: Forms a subtree from concepts
in the same block
10
if
++<
If ( x < 2 ) {
x++;
}
Local Similarity
 #2: Compares subtrees of question and example
 Tree Edit Distance (TED)
11
Example
a
cb
e
if
Question
e
gf
a
cb d
a
cb d
a
cb
e
if
1 2
TED: 3
e
gf
Local Similarity
 #3: Measures similarity based on TED
12
Example
a
cb
e
if
a
hd
e
gh i
i
lk m n
Example
Question
e
gf
a
cb d
TED: 3
TED: 6
13
Global vs. Local
We Are Globally
Similar!
Outline
 Literature Review
 Concept-based Similarity Approach
 Evaluation: Lab study
14
Study Design
Time Rating #Example Approach
Failing in question Optional 5 Random
End of question Mandatory 4 Both
15
 Date: January, 2014
 12 students
 Java Contents:
 6 topics
 83 annotated examples
 24 parametric questions
16
Task
 Pretest
 Solving 4 questions in 3 Java topics
 Rating helpfulness of examples
 ‘Not helpful at all’(0) - ‘Very helpful’(3)
 Post-test
17
Solving question
18
Examples & Rating
19
Results
20
1.95
1.49
0
1
2
3
GLOBAL LOCAL
Average Users Ratings
Rating Evaluation
21
Work in
progress
 RMSE
 Precision 2+
 Precision 3
 MRR
0.29 0.32
0
0.5
1
Global Local
Average RMSE
Rating Evaluation
22
Work in
progress
 RMSE
 Precision 2+
 Precision 3
 MRR
0.69 0.62
0
0.5
1
Global Local
Average Precision 2+
Rating Evaluation
23
Work in
progress
 RMSE
 Precision 2+
 Precision 3
 MRR
0.28 0.23
0
0.5
1
Global Local
Average Precision 3
Rating Evaluation
24
Work in
progress
 RMSE
 Precision 2+
 Precision 3
 MRR
0.7
0.61
0
0.5
1
Global Local
Average MRR
25
Work in
progress
0
0.2
0.4
0.6
0.8
1
RMSE Precision 2+ Precision 3 MRR
Ratings over Difficulty Level of Question
Global Easy Local Easy
Global Moderate Local Moderate
HOW-TOs
26
Work in
progress
 #1: Define structure of the content?
 #2: Do personalized example selection?
27
Work in
progress
 Concepts that appear together in a:
 Block/line
#1: Define structure of the content?
if
++<
If ( x < 2 )
{
x++;
}
28
Work in
progress
 Consider user knowledge information
 Select examples with:
 least unknown parts
 enough new parts
#2: Do personalized example selection?
29
Work in
progress
 Lab study : 30 subjects
 Personalized & non-personalized Global-Local approach
 Personalized example selection did not work out!
 Hard question: Local approach has the least RMSE
 Easy-Medium question: Global approach has the least
RMSE
Concept-based
Similarity
User Knowledge
Level
Personalized
Example-Selection
Discussion
 Global & Local Concept-based approach:
 Generalizability across other programming domains
 Limitations:
 Few contents
 Few subjects
30
Next Steps
 Investigating:
 learning gains of students in the study
 other approaches for capturing content structure
 personalized example selection
○ user knowledge, …
 Adaptive visualization of problem-example
space
31
Thank You!
Personalized Adaptive Web Systems
School of Information Sciences
University of Pittsburgh
Roya Hosseini (roh38@pitt.edu)
http://people.cs.pitt.edu/~hosseini/

Presentation

Editor's Notes

  • #3 The main goal of this work is to select ….. In this talk, I will introduce …
  • #5 Example-based problem solving is one the efficient approaches used by Intelligent Tutoring Systems (ITSs) in the programming domain. In this approach, when the student has trouble solving a problem, the system tries to find the most relevant examples which might be helpful to solve the problem.
  • #6 Example based problem solving has been used, for example, in ELM-ART ITS that is a tutor for LISP programming. In this system, episodic learner model (ELM) is used for individualized selection of the best solutions to a programming problem and that are most similar to the expected solution to the new problem. ELM is a case-based learning model that stores knowledge about the learner in terms of a collection of episodes. ELM-ART uses information from a task description, from the domain knowledge, and from the individual learner model to analyze program code produced by the learner. The result of this analysis is then used for finding most similar programming solution to the new problem. The system requires too many for advance analysis of programming contents which makes the approach not easily generalizable to other programming domains.
  • #8 The goal is to create a different version of the example-based problem solving support for Java programming which is generalizable in multiple different programming languages without too many effort that is required for advance analysis of content in a system like ELM-ART. The main innovation is in analyzing domain concepts related to programming problems and examples and using the underlying concept structure for finding similarity between examples and problems. To this end, we use a standard Java ontology that defines a hierarchy of programming concepts in Java. To extract those programming concepts in code, we used a specialized concept analysis tool, JavaParser, which can extract not only the list of concepts but also concept structure. The parser provides a fine-grained level of indexing per line of code which helps identifying blocks of code that have sets of adjacent concepts. The parser helped us index a considerable volume of Java programming problems and examples and we could then start a study comparing the approaches. You should then create a nice Transition to the problem. Your strategy would be to describe the problem with an example in the next slide. After the example, say what approach did we chose in the work. So, here’s the sentences for the transition to next slide: Then, similarity between two content is measured by comparing their corresponding concepts. The similarity can be obtained using number of complicated approaches, but our first challenge was to choose similarity approaches that considers whether two sets of concepts are more or less similar as a whole or considering similarity on a structure level where detailed level of similarity can be identified by structure of blocks and adjacent concepts.
  • #9 Here is an example that shows the difference between the two approaches. The code snippet on left is the code in the question. And the two code snippets on right are the codes of two examples. As you see, the question has nested for statement. So, both approaches are about For concept. The approach that considers similarity of concepts as a whole, however, does not make any difference between the examples that has two for in separate blocks (i.e. the first example, at top right) and the one that has nested for (second example, at bottom right). We’ll call this approach Global since it only considers similarity of concepts as a whole. The other approach that considers structure of code, instead, prefers the second example since it has the same nested for structure. We’ll call this approach Local since it considers code structures (like blocks) for determining the similarity between the question and example.
  • #10 For our work, we defined global concept-based similarity based on cosine similarity that uses TF-IDF weighting for vectors of concepts. So, in this approach, we measured similarity by comparing vectors of question and examples as shown in this slide. Say, it we can think of it as a bag of word approach!
  • #11  The local concept-based similarity approach selects sets of examples that have the closest blocks of code to the question. The main idea of this approach is to build subtrees of concepts that have appeared together as blocks in each of the contents. As a result, each subset of concepts that are either in the same line or in the same block, will be merged together to create a subtree for the content. An example is shown here for the if statement which has less than expression in its condition and post increment expression in its body. The subtree for this block of code would look like this that has if as a root and less expression and post increment expression as it children.
  • #12 Having created the subtrees, we can find the similarity of a question and example by comparing their corresponding subtrees. Several methods have been suggested to compare trees, among which Tree Edit Distance (TED) is quite well known and has been widely used in other studies for similar purposes. For example, to determine the minimum TED between the first subtree in Question, it is compared to each of the two subtrees in the example. The minimum TED of 1 is obtained when comparing the first subtree with the first subtree in example. Similarly, for the minimum distance of 1 is obtained when comparing second subtree in question with the second subtree in example. This way, the total distance between the question and example is 2. This distance will be normalized based on the number.
  • #13 Finally, TED is used to find similarity between question and examples. The example that has the least TED is the most similar one to the question. In this picture, we see that the question has the TED 2 and 5 with example shown on the left and right respectively. So, local similarity select the one with TED 2 as the most similar example to the question.
  • #16 The following is the main research questions of the study: Q1: Which of two competing approaches - Global and Local - can generate better links from problems to relevant examples? We designed a lab study to investigate the effectiveness of local and global concept- based similarity approaches. In this design: (describe the table) In total, the examples were presented to students in two contexts: Context 1 when student failed in the question; and Context 2, at the end of work with each question, before to the next question. In each context the examples were selected from the same set consisting of two top examples generated by each approach for the current question (6 examples in Study 1 and 8 examples in Study 2). In Context 1, five examples were randomly selected from the pool and presented in order of similarity. (the user is getting recommended examples when she supposed to need it, so, they are in order). Example rating was optional. This context simulated the natural use of examples as problem-solving help. In Context 2, all examples in the pool were presented in a random order. Overlapping examples were removed in this condition. In this data collection context example rating was mandatory.
  • #17 We conducted a lab study to investigate the effectiveness of local and global concept- based similarity approaches.
  • #19 each subject had to work with 12 questions belonging to 3 topics. Working with each question the subject had to go through two stages. The first ‘Solving’ phase is attempting to solve the parametric question. Figure shows the solving phase for a question. During this phase, student can repeat the question as many times as she wants - each time the question is generated with different parameter.
  • #20 In case student’s answer is incorrect, the system generates 5 ‘relevant’ examples for the question. Figure shows the list of examples presented to student after providing the wrong answer to ‘While’ question. Student can explore any subset of these examples. Finally, when she selects ‘Finish Solving Button’, a second ’Rating phase’ starts. In this phase, she has to rate 4 relevant examples (generated by several explored approaches) in respect to their helpfulness for the question that she was solving. We asked users to react to the statement ‘The above example is helpful for me to solve the exercise’. The 4-point rating scale included ‘Not helpful at all’, ’Not helpful’, ’Helpful’, and ‘Very helpful’ coded from 0 to 3, respectively.
  • #21 Here you should not say Global performs better (this is because statistical test showed no significant difference between the two.) You should here interpret the numbers. Say what 1.95 and 1.49 means in terms of the user ratings. Also, do not skip this slide quickly. Say, why this has happened
  • #22 To decide which of the two approaches generate better links we applied user judgments collected in the study. Since every generated link was rated by the subjects in a realistic context, we can make a decision about approach quality using traditional measures based on user feedback. In our study we used some measures: First is Root-Mean-Square Error (RMSE) which measures the extent to which the approach generates examples close to user ratings.
  • #23 Second measure is Precision of hitting 2 or above (Precision 2),
  • #24 Third measure is Precision of hitting 3 (Precision 3).
  • #25 The final measure is MRR.
  • #26 To obtain a more detailed picture, we considered the performance of the two approaches separately for different levels of question complexity (Easy, Moderate and Complex based on the number of involved concepts). (Since pretest showed that no student is ready for hard topics, we had no students assigned to hard topics and thus no ratings for hard questions. -- For example, for RMSE of Global approach is less than the local approach in both moderate and easy questions. Besides we can see that Global, has less RMSE in moderate question compared to easy question. The rest of measures shows similar trend that global has higher performance. However, there is no significant different.
  • #27 I suggest to focus on 2 challenges in your task: 1. How to  take the structure of the problem and example into account? 2. How to use information about user knowledge to do personalized example selection? Discuss possible ideas (so far!) for 1 and also for 2, with special slide for it. Suggest simple ideas like examples with least unknown parts, example with enough new parts for students to handle, etc. Ask DC attendees for more suggestions to explore, show that you are in the process of search for solutions. And then present your lab study just as one step towards the goal. 
  • #28 I suggest to focus on 2 challenges in your task: 1. How to  take the structure of the problem and example into account? 2. How to use information about user knowledge to do personalized example selection? Discuss possible ideas (so far!) for 1 and also for 2, with special slide for it. Suggest simple ideas like examples with least unknown parts, example with enough new parts for students to handle, etc. Ask DC attendees for more suggestions to explore, show that you are in the process of search for solutions. And then present your lab study just as one step towards the goal. 
  • #29 I suggest to focus on 2 challenges in your task: 1. How to  take the structure of the problem and example into account? 2. How to use information about user knowledge to do personalized example selection? Discuss possible ideas (so far!) for 1 and also for 2, with special slide for it. Suggest simple ideas like examples with least unknown parts, example with enough new parts for students to handle, etc. Ask DC attendees for more suggestions to explore, show that you are in the process of search for solutions. And then present your lab study just as one step towards the goal. 
  • #30 I suggest to focus on 2 challenges in your task: 1. How to  take the structure of the problem and example into account? 2. How to use information about user knowledge to do personalized example selection? Discuss possible ideas (so far!) for 1 and also for 2, with special slide for it. Suggest simple ideas like examples with least unknown parts, example with enough new parts for students to handle, etc. Ask DC attendees for more suggestions to explore, show that you are in the process of search for solutions. And then present your lab study just as one step towards the goal. 
  • #32 Learning gains (using post-test & pre-test) can be used to see if proposed approaches had implicit effects on student’s learning. Ratings are subjective. It could be that some students think examples were not helpful but in fact they really were. Adaptive visualization of problem-example space: help in selecting examples same source mined from structure and wisdom, but different interface. This work is currently in progress and I’m still working on approaches to improve them. So, I really appreciate your feedbacks on this work. And we have one dataset so far, but glad to work on others, ask for collaborators with datasets and ideas!
  • #34 Problem solving is often guided or facilitated by using examples of how a similar problem was solved or could be solved. This is well-documented for experts that use previous solutions to solve new problems or planning tasks in different domains. Especially in the domain of programming, experienced programmers are able to retrieve examples of how they solved previous problems and they can adapt these solutions to similar new programming tasks. However, not only experience programmers use examples but also beginners try to program code in analogy to examples that they’ve seen before. Studies done in the 1990s showed the real need for example based programming support in ITS. Carefuly chosen examples have a high impact on learning to solve problems (Reed & Bolstad, 1991) and that the intelligent tutor should retrieve and select the best example or examples (Weber & Bögelsack 1995)