0% found this document useful (0 votes)

103 views

Assignment 3

This document provides instructions for a programming assignment to implement Markov text generation using a list of lists data structure. It describes the starter code provided, outlines the key methods to implement, and provides guidance on how to approach implementing the train, generateText, and retrain methods. The goal is to train on a sample text, then generate new text that resembles the style and word transitions of the original text.

Uploaded by

Thịnh Trần

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

103 views

Assignment 3

Uploaded by

Thịnh Trần

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Programming Assignment, Part 2: Markov

Text Generation
Let's have some fun. In this assignment, you'll implement Markov Text Generation using a List of
Lists.

Before you begin this assignment, make sure you check Part 2 in the setup guide to make sure the
starter code has not changed since you downloaded it. If you are an active learner, you will have
also received an email about any starter code changes. If there have been any changes, follow the
instructions in the setup guide for updating your code base before you begin.

1. Find the starter code

You should see a package called textgen in that starter code.

To verify everything is setup okay, you can run the "MarkovTextGeneratorLoL.java" file and you will
see output (including two "null" outputs which we'll be fixing soon!).

2. Open and examine the starter code

Examine the MarkovTextGenerator interface for the key methods:

public void train(String sourceText);

Markov Text Generation depends on an initial source text to mimic. This method takes in String to
train the Markov Text Generator.

public String generateText(int numWords);

The goal of Markov Text Generation is to be able to generate text which resembles the source in a
reasonable way. The method generateText returns such text as a String of words, the length of
which is determined by numWords.

public void retrain(String sourceText);

You may wish to use a Markov Text Generator multiple times with different source text. The method
retrain acts just like train, except it removes any existing training that was done previously and trains
from scratch on the new sourceText.
Assignment and Submission Details
Your submission of this assignment will be the code which implements Markov Text Generation as a
List of Lists. As always, we recommend testing as you write your implementation.

Step 1: Implement the train method

You'll notice our MarkovTextGeneratorLoL constructor creates a List of ListNode objects. The
ListNode class is authored at the bottom of the MarkovTextGenerationLoL.java file. Each ListNode
contains a word and a list of words which follow that word in the source text. We'll be using
ListNodes to help us generate text (in the next step).

1. A. The algorithm you'll be implementing

The idea for the train method is to build your list of lists:

set "starter" to be the first word in the text

set "prevWord" to be starter

for each word "w" in the source text starting at the second word

check to see if "prevWord" is already a node in the list

if "prevWord" is a node in the list

add "w" as a nextWord to the "prevWord" node

else

add a node to the list with "prevWord" as the node's word

add "w" as a nextWord to the "prevWord" node

set "prevWord" = "w"

add starter to be a next word for the last word in the source text.

First, check to see if you understand how this algorithm works. You'll be going through the text
keeping track of the previous word you saw ("prevWord") and the current word "w". You'll then want
to add this current word to the list of words which follow the previous word. It's okay to add a word
multiple times to the list of words which follow - we'll use this to help with our text generation (words
appearing more than once should be more likely to occur). Notice that there is extra care taken to
setup the "starter" word and to make sure the last word also points to the start word.

An example of how the train function should work can be found below. Notice that starter will point to
"hi".

1. B. Use the algorithm above to author your train method

You'll find that the list itself has already been made for you and the methods you need in a ListNode
also already exist. Your code then will be searching the list of ListNodes, calling ListNode
constructors to add previous words to the list, and adding following words using the addNextWord
method.

After completing your train method, we recommend you use the toString methods in both
MarkovTextGeneratorLoL and ListNode to verify you are producing a reasonable list. For example, if
you train your generator on the string above ("hi there hi Leo"), calling toString on the
MarkovTextGenerator should produce:

hi: there->Leo->

there: hi->

Leo: hi->
Step 2: Implement the generateText method

Now that you've trained on text, your next step will be producing text based on the input set. To do
this, you'll be implementing the algorithm below.

set "currWord" to be the starter word

set "output" to be ""

add "currWord" to output

while you need more words

find the "node" corresponding to "currWord" in the list

select a random word "w" from the "wordList" for "node"

add "w" to the "output"

set "currWord" to be "w"

increment number of words added to the list

To help with implementing the method, you should author the ListNode getRandomNextWord
method to help. It will take care of the step above:

select a random word from the "wordList" for "node"

To clarify how this algorithm behaves, let's continue with the example above (trained on "hi there hi
Leo") and generate 4 words.

We add "hi" to the output as the starter word .

We then find the node corresponding to the word "hi". "hi" could generate either "there" or "Leo".
Let's suppose the random value selects "Leo" and we add "Leo" to our output.

"Leo" can only generate "hi" so we add "hi" to our output.

Again, "hi" could generate either "there" or "Leo". Let's suppose the random value selects "there". So
we add "there" to our output.

We've added 4 words to the output ("hi", "Leo", "hi", "there") so we're done (our while loop would
terminate). Our final output would be:
"hi Leo hi there"

NOTE: If the generator has not yet been trained, the generateText method should simply return an
empty list. Normally you would also print a warning message, but our auto grader might get
confused so just return the empty list without printing anything.

Step 3: Implement the retrain method

Retrain will behave just like train, only you'll need to re-initialize the instance variables, effectively
discarding the prior training.

Hints:
 You will likely find the "toString" method for the list and/or a "toString" method for each node helpful
when testing and debugging.

 Train on small inputs and draw the expected list of lists by hand. Then check to see if your code is
producing what you'd expect.

 When picking a random next word, be sure you are able to produce all of the possible words. For
example, if you could produce "hi", "there", or "this", be sure your getRandomNextWord method
can produce all possible words. A common mistake is to not bound your random number properly
and go off the end of the list or omit the last word in the list.

 Also, when picking a random next word, be sure to have a test case where a word is repeated. For
example, the nextWord list could contain "hi", "hi", "hi", "hello". Your generate next word method
should produce "hi" far more often than it produces "hello" if you have it produce a reasonable
(10+) number of words. The grader will test cases where one word occurs much more often than
the others, and expect to see that word generated more often.

 Punctuation counts, so it's okay if you end up with the same word in your data structure punctuated
and not punctuated. For example, if you trained on the following: "Hi there. Up there is the sky."
You would have a node for "there." and a node for "there" in the resultant List of Lists.

What and how to submit

Upload the MarkovTextGeneratorLoL.java file for automated testing. We'll be testing to ensure your
output is reasonable: words only follow words as in the training text and all possible subsequent
words get generated at a reasonable frequency (i.e., the random logic works correctly).

Ultimate Ios 10, Xcode 8 Development Book: Build 30 Apps
From Everand
Ultimate Ios 10, Xcode 8 Development Book: Build 30 Apps
John Bura
No ratings yet
5 Cs Lesson Plan
No ratings yet
5 Cs Lesson Plan
4 pages
Markov Processes Generator
No ratings yet
Markov Processes Generator
5 pages
Exercise 2 en
No ratings yet
Exercise 2 en
3 pages
Project Assignment 4: Markov Chain
No ratings yet
Project Assignment 4: Markov Chain
10 pages
Text Generation Using Markov Chain
No ratings yet
Text Generation Using Markov Chain
13 pages
MLRESEARCHPAPERfinal
No ratings yet
MLRESEARCHPAPERfinal
7 pages
Python: Advanced Guide to Programming Code with Python
From Everand
Python: Advanced Guide to Programming Code with Python
Charlie Masterson
No ratings yet
Python: Advanced Guide to Programming Code with Python: Python Computer Programming, #4
From Everand
Python: Advanced Guide to Programming Code with Python: Python Computer Programming, #4
Charlie Masterson
No ratings yet
exp-2 nlp
No ratings yet
exp-2 nlp
4 pages
PHP programming
From Everand
PHP programming
Nino Paiotta
No ratings yet
Report- BOUFTIRA- 201924010110
No ratings yet
Report- BOUFTIRA- 201924010110
10 pages
Learn Programming Using C#
From Everand
Learn Programming Using C#
Taurius Litvinavicius
No ratings yet
Multimedia Application L6
No ratings yet
Multimedia Application L6
63 pages
N_gram_Presentation
No ratings yet
N_gram_Presentation
29 pages
Coding Interview Questions and Answers
From Everand
Coding Interview Questions and Answers
Chinmoy Mukherjee
No ratings yet
Multimedia Application L5
No ratings yet
Multimedia Application L5
35 pages
Beyond the Basics of JavaScript
From Everand
Beyond the Basics of JavaScript
Tom Henricksen
No ratings yet
Just the basics of JavaScript
From Everand
Just the basics of JavaScript
Tom Henricksen
No ratings yet
Text Generation:Use Technique Like Markov Models or LSTM Network To Generate Realistic Text in A Specific Style or Genre
No ratings yet
Text Generation:Use Technique Like Markov Models or LSTM Network To Generate Realistic Text in A Specific Style or Genre
7 pages
NLP EXP 3 (b) - Word Generation
No ratings yet
NLP EXP 3 (b) - Word Generation
2 pages
Notes of NLP - Unit-2
No ratings yet
Notes of NLP - Unit-2
23 pages
Language Modeling
No ratings yet
Language Modeling
88 pages
Java: Best Practices to Programming Code with Java: Java Computer Programming, #3
From Everand
Java: Best Practices to Programming Code with Java: Java Computer Programming, #3
Charlie Masterson
No ratings yet
Java: Best Practices to Programming Code with Java
From Everand
Java: Best Practices to Programming Code with Java
Charlie Masterson
No ratings yet
module5_DS_ppt
No ratings yet
module5_DS_ppt
38 pages
2. Language Modeling
No ratings yet
2. Language Modeling
50 pages
Evaluating Language Models
No ratings yet
Evaluating Language Models
21 pages
lm24aug
No ratings yet
lm24aug
84 pages
NLP lab Manual (3)
No ratings yet
NLP lab Manual (3)
7 pages
018 NLP EXP2
No ratings yet
018 NLP EXP2
1 page
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
From Everand
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
Nikhil Khan
No ratings yet
List of Experiments: Experiment No. Experiment Name Page No
No ratings yet
List of Experiments: Experiment No. Experiment Name Page No
1 page
JavaScript & Vue.js: A Match Made in Heaven
From Everand
JavaScript & Vue.js: A Match Made in Heaven
Tom Henricksen
No ratings yet
processss
No ratings yet
processss
1 page
Transformers, How Do They Work?: Generative AI To Create Content
No ratings yet
Transformers, How Do They Work?: Generative AI To Create Content
14 pages
Simplified PHP
From Everand
Simplified PHP
James Blanchette
No ratings yet
Perl One-Liners: 130 Programs That Get Things Done
From Everand
Perl One-Liners: 130 Programs That Get Things Done
Peteris Krumins
4/5 (3)
6.Chapter6_LanguageModel
No ratings yet
6.Chapter6_LanguageModel
33 pages
Python Programming Using Google Colab
From Everand
Python Programming Using Google Colab
AM Govind Kumar
No ratings yet
Java: Advanced Guide to Programming Code with Java
From Everand
Java: Advanced Guide to Programming Code with Java
Charlie Masterson
No ratings yet
Java: Advanced Guide to Programming Code with Java: Java Computer Programming, #4
From Everand
Java: Advanced Guide to Programming Code with Java: Java Computer Programming, #4
Charlie Masterson
No ratings yet
Markov Chain Algorithm in Java
No ratings yet
Markov Chain Algorithm in Java
7 pages
module-1 ch-2
No ratings yet
module-1 ch-2
31 pages
UBC Summer School in NLP - VSP 2019 Lecture 9
No ratings yet
UBC Summer School in NLP - VSP 2019 Lecture 9
17 pages
SQL Server: Tips and Tricks - 2
From Everand
SQL Server: Tips and Tricks - 2
Priyanka Agarwal
4.5/5 (3)
Simplifying Data Science With Python
From Everand
Simplifying Data Science With Python
Billy David millican
No ratings yet
NLP m2
No ratings yet
NLP m2
74 pages
NLP_basics
No ratings yet
NLP_basics
119 pages
NLP Unit-4
No ratings yet
NLP Unit-4
48 pages
NLP - N-Gram Language Model
No ratings yet
NLP - N-Gram Language Model
22 pages
NLP Sentiment Analysis
No ratings yet
NLP Sentiment Analysis
7 pages
The Joy of JavaScript With a Side of Vue.js
From Everand
The Joy of JavaScript With a Side of Vue.js
Tom Henricksen
No ratings yet
5. speech to text Beam Search
No ratings yet
5. speech to text Beam Search
15 pages
JavaScript: Advanced Guide to Programming Code with JavaScript
From Everand
JavaScript: Advanced Guide to Programming Code with JavaScript
Charlie Masterson
No ratings yet
JavaScript: Advanced Guide to Programming Code with Javascript: JavaScript Computer Programming, #4
From Everand
JavaScript: Advanced Guide to Programming Code with Javascript: JavaScript Computer Programming, #4
Charlie Masterson
No ratings yet
Programming Puzzles: Python Edition: The Guide to Sharpen Your Coding Skills with Engaging and Challenging Puzzles
From Everand
Programming Puzzles: Python Edition: The Guide to Sharpen Your Coding Skills with Engaging and Challenging Puzzles
Matthew Whiteside
No ratings yet
Ngrams
100% (1)
Ngrams
22 pages
CS 388: Natural Language Processing:: N-Gram Language Models
No ratings yet
CS 388: Natural Language Processing:: N-Gram Language Models
22 pages
JavaScript for Kids: Start Your Coding Adventure
From Everand
JavaScript for Kids: Start Your Coding Adventure
Abdelfattah Ragab
No ratings yet
JavaScript.
From Everand
JavaScript.
Tom Henricksen
No ratings yet
Trees: Data Structures and Algorithms in Java 1/33
No ratings yet
Trees: Data Structures and Algorithms in Java 1/33
33 pages
Trees: 1/34 Data Structures and Algorithms in Java
No ratings yet
Trees: 1/34 Data Structures and Algorithms in Java
34 pages
2B Queues
No ratings yet
2B Queues
21 pages
Programming Assignment: How Easy To Read Is Your Writing?: Before You Begin: Getting Help
No ratings yet
Programming Assignment: How Easy To Read Is Your Writing?: Before You Begin: Getting Help
5 pages
3 Recursion
No ratings yet
3 Recursion
25 pages
Assignment 2, Part 2: Measuring Program Efficiency: Calculate The Big-O Running Time of Your Code
No ratings yet
Assignment 2, Part 2: Measuring Program Efficiency: Calculate The Big-O Running Time of Your Code
2 pages
Assignment 2, Part 1: Improving Program Efficiency
No ratings yet
Assignment 2, Part 1: Improving Program Efficiency
2 pages
Domjudge Sample Problem Boolfind - Boolean Switch Search'
No ratings yet
Domjudge Sample Problem Boolfind - Boolean Switch Search'
1 page
Drama and Playwriting
No ratings yet
Drama and Playwriting
6 pages
Literary Scholarship in Late Imperial Russia 1870s 1917 Rituals of Academic Institutionalism 1st Edition Andy Dr. Byford - Own the complete ebook with all chapters in PDF format
100% (2)
Literary Scholarship in Late Imperial Russia 1870s 1917 Rituals of Academic Institutionalism 1st Edition Andy Dr. Byford - Own the complete ebook with all chapters in PDF format
72 pages
EMET7001 - Tutorial 3 - Solution - v2
No ratings yet
EMET7001 - Tutorial 3 - Solution - v2
3 pages
Itl Reviewer
No ratings yet
Itl Reviewer
4 pages
Mobile App Development With React Native
No ratings yet
Mobile App Development With React Native
22 pages
The Classification of English Vowels
No ratings yet
The Classification of English Vowels
2 pages
Past Future Perfect Tense
No ratings yet
Past Future Perfect Tense
10 pages
Class 1 Assignment 01
No ratings yet
Class 1 Assignment 01
5 pages
Elective 1 Module
No ratings yet
Elective 1 Module
21 pages
Đề Số 05.2020
No ratings yet
Đề Số 05.2020
9 pages
Pygmalion (OFFICIAL) - Side Story 3 (5)
No ratings yet
Pygmalion (OFFICIAL) - Side Story 3 (5)
5 pages
Android An 242 FTDI UART Terminal User Manual
No ratings yet
Android An 242 FTDI UART Terminal User Manual
22 pages
Worksheet Chapter 1 PDF
No ratings yet
Worksheet Chapter 1 PDF
4 pages
Ooprogr Obtl 2023 2024 Picab
No ratings yet
Ooprogr Obtl 2023 2024 Picab
21 pages
Log
No ratings yet
Log
6 pages
Namelist
No ratings yet
Namelist
41 pages
Rapture: 1 Thessalonians 4:13-18 1 Corinthians 15:50-54
No ratings yet
Rapture: 1 Thessalonians 4:13-18 1 Corinthians 15:50-54
2 pages
vajradhara
No ratings yet
vajradhara
7 pages
BAHRAIN Language Program and Policies
100% (2)
BAHRAIN Language Program and Policies
4 pages
Practice Test 3 New
No ratings yet
Practice Test 3 New
22 pages
Bhasha Shahid Divas - A Brief Reconnoitre - Arjun Choudhuri
100% (1)
Bhasha Shahid Divas - A Brief Reconnoitre - Arjun Choudhuri
6 pages
What Is Bilingualism
0% (1)
What Is Bilingualism
9 pages
P6 English
No ratings yet
P6 English
44 pages
OpenOTP QuickStart
100% (1)
OpenOTP QuickStart
19 pages
Upper Int Eva
No ratings yet
Upper Int Eva
4 pages
Download full Chaucer s Language 2nd Edition Simon Horobin ebook all chapters
100% (4)
Download full Chaucer s Language 2nd Edition Simon Horobin ebook all chapters
71 pages
.. Lalita Trishati (300 Names Of Goddess Lalita) ..: ॥ Lelta E/Шet ॥
No ratings yet
.. Lalita Trishati (300 Names Of Goddess Lalita) ..: ॥ Lelta E/Шet ॥
6 pages
C++ Keyboard Shortcut 2019 PDF
No ratings yet
C++ Keyboard Shortcut 2019 PDF
2 pages
Grade-5-worksheet-2
No ratings yet
Grade-5-worksheet-2
3 pages

Assignment 3

Uploaded by

Assignment 3

Uploaded by

Programming Assignment, Part 2: Markov

1. Find the starter code

You should see a package called textgen in that starter code.

2. Open and examine the starter code

Examine the MarkovTextGenerator interface for the key methods:

public void train(String sourceText);

public String generateText(int numWords);

public void retrain(String sourceText);

Step 1: Implement the train method

1. A. The algorithm you'll be implementing

set "starter" to be the first word in the text

set "prevWord" to be starter

check to see if "prevWord" is already a node in the list

if "prevWord" is a node in the list

add "w" as a nextWord to the "prevWord" node

add a node to the list with "prevWord" as the node's word

add "w" as a nextWord to the "prevWord" node

set "prevWord" = "w"

1. B. Use the algorithm above to author your train method

set "currWord" to be the starter word

set "output" to be ""

add "currWord" to output

while you need more words

find the "node" corresponding to "currWord" in the list

select a random word "w" from the "wordList" for "node"

add "w" to the "output"

set "currWord" to be "w"

increment number of words added to the list

select a random word from the "wordList" for "node"

We add "hi" to the output as the starter word .

"Leo" can only generate "hi" so we add "hi" to our output.

Step 3: Implement the retrain method

What and how to submit

You might also like