Element of Computing
Element of Computing
19AIE113
VM translator REPORT
Team members:-
1. CB.EN.U4AIE22151 YERRA SUDHEER KUMAR CHOWDARY
2. CB.EN.U4AIE22130 M C DHANUSH
3. CB.EN.U4AIE22111 AYUSH KUMAR RAI
4. CB.EN.U4AIE22110 ANERUD THIYAGARAJAN
DECLARATION
Place: Ettimadai
Date: 1 9 - 4 - 2 0 2 3 .
Acknowledgement
This project has been possible due to the sincere and dedicated efforts of many.
First, we would liketo thank the dean of our college for giving us the opportunity
to get involved in a project and express our skill. We thank our elements of
computing teacher, Ms. Sreelakshmi for her guidance and support without which
this project would have been impossible. Finally, we thank our parents andour
classmates who encouraged us throughout the project.
Group: 9
CB.EN.U4AIE22130 M C DHANUSH
CB.EN.U4AIE22151 YERRA SUDHEER KUMAR CHOWDARY
CB.EN.U4AIE22111 Ayush Kumar Rai
CB.EN.U4AIE22110 Anerud Thiyagarajan
ABSTRACT:
The VM Translator report presents the design and implementation of a software tool that
translates VM language programs to Hack assembly language. It covers the architecture,
algorithms, and data structures utilized in the translation process. The report highlights the
importance of translating VM code for execution on the Hack platform. It introduces the Parser
and CodeWriter modules, responsible for parsing VM instructions and generating assembly
code, respectively. The implementation details cover handling various command types, such as
arithmetic, memory access, and flow control. Examples demonstrate the translation process, and
the report concludes with performance considerations and potential extensions.
Overall, the VM Translator report provides a comprehensive understanding of the software tool's
design, implementation, and functionality, serving as a valuable resource for learning about VM
translation and its role in executing high-level programs on the Hack computer platform.
INDEX
1)What is VM LANGUAGE
2)What are VM Instructions
3)What is a VM translator
4)Code and Explanation
5)Conclusion
6)References
WHAT IS VM LANGUAGE ?
The term "VM language" can refer to two different concepts: virtual machine languages and virtual
machine instruction set languages. Let's explore both of them:
These languages are typically platform-independent, meaning they can run on any system that has a
compatible virtual machine implementation. Examples of virtual machine languages include Java, C#,
Python (when using the Python Virtual Machine - PVM), and Ruby (when using the YARV - Yet Another
Ruby VM).
• Platform independence: Programs written in a virtual machine language can run on different
operating systems and hardware platforms, as long as there is a compatible virtual machine
available.
• Portability: Virtual machine languages provide a level of abstraction that allows developers to
write code once and run it on various platforms, without the need for extensive modifications or
recompilation.
• Security: Virtual machine languages often implement security features, such as sandboxing, that
enhance the overall security of the executed code by preventing unauthorized access to system
resources.
• Performance optimization: Virtual machine languages employ techniques like just-in-time (JIT)
compilation and runtime optimizations to improve the execution speed of programs.
2. Virtual Machine Instruction Set Languages: A virtual machine instruction set language refers to
the assembly-level language used to write programs for a specific virtual machine's instruction
set architecture (ISA). Each virtual machine has its own set of instructions that define the
operations it can perform.
Virtual machine instruction set languages are typically low-level and closely tied to the internal workings
of the virtual machine. They are used by developers who want to write programs directly targeting the
virtual machine, bypassing higher-level languages. Examples of instruction set languages include Java
bytecode, Microsoft Intermediate Language (MSIL), and the assembly language specific to the Lua virtual
machine.
These languages provide a way to directly manipulate the virtual machine's instructions, allowing for
fine-grained control and optimization. They are often used by virtual machine implementers, language
runtime developers, or those seeking to extend the capabilities of a specific virtual machine.
In summary, the term "VM language" can refer to either a programming language designed to run on a
virtual machine or an assembly-level language specific to a virtual machine's instruction set. Both types
of VM languages offer unique benefits and play a significant role in enabling platform independence,
portability, and performance optimization in software development and execution.
WHAT ARE VM INSTRUCTIONS?
In virtual machine languages, "push" and "pop" are common instructions used to manipulate the stack
data structure within the virtual machine's runtime environment. The stack is a region of memory used
for temporary storage and is commonly utilized for parameter passing, local variable storage, and
managing program flow.
Here's an explanation of "push" and "pop" instructions and some other common operations in virtual
machine languages:
1. Push: The "push" instruction places a value onto the top of the stack. It typically involves moving
the value from a register or memory location into the stack's memory space. After a "push"
operation, the stack pointer is typically incremented to point to the new top of the stack.
2. Pop: The "pop" instruction retrieves the value from the top of the stack and stores it in a register
or memory location. It usually involves moving the value from the stack memory into the
destination location and decrementing the stack pointer to reflect the new top of the stack.
3. Arithmetic Operations: Virtual machine languages often provide instructions for basic arithmetic
operations such as addition, subtraction, multiplication, and division. These instructions operate
on values stored in registers or memory locations, allowing computations to be performed
within the virtual machine environment.
4. Control Flow: Virtual machine languages include instructions for controlling program flow, such
as conditional branching and subroutine calls. Branching instructions allow for conditional
execution based on certain conditions, while subroutine calls enable the execution of specific
code segments and the return to the calling point.
5. Memory Access: Virtual machine languages offer instructions to read from and write to memory
locations. These instructions allow data to be stored in variables, arrays, and other data
structures within the virtual machine's memory space. Memory access instructions typically
involve specifying the memory address and the source or destination register for the data
transfer.
6. Logical Operations: Virtual machine languages often provide logical operations such as bitwise
AND, OR, XOR, and logical negation. These operations manipulate binary values at the bit level,
allowing for boolean logic and bitwise manipulations.
7. Stack Manipulation: In addition to "push" and "pop," virtual machine languages may have other
stack-related instructions, such as "dup" (duplicate the value at the top of the stack), "swap"
(exchange the top two values on the stack), or "drop" (remove the top value from the stack).
8. Input and Output: Virtual machine languages typically include instructions for interacting with
input and output devices. These instructions allow reading from and writing to files, console
input and output, and communication with external devices.
9. Type Conversion: Virtual machine languages often provide instructions for converting data
between different types, such as integer to floating-point conversion or character to integer
conversion. These instructions enable data transformations to suit the requirements of different
operations and data representations.
These are some common operations and instructions found in virtual machine languages. However, the
specific set of instructions and their functionalities may vary depending on the design and purpose of
the virtual machine and its associated language.
WHAT IS A VM TRANSLATOR
A VM translator is a tool or program used to translate code written in a high-level virtual machine
language into a lower-level representation that can be executed directly by the underlying hardware or
an emulator. The purpose of a VM translator is to bridge the gap between the virtual machine language
and the target execution environment.
In the context of virtual machine languages like Java or C#, the VM translator is responsible for
translating the high-level bytecode instructions, which are platform-independent, into the machine code
instructions specific to the target hardware or software environment. This translation process is typically
performed during runtime or as a pre-compilation step.
1. Input: The VM translator takes as input the code written in the high-level virtual machine
language, such as Java bytecode or CIL (Common Intermediate Language).
2. Parsing and Analysis: The VM translator parses and analyzes the input code to understand its
structure, extract relevant information, and build an internal representation of the code. This
step involves lexical analysis, syntax parsing, and semantic analysis.
3. Translation: Based on the internal representation of the code, the VM translator translates the
high-level virtual machine instructions into the target machine code instructions or instructions
understood by an emulator. This translation process may involve optimizations, such as
bytecode-to-native code compilation, to improve performance.
4. Output: The translated code is generated as the output of the VM translator. The output can be
in various forms, depending on the execution environment. It could be machine code for a
specific hardware platform, an intermediate representation for an interpreter or just-in-time
compiler, or even a modified bytecode that enhances performance or compatibility.
5. Execution: The translated code can then be executed by the target hardware or the designated
runtime environment, which understands the machine code or modified bytecode format. This
allows the code written in the high-level virtual machine language to be executed natively or in a
simulated environment.
The VM translator plays a crucial role in enabling the execution of code written in a virtual machine
language on different platforms or runtime environments. It abstracts the complexities of the target
environment, allowing developers to write code once and run it on multiple platforms without worrying
about low-level details.
It's important to note that the specifics of the VM translator can vary depending on the virtual machine
and the associated language being used. For example, in the case of Java, the Java Virtual Machine
(JVM) performs the translation process on-the-fly during runtime, whereas in other scenarios, a
separate translator tool may be used as a pre-processing step before execution.
CODE AND EXPLANATION :
CODE:
class Parser:
self.current_command = None
def has_more_commands(self):
return bool(lines) # Return True if a line was read, indicating there are more commands
EXPLANATION:
The code defines a class called Parser. Let's break down its functionality:
• def __init__(self, file_path):: This is the constructor method of the Parser class. It is called when
a new instance of the class is created. It takes a file_path parameter, which is the path to the file
containing VM code to be parsed. The method initializes two attributes:
• self.file: It opens the file specified by file_path in read mode and assigns the file object
to the file attribute. This attribute will be used to read the contents of the file.
• position = self.file.tell(): It retrieves the current position in the file by calling the tell()
method on the file object. This is necessary because reading lines from the file moves
the file position.
• lines = self.file.readlines(): It attempts to read all the lines from the file using the
readlines() method. This will return a list of lines.
• self.file.seek(position): It resets the file position back to where it was before reading the
lines. This is done to ensure that the file is not modified or moved while checking for
more commands.
• return bool(lines): It returns True if there are lines in the lines list, indicating that there
are more commands to be processed. If lines is an empty list, it returns False, indicating
that there are no more commands.
In summary, the Parser class is initialized with a file path, opens the file for reading, and provides a
method to check if there are more commands to be processed in the file.
CODE:
def advance(self):
while True:
lines = self.file.readline().strip()
continue
command = lines.split('//')[0].strip()
self.current_command = command
break
def command_type(self):
if self.current_command.startswith('push'):
return 'C_PUSH'
elif self.current_command.startswith('pop'):
return 'C_POP'
elif self.current_command.startswith(('add', 'sub', 'neg', 'eq', 'gt', 'lt', 'and', 'or', 'not')):
return 'C_ARITHMETIC'
elif self.current_command.startswith('label'):
return 'C_LABEL'
elif self.current_command.startswith('goto'):
return 'C_GOTO'
elif self.current_command.startswith('if-goto'):
return 'C_IF'
elif self.current_command.startswith('function'):
return 'C_FUNCTION'
elif self.current_command.startswith('return'):
return 'C_RETURN'
elif self.current_command.startswith('call'):
return 'C_CALL'
else:
return 'C_UNKNOWN'
EXPLANATION:
The code contains two methods within the Parser class: advance() and command_type(). Let's explain
each of these methods:
1. advance(self):
• This method is used to advance to the next command in the file. It reads lines from the
file until a valid command is found or the end of the file is reached.
• The method uses a while loop that runs indefinitely until a valid command is found and
breaks out of the loop.
2. command_type(self):
• This method is used to determine the type of the current command (stored in
self.current_command) and returns a corresponding string.
• It uses a series of if and elif statements to check the starting keyword of the current
command and determine its type.
• Here are the different command types and their corresponding keyword checks:
• Arithmetic commands: These are checked using a tuple of keywords that start
with 'add', 'sub', 'neg', 'eq', 'gt', 'lt', 'and', 'or', 'not'.
• Push and pop commands: These start with 'push' and 'pop' respectively.
• Label, goto, if-goto, function, return, and call commands: These start with their
respective keywords.
• If the current command doesn't match any of the known types, it is classified as
'C_UNKNOWN'.
Overall, the advance() method is responsible for reading lines from the file, ignoring empty lines and
comments, and storing the current valid command. The command_type() method uses the stored
command to determine its type and returns the corresponding string.
CODE:
def arg1(self):
split_command = self.current_command.split()
if self.command_type() == 'C_ARITHMETIC':
return split_command[0]
return split_command[1]
else:
return None
def arg2(self):
split_command = self.current_command.split()
if len(split_command) > 2:
return int(split_command[2])
else:
return None
EXPLANATION:
The code contains two additional methods within the Parser class: arg1() and arg2(). Let's explain each
of these methods:
1. arg1(self):
• This method is used to extract the first argument from the current command
(self.current_command).
• It starts by splitting the command into a list of individual words using the split() method
and assigning it to the split_command variable.
• If neither of the above conditions is met, it returns None to indicate that there is
no argument.
2. arg2(self):
• This method is used to extract the second argument from the current command
(self.current_command).
• Similar to arg1(), it starts by splitting the command into a list of words and assigning it
to split_command.
• If the length is not greater than 2, it returns None to indicate that there is no
second argument.
In summary, arg1() extracts the first argument from the current command, considering the command
type, while arg2() extracts the second argument if it exists. These methods are used to retrieve specific
arguments from different types of commands for further processing or analysis.
CODE:
class CodeWriter:
print(output_file)
self.label_count=0
self.current_file_name=None
EXPLANATION:
The code defines a class CodeWriter with an __init__() method and initializes several instance
variables. Here's an explanation of the code:
1. __init__(self, output_file):
• This is the constructor method of the CodeWriter class, called when creating a new
instance of the class.
• It takes an output_file parameter, which represents the file path or name of the output
file.
• Inside the method, it prints the output_file value (presumably for debugging purposes).
• It then opens the output_file in write mode using the open() function and assigns the
resulting file object to the self.output_file instance variable.
• It also initializes the label_count variable to 0, representing the count of labels used in
the code.
• Lastly, it sets the current_file_name variable to None, which presumably represents the
current file being processed by the CodeWriter object.
Overall, this code sets up the CodeWriter object by opening the output file, initializing counters, and
setting the current file name to None. The output_file will be used to write generated code instructions,
and the label_count and current_file_name variables will be used for tracking and managing code
generation.
CODE:
def write_arithmetic(self, command):
if command == 'add':
self.output_file.write('// add\n')
self.output_file.write('@SP\n')
self.output_file.write('AM=M-1\n')
self.output_file.write('D=M\n')
self.output_file.write('A=A-1\n')
self.output_file.write('M=D+M\n')
self.output_file.write('\n')
self.output_file.write('// sub\n')
self.output_file.write('@SP\n')
self.output_file.write('AM=M-1\n')
self.output_file.write('D=M\n')
self.output_file.write('A=A-1\n')
self.output_file.write('M=M-D\n')
self.output_file.write('\n')
self.output_file.write('// neg\n')
self.output_file.write('@SP\n')
self.output_file.write('A=M-1\n')
self.output_file.write('M=-M\n')
self.output_file.write('\n')
self.output_file.write('// eq\n')
self.output_file.write('@SP\n')
self.output_file.write('M=M-1\n')
self.output_file.write('A=M\n')
self.output_file.write('D=M\n')
self.output_file.write('A=A-1\n')
self.output_file.write('D=M-D\n')
self.output_file.write('@TRUE{}\n'.format(self.label_count))
self.output_file.write('D;JEQ\n')
self.output_file.write('@SP\n')
self.output_file.write('A=M-1\n')
self.output_file.write('M=0\n')
self.output_file.write('@CONTINUE{}\n'.format(self.label_count))
self.output_file.write('0;JMP\n')
self.output_file.write('(TRUE{})\n'.format(self.label_count))
self.output_file.write('@SP\n')
self.output_file.write('A=M-1\n')
self.output_file.write('M=-1\n')
self.output_file.write('(CONTINUE{})\n'.format(self.label_count))
self.label_count += 1
self.output_file.write('\n')
self.output_file.write('// gt\n')
self.output_file.write('@SP\n')
self.output_file.write('M=M-1\n')
self.output_file.write('A=M\n')
self.output_file.write('D=M\n')
self.output_file.write('A=A-1\n')
self.output_file.write('D=M-D\n')
self.output_file.write('@TRUE{}\n'.format(self.label_count))
self.output_file.write('D;JGT\n')
self.output_file.write('@SP\n')
self.output_file.write('A=M-1\n')
self.output_file.write('M=0\n')
self.output_file.write('@CONTINUE{}\n'.format(self.label_count))
self.output_file.write('0;JMP\n')
self.output_file.write('(TRUE{})\n'.format(self.label_count))
self.output_file.write('@SP\n')
self.output_file.write('A=M-1\n')
self.output_file.write('M=-1\n')
self.output_file.write('(CONTINUE{})\n'.format(self.label_count))
self.label_count += 1
self.output_file.write('\n')
self.output_file.write('// lt\n')
self.output_file.write('@SP\n')
self.output_file.write('M=M-1\n')
self.output_file.write('A=M\n')
self.output_file.write('D=M\n')
self.output_file.write('A=A-1\n')
self.output_file.write('D=M-D\n')
self.output_file.write('@TRUE{}\n'.format(self.label_count))
self.output_file.write('D;JLT\n')
self.output_file.write('@SP\n')
self.output_file.write('A=M-1\n')
self.output_file.write('M=0\n')
self.output_file.write('@CONTINUE{}\n'.format(self.label_count))
self.output_file.write('0;JMP\n')
self.output_file.write('(TRUE{})\n'.format(self.label_count))
self.output_file.write('@SP\n')
self.output_file.write('A=M-1\n')
self.output_file.write('M=-1\n')
self.output_file.write('(CONTINUE{})\n'.format(self.label_count))
self.label_count += 1
self.output_file.write('\n')
self.output_file.write('// and\n')
self.output_file.write("@SP\n")
self.output_file.write("AM=M-1\n")
self.output_file.write("D=M\n")
self.output_file.write('A=A-1\n')
self.output_file.write('M=D&M\n')
self.output_file.write('\n')
self.output_file.write('// or\n')
self.output_file.write("@SP\n")
self.output_file.write("AM=M-1\n")
self.output_file.write("D=M\n")
self.output_file.write('A=A-1\n')
self.output_file.write('M=D|M\n')
self.output_file.write('\n')
self.output_file.write('// not\n')
self.output_file.write('@SP\n')
self.output_file.write('A=M-1\n')
self.output_file.write('M=!M\n')
self.output_file.write('\n')
else:
EXPLANATION:
The code you provided defines a method write_arithmetic(self, command) within the CodeWriter class.
This method is responsible for writing the assembly code instructions corresponding to different
arithmetic operations based on the command parameter. Here's an explanation of the code:
1. The method starts with a series of conditional statements (if, elif, else) to handle different
arithmetic commands.
2. Each if block corresponds to a specific arithmetic command, such as add, sub, neg, eq, gt, lt,
and, or, and not.
3. Inside each block, the method writes the corresponding assembly instructions to the
self.output_file.
4. The assembly instructions are written line by line using the write() method of the
self.output_file file object.
5. The assembly code instructions are specific to the Hack assembly language, manipulating the
stack (SP) and performing arithmetic operations.
6. For conditional commands (eq, gt, lt), the code also includes labels (TRUE{} and CONTINUE{}) to
handle the branching logic based on the comparison result.
7. After writing the instructions, the label_count variable is incremented to ensure unique labels
for each conditional command.
8. If the command parameter does not match any valid arithmetic command, a ValueError is
raised with an appropriate error message.
Overall, this write_arithmetic() method allows the CodeWriter object to generate the corresponding
assembly code for various arithmetic operations based on the given command. The generated assembly
code can be written to the output file for further processing or execution.
CODE:
def write_push(self, segment, index):
if segment == 'constant':
self.output_file.write('@{}\n'.format(index))
self.output_file.write('D=A\n')
self.output_file.write('@LCL\n')
self.output_file.write('D=M\n')
self.output_file.write('@{}\n'.format(index))
self.output_file.write('A=D+A\n')
self.output_file.write('D=M\n')
self.output_file.write('D=M\n')
self.output_file.write('@{}\n'.format(index))
self.output_file.write('A=D+A\n')
self.output_file.write('D=M\n')
self.output_file.write('@THIS\n')
self.output_file.write('D=M\n')
self.output_file.write('@{}\n'.format(index))
self.output_file.write('A=D+A\n')
self.output_file.write('D=M\n')
self.output_file.write('@THAT\n')
self.output_file.write('D=M\n')
self.output_file.write('@{}\n'.format(index))
self.output_file.write('A=D+A\n')
self.output_file.write('D=M\n')
self.output_file.write('D=M\n')
self.output_file.write('D=M\n')
self.output_file.write('D=M\n')
EXPLANATION:
The code you provided defines a method write_push(self, segment, index) within the CodeWriter class.
This method is responsible for writing the assembly code instructions to push a value onto the stack
based on the given segment and index parameters. Here's an explanation of the code:
1. The method starts by writing a comment indicating the push operation being performed, using
the format '// push {} {}\n'.format(segment, index).
2. The method then uses conditional statements (if, elif, else) to handle different segments.
3. Each if block corresponds to a specific segment, such as 'constant', 'local', 'argument', 'this',
'that', 'pointer', 'temp', and 'static'.
4. Inside each block, the method writes the corresponding assembly instructions to the
self.output_file.
5. The assembly instructions are written line by line using the write() method of the
self.output_file file object.
6. The assembly code instructions are specific to the Hack assembly language and follow the
convention of accessing memory segments and storing the values in the D register.
7. The index parameter is used to determine the offset within the segment.
8. For certain segments ('local', 'argument', 'this', 'that'), the segment base address is loaded into
the D register (D=M), and then the offset (index) is added to it (A=D+A). Finally, the value at that
memory location (D=M) is loaded into the D register.
9. For the 'pointer' and 'temp' segments, the base address is computed based on the segment's
specific location in the CPU's memory map. The offset (index) is added to the base address, and
the value at that memory location (D=M) is loaded into the D register.
10. For the 'static' segment, a unique label is created by combining the self.current_file_name (the
current file's name) and the index. The value at that memory location (D=M) is loaded into the D
register.
Overall, this write_push() method allows the CodeWriter object to generate the corresponding
assembly code for pushing values onto the stack based on the given segment and index. The generated
assembly code can be written to the output file for further processing or execution.
CODE:
# Push value on the stack
self.output_file.write('@SP\n')
self.output_file.write('A=M\n')
self.output_file.write('M=D\n')
self.output_file.write('@SP\n')
self.output_file.write('M=M+1\n')
EXPLANATION:
The code you provided is the continuation of the write_push() method in the CodeWriter class. After
loading the desired value into the D register based on the segment and index, the following instructions
are written to complete the push operation:
1. self.output_file.write('@SP\n'): This instruction sets the address of the stack pointer (SP) to the
A register, preparing to store the value onto the stack.
3. self.output_file.write('M=D\n'): This instruction stores the value in the D register into the
memory location pointed to by the A register. In other words, it saves the value onto the stack.
4. self.output_file.write('@SP\n'): This instruction again sets the address of the stack pointer (SP)
to the A register.
These instructions ensure that the value in the D register is pushed onto the stack, and the stack pointer
is correctly updated to reflect the addition of the new value.
CODE:
def write_pop(self, segment, index):
if segment == 'local':
self.output_file.write('@LCL\n')
self.output_file.write('D=M\n')
self.output_file.write('@{}\n'.format(index))
self.output_file.write('D=D+A\n')
self.output_file.write('@R13\n')
self.output_file.write('M=D\n')
self.output_file.write('@ARG\n')
self.output_file.write('D=M\n')
self.output_file.write('@{}\n'.format(index))
self.output_file.write('D=D+A\n')
self.output_file.write('@R13\n')
self.output_file.write('M=D\n')
self.output_file.write('@THIS\n')
self.output_file.write('D=M\n')
self.output_file.write('@{}\n'.format(index))
self.output_file.write('D=D+A\n')
self.output_file.write('@R13\n')
self.output_file.write('M=D\n')
self.output_file.write('@THAT\n')
self.output_file.write('D=M\n')
self.output_file.write('@{}\n'.format(index))
self.output_file.write('D=D+A\n')
self.output_file.write('@R13\n')
self.output_file.write('M=D\n')
self.output_file.write('@{}\n'.format(3 + index))
self.output_file.write('D=A\n')
self.output_file.write('@R13\n')
self.output_file.write('M=D\n')
self.output_file.write('@{}\n'.format(5 + index))
self.output_file.write('D=A\n')
self.output_file.write('@R13\n')
self.output_file.write('M=D\n')
self.output_file.write('@{}.{}\n'.format(self.current_file_name, index))
self.output_file.write('D=A\n')
self.output_file.write('@R13\n')
self.output_file.write('M=D\n')
EXPLANATION:
The code is the write_pop() method in the CodeWriter class. This method is responsible for generating
assembly code to perform the pop operation, which removes a value from the stack and stores it in the
specified segment and index.
2. The code then checks the segment parameter to determine the appropriate assembly
instructions for each segment.
• The corresponding segment base address (LCL, ARG, THIS, or THAT) is loaded
into the D register.
• The index is added to the base address using the D=D+A instruction to calculate
the target memory location.
• The target memory location is stored in the R13 register as a temporary storage
location.
• The target memory location is computed by combining the current file name
and index.
3. After executing the corresponding segment-specific instructions, the target memory location is
stored in the R13 register, which acts as a temporary storage location.
4. The generated assembly code is then ready to perform the actual pop operation, which involves
transferring the value from the stack to the specified memory location. The implementation of
this step is not shown in the code snippet you provided.
The code you provided demonstrates the initial steps of the write_pop() method, setting up the target
memory location to perform the pop operation based on the provided segment and index.
CODE:
# Pop value from the stack to the destination address
self.output_file.write('@SP\n')
self.output_file.write('AM=M-1\n')
self.output_file.write('D=M\n')
self.output_file.write('@R13\n')
self.output_file.write('A=M\n')
self.output_file.write('M=D\n')
EXPLANATION:
The code is the continuation of the write_pop() method in the CodeWriter class. After setting up the
target memory location, the code performs the actual pop operation by transferring the value from the
stack to the specified memory location.
1. self.output_file.write('@SP\n'): This line loads the stack pointer address into the A register.
3. self.output_file.write('D=M\n'): This line reads the value from the stack and stores it in the D
register.
4. self.output_file.write('@R13\n'): This line loads the target memory location from the R13
register into the A register.
5. self.output_file.write('A=M\n'): This line sets the A register to the target memory location.
6. self.output_file.write('M=D\n'): This line stores the value from the D register into the target
memory location.
By performing these steps, the code effectively pops the value from the stack and stores it in the
destination address specified by the segment and index.
It's important to note that the code you provided generates the assembly instructions up to this point.
The remaining part of the code, which is responsible for updating the stack pointer and performing any
necessary cleanup, is not included in the code snippet.
CODE:
def write_label(self, label,function):
self.output_file.write('({}${})\n'.format(function, label))
def write_goto(self, label,function_name):
self.output_file.write('@{}\n'.format(label))
self.output_file.write('0;JMP\n')
self.output_file.write('@SP\n')
self.output_file.write('M=M-1\n')
self.output_file.write('A=M\n')
self.output_file.write('D=M\n')
if condition == "EQ":
self.output_file.write('@{}\n'.format(label))
self.output_file.write('D;JEQ\n')
self.output_file.write('@{}\n'.format(label))
self.output_file.write('D;JGT\n')
self.output_file.write('@{}\n'.format(label))
self.output_file.write('D;JLT\n')
self.output_file.write('({})\n'.format(function_name))
for _ in range(num_locals):
self.write_push('constant', 0)
EXPLANATION:
The code you provided includes several methods in the CodeWriter class that generate assembly
instructions for handling labels, unconditional jumps, conditional jumps, and function declarations.
Here's a breakdown of each method:
1. write_label(self, label,function): This method writes the assembly code for a label declaration.
It takes the label and function as parameters and generates the label declaration using the
format ({function}${label}). The resulting assembly code represents the target location for
jumping or branching.
3. write_if_goto(self, label, condition): This method writes the assembly code for a conditional
jump (if-goto) statement. It takes the label and condition as parameters and generates the
assembly instructions for performing a conditional jump based on the condition. It first pops the
value from the stack and stores it in the D register. Then, depending on the specified condition,
it uses the appropriate jump instruction (D;JEQ, D;JGT, or D;JLT) to perform the conditional
jump.
4. write_function(self, function_name, num_locals): This method writes the assembly code for a
function declaration. It takes the function_name and num_locals as parameters and generates
the assembly instructions for declaring a function. It starts with a label declaration for the
function name (({function_name})) and then uses a loop to initialize the local variables by
pushing the value 0 onto the stack using the write_push() method with the segment set to
'constant' and the index set to 0.
These methods provide the necessary functionality for generating assembly code related to labels,
jumps, and function declarations in the Hack computer's assembly language.
CODE:
def write_return(self):
self.output_file.write('// return\n')
# FRAME = LCL
self.output_file.write('@LCL\n')
self.output_file.write('D=M\n')
# RET = *(FRAME-5)
self.output_file.write('@5\n')
self.output_file.write('A=D-A\n')
self.output_file.write('D=M\n')
self.output_file.write('M=D\n')
# *ARG = pop()
self.write_pop('argument', 0)
# SP = ARG + 1
self.output_file.write('@ARG\n')
self.output_file.write('D=M+1\n')
self.output_file.write('@SP\n')
self.output_file.write('M=D\n')
# THAT = *(FRAME-1)
self.output_file.write('D=M\n')
self.output_file.write("@R13\n")
self.output_file.write("M=M-1\n")
self.output_file.write('@THAT\n')
# THIS = *(FRAME-2)
self.output_file.write('@R13\n')
self.output_file.write('A=M\n')
self.output_file.write('D=M\n')
self.output_file.write("@R13\n")
self.output_file.write("M=M-1\n")
self.output_file.write('@THIS\n')
self.output_file.write('M=D\n')
# ARG = *(FRAME-3)
self.output_file.write('@R13\n')
self.output_file.write('A=M\n')
self.output_file.write('D=M\n')
self.output_file.write("@R13\n")
self.output_file.write("M=M-1\n")
self.output_file.write('@ARG\n')
self.output_file.write('M=D\n')
# LCL = *(FRAME-4)
self.output_file.write('@R13\n')
self.output_file.write('A=M\n')
self.output_file.write('D=M\n')
self.output_file.write("@R13\n")
self.output_file.write("M=M-1\n")
self.output_file.write('@LCL\n')
self.output_file.write('M=D\n')
# goto RET
self.output_file.write('@R14\n')
self.output_file.write('A=M\n')
self.output_file.write('0;JMP\n')
EXPLANATION:
The write_return() method generates assembly code for handling the return command in the Hack
computer's assembly language. Here's a breakdown of the code:
1. @LCL\nD=M\n@R13\nM=D: This code saves the value of the current LCL (frame) register to the
R13 register.
3. self.write_pop('argument', 0): This code pops a value from the stack and stores it in the location
pointed to by the ARG register. It uses the write_pop() method with the segment set to
'argument' and the index set to 0.
4. @ARG\nD=M+1\n@SP\nM=D: This code sets the stack pointer (SP) to ARG + 1, effectively
restoring the stack pointer to its previous position before the function call.
6. Similar steps are repeated for THIS, ARG, and LCL registers, retrieving their previous values and
storing them accordingly.
7. @R14\nA=M\n0;JMP: This code jumps to the return address stored in the R14 register,
effectively returning to the caller function.
The write_return() method handles the necessary steps to properly return from a function, including
restoring the caller's execution state and jumping back to the caller's code.
CODE:
def write_call(self, num_args):
return_address = 'RETURN_ADDRESS_{}'.format(self.return_address_count)
self.return_address_count += 1
self.output_file.write('@{}\n'.format(return_address))
self.output_file.write('D=A\n')
self.write_push('D', 'A')
# Push LCL, ARG, THIS, THAT
self.write_push('LCL', 'M')
self.write_push('ARG', 'M')
self.write_push('THIS', 'M')
self.write_push('THAT', 'M')
EXPLANATION:
The write_call() method you provided generates assembly code for handling the call command in the
Hack computer's assembly language. Here's a breakdown of the code:
4. self.output_file.write('D=A\n'): This line sets the D register to the value of the return address.
5. self.write_push('D', 'A'): This line calls the write_push() method to push the value of the return
address onto the stack. It uses the 'D' segment and 'A' index.
6. The following lines call the write_push() method four times to push the values of LCL, ARG,
THIS, and THAT onto the stack. Each call uses the respective segment and 'M' index to push the
value stored in the corresponding register onto the stack.
The write_call() method handles the necessary steps to perform a function call, including pushing the
return address and the values of LCL, ARG, THIS, and THAT onto the stack. This sets up the stack frame
for the called function and prepares for its execution.
CODE:
# ARG = SP - num_args - 5
self.output_file.write('@SP\n')
self.output_file.write('D=M\n')
self.output_file.write('@{}\n'.format(num_args))
self.output_file.write('D=D-A\n')
self.output_file.write('@5\n')
self.output_file.write('D=D-A\n')
self.output_file.write('@ARG\n')
self.output_file.write('M=D\n')
# LCL = SP
self.output_file.write('@SP\n')
self.output_file.write('D=M\n')
self.output_file.write('@LCL\n')
self.output_file.write('M=D\n')
self.output_file.write('@{}\n'.format(self.function_name))
self.output_file.write('0;JMP\n')
self.output_file.write('({})\n'.format(return_address))
EXPLANATION:
The additional code you provided for the write_call() method handles the remaining steps of the
function call. Here's a breakdown of the code:
1. self.output_file.write('@SP\n'): This line loads the stack pointer address into the D register.
2. self.output_file.write('D=M\n'): This line sets the D register to the value of the stack pointer.
4. self.output_file.write('D=D-A\n'): This line subtracts num_args from the stack pointer value in
the D register.
6. self.output_file.write('D=D-A\n'): This line subtracts 5 from the value in the D register. This
accounts for the return address and the four additional values pushed onto the stack.
7. self.output_file.write('@ARG\n'): This line loads the address of the ARG register into the A
register.
8. self.output_file.write('M=D\n'): This line sets the value of the ARG register to the calculated
value.
9. self.output_file.write('@SP\n'): This line loads the stack pointer address into the A register.
10. self.output_file.write('D=M\n'): This line sets the D register to the value of the stack pointer.
11. self.output_file.write('@LCL\n'): This line loads the address of the LCL register into the A
register.
12. self.output_file.write('M=D\n'): This line sets the value of the LCL register to the value of the
stack pointer, effectively setting up the new stack frame.
The additional code sets up the ARG and LCL registers and transfers control to the called function by
jumping to its label. It also declares the return address label, which is used for returning from the called
function.
CODE:
f.write(lines + "\n")
def close(self):
self.output_file.close()
EXPLANATION:
The close() method should be responsible for closing the file object that you opened earlier when
writing to the output file. Therefore, you don't need to close the self.output_file in the write() method
since it is being opened and closed within the with block.
CODE:
def main():
file_path="C:/Users/aneru/OneDrive/Desktop/AMRITA/SEM-2/Assignments/EOC-
2/assignment5/SimpleAdd.vm"
parser = Parser(file_path)
line_index = 1
code_writer = CodeWriter('SimpleAdd.asm')
funcount=1
while parser.has_more_commands():
parser.advance()
command_type = parser.command_type()
if command_type == 'C_ARITHMETIC':
code_writer.write_arithmetic(parser.arg1())
code_writer.write_push(parser.arg1(), parser.arg2())
code_writer.write_pop(parser.arg1(), parser.arg2())
code_writer.write_label(parser.arg1(), parser.arg2())
code_writer.write_goto(parser.arg1(), parser.arg2())
code_writer.write_if_goto(parser.arg1(), parser.arg2())
code_writer.write_function(parser.arg1(), parser.arg2())
code_writer.write_return(parser.arg1(), parser.arg2())
code_writer.write_call(parser.arg1(), parser.arg2())
else:
1. It sets the file_path variable to the path of the input file (SimpleAdd.vm) containing the VM
commands you want to translate.
2. It creates an instance of the Parser class, passing the file_path as an argument. The Parser class
is responsible for parsing the VM commands from the input file.
3. It initializes the line_index variable to 1. This variable can be used to track the current line
number being processed, although it's not used in the provided code.
4. It creates an instance of the CodeWriter class, passing the output file name (SimpleAdd.asm) as
an argument. The CodeWriter class is responsible for translating the parsed VM commands into
assembly code and writing it to the output file.
5. It initializes the funcount variable to 1. This variable can be used to keep track of function calls
or labels, although it's not used in the provided code.
6. It enters a loop that continues as long as there are more commands to process
(parser.has_more_commands()). Inside the loop, it performs the following steps:
• Based on the command type, it calls the appropriate method of the CodeWriter
instance to translate and write the command to the output file.
• If the command type is not recognized, it prints an error message indicating an invalid
command type.
This code provides a structure for translating VM code to assembly code using the Parser and
CodeWriter classes. However, it seems that the write_return() and write_call() methods are not defined
in your code. Make sure to define these methods in the CodeWriter class to handle the respective VM
commands.
Conclusion
The Python script provided in this report offers a straightforward approach to convert VM code
into assembly language. It defines functions for handling different types of VM commands, such
as push, pop, arithmetic, and logical operations. By executing the script, a corresponding assembly
file is generated, providing a closer representation of the original VM code in low-level assembly
language.
References
https://www.smartrek.io/kb/the-virtual-machine-code-structure-the-vm-scripting-language/
https://stackoverflow.com/questions/4640809/what-is-a-vm-and-why-do-dynamic-languages-need-one
https://cloud.google.com/learn/what-is-a-virtual-
machine#:~:text=A%20virtual%20machine%20(VM)%20is,as%20updates%20and%20system%20monitor
ing.
https://pypi.org/project/vm-translator/
https://www.coursera.org/lecture/nand2tetris2/unit-1-8-vm-translator-proposed-implementation-
qmJl3
https://www.tylercrosse.com/ideas/nand2tetris-vm-translator