CMPS 10 - Programming Assignments

Lab Assignment 3

Due: Monday November 27 10:00 pm

Purpose:
Write simple machine language and assembly language programs for a simulated processor.

Description:
Copy the files Translate.cpp, Run.cpp, EX1, EX2, EX3, and EX4 from the course locker (/afs/cats.ucsc.edu/courses/cmps010-pt/) to your cs10 directory. (See lab2 for instructions on how to do this.) The files Run.cpp and Translate.cpp are C++ source files. Compile them using g++ and name the object files Run and Translate, respectively. Type:

% g++ -o Translate Translate.cpp
% g++ -o Run Run.cpp

Since this is the third time we've done this, I'm sure you see the pattern. To compile any C++ source file type:

% g++ -o object_file_name source_file_name.cpp

If you leave out the name of the object file (i.e. % g++ source_file_name.cpp), then the name of the object file defaults to a.out. As usual do not try to view the contents of the binary files Run and Translate.

Run simulates the behavior of a simple processor whose instruction set is given in Fig. 6.5 on page 248 of the text, and Translate simulates the behavior of an assembler for the same processor. The files EX1, EX2, EX3, and EX4 are the four assembly language examples which will be covered in class. However the machine language and assembly language used in this assignment differs from that described in section 6.3 in a several respects.

First, the machine language format for this assignment consists of an 8 bit op-code, followed by a single 8 bit address field; so there is still one address field, and 2 bytes per instruction, but the bits are distributed differently. With 8 bits for the op code we could have as many as 256 operations in our instruction set, but we do not. We still have only the 16 operations described on page 244. The smaller address field limits memory to 256 bytes. This different distribution of bits means that the address field of one instruction now fits in a single cell. Thus when an instruction is stored in memory, each cell contains either an op-code, an address, or a data value. In fact, a machine language program will always start at memory address 0, so that op-codes will always occupy cells with even addresses, and address fields will occupy cells with odd addresses. Data values can reside in either even or odd addressed cells. We will write our machine code in decimal (i.e. base 10) rather than binary. Of course if this were a real machine language instead of a simulation, everything would be done in binary.

With this information you can now translate the source files EX1, EX2, EX3, and EX4 into machine language by hand. Create a file called ex1 which contains the machine language translation of EX1. (I will use upper case for source file names, and lower case for the corresponding object file names. I suggest you follow the same convention.) It is helpful to see EX1 and ex1 side by side:

EX1: Assembly code               ex1: Machine code
      .BEGIN
      IN        A                13 17
      IN        B                13 18
      IN        C                13 19
      LOAD      A                0   17
      ADD       B                3   18
      SUBTRACT C                5   19
      STORE     D                1   20
      OUT       D                14 20
      HALT                       15
A:    .DATA     0                0
B:    .DATA     0                0
C:    .DATA     0                0
D:    .DATA     0                0
      .END

The machine code for ex1 is loaded into memory starting at address 0. In terms of cell addresses ex1 looks like

address:	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20
ex1:	13	17	13	18	13	19	0	17	3	18	5	19	1	20	14	20	15	0	0	0	0

Observe that the op codes are in the even cells numbered 0 through 16, the address fields are in odd cells numbered 1 through 15, and cells 17 through 20 contain data. If you translate EX2 in the same way you should get

address:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

ex2:
13 21 0 23 7 21 11 14 0 21 1 22 8 18 5 21 1 22 14 22 15 0 0 0

Translate EX3 and EX4 into machine language yourself. Create four text files called ex1, ex2, ex3, and ex4 containing the machine code translations of EX1, EX2, EX3, and EX4 respectively. These object files should contain non-negative integers, white space characters (spaces, tabs, newlines), and nothing else.

You can now run the these programs by typing:

% Run object_file_name

For instance, you can type:

% Run ex1

You will see a listing of the object file ex1, its length in bytes, the message "Executing program:", and then the program just hangs. Recall that EX1 takes three integers a, b, c, as input and prints out the value a+b-c. The program is waiting for you to enter three values. You may enter these all on a line separated by spaces, followed by a return; or you may enter them on three separate lines, each with its own return. Try entering the numbers 5 7 3. You will see:

Executing program:
5 7 3
9
Execution complete.
%

Our simulated processor is not capable sending or receiving any character data, so no user friendly prompting for input is possible. Try running ex2, ex3, and ex4 in the same way. Of course you should first recall what each of those programs is supposed to do and act accordingly. (A description of each program is included in it's source file.) If ex3 and ex4 do not behave as expected, then you did not translate EX3 and EX4 correctly. Check your translation carefully for errors.

As I'm sure you've noticed, translating assembly code into machine code by hand is a hassle. Imagine doing this for a real processor with anywhere from 40 to 300 op-codes, up to six address fields, and all in binary of course. If only we could automate this translation process. But we can! Type:

% Translate EX1 ex1a

You should see a message that EX1 was successfully translated, and the output was placed in file ex1a. Compare the object files ex1 and ex1a. They are identical. Try translating EX2, EX3, and EX4 in the same way and call the object files ex2a, ex3a, and ex4a respectively. Comparison should show that exN is identical to exNa, for N=1, 2, 3, and 4. Observe that the syntax for Translate is reversed from the syntax for the g++ compiler. The proper form is:

% Translate source_file object_file

Be warned that if object_file already exists in your current directory, it will be overwritten. If ex3a and ex4a are different from your translations ex3 and ex4, then you made a mistake in translation. If so, Run ex3a and ex4a; they should now execute properly.

Your days of tediously translating assembly language programs into machine language by hand are now at an end. Write assembly language programs to do the following:

Integer multiplication. Take as input two non-negative integers a and b, and print out their product. Recall that a homework problem back in chapter two (I believe) asked that you write an algorithm for this, assuming that addition is a primitive operation and multiplication is not, which is exactly the position you are in with this simulated processor. That algorithm should come in handy now. If you feel ambitious, design your program so that it works with negative integers also. (Think about our absolute value program, EX2.)
Integer division. Take as input two positive integers a and b. Print out the quotient and remainder of a divided by b. The quotient and remainder are defined to be the unique non-negative integers q and r satisfying: a = bq + r and 0 <= r < b. (Hint: consider the ideas used in EX4.)
Find extreme values. Take a list of integers as input and print out the maximum and minimum elements in the list. The program should continue to take input values from the user until the sentinel value 0 is entered, then print the maximum and minimum entrees (not including the zero.) This is very similar to homework problem #13 on page 283.

As mentioned above, there are a few minor differences between the assembly language described in section 6.3 of the text, and that used in this assignment. The most important difference regards comments. Translate regards anything before .BEGIN and after .END as a comment, and ignores it. However, comments which appear between .BEGIN and .END must be bracketed by the characters /* and */. (This is the comment syntax used by the C language.) Look at EX2 for an example. Do not use two dashes to signify a comment as in the text. Another difference, which could be considered to be a bug in Translate, is that it will not complain if you label two different instructions or data values with the same symbolic name. It will simply ignore the second label. Just make sure all of your labels are unique and you should have no problem. Translate does generate an error message when it encounters certain simple syntax errors. For instance, if you leave out HALT, .BEGIN, or .END you will get an error message and no object file will be created. See the program notes below.

Translate your assembly code for (1), (2), and (3) into machine language, and Run the object files. Test your executables (i.e. object files) on several sets of data to make sure they work as intended.

What to Turn In:
Name your source files MULT, DIV, and EXT respectively. Submit only these source files (i.e. do not submit the object files) to the assignment name "lab3". Note that points will be deducted if you submit extra files, or if you use names other than MULT, DIV, and EXT for your source files. Be sure that each source file contains a heading with your:

name
CMPS 10
Lab Assignment 3
date submitted
program name
short description of program operation

Program Notes:
In the unlikely event that your source file contains a string (i.e. a word) which is longer than 80 characters, you will get a segmentation fault when you try to Translate. (This includes words in comments.) Either shorten the offending word, or go to Translate.cpp and change the line

#define MAX_STR_LEN 80

by replacing 80 with something bigger, then re-compile. Translate expects to see at minimum a HALT command, bracketed by .BEGIN and .END. In fact

.BEGIN
HALT
.END

is a valid program. All .DATA commands must come between HALT and .END. Labels must end in a colon, with no space between the label and the colon. As mentioned above multiple labels will be accepted, but this should be considered a bug, so all labels should refer to a unique location in the program. Any label which comes after HALT must be followed by the .DATA command, which must then be followed by an integer. If a .DATA command is not preceded by a label you will not receive an error, but that data will not be accessible by the program.

The assembly language described in the book would seem to allow literal addresses (i.e. numbers) as well as symbolic addresses. Translate does not allow this. Only symbolic labels may appear in the address field.

Another anomaly is that Translate accept a valid program of any length, while Run will only execute an object file of length at most 256 bytes. If you wish to Run longer object files go to Run.cpp and change the line

#define MAX_PRG_LEN 256

by replacing 256 by something bigger, then re-compile.

If you find any errors, please report them to: ptantalo@soe.ucsc.edu

webmaster@soe.ucsc.edu

Back to the SOE Class Home Pages
Back to the SOE Home Page

address:	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23
ex2:	13	21	0	23	7	21	11	14	0	21	1	22	8	18	5	21	1	22	14	22	15	0	0	0