CSC 220 Program #3 - The Sophocles 9000

CSC 220

Program #3 - The S-7000

The SOPHOCLES 7000 assembler, linker and virtual machine

The bottom line:

Worth 10 points, or 10% of your grade
Due June 2, 2006... I think that's a Friday
Why? Maybe Sophocles said it best:

"One must learn by doing the thing;

for though you think you know it, you have no certainty, until you try."

- Sophocles

Overview

Our Goal: Assemble, link and simulate assembly language programs on our own virtual machine.

Cough.

OK, the high-level view of what needs to be done is pretty straightforward. Here's the model from page 531:

This is a group project with 3 teams, each doing a piece of the diagram above:

The assembler team - Andrew & Collin
The linker team - Christy & James
The virtual machine team - Dave, Jong & Wendy

#1. Team Assembler

The assembler parses an assembly code file and translates it into an object code file. Details:

Members:	Andrew & Collin
Input:	Assembly code file
Output:	Object code file
Processing:	Assemble the file in two passes
Format defined:	Assembly language format

Here are the text references for :

Section 4.2 - You guys must know this section intimately. It defines the IJVM instruction set architecture, given assembly code and object code examples, and describes how the stack works, etc.
1. Page 250 - These are your instructions, 20 of them. Notice the hex opcodes in the first column.
2. Page 254 - This is an example of the assembly language format that we want: ILOAD, ISTORE, etc. I also think this may be a great first example to try to assemble. Some of the object code for this snippet is given in the rightmost column.
Section 7.3 - This is the money section for Team Assembler. It describes the two-pass assembly process. I strongly expect that each team member will implement one pass... makes sense.
1. Page 526 - Here, pseudo-code is listed for pass 1. now, I don't think I agree with all this, but it's a good starting point.
2. Page 528 - More pseudo-code, but this time for pass 2.
3. Page 529 - The symbol table is one of your most important data structures, and it is discussed here. Are you guys familiar with the C++ Standard Template Library (STL). If not, then I'll show you. You'll want to steal a hash table or something from that library to implement your symbol table.
Page 535 - This is a nice hint of what you guys will be writing, an object file. This pages shows the 6 parts of the object module and what they do. Even though Team Linker gets to define this format, you know it will contain information like this.
Page 515 - You can use assembly pseudo-instructions to make your job easier. Things like EXTERN and PUBLIC may help you in building your symbol table. You don't need to do macros or conditional assembly or anything goofy like that.
In general, you are certainly free (or better, encouraged) to use any ideas from your IA-32 experience in designing the assembly code format.
I'd like you guys to support comments in your assembly code. Also, try carry line numbers throughout the process, so that you can report the line number with any error you run into.

#2. Team Linker

The linker combines 1 or more object files into a single executable file of machine code. Details:

Members:	Christy & James
Input:	Any number of object files
Output:	One executable file
Processing:	Resolve all references to create a flat executable file in machine code format.
Format defined:	Object code file format

Here are your text references Team Linker:

Section 7.4 - Duh. This is your section.
1. You must deal with the dreaded relocation problem. OK, it's not really dreaded, but each object module thinks that it starts at address 0, and only one module can. So, you have to adjust all branching statements accordingly.
2. You'll also handle the external reference problem. When someone calls a method, they just use the name. When linking everything together, you must turn that name into a specified address in the program where that method's code resides.
3. Page 524 - Figure 7-14 is outstanding. It shows the "before" picture, a bunch of object modules. Everyone thinks they start at address 0 and they all just call methods by name. Figure 7-15 a) shows all the object modules just glommed together, and then 7-15 b) shows a linked executable where all branches are correct and all method calls are resolved to the correct place in memory.
4. Page 536 - "Most linkers require two passes." We'll see about that; I'm not sure yet. We may be able to simplify things, so that is not necessary.
5. We won't do any of the fancy linking described at the end of 7.4... dynamic linking, delayed binding, etc.
6. So, while your output format will be defined by Team Machine, it should look a lot like 7-15 b)... a tiny prelude section with some information (to be determined) and then your object modules flattened into one file.

#3. Team Machine

This team bears a heavy, heavy burden... to implement our entire machine in software. Details:

Members:	Dave, Jong, & Wendy
Input:	Executable file and any user input required
Output:	Whatever the program is supposed to output
Processing:	Simulate the parts of the Mic2 architecture as closely as possible in software.
Format defined:	Executable file format and machine code format

The text references for Team Machine are:

Chapter 4. You guys will want to intimately know all the ins and outs of our Mic1/2 architecture. The thing to do will be to create code that as closely mimics how you think the machine works as possible.
I have a zillion handouts on Chapter 4 as well that you'll get as a part of lecture.
One of you will probably want to be our microcode expert. We'll see.

The CEO

I get to be the CEO. Surprise. My duties will be:

Interfacing with our Board of Directors, the venture capital community and, of course, to our shareholders. This will usually happen on a golf course.
I expect to meet with each team outside of class about once each week to get a status update. We'll work on the times. Email me to set things up.
I will take the responsibility of assigning tasks to individual team members.

Getting the S-7000 is a key part of our company getting its next round of financing. We must complete this prototype by the end of the term!

Here are some company-wide issues/ideas that apply to each team:

Quality code

I will not accept crappy code. The grading penalty for poorly crafted, undocumented code will be severe.

What is quality code, then? Here are my top ten attributes of quality code that you must follow for CSC 220 Program #3:

Organization: 1 class per file please
Header documentation for each file (author, description), class (description) and function/method (parameters, return value or pre/post is fine too)... data fields as well if you're using Javadoc
Inline comments detailing the workings of difficult code sections
Use a consistent and proper indentation style
Use a consistent and proper naming style
Proper use of object-oriented principles like data encapsulation (meaning private data fields and public methods to access or modify them) and code reuse (meaning well-formed classes and function/method signatures).
No global variables. You may use static variables in a class for special situations.
eight
Have test drivers for individual classes or components of your design. Individuals must test their pieces before gluing them together.
Please follow the standards and conventions dictated by your programming language. For example, in Java, implement the toString() method to print/debug classes. A C++ example: free memory that you have dynamically allocated.

Cutting corners

After admitting my rigidity on the quality code issue, I recognize that this is an ambitious project. Here are some corner-cutting ideas:

This is "extreme programming"; we'll skimp on documentation and process to get our prototype running... but we will get a prototype running!
As of this minute our goal is to get one small example (the one in the book?) through each part of the project... from there it's gravy.
Skimp on nice error messages and thorough error-checking
Execution speed is not an issue at all
If it makes things easier, then go ahead and set limits on things like table size or whatever.

Integration

Everyone will help in the integration of our computer simulation. I'll use the common area in the k: drive to let teams share programs and source code. This is not setup yet, however.

Writing some examples

Finally, each team and the CEO will be responsible for writing an assembly language program in our format for our machine. Some ideas:

Sum of squares program (Bill calls this one)
Fibonacci... our program #1
Median... our program #2
Factorial
Sort an array of integers
Recursively reverse the characters in a string

A couple technical issues

Let's see:

We'll use "fake" binary where we'll use 2 chars to represent each byte in hex. For example, 0011 1011 will be written to a file as "3B". This will be done to make debugging easier since the files will be readable and editable using NotePad or WordPad.
We need to figure out a way to read from the keyboard and write to the console. These are stdin and stdout in C++... System.in and System.out in Java. We'll figure something out. This can be done later as we will be able to hardcode values into assembly code and view registers in our machine.

Notes

I'll post goodies here from time to time.

thanks... yow, bill

...

My site: william.krieger.faculty.noctrl.edu

My email: wtkrieger@noctrl.edu