Lab Assignment 2
Due: Thursday October 26
10:00 pm
Purpose:
Observe the behavior of some sorting algorithms, and draw
conclusions
as to their efficiency.
Description:
If you haven't already done so, read the Unix
Tutorial section on copying and moving files. To make a copy
of a file in Unix one uses the cp command:
% cp source_file destination_file
This makes a copy of "source_file"
called
"destination_file". This assumes of course that source_file
exists in your current directory. You can also copy files from
distant
directories by using the full path name of the file. In this
exercise
you will need to copy a file called Csort.cpp from the
course
directory (also called the course locker, /afs/cats.ucsc.edu/courses/cmps010-pt)
to your cs10 directory. This can be done (from within your cs10
directory)
by typing
% cp
/afs/cats.ucsc.edu/courses/cmps010-pt/Csort.cpp
Csort.cpp
(If you are in your home directory, the destination file should be
called
cs10/Csort.cpp.)
If you then list the contents of cs10, you will see a new file
called
Csort.cpp, which is a C++ source file. Compile it by typing
% g++ -o Csort Csort.cpp
(Be sure to type the "C" in "Csort" as a capital letter, remember Unix
is case sensitive.) You should now have in your cs10
directory
a new executable file called "Csort". (As in the last
exercise, don't try to view the contents of this file.) Before
you
can run this program you must create at least one auxiliary data
file.
A data file for this program can have any name, and should contain only
a list of numbers to be sorted, separated by spaces and carriage
returns,
and nothing else. To start with, create a text file called
"data" using Pico (or any other text editor you like) containing the
numbers
from 1 to 20 in reverse order:
20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
Run the Csort program by typing
% Csort
(Again, be sure to type this correctly. There is a built in Unix
command called "sort" which we are not using.) You
will
be prompted for the name of a file to sort. Respond by typing the
name "data".
At this point you will be presented with the name of the file you
just
typed, a listing of its contents, and the number of numbers in the
list,
followed by a menu of choices. Choose (1) to animate sorting
algorithms.
You will see another menu, giving you a choice of four algorithms to
animate.
Choose (1) and step through the changes that Selection Sort makes
to this list by hitting return repeatedly. (You can also just
lean
on the return key to see the changes go by very fast.) The final
line will be the modified list in increasing order. Choose menu
items
(2) Bubble Sort, (3) Insertion Sort, and (4) Quicksort in turn.
Now choose (5) to go back to the main menu, then choose (2) to count
the number of comparisons performed by each of the same
algorithms.
Choose (1), for instance, to count comparisons done by Selection
sort.
You will see the list before and after sorting, followed by a short
summary
of current parameters. For this particular list the algorithm
should
have done 190 comparisons. (Could you have predicted that?)
Choose (2), (3), and (4) to check the performance of the other
algorithms
on the same list.
Go back to the main menu (5), and note that there is a selection (3)
which would allow you to open a new data file if you had one.
Quit the program (4) so we can remedy that
situation.
Create a series of five text files containing randomly chosen
numbers.
You may call them whatever you like, but I'll refer to them here as
data1,
data2, data3, data4, and data5. File data1 should contain 100
numbers,
and each successive file should contain twice as many as the last, so
that data5 will contain 1600 numbers. These numbers should be
chosen at random in the range 1 to 5000. This would be a huge
amount of work if you had to make up the numbers one by one and type
them in. Fortunately that's not necessary. Go to http://www.random.org/nform.html
and you will find a random number generator which will provide random
numbers by the truckload. (Spend some time wandering around that
site and read some of the interesting things there.) Just cut and
past as many numbers as you need from your web browser into the data
files. These files should contain only numbers
and separating white space, which is defined to be: spaces,
tabs, and newlines (i.e. carriage returns).
What to Turn In:
Count comparisons for your random lists of length 100, 200, 400,
800,
and 1600 using all four algorithms, and record your results in a
table.
What do you notice about the results for Selection Sort and Bubble
Sort?
Recall we deduced a formula in class for the number of comparisons
performed
by these algorithms on a list of length n. This
formula can be used to predict your results for Selection Sort and
Bubble
Sort. From the data you've
collected, form a conjecture as to the
(average case) asymptotic run time of both (1)
Insertion Sort
and (2) Quicksort. (Hint: consider the following
orders:
log(n),
n, nlog(n), n2, n2log(n), where the log functiion
can have any base.) Bear in mind that
the formulas
derived
for Selection Sort and Bubble Sort give the exact number of comparisons
in any case (i.e. best, worst, and average.) The data you
are collecting for Insertion Sort and Quicksort serve as approximations
to the average number of comparisons performed. Here are
a few additional remarks on how to arrive
at a reasonable conjecture for the run time of Insertion Sort and
Quicksort. Type up your answer to questions 1 and 2 and
place them in a
text
file called "answers". Be
sure
to include your recorded observations in the form of a table, along
with
your conclusions and supporting reasoning. Your "answers" file
should of course
begin
with the usual identification information:
Your Name
user_name@cats.ucsc.edu
The name of the assignment (lab2)
Submit the file "answers" by using the submit command:
% submit cmps010-pt.f06 lab2 answers
Then use peek to confirm.
Notes:
The Csort program can also be given a command line argument
which is the name of the file to be sorted. For example
% Csort data
has the same effect as typing %Csort, then
entering
"data" as the name of the file to be sorted. The data file can
always
be changed by choosing (3) from the main menu. If a data file
contains
any character which is not part of a number, and which is not
whitespace,
such as a letter or punctuation mark, nothing will be read beyond that
point. In any case only the first 2000 numbers in a data file
will
be read in. If you want to reset the maximum list size, say to
3000,
edit the file Csort.cpp so that line 11 reads:
#define MAX_LST_LEN 3000
then recompile, and run Csort on a larger list. A data
file
can have at most 20 characters in its name. This too can be
adjusted
at line 10 of the source code.
If you find any errors, please report them to: ptantalo@soe.ucsc.edu
webmaster@soe.ucsc.edu
Back to the SOE Class
Home Pages
Back to the SOE Home Page