Lab Assignment 2

Due:  Thursday October 26  10:00 pm


Purpose:

Observe the behavior of some sorting algorithms, and draw conclusions as to their efficiency.

Description:

If you haven't already done so, read the Unix Tutorial section on copying and moving files.  To make a copy of a file in Unix one uses the  cp command:
% cp source_file destination_file
This makes a copy of   "source_file"  called   "destination_file".  This assumes of course that  source_file  exists in your current directory.  You can also copy files from distant directories by using the full path name of the file.  In this exercise you will need to copy a file called  Csort.cpp  from the course directory (also called the course locker, /afs/cats.ucsc.edu/courses/cmps010-pt) to your cs10 directory.  This can be done (from within your cs10 directory)  by typing
% cp /afs/cats.ucsc.edu/courses/cmps010-pt/Csort.cpp  Csort.cpp
(If you are in your home directory, the destination file should be called cs10/Csort.cpp.)
If you then list the contents of cs10, you will see a new file called Csort.cpp,  which is a C++ source file.  Compile it by typing
% g++ -o Csort Csort.cpp
(Be sure to type the "C" in "Csort" as a capital letter, remember Unix is case sensitive.)  You should now have in your cs10 directory a new executable file called   "Csort".  (As in the last exercise, don't try to view the contents of this file.)  Before you can run this program you must create at least one auxiliary data file.  A data file for this program can have any name, and should contain only a list of numbers to be sorted, separated by spaces and carriage returns, and nothing else.  To start with, create a text file called  "data" using Pico (or any other text editor you like) containing the numbers from 1 to 20 in reverse order:
20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
Run the  Csort  program by typing
% Csort
(Again, be sure to type this correctly.  There is a built in Unix command called  "sort"  which we are not using.)  You will be prompted for the name of a file to sort.  Respond by typing the name  "data".
At this point you will be presented with the name of the file you just typed, a listing of its contents, and the number of numbers in the list, followed by a menu of choices.  Choose (1) to animate sorting algorithms.  You will see another menu, giving you a choice of four algorithms to animate.  Choose (1)  and step through the changes that Selection Sort makes to this list by hitting return repeatedly.  (You can also just lean on the return key to see the changes go by very fast.)  The final line will be the modified list in increasing order.  Choose menu items (2) Bubble Sort, (3) Insertion Sort, and (4) Quicksort in turn.
Now choose (5) to go back to the main menu, then choose (2) to count the number of comparisons performed by each of the same algorithms.  Choose (1), for instance, to count comparisons done by Selection sort.  You will see the list before and after sorting, followed by a short summary of current parameters.  For this particular list the algorithm should have done 190 comparisons.  (Could you have predicted that?)  Choose (2), (3), and (4) to check the performance of the other algorithms on the same list.
Go back to the main menu (5), and note that there is a selection (3) which would allow you to open a new data file if you had one.  Quit the program (4) so we can remedy that situation.  Create a series of five text files containing randomly chosen numbers.  You may call them whatever you like, but I'll refer to them here as data1, data2, data3, data4, and data5.  File data1 should contain 100 numbers, and each successive file should contain twice as many as the last, so that data5 will contain 1600 numbers.  These numbers should be chosen at random in the range 1 to 5000.  This would be a huge amount of work if you had to make up the numbers one by one and type them in.  Fortunately that's not necessary.  Go to http://www.random.org/nform.html  and you will find a random number generator which will provide random numbers by the truckload.  (Spend some time wandering around that site and read some of the interesting things there.)  Just cut and past as many numbers as you need from your web browser into the data files.  These files should contain only numbers and separating white space, which is defined to be: spaces, tabs, and newlines (i.e. carriage returns).
What to Turn In:
Count comparisons for your random lists of length 100, 200, 400, 800, and 1600 using all four algorithms, and record your results in a table.  What do you notice about the results for Selection Sort and Bubble Sort?  Recall we deduced a formula in class for the number of comparisons performed by these algorithms on a list of length n.  This formula can be used to predict your results for Selection Sort and Bubble Sort.  From the data you've collected, form a conjecture as to the (average case) asymptotic run time of both  (1) Insertion Sort and (2) Quicksort.  (Hint: consider the following orders:  log(n), n, nlog(n), n2, n2log(n), where the log functiion can have any base.)  Bear in mind that the formulas derived for Selection Sort and Bubble Sort give the exact number of comparisons in any case (i.e. best, worst, and average.)  The data you are collecting for Insertion Sort and Quicksort serve as approximations to the average number of comparisons performed.  Here are a few additional remarks on how to arrive at a reasonable conjecture for the run time of Insertion Sort and Quicksort.  Type up your answer to questions 1 and 2 and  place them in a text file called  "answers".  Be sure to include your recorded observations in the form of a table, along with your conclusions and supporting reasoning.  Your "answers" file should of course begin with the usual identification information:
Your Name
user_name@cats.ucsc.edu
The name of the assignment (lab2)
Submit the file "answers" by using the submit command:
% submit cmps010-pt.f06 lab2 answers
Then use peek to confirm.
 
Notes:
The  Csort  program can also be given a command line argument which is the name of the file to be sorted.  For example
% Csort data
has the same effect as  typing %Csort, then entering  "data" as the name of the file to be sorted.  The data file can always be changed by choosing (3) from the main menu.  If a data file contains any character which is not part of a number, and which is not whitespace, such as a letter or punctuation mark, nothing will be read beyond that point.  In any case only the first 2000 numbers in a data file will be read in.  If you want to reset the maximum list size, say to 3000, edit the file  Csort.cpp  so that line 11 reads:
#define  MAX_LST_LEN  3000
then recompile, and run Csort on a larger list.  A data file can have at most 20 characters in its name.  This too can be adjusted at line 10 of the source code.
 
If you find any errors, please report them to: ptantalo@soe.ucsc.edu

webmaster@soe.ucsc.edu

Back to the SOE Class Home Pages
Back to the SOE Home Page