CS180 - Database Systems

                                                                                                           Winter 2003

                                                                                                           Project Part II

                                                                                                Due Friday, February 7, 2003

(This document is adapted from the projects for CS180 Winter ’02 by Arthur Keller)

LOGISTICS AND LATE POLICY

Turning in your work: The project for CS 180, Winter 2003 will consist of a sequence of lab assignments involving database programming (including such tasks as create databases, populate them with data, modify data, write SQL queries). Unless otherwise specified, this lab assignment and all subsequent ones will be turned in electronically. We will use submit. Submit is accessed from your cats account. For information on using submit, enter "submit -m" from the console.
Please keep in mind that the deadlines for submitting lab assignments are separate from the deadlines for written homework assignments. The following late policy applies to this and subsequent lab assignments:
Lab assignment work must be submitted electronically by midnight at the end of the day that it is due. Programming work submitted after the deadline but before the "late deadline" -- midnight 48 hours after the deadline -- will be accepted but penalized 50%. No lab assignment work will be accepted after the late deadline.
One caveat: students must use ssh (both versions 1 and 2 will work), as telnet is not enabled. SSH is an encrypted (safer) method of connecting to a remote host, as opposed to telnet which transmits and receives data unencrypted. Free and legitimate SSH software is available for almost all computing platforms.
Once connected to linux1.ic.ucsc.edu, students will have full access to their CATS home directories via AFS, much the same way as when logged onto teach.ic.ucsc.edu and hawking.ic.ucsc.edu. The major different is that the binaries (including those for PostgreSQL) are Linux binaries, and thus will not work on the Sun hosts such as teach/learn/hawking/curie.

THE PROJECT PART II

The  database to work on for the Part of the project is the MOVIES database. It stores the details of various movies, their stars, studios and movie executives.

The schema has five relations as given below:


Assumptions:
Studio names are unique.
The producerCert is the certificate number of the movie executive who directed the movie.
The length of the movie represents the length in minutes.
The networth of movie executives is assumed to be in millions.

This assignment is divided into two parts. Part A helps you gain familiarity with PostgreSQL, creating tables and populating them with data. Part B helps you gain experience with SQL programming by writing SQL queries for retrieving answers to questions about a database that is provided.
 

PART A

(a) Familiarize yourself with the PostgreSQL relational DBMS by reading the document Managing a Database in the Interactive PostgreSQL documentation, logging into PostgreSQL, trying some of the examples in the document, and experimenting with the various commands. You don't need to turn anything in for this part. The PostgreSQL web page has links to documentation and other information. See PostgreSQL: Introduction and Concepts, by Bruce Momjian, which is available in printed form at the bookstore and on reserve at the library. There's also a PostgreSQL At A Glance document describing PostgreSQL's features.

Create your first database using the shell command:

createdb username

where "username" is your cats login username.

Then you can access your database with the command:

psql -a
which will start a command line interface to PostgreSQL connected to the database with your username. (The -a is so input is echoed to the console for saving a script log. If you don't want your input echoed, you can omit the -a, but please do include it when you are running an execution script to hand in.)
You may create other database names, but they must all start with your username. (By the way, username "wan" should not create databases starting with "wang"!)

To start a command line interface to PostgreSQL connecting to a different database, use the command:

psql -a databasename

where "databasename" is the name of your database (which should begin with your cats username).

(b) Create relations for your database based on the relational database schema given to you. Use the CREATE TABLE command to specify each relation its attributes and attribute types; see CREATE TABLE from the PostgreSQL interactive documentation. If you have an attribute that represents a date and/or time, you may want to look at the page on FAQ: Working with Dates and Times in PostgreSQL.

Turn in a script log showing an PostgreSQL session in which your relations are created successfully. Please see Recording Your Session below for details.

(c) For each relation in your database, create an execution script file containing a Copy Command and a few (approximately 5-10) records of "realistic" data. Then execute the script file from the psql command line using the \i option. Please see Creating execution script files below for more details.

Turn in a listing showing the contents of the files you created, the successful loading of the data into PostgreSQL, and the execution of "SELECT *" commands to show the contents of each relation.

Note:
        1. Make sure not to generate tuples that violate the key constraints.
        2. The database almost certainly includes relations that are expected to join with each other.

For example, the producerCert in the Movie relation should join with certnum of MovieExec relation. When generating data, be sure to generate values that actually do join. There are a couple of ways to properly generate joining values. One way is to generate records for multiple relations (e.g., Movie and MovieExec) at the same time. Another way is to generate the records for one relation first, and then use the joining values for the other relation. For example, you could generate records for relation Movie first, then use the Movie.producerCert values when creating values for MovieExec.certnum.

What to turn in

Please see Recording Your Session below for a guide to preparing output to be submitted for this and subsequent project parts.
Components (b) and (c) of this project part each tell you what should be recorded in the script log that you turn in. In this and all subsequent project parts, the material you turn in should be clearly formatted and delineated, and should include comments for any aspects that are not crystal clear. Poorly assembled or documented material will not receive full credit, even if it is correct. You also will not receive full credit if you turn in your entire large data files (or large query results in later assignments) when we ask for small samples. Other than comments, truncation, and simple formatting, it is Academic Dishonesty to edit scripts before turning them in.

For this part of the assignment the following files should be turned in electronically using your cats account and the submit program:
 
  README
Please give your name, cats account, project part number, course number, date, a list and description of the files you are submitting, and any other information that will be useful for the grader. 
create.script The execution script file you used to create tables for the MOVIES database.
create.log 
A record of your session creating your tables. 
data.script 
The execution script file you used to populate your small database. 
data.log 
script log (i.e., a recording) of your session using data.script to populate your DB and your execution of SELECT queries on each table. 

To submit these files use the syntax:

  submit cmps180-wt.w03 proj2-part1 README create.script create.log data.script data.log 

or proj2-late if your project part is late.

Maintaining your databases

For the duration of the project, we suggest that you establish some kind of routine that includes reloading your database from the files created in this project part each time you want to get a "fresh" start with PostgreSQL. Remember to delete the contents of each relation (or destroy and recreate the relations) before reloading. Otherwise, unless there is a declared key (or you take APPEND out of your control file), PostgreSQL will happily append the new data to your old relation, causing your relation size to double, triple, quadruple, etc. To get rid of a table called T, issue the command: 
  drop table T;
If you want to get rid of all tuples inT without deleting the table itself, issue the command: 
  delete from T;

Recording Your Session

There are several methods for creating a typescript to turn in for your programming assignments. The most primitive way is to cut and paste your terminal output and save it in a file (if you have windowing capabilities). Another method is to use the Unix command script to record the terminal interaction. The script command records everything printed on your screen. The syntax for the command is
    script [ -a ] [ filename ]
The record is written to filename. If no file name is given, the record is saved in the file typescript. The -a option allows you to append the session record to filename, rather than overwrite it. To end the recording, type
    exit
For more information about script, check out its man page. 

Creating and running script files from psql

You will be using the psql command line interface to interact with your data base. See Managing a Database for information on starting using psql. Script files are text files of psql commands which can be executed like a batch file using the the \i command in psql. The syntax is \i filename entered following the psql prompt, where file name is the complete (case sensitive) name of the script file you desire to run.
To run your execution script data.script and save the script log in the file data.log, do the following:
 
script data.log 
to start saving the script log 
psql -a databasename 
to run PostgreSQL's command line inteface using your database 
\i data.script 
to import an execution script 
\q 
to exit pqsl (PostgreSQL's command line interface) 
exit 
to stop saving the script log 

The execution script file you create can consist of most any series of commands which you could enter following the psql prompt. This includes all of the SQL commands which you will be using to create, modify and test your data base. Examples include the CREATE TABLE and SELECT commands. Just like when using the psql command line interface you must terminate each SQL command in your script file with a semicolon. \i and \q do not need semicolons.

If you are recording your session using the script command described above into a script log then it is useful to start psql using the -a option so that all commands included in your execution script file will be echoed to the console and thus to your script log file. Here is an example script file that creates a table (relation) named products:

CREATE TABLE products (
        productID INT,
        name VARCHAR(80),
        price NUMERIC(10,2),
        retailPrice NUMERIC(10,2)
);
Here is an example script file that loads four tuples into the table (relation) named products created using the previous script file:

COPY products FROM stdin USING DELIMITERS '|';
1419|American Greetings CreataCard Gold V4.0|21.49|25.24
1424|Barbie(R) Nail Designer(TM)|20.74|25.99
1427|Panzer Commander|21.99|30.24
1431|Riven: The Sequel to Myst|31.99|40.24
\.

This is the format you will use for the files that load data into your tables. The USING DELIMITERS '|' and the use of '|' as a delimiter is optional. The default delimiter is the tab character. The delimited data on each line must match the attributes and their types in your table in a one to one manner and in the order they were defined in your CREATE TABLE commands. The COPY data must be terminated with '\.'
For testing, some students have found it convenient to have a separate script file to populate each of the tables. I also find it convenient have a single script file to create all of my tables and another one to drop all of my tables. Look for examples at the end of this document.

PART B

For this part of the assignment, you will first need to drop the tables that you have created. If you need you can download the drop-tables script.

Important:  Be sure to complete the PART A of the assignment and save all the files you need to submit for that part before dropping your tables.
Also, you need to create and populate the database with the data provided (as stated below), else your queries will not produce the desired results.

Download the create-tables script and the populate-data script files.  Once you have created the tables and populated them with data provided, you need to write queries to answer the following questions about the data.

1. Find all movies directed by Garry Marshall.
2. Find the title and genre of movies that start with 'The' and were made after 1998.
3. Give the title, studio-name and studio# of those movies that star Richard Gere but not Julia Roberts.
4. Find the title and genre of the longest movie.
5. Find name and address of those female movie stars who are also movie executives with a net worth of over 10 million dollars.
6. Find the certificate number and average movie length for those executives who have a net worth of atleast 10 million dollars and whose none of the movies are      smaller than 110 mins.
7. Find the name and net worth of those movie executives who have worked with atleast one movie star born before 1962.
8. Find the studio that made the maximum number of movies.

For this part of the assignment you will submit a file which contains each of your queries in the order of the questions. Also, submit a log file that contains the results of running your queries against the database. Last, a separate README file should be included which indicates the names and descriptions of the files you have turned in. An example submit statement is:

submit cmps180-wt.w03 proj2-part2  README query.script query.log 

Example Queries:

            SELECT title, genre
            FROM Movie, StarsIn
            WHERE movietitle = title AND starname = 'Julia Roberts';
              SELECT name
            FROM Movie, MovieExec
            WHERE producerCert = certnum AND year < 1995;
              SELECT genre, avg(length)
            FROM Movie
            GROUP BY genre;
 

Sample Script Files

createbeers.script
-- Sample Script file to Create and
-- Populate a BEERS DB  
 
-- print out the current time
SELECT timeofday();
 
CREATE TABLE Beers (
name VARCHAR(30),
manf VARCHAR(50)
);
COPY Beers FROM stdin USING DELIMITERS '|';
Coors|Adolph Coors
Coors Lite|Adolph Coors
Miller|Miller Brewing
Miller Lite|Miller Brewing
MGD|Miller Brewing
Bud|Anheuser-Busch
Bud Lite|Anheuser-Busch
Michelob|Anheuser-Busch
Anchor Steam|Anchor Brewing
\.
CREATE TABLE Bars (
name VARCHAR(30),
addr VARCHAR(50),
license VARCHAR(50)
);
COPY Bars FROM stdin USING DELIMITERS '|';
Joe's|123 Any Street|B7462A
Sue's|456 My Way|C5473S
\.
CREATE TABLE Sells (
bar VARCHAR(20),
beer VARCHAR(30),
price REAL
);
COPY Sells FROM stdin USING DELIMITERS '|';
Joe's|Coors|2.50
Joe's|Bud|2.50
Joe's|Bud Lite|2.50
Joe's|Michelob|2.50
Joe's|Anchor Steam|3.50
Sue's|Coors|2.00
Sue's|Miller|2.00
\.
CREATE TABLE Drinkers (
name VARCHAR(30),
addr VARCHAR(50),
phone CHAR(16)
);
COPY Drinkers FROM stdin USING DELIMITERS '|';
Bill Jones|180 Saint St.|831-459-1812
Kelly Arthur|180 Alto Pl.|650-856-2002
Fred|1234 Fifth St.|831-426-1956
\.
CREATE TABLE Likes (
drinker VARCHAR(30),
beer VARCHAR(30)
);
COPY Likes FROM stdin USING DELIMITERS '|';
Bill Jones|Miller
Bill Jones|Michelob
Kelly Arthur|Anchor Steam
Fred|MGD
\.
CREATE TABLE Frequents (
drinker VARCHAR(30),
bar VARCHAR(30)
);
COPY Frequents FROM stdin USING DELIMITERS '|';
Bill Jones|Joe's
Bill Jones|Sue's
Kelly Arthur|Joe's
\.
 
-- Execute some SELECT queries--
SELECT * FROM Bars;
SELECT * FROM Drinkers;
 
-- print out the current time
SELECT timeofday();
 
dropbeers.script
DROP TABLE Beers;
DROP TABLE Bars;
DROP TABLE Sells;
DROP TABLE Likes;
DROP TABLE Frequents;
DROP TABLE Drinkers;