CSC323 2010S, Class 01: An Introduction to the Course Overview: * A bit about the course. * Introductions. * Group programming exercise: Exponentiation. * Thinking about object-oriented programming. * Another problem: Modeling candy machines. * Beautiful Code: Regular Expressions. Admin: * The Course Web is under development. Expect some updates over the next week or so. * Assignment for Thursday: Introductory Survey and Such. * Beautiful Code Reading for Thursday: Chapter 18 (Python's Dictionary Implementation). * Additional Reading for Thursday: Skim Python Pocket Reference (if you have it) Read/skim Python 101 (linked in outline) * EC for Gene Gaub's recital on Thursday at 11am A bit about the course * Software Design - "Everything" about writing programs (not marketing) * Methodologies * For thinking about the structure of the program: object-oriented analysis and design * Team methodology: Extreme Programming (XP) * Tools that support these methodologies * Version control - Subversion * Software construction - Make * Unit testing - (TBD) * Additional approaches * Design patterns * Reading and thinking about and criticizing code * Rely on beautiful code * Dichotomy in computer science: theory and practice * Try to ground everything in practice * At least three categories of work + Lots of "toy projects" + One contribution to an open source project + One large group project (each group is 3 or so people) * Doing it all in Python * "Quicker" OO language than Java * Supports functional approaches (anonymous functions, hop, etc.) * Puts people on mostly equal footing * A scripting language 300 level course * You are responsible adults * Fewer carrots and sticks * Long range planning * Support your colleagues * You know how to use the Web * You are proactive * You have significant background in CS * You know three dominant paradigms: Functional, Imperative, Object-Oriented * You know Grinnell's three favorite languages: Scheme, C, and Java Introductions * Quiz! No. Let's look at the construction of a simple library * Function that computes x^n, where x is a double and n is a non-negative integer. * In C * Algorithm: Multiply x by itself n times double expt (double x, int n) { double answer = 1; int i; for (i = 0; i < n; i++) answer = answer*x; return answer; } /* expt */ * This is an O(n) algorithm * Sam's favorite exponentiation algorithm: Divide and conquer expt (x, n): if (n == 0) return 1.0 if (n is even) return square (expt (x, n/2)); return x * expt (x, n-1) * Translate to C double square (double x) { return x*x; } double expt (double x, int n) { if (n == 0) return 1.0; if ((n%2) == 0) return square (expt (x, n/2)); return x * expt (x, n-1); } /* expt */ What's the running time? * I dunno * O(n/2) = O(n) * O(log_2(n)) - intuitive * You cut it in half and half again and half again * Well, half the time, at least O(2*log_2(n)) = O(log_2(n)) What next? * Document! * Put in a code file * Add a header file * Write a main * Extend the main to take command-line input * Update that main so that we do reasonable error checking * Write unit tests! Whoops, we were supposed to write those first. Oh well. * Unit tests: Instead of "let's try these numbers and see the results", do "try this and this and this and this and this and this and tell me how many tests worked or failed." * How would you test an expt procedure? * Do you think black box or white box is appropriate? The reading: Implementing Regular Expressions * About thirty lines to implement a subset of regular expressions That covers 95% of the typical cases used by grep And implemented recursively. * By Brian Kernighan. Who is Brian Kernighan? * Author of K&R. THE book on C.Z * Author of another book: Software Tools * Long-standing member of Unix community * What is the goal of the algorithm * Takes a regular expression and a string * Returns leftmost shortest matching string * How does the algorithm achieve its goal Three interrelated algorithms * match (pattern, string) - match anywhere * match_here (pattern, string) - match starting at front * match_star (char, pattern, string) - match char* followed by matching the pattern in the remainder of the string match algorithm: Check to see if the first part of the pattern is '^' If so, we must match the rest of the pattern at the beginning of the string If not, try each potential starting position in the string until either (a) we find a match starting at that position (return 1) (b) we run out of string (return 0) match_here algorithm match_star algorithm * Why does he think this code is beautiful? * It's much more concise than most other versions of grep * And much more readable * Traditionally grep is implemented with the RE -> NFA -> DFA translation * It may be that the algorithm is beautiful, even if we don't find the C code beautiful * Why might the code not be beautiful * Hard to understand. External documentation, rather than internal documentation. * Assumes that we understand the underlying problem: What is a regular expression or why are they useful?