CAAM 420: HW Assignments

See the grading guidelines for information about grading policies and turn-in procedures.

NOTE: there are two types of homeworks - assignments, which will be frequent and not graded but discussed in class, and projects, which will (mostly or entirely) be done as group work and will be graded. Projects will be clearly identified below, and will be described in individual project pages linked to this one.

Whether assignments or projects, late homeworks will not be accepted.


  1. Assigned 26 Aug, due 28 Aug.

  2. Assigned 28 Aug, due 30 Aug. Use the emacs editor to write a text file explaining your background in computing and what you hope to learn from the course. If inspection of the syllabus suggests that you already have some knowledge of the topics to be covered, please say so. If topics you'd like to see are missing, please say that too.

    What operating system runs on the laptop computer you will use for coursework in CAAM 420? What is its relation to Unix?

    One page (or less) should be sufficient; don't worry too much about formatting, but make sure it's readable. Name the file "hw1.txt" and Upload to your repository by the due date.

    Structure your file as follows:

    1. first line: your name
    2. second line: your email address
    3. blank
    4. fourth line etc: text per instructions, one page max (with reasonable line width!!!
  3. Assigned 30 Aug, due 4 Sep. Format hw1.txt in LaTeX. Add at the bottom a single math display expressing a correct statement of some version of the Fundamental Theorem of Calculus (no need to include detailed statement of hypotheses, just a formula). MODIFIED 4 Sep: Submit your homework in a subdirectory hw_2 of your repository - after submission, "svn list https://svn.rice.edu/r/CAAM420_(YourNetID)" should return "... hw_2 ...", i.e. hw_2 should show up in the top level.

    The hw_2 directory should contain both the source, as hw1.tex, and the pdflatex output, as hw1.pdf.

  4. Assigned 4 Sep, due 6 Sep (that's a joke - nothing to turn in!). Read K&R sections 1.1 and 1.2. Find a history of the C language on the net, read it. The first two Google hits are good - an article by Ritchie, the "R" in "K&R", and the Wikipedia piece.
  5. Assigned 6 Sep, due 11 Sep (NOTE CHANGE!): read K&R section 1.2 very carefully for the use of the while control structure and variable assignment (I'll also discuss these in class). Then write a program to use the rand random number utility to produce repeatedly, and display (only) the first "random" number it generates that is more than 1/2 of the maximum value RAND_MAX. You can consult the notes for week 2 and the example program src/ex2.c for more information.

    submit a directory named hw_3 under your main repo directory, with your solution of the problem as its contents (hw_3/hw3.c).

  6. Assigned 11 Sep, due 16 Sep: Read Notes through section 11 - this assignment uses material through section 8.

    The 2-vector function (cos t, sin t) obeys the system of differential equations

    d/dt (cos t) = -sin t,
    d/dt (sin t) = cos t

    with initial conditions cos 0 = 1, sin 0 = 0.

    Euler's method to approximate the solution of this system on the interval [0,pi] goes like this: choose a positive integer N, and divide the interval [0, pi] into subintervals of length pi/N, with N+1 endpoints t0, t1, ... tn: the ith endpoint is ti = i(pi/N). Compute approximate values (c[i], s[i]) for cos ti, sin ti, i=0,...,N, by following this rule:

    c[0]=1, s[0]=0

    for i = 1,...N,

    c[i] = c[i-1] - s[i-1](pi/N)
    s[i] = s[i-1] + c[i-1](pi/N)

    Write a program implementing Euler's method. Store (c[i], s[i], i=0,...N) in C arrays of length N+1. Write a "for" loop to implement the recursive rule above. Print the value of c[i] for each i.

    Choosing N: Note that cos pi = -1. Try N=2, N=3, ..., until you find an N for which c[N] is a 1-digit approximation to -1, i.e. c[N] is between -1.1 and -0.9.

    Submit your work in a directory hw_4 under your repository root, containing your code as hw4.c. The submitted version should have N set to a value satisfying the 1-digit criterion stated in the last paragraph. If you use cstd.h, include it in your hw_4 directory.

  7. Assigned 13 Sept, due 16 Sept. An exercise which does not need to be turned in:

    The IEEE standard decrees that floats should consist of 32 bits, as follows:

    What is the largest decimal exponent of an IEEE float, in standard scientific notation: (plus or minus)(number between 1 and 10)(10 to exponent)? How many non-garbage decimal digits does the mantissa (the factor between 1 and 10) have?

    Having worked this out, write a program which prints

  8. Project 1. Assigned 20 Sep, [due 27 Sep] REVISION: DUE 30 Sep.

    Condition: You may consult with the instructor and TAs, who may provide relevant advice, but will not directly assist you in doing the assignment. You may not discuss the assignment with your fellow students prior to turnin.

    Lab: TAs will be available in Sewall 305 on Friday 27 Sep during the usual class hour to provide advice.

    Objective: write a program printing the binary representation of a float, in the following format:

    SIGN=... MANTISSA=... EXPSIGN=... EXPONENT=...

    The sign (SIGN) and exponent sign (EXPSIGN) should be printed as 0 or 1, meaning positive or negative. The mantissa should be printed as a string of 23 characters, each either '0' or '1'. The most significan bit, i.e. the coefficient of the highest power of 2, should come first (this may not be the way the bits are stored in the machine, but it is the natural way they would be presented in text). The exponent should be printed as a string of seven characters, each either 0 or 1.

    Your program file should be called float2bin.c. It should generate a command (say a.out) which is called as follows:

    ./a.out [real number in floating point notation]

    and print the binary representation in the format described above.

    Hints:

    1. A good first step is to get hold of the bits stored in the machine. To do that, the simplest trick is (1) declare a pointer to an unsigned int and assign it to the address of your float - you will need to cast the address of the float to an unsigned int address (i.e. pointer variable) to make the compiler treat the former as the latter, i.e.

      float x;
      ...
      unsigned int * u = (unsigned int *)&x;

      An unsigned int is just a pile of bits, so now *u has your 0's and 1's.

    2. However, when you print *u, it will look like a decimal, so that doesn't help. You need to convert the int to its binary rep. So write a function to do that, and use it to get an array of ints, all either 0 or 1. You know what the length of the array should be (32), so you can declare it in the standard array fashion in your main program, and pass it to the function.
    3. Now you have to figure out which is the sign, which 23 bits are the mantissa, etc. I suggest an experimental approach: choose some floats for which you know the answers, and see what bit arrays the last step produces. Alternatively, you can dig into the literature on IEEE float representation.
    4. NOTE ADDED 24 Sep: a couple of other peculiar things about IEEE 754 floating point standard:
      1. The Shifted Exponent: The exponent is coded as a binary number in the range 1-254. The actual value of the exponent is the coded value - 127: thus the actual exponents range from -126 to 127. The reason for coding the exponent this way, rather than with a sign bit, is simple: with a sign bit (which I will write first, though it could be last) and 7 exponent bits, 0 could be represented as either 00000000 or 10000000, since -0=0. Which means that the coding represents one less exponent value than you might have thought. However with the exponent + 127 stored instead of the exponent, there is no duplicate coding: 10000000 represents 128-127=1, and 00000001 represents -126. The largest representable exponent is then 11111110 which represents 254-127=127.
      2. The Hidden Bit:: if a number is not too small (less than 2^(-126)) and not too large, then it can always be represented as +/-1.xxx...x times 2^p with p between -126 and 127. The leading digit (1....) is always there for these normalized floating point numbers, so there is no point in storing it. This is the hidden bit. For example, 1/16 is (binary) 1.000... times 2^-4, and stored in the machine as 000.....0000 (23 zeros) with exponent 127-4=123 = 01111011. 3/16 would be stored as 100...0000 (22 zeros) with exponent stored as 01111100. And so on.

    Turnin: Submit a directory hw_5 to your repository, containing float2bin.c, and any other files (for example cstd.h, if you use it) necessary to run the program. It should be possible to type "gcc float2bin.c" in your directory and obtain the command (as a.out) performing the task described above.

    Grading: I will test your program by running it on several examples, to be revealed later.

  9. Project 2. Assigned 4 Oct, due 11 Oct.

    Condition: This assignment is NOT pledged. You may work with other students, and consult the TAs and Instructor for advice. HOWEVER, all text, including code, in your turnin directory must be your own. If you've worked with others on this assignment, include the names of the group members in a README file in the turnin directory.

    Lab: TAs will be available in Sewall 305 on Wed 9 Oct during the usual class hour to provide advice.

    Objectives: Enhance the cvector package from the class examples to include file i/o, matrices, and matrix-vector multiplication. Provide files containing the necessary function definitions, and program files (containing appropriate "main" functions) to write, read and display vectors, matrices, and matrix-vector products. Build the project with make.

    Specifically,

    1. Add functions to cvector/cvector.c to read and write VECTOR data to/from files. For a VECTOR v, the file should record the integer length (v.len), and the data (v.data) on the subsequent lines, one per line. For example, a vector of length three would be recorded in a file looking like this:
      3
      1.0000000e+00
      2.0000000e+00
      3.0000000e+00

      for example. Use the stdio file functions (fprintf, fscanf) to implement functions for reading and writing VECTOR data in this form, with these signatures:

      int vec_fprint(VECTOR, char *);

      int vec_fscan(VECTOR *, char *);

      vec_fprint should write the contents of its VECTOR to the file named in the "char *" argument: i.e.

      vec_fprint(v, "foo.vec");

      should write v to foo.vec in the format described above. Conversely,

      vec_fscan(&w, "foo.vec");

      should read the contents of the file foo.vec into the VECTOR w.

      Use the suffix ".vec" on all of your VECTOR files - that way, it's obvious which they are, without looking into them.

      VERY IMPORTANT: "trap" all errors you can anticipate. In the event of an error condition occurring in one of your functions, print an informative error message and return from the function immediately - for vec_fprint and vec_fscan, the return value should be non-zero in the event of an error, zero for normal return. specifically, trap these errors:

      • failure to open a file with the appropriate permission ("r" or "w")
      • failure to read the file length
      • failure to read a data value
      Note that fscanf may return either 0 or EOF if it fails to read a value.
    2. Define another struct (typdef struct mat {...} MATRIX) to hold the data of a matrix. Follow the pattern of VECTOR: the struct should contain (a) nrows = the number of rows and ncols = the number of columns, and (b) a float **, say data, so that data[i][j] contains the ith row, jth column entry in the matrix. ex16_multid_array.c uses this idea, and except for using malloc instead of stack-based arrays you will proceed in the same way: first allocate nrows*ncols floats, then an nrows float *'s which you initialize to point to the beginnings of each row. Something like this would work:


      float * vdata = ... (allocate nrows*ncols floats)
      float ** data = ... (allocate nrows float *'s)
      ... data[i] = &(vdata[i*ncols]) ...

      A question: you will need to allocate the float * vdata, in order to guarantee (this is worth it!) that the matrix data occupies a single contiguous block of memory. Will you need to keep vdata as one of the data members of the MATRIX struct?

      Write a set of functions precisely analogous to the functions in cvector.c, stored in a file cmatrix.c. For the file storage, use a similar format to the VECTOR storage mode: nrows on first line, ncols on the second line, then the vector data, row by row, starting with the first row, one float per line (so your file will have 2 + nrows*ncols lines). Be very careful to trap all errors that you can anticipate.

      Finally, add to cmatrix.c a definition of matrix multiplication, implemented in a function with signature

      int matvec(MATRIX a, VECTOR x, VECTOR b);

      Calling this function should store the product of a matrix (first arg) and vector (second arg) in another vector (third arg). Trap all errors - especially, check that the dimensions are compatible. Note that the guts of this function are already available in ex16_multid_array.c.

    3. Write main functions (each in a separate file) to produce these commands:
      1. testvec: reads vector length (an integer, say n) from the command line, writes the VECTOR with data[i]=(float)i, i=0,1,2...n-1 to the file test.vec, in the format described above. Example:

        testvec.x 10

      2. vecreader: reads file name from command line, reads a VECTOR from the file of that name, displays it using vec_print. Example:

        vecreader.x test.vec

      3. testmat: reads nrows and ncols from the command line, writes the matrix with data[i][j]=(float)(i+j) to the file test.mat Example:
      4. testmat.x 11 21

      5. matmult: reads files names for matrix and two vectors from command line, multiplies the matrix and the first vector and stores the result in the second vector (file name). Example:

        matmult.x a.mat x.vec b.vec

      Again, make sure that your commands trap the obvious errors, such as the wrong number of command line arguments, in addition to detecting and handling error conditions trapped by your functions.
    4. Modify the Makefile from the class src directory to build your project. It should be possible to type "make" in the project directory and have all commands built. Note that the command files will end with ".x", per the Makefile's rules.

    Turnin: Create a project directory named hw_6. Upload it to your repository DIRECTLY UNDER THE REPOSITORY ROOT DIRECTORY, i.e. it should appear in the directory listing as

    CAAM420_[yournetid]/hw_6

    In hw_6, include the following:

    1. Makefile
    2. modified cvector.c, cvector.h as described above
    3. new cmatrix.c, cmatrix.h as described above
    4. source files for commands:
      1. testvec.c
      2. vecreader.c
      3. testmat.c
      4. matmult.c

    Grading:

    1. 30 pts: typing "make" in the source directory builds all four of the commands listed above
    2. 10 pts: testvec.x and testmat.x produce files with correct contents
    3. 30 pts: use testvec.x to produce test.vec of length 10, testmat.x to produce test.mat of size 8 x 10, and

      matmult.x test.mat test.vec outp.vec

      produces the correct matrix-vector product of length 8, recorded in outp.vec.

    4. 10 pts: vecreader correctly displays outp.vec, and responds with a usage message (rather than seg fault) when supplied with an incorrect number of arguments.
    5. 20 pts: matmult.x responds with a usage message when supplied with the wrong number of arguments, and traps incompatibility between matrix and vector sizes.
  10. Project 3. Assigned 16 Oct, due 25 Oct.

    Condition: This assignment is NOT pledged. You may work with other students, and consult the TAs and Instructor for advice. HOWEVER, all text, including code, in your turnin directory must be your own. If you've worked with others on this assignment, include the names of the group members in a README file in the turnin directory.

    Objective: An exercise in debugging: use various debugging techniques to fix Prof. Warburton's REALLY bad code

    Details:

    really_bad_code.c exhibits a wide variety of bugs. Some are fatal (i.e. cause the resulting program to crash), some are not. The non-fatal bugs are in some cases harder to catch: they correspond to a failure of intention, that is, the program "works" but does not do what the programmer (obviously) intended it to do.

    Tools: you may use

    Turnin: Create a project directory named hw_7. Upload it to your repository DIRECTLY UNDER THE REPOSITORY ROOT DIRECTORY, i.e. it should appear in the directory listing as

    CAAM420_[yournetid]/hw_7

    In hw_7, include the following:

    1. your fix of really_bad_code.c, in a file named much_better_code.c.
    2. LaTeX source (.tex) and output (.pdf) files documenting your fix. The title should be "Project 3", and the author should be you. The document should have three sections:
      1. Introduction: describe what the programmer of really_bad_code.c ``intended'' the program to accomplish (other than illustrate bad programming practices), and your evidence that your ``fix'' actually accomplishes the intended task;
      2. Bugs and Fixes: describe the programming errors you found, how you found them, and what you did to repair them.
      3. Lessons: propose programming practices that would tend to prevent programmers from committing the errors demonstrated in really_bad_code.c
      Include any references you found useful in this exercise.
    3. a README text file including the names of all people who worked on your solution with you.

    Grading:

    1. 30 pts: "gcc much_better_code.c; ./a.out" (followed by appropriate input, then return) in your hw_7 directory produces the correct output;
    2. 20 pts: "valgrind --leak-check=full ./a.out ..." reports no memory leaks or other memory errors
    3. 10 pts: your document correctly describes the ``intention'' of the programmer
    4. 20 pts: your document describes the steps you took to find and repair the bugs in the example code;
    5. 20 pts: the final section of your document gives useful advice for avoiding the errors presented in really_bad_code.c.
  11. Project 4: Assigned 21 Oct, due 1 Nov.

    Condition: This assignment is pledged. You must carry out the work yourself, without assistance from other students. You may consult the instructor and TAs who may give you general advice or information about C/C++ programming, Unix, Make, etc. but may not answer specific questions about the assignment. You may use the course website and other text and web resources. You must cite all resources used - title, author(s), publisher, date for books, URL for web sites. Your README text file turned in with the rest of the assignment should contain the standard Rice pledge.

    Objective: Create classes in C++ expressing the vector and matrix functions you developed in project 2 as class methods. Add the capability to read/write vector data to/from files. Secondary objectives: learn to manage a typical project directory structure using a tree of Makefiles.

  12. Details:

    1. Part of this exercise concerns working out how to add to VecPPDense some kind of initialization from a file. Since there is no post-construction initialization set up for VecPPDense, the initialization from a file has to happen in a constructor. Accordingly, add to VecPPDense a constructor to do this job. Your constructor will look something like

      VecPPDense(const char * filename) {...}

      In the body of the constructor function, you will open the file with name filename (there is no intent to change the contents of filename, so pass a const char * rather than a char *), read the length from the first line into the len data member, allocate the double array, and read the doubles (if you use scanf, remember that you need to use the "l" prefix, to get full length, for example %le) into the array. Trap all likely errors, eg. not having enough lines in the file.

      Note the big advantage to putting the read-from-file initialization in a constructor: since you will construct the vector precisely tailored to the contents of the file, there is no need to check eg. that the len data member is equal to the int on the first line of the file!

      Add a save method, with signature

      void save(const char * filename) const {...}

      whose function is analogous to vec_fprint from VECTOR - writes the vector data out to the file filename in the by-now familiar form.

      Note that neither of these methods (construct-from-file, save-to-file) appears in the base class VecPPBase as it exists in the course src directory - these will be your innovations.

      Also note that you will keep the constructor already present in VecPPDense, with the int argument = vector length. This function will construct vector workspace in some of the examples to come, whereas the constructor that reads from a file will construct vectors based on archival vector data.

    2. Create a matrix base class named MatPPBase, patterned on VecPPBase. The example in the src directory, in matbase.hh, gives a reasonable pattern to follow, for dimension and element access functions, and you may consult it. It's a bit different from what is required here so I don't advise copying it, though. In particular, your class should provide a pure virtual element indexing method: the (i,j) element of a MatPPBase object m will be retrieved by m(i,j). You should provide both const and non-const versions of this indexing operator. It should also provide a pure virtual reporting method named "write", analogous to VecPPBase::write, and all the usual constructors and destructor.

      In addition, provide a virtual method named apply with the signature

      virtual void apply(VecPPBase const & x, VecPPBase & y) const {...}

      Use the element access methods of VecPPBase and MatPPBase to implement this method, which has the same function as matmult in your Project 2. It is virtual (so could be replaced or overridden in a subclass) but implemented in the base class - subclasses can (but don't need to) use the base class implementation, as is the case with axpy in VecPPBase. Your implementation should use the access functions to define the loops to multiply the input vector x by the matrix represented by the MatPPBase object on which you call apply, and store the result in the output vector y.

    3. Write a simple MatPPDense class patterned after MATRIX, analogous to VecPPDense, that stores the the matrix data just as MATRIX did, and implement all of the pure virtual functions in the base class. Use the apply method of the base class - don't override it. Since you are just using it, you don't need to declare the implemented base class apply method - if you do, the compiler will think that you are overriding it.

      As for VecPPDense, implement a constructor that reads the dimensions and data of the matrix from a file, as your mat_fscan function did in Project 2. Also write a save method - you can figure out its signature - with the same function as mat_fprint.

      Be careful about trapping all the errors you can anticipate - not being able to open the file, not having the right number of lines, not being able to translate the file data into doubles, etc. etc.

    4. Test everything!! As your final test, write another matmult program that constructs a matrix A from a file (name = command line argument), constructs a vector from a file (name = command line argument), calls the apply method to multiply the vector by the matrix, storing the result in a second vector, and saves the second vector to a file (name = command line argument). This example has exactly the functionality of matmult.x in Project 2..

    Important Note: The class examples VecPPBase and VecPPDense are written entirely in header files (xxx.hh). This choice is fairly common in C++ programming, probably because one of two important modes of abstraction (generic programming, implemented by class templates) demands it. The function definisions in these classes, and yours, could be moved to definition files (xxx.cc), and there are advantages to doing so when it's possible. However I advise you to follow my example for the time being and write all of your code in header files, which you #include in your main program files.

    Turnin: Submit the code you have written in this project in the form of a typical Unix package, after the pattern of cvectorpkg in the course src directory. Call the package hw_8, and place it just below your main repository directory (i.e. CAAM420_xxx/hw_8). In the directory, include directories named include (header files) and main (main program files). You will not need a lib directory for this example (if you keep all of your code in header files, as I advise). Use Makefiles lightly modified from the makefiles in cvectorpkg to organize your directory.

    The header files VecPPBase.hh, VecPPDense.hh, MatPPBase.hh, and MatPPDense.hh should reside in include, the main program file matmult.cc (and any other test code) should reside in main. You will need makefiles at the top level and in main.

    Grading:

    1. 30 pts: it is possible to type "make" in the top level directory (hw_8) and build the project, especially the matmult.x command.
    2. 40 pts: calling matmult.x with various correct and incorrect data yields appropriate results - including correctly dimensioned data, incorrectly dimensioned data, missing filenames, etc.
    3. 30 pts: the classes you have created have the functions described in the assignment, with correct signatures.

  13. Project 5. Assigned 1 Nov, due 11 Nov.

    Condition: This assignment is NOT pledged. You may work with other students, and consult the TAs and Instructor for advice. HOWEVER, all text, including code, in your turnin directory must be your own. If you've worked with others on this assignment, include the names of the group members in a README file in the turnin directory.

    Objectives: 1. Explore recursive data structure and class templates by creating a doubly linked list class. 2. Use the doubly linked list as a container for command line arguments, implement associative array behaviour (access by keyword).

    Details: This project has two parts.

    In the first part, you will modify the linked list class template "slist < T >" from the lecture (src directory) to be create a doubly linked list class template "dlist < T >", based on a double link "dlink < T >". What this means: each dlink will refer to the previous link, as well as to the next one - slinks only referred to "next". Your class will include all of the methods of "slist" in "dlist", along with

    void operator--() const;

    void prepend(T const &);

    Applying operator-- will make the "finger" point at the previous link in "dlist", just as operator++ in "slist" makes it point at the next one. The function "prepend" is supposed to add a dlink to the front of the list, just as "append" added a slink to the back of the "slist" list.

    Just as for "slist", your doubly linked list "dlist" will be a class template - a doubly linked list of "anything".

    The second part of this assignment uses the doubly linked list data structure to implement an associative array - like an array, but with strings instead of integers as indices. This is a standard and familiar data structure - any sort of directory, eg. a telephone directory, works this way. You will use your associative array specifically to store command line arguments of the form "key=value", and will provide access to the values by key.

    Specifically, define a class (can be a struct, i.e. class with public data members) StringPair of pairs of C++ strings. The names of the two data members in the struct should be "key" and "value".

    Then define a class ParList, with the following properties:

    1. private data member of type dlist < StringPair > (caveat added 08.11 - see Hint, below)
    2. default construction - in fact its constructors are just the usual two, default and copy (what should these do with the data member?);

      HINT: It's not obvious how to build a default constructor with a dlist object as private data, because a dlist has to be initialized with some object of its template type (a string pair, in this case). Can be done - you could choose some kind of standard starting pair, and ignore it in all subsequent operations. However that's icky. A much easier way to proceed: instead of a dlist < StringPair > member, give ParList a dlist < StringPair > * data member, initialize it to NULL on construction of the ParList. Then in the addFirst and addLast methods, check to see if dlist is NULL, and construct it if necessary via operator new. Then use as usual. This is effectively post-construction initialization. If you go this route, you will need to be careful in the definitions of operator++, write(), etc. that you handle correctly the case when your dlist data member has not been allocated - but you will know, because the value of the dlist * will be NULL!!

    3. a method with signature

      bool addFirst(const char *);

      which (i) picks apart the character array, assumed to have the form "key=value", into the two subarrays on either side of the "=" signe, stores these as strings "key" and "value", then (ii) prepends the string pair (key,value) to the front of the dlist data member and returns true, or (iii) returns false if it can't accomplish this goal (for example because the char * arg does not have the form "key=value")

    4. a method with signature

      bool addLast(const char *);

      similar to addFirst, which appends a StringPair to the back of the dlist [Note: you may want to write a function that converts a "key=value" string to the StringPair ("key","value"), and use it in each of the add... methods - to avoid having to write the same code twice!]

    5. forward (operator++) and backward (operator--) increment methods, front() and back() methods, and access method (operator()), which you can implement by simply calling the methods of the same names on the dlist data member [Note - you'll only need the const version of operator()];
    6. a sensible reporting method

      void write(ostream & str) const;

      which writes out the keys and values stored in the dlist in a nice readable form.

    Finally, you will need to define several standalone function (not class members) of the form

    bool findFirst(ParList const & par, string key, T & val);

    bool findLast(ParList const & par, string key, T & val);

    which assign to the variable of type T a value obtained by looking up the first, resp. last, occurrence of a pair (key, something) in the ParList par, then converting (something) to type T. So for example if a ParList par contains StringPairs {...,(foo,1),...,(foo,2),...}, then

    string key="foo";
    int v1=0;
    int v2=0;
    findFirst(par,key,v1);
    findLast(par,key,v2);

    will return v1=1, v2=2.

    Define these functions for the cases

    1. T=int
    2. T=float
    3. T=double
    4. T=string

    NOTE: these functions are not (should not be) implemented as templates - just as overloaded functions of the same name. That is,

    bool findFirst(ParList const & par, string key, int & val);

    bool findFirst(ParList const & par, string key, float & val);

    ...

    This works because C++ identifies functions not just by their name but by their signature - function name and number and type of argument. Thus the C++ compiler views the two lines above as declaring different functions, which can be provided different implementations. When the functions are called, the compiler resolves which code to insert by the name, and if the names are the same by the argument types.

    NOTE: Overloading functions of the same name this way is a C++ feature - C does not allow you to declare functions of the same name but different arguments in the same program (try it, see what happens!) You can fake it in C using the variable argument list mechanism (such as is used to allow printf to accept any number of arguments of any type), but that's fairly complicated and error prone.

    Because you will have a different function for each type, you can implement the functions returning numerical types (int, float, double) by using sscanf - look up the value (string) associated to the key argument in the ParList, use the string method c_str() to extract the C string (const char *) from the value (C++) string, and pass that as the first arg of sscanf, with the second being an appropriate conversion string and the third being a pointer to the variable being updated.

    Turnin: Submit the code you have written in this project in the form of a typical Unix package, after the pattern of cvectorpkg in the course src directory. Call the package hw_9, and place it just below your main repository directory (i.e. CAAM420_xxx/hw_9). In the directory, include directories named include (header files), lib (source or implementation files) and main (main program files). Use Makefiles lightly modified from the makefiles in cvectorpkg to organize your directory.

    The include directory should contain a header files named "dlist.hh" "parlist.hh", that is, your implementation of dlist - since dlist is a class template, it must be included in every source file that uses it - and declarations for ParList. You can also include all of the implementations for ParList in the header file, if you like (simpler than breaking them out in a separate source file).

    Write a single source (implementation) file, called "parfun.cc", for the findFirst and findLast functions and include these in lib; create a library (archive file) named libpar.a, following the pattern of the cvectorpkg, to hold the object file parfun.o.

    The main directory may be empty, or contain your test files.

    Create makefiles at the top level and in lib and main, as in cvectorpkg, so that typing "make" in the top level builds all components of the project.

    Grading:

    Project will be graded by using your code with our "main"s:

    1. For part 1, we will use a test program file that looks like this:

      #include "cppstd.hh"
      #include "dlist.hh"

      int main() {
          dlist < int > lint;
          dlist < float > lfloat;
          dlist < string > lstring;
          /* ... a bunch of instructions that exercise the dlist
          methods, per above */
      }

      We will check that the list behaves as expected by periodically printing its contents.

    2. The main contemplated use of ParList is to store command line arguments for use in functions other than main. Accordingly, for part 2, we will compile a test program along these lines:

      #include "cppstd.hh"
      #include "parlist.hh"

      void partest(ParList const & par) {
          /* code using findFirst and findLast to extract values
          by key from the ParList argument, testing int, float,
          and string versions, also nonexistent keys and cases where
          the same key appears twice (so that findFirst and findLast
          will return different values)
          */
      }

      int main(int argc, char ** argv) {
          ParList par;
          for (int i=1;i < argc;i++) {
              if (par.addLast(argv[i]))
                  cout << "add " << argv[i] << "\n";
              else
                  cout <<"failed to add " << argv[i] << "\n";
          }
          // check that command line args read correctly
          partest.write(cout);
          // add some more key=value pairs using addFirst and addLast, check again
          // then run test function, produce more output
          partest(par);
      }

      A typical command line for this test might look something like

      ./a.out fruit=apple quantity=10 weight=12.4 fruitcake

      which would result in a ParList with three entries:

      key=fruit value=apple
      key=quantity value=10
      key=weight value=12.4

      and one failure to parse a command line arg (fruitcake). The partest function could be built to extract the three values using string, int, and float versions of findFirst, respectively. It will also test for correct behaviour when a key is not found, etc.

    Point distribution:

    1. entire project (library, any commands) builds by typing "make" at top level - 15 pts
    2. test programs compile and link with your header files and library - 15 pts
    3. test of doubly linked list class behaves as expected - 30 pts
    4. test of ParList successfully parses key=value pairs from the command line (10 pts), permits addition of additional pairs at either front or back of list, (10 pts) and passes the list information to the partest function where it can be successfully extracted using findFirst and findLast (20 pts).
  14. Project 6. Assigned 11 Nov, due 22 Nov. [NOTE NEW DUE DATE]

    Condition: This assignment is NOT pledged. You may work with other students, and consult the TAs and Instructor for advice. HOWEVER, all text, including code, in your turnin directory must be your own. If you've worked with others on this assignment, include the names of the group members in a README file in the turnin directory.

    Objectives: 1. More experience in use of class templates and exceptions via creation of matrix class template. 2. Coupling C with C++: use CLAPACK to solve linear systems, use template specialization to select float vs. double data representation.

    Details:

    1. Morph your VecPPDense class from Project 4 into a class template named VecTDense, a subclass of VecTBase from the class examples. Much of it can be modeled after a similar example in class, but you will need to somehow change your constructor-from-file and save method to deal with the template parameters. The simplest way to do this is with C++ i/o, specifically std::ifstream and std::ofstream. Good online references for these classes (and most anything else you want to know about C++) are found in

      www.cplusplus.com/reference

      It's also possible to solve this problem with C techniques (scanf, printf), but harder, as somehow you must determine the type and then choose format conversion strings based on that info.

      We will test only float and double cases, so if you do something that works in these cases that's fine. We will not test any other template parameter cases.

    2. Having got through that, do the same with your MatPPBase and MatPPDense classes from Project 4: call the class templates MatTBase and MatTDense.
    3. Convert all trapped errors to throw exceptions - no "exit(1)" stuff.
    4. Now add an a method named "solve" to your MatTDense class, which solves a linear system. Its signature should be similar to that of "apply":

      void solve(VecTDense < T > const & b, VecTDense < T > & x) const;

      This will solve the linear system Ax=b, where the matrix object on which you call the method stores A. Note that we might as well not bother adding this method to the base class, as you will explicitly use the CLAPACK functions which require access to arrays of values. That is, this method is an additional attribute of the derived class MatTDense, unlike "apply" which is a virtual method implemented in the base class MatTBase (and which could be overridden in the derived class, though such an override is not a required part of this assignment).

      See the example lapacktest.c in the course example directory for the usage of CLAPACK to solve linear systems.

      The "solve" method will need to be implemented with template specialization, because only double and float cases make sense. There are two ways to go: (1) specialize the entire class, or (2) delegate the CLAPACK interface to an external function, much as was done in the course example vectlapack.hh. I recommend the second method. Note that the general template case MUST throw an exception, as we have no implementation of the solve that works for general types.

    Turnin: Submit the code you have written in this project in the form of a typical Unix package, after the pattern of cvectorpkg in the course src directory. Call the package hw_10, and place it just below your main repository directory (i.e. CAAM420_xxx/hw_10). In the directory, include directories named include (header files) and main (main program files). You will not need a lib directory for this example (if you keep all of your code in header files, which is more or less necessary as templates cannot be compiled). You may also leave off the main directory in this case, or use it as the location for your test codes (in which case your job will be easier if you use a makefile).

    The header files vectbase.hh, vectdense.hh, mattbase.hh, and mattdense.hh should reside in include.

    Grading: We will supply the main program, incorporating your header files, and solve several linear systems, printing out matrices and vectors and testing for accurate solution of a problem with invertible matrix, and for reasonable exception handling when the "solve" method encounters a singular matrix.

    1. Vector base class template and derived class template for dense vectors, correctly loading both float and double data on construction from files of the format introduced in Proj. 2, write methods displaying the vector data: 25 pts
    2. Matrix base class template and derived class template for dense matrices, correctly loading both float and double data from files of the type introduced in Proj. 2, write method displays matrix data in readable form: 35 pts
    3. calling solve method correctly solves an invertible system for both float and double cases, also throws exception upon encountering singular matrix: 40 pts

    Additional note 20 Nov 2013: We will supply a test directory with makefile very similar to src/cppvectorpkg/main/Makefile. The test program, double case, will closely resemble the following code, (a similar program will test the float case):

    #include "mattdense.hh"

    int main(int argv, char ** argv) {
        try {

            MatTDense< double > m("matrix.txt");
            VecTDense< double > b("rhsvec.txt");

            cout<<"\nmatrix:\n";
            m.write(cout);
            cout<<"\nright-hand side vector\n";
            b.write(cout);

            VecTDense< double > x(m.getNCols());

            m.solve(b,x);

            cout<<"\nsolution:\n";
            x.write(cout);

            exit(0);
        }
        catch (CAAM420Exception & e) {
            e.write(cerr);
            exit(1);
        }
    }

    Important: note that your matrix class needs a getNCols() function that returns the number of columns. If I were writing it, I'd also implement a getNRows() function that returns the number of rows, but the number of columns is all you need here.

    Word to the wise: it might be a good idea to make sure that you can implement this or something very similar and get it to work properly!

  15. Project 7. Assigned 22 Nov, due 6 Dec.

    Condition: This assignment is pledged. You must carry out the work yourself, without assistance from other students. You may consult the instructor and TAs who may give you general advice or information about C/C++ programming, Unix, Make, etc. but may not answer specific questions about the assignment. You may use the course website and other text and web resources. You must cite all resources used - title, author(s), publisher, date for books, URL for web sites. Your README text file turned in with the rest of the assignment should contain the standard Rice pledge.

    Remark: It may be (or maybe not) that Brooks' law will make you feel better about a pledged project. Look it up.

    Objectives: 1. Synthesis of a complete application using classes, templates, exceptions, and other C++ techniques, also coupling to an external library, to solve a prototype scientific programming problem. 2. Present results in a report, including graphics, using LaTeX.

    References on Least Squares: The CAAM 335 notes, available online, are excellent in general and in particular have a very nice chapter (6) on least squares. Wikipedia (google "least squares") is not bad, and has an excellent historical review. Note that we are dealing only with linear least squares, so called, and are using what they call Tikhonov regularization.

    Details:

    1. modify your MatTDense::solve function to solve the least-squares problem

      minimize (Ax-b)^T (Ax-b).

      instead of the linear system Ax=b. Use the LAPACK subroutines dgels (double) and sgels (float). Read the documentation (at the top of the files in CLAPACK-3.2.1/SRC, or you can find it online) carefully to understand how to use these functions. You can also consult my sample code src/ls.c. An interesting feature, which the *gesv functions do not have: you can specify whether the matrix or its transpose is to be used in formulating the problem. Surprise consequence: you do not have to transpose the stored matrix from row-major (C style) to column-major (Fortran style, used by LAPACK) to use these functions.

      Another surprise consequence: now your solve(...) method will work for any shape of matrix, i.e. any positive nrows, ncols. The meaning of the solution is different in different cases, however, as explained in the documentation. One special case: if A is square and invertible, the least squares solution is the same as the linear system solution, except for possible loss of precision in the last few digits.

    2. Add a boolean "transp" argument to the "apply" method in MatTBase:

      void apply(VecTBase < T > const & x, VecTBase < T > y, bool transp=false);

      If transp=false (the default value), the apply method should multiply x by the matrix and store the result in y, as before. If trasp=true, it should multiply x by the transpose of the matrix. Note that this is very easy to implement: instead of (sum over j) a_ij x_j, the ith component of the output should be (sum over j) a_ji x_j. Of course the "sanity" test for vector lengths changes in an appropriate way in the transp=true case.

      This method has an obvious role in manipulating least squares problems - you need the transp=true case to compute the gradient of the least squares function.

    3. Devise small (like 2x3, 3x2, 5x10, etc.) test cases to verify that your code works properly. It is not quite as easy to create test problems, as it was for linear systems: if you choose the solution x first (the "backwards" method of test construction, usually the best), then it is the solution to any least squares problem with RHS vector b = Ax + y, where y is any vector perpindicular to the column space of A, i.e. in the null space of A^T. Since computing the perp of the column space of A is itself a least squares problem (or many least squares problems, depending on how you look at it), this program is not so simple to carry out. My suggestion: (a) choose x; (b) choose b (also arbitrarily); (c) compute the A^T A (by hand, or by applying A^T to the columns of A, for which you have to write and debug a program); (d) compute A^T b (now you see why you need the transp=true case of apply()); (e) solve the normal equation A^TAx=A^Tb using your code from project 6. If you are confident that your project 6 solution was right, this process will give you correct test data. You can also use Matlab to build test data, if you wish.
    4. Application problem: image deblurring.

      1. Use src/gaussblur.c to produce a Regularized Gaussian Blurring Matrix of size (nrows+ncols)xncols, nrows=201, ncols=101. Use shift=50, sig=0.3, and several values of lam (see below). This matrix stacks a (nrowsxncols) "blurring" matrix, whose columns are Gaussian distributions centered at row j+shift in column j, on top of a ncolsxncols regularization matrix = the ncolsxncols identity matrix scaled by the factor lam.
      2. Using your matrix class equipped with least-squares solver via LAPACK, solve the regularized least-squares problem with matrix produced by src/gaussblur.c as in the previous item, and right-hand side stored in the file src/proj7.vec (download this!). Note that src/proj7.vec has nrows=201 rows: to set up the least squares problem, you will need to append an additional ncols=101 rows of zeros to it.

        Construct solutions for five values of the regularization parameter lam = 10^p, with integer p ranging from -2 to 2. For each choice of lam, use matplotlib and the reader.py module to plot the solution and save a pdf of the plot (see below for instructions). Also record the mean-square residual, i.e. (Ax-b)^T(Ax-b) for each choice of lam. [If you read the documentation carefully, you will find that use of the LAPACK least squares solver has already computed Ax-b for you - you do not need to compute it again, just calculate the sum of squares!]

      3. The actual mean-square noise in the data is approximatly 2.65 (this data is dimensionless!), which is 1.3 times the noise-free data mean square (so the RMS noise level is about 115%). Interpret your results of in terms of the Morozov discrepancy principle: the regularization parameter lam should be chosen to as small as possible subject to making the data residual mean square approximately equal to the mean square noise. [Of course, you can assess this principle because the problem is fake, and I actually know the noise level!]
      4. Given the additional information that the data was produced by applying the Gaussian blur to a vector x with non-zero values only in two entries, estimate which entries these were, i.e. where were the non-zeros in the vector x you are attempting to recover by least squares?
      5. Your modified matrix class template allows you to carry out this task with both float and in double representations of the real numbers. Whichever you chose, now repeat the same set of experiments - does it make any appreciable difference to the results?

    5. Report: write a report describing your project. Include

      1. An Introduction section describing the goal of the project
      2. A Methods section describing the modifications you made to your matrix class template from Project 6, and the driver (main) code you created for this project.
      3. A Results section including plots of the data vector proj7.vec, and of all five of your solution vectors for various values of lam. Provide an informative caption for each figure. You can find a model in src/lex1.tex. Also, estimate the positions of the non-zeros in the "true" solution vector that I used to create the data proj7.vec. State your observations about the consequences of using single (float) vs. double precision arithmetic.
      4. A Dicussion section in which you interpret the result in terms of the Morozov discrepancy principle, stated above.

      Ordinarily, a report of this type would also include a literature review section and a final conclusion section, however these do not seem appropriate to this assignment.

    Plotting with MatPlotLib:

    Both ipython and matplotlib have excellent online manuals - I recommend them.

    Turnin:as usual

    Grading:

    1. 40 points for following the instructions above to modify the matrix class template from project 6 - solve now calls *gels, and apply has a transpose flag.
    2. 30 points for correct results, per plots in your writeup
    3. 30 points for an adequate paper - 10 for methods, 10 for results, and 10 for discussion