CompSciWeek8

From Predictive Chemistry
Revision as of 16:57, 13 October 2014 by David M. Rogers (talk | contribs)

Jump to: navigation, search

Reading

Review previous chapters and complete your midterm projects by Friday!

  • Advertisment: Use the slurm queue on USF Circe for an overall better user experience (sign up at rc user portal)!

Classes 1 & 2

  • Example project walk-through
  • Demonstrates use of git, python ctypes, class objects, sed, and batch job organization
  • Complexity of searching for a key
  • Design pattern: sorted list + bisection search vs. hashing
  • Design patterns: sorting an index to recover a permutation
  • Working with permutations
    • Random permutation, conditional probability concepts
  • Complexity and solution of the Hanoi tower problem

Midterm Project Outline

  • Motivation - explain where this problem comes from / where the solution applies
  • Problem Description - explain what makes your problem difficult
    • If more than one paragraph of reviewing previous work is needed, put here. (ex. the termemulator project by Siva Chandran provides a great example of using the subprocess module to run shell programs with interactive user input. It provides ..., but does not solve ... This work builds on its use of subprocess.)
  • Code - explain what new code / structure / algorithm design improvement you have made (ex. My job-server design queries a directory for new problem descriptions, then either uses stored data to directly compute the result or else creates a new SGE job to produce more stored data.)
  • Results and Discussion - present numerical results (if applicable) or discuss the trade-offs involved in your design choices (ex. This strategy automates stored data computation and management, saving more time than it required to write the server. Manual management took ... per data set, automatic management required ... to write. It will not scale beyond 100 Gb, or work for problem ..., etc.)

Note that the above sections are suitable for this project, whose goal is to show proficiency in going from a scientific problem to a structured piece of code. They are inadequate for a scientific paper, for which much more effort needs to be spent on defending a single claim, anticipating the prejudices of your audience (this means literature review), and making understandable figures to illustrate your defense.

Homework 5, Due Wed., Oct. 22

Refer to the Binary Tree Code for all except the last question. Email your homework answers with "SciComp HW5" in the subject line, using a text file formatting that can be input to text2pdf. Specifically, each to-level question (1-7) should start with = Question 1 at the beginning of a line.

  1. Write a one-liner using these functions to print the binary encoding for 'd' in the prefix code for "abcdefg".
  2. What is the asymptotic worst case complexity notation of each operation:
    1. decode
    2. create_tree
    3. get_codes
  3. What is special about get_codes?
  4. Write an inductive proof for the following properties of get_codes:
    1. The iteration covers all leaf nodes under the current node. (hint: Remember an inductive proof has a base case and a recursive case and add more detail to the definition of all leaf nodes under the current node).
    2. At each internal node, every address generated is a prefix code (i.e. no code is a prefix of another code).
  5. A balanced binary tree has approximately equally sized left and right sub-trees. Why does the tree remain balanced when the input to create_tree is sorted?
  6. Compare the design choice made here of separate functions with the alternative of putting some of the functions (decode, create_tree, etc.) inside the class TreeNode. Which are the best candidates for turining into class methods? If create_tree is used as TreeNode's init function, what would be lost?
  7. Calculate the average case complexity of list.index(), which searches for an item in a list of size n by checking each list element (0, ..., n-1) in order. Use the assumptions that 1) the probability that X occurs in the list is 2/3. 2) If X occurs in the list, it is equally likely to appear anywhere.