CompSciWeek4

From Predictive Chemistry
Jump to: navigation, search

Reading Assignment

  • Algorithms, Chapter 2-3

Class 1: Formal Proofs

  • Complexity Notation
  • Rules - state assumptions, use logical operations on assumptions
  • Modus Ponens - (a, a =>b) => b
  • Modus Tollens - (a => b, ~b) => ~a
  • Disjunctive Syllogism - ((a v b), ~a) => b
  • Contradiction - (a, ~a) => Yikes!
  ex. from 1+1 = 1, prove that I am the Pope.
  A: The Pope and I make 2, but 1+1 = 1, so I and the Pope are
     one and the same.
    • This is a famous argument from the Cambridge mathematician G. H. Hardy, written to satisfy J. E. McTaggart's doubt that anything can be proven from a false proposition. It was cited in the first chapter of Sir Harold Jeffreys' "Scientific Inference" and in a discussion discounting set theory in appendix B to Jaynes "Probability Theory, the Logic of Science."
    • Anything that proves a false statement is false.
    • If a then False => ~a
  • Induction

Class 2: Code Profiling Examples

  • Creating and loading a module
  • formatting the results as a flat text file
  • reading text files into arrays (numpy) and plotting (pylab)

Profiling and Plotting Code

lib/ltime.py <source lang="python">

  1. Functions to time

from random import random

def lappend(n): l = [1]*n l.append(random())

def lmod(n): l = [1]*n l[0] = random()

def make_list(n): l = [1]*n

def rand(n): random() </source>

Make sure that PYTHONPATH includes your lib directory (where ltime.py lives), e.g.

 PYTHONPATH=/usr/lib/python2.7/site-packages:$HOME/scicomp/lib

time.sh <source lang="bash">

  1. !/bin/bash

if [ $# -ne 1 ]; then

 echo "Usage: $0 <function name from ltime.py>"
 exit 1

fi

cmd=$1

k=1 for((n=1;n<8;n++)); do

 echo -n "$k "
 python -m timeit -s"import ltime as m;" "m.$cmd($k)" | sed 's/.*[^0-9]\([0-9]*\.\)/\1/; s/ usec.*//;'
 k=$(($k*2))

done </source>

analyze.py <source lang="python"> from numpy import * import sys, pylab

assert len(sys.argv) == 2, "Usage: %s <function name>"%sys.argv[0]

name = sys.argv[1] x = fromfile(name+".dat", sep=' ') x = reshape(x, (-1,2)) pylab.plot(x[:,0], x[:,1]) pylab.savefig(name+".png") </source>

Both the shell and python scripts above have been modified from the version in class to show how to read command-line arguments. In the shell, the number of arguments is in the variable $#, and the if statement checks for the right usage. $0 is the program name and is not counted in $#. In python, the arguments are in the list, sys.argv, sys.argv[0] is the program name, and is counted in len(sys.argv). The assert statement checks for the right usage, and immediately terminates with an error (raises an AssertionError exception) if the assertion fails.

Homework 3 - due Monday, Sept. 29

  • Coding Problems
    • Write a code to build a graph representation of what's inside the /usr/share/X11 directory on circe. Use (os.walk and/or os.listdir, os.path.join). see example graph representation
      1. How many directories are there total?
      2. What is the maximum depth, counting files and dirs so the depth of a dir with no files/subdirs = 0, and with any files/subdirs = 1?
      3. How many references are there to each directory -- node in-degree? (ignore symlinks)
      4. What is the maximum number of references from a directory (i.e. maximum out-

degree)?

  • Problems from Algorithms in Python:
    • 2-4 (read the text definition of big-O notation)
    • 2-7 (write and turn in a test code for each case mentioned -- see above) You should show the test function you used and write the answer to the question in big-O (or theta) notation, e.g. list append was tested with "def lappend(n): l = [1]*n; l.append(1);" and the append operation takes constant time, O(1).
    • 2-10
      • The question describes a binary tree. Internal nodes are nodes with two children. Leaf nodes have no children. Try drawing some trees to get a sense of the answer for this one.
    • 2-12 The idea of weighted edges is to add information to the graph, e.g. 42 steps between node u and node v.
      • Specifically, explain how looking up children of a given node would work in this representation (how complicated is finding all children of node 'u').