From Code Self Study Wiki
Jump to: navigation, search

Notes about algorithms.


Week 1[edit]

Grade-school algorithm for integer multiplication requires approx. \( n^{2} \) operations to solve. (?)

E.g., if you double the size of the input, the operations go up by factor of four. Quadruple the input and the number of operations go up to 16.

Karatsuba Multiplication[edit]

x=5678 (a=56, b=78) y=1234 (c=12, d=34)

Step 1: a*c=672

Step 2: b*d=2652

Step 3: (a+b)(c+d)=134*45=6164

Step 4: step 3-2-1=2840

Step 5:
6720000 (step 1 with four zero padding)

7006652 < the correct answer

Recursive Algorithm[edit]

\( x=10^{\frac{n}{2}}a+b \) and \( y=10^{\frac{n}{2}}c+d \) where a, b, c, d are \( \frac{n}{2} \)-digit numbers.

Example: \( x=5678=10^{\frac{4}{2}}56+78 \) -- it's a four digit number so the 56 part is multiplied by \( 10^{2} \).

\( \begin{align} x \cdot y & = (10^{\frac{n}{2}}a+b) \cdot (10^{\frac{n}{2}}c+d)\\& =10^{n}ac+10^{\frac{n}{2}}(ad+bc)+bd \end{align} \)

He refers to \( 10^{n}ac+10^{\frac{n}{2}}(ad+bc)+bd \) as "*" ("star").

Step 1: recursively compute \( ac \)
Step 2: recursively compute \( bd \)
Step 3: recursively compute \( \require{cancel} (a+b)(c+d)=\cancel{ac}+ad+bc+\cancel{bd} \)
Gauss's trick: 3-1-2 = \( ad+bc) \)

Course Topics[edit]

Divide and Conquer Algorithm... will apply to:

  • Integer multiplication
  • Sorting
  • Matrix multiplication
  • Closest pair

Also will cover general analysis methods ("Master Method/Theorem")

Randomization in algorithm design will apply to:

  • QuickSort
  • Primality testing
  • Graph partitioning
  • Hashing

Primitives for reasoning about graphs:

  • Connectivity information
  • Shortest paths
  • Structure of information
  • Social networks

Data structures:

  • Heaps
  • Balanced binary search trees
  • Hashing and variants like bloom filters, etc.

Merge Sort[edit]

"For every input array of n numbers, Merge Sort produces a sorted output array and uses at most \( 6n log_{2}n+6n \) operations."

Input: array of n numbers, unsorted (e.g., 5, 4, 1, 8, 7, 2, 6, 3)

Output: the array, sorted (e.g., 1, 2, 3, 4, 5, 6, 7, 8)

Recursion tree size: \( log_{2}n \) (or \( log_{2}n+1 \) to be exact -- see Merge Sort: Analysis video)

Each level j=0,1,2,...,log2n, there are 2j subproblems, each of size n/2j. E.g., n=8. At level 1 (j=1), there are 2 subproblems each of size 4.

Guiding Principles for Algorithm Analysis[edit]

  1. Worst-case analysis vs. average-case analysis or benchmarks
  2. Won't pay much attention to constant factors, lower-order terms
  3. Asymptotic analysis: focus on running time for large input sizes n.

Asymptotic Analysis[edit]

High level idea: suppress constant factors (too system dependent) and lower order terms (irrelevant for large inputs). An example of a constant factor mentioned is searching through two arrays rather than one. The number of arrays is a constant factor and can be suppressed. So it's O(n) whether searching through two arrays or one.

Equate \( 6n \log_{2}n+6n \) with just \( n \log n\), written \( O(n \log n) \) ...?

Big-Oh Formal Definition[edit]

The handwriting is too bad to read it, but it's at the Big-Oh Notation video at about 2 min in.