Algorithms
Notes about algorithms.
Contents
Resources
- The Algorithm Design Manual Hardcover, 2nd ed., by Steven S Skiena
- Think Complexity
- http://interactivepython.org/runestone/static/pythonds/index.html
- Data Structures and Algorithms in Python Hardcover by Michael T. Goodrich
- mbostock Algorithm visualizations
- http://setosa.io/#/
- Essential Algorithms by Rod Stephens
- Introduction to Algorithms by By Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein
- Algorithm Design by Jon Kleinberg, Éva Tardos
Week 1
Grade-school algorithm for integer multiplication requires approx. \( n^{2} \) operations to solve. (?)
E.g., if you double the size of the input, the operations go up by factor of four. Quadruple the input and the number of operations go up to 16.
Karatsuba Multiplication
x=5678 (a=56, b=78) y=1234 (c=12, d=34)
Step 1: a*c=672
Step 2: b*d=2652
Step 3: (a+b)(c+d)=134*45=6164
Step 4: step 3-2-1=2840
Step 5:
6720000 (step 1 with four zero padding)
2652
284000
7006652 < the correct answer
Recursive Algorithm
\( x=10^{\frac{n}{2}}a+b \) and \( y=10^{\frac{n}{2}}c+d \) where a, b, c, d are \( \frac{n}{2} \)-digit numbers.
Example: \( x=5678=10^{\frac{4}{2}}56+78 \) -- it's a four digit number so the 56 part is multiplied by \( 10^{2} \).
\( \begin{align} x \cdot y & = (10^{\frac{n}{2}}a+b) \cdot (10^{\frac{n}{2}}c+d)\\& =10^{n}ac+10^{\frac{n}{2}}(ad+bc)+bd \end{align} \)
He refers to \( 10^{n}ac+10^{\frac{n}{2}}(ad+bc)+bd \) as "*" ("star").
Step 1: recursively compute \( ac \)
Step 2: recursively compute \( bd \)
Step 3: recursively compute \( \require{cancel} (a+b)(c+d)=\cancel{ac}+ad+bc+\cancel{bd} \)
Gauss's trick: 3-1-2 = \( ad+bc) \)
Course Topics
Divide and Conquer Algorithm... will apply to:
- Integer multiplication
- Sorting
- Matrix multiplication
- Closest pair
Also will cover general analysis methods ("Master Method/Theorem")
Randomization in algorithm design will apply to:
- QuickSort
- Primality testing
- Graph partitioning
- Hashing
Primitives for reasoning about graphs:
- Connectivity information
- Shortest paths
- Structure of information
- Social networks
Data structures:
- Heaps
- Balanced binary search trees
- Hashing and variants like bloom filters, etc.
Merge Sort
"For every input array of n numbers, Merge Sort produces a sorted output array and uses at most \( 6n log_{2}n+6n \) operations."
- intro to divide & conquer
- Merge Sort improves over Selection Sort, Insertion Sort, and Bubble Sort
Input: array of n numbers, unsorted (e.g., 5, 4, 1, 8, 7, 2, 6, 3)
Output: the array, sorted (e.g., 1, 2, 3, 4, 5, 6, 7, 8)
Recursion tree size: \( log_{2}n \) (or \( log_{2}n+1 \) to be exact -- see Merge Sort: Analysis video)
Each level j=0,1,2,...,log_{2}n, there are 2^{j} subproblems, each of size n/2^{j}. E.g., n=8. At level 1 (j=1), there are 2 subproblems each of size 4.
Guiding Principles for Algorithm Analysis
- Worst-case analysis vs. average-case analysis or benchmarks
- Won't pay much attention to constant factors, lower-order terms
- Asymptotic analysis: focus on running time for large input sizes n.
Asymptotic Analysis
High level idea: suppress constant factors (too system dependent) and lower order terms (irrelevant for large inputs). An example of a constant factor mentioned is searching through two arrays rather than one. The number of arrays is a constant factor and can be suppressed. So it's O(n) whether searching through two arrays or one.
Equate \( 6n \log_{2}n+6n \) with just \( n \log n\), written \( O(n \log n) \) ...?
Big-Oh Formal Definition
The handwriting is too bad to read it, but it's at the Big-Oh Notation video at about 2 min in.