Switch to O(n) implenentation of sorted

2019-09-30 21:04:48 -07:00 · 2019-09-30 21:04:48 -07:00 · d9fc5c45ef
commit d9fc5c45ef
parent 022533dc4a
2 changed files with 29 additions and 19 deletions
--- a/qsort.py
+++ b/qsort.py
@ -6,24 +6,32 @@ def qsort(xs):
    right = [x for x in xs[1:] if x >= pivot]
    return [qsort(left)] + [pivot] + [qsort(right)]

+def _sorted(tree, acc):
+    if tree == []: return
+
+    _sorted(tree[0], acc)
+    acc.append(tree[1])
+    _sorted(tree[2], acc)
+
 def sorted(tree):
-  if tree == []: return []
-  return sorted(tree[0]) + [tree[1]] + sorted(tree[2])
+    acc = []
+    _sorted(tree, acc)
+    return acc

 def search(tree, x):
-    return _sorted(tree, x) != []
+    return _search(tree, x) != []

 def insert(tree, x):
-    node = _sorted(tree, x)
+    node = _search(tree, x)
    if node == []:
        node.append([])
        node.append(x)
        node.append([])

-def _sorted(tree, i):
+def _search(tree, i):
    if tree == []: return tree

    pivot = tree[1]
    if pivot == i: return tree
-    elif i < pivot: return _sorted(tree[0], i)
-    else: return _sorted(tree[2], i)
+    elif i < pivot: return _search(tree[0], i)
+    else: return _search(tree[2], i)
--- a/report.txt
+++ b/report.txt
@ -12,7 +12,11 @@ A: Quicksort has the worst-case complexity of O(n^2). This is because in the wor

   On average, Quicksort is also O(n*log(n)). It's quite difficult to consistently pick
   a pivot that is either the smallest or the largest. I am unfamilliar with proof
-   techniques that help formalize this.
+   techniques that help formalize this, but we can think of a case in which 
+   some non-half fraction (say j/k) of the elements
+   is on the left of the pivot. In this case, the depth ends up being a multiple
+   of log_k(n), meaning that the depth is still logarithmic and the complexity is
+   still O(n*log(n)).

 Q: What's the best-case, worst-case, and average-case time complexities? Briefly explain.
 A: For the same reason as quicksort, in the worst case, the complexity is O(n^2).
@ -25,19 +29,17 @@ A: For the same reason as quicksort, in the worst case, the complexity is O(n^2)
   is n(1-r^k)/(1-r). This simplifies to 2n(1-r^k). Since 1-2^k < 1,
   n*(1+1/2+1/4+...) < 2n. This means the complexity is O(n).

-   For similarly hand-wavey reasons to those in Q0, the average case complexity aligns
-   with the best-case complexity rather than worst-case complexity.
+   Similarly to quicksort, we can assume j/k elements are on the left
+   of the pivot. Then, the the longest possible computation will end up
+   looking at nj/k elements, then nj^2/k^2, and so on. This is effectively
+   n times the sum of the geometric series with r=j/k. This means
+   the sum is n * c, and thus, the complexity is O(n).

 Q: What are the time complexities for the operations implemented?
-A: The complexity of sorted is O(n*log(n)) in best, and O(n^2) in worst case.
-   This is because of the way in which it implements
-   "flattening" the binary search tree - it recursively calls itself, creating
-   a new array from the results of the two recursive calls and the "pivot" between them.
-   Since creating a new array from arrays of length m and n is an O(m+n) operation.
-   Just like with qsort, in the best case, the tree is balanced with a depth of log(n).
-   Since concatenation at each level will effectively take n steps, the best case complexity
-   is O(n*log(n)). On the other hand, in the case of a tree with only right children,
-   the concatenation will take 1+2+...+n steps, which is in the order O(n^2).
+A: The complexity of sorted is O(n).
+   Since I use an accumulator array, array append is O(1). Then, all
+   that's done is an in-order traversal of the tree, which is O(n),
+   since it visits every element of the tree.

   Since insert and search both use _search, and perform no steps above O(1), they are
   of the same complexity as _search. _search itself is O(logn) in the average case,