Add homework 2 solution.

2020-12-07 15:50:31 -08:00
commit 7bd7036b7f
1 changed files with 216 additions and 0 deletions
--- a/hw2.tex
+++ b/hw2.tex
@@ -0,0 +1,216 @@
 \documentclass{article}
 \usepackage{amsmath}
 \usepackage{tikz}
 \usepackage{mathtools}
 \usepackage{geometry}[margin=0.5in]
 \begin{document}
 \section*{Problem 1}
 Going step by step:
 \begin{itemize}
    \item The first instruction, $R_A$, causes a cache miss,
        which is serviced by the next level cache, since
        the MSI model doesn't have ownership support.
    \item The second instruction also causes a cache miss,
        which is serviced by the next level cache.
    \item The same is the case for the third instruction.
        Each CPU's private cache needs to bring the data into
        memory separately.
    \item The write in the fourth instructions prompts a
        bus upgrade request to bring Processor 1 into
        ``modified" state. This invalidates the private caches
        of the second and third processors.
    \item The read in the fifth instruction requires another
        cache request, since the private cache was invalidated
        by the previous instruction. This read is
        again serviced by the next level cache. However,
        processor 3 reads a part of memory that was not modified
        by another processor, and thus, this is a false sharing miss.
    \item The read in the sixth instruction causes another
        cache miss like the previous instruction. However, unlike
        the previous instruction, here the CPU reads a part of the
        memory that was modified, so this is a true sharing miss.
    \item The write in the seventh instruction causes
        another bus upgrade request, and again invalidates the
        other private caches.
    \item The read in the eigth instruction causes a cache miss,
        again serviced by the next level cache. Althoguh
        the variable $R_A$ was not modified since the previous
        state of the cache, the subsequent use of the $R_B$
        variable makes this a true sharing miss.
    \item The read in the ninth instruction is a cache hit, since
        the previous instruction already caused a cache miss.
 \end{itemize}
 \subsection*{Part a}
 Given the above, the first three misses (1,2,3) are cold misses,
 the miss at 5 is a false sharing miss, and 6 and 8 are true sharing misses.
 \subsection*{Part b}
 The false sharing miss at 5 can be ignored, since no variables accessed
 after the miss (and before others) were changed by another processor.
 \subsection*{Part c}
 Each miss requires $64+6=70$ bytes of traffic. There are 6 misses,
 one of which is not essential. In addition, there are two bus
 upgrade requests, which require 10 bytes each. Thus, we have
 a total of 440 bytes of data sent ($6 \times 70 + 20$).
 Of these, 70 are non-essential, and thus, the
 fraction of essential traffic is $370/440$ = 84\%.
 \pagebreak
 \section*{Problem 2}
 In these instruction, there are two ``major" indications
 of order: the assignment to A in P1 and the subsequent use of A
 in P2, and the assignment of B in P2 and the subsequent use of B
 in P3. It would be inconsistent if P3 observed P1 not to have run,
 but P2 to have run, while P2 observed P1 to have run.
 If this happened, R3 would be set to A's old value (0),
 but R1 would be set to 1, and R2 would be set to 1.
 Thus, the assignment of R1 = 1, R2 = 1, R3 = 0 would be inconsistent.
 \pagebreak
 \section*{Problem 3}
 In one dimension, we have two nodes, with one channel between them.
 Thus,
 \begin{equation*}
  C_1 = 1
 \end{equation*}
 When extending to 2 dimensions, we copy the 1-dimensional
 version of the network 2 more times, and add links between all of the
 2 nodes 2 times. Thus,
 \begin{equation*}
  C_2 = 1 + [2(1) + 2*2] = 7
 \end{equation*}
 For three dimensions, we copy the 2-dimensional version of the network 3 more
 times, and add links between all of the $6 = 3 * 2 = 3!$ nodes
 $3$ times. Thus,
 \begin{equation*}
  C_3 = 1 + [2(1) + 2*2] + [3*7 + 3*3!] = 46
 \end{equation*}
 Observing the pattern, we rewrite:
 \begin{equation*}
    C_3 = [1(0) + 1*1!] + [2(1) + 2*2!] + [3*7 + 3*3!]
 \end{equation*}
 Further, we notice:
 \begin{equation*}
    \begin{aligned}
        C_3 &= \overbrace{\underbrace{[1C_0 + 1*1!]}_{C_1} + [2C_1 + 2*2!]}^{C_2} + [3*C_2 + 3*3!] \\
    \end{aligned}
 \end{equation*}
 Then, we have:
 \begin{equation*}
    C_n = (n+1)C_{n-1} + n*n! = (n+1)(C_{n-1} + n!) - n!
 \end{equation*}
 Telescoping a little bit:
 \begin{equation*}
    \begin{aligned}
        C_n &= (n+1)(C_{n-1} + n!) - n! \\
            &= (n+1)(\left[n(C_{n-2} + (n-1)!) - (n-1)!\right] + n!) - n! \\
            &= (n+1)(\left[n(\left[(n-1)(C_{n-2} + (n-2)!) - (n-2)!\right] + (n-1)!) - (n-1)!\right] + n!) - n! \\
    \end{aligned}
 \end{equation*}
 Observe that for every $-k!$, there's a corresponding $(k+1)k!$, and for each, there are surrounding factors:
 \begin{equation*}
    ... (k+3)\left[(k+2)\left[(k+1)k! - k!\right]\right] ...
 \end{equation*}
 The $(k+1)k!$ term then is multiplied by all numbers until $n+1$, and is thus equal to $(n+1)!$.
 The negative term, though, misses the $(k+1)$ factor, and is thus equal to $-(n+1)!/(k+1)$.
 Subtracting the two gives:
 \begin{equation*}
    \frac{k}{k+1}(n+1)!
 \end{equation*}
 If we expand $C_n$ in such a way until $C_0 = 0$, we will have $0 \leq k \leq n$,
 and thus:
 \begin{equation*}
    \begin{aligned}
        C_n &= \sum_{k=0}^{n} \frac{k}{k+1}(n+1)! \\
        \Aboxed{C_n &= (n+1)! \sum_{k=0}^{n} \frac{k}{k+1}}
    \end{aligned}
 \end{equation*}
 \pagebreak
 \section*{Problem 4}
 I'll take the problem's advice, and go for dependency graphs.
 To form a deadlock, we will need to be able to start at a node
 of the graph and visit every other node, ending where you started.
 \subsection*{Part a}
 \begin{tikzpicture}
    \def \radius {1.5}
    \node[draw, circle, minimum size = 1.2cm] (east) at (0:\radius) {East};
    \node[draw, circle, minimum size = 1.2cm] (north) at (90:\radius) {North};
    \node[draw, circle, minimum size = 1.2cm] (west) at (180:\radius) {West};
    \node[draw, circle, minimum size = 1.2cm] (south) at (270:\radius) {South};
    \draw[->, thick] (south.east) .. controls +(right:4mm) and +(down:4mm) .. (east.south);
    \draw[->, thick] (south.west) .. controls +(left:4mm) and +(down:4mm) .. (west.south);
    \draw[->, thick] (west.north) .. controls +(up:4mm) and +(left:4mm) .. (north.west);
    \draw[->, thick] (east.north) .. controls +(up:4mm) and +(right:4mm) .. (north.east);
 \end{tikzpicture} \\
 The above graph for this part is a Directed Acyclic Graph (DAG), and thus there's no way to
 create a cycle. This, in turn, means \textbf{deadlocks are not possible}. Intuitively,
 once you turn north, you can't turn away, and thus, can't form a loop.
 \subsection*{Part b}
 \begin{tikzpicture}
    \def \radius {1.5}
    \node[draw, circle, minimum size = 1.2cm] (east) at (0:\radius) {East};
    \node[draw, circle, minimum size = 1.2cm] (north) at (90:\radius) {North};
    \node[draw, circle, minimum size = 1.2cm] (west) at (180:\radius) {West};
    \node[draw, circle, minimum size = 1.2cm] (south) at (270:\radius) {South};
    \draw[->, thick] (west.south) .. controls +(down:4mm) and +(left:4mm) .. (south.west);
    \draw[->, thick] (east.south) .. controls +(down:4mm) and +(right:4mm) .. (south.east);
    \draw[->, thick] (north.east) .. controls +(right:4mm) and +(up:4mm) .. (east.north);
    \draw[->, thick] (south.north) .. controls +(up:4mm) and +(left:4mm) .. (east.west);
 \end{tikzpicture} \\
 The above graph for this part does have a cycle, but it's not possible to visit
 every node and return to the original position. Thus, \textbf{no deadlock is possible}.
 Intuitively, you can't turn north or west, and thus, can't make a loop. After turning
 east or south, it's possible to only ``ladder" in a diagonal.
 \pagebreak
 \subsection*{Part c}
 \begin{tikzpicture}
    \def \radius {1.5}
    \node[draw, circle, minimum size = 1.2cm] (east) at (0:\radius) {East};
    \node[draw, circle, minimum size = 1.2cm] (north) at (90:\radius) {North};
    \node[draw, circle, minimum size = 1.2cm] (west) at (180:\radius) {West};
    \node[draw, circle, minimum size = 1.2cm] (south) at (270:\radius) {South};
    \draw[->, thick] (west.south) .. controls +(down:4mm) and +(left:4mm) .. (south.west);
    \draw[->, thick] (east.south) .. controls +(down:4mm) and +(right:4mm) .. (south.east);
    \draw[->, thick] (north.west) .. controls +(left:4mm) and +(up:4mm) .. (west.north);
    \draw[->, thick] (south.north) .. controls +(up:4mm) and +(left:4mm) .. (east.west);
    \draw[->, thick] (east.south) .. controls +(down:4mm) and +(right:4mm) .. (south.east);
    \draw[->, thick, color=green] (west.east) .. controls +(right:4mm) and +(down:4mm) .. (north.south);
    \draw[->, thick, color=blue] (south.north) .. controls +(up:4mm) and +(right:4mm) .. (west.east);
 \end{tikzpicture} \\
 The above graph for this part has a cycle which includes every node (green and blue were used to disambiguate
 arrow directions). Because this cycle exists, this configuration \textbf{can have deadlocks}.
 Intuitively, it's possible to make a loop by starting north, turning west, then south, then
 east, then south, then west, then back north.
 \subsection*{Part d}
 \begin{tikzpicture}
    \def \radius {1.5}
    \node[draw, circle, minimum size = 1.2cm] (east) at (0:\radius) {East};
    \node[draw, circle, minimum size = 1.2cm] (north) at (90:\radius) {North};
    \node[draw, circle, minimum size = 1.2cm] (west) at (180:\radius) {West};
    \node[draw, circle, minimum size = 1.2cm] (south) at (270:\radius) {South};
    \draw[->, thick,color=blue] (south.north) .. controls +(up:4mm) and +(left:4mm) .. (east.west);
    \draw[->, thick] (east.south) .. controls +(down:4mm) and +(right:4mm) .. (south.east);
    \draw[->, thick] (south.west) .. controls +(left:4mm) and +(down:4mm) .. (west.south);
    \draw[->, thick] (north.west) .. controls +(left:4mm) and +(up:4mm) .. (west.north);
    \draw[->, thick] (north.east) .. controls +(right:4mm) and +(up:4mm) .. (east.north);
    \draw[->, thick,color=green] (east.west) .. controls +(left:4mm) and +(down:4mm) .. (north.south);
 \end{tikzpicture} \\
 The above graph has a cycle, but it does not include every node; \textbf{no deadlocks are possible}.
 Intuitively, once you turn west, you can't turn off anymore, and thus, can't form a loop.
 \end{document}