\documentclass{article} \usepackage{amsmath} \usepackage{tikz} \usepackage{mathtools} \usepackage{geometry}[margin=0.5in] \begin{document} \section*{Problem 1} Going step by step: \begin{itemize} \item The first instruction, $R_A$, causes a cache miss, which is serviced by the next level cache, since the MSI model doesn't have ownership support. \item The second instruction also causes a cache miss, which is serviced by the next level cache. \item The same is the case for the third instruction. Each CPU's private cache needs to bring the data into memory separately. \item The write in the fourth instructions prompts a bus upgrade request to bring Processor 1 into ``modified" state. This invalidates the private caches of the second and third processors. \item The read in the fifth instruction requires another cache request, since the private cache was invalidated by the previous instruction. This read is again serviced by the next level cache. However, processor 3 reads a part of memory that was not modified by another processor, and thus, this is a false sharing miss. \item The read in the sixth instruction causes another cache miss like the previous instruction. However, unlike the previous instruction, here the CPU reads a part of the memory that was modified, so this is a true sharing miss. \item The write in the seventh instruction causes another bus upgrade request, and again invalidates the other private caches. \item The read in the eigth instruction causes a cache miss, again serviced by the next level cache. Althoguh the variable $R_A$ was not modified since the previous state of the cache, the subsequent use of the $R_B$ variable makes this a true sharing miss. \item The read in the ninth instruction is a cache hit, since the previous instruction already caused a cache miss. \end{itemize} \subsection*{Part a} Given the above, the first three misses (1,2,3) are cold misses, the miss at 5 is a false sharing miss, and 6 and 8 are true sharing misses. \subsection*{Part b} The false sharing miss at 5 can be ignored, since no variables accessed after the miss (and before others) were changed by another processor. \subsection*{Part c} Each miss requires $64+6=70$ bytes of traffic. There are 6 misses, one of which is not essential. In addition, there are two bus upgrade requests, which require 10 bytes each. Thus, we have a total of 440 bytes of data sent ($6 \times 70 + 20$). Of these, 70 are non-essential, and thus, the fraction of essential traffic is $370/440$ = 84\%. \pagebreak \section*{Problem 2} In these instruction, there are two ``major" indications of order: the assignment to A in P1 and the subsequent use of A in P2, and the assignment of B in P2 and the subsequent use of B in P3. It would be inconsistent if P3 observed P1 not to have run, but P2 to have run, while P2 observed P1 to have run. If this happened, R3 would be set to A's old value (0), but R1 would be set to 1, and R2 would be set to 1. Thus, the assignment of R1 = 1, R2 = 1, R3 = 0 would be inconsistent. \pagebreak \section*{Problem 3} In one dimension, we have two nodes, with one channel between them. Thus, \begin{equation*} C_1 = 1 \end{equation*} When extending to 2 dimensions, we copy the 1-dimensional version of the network 2 more times, and add links between all of the 2 nodes 2 times. Thus, \begin{equation*} C_2 = 1 + [2(1) + 2*2] = 7 \end{equation*} For three dimensions, we copy the 2-dimensional version of the network 3 more times, and add links between all of the $6 = 3 * 2 = 3!$ nodes $3$ times. Thus, \begin{equation*} C_3 = 1 + [2(1) + 2*2] + [3*7 + 3*3!] = 46 \end{equation*} Observing the pattern, we rewrite: \begin{equation*} C_3 = [1(0) + 1*1!] + [2(1) + 2*2!] + [3*7 + 3*3!] \end{equation*} Further, we notice: \begin{equation*} \begin{aligned} C_3 &= \overbrace{\underbrace{[1C_0 + 1*1!]}_{C_1} + [2C_1 + 2*2!]}^{C_2} + [3*C_2 + 3*3!] \\ \end{aligned} \end{equation*} Then, we have: \begin{equation*} C_n = (n+1)C_{n-1} + n*n! = (n+1)(C_{n-1} + n!) - n! \end{equation*} Telescoping a little bit: \begin{equation*} \begin{aligned} C_n &= (n+1)(C_{n-1} + n!) - n! \\ &= (n+1)(\left[n(C_{n-2} + (n-1)!) - (n-1)!\right] + n!) - n! \\ &= (n+1)(\left[n(\left[(n-1)(C_{n-2} + (n-2)!) - (n-2)!\right] + (n-1)!) - (n-1)!\right] + n!) - n! \\ \end{aligned} \end{equation*} Observe that for every $-k!$, there's a corresponding $(k+1)k!$, and for each, there are surrounding factors: \begin{equation*} ... (k+3)\left[(k+2)\left[(k+1)k! - k!\right]\right] ... \end{equation*} The $(k+1)k!$ term then is multiplied by all numbers until $n+1$, and is thus equal to $(n+1)!$. The negative term, though, misses the $(k+1)$ factor, and is thus equal to $-(n+1)!/(k+1)$. Subtracting the two gives: \begin{equation*} \frac{k}{k+1}(n+1)! \end{equation*} If we expand $C_n$ in such a way until $C_0 = 0$, we will have $0 \leq k \leq n$, and thus: \begin{equation*} \begin{aligned} C_n &= \sum_{k=0}^{n} \frac{k}{k+1}(n+1)! \\ \Aboxed{C_n &= (n+1)! \sum_{k=0}^{n} \frac{k}{k+1}} \end{aligned} \end{equation*} \pagebreak \section*{Problem 4} I'll take the problem's advice, and go for dependency graphs. To form a deadlock, we will need to be able to start at a node of the graph and visit every other node, ending where you started. \subsection*{Part a} \begin{tikzpicture} \def \radius {1.5} \node[draw, circle, minimum size = 1.2cm] (east) at (0:\radius) {East}; \node[draw, circle, minimum size = 1.2cm] (north) at (90:\radius) {North}; \node[draw, circle, minimum size = 1.2cm] (west) at (180:\radius) {West}; \node[draw, circle, minimum size = 1.2cm] (south) at (270:\radius) {South}; \draw[->, thick] (south.east) .. controls +(right:4mm) and +(down:4mm) .. (east.south); \draw[->, thick] (south.west) .. controls +(left:4mm) and +(down:4mm) .. (west.south); \draw[->, thick] (west.north) .. controls +(up:4mm) and +(left:4mm) .. (north.west); \draw[->, thick] (east.north) .. controls +(up:4mm) and +(right:4mm) .. (north.east); \end{tikzpicture} \\ The above graph for this part is a Directed Acyclic Graph (DAG), and thus there's no way to create a cycle. This, in turn, means \textbf{deadlocks are not possible}. Intuitively, once you turn north, you can't turn away, and thus, can't form a loop. \subsection*{Part b} \begin{tikzpicture} \def \radius {1.5} \node[draw, circle, minimum size = 1.2cm] (east) at (0:\radius) {East}; \node[draw, circle, minimum size = 1.2cm] (north) at (90:\radius) {North}; \node[draw, circle, minimum size = 1.2cm] (west) at (180:\radius) {West}; \node[draw, circle, minimum size = 1.2cm] (south) at (270:\radius) {South}; \draw[->, thick] (west.south) .. controls +(down:4mm) and +(left:4mm) .. (south.west); \draw[->, thick] (east.south) .. controls +(down:4mm) and +(right:4mm) .. (south.east); \draw[->, thick] (north.east) .. controls +(right:4mm) and +(up:4mm) .. (east.north); \draw[->, thick] (south.north) .. controls +(up:4mm) and +(left:4mm) .. (east.west); \end{tikzpicture} \\ The above graph for this part does have a cycle, but it's not possible to visit every node and return to the original position. Thus, \textbf{no deadlock is possible}. Intuitively, you can't turn north or west, and thus, can't make a loop. After turning east or south, it's possible to only ``ladder" in a diagonal. \pagebreak \subsection*{Part c} \begin{tikzpicture} \def \radius {1.5} \node[draw, circle, minimum size = 1.2cm] (east) at (0:\radius) {East}; \node[draw, circle, minimum size = 1.2cm] (north) at (90:\radius) {North}; \node[draw, circle, minimum size = 1.2cm] (west) at (180:\radius) {West}; \node[draw, circle, minimum size = 1.2cm] (south) at (270:\radius) {South}; \draw[->, thick] (west.south) .. controls +(down:4mm) and +(left:4mm) .. (south.west); \draw[->, thick] (east.south) .. controls +(down:4mm) and +(right:4mm) .. (south.east); \draw[->, thick] (north.west) .. controls +(left:4mm) and +(up:4mm) .. (west.north); \draw[->, thick] (south.north) .. controls +(up:4mm) and +(left:4mm) .. (east.west); \draw[->, thick] (east.south) .. controls +(down:4mm) and +(right:4mm) .. (south.east); \draw[->, thick, color=green] (west.east) .. controls +(right:4mm) and +(down:4mm) .. (north.south); \draw[->, thick, color=blue] (south.north) .. controls +(up:4mm) and +(right:4mm) .. (west.east); \end{tikzpicture} \\ The above graph for this part has a cycle which includes every node (green and blue were used to disambiguate arrow directions). Because this cycle exists, this configuration \textbf{can have deadlocks}. Intuitively, it's possible to make a loop by starting north, turning west, then south, then east, then south, then west, then back north. \subsection*{Part d} \begin{tikzpicture} \def \radius {1.5} \node[draw, circle, minimum size = 1.2cm] (east) at (0:\radius) {East}; \node[draw, circle, minimum size = 1.2cm] (north) at (90:\radius) {North}; \node[draw, circle, minimum size = 1.2cm] (west) at (180:\radius) {West}; \node[draw, circle, minimum size = 1.2cm] (south) at (270:\radius) {South}; \draw[->, thick,color=blue] (south.north) .. controls +(up:4mm) and +(left:4mm) .. (east.west); \draw[->, thick] (east.south) .. controls +(down:4mm) and +(right:4mm) .. (south.east); \draw[->, thick] (south.west) .. controls +(left:4mm) and +(down:4mm) .. (west.south); \draw[->, thick] (north.west) .. controls +(left:4mm) and +(up:4mm) .. (west.north); \draw[->, thick] (north.east) .. controls +(right:4mm) and +(up:4mm) .. (east.north); \draw[->, thick,color=green] (east.west) .. controls +(left:4mm) and +(down:4mm) .. (north.south); \end{tikzpicture} \\ The above graph has a cycle, but it does not include every node; \textbf{no deadlocks are possible}. Intuitively, once you turn west, you can't turn off anymore, and thus, can't form a loop. \end{document}