Add homework 2 solution.

This commit is contained in:
Danila Fedorin 2020-12-07 15:50:31 -08:00
commit 7bd7036b7f
1 changed files with 216 additions and 0 deletions

216
hw2.tex Normal file
View File

@ -0,0 +1,216 @@
\documentclass{article}
\usepackage{amsmath}
\usepackage{tikz}
\usepackage{mathtools}
\usepackage{geometry}[margin=0.5in]
\begin{document}
\section*{Problem 1}
Going step by step:
\begin{itemize}
\item The first instruction, $R_A$, causes a cache miss,
which is serviced by the next level cache, since
the MSI model doesn't have ownership support.
\item The second instruction also causes a cache miss,
which is serviced by the next level cache.
\item The same is the case for the third instruction.
Each CPU's private cache needs to bring the data into
memory separately.
\item The write in the fourth instructions prompts a
bus upgrade request to bring Processor 1 into
``modified" state. This invalidates the private caches
of the second and third processors.
\item The read in the fifth instruction requires another
cache request, since the private cache was invalidated
by the previous instruction. This read is
again serviced by the next level cache. However,
processor 3 reads a part of memory that was not modified
by another processor, and thus, this is a false sharing miss.
\item The read in the sixth instruction causes another
cache miss like the previous instruction. However, unlike
the previous instruction, here the CPU reads a part of the
memory that was modified, so this is a true sharing miss.
\item The write in the seventh instruction causes
another bus upgrade request, and again invalidates the
other private caches.
\item The read in the eigth instruction causes a cache miss,
again serviced by the next level cache. Althoguh
the variable $R_A$ was not modified since the previous
state of the cache, the subsequent use of the $R_B$
variable makes this a true sharing miss.
\item The read in the ninth instruction is a cache hit, since
the previous instruction already caused a cache miss.
\end{itemize}
\subsection*{Part a}
Given the above, the first three misses (1,2,3) are cold misses,
the miss at 5 is a false sharing miss, and 6 and 8 are true sharing misses.
\subsection*{Part b}
The false sharing miss at 5 can be ignored, since no variables accessed
after the miss (and before others) were changed by another processor.
\subsection*{Part c}
Each miss requires $64+6=70$ bytes of traffic. There are 6 misses,
one of which is not essential. In addition, there are two bus
upgrade requests, which require 10 bytes each. Thus, we have
a total of 440 bytes of data sent ($6 \times 70 + 20$).
Of these, 70 are non-essential, and thus, the
fraction of essential traffic is $370/440$ = 84\%.
\pagebreak
\section*{Problem 2}
In these instruction, there are two ``major" indications
of order: the assignment to A in P1 and the subsequent use of A
in P2, and the assignment of B in P2 and the subsequent use of B
in P3. It would be inconsistent if P3 observed P1 not to have run,
but P2 to have run, while P2 observed P1 to have run.
If this happened, R3 would be set to A's old value (0),
but R1 would be set to 1, and R2 would be set to 1.
Thus, the assignment of R1 = 1, R2 = 1, R3 = 0 would be inconsistent.
\pagebreak
\section*{Problem 3}
In one dimension, we have two nodes, with one channel between them.
Thus,
\begin{equation*}
C_1 = 1
\end{equation*}
When extending to 2 dimensions, we copy the 1-dimensional
version of the network 2 more times, and add links between all of the
2 nodes 2 times. Thus,
\begin{equation*}
C_2 = 1 + [2(1) + 2*2] = 7
\end{equation*}
For three dimensions, we copy the 2-dimensional version of the network 3 more
times, and add links between all of the $6 = 3 * 2 = 3!$ nodes
$3$ times. Thus,
\begin{equation*}
C_3 = 1 + [2(1) + 2*2] + [3*7 + 3*3!] = 46
\end{equation*}
Observing the pattern, we rewrite:
\begin{equation*}
C_3 = [1(0) + 1*1!] + [2(1) + 2*2!] + [3*7 + 3*3!]
\end{equation*}
Further, we notice:
\begin{equation*}
\begin{aligned}
C_3 &= \overbrace{\underbrace{[1C_0 + 1*1!]}_{C_1} + [2C_1 + 2*2!]}^{C_2} + [3*C_2 + 3*3!] \\
\end{aligned}
\end{equation*}
Then, we have:
\begin{equation*}
C_n = (n+1)C_{n-1} + n*n! = (n+1)(C_{n-1} + n!) - n!
\end{equation*}
Telescoping a little bit:
\begin{equation*}
\begin{aligned}
C_n &= (n+1)(C_{n-1} + n!) - n! \\
&= (n+1)(\left[n(C_{n-2} + (n-1)!) - (n-1)!\right] + n!) - n! \\
&= (n+1)(\left[n(\left[(n-1)(C_{n-2} + (n-2)!) - (n-2)!\right] + (n-1)!) - (n-1)!\right] + n!) - n! \\
\end{aligned}
\end{equation*}
Observe that for every $-k!$, there's a corresponding $(k+1)k!$, and for each, there are surrounding factors:
\begin{equation*}
... (k+3)\left[(k+2)\left[(k+1)k! - k!\right]\right] ...
\end{equation*}
The $(k+1)k!$ term then is multiplied by all numbers until $n+1$, and is thus equal to $(n+1)!$.
The negative term, though, misses the $(k+1)$ factor, and is thus equal to $-(n+1)!/(k+1)$.
Subtracting the two gives:
\begin{equation*}
\frac{k}{k+1}(n+1)!
\end{equation*}
If we expand $C_n$ in such a way until $C_0 = 0$, we will have $0 \leq k \leq n$,
and thus:
\begin{equation*}
\begin{aligned}
C_n &= \sum_{k=0}^{n} \frac{k}{k+1}(n+1)! \\
\Aboxed{C_n &= (n+1)! \sum_{k=0}^{n} \frac{k}{k+1}}
\end{aligned}
\end{equation*}
\pagebreak
\section*{Problem 4}
I'll take the problem's advice, and go for dependency graphs.
To form a deadlock, we will need to be able to start at a node
of the graph and visit every other node, ending where you started.
\subsection*{Part a}
\begin{tikzpicture}
\def \radius {1.5}
\node[draw, circle, minimum size = 1.2cm] (east) at (0:\radius) {East};
\node[draw, circle, minimum size = 1.2cm] (north) at (90:\radius) {North};
\node[draw, circle, minimum size = 1.2cm] (west) at (180:\radius) {West};
\node[draw, circle, minimum size = 1.2cm] (south) at (270:\radius) {South};
\draw[->, thick] (south.east) .. controls +(right:4mm) and +(down:4mm) .. (east.south);
\draw[->, thick] (south.west) .. controls +(left:4mm) and +(down:4mm) .. (west.south);
\draw[->, thick] (west.north) .. controls +(up:4mm) and +(left:4mm) .. (north.west);
\draw[->, thick] (east.north) .. controls +(up:4mm) and +(right:4mm) .. (north.east);
\end{tikzpicture} \\
The above graph for this part is a Directed Acyclic Graph (DAG), and thus there's no way to
create a cycle. This, in turn, means \textbf{deadlocks are not possible}. Intuitively,
once you turn north, you can't turn away, and thus, can't form a loop.
\subsection*{Part b}
\begin{tikzpicture}
\def \radius {1.5}
\node[draw, circle, minimum size = 1.2cm] (east) at (0:\radius) {East};
\node[draw, circle, minimum size = 1.2cm] (north) at (90:\radius) {North};
\node[draw, circle, minimum size = 1.2cm] (west) at (180:\radius) {West};
\node[draw, circle, minimum size = 1.2cm] (south) at (270:\radius) {South};
\draw[->, thick] (west.south) .. controls +(down:4mm) and +(left:4mm) .. (south.west);
\draw[->, thick] (east.south) .. controls +(down:4mm) and +(right:4mm) .. (south.east);
\draw[->, thick] (north.east) .. controls +(right:4mm) and +(up:4mm) .. (east.north);
\draw[->, thick] (south.north) .. controls +(up:4mm) and +(left:4mm) .. (east.west);
\end{tikzpicture} \\
The above graph for this part does have a cycle, but it's not possible to visit
every node and return to the original position. Thus, \textbf{no deadlock is possible}.
Intuitively, you can't turn north or west, and thus, can't make a loop. After turning
east or south, it's possible to only ``ladder" in a diagonal.
\pagebreak
\subsection*{Part c}
\begin{tikzpicture}
\def \radius {1.5}
\node[draw, circle, minimum size = 1.2cm] (east) at (0:\radius) {East};
\node[draw, circle, minimum size = 1.2cm] (north) at (90:\radius) {North};
\node[draw, circle, minimum size = 1.2cm] (west) at (180:\radius) {West};
\node[draw, circle, minimum size = 1.2cm] (south) at (270:\radius) {South};
\draw[->, thick] (west.south) .. controls +(down:4mm) and +(left:4mm) .. (south.west);
\draw[->, thick] (east.south) .. controls +(down:4mm) and +(right:4mm) .. (south.east);
\draw[->, thick] (north.west) .. controls +(left:4mm) and +(up:4mm) .. (west.north);
\draw[->, thick] (south.north) .. controls +(up:4mm) and +(left:4mm) .. (east.west);
\draw[->, thick] (east.south) .. controls +(down:4mm) and +(right:4mm) .. (south.east);
\draw[->, thick, color=green] (west.east) .. controls +(right:4mm) and +(down:4mm) .. (north.south);
\draw[->, thick, color=blue] (south.north) .. controls +(up:4mm) and +(right:4mm) .. (west.east);
\end{tikzpicture} \\
The above graph for this part has a cycle which includes every node (green and blue were used to disambiguate
arrow directions). Because this cycle exists, this configuration \textbf{can have deadlocks}.
Intuitively, it's possible to make a loop by starting north, turning west, then south, then
east, then south, then west, then back north.
\subsection*{Part d}
\begin{tikzpicture}
\def \radius {1.5}
\node[draw, circle, minimum size = 1.2cm] (east) at (0:\radius) {East};
\node[draw, circle, minimum size = 1.2cm] (north) at (90:\radius) {North};
\node[draw, circle, minimum size = 1.2cm] (west) at (180:\radius) {West};
\node[draw, circle, minimum size = 1.2cm] (south) at (270:\radius) {South};
\draw[->, thick,color=blue] (south.north) .. controls +(up:4mm) and +(left:4mm) .. (east.west);
\draw[->, thick] (east.south) .. controls +(down:4mm) and +(right:4mm) .. (south.east);
\draw[->, thick] (south.west) .. controls +(left:4mm) and +(down:4mm) .. (west.south);
\draw[->, thick] (north.west) .. controls +(left:4mm) and +(up:4mm) .. (west.north);
\draw[->, thick] (north.east) .. controls +(right:4mm) and +(up:4mm) .. (east.north);
\draw[->, thick,color=green] (east.west) .. controls +(left:4mm) and +(down:4mm) .. (north.south);
\end{tikzpicture} \\
The above graph has a cycle, but it does not include every node; \textbf{no deadlocks are possible}.
Intuitively, once you turn west, you can't turn off anymore, and thus, can't form a loop.
\end{document}