Add homework 2 solution.
This commit is contained in:
commit
7bd7036b7f
216
hw2.tex
Normal file
216
hw2.tex
Normal file
@ -0,0 +1,216 @@
|
|||||||
|
\documentclass{article}
|
||||||
|
\usepackage{amsmath}
|
||||||
|
\usepackage{tikz}
|
||||||
|
\usepackage{mathtools}
|
||||||
|
\usepackage{geometry}[margin=0.5in]
|
||||||
|
\begin{document}
|
||||||
|
\section*{Problem 1}
|
||||||
|
Going step by step:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item The first instruction, $R_A$, causes a cache miss,
|
||||||
|
which is serviced by the next level cache, since
|
||||||
|
the MSI model doesn't have ownership support.
|
||||||
|
\item The second instruction also causes a cache miss,
|
||||||
|
which is serviced by the next level cache.
|
||||||
|
\item The same is the case for the third instruction.
|
||||||
|
Each CPU's private cache needs to bring the data into
|
||||||
|
memory separately.
|
||||||
|
\item The write in the fourth instructions prompts a
|
||||||
|
bus upgrade request to bring Processor 1 into
|
||||||
|
``modified" state. This invalidates the private caches
|
||||||
|
of the second and third processors.
|
||||||
|
\item The read in the fifth instruction requires another
|
||||||
|
cache request, since the private cache was invalidated
|
||||||
|
by the previous instruction. This read is
|
||||||
|
again serviced by the next level cache. However,
|
||||||
|
processor 3 reads a part of memory that was not modified
|
||||||
|
by another processor, and thus, this is a false sharing miss.
|
||||||
|
\item The read in the sixth instruction causes another
|
||||||
|
cache miss like the previous instruction. However, unlike
|
||||||
|
the previous instruction, here the CPU reads a part of the
|
||||||
|
memory that was modified, so this is a true sharing miss.
|
||||||
|
\item The write in the seventh instruction causes
|
||||||
|
another bus upgrade request, and again invalidates the
|
||||||
|
other private caches.
|
||||||
|
\item The read in the eigth instruction causes a cache miss,
|
||||||
|
again serviced by the next level cache. Althoguh
|
||||||
|
the variable $R_A$ was not modified since the previous
|
||||||
|
state of the cache, the subsequent use of the $R_B$
|
||||||
|
variable makes this a true sharing miss.
|
||||||
|
\item The read in the ninth instruction is a cache hit, since
|
||||||
|
the previous instruction already caused a cache miss.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
\subsection*{Part a}
|
||||||
|
Given the above, the first three misses (1,2,3) are cold misses,
|
||||||
|
the miss at 5 is a false sharing miss, and 6 and 8 are true sharing misses.
|
||||||
|
|
||||||
|
\subsection*{Part b}
|
||||||
|
The false sharing miss at 5 can be ignored, since no variables accessed
|
||||||
|
after the miss (and before others) were changed by another processor.
|
||||||
|
|
||||||
|
\subsection*{Part c}
|
||||||
|
Each miss requires $64+6=70$ bytes of traffic. There are 6 misses,
|
||||||
|
one of which is not essential. In addition, there are two bus
|
||||||
|
upgrade requests, which require 10 bytes each. Thus, we have
|
||||||
|
a total of 440 bytes of data sent ($6 \times 70 + 20$).
|
||||||
|
Of these, 70 are non-essential, and thus, the
|
||||||
|
fraction of essential traffic is $370/440$ = 84\%.
|
||||||
|
|
||||||
|
\pagebreak
|
||||||
|
\section*{Problem 2}
|
||||||
|
In these instruction, there are two ``major" indications
|
||||||
|
of order: the assignment to A in P1 and the subsequent use of A
|
||||||
|
in P2, and the assignment of B in P2 and the subsequent use of B
|
||||||
|
in P3. It would be inconsistent if P3 observed P1 not to have run,
|
||||||
|
but P2 to have run, while P2 observed P1 to have run.
|
||||||
|
If this happened, R3 would be set to A's old value (0),
|
||||||
|
but R1 would be set to 1, and R2 would be set to 1.
|
||||||
|
|
||||||
|
Thus, the assignment of R1 = 1, R2 = 1, R3 = 0 would be inconsistent.
|
||||||
|
|
||||||
|
\pagebreak
|
||||||
|
|
||||||
|
\section*{Problem 3}
|
||||||
|
In one dimension, we have two nodes, with one channel between them.
|
||||||
|
Thus,
|
||||||
|
\begin{equation*}
|
||||||
|
C_1 = 1
|
||||||
|
\end{equation*}
|
||||||
|
When extending to 2 dimensions, we copy the 1-dimensional
|
||||||
|
version of the network 2 more times, and add links between all of the
|
||||||
|
2 nodes 2 times. Thus,
|
||||||
|
\begin{equation*}
|
||||||
|
C_2 = 1 + [2(1) + 2*2] = 7
|
||||||
|
\end{equation*}
|
||||||
|
For three dimensions, we copy the 2-dimensional version of the network 3 more
|
||||||
|
times, and add links between all of the $6 = 3 * 2 = 3!$ nodes
|
||||||
|
$3$ times. Thus,
|
||||||
|
\begin{equation*}
|
||||||
|
C_3 = 1 + [2(1) + 2*2] + [3*7 + 3*3!] = 46
|
||||||
|
\end{equation*}
|
||||||
|
Observing the pattern, we rewrite:
|
||||||
|
\begin{equation*}
|
||||||
|
C_3 = [1(0) + 1*1!] + [2(1) + 2*2!] + [3*7 + 3*3!]
|
||||||
|
\end{equation*}
|
||||||
|
Further, we notice:
|
||||||
|
\begin{equation*}
|
||||||
|
\begin{aligned}
|
||||||
|
C_3 &= \overbrace{\underbrace{[1C_0 + 1*1!]}_{C_1} + [2C_1 + 2*2!]}^{C_2} + [3*C_2 + 3*3!] \\
|
||||||
|
\end{aligned}
|
||||||
|
\end{equation*}
|
||||||
|
Then, we have:
|
||||||
|
\begin{equation*}
|
||||||
|
C_n = (n+1)C_{n-1} + n*n! = (n+1)(C_{n-1} + n!) - n!
|
||||||
|
\end{equation*}
|
||||||
|
Telescoping a little bit:
|
||||||
|
\begin{equation*}
|
||||||
|
\begin{aligned}
|
||||||
|
C_n &= (n+1)(C_{n-1} + n!) - n! \\
|
||||||
|
&= (n+1)(\left[n(C_{n-2} + (n-1)!) - (n-1)!\right] + n!) - n! \\
|
||||||
|
&= (n+1)(\left[n(\left[(n-1)(C_{n-2} + (n-2)!) - (n-2)!\right] + (n-1)!) - (n-1)!\right] + n!) - n! \\
|
||||||
|
\end{aligned}
|
||||||
|
\end{equation*}
|
||||||
|
Observe that for every $-k!$, there's a corresponding $(k+1)k!$, and for each, there are surrounding factors:
|
||||||
|
\begin{equation*}
|
||||||
|
... (k+3)\left[(k+2)\left[(k+1)k! - k!\right]\right] ...
|
||||||
|
\end{equation*}
|
||||||
|
The $(k+1)k!$ term then is multiplied by all numbers until $n+1$, and is thus equal to $(n+1)!$.
|
||||||
|
The negative term, though, misses the $(k+1)$ factor, and is thus equal to $-(n+1)!/(k+1)$.
|
||||||
|
Subtracting the two gives:
|
||||||
|
\begin{equation*}
|
||||||
|
\frac{k}{k+1}(n+1)!
|
||||||
|
\end{equation*}
|
||||||
|
If we expand $C_n$ in such a way until $C_0 = 0$, we will have $0 \leq k \leq n$,
|
||||||
|
and thus:
|
||||||
|
\begin{equation*}
|
||||||
|
\begin{aligned}
|
||||||
|
C_n &= \sum_{k=0}^{n} \frac{k}{k+1}(n+1)! \\
|
||||||
|
\Aboxed{C_n &= (n+1)! \sum_{k=0}^{n} \frac{k}{k+1}}
|
||||||
|
\end{aligned}
|
||||||
|
\end{equation*}
|
||||||
|
|
||||||
|
\pagebreak
|
||||||
|
\section*{Problem 4}
|
||||||
|
I'll take the problem's advice, and go for dependency graphs.
|
||||||
|
To form a deadlock, we will need to be able to start at a node
|
||||||
|
of the graph and visit every other node, ending where you started.
|
||||||
|
|
||||||
|
\subsection*{Part a}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\def \radius {1.5}
|
||||||
|
\node[draw, circle, minimum size = 1.2cm] (east) at (0:\radius) {East};
|
||||||
|
\node[draw, circle, minimum size = 1.2cm] (north) at (90:\radius) {North};
|
||||||
|
\node[draw, circle, minimum size = 1.2cm] (west) at (180:\radius) {West};
|
||||||
|
\node[draw, circle, minimum size = 1.2cm] (south) at (270:\radius) {South};
|
||||||
|
|
||||||
|
\draw[->, thick] (south.east) .. controls +(right:4mm) and +(down:4mm) .. (east.south);
|
||||||
|
\draw[->, thick] (south.west) .. controls +(left:4mm) and +(down:4mm) .. (west.south);
|
||||||
|
\draw[->, thick] (west.north) .. controls +(up:4mm) and +(left:4mm) .. (north.west);
|
||||||
|
\draw[->, thick] (east.north) .. controls +(up:4mm) and +(right:4mm) .. (north.east);
|
||||||
|
\end{tikzpicture} \\
|
||||||
|
The above graph for this part is a Directed Acyclic Graph (DAG), and thus there's no way to
|
||||||
|
create a cycle. This, in turn, means \textbf{deadlocks are not possible}. Intuitively,
|
||||||
|
once you turn north, you can't turn away, and thus, can't form a loop.
|
||||||
|
|
||||||
|
\subsection*{Part b}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\def \radius {1.5}
|
||||||
|
\node[draw, circle, minimum size = 1.2cm] (east) at (0:\radius) {East};
|
||||||
|
\node[draw, circle, minimum size = 1.2cm] (north) at (90:\radius) {North};
|
||||||
|
\node[draw, circle, minimum size = 1.2cm] (west) at (180:\radius) {West};
|
||||||
|
\node[draw, circle, minimum size = 1.2cm] (south) at (270:\radius) {South};
|
||||||
|
|
||||||
|
\draw[->, thick] (west.south) .. controls +(down:4mm) and +(left:4mm) .. (south.west);
|
||||||
|
\draw[->, thick] (east.south) .. controls +(down:4mm) and +(right:4mm) .. (south.east);
|
||||||
|
\draw[->, thick] (north.east) .. controls +(right:4mm) and +(up:4mm) .. (east.north);
|
||||||
|
\draw[->, thick] (south.north) .. controls +(up:4mm) and +(left:4mm) .. (east.west);
|
||||||
|
\end{tikzpicture} \\
|
||||||
|
The above graph for this part does have a cycle, but it's not possible to visit
|
||||||
|
every node and return to the original position. Thus, \textbf{no deadlock is possible}.
|
||||||
|
Intuitively, you can't turn north or west, and thus, can't make a loop. After turning
|
||||||
|
east or south, it's possible to only ``ladder" in a diagonal.
|
||||||
|
|
||||||
|
\pagebreak
|
||||||
|
|
||||||
|
\subsection*{Part c}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\def \radius {1.5}
|
||||||
|
\node[draw, circle, minimum size = 1.2cm] (east) at (0:\radius) {East};
|
||||||
|
\node[draw, circle, minimum size = 1.2cm] (north) at (90:\radius) {North};
|
||||||
|
\node[draw, circle, minimum size = 1.2cm] (west) at (180:\radius) {West};
|
||||||
|
\node[draw, circle, minimum size = 1.2cm] (south) at (270:\radius) {South};
|
||||||
|
|
||||||
|
\draw[->, thick] (west.south) .. controls +(down:4mm) and +(left:4mm) .. (south.west);
|
||||||
|
\draw[->, thick] (east.south) .. controls +(down:4mm) and +(right:4mm) .. (south.east);
|
||||||
|
\draw[->, thick] (north.west) .. controls +(left:4mm) and +(up:4mm) .. (west.north);
|
||||||
|
\draw[->, thick] (south.north) .. controls +(up:4mm) and +(left:4mm) .. (east.west);
|
||||||
|
\draw[->, thick] (east.south) .. controls +(down:4mm) and +(right:4mm) .. (south.east);
|
||||||
|
\draw[->, thick, color=green] (west.east) .. controls +(right:4mm) and +(down:4mm) .. (north.south);
|
||||||
|
\draw[->, thick, color=blue] (south.north) .. controls +(up:4mm) and +(right:4mm) .. (west.east);
|
||||||
|
\end{tikzpicture} \\
|
||||||
|
The above graph for this part has a cycle which includes every node (green and blue were used to disambiguate
|
||||||
|
arrow directions). Because this cycle exists, this configuration \textbf{can have deadlocks}.
|
||||||
|
Intuitively, it's possible to make a loop by starting north, turning west, then south, then
|
||||||
|
east, then south, then west, then back north.
|
||||||
|
|
||||||
|
\subsection*{Part d}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\def \radius {1.5}
|
||||||
|
\node[draw, circle, minimum size = 1.2cm] (east) at (0:\radius) {East};
|
||||||
|
\node[draw, circle, minimum size = 1.2cm] (north) at (90:\radius) {North};
|
||||||
|
\node[draw, circle, minimum size = 1.2cm] (west) at (180:\radius) {West};
|
||||||
|
\node[draw, circle, minimum size = 1.2cm] (south) at (270:\radius) {South};
|
||||||
|
|
||||||
|
\draw[->, thick,color=blue] (south.north) .. controls +(up:4mm) and +(left:4mm) .. (east.west);
|
||||||
|
\draw[->, thick] (east.south) .. controls +(down:4mm) and +(right:4mm) .. (south.east);
|
||||||
|
\draw[->, thick] (south.west) .. controls +(left:4mm) and +(down:4mm) .. (west.south);
|
||||||
|
\draw[->, thick] (north.west) .. controls +(left:4mm) and +(up:4mm) .. (west.north);
|
||||||
|
\draw[->, thick] (north.east) .. controls +(right:4mm) and +(up:4mm) .. (east.north);
|
||||||
|
\draw[->, thick,color=green] (east.west) .. controls +(left:4mm) and +(down:4mm) .. (north.south);
|
||||||
|
\end{tikzpicture} \\
|
||||||
|
The above graph has a cycle, but it does not include every node; \textbf{no deadlocks are possible}.
|
||||||
|
Intuitively, once you turn west, you can't turn off anymore, and thus, can't form a loop.
|
||||||
|
|
||||||
|
\end{document}
|
Loading…
Reference in New Issue
Block a user