diff --git a/final/report.tex b/final/report.tex index 2e3a777..d6167bb 100644 --- a/final/report.tex +++ b/final/report.tex @@ -4,6 +4,8 @@ \usepackage{amsmath} \usepackage{hyperref} \usepackage{xcolor} +\usepackage{caption} +\usepackage{subcaption} \definecolor{link}{HTML}{006275} \hypersetup{ colorlinks, @@ -45,7 +47,7 @@ changes: \begin{itemize} \item I added \textbf{additional precharge transistors} along the column, a total of 4. - Each was sized at $5\lambda$, much like the SRAM transistors themselves. When the clock + Each was sized at $10\lambda$, much like the SRAM transistors themselves. When the clock was low, these PMOS transistors became transparent, and helped precharge the bitlines faster. Doing so helped avid hysteresis. However, this did not help with writing during high clock, so... @@ -57,11 +59,21 @@ changes: configuration, I did not place it in the middle of the column, as that would needlessly increase the length of the wires. \end{itemize} - -This led to the configuration shown in Figure \ref{fig:top-design}. To simulate this design, I placed -a memory cell at the very top of my column, which is the furthest spot from both the read and write -circuit. I also split the wire into 4 equally-sized fragments, each with resistance $\frac{R}{4}$ and -capacitance $\frac{C}{4}$. Between each fragment, I added the aforementioned $5\lambda$ precharge +% +This led to the configuration shown in Figure \ref{fig:top-design}. To simulate this design, I \textbf{tested three configurations}: +\begin{enumerate} + \item A memory cell at the very top of my column, which is the furthest spot from both the read and write. + This is the simulation in the figure. + \item A memory cell in the middle of my column, in the same place as the write block. Since the write block + has brief ``false starts'', this test was to ensure that the read block can still pick up data + despite the write block's misfires. + \item A memory cell at the very bottom of my column. This area has additional capacitance from the read block; + it thus takes longer to charge up, and tends to be the first spot where writes fail. +circuit. +% +\end{enumerate} +I also split the wire into 4 equally-sized fragments, each with resistance $\frac{R}{4}$ and +capacitance $\frac{C}{4}$. Between each fragment, I added the aforementioned $10\lambda$ precharge transistors, as well as 16 always-off $5\lambda$ transistors, which simulated the remaining memory cells. I also placed \textsc{Din}, \textsc{Ad0}, and \textsc{Rwt} behind the default-sized flip-flops attached to the clock to simulate something like a pipeline stage. My overall design is shown @@ -89,12 +101,7 @@ of length to this number, to a total of roughly $2200\lambda$. \pagebreak \section{Performance Results} -I was able to clock my design at 1.38ns. There is a caveat to this clock speed: my \textsc{Bt} and -\textsc{Bf} lines are not pulled all the way to \textsc{Gnd} when they are written low. This -doesn't seem to be a problem - it's sufficient to flip the furthest cell in the design in -every situation I've tested. However, from what I hear, this was discouraged during one of -the office hours (which I was unable to attend). With the constraint of pulling the wires -all the way down, my design can operate at around 2.1ns. +I was able to clock my design at $1.9\textit{ns}$. % Two factors lead to these upper limits. % @@ -116,7 +123,7 @@ Two factors lead to these upper limits. \section{Components} \subsection{Decoder} \subsubsection{In My Own Words} -The decoder in this design is exact same one as we were given in lecture. +The decoder in this design is \textit{almost} the exact same one as we were given in lecture. It computes all combinations of two consecutive bits using a \textsc{Nand} gate; for each combination, there are 4 adjacent two-bit combinations, leading to a 4 \textsc{Nor} gates connected to each \textsc{Nand}. There are now @@ -127,6 +134,20 @@ results in 256 unique \textsc{Wl} wires. Finally, these need to be attached to the clock, so that cells aren't open randomly. This is done using an \textsc{And} gate (a \textsc{Nand} followed by an inverter). +I adjusted this design to account for the address signals that need to be fed +into the write blocks. Which of the read/write columns is triggered +depends on the upper two bits of the address (since we have 4 columns). I modeled +this by increasing the fanout on the first \textsc{Nand} gate from 1 to 4. +This is pessimistic; each 2-bit combination would only feed into one write block, +whose trigger gate is normally sized. + +\begin{figure}[h] + \centering + \includegraphics[width=\linewidth]{decoder.png} + \caption{Decoder model used in project.} + \label{fig:decoder} +\end{figure} + % TODO: Domino logic % TODO: More inverters? @@ -151,7 +172,7 @@ on the two \textsc{Nand3} gates was easy to understand and build, but was less sensitive, and tended to behave strangely under pressure. This led to difficulties with debugging (the output would, for instance, flip completely at certain wire widths), and was seemingly random. Instead, I used -an \textbf{improved latch-based sense amplifier design} from . % TODO: cite +an \textbf{improved latch-based sense amplifier design} from \cite{210039}. % TODO: cite The design I used is shown in Figure \ref{fig:latch-amp}. I left it sized at $40\lambda$, since larger amplifiers seem to take longer to trigger and exit metastability. @@ -163,9 +184,32 @@ the initial clock. Thus, if a write occurred during a previous cycle, the write activate for a short period of time before the read block does. The memory cell will overpower this initial misfire\footnote{According to my additional simulations, this is true even when the memory cell is close to the write block.}, but in this case, both \textsc{Bt} and \textsc{Bf} will be below \textsc{Vdd}. The ``improved sense amplifier'' seems to handle this -case better than the one based on two \textsc{Nand} gates. I think that both Reed and -Graham experienced this occurrence -- they seemed to post very similar waveforms -to the community Discord group chat. +case better than the one based on two \textsc{Nand} gates. + +The latch-induced delay in \textsc{Rwt} also causes a strange \textsc{Trigger} signal during write operations +directly following read operations. The trigger signal initialy activates, putting the sense +amplifier into metastability; however, the correct \textsc{Rwt} value arrives before the +sense amp's outputs are compromised. If this became a problem, I would add an additional, +delayed clock signal \emph{after} the sense amplifier, and use an \textsc{And} gate +to delay the read block's output. + +\begin{figure}[h] +\centering +\begin{subfigure}{.5\textwidth} + \centering + \includegraphics[width=.7\linewidth]{amp.png} + \caption{The latch-based sense amplifier from \cite{210039}.} + \label{fig:latch-amp} +\end{subfigure}% +\begin{subfigure}{.5\textwidth} + \centering + \includegraphics[width=.8\linewidth]{read_select.png} + \caption{The block gathering signals from the four columns.} + \label{fig:read-collect} +\end{subfigure} +\caption{Read block schematics} +\label{fig:read} +\end{figure} \pagebreak \subsection{Write Block} @@ -173,8 +217,8 @@ to the community Discord group chat. The write block converts a ``data in'', or \textsc{Din}, signal into a one-hot representation. It does so by pulling one of the bitlines high, and the other low. Once the memory cell connects to the bitlines, it takes on the charge provided by the -write block, and is therefore overwritten. In my design, two PMOS transistor for each bitline -are used to pull down; one of the transistors is triggered by \textsc{Din} signal (which wire +write block, and is therefore overwritten. In my design, two PMOS transistors for each bitline +are used to pull down; one of the transistors is triggered by the \textsc{Din} signal (which wire we pull down depends on the signal itself!), and the other by a combination of the clock and \textsc{Rwt} (we don't want to touch the wires when reading!). @@ -194,7 +238,9 @@ time is spent reading the wires, the memory cell in question is able to graduall of charge on one of these wires. Since the original, \textsc{Nand}-based sense amplifier required all inputs to be high to properly function, this led to it eventually ``flipping'' and producing the wrong output. This was only an issue above $5\textit{ns}$, and only with the original sense amplifier -design, though. +design, though. I think that both Reed and +Graham experienced this occurrence -- they seemed to post very similar waveforms +to the community Discord group chat. One thing to note about the write block is that its \textbf{clock input is deliberately delayed} compared to the ``actual'' clock. This is because of an issue with \textsc{Din}. Since this @@ -202,12 +248,19 @@ input is behind a latch, it takes around $300\textit{ps}$ to arrive after the ri edge. If the previous value of \textsc{Din} was different than its current one, the write block will start writing the wrong value. This will typically mean that the block cannot properly perform the write. The delay on the clock input serves to mitigate this issue, by giving more -time for \textbf{Din} to settle before starting to write. To compensate for this delay, I sized +time for \textsc{Din} to settle before starting to write. To compensate for this delay, I sized the write block's pull down transistors quite large ($100\lambda$), so that they can pull the wire down, even starting $300\textit{ps}$ into the cycle. This is why the ``clock'' input in my diagrams is colored black, unlike every other clocked component. The delay is achieved by 6 sequenced inverters, two of which are sized 10x larger than the rest. +\begin{figure}[h] + \centering + \includegraphics[width=0.65\linewidth]{write.png} + \caption{Write block used in this project.} + \label{fig:write} +\end{figure} + \pagebreak \subsection{Memory Cell} \subsubsection{In My Own Words} @@ -250,4 +303,58 @@ above - it becomes nigh impossible to wire further \textsc{Wl} lines through eac unless the decoder is split into bits, in which case the width of the entire assembly drastically increases, slowing down all signals. +\begin{figure}[h] + \centering + \includegraphics[width=0.5\linewidth]{layout_single.png} + \caption{Electric layout for a single cell.} + \label{fig:layout-cell} +\end{figure} + +\pagebreak +My basic cell is shown in Figure \ref{fig:layout-cell}. The arrayed version (in Figure \ref{fig:layout-arrayed}) +merits additional explanation. In my earlier description of the overall design, I mentioned +that I have precharge PMOS transistors. I have integrated these into my layout to accurately model +my design. I also made them $10\lambda$ wide, since this is, at the time of writing, +the size of my 4 precharge transistors. In the bird's eye view (Figure \ref{fig:layout-arrayed-far}), +three things can be observed: +\begin{itemize} + \item \textit{Additional vertical line:} This line represents the clock signal, + which must be fed to the precharge transistors. In the full design, there would + be 5 clock lines (3 shared, and 2 on either side). + \item \textit{``Empty'' space between nodes:} I left this space because I was not sure + how wide I would end up making my \textsc{Bt} and \textsc{Bf} wires. I have measured + the distance to ensure that the design will remain DRC clean with up to \textbf{$8\lambda$-wide bitlines}. + This appears to be a sweet spot for my design, anyway. + \item \textit{Moved well contacts:} I have moved my well contacts to the region between + two columns. By extending the N- and P-wells to this area, I was able to + share a single contact between two cells, leaving room for prechare transistors + on both sides of the cell. This was partially inspired by Reed's compact cell design, + which shared a single contact between two cells\footnote{I am operating based on your + comment that well contacts for every cell are significantly overkill.}. +\end{itemize} +Figure \ref{fig:layout-arrayed-close} shows a closer view of the design. Due to the additional +space incurred, an entire column is approximately $100\lambda$ wide. + +\begin{figure}[h] +\centering +\begin{subfigure}{.5\textwidth} + \centering + \includegraphics[width=.7\linewidth]{layout_arrayed.png} + \caption{Bird's eye view of the arrayed SRAM cells.} + \label{fig:layout-arrayed-far} +\end{subfigure}% +\begin{subfigure}{.5\textwidth} + \centering + \includegraphics[width=.8\linewidth]{layout_arrayed_closeup.png} + \caption{Close up from arrayed SRAM cells.} + \label{fig:layout-arrayed-close} +\end{subfigure} +\caption{Read block schematics} +\label{fig:layout-arrayed} +\end{figure} + +\pagebreak +\bibliographystyle{unsrt} +\bibliography{bibliography} + \end{document}