Compare commits

...

22 Commits

Author SHA1 Message Date
f317a7e8da Add some minor additional sections to report. 2021-03-17 13:34:04 -07:00
90a5aed6ec Size sense amp back to 40 2021-03-17 13:12:55 -07:00
c6d7795074 Update report with new performance characteristics. 2021-03-17 13:09:15 -07:00
d8f4a272e3 Update to use new wire model's characteristics. 2021-03-17 13:07:33 -07:00
ff0edb93bb Update with new wire model 2021-03-17 12:58:36 -07:00
39ec744562 Update report. 2021-03-17 12:21:35 -07:00
6530e7ef8c 1.9ns everywhere. 2021-03-17 12:19:33 -07:00
71195df7c9 Add final SRAM design. 2021-03-17 12:19:17 -07:00
0c1d8611b1 Add missing images. 2021-03-17 12:19:02 -07:00
b99403a4ff Update TODOs. 2021-03-17 12:16:02 -07:00
9afa839bff Add TODO. 2021-03-17 11:51:50 -07:00
6f99879b8f Update electric files. 2021-03-17 09:00:29 -07:00
fe52f689f9 Upload files to 'final' 2021-03-17 00:32:18 -07:00
c171b0374b Upload files to 'final' 2021-03-17 00:02:13 -07:00
eb8d068519 Update 'final/todo.md' 2021-03-17 00:00:34 -07:00
6db76f4fd3 Update todo 2021-03-16 23:25:36 -07:00
64ee80be63 Update report. 2021-03-16 23:25:30 -07:00
6b963c967b Merge branch 'master' of dev.danilafe.com:ECE-571/Labs 2021-03-16 19:02:58 -07:00
4d4ceddcc6 Update todos 2021-03-16 19:02:06 -07:00
75381749d7 Add initial designs. 2021-03-16 18:32:56 -07:00
d2f53a9a4f Update 'final/todo.md' 2021-03-16 18:19:45 -07:00
0f6426958e Update 'final/todo.md' 2021-03-16 16:47:53 -07:00
14 changed files with 1563 additions and 573 deletions

File diff suppressed because it is too large Load Diff

View File

@@ -26,8 +26,8 @@
.subckt wire iot iof len=10 wid=10 .subckt wire iot iof len=10 wid=10
.param rr=0.8 .param rr=0.4
.param cc = '200e-15' .param cc = '100e-15'
rt iot iof 'rr*len*50/(wid)' rt iot iof 'rr*len*50/(wid)'
cf iof 0 'cc*len*wid*50/1e6' cf iof 0 'cc*len*wid*50/1e6'
@@ -179,7 +179,7 @@ Xpct btt clk vdd pp ww='100'
Xpcf bff clk vdd pp ww='100' Xpcf bff clk vdd pp ww='100'
.ends write1 .ends write1
.subckt iWrite1 btt bff dii rwt clk .subckt iWrite1 btt bff dii rwt en clk
* TODO: sizes * TODO: sizes
Xclk clkb clk inv size='40' Xclk clkb clk inv size='40'
Xdii diib dii inv size='40' Xdii diib dii inv size='40'
@@ -268,7 +268,7 @@ Xh2 nn1 dot inv size='60'
.subckt decModel choose din clk size='20' .subckt decModel choose din clk size='20'
Xi1 nn1 din inv size='size' Xi1 nn1 din inv size='size'
* Here: stopped using i1 and just used din * Here: stopped using i1 and just used din
Xnal ww1 gnd din nnd2 size='size' Xnal ww1 gnd din nnd2 size='size*4'
Xnar nn2 vdd din nnd2 size='size' Xnar nn2 vdd din nnd2 size='size'
Xnrl ww2 nn2 vdd nor2 size='size*3' Xnrl ww2 nn2 vdd nor2 size='size*3'
Xnrr nn3 nn2 gnd nor2 size='size' Xnrr nn3 nn2 gnd nor2 size='size'

BIN
final/amp.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

BIN
final/decoder.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 11 KiB

BIN
final/layout_arrayed.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 394 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 100 KiB

BIN
final/layout_single.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 210 KiB

BIN
final/read_select.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

View File

@@ -2,11 +2,25 @@
\usepackage[margin=1in]{geometry} \usepackage[margin=1in]{geometry}
\usepackage{graphicx} \usepackage{graphicx}
\usepackage{amsmath} \usepackage{amsmath}
\usepackage{hyperref}
\usepackage{xcolor}
\usepackage{caption}
\usepackage{subcaption}
\definecolor{link}{HTML}{006275}
\hypersetup{
colorlinks,
citecolor=black,
filecolor=black,
linkcolor=link,
urlcolor=black
}
\title{Final Project Report} \title{Final Project Report}
\author{Danila Fedorin} \author{Danila Fedorin}
\begin{document} \begin{document}
\maketitle \maketitle
\section*{General Design and Considerations} \tableofcontents
\pagebreak
\section{General Design and Considerations}
The goal of this assignment was to create a 256-byte SRAM memory unit. In order The goal of this assignment was to create a 256-byte SRAM memory unit. In order
to minimize wire delays, I chose to split each bit into \textbf{4 columns of 64 SRAM cells to minimize wire delays, I chose to split each bit into \textbf{4 columns of 64 SRAM cells
each}. This was motivated by the following factors: each}. This was motivated by the following factors:
@@ -18,7 +32,7 @@ each}. This was motivated by the following factors:
the decision to shrink the columns as much as possible. However... the decision to shrink the columns as much as possible. However...
\item \emph{Smaller} columns became a routing challenge. Even with a 4-column split, \item \emph{Smaller} columns became a routing challenge. Even with a 4-column split,
to properly connect each cell of the SRAM column, the SRAM cells themselves need to properly connect each cell of the SRAM column, the SRAM cells themselves need
to accomodate an additional three \textsc{Wl} lines. Due to the pitch requirements to accommodate an additional three \textsc{Wl} lines. Due to the pitch requirements
on metals three and four, this is the upper limit (for reasonably sized cells). on metals three and four, this is the upper limit (for reasonably sized cells).
Alternatives included splitting the decoder into pieces, but for large numbers Alternatives included splitting the decoder into pieces, but for large numbers
of columns, this meant that the decoder signal traveled through significant amounts of columns, this meant that the decoder signal traveled through significant amounts
@@ -33,7 +47,7 @@ changes:
\begin{itemize} \begin{itemize}
\item I added \textbf{additional precharge transistors} along the column, a total of 4. \item I added \textbf{additional precharge transistors} along the column, a total of 4.
Each was sized at $5\lambda$, much like the SRAM transistors themselves. When the clock Each was sized at $10\lambda$, much like the SRAM transistors themselves. When the clock
was low, these PMOS transistors became transparent, and helped precharge the bitlines faster. was low, these PMOS transistors became transparent, and helped precharge the bitlines faster.
Doing so helped avid hysteresis. However, this did not help with writing during high clock, Doing so helped avid hysteresis. However, this did not help with writing during high clock,
so... so...
@@ -45,43 +59,329 @@ changes:
configuration, I did not place it in the middle of the column, as that would needlessly configuration, I did not place it in the middle of the column, as that would needlessly
increase the length of the wires. increase the length of the wires.
\end{itemize} \end{itemize}
%
This led to the configuration shown in Figure \ref{fig:top-design}. To simulate this design, I placed This led to the configuration shown in Figure \ref{fig:top-design}. To simulate this design, I \textbf{tested three configurations}:
a memory cell at the very top of my column, which is the furthest spot from both the read and write \begin{enumerate}
circuit. I also split the wire into 4 equally-sized fragments, each with resistance $\frac{R}{4}$ and \item A memory cell at the very top of my column, which is the furthest spot from both the read and write.
capacitance $\frac{C}{4}$. Between each fragment, I added the aforementioned $5\lambda$ precharge This is the simulation in the figure.
\item A memory cell in the middle of my column, in the same place as the write block. Since the write block
has brief ``false starts'', this test was to ensure that the read block can still pick up data
despite the write block's misfires.
\item A memory cell at the very bottom of my column. This area has additional capacitance from the read block;
it thus takes longer to charge up, and tends to be the first spot where writes fail.
circuit.
%
\end{enumerate}
I also split the wire into 4 equally-sized fragments, each with resistance $\frac{R}{4}$ and
capacitance $\frac{C}{4}$. Between each fragment, I added the aforementioned $10\lambda$ precharge
transistors, as well as 16 always-off $5\lambda$ transistors, which simulated the remaining memory cells. transistors, as well as 16 always-off $5\lambda$ transistors, which simulated the remaining memory cells.
I also placed \textsc{Din}, \textsc{Ad0}, and \textsc{Rwt} behind the default-sized flip-flops I also placed \textsc{Din}, \textsc{Ad0}, and \textsc{Rwt} behind the default-sized flip-flops
attached to the clock to simulate something like a pipeline stage. My overall design is shown attached to the clock to simulate something like a pipeline stage. My overall design is shown
in figure \ref{fig:top-design-sim}. in Figure \ref{fig:top-design-sim}.
\pagebreak \pagebreak
\section*{Performance Results} \begin{figure}[h]
I made three measurements of my performance. \centering
\includegraphics[width=\linewidth]{toplevel_design.png}
\caption{Top-level design for a single bit.}
\label{fig:top-design}
\end{figure}
\begin{itemize} My SRAM cell ended up being $30\lambda$ units tall when arrayed. With
\item Without flip-flopping my inputs and outputs, I was able to clock my design around a total of 64 cells in a single column, this led to a wire length of $1920\lambda$.
950\textit{ps}. However, since my write block was now included in the column, I added another $300\lambda$
\item With flip-flops on my inputs (but not on my output), I was able to clock my design of length to this number, to a total of roughly $2200\lambda$.
around 1.24\textit{ns}. However, at this delay, the output of the gate came in very close to
the falling edge of the clock. \begin{figure}
\item With flip-flops on my inputs and my outputs, I was able to clock my design at 2.6\textit{ns}. \centering
This significant delay was to allow enough setup time for the flip flop. \includegraphics[width=0.6\linewidth]{toplevel.png}
\end{itemize} \caption{Architecture of top-level simulation.}
\label{fig:top-design-sim}
\end{figure}
\pagebreak
\section{Performance Results}
I was able to clock my design at \textbf{$1.3\textit{ns}$}.
% %
Two factors lead to these upper limits. I realize that this isn't as fast as everyone else, but I ask that you take
into consideration the fact that \textbf{I was working with the old wire model}
until about an hour before the final due date (since I didn't know the wire model changed).
If I knew earlier, I'd have more time to optimize my design for the timings associated
with the new model.
%
Two factors lead to this upper limit.
% %
\begin{itemize} \begin{itemize}
\item \textit{Write capacitance} makes it increasingly difficult to overwrite the value \item \textit{Write capacitance} makes it increasingly difficult to overwrite the value
in the cell. Clocking my design any faster than 950\textit{ps} or 1.24\textit{ns} in the cell. Clocking my design any faster leads my cell to \textit{almost} flip, but not resolve correctly.
(depending on the case) leads my cell to \textit{almost} flip, but not resolve correctly.
I have found no way to work around these limits once my wire was properly sized, and my I have found no way to work around these limits once my wire was properly sized, and my
write block was placed in the middle of the column. write block was placed in the middle of the column.
\item \textit{Flop, decoder, and read delays} are the major limitation when both the inputs \item \textit{Flop, decoder, and read delays} are the major limitation when both the inputs
and the outputs of the circuit are connected to flip flops. Even though the output and the outputs of the circuit are connected to flip flops. The most significant
of the read block is correct, it doesn't arrive fast enough to be captured by the next cycle. instance of this issue is my write block: both \textsc{Din} and \textsc{Rwt} arrive
Furthermore, in some cases, the signal to open a memory cell arrives later than the around $300\textit{ps}$ into the cycle. This means two things: a) if the previous
\textsc{Trig} signal for the senseamp, making it read too early and thus output the incorrect value. operation was ``read'', then the block does not start writing until halfway into
the positive phase of the clock and b) if the data being written is different
from the data in the previous cycle, for half the time, the write block will write
the old data (until the flip flop switches).
\end{itemize} \end{itemize}
\section{Components}
\subsection{Decoder}
\subsubsection{In My Own Words}
The decoder in this design is \textit{almost} the exact same one as we were given in lecture.
It computes all combinations of two consecutive bits using a \textsc{Nand} gate; for
each combination, there are 4 adjacent two-bit combinations,
leading to a 4 \textsc{Nor} gates connected to each \textsc{Nand}. There are now
16 combinations of 4 adjacent bits; each combination of the lower 4 bits
needs to be compared with each of the 16 combinations of the upper 4 bits,
leading to 16 \textsc{Nand} gates connected to each \textsc{Nor}. This
results in 256 unique \textsc{Wl} wires. Finally, these need to be attached
to the clock, so that cells aren't open randomly. This is done using an \textsc{And}
gate (a \textsc{Nand} followed by an inverter).
I adjusted this design to account for the address signals that need to be fed
into the write blocks. Which of the read/write columns is triggered
depends on the upper two bits of the address (since we have 4 columns). I modeled
this by increasing the fanout on the first \textsc{Nand} gate from 1 to 4.
This is pessimistic; each 2-bit combination would only feed into one write block,
whose trigger gate is normally sized.
\begin{figure}[h]
\centering
\includegraphics[width=\linewidth]{decoder.png}
\caption{Decoder model used in project.}
\label{fig:decoder}
\end{figure}
% TODO: Domino logic
% TODO: More inverters?
\pagebreak
\subsection{Read Block}
\subsubsection{In My Own Words}
The read block uses a \emph{sense amplifier} to detect small changes on the bitlines,
which it then translates into a zero-or-one output. The changes in the wires are below
the threshold of what could be considered digital logic; all the sense amplifier
designs I've come across rely on metastability, a state in which even tiny fluctuations
can significantly alter the outcome\footnote{My favorite analogy is a pencil balanced on its tip.
Technically, it's stable; however, even a small air current -- one you can't feel -- can knock it over.}.
The \textsc{Trigger} signal, which depends on the clock and \textsc{Rwt}, puts the amplifier
into a metastable state. From there, the connected bitlines cause it to resolve one way
or another. Finally, if one of the wires resolves, a value is written into the keeper circuit
at the end, which ensures that the value that was read continues to be expressed until
the next read operation.
\subsubsection{Details}
For my read block, I used a different sense amplifier design. The design based
on the two \textsc{Nand3} gates was easy to understand and build, but was less
sensitive, and tended to behave strangely under pressure. This led to difficulties
with debugging (the output would, for instance, flip completely at certain
wire widths), and was seemingly random. Instead, I used
an \textbf{improved latch-based sense amplifier design} from \cite{210039}. % TODO: cite
The design I used is shown in Figure \ref{fig:latch-amp}.
I left it sized at $40\lambda$, since larger amplifiers seem to take longer
to trigger and exit metastability.
The read block is not a particular bottleneck in this design. The main concern
was to handle the \textbf{``false start'' activation of the write block}. Because the \textsc{Rwt}
input is behind a latch, it takes nearly $300\textit{ps}$ to pull up or down after
the initial clock. Thus, if a write occurred during a previous cycle, the write block will
activate for a short period of time before the read block does. The memory cell
will overpower this initial misfire\footnote{According to my additional simulations, this is true even when the memory cell is close to the write block.}, but in this case, both \textsc{Bt} and \textsc{Bf}
will be below \textsc{Vdd}. The ``improved sense amplifier'' seems to handle this
case better than the one based on two \textsc{Nand} gates.
The latch-induced delay in \textsc{Rwt} also causes a strange \textsc{Trigger} signal during write operations
directly following read operations. The trigger signal initialy activates, putting the sense
amplifier into metastability; however, the correct \textsc{Rwt} value arrives before the
sense amp's outputs are compromised. If this became a problem, I would add an additional,
delayed clock signal \emph{after} the sense amplifier, and use an \textsc{And} gate
to delay the read block's output.
\begin{figure}[h]
\centering
\begin{subfigure}{.5\textwidth}
\centering
\includegraphics[width=.7\linewidth]{amp.png}
\caption{The latch-based sense amplifier from \cite{210039}.}
\label{fig:latch-amp}
\end{subfigure}%
\begin{subfigure}{.5\textwidth}
\centering
\includegraphics[width=.8\linewidth]{read_select.png}
\caption{The block gathering signals from the four columns.}
\label{fig:read-collect}
\end{subfigure}
\caption{Read block schematics}
\label{fig:read}
\end{figure}
\pagebreak
\subsection{Write Block}
\subsubsection{In My Own Words}
The write block converts a ``data in'', or \textsc{Din}, signal
into a one-hot representation. It does so by pulling one of the bitlines high, and the other
low. Once the memory cell connects to the bitlines, it takes on the charge provided by the
write block, and is therefore overwritten. In my design, two PMOS transistors for each bitline
are used to pull down; one of the transistors is triggered by the \textsc{Din} signal (which wire
we pull down depends on the signal itself!), and the other by a combination of the clock
and \textsc{Rwt} (we don't want to touch the wires when reading!).
\subsubsection{Details}
My write block was not significantly different from the original design. Under the assumption
that data arrives first, I placed the transistors attached to \textsc{Din} and $\overline{\textsc{Din}}$
close to \textsc{Gnd}, each followed by a transistor attached to the ``write'' signal.
I also configured the write block to only precharge when the clock is low.
I experimented with making the write block pull wires up when writing (during high clock). However,
I did not find this to be of significant use. Since the wires are initially precharged,
there is no more time spent on charging them up; furthermore, the memory cell being written to
does not have enough ``strength'' to pull the wire down enough.
A curiosity of this design is that reads didn't seem to work with hich clock speeds. When enough
time is spent reading the wires, the memory cell in question is able to gradually exhaust the amount
of charge on one of these wires. Since the original, \textsc{Nand}-based sense amplifier required
all inputs to be high to properly function, this led to it eventually ``flipping'' and producing
the wrong output. This was only an issue above $5\textit{ns}$, and only with the original sense amplifier
design, though. I think that both Reed and
Graham experienced this occurrence -- they seemed to post very similar waveforms
to the community Discord group chat.
One thing to note about the write block is that its \textbf{clock input is deliberately delayed} compared
to the ``actual'' clock. This is because of an issue with \textsc{Din}. Since this
input is behind a latch, it takes around $300\textit{ps}$ to arrive after the rising clock
edge. If the previous value of \textsc{Din} was different than its current one, the write
block will start writing the wrong value. This will typically mean that the block cannot properly
perform the write. The delay on the clock input serves to mitigate this issue, by giving more
time for \textsc{Din} to settle before starting to write. To compensate for this delay, I sized
the write block's pull down transistors quite large ($100\lambda$), so that they can pull
the wire down, even starting $300\textit{ps}$ into the cycle. This is why the ``clock'' input
in my diagrams is colored black, unlike every other clocked component. The delay is achieved
by 6 sequenced inverters, two of which are sized 10x larger than the rest.
\begin{figure}[h]
\centering
\includegraphics[width=0.65\linewidth]{write.png}
\caption{Write block used in this project.}
\label{fig:write}
\end{figure}
\pagebreak
\subsection{Memory Cell}
\subsubsection{In My Own Words}
The memory cell consists of two cross coupled inverters whose outputs
are disconnected from the bitlines by two additional nMOS transistors. When disconnected,
this cell reliably holds its value; one inverter's output turns off the other, and symmetrically,
the ``off'' output of that other inverter keeps the first one on. However, this cell is pretty
small; all of its transistors have size $5\lambda$ is the smallest size that can be properly
connected with a standard $2\lambda\times2\lambda$ via. Thus, when the ``write line'' (signal
connected to the gates of the two outside transistors) is asserted, the charge from the
surrounding bitlines can easily overpower the cell, causing it to switch to a different value.
\subsubsection{Details}
There are few notable things about my cell design. Even though it was recommended that we only
use metals one and two for the internal wiring, I went up to metal three for cross-connecting
the two internal inverters. This was the only way I found to keep the height of the cell to
minimum. This limited my routing options somewhat; to compensate, I also used metal three for
the vertical wires, \textsc{Bt} and \textsc{Bf}. This allowed me to use metal four for the
\textsc{Wl} (access) signal. Since this was the only use of metal four, I had enough free
room to route thee additional \textsc{Wl} signals to the remaining three columns.
My general principle for designing the layout was that, in an 12-bit, 4-column design, \textbf{a single
unit of height costs as much as 64 units of width}. Thus, I was fairly liberal with my layout's
width, but made sure to minimize the height of the design. The most significant bottleneck
was the gate oxide ``poking out'' of the ends of the design. In total, I was able to achieve
a height of $30\lambda$ when arrayed.
Other designs with smaller height were possible, but I found them undesirable. For instance,
Reed's now-famous design used a significant amount of high-level metals to achieve its tiny,
almost square area. This, however, makes routing \textsc{Wl} signals fairly complicated. They either
need to go to yet another layer of metal, or the decoder needs to be split into 4 pieces. The former
is undesirable as per the requirements for this assignment; the latter incurs the cost of additional
decoder hardware between columns, thereby significantly increasing the wire length and signal
delays. Since delays incurred by the flip flops and other signals are already becoming
a significant factor in my design, I thought it would be best to avoid such delays.
Other ideas I am aware of include putting \textit{all} the transistors in a single, horizontal line.
While this certainly succeeds at reducing the height, it incurs all the same issues described
above - it becomes nigh impossible to wire further \textsc{Wl} lines through each column,
unless the decoder is split into bits, in which case the width of the entire assembly drastically increases,
slowing down all signals.
\begin{figure}[h]
\centering
\includegraphics[width=0.5\linewidth]{layout_single.png}
\caption{Electric layout for a single cell.}
\label{fig:layout-cell}
\end{figure}
\pagebreak
My basic cell is shown in Figure \ref{fig:layout-cell}. The arrayed version (in Figure \ref{fig:layout-arrayed})
merits additional explanation. In my earlier description of the overall design, I mentioned
that I have precharge PMOS transistors. I have integrated these into my layout to accurately model
my design. I also made them $10\lambda$ wide, since this is, at the time of writing,
the size of my 4 precharge transistors. In the bird's eye view (Figure \ref{fig:layout-arrayed-far}),
three things can be observed:
\begin{itemize}
\item \textit{Additional vertical line:} This line represents the clock signal,
which must be fed to the precharge transistors. In the full design, there would
be 5 clock lines (3 shared, and 2 on either side).
\item \textit{``Empty'' space between nodes:} I left this space because I was not sure
how wide I would end up making my \textsc{Bt} and \textsc{Bf} wires. I have measured
the distance to ensure that the design will remain DRC clean with up to \textbf{$8\lambda$-wide bitlines}.
This appears to be a sweet spot for my design, anyway.
\item \textit{Moved well contacts:} I have moved my well contacts to the region between
two columns. By extending the N- and P-wells to this area, I was able to
share a single contact between two cells, leaving room for prechare transistors
on both sides of the cell. This was partially inspired by Reed's compact cell design,
which shared a single contact between two cells\footnote{I am operating based on your
comment that well contacts for every cell are significantly overkill.}.
\end{itemize}
Figure \ref{fig:layout-arrayed-close} shows a closer view of the design. Due to the additional
space incurred, an entire column is approximately $100\lambda$ wide.
\begin{figure}[h]
\centering
\begin{subfigure}{.5\textwidth}
\centering
\includegraphics[width=.7\linewidth]{layout_arrayed.png}
\caption{Bird's eye view of the arrayed SRAM cells.}
\label{fig:layout-arrayed-far}
\end{subfigure}%
\begin{subfigure}{.5\textwidth}
\centering
\includegraphics[width=.8\linewidth]{layout_arrayed_closeup.png}
\caption{Close up from arrayed SRAM cells.}
\label{fig:layout-arrayed-close}
\end{subfigure}
\caption{Read block schematics}
\label{fig:layout-arrayed}
\end{figure}
\pagebreak
\section{Further Design Ideas}
I discovered -- from other people in the class -- that an 8-column design was plausible.
Unfortunately, I was only convinced a day or so before the project was due, which did not give me
enough time to redesign my SRAM. I have seen students successfully using
the 8-column design by sharing \textsc{Wl} wires for each 'row', and using
the remaining 3 bits to enable and disable the write block. Since reading does
not change the cell value, this is a viable approach; all 8 columns would ``read''
(except during writing, in which 7 columns would read and 1 would write). As
long as a proper address selection mechanism is implemented into the read collector
circuit (which at present cannot handle concurrent reads), this would work just
fine, albeit at the expense of added power consumption (from draining and re-charging
7 extra wires). This design, combined with my idea of placing the write block
in the middle of the column, can lead to very short effective wire lengths. If
I was to approach this project again, that's what I would try.
\section{Acknowledgements}
Reed's aforementioned idea of sharing well contacts between adjacent cells
played a part in my design. Also, without the other students in the class
Discord, I would not have known to use the ``better'' wire model at all.
\pagebreak
\bibliographystyle{unsrt}
\bibliography{bibliography}
\end{document} \end{document}

View File

@@ -21,17 +21,17 @@ Xnf fff gnd dead nn ww='number*5'
*********begin: topLevel***** *********begin: topLevel*****
.param per = 1.33ns .param per = 1.3ns
.param dataLead=per*0.1 .param dataLead=per*0.1
.param lw=2000 .param lw=2200
.param wirew=14 .param wirew=14
vdd vdd 0 'supply' vdd vdd 0 'supply'
Xclok clk dat1 period='per' start='per+dataLead' total=1 duty=0.5 sz=300 Xclok clk dat1 period='per' start='per+dataLead' total=1 duty=0.5 sz=300
Xad ad dat1 period='per' start='per' total=1 duty=0.5 sz=300 Xad ad dat1 period='per' start='per' total=1 duty=0.5 sz=300
Xrdwr rdw dat1 period='3*per' start='2*per' total=2 duty=1 sz=300 Xrdwr rdw dat1 period='per' start='2*per' total=2 duty=1 sz=300
Xdii din dat1 period='3*per' start='per' total=4 duty=2 sz=300 Xdii din dat1 period='per' start='per' total=4 duty=2 sz=300
Xinv1 clkb1 clk inv Xinv1 clkb1 clk inv
Xinv2 clkb2 clkb1 inv Xinv2 clkb2 clkb1 inv
@@ -46,7 +46,7 @@ Xrdwff rdwf rdw clk flop
Xrotff dotf dot clk flop Xrotff dotf dot clk flop
Xdec choose adf clk decModel Xdec choose adf clk decModel
Xwr bt3 bf3 dinf rdwf clkb6 iWrite1 Xwr bt3 bf3 dinf rdwf adf clkb6 iWrite1
Xw1 bt1 bt2 bf1 bf2 clk wire_precharge len='lw/4' wid='wirew' Xw1 bt1 bt2 bf1 bf2 clk wire_precharge len='lw/4' wid='wirew'
Xmd1 bt2 bf2 memLoad number=15 Xmd1 bt2 bf2 memLoad number=15
Xw2 bt2 bt3 bf2 bf3 clk wire_precharge len='lw/4' wid='wirew' Xw2 bt2 bt3 bf2 bf3 clk wire_precharge len='lw/4' wid='wirew'
@@ -54,8 +54,10 @@ Xmd2 bt3 bf3 memLoad number=16
Xw3 bt3 bt4 bf3 bf4 clk wire_precharge len='lw/4' wid='wirew' Xw3 bt3 bt4 bf3 bf4 clk wire_precharge len='lw/4' wid='wirew'
Xmd3 bt4 bf4 memLoad number=16 Xmd3 bt4 bf4 memLoad number=16
Xw4 bt4 btt bf4 bff clk wire_precharge len='lw/4' wid='wirew' Xw4 bt4 btt bf4 bff clk wire_precharge len='lw/4' wid='wirew'
Xmd4 btt bff memLoad number =16 Xmd4 bt3 bf3 memLoad number =16
Xla bt1 bf1 choose mem1 * Xla bt1 bf1 choose mem1
* Xla bt3 bf3 choose mem1
Xla btt bff choose mem1
Xrd btt bff set rst rdwf clk choose iReadSub Xrd btt bff set rst rdwf clk choose iReadSub
Xrc dot set rst vdd vdd vdd vdd vdd vdd readCollect Xrc dot set rst vdd vdd vdd vdd vdd vdd readCollect

View File

@@ -1,5 +1,13 @@
* [x] Figure out the weird opAmp behavior * [x] Figure out the weird opAmp behavior
* [ ] Design cell with strict metal policies * [x] Design cell with strict metal policies
* [ ] Add precharger version of memory cell (or explain how they compose) * [x] Add precharger version of memory cell (or explain how they compose)
* [ ] Test cell in the _middle_. * [x] Test cell in the _middle_.
* [ ] Walk through the consequences of the read/write block being in the middle. * [x] Walk through the consequences of the read/write block being in the middle.
* [x] Figure out what to do with flopped write block.
* [x] Test data close to write block (it pulls up past clock low!)
* [ ] Drive wires to zero?
* [x] Add missing well connection in layout
* [x] Make sure width isn't too horrible
* [ ] Model additional delay for read read/write block select?
* [x] Model worst case of decoder
* [x] Cite [this](https://ieeexplore.ieee.org/document/210039)

BIN
final/toplevel.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 149 KiB

BIN
final/toplevel_design.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 192 KiB

BIN
final/write.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB