Grammar pass

Signed-off-by: Danila Fedorin <danila.fedorin@gmail.com>
This commit is contained in:
2026-04-05 16:09:27 -07:00
parent 767545dda4
commit aabbc66bb2
2 changed files with 13 additions and 13 deletions

View File

@@ -18,12 +18,12 @@ and, viewed from this perspective, I think the experience has been a
colossal success. colossal success.
As someone who works on software, I am always reminded that end-users rarely As someone who works on software, I am always reminded that end-users rarely
care about the technology as much as us technologists; they care about care about the technology as much as we technologists; they care about
having their problems solved. I find taking that perspective to be challenging having their problems solved. I find taking that perspective to be challenging
(though valuable) because software is my craft, and because in thinking (though valuable) because software is my craft, and because in thinking
about the solution, I have to think about the elements that bring it to life. about the solution, I have to think about the elements that bring it to life.
With LLMs, I was able --- allowed? --- to view things more so from the With LLMs, I was able --- allowed? --- to view things more from the
end-user perspective. I didn't know, and didn't need to know, the API end-user perspective. I didn't know, and didn't need to know, the API
for `PyMuPDF`, `argostranslate`, or `spaCy`. I didn't need to understand for `PyMuPDF`, `argostranslate`, or `spaCy`. I didn't need to understand
the PDF format. I could move one step away from the nitty-gritty and focus the PDF format. I could move one step away from the nitty-gritty and focus
@@ -45,7 +45,7 @@ on code as a medium.
There are two perspectives through which one may view software: There are two perspectives through which one may view software:
as a craft in and of itself, and as a means to some end. as a craft in and of itself, and as a means to some end.
My flashcard extractor can be viewed in vastly different ways when faced My flashcard extractor can be viewed in vastly different ways when faced
from these two perspective. In terms of craft, I think that it is at best from these two perspectives. In terms of craft, I think that it is at best
mediocre; most of the code is generated, slightly verbose and somewhat mediocre; most of the code is generated, slightly verbose and somewhat
tedious. The codebase is far from inspiring, and if I had written it by hand, tedious. The codebase is far from inspiring, and if I had written it by hand,
I would not be particularly proud of it. In terms of product, though, I would not be particularly proud of it. In terms of product, though,
@@ -57,7 +57,7 @@ The truth is, the "builder vs. craftsman" distinction is a simplifying one,
another in the long line of "us vs. them" classifications. Any one person is another in the long line of "us vs. them" classifications. Any one person is
capable of being any combination of these two camps at any given time. Indeed, capable of being any combination of these two camps at any given time. Indeed,
different sorts of software demand to be viewed through different lenses. different sorts of software demand to be viewed through different lenses.
I will _still_ treat work on my long-term projects as craft, because I will _still_ treat work on my long-term projects as a craft, because
I will come back to it again and again, and because our craft has evolved I will come back to it again and again, and because our craft has evolved
to engender stability and maintainability. to engender stability and maintainability.
@@ -93,11 +93,11 @@ I think that my flashcard generator is an early instance of such software.
It doesn't worry about various book formats, or various languages, or It doesn't worry about various book formats, or various languages, or
various page layouts. The heuristic was tweaked to fit my use case, and various page layouts. The heuristic was tweaked to fit my use case, and
now works 100% of the time. I understand the software in its entirety. now works 100% of the time. I understand the software in its entirety.
I thought about sharing it --- and, in way, I did, since it's I thought about sharing it --- and, in a way, I did, since it's
[open source](https://dev.danilafe.com/DanilaFe/vocab-builder) --- but realized [open source](https://dev.danilafe.com/DanilaFe/vocab-builder) --- but realized
that outside of the constraints of my own problem, it likely will not be that outside of the constraints of my own problem, it likely will not be
of that much use. I _could_ experiment with more varied constraints, but of that much use. I _could_ experiment with more varied constraints, but
that would turn in back into the sort of software I discussed above: that would turn it back into the sort of software I discussed above:
general, robust, and complex. general, robust, and complex.
Today, I think that there is a whole class of software that is amenable to Today, I think that there is a whole class of software that is amenable to
@@ -112,7 +112,7 @@ if I had to give a rough heuristic, it would be problems that:
etc. significantly raises the bar for quality. etc. significantly raises the bar for quality.
* e.g., I collect flashcards once every two weeks; * e.g., I collect flashcards once every two weeks;
I organize my filesystem once a month; I don't spend nearly enough money I organize my filesystem once a month; I don't spend nearly enough money
to want to re-generate cash flow charts very often to want to regenerate cash flow charts very often
* __have an "answer" that's relatively easy to assess__, because * __have an "answer" that's relatively easy to assess__, because
LLMs are not perfect and iteration must be possible and easy. LLMs are not perfect and iteration must be possible and easy.
* e.g., I can see that all the underlined words are listed in my web app; * e.g., I can see that all the underlined words are listed in my web app;
@@ -137,7 +137,7 @@ with others --- that last one because they can just ask as well.
#### The Unfair Advantage of Being Technical #### The Unfair Advantage of Being Technical
I recognize that my success described here did not come for free. There I recognize that my success described here did not come for free. There
were numerous parts of the process where my software background helped were numerous parts of the process where my software background helped
get the most out of Codex. me get the most out of Codex.
For one thing, writing software trains us to think precisely about problems. For one thing, writing software trains us to think precisely about problems.
We learn to state exactly what we want, to decompose tasks into steps, We learn to state exactly what we want, to decompose tasks into steps,

View File

@@ -26,14 +26,14 @@ the latter was unpleasant, making me constantly break from the prose
In the end, I decided to underline the words, and come back to them later. In the end, I decided to underline the words, and come back to them later.
However, even then, the task is fairly arduous. For one, words I don't recognize However, even then, the task is fairly arduous. For one, words I don't recognize
aren't always in their canonical form (they can conjugated, plural, compound, aren't always in their canonical form (they can be conjugated, plural, compound,
and more): I have to spend some time deciphering what I should add to a and more): I have to spend some time deciphering what I should add to a
flashcard. For another, I had to bounce between a PDF of my book flashcard. For another, I had to bounce between a PDF of my book
(from where, fortunately, I can copy-paste) and my computer. Often, a word (from where, fortunately, I can copy-paste) and my computer. Often, a word
confused the translation software out of context, so I had to copy more of the confused the translation software out of context, so I had to copy more of the
surrounding text. Finally, I learned that given these limitations, the pace of surrounding text. Finally, I learned that given these limitations, the pace of
my reading far exceeds the rate of my translation. This led me to underline my reading far exceeds the rate of my translation. This led me to underline
less words. fewer words.
I thought, I thought,
@@ -60,10 +60,10 @@ interleaved with the technical details.
### The Core Solution ### The Core Solution
The core idea has always been: The core idea has always been:
1. Find thing that look like underlines 1. Find things that look like underlines
2. See which words they correspond to 2. See which words they correspond to
3. Perform {{< sidenote "right" "lemmatization-node" "lemmatization" >}} 3. Perform {{< sidenote "right" "lemmatization-node" "lemmatization" >}}
Lemmatization (<a href="https://en.wikipedia.org/wiki/Lemmatization">wikipedia</a>) is the Lemmatization (<a href="https://en.wikipedia.org/wiki/Lemmatization">Wikipedia</a>) is the
process of turning non-canonical forms of words (like <code>am</code> (eng) / process of turning non-canonical forms of words (like <code>am</code> (eng) /
<code>suis</code> (fr)) into their canonical form which might be found in the <code>suis</code> (fr)) into their canonical form which might be found in the
dictionary (<code>to be</code> / <code>être</code>). dictionary (<code>to be</code> / <code>être</code>).
@@ -182,7 +182,7 @@ plenty of Flask applications in Codex's training dataset. In one shot,
it generated a little web application that enabled me to tweak the source word it generated a little web application that enabled me to tweak the source word
and final translation. It also enabled me to throw away certain underlines. and final translation. It also enabled me to throw away certain underlines.
This was useful when, across different sessions, I forgot and underlined This was useful when, across different sessions, I forgot and underlined
the same word, or when I underlined a word but later decided it not worth the same word, or when I underlined a word but later decided it was not worth
including in my studying. This application produced an Anki deck, using including in my studying. This application produced an Anki deck, using
the Python library [`genanki`](https://github.com/kerrickstaley/genanki). the Python library [`genanki`](https://github.com/kerrickstaley/genanki).
Anki has a nice mechanism to de-duplicate decks, which meant that every Anki has a nice mechanism to de-duplicate decks, which meant that every