191 lines
11 KiB
Markdown
191 lines
11 KiB
Markdown
---
|
|
title: "Personal Software with the Help of LLMs"
|
|
date: 2026-04-05T16:03:00-07:00
|
|
tags: ["LLMs"]
|
|
series: ["LLM-Assisted Flashcard Generator"]
|
|
description: "In this post, I describe an inherently individual and outcome-focused class of LLM-generated software"
|
|
---
|
|
|
|
In [the previous post in this series]({{< relref "pdf_flashcards_llm" >}}),
|
|
I wrote about a little utility I created for detecting underlined words
|
|
in a book and creating vocabulary study material for them.
|
|
Like I mentioned earlier, this was one of my earliest experiences with
|
|
LLM-driven development, and I think it shaped my outlook on the technology
|
|
quite a bit. For me, the bottom line is this: _with LLMs, I was able to
|
|
rapidly solve a problem that was holding me back in another area of my life_.
|
|
My goal was never to "produce software", but to "acquire vocabulary",
|
|
and, viewed from this perspective, I think the experience has been a
|
|
colossal success.
|
|
|
|
As someone who works on software, I am always reminded that end-users rarely
|
|
care about the technology as much as we technologists; they care about
|
|
having their problems solved. I find taking that perspective to be challenging
|
|
(though valuable) because software is my craft, and because in thinking
|
|
about the solution, I have to think about the elements that bring it to life.
|
|
|
|
With LLMs, I was able --- allowed? --- to view things more from the
|
|
end-user perspective. I didn't know, and didn't need to know, the API
|
|
for `PyMuPDF`, `argostranslate`, or `spaCy`. I didn't need to understand
|
|
the PDF format. I could move one step away from the nitty-gritty and focus
|
|
on the 'why' and the 'what', on the challenge of what I wanted to accomplish.
|
|
I wrestled with the inherent complexity and
|
|
avoided altogether the unrelated difficulties that merely happened to be
|
|
there (downloading language modules; learning translation APIs; etc.)
|
|
|
|
By enabling me to do this, the LLM let me make rapid progress, and to produce
|
|
solutions to problems I would've previously deemed "too hard" or "too tedious".
|
|
This did, however, markedly reduce the care with which I was examining
|
|
the output. I don't think I've _ever_ read the code that produces the
|
|
pretty colored boxes in my program's debug output. This shift, I think,
|
|
has been a divisive element of AI discourse in technical communities.
|
|
I think that this has to do, at least in part, with different views
|
|
on code as a medium.
|
|
|
|
#### The Builders and the Craftsmen
|
|
There are two perspectives through which one may view software:
|
|
as a craft in and of itself, and as a means to some end.
|
|
My flashcard extractor can be viewed in vastly different ways when faced
|
|
from these two perspectives. In terms of craft, I think that it is at best
|
|
mediocre; most of the code is generated, slightly verbose and somewhat
|
|
tedious. The codebase is far from inspiring, and if I had written it by hand,
|
|
I would not be particularly proud of it. In terms of product, though,
|
|
I think it tells an exciting story: here I am, reading Camus again, because
|
|
I was able to improve the workflow around said reading. In a day, I was able
|
|
to achieve what I couldn't muster in a year or two on my own.
|
|
|
|
The truth is, the "builder vs. craftsman" distinction is a simplifying one,
|
|
another in the long line of "us vs. them" classifications. Any one person is
|
|
capable of being any combination of these two camps at any given time. Indeed,
|
|
different sorts of software demand to be viewed through different lenses.
|
|
I will _still_ treat work on my long-term projects as a craft, because
|
|
I will come back to it again and again, and because our craft has evolved
|
|
to engender stability and maintainability.
|
|
|
|
However, I am more than happy to settle for 'underwhelming' when it means an
|
|
individual need of mine can be addressed in record time. I think this
|
|
gives rise to a new sort of software: highly individual, explicitly
|
|
non-robust, and treated differently from software crafted with
|
|
deliberate thought and foresight.
|
|
|
|
#### Personal Software
|
|
|
|
I think as time goes on, I am becoming more and more convinced by the idea
|
|
of "personal software". One might argue that much of the complexity in many
|
|
pieces of software is driven by the need of that software to accommodate
|
|
the diverse needs of many users. Still, software remains somewhat inflexible and
|
|
unable to accommodate individual needs. Features or uses that demand
|
|
changes at the software level move at a slower pace: finite developer time
|
|
needs to be spent analyzing what users need, determining the costs of this new
|
|
functionality, choosing which of the many possible requests to fulfill.
|
|
On the other hand, software that enables the users to build their customizations
|
|
for themselves, by exposing numerous configuration options and abstractions,
|
|
becomes, over time, very complicated to grasp.
|
|
|
|
Now, suppose that the complexity of such software scales superlinearly with
|
|
the number of features it provides. Suppose also that individual users
|
|
leverage only a small subset of the software's functionality. From these
|
|
assumptions it would follow that individual programs, made to serve a single
|
|
user's need, would be significantly less complicated than the "whole".
|
|
By definition, these programs would also be better tailored to the users'
|
|
needs. With LLMs, we're getting to a future where this might be possible.
|
|
|
|
I think that my flashcard generator is an early instance of such software.
|
|
It doesn't worry about various book formats, or various languages, or
|
|
various page layouts. The heuristic was tweaked to fit my use case, and
|
|
now works 100% of the time. I understand the software in its entirety.
|
|
I thought about sharing it --- and, in a way, I did, since it's
|
|
[open source](https://dev.danilafe.com/DanilaFe/vocab-builder) --- but realized
|
|
that outside of the constraints of my own problem, it likely will not be
|
|
of that much use. I _could_ experiment with more varied constraints, but
|
|
that would turn it back into the sort of software I discussed above:
|
|
general, robust, and complex.
|
|
|
|
Today, I think that there is a whole class of software that is amenable to
|
|
being "personal". My flashcard generator is one such piece of software;
|
|
I imagine file-organization (as served by many "bulk rename and move" pieces
|
|
of software out there), video wrangling (possible today with `ffmpeg`'s
|
|
myriad of flags and switches), and data visualization to be other
|
|
instances of problems in that class. I am merely intuiting here, but
|
|
if I had to give a rough heuristic, it would be problems that:
|
|
|
|
* __fulfill a short-frequency need__, because availability, deployment,
|
|
etc. significantly raises the bar for quality.
|
|
* e.g., I collect flashcards once every two weeks;
|
|
I organize my filesystem once a month; I don't spend nearly enough money
|
|
to want to regenerate cash flow charts very often
|
|
* __have an "answer" that's relatively easy to assess__, because
|
|
LLMs are not perfect and iteration must be possible and easy.
|
|
* e.g., I can see that all the underlined words are listed in my web app;
|
|
I know that my files are in the right folders, named appropriately,
|
|
by inspection; my charts seem to track with reality
|
|
* __have a relatively complex technical implementation__, because
|
|
why would you bother invoking an LLM if you can "just" click a button somewhere?
|
|
* e.g., extracting data from PDFs requires some wrangling;
|
|
bulk-renaming files requires some tedious and possibly case-specific
|
|
pattern matching; cash flow between N accounts requires some graph
|
|
analysis
|
|
* __have relatively low stakes__, again, because LLMs are not perfect,
|
|
and nor is (necessarily) one's understanding of the problem.
|
|
* e.g., it's OK if I miss some words I underlined; my cash flow
|
|
charts only give me an impression of my spending;
|
|
* I recognize that moving files is a potentially destructive operation.
|
|
|
|
I dream of a world in which, to make use of my hardware, I just _ask_,
|
|
and don't worry much about languages, frameworks, or sharing my solution
|
|
with others --- that last one because they can just ask as well.
|
|
|
|
#### The Unfair Advantage of Being Technical
|
|
I recognize that my success described here did not come for free. There
|
|
were numerous parts of the process where my software background helped
|
|
me get the most out of Codex.
|
|
|
|
For one thing, writing software trains us to think precisely about problems.
|
|
We learn to state exactly what we want, to decompose tasks into steps,
|
|
and to intuit the exact size of these steps; to know what's hard and what's
|
|
easy for the machine. When working with an LLM, these skills make it possible
|
|
to hit the ground running, to know what to ask and to help pluck out a particular
|
|
solution from the space of various approaches. I think that this greatly
|
|
accelerates the effectiveness of using LLMs compared to non-technical experts.
|
|
|
|
For another, the boundary between 'manual' and 'automatic' is not always consistent.
|
|
Though I didn't touch any of the `PyMuPDF` code, I did need to look fairly
|
|
closely at the logic that classified my squiggles as "underlines" and found
|
|
associated words. It was not enough to treat LLM-generated code as a black box.
|
|
|
|
Another advantage software folks have when leveraging LLMs is the established
|
|
rigor of software development. LLMs can and do make mistakes, but so do people.
|
|
Our field has been built around reducing these mistakes' impact and frequency.
|
|
Knowing to use version control helps turn the pathological downward spiral
|
|
of accumulating incorrect tweaks into monotonic, step-wise improvements.
|
|
Knowing how to construct a test suite and thinking about edge cases can
|
|
provide an agent LLM the grounding it needs to iterate rapidly and safely.
|
|
|
|
In this way, I think the dream of personal software is far from being realized
|
|
for the general public. Without the foundation of experience and rigor,
|
|
LLM-driven development can easily devolve into a frustrating and endless
|
|
back-and-forth, or worse, successfully build software that is subtly and
|
|
convincingly wrong.
|
|
|
|
#### The Shoulders of Giants
|
|
|
|
The only reason all of this was possible is that the authors of `PyMuPDF`,
|
|
`genanki`, `spaCy`, and `argos-translate` made them available for me to use from
|
|
my code. These libraries provided the bulk of the functionality that Codex and I
|
|
were able to glue into a final product. It would be a mistake to forget this,
|
|
and to confuse the sustained, thoughtful efforts of the people behind these
|
|
projects for the one-off, hyper-specific software I've been talking about.
|
|
|
|
We need these packages, and others like them, to provide a foundation for the
|
|
things we build. They bring stability, reuse, and the sort of cohesion that
|
|
is not possible through an amalgamation of home-grown personal scripts.
|
|
In my view, something like `spaCy` is to my flashcard script as a brick is to
|
|
grout. There is a fundamental difference.
|
|
|
|
I don't know how LLMs will integrate into the future of large-scale software
|
|
development. The discipline becomes something else entirely when the
|
|
constraints of "personal software" I floated above cease to apply. Though
|
|
LLMs can still enable doing what was previously too difficult, tedious,
|
|
or time consuming (like my little 'underline visualizer'), it remains
|
|
to be seen how to integrate this new ease into the software lifecycle
|
|
without threatening its future.
|