From d696183690a4f3a46b9330bf3e431886962b9935 Mon Sep 17 00:00:00 2001 From: Danila Fedorin Date: Sun, 5 Apr 2026 16:00:31 -0700 Subject: [PATCH] Split into two files Signed-off-by: Danila Fedorin --- content/blog/llm_personal_software/index.md | 190 ++++++++++++++++++ content/blog/pdf_flashcards_llm/index.md | 184 +---------------- .../_index.md | 10 + 3 files changed, 203 insertions(+), 181 deletions(-) create mode 100644 content/blog/llm_personal_software/index.md create mode 100644 content/series/llm-assisted-flashcard-generator/_index.md diff --git a/content/blog/llm_personal_software/index.md b/content/blog/llm_personal_software/index.md new file mode 100644 index 0000000..7882a63 --- /dev/null +++ b/content/blog/llm_personal_software/index.md @@ -0,0 +1,190 @@ +--- +title: "Personal Software with the Help of LLMs" +date: 2026-04-05T15:43:26-07:00 +tags: ["LLMs"] +draft: true +series: ["LLM-Assisted Flashcard Generator"] +--- + +In [the previous post in this series]({{< relref "pdf_flashcards_llm" >}}), +I wrote about a little utility I created for detecting underlined words +in a book and creating vocabulary study material for them. +Like I mentioned earlier, this was one of my earliest experiences with +LLM-driven development, and I think it shaped my outlook on the technology +quite a bit. For me, the bottom line is this: _with LLMs, I was able to +rapidly solve a problem that was holding me back in another area of my life_. +My goal was never to "produce software", but to "acquire vocabulary", +and, viewed from this perspective, I think the experience has been a +colossal success. + +As someone who works on software, I am always reminded that end-users rarely +care about the technology as much as us technologists; they care about +having their problems solved. I find taking that perspective to be challenging +(though valuable) because software is my craft, and because in thinking +about the solution, I have to think about the elements that bring it to life. + +With LLMs, I was able --- allowed? --- to view things more so from the +end-user perspective. I didn't know, and didn't need to know, the API +for `PyMuPDF`, `argostranslate`, or `spaCy`. I didn't need to understand +the PDF format. I could move one step away from the nitty-gritty and focus +on the 'why' and the 'what', on the challenge of what I wanted to accomplish. +I wrestled with the inherent complexity and +avoided altogether the unrelated difficulties that merely happened to be +there (downloading language modules; learning translation APIs; etc.) + +By enabling me to do this, the LLM let me make rapid progress, and to produce +solutions to problems I would've previously deemed "too hard" or "too tedious". +This did, however, markedly reduce the care with which I was examining +the output. I don't think I've _ever_ read the code that produces the +pretty colored boxes in my program's debug output. This shift, I think, +has been a divisive element of AI discourse in technical communities. +I think that this has to do, at least in part, with different views +on code as a medium. + +#### The Builders and the Craftsmen +There are two perspectives through which one may view software: +as a craft in and of itself, and as a means to some end. +My flashcard extractor can be viewed in vastly different ways when faced +from these two perspective. In terms of craft, I think that it is at best +mediocre; most of the code is generated, slightly verbose and somewhat +tedious. The codebase is far from inspiring, and if I had written it by hand, +I would not be particularly proud of it. In terms of product, though, +I think it tells an exciting story: here I am, reading Camus again, because +I was able to improve the workflow around said reading. In a day, I was able +to achieve what I couldn't muster in a year or two on my own. + +The truth is, the "builder vs. craftsman" distinction is a simplifying one, +another in the long line of "us vs. them" classifications. Any one person is +capable of being any combination of these two camps at any given time. Indeed, +different sorts of software demand to be viewed through different lenses. +I will _still_ treat work on my long-term projects as craft, because +I will come back to it again and again, and because our craft has evolved +to engender stability and maintainability. + +However, I am more than happy to settle for 'underwhelming' when it means an +individual need of mine can be addressed in record time. I think this +gives rise to a new sort of software: highly individual, explicitly +non-robust, and treated differently from software crafted with +deliberate thought and foresight. + +#### Personal Software + +I think as time goes on, I am becoming more and more convinced by the idea +of "personal software". One might argue that much of the complexity in many +pieces of software is driven by the need of that software to accommodate +the diverse needs of many users. Still, software remains somewhat inflexible and +unable to accommodate individual needs. Features or uses that demand +changes at the software level move at a slower pace: finite developer time +needs to be spent analyzing what users need, determining the costs of this new +functionality, choosing which of the many possible requests to fulfill. +On the other hand, software that enables the users to build their customizations +for themselves, by exposing numerous configuration options and abstractions, +becomes, over time, very complicated to grasp. + +Now, suppose that the complexity of such software scales superlinearly with +the number of features it provides. Suppose also that individual users +leverage only a small subset of the software's functionality. From these +assumptions it would follow that individual programs, made to serve a single +user's need, would be significantly less complicated than the "whole". +By definition, these programs would also be better tailored to the users' +needs. With LLMs, we're getting to a future where this might be possible. + +I think that my flashcard generator is an early instance of such software. +It doesn't worry about various book formats, or various languages, or +various page layouts. The heuristic was tweaked to fit my use case, and +now works 100% of the time. I understand the software in its entirety. +I thought about sharing it --- and, in way, I did, since it's +[open source](https://dev.danilafe.com/DanilaFe/vocab-builder) --- but realized +that outside of the constraints of my own problem, it likely will not be +of that much use. I _could_ experiment with more varied constraints, but +that would turn in back into the sort of software I discussed above: +general, robust, and complex. + +Today, I think that there is a whole class of software that is amenable to +being "personal". My flashcard generator is one such piece of software; +I imagine file-organization (as served by many "bulk rename and move" pieces +of software out there), video wrangling (possible today with `ffmpeg`'s +myriad of flags and switches), and data visualization to be other +instances of problems in that class. I am merely intuiting here, but +if I had to give a rough heuristic, it would be problems that: + +* __fulfill a short-frequency need__, because availability, deployment, + etc. significantly raises the bar for quality. + * e.g., I collect flashcards once every two weeks; + I organize my filesystem once a month; I don't spend nearly enough money + to want to re-generate cash flow charts very often +* __have an "answer" that's relatively easy to assess__, because + LLMs are not perfect and iteration must be possible and easy. + * e.g., I can see that all the underlined words are listed in my web app; + I know that my files are in the right folders, named appropriately, + by inspection; my charts seem to track with reality +* __have a relatively complex technical implementation__, because + why would you bother invoking an LLM if you can "just" click a button somewhere? + * e.g., extracting data from PDFs requires some wrangling; + bulk-renaming files requires some tedious and possibly case-specific + pattern matching; cash flow between N accounts requires some graph + analysis +* __have relatively low stakes__, again, because LLMs are not perfect, + and nor is (necessarily) one's understanding of the problem. + * e.g., it's OK if I miss some words I underlined; my cash flow + charts only give me an impression of my spending; + * I recognize that moving files is a potentially destructive operation. + +I dream of a world in which, to make use of my hardware, I just _ask_, +and don't worry much about languages, frameworks, or sharing my solution +with others --- that last one because they can just ask as well. + +#### The Unfair Advantage of Being Technical +I recognize that my success described here did not come for free. There +were numerous parts of the process where my software background helped +get the most out of Codex. + +For one thing, writing software trains us to think precisely about problems. +We learn to state exactly what we want, to decompose tasks into steps, +and to intuit the exact size of these steps; to know what's hard and what's +easy for the machine. When working with an LLM, these skills make it possible +to hit the ground running, to know what to ask and to help pluck out a particular +solution from the space of various approaches. I think that this greatly +accelerates the effectiveness of using LLMs compared to non-technical experts. + +For another, the boundary between 'manual' and 'automatic' is not always consistent. +Though I didn't touch any of the `PyMuPDF` code, I did need to look fairly +closely at the logic that classified my squiggles as "underlines" and found +associated words. It was not enough to treat LLM-generated code as a black box. + +Another advantage software folks have when leveraging LLMs is the established +rigor of software development. LLMs can and do make mistakes, but so do people. +Our field has been built around reducing these mistakes' impact and frequency. +Knowing to use version control helps turn the pathological downward spiral +of accumulating incorrect tweaks into monotonic, step-wise improvements. +Knowing how to construct a test suite and thinking about edge cases can +provide an agent LLM the grounding it needs to iterate rapidly and safely. + +In this way, I think the dream of personal software is far from being realized +for the general public. Without the foundation of experience and rigor, +LLM-driven development can easily devolve into a frustrating and endless +back-and-forth, or worse, successfully build software that is subtly and +convincingly wrong. + +#### The Shoulders of Giants + +The only reason all of this was possible is that the authors of `PyMuPDF`, +`genanki`, `spaCy`, and `argos-translate` made them available for me to use from +my code. These libraries provided the bulk of the functionality that Codex and I +were able to glue into a final product. It would be a mistake to forget this, +and to confuse the sustained, thoughtful efforts of the people behind these +projects for the one-off, hyper-specific software I've been talking about. + +We need these packages, and others like them, to provide a foundation for the +things we build. They bring stability, reuse, and the sort of cohesion that +is not possible through an amalgamation of home-grown personal scripts. +In my view, something like `spaCy` is to my flashcard script as a brick is to +grout. There is a fundamental difference. + +I don't know how LLMs will integrate into the future of large-scale software +development. The discipline becomes something else entirely when the +constraints of "personal software" I floated above cease to apply. Though +LLMs can still enable doing what was previously too difficult, tedious, +or time consuming (like my little 'underline visualizer'), it remains +to be seen how to integrate this new ease into the software lifecycle +without threatening its future. diff --git a/content/blog/pdf_flashcards_llm/index.md b/content/blog/pdf_flashcards_llm/index.md index e7fcef4..9676cc5 100644 --- a/content/blog/pdf_flashcards_llm/index.md +++ b/content/blog/pdf_flashcards_llm/index.md @@ -3,6 +3,7 @@ title: "Generating Flashcards from PDF Underlines" date: 2026-04-04T12:25:14-07:00 tags: ["LLMs", "Python"] draft: true +series: ["LLM-Assisted Flashcard Generator"] --- __TL;DR__: I, with the help of ChatGPT, wrote a program that helps me @@ -234,184 +235,5 @@ That said, I think that those features are way beyond the 80:20 transition: it would be much harder for me to get to that point, and the benefit would be relatively small. Today, I'm happy to stick with what I already have. -### Personal Software with the Help of LLMs - -Like I mentioned earlier, this was one of my earliest experiences with -LLM-driven development, and I think it shaped my outlook on the technology -quite a bit. For me, the bottom line is this: _with LLMs, I was able to -rapidly solve a problem that was holding me back in another area of my life_. -My goal was never to "produce software", but to "acquire vocabulary", -and, viewed from this perspective, I think the experience has been a -colossal success. - -As someone who works on software, I am always reminded that end-users rarely -care about the technology as much as us technologists; they care about -having their problems solved. I find taking that perspective to be challenging -(though valuable) because software is my craft, and because in thinking -about the solution, I have to think about the elements that bring it to life. - -With LLMs, I was able --- allowed? --- to view things more so from the -end-user perspective. I didn't know, and didn't need to know, the API -for `PyMuPDF`, `argostranslate`, or `spaCy`. I didn't need to understand -the PDF format. I could move one step away from the nitty-gritty and focus -on the 'why' and the 'what', on the challenge of what I wanted to accomplish. -I wrestled with the inherent complexity and -avoided altogether the unrelated difficulties that merely happened to be -there (downloading language modules; learning translation APIs; etc.) - -By enabling me to do this, the LLM let me make rapid progress, and to produce -solutions to problems I would've previously deemed "too hard" or "too tedious". -This did, however, markedly reduce the care with which I was examining -the output. I don't think I've _ever_ read the code that produces the -pretty colored boxes in my program's debug output. This shift, I think, -has been a divisive element of AI discourse in technical communities. -I think that this has to do, at least in part, with different views -on code as a medium. - -#### The Builders and the Craftsmen -There are two perspectives through which one may view software: -as a craft in and of itself, and as a means to some end. -My flashcard extractor can be viewed in vastly different ways when faced -from these two perspective. In terms of craft, I think that it is at best -mediocre; most of the code is generated, slightly verbose and somewhat -tedious. The codebase is far from inspiring, and if I had written it by hand, -I would not be particularly proud of it. In terms of product, though, -I think it tells an exciting story: here I am, reading Camus again, because -I was able to improve the workflow around said reading. In a day, I was able -to achieve what I couldn't muster in a year or two on my own. - -The truth is, the "builder vs. craftsman" distinction is a simplifying one, -another in the long line of "us vs. them" classifications. Any one person is -capable of being any combination of these two camps at any given time. Indeed, -different sorts of software demand to be viewed through different lenses. -I will _still_ treat work on my long-term projects as craft, because -I will come back to it again and again, and because our craft has evolved -to engender stability and maintainability. - -However, I am more than happy to settle for 'underwhelming' when it means an -individual need of mine can be addressed in record time. I think this -gives rise to a new sort of software: highly individual, explicitly -non-robust, and treated differently from software crafted with -deliberate thought and foresight. - -#### Personal Software - -I think as time goes on, I am becoming more and more convinced by the idea -of "personal software". One might argue that much of the complexity in many -pieces of software is driven by the need of that software to accommodate -the diverse needs of many users. Still, software remains somewhat inflexible and -unable to accommodate individual needs. Features or uses that demand -changes at the software level move at a slower pace: finite developer time -needs to be spent analyzing what users need, determining the costs of this new -functionality, choosing which of the many possible requests to fulfill. -On the other hand, software that enables the users to build their customizations -for themselves, by exposing numerous configuration options and abstractions, -becomes, over time, very complicated to grasp. - -Now, suppose that the complexity of such software scales superlinearly with -the number of features it provides. Suppose also that individual users -leverage only a small subset of the software's functionality. From these -assumptions it would follow that individual programs, made to serve a single -user's need, would be significantly less complicated than the "whole". -By definition, these programs would also be better tailored to the users' -needs. With LLMs, we're getting to a future where this might be possible. - -I think that my flashcard generator is an early instance of such software. -It doesn't worry about various book formats, or various languages, or -various page layouts. The heuristic was tweaked to fit my use case, and -now works 100% of the time. I understand the software in its entirety. -I thought about sharing it --- and, in way, I did, since it's -[open source](https://dev.danilafe.com/DanilaFe/vocab-builder) --- but realized -that outside of the constraints of my own problem, it likely will not be -of that much use. I _could_ experiment with more varied constraints, but -that would turn in back into the sort of software I discussed above: -general, robust, and complex. - -Today, I think that there is a whole class of software that is amenable to -being "personal". My flashcard generator is one such piece of software; -I imagine file-organization (as served by many "bulk rename and move" pieces -of software out there), video wrangling (possible today with `ffmpeg`'s -myriad of flags and switches), and data visualization to be other -instances of problems in that class. I am merely intuiting here, but -if I had to give a rough heuristic, it would be problems that: - -* __fulfill a short-frequency need__, because availability, deployment, - etc. significantly raises the bar for quality. - * e.g., I collect flashcards once every two weeks; - I organize my filesystem once a month; I don't spend nearly enough money - to want to re-generate cash flow charts very often -* __have an "answer" that's relatively easy to assess__, because - LLMs are not perfect and iteration must be possible and easy. - * e.g., I can see that all the underlined words are listed in my web app; - I know that my files are in the right folders, named appropriately, - by inspection; my charts seem to track with reality -* __have a relatively complex technical implementation__, because - why would you bother invoking an LLM if you can "just" click a button somewhere? - * e.g., extracting data from PDFs requires some wrangling; - bulk-renaming files requires some tedious and possibly case-specific - pattern matching; cash flow between N accounts requires some graph - analysis -* __have relatively low stakes__, again, because LLMs are not perfect, - and nor is (necessarily) one's understanding of the problem. - * e.g., it's OK if I miss some words I underlined; my cash flow - charts only give me an impression of my spending; - * I recognize that moving files is a potentially destructive operation. - -I dream of a world in which, to make use of my hardware, I just _ask_, -and don't worry much about languages, frameworks, or sharing my solution -with others --- that last one because they can just ask as well. - -#### The Unfair Advantage of Being Technical -I recognize that my success described here did not come for free. There -were numerous parts of the process where my software background helped -get the most out of Codex. - -For one thing, writing software trains us to think precisely about problems. -We learn to state exactly what we want, to decompose tasks into steps, -and to intuit the exact size of these steps; to know what's hard and what's -easy for the machine. When working with an LLM, these skills make it possible -to hit the ground running, to know what to ask and to help pluck out a particular -solution from the space of various approaches. I think that this greatly -accelerates the effectiveness of using LLMs compared to non-technical experts. - -For another, the boundary between 'manual' and 'automatic' is not always consistent. -Though I didn't touch any of the `PyMuPDF` code, I did need to look fairly -closely at the logic that classified my squiggles as "underlines" and found -associated words. It was not enough to treat LLM-generated code as a black box. - -Another advantage software folks have when leveraging LLMs is the established -rigor of software development. LLMs can and do make mistakes, but so do people. -Our field has been built around reducing these mistakes' impact and frequency. -Knowing to use version control helps turn the pathological downward spiral -of accumulating incorrect tweaks into monotonic, step-wise improvements. -Knowing how to construct a test suite and thinking about edge cases can -provide an agent LLM the grounding it needs to iterate rapidly and safely. - -In this way, I think the dream of personal software is far from being realized -for the general public. Without the foundation of experience and rigor, -LLM-driven development can easily devolve into a frustrating and endless -back-and-forth, or worse, successfully build software that is subtly and -convincingly wrong. - -#### The Shoulders of Giants - -The only reason all of this was possible is that the authors of `PyMuPDF`, -`genanki`, `spaCy`, and `argos-translate` made them available for me to use from -my code. These libraries provided the bulk of the functionality that Codex and I -were able to glue into a final product. It would be a mistake to forget this, -and to confuse the sustained, thoughtful efforts of the people behind these -projects for the one-off, hyper-specific software I've been talking about. - -We need these packages, and others like them, to provide a foundation for the -things we build. They bring stability, reuse, and the sort of cohesion that -is not possible through an amalgamation of home-grown personal scripts. -In my view, something like `spaCy` is to my flashcard script as a brick is to -grout. There is a fundamental difference. - -I don't know how LLMs will integrate into the future of large-scale software -development. The discipline becomes something else entirely when the -constraints of "personal software" I floated above cease to apply. Though -LLMs can still enable doing what was previously too difficult, tedious, -or time consuming (like my little 'underline visualizer'), it remains -to be seen how to integrate this new ease into the software lifecycle -without threatening its future. +In the [next part of this series]({{< relref "llm_personal_software" >}}), +I will talk more about how this project influenced my views on LLMs. diff --git a/content/series/llm-assisted-flashcard-generator/_index.md b/content/series/llm-assisted-flashcard-generator/_index.md new file mode 100644 index 0000000..27e5549 --- /dev/null +++ b/content/series/llm-assisted-flashcard-generator/_index.md @@ -0,0 +1,10 @@ ++++ +title = "LLM-Assisted Flashcard Generator" +summary = """ + In this series, I write up a little program I wrote for myself, + which detects vocabulary words I underline in a book and turns them + into flashcards. I view this through the lens of a first foray into + development that heavily relies on LLMs. + """ +status = "complete" ++++