Bring day 7 up to date with the blog version
This commit is contained in:
parent
268b6b24b4
commit
53172351df
507
day7.chpl
507
day7.chpl
|
@ -14,7 +14,7 @@
|
||||||
/*
|
/*
|
||||||
### The Task at Hand and My Approach
|
### The Task at Hand and My Approach
|
||||||
|
|
||||||
In today's puzzle, we are given a list of terminal-like commands (
|
In [today's puzzle](https://adventofcode.com/2022/day/7), we are given a list of terminal-like commands (
|
||||||
[`ls`](https://man7.org/linux/man-pages/man1/ls.1.html) and [`cd`](https://man7.org/linux/man-pages/man1/cd.1p.html)
|
[`ls`](https://man7.org/linux/man-pages/man1/ls.1.html) and [`cd`](https://man7.org/linux/man-pages/man1/cd.1p.html)
|
||||||
), as well as output corresponding to running these commands. The commands
|
), as well as output corresponding to running these commands. The commands
|
||||||
explore a fictional file system, which can have files (objects with size)
|
explore a fictional file system, which can have files (objects with size)
|
||||||
|
@ -23,35 +23,420 @@
|
||||||
sizes of all folders that are smaller than a particular threshold.
|
sizes of all folders that are smaller than a particular threshold.
|
||||||
|
|
||||||
The tree-like nature of the file system does not make it amenable to
|
The tree-like nature of the file system does not make it amenable to
|
||||||
representations based on arrays, lists, and maps alone. The trouble with
|
representations based on arrays, lists, or maps alone. The trouble with
|
||||||
these data types is that they're flat. Our input could -- and will -- have arbitrary
|
these data types is that they're flat. Our input could have arbitrary
|
||||||
levels of nested directories. However, arrays, lists, and maps cannot have
|
levels of nested directories. However, arrays, lists, and maps cannot have
|
||||||
such arbitrary nesting -- we'd need something like a list of lists of lists...
|
such arbitrary nesting --- we'd need something like a list of lists of lists of...
|
||||||
We could, of course, use the `map` and `list` data types to represent the
|
We could, of course, use the `map` and `list` data types to represent the
|
||||||
file system with some sort of [adjacency list](https://en.wikipedia.org/wiki/Adjacency_list).
|
file system with some sort of [adjacency list](https://en.wikipedia.org/wiki/Adjacency_list).
|
||||||
However, such an implementation would be somewhat clunky and hard to use.
|
However, such an implementation would be somewhat clunky and hard to use.
|
||||||
|
|
||||||
Instead, we'll use a different tool from the repertoire of Chapel language
|
Instead, in this article I use a different tool from the repertoire of Chapel language
|
||||||
features, one we haven't seen so far: classes. Much like in most languages,
|
features, one we haven't seen so far: classes. Specifically, I use a class, `Dir`, to represent
|
||||||
classes are a way to group together related pieces of data. up until now,
|
directories in the file system, and build up a tree of these directories
|
||||||
we've used tuples for this purpose.
|
while reading the input. I then create an iterator over this tree that
|
||||||
|
computes and yields the sizes of the folders. From there, it's easy to
|
||||||
|
pick out all directory sizes smaller than the threshold and sum them up.
|
||||||
|
|
||||||
|
**If you skip right to your favorite parts of a movie, here's a full solution for the day:**
|
||||||
|
{{< whole_file_min >}}
|
||||||
|
|
||||||
|
And now, on to the explanation train. Before the train departs, let's import
|
||||||
|
a few of the modules we'll use today. `IO` is a permanent fixture in our
|
||||||
|
solutions (we always need to read input!), and `List` is a familiar face.
|
||||||
|
The only newcomer here is `Map`, which helps us associate keys with values,
|
||||||
|
much like a dictionary in Python, a hash in Ruby, or a map in C++.
|
||||||
|
We'll use maps and lists for storing the various files and directories
|
||||||
|
on the file system.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
use IO, Map, List;
|
use IO, Map, List;
|
||||||
|
|
||||||
class TreeNode {
|
/*
|
||||||
|
With that, our train's first stop: classes!
|
||||||
|
|
||||||
|
### Classes in Chapel
|
||||||
|
|
||||||
|
Like in most languages, classes in Chapel are a way to group together related
|
||||||
|
pieces of data. Up until now, we've used tuples for this purpose. Tuples,
|
||||||
|
however, have a couple of limitations when it comes to solving today's
|
||||||
|
Advent of Code problem:
|
||||||
|
|
||||||
|
* We can't name a tuple's elements. Whenever you make and use a tuple,
|
||||||
|
it is up to _you_ to remember the order of the elements within it, and
|
||||||
|
what each element represents.
|
||||||
|
* Tuples can be nested, but their precise element types, including
|
||||||
|
nesting depth, must be known at compile-time. As a result, tuples aren’t
|
||||||
|
flexible enough to support the arbitrary levels of nesting that would be
|
||||||
|
required by a program that didn’t know the directory structure _a priori_
|
||||||
|
(e.g. one that was reading it from disk). We simply don’t have the
|
||||||
|
information at compile-time to describe the tuple’s types and “shape”.
|
||||||
|
|
||||||
|
Classes have neither of these limitations. They do, however, need to be
|
||||||
|
explicitly created within Chapel code. For example, one might create a
|
||||||
|
class to store information about a person:
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
class person {
|
||||||
|
var firstName, lastName: string;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
We've seen plenty of `var` statements used to create variables; when used
|
||||||
|
within a class, `var` declares a _member variable_ (also known as a _field_)
|
||||||
|
for the class. Our `person` contains two pieces of data in its fields: the
|
||||||
|
person's first name (`firstName`) and last name (`lastName`).
|
||||||
|
|
||||||
|
With that class definition in hand, we can create instances of the `person` class
|
||||||
|
using the `new` keyword.
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
var biggestCandyFan = new person("Daniel", "Fedorin");
|
||||||
|
```
|
||||||
|
|
||||||
|
As usual, we can rely on type inference to only write the type `person` once;
|
||||||
|
Chapel figures out that `biggestCandyFan` is a `person`. Now, it's easy to get
|
||||||
|
the various fields back out of a class:
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
writeln("The biggest fan of candy is ", biggestCandyFan.firstName);
|
||||||
|
```
|
||||||
|
|
||||||
|
Believe it or not, we've already seen enough of classes to see how to represent
|
||||||
|
nested data structures. The key observation is that classes have names, which
|
||||||
|
means that we can create fields that refer back to instances of the same class. Here's
|
||||||
|
an example of what I mean, in the form of a modified `person` class:
|
||||||
|
|
||||||
|
```Chapel {hl_lines=3}
|
||||||
|
class person {
|
||||||
|
var firstName, lastName: string;
|
||||||
|
var children: list(owned person);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The highlighted line is new. We've added a list of children to our person.
|
||||||
|
These children are themselves instances of `person`, which means they too
|
||||||
|
can have children of their own. _Et voilà_ - we've got a nested data structure!
|
||||||
|
|
||||||
|
#### Memory Management Strategies
|
||||||
|
You probably noticed that `children`'s type is `list(owned person)` ---
|
||||||
|
note the `owned`. This keyword is an indication of the way that memory is
|
||||||
|
allocated and maintained for classes: their _memory management_. To create
|
||||||
|
a class, a Chapel program asks for some memory from the computer (_allocates_ it).
|
||||||
|
This memory is kept by the program until the instance of a class is no longer
|
||||||
|
needed, at which point it's _deallocated_/_freed_. The challenge is knowing when
|
||||||
|
a class is no longer needed! This is where _memory management strategies_,
|
||||||
|
like `owned`, come in.
|
||||||
|
|
||||||
|
We don't need to get too deep into the various memory management strategies
|
||||||
|
in today's post.
|
||||||
|
|
||||||
|
{{< details summary="**(If you're curious, here's a brief description of each strategy...)**" >}}
|
||||||
|
* When using the `owned` strategy, a class instance has one "owner" variable.
|
||||||
|
The instance is only around as long as this owner exists.
|
||||||
|
As soon as the owner disappears, the class instance is deallocated.
|
||||||
|
In some cases --- though we won't be covering them today --- ownership can
|
||||||
|
be transferred from one variable to another, but no two values can
|
||||||
|
own the same class instance at the same time.
|
||||||
|
|
||||||
|
Other variables can still refer to an `owned` class instance, but they must _borrow_ it,
|
||||||
|
creating, for example, a `borrowed person`. Borrows do not affect the
|
||||||
|
lifetime of class or when it is deallocated.
|
||||||
|
* When using the `shared` strategy, Chapel keeps track of how many places
|
||||||
|
still have variables that refer to a particular instance of a class. This
|
||||||
|
is typically called a _reference count_. Each time a variable is created
|
||||||
|
or changed to refer to a class instance, the instance's reference count
|
||||||
|
increases. When that variable goes out of scope and disappears, the
|
||||||
|
reference count decreases. Finally, when the reference count reaches
|
||||||
|
zero (no more variables refer to the class instance), there's no point
|
||||||
|
in keeping it around anymore, and its memory is deallocated.
|
||||||
|
|
||||||
|
As is the case with `owned`, other variables can borrow `shared` class instances.
|
||||||
|
Such borrows do not affect the reference count at all, and therefore don't
|
||||||
|
influence when the instance is freed.
|
||||||
|
* When using the `unmanaged` strategy, you're promising to manually free
|
||||||
|
the memory later, using the `delete` keyword. This is very similar to
|
||||||
|
how `new`/`delete` work in classic C++.
|
||||||
|
{{< /details >}}
|
||||||
|
|
||||||
|
So, the `owned` keyword in our `children` list means we've opted for the
|
||||||
|
`owned` memory management strategy. The implication of this is that
|
||||||
|
when a "parent" person is deallocated, so are all of its children
|
||||||
|
(since the person class, through its `children` list, owns each child).
|
||||||
|
If we aren't planning on sharing our data, `owned` is the preferred strategy. This is because
|
||||||
|
it precludes the need for some bookkeeping, which
|
||||||
|
often makes a difference in terms of performance. The added benefit to using
|
||||||
|
`owned`, in my personal view, is that it's easier to figure out when something
|
||||||
|
will be deleted --- there's no chance of some other variable, elsewhere in my program,
|
||||||
|
preventing a class instance's deallocation.
|
||||||
|
|
||||||
|
#### Methods
|
||||||
|
Remember how I said that classes can be used to group together pieces
|
||||||
|
of related data? Well, they can do more than that. They can also group
|
||||||
|
together operations on this data, in the form of _methods_. For instance,
|
||||||
|
we could add the following definition **inside** the `class` declaration
|
||||||
|
for our `person`:
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
class person {
|
||||||
|
// ... as before
|
||||||
|
|
||||||
|
proc getGreeting() {
|
||||||
|
return "Hello, " + this.firstName + "!";
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Just like fields can be thought of as `var`s that are associated with a particular
|
||||||
|
class instance, methods can be thought of as _procedures_ associated with
|
||||||
|
a particular class instance. Thus, methods behave pretty much exactly
|
||||||
|
like the `proc`s we've seen so far, with the notable difference of being able to
|
||||||
|
access that class instance through the `this` keyword.
|
||||||
|
For example, inside the body of a method like `getGreeting` above,
|
||||||
|
`this.firstName` gets us the person's first name, and `this.lastName` would
|
||||||
|
get us their last name.
|
||||||
|
|
||||||
|
We can call methods using the dot syntax:
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
// Prints "Hello, Daniel!"
|
||||||
|
writeln(biggestCandyFan.getGreeting());
|
||||||
|
```
|
||||||
|
|
||||||
|
Methods are a powerful tool for abstraction; rather than writing external code
|
||||||
|
that refers to the various fields of a class, we can put that logic
|
||||||
|
inside of methods, and avoid exposing it to the rest of the world. A person
|
||||||
|
writing `.getGreeting()` will not need to know how a name is represented
|
||||||
|
in the `person` class.
|
||||||
|
|
||||||
|
Another sort of method is a _type method_ (sometimes referred to as
|
||||||
|
a _static method_ in other languages). Rather than being called on
|
||||||
|
an instance of a person, like `biggestCandyFan` or `daniel`, it's called
|
||||||
|
on the class itself. For instance:
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
class person {
|
||||||
|
// ... as before
|
||||||
|
|
||||||
|
proc type createBiggestCandyFan() {
|
||||||
|
return new person("Daniel", "Fedorin");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
var biggestCandyFan = person.createBiggestCandyFan();
|
||||||
|
```
|
||||||
|
|
||||||
|
Methods like this have the benefit of being associated with a particular class.
|
||||||
|
This means that another class can have its own `createBiggestCandyFan()`
|
||||||
|
method, and there won't be any confusion or problems arising from trying
|
||||||
|
to figure out which is which. Perhaps dogs (represented by a hypothetical
|
||||||
|
`dog` class) have a biggest candy fan, too!
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
var biggestCandyFan = person.createBiggestCandyFan();
|
||||||
|
var biggestCandyFanDog = dog.createBiggestCandyFan();
|
||||||
|
```
|
||||||
|
|
||||||
|
### A `Dir` Class to Represent Directories
|
||||||
|
Back to the solution. The class I use for tracking directories is actually not too different
|
||||||
|
from our modified `person` class above. Each directory
|
||||||
|
{{< sidenote right "dir-firstname-note" "will have a name" >}}
|
||||||
|
Despite the recent media noise about ChatGPT, directories have not yet
|
||||||
|
been granted personhood, and do not have both first and last names.
|
||||||
|
{{< /sidenote >}}
|
||||||
|
as well as a collection of files and directories it contains.
|
||||||
|
*/
|
||||||
|
|
||||||
|
class Dir {
|
||||||
var name: string;
|
var name: string;
|
||||||
|
|
||||||
var files = new map(string, int);
|
var files = new map(string, int);
|
||||||
var dirs = new list(owned TreeNode);
|
var dirs = new list(owned Dir);
|
||||||
|
|
||||||
proc init(name: string) {
|
/*
|
||||||
this.name = name;
|
Since files have no
|
||||||
|
additional information to them besides their size, I decided to represent
|
||||||
|
them as a map --- a directory's `files` field associates each file's name
|
||||||
|
with that file's size. The subdirectories are represented just like
|
||||||
|
the `children` field from our `person` record, as a list of owned `Dir`s.
|
||||||
|
|
||||||
|
There are a few more things I want to add to `Dir`;
|
||||||
|
the first is a way to read our directory from our puzzle input.
|
||||||
|
|
||||||
|
#### Reading the File System with the `fromInput` Type Method
|
||||||
|
For reasons of abstraction and avoiding conflicts, I put
|
||||||
|
the code for creating a directory from user input into a type method on `Dir`. Within
|
||||||
|
this method, I include the now-familiar code for reading from the
|
||||||
|
input using `readLine`, until we run out of lines.
|
||||||
|
*/
|
||||||
|
|
||||||
|
proc type fromInput(name: string): owned Dir {
|
||||||
|
var line: string;
|
||||||
|
var newDir = new Dir(name);
|
||||||
|
|
||||||
|
while readLine(line, stripNewline = true) {
|
||||||
|
/*
|
||||||
|
Notice that I'm accepting the name for the
|
||||||
|
directory as a string formal and initializing a new variable `newDir` with that name.
|
||||||
|
Notice also that I don't need to provide the `files` and `dirs`
|
||||||
|
as arguments to `new Dir` --- they have default values in the
|
||||||
|
class definition. By default, `new` uses the `owned` memory management
|
||||||
|
strategy. For the time being, the `newDir` variable owns our
|
||||||
|
directory-under-construction.
|
||||||
|
|
||||||
|
We're reading lines now; all that's left is to figure out what to do
|
||||||
|
with them. The first case is that of `$ cd ..`. When we see that line,
|
||||||
|
it means that we're done looking at the current directory; none
|
||||||
|
of the subsequent `ls` lines will be meant for us. Thus, we break
|
||||||
|
out of the input `while`-loop.
|
||||||
|
*/
|
||||||
|
if line == "$ cd .." {
|
||||||
|
break;
|
||||||
|
/*
|
||||||
|
If the `cd` command is used, but its argument isn't `..`, we're being
|
||||||
|
asked to descend into a sub-directory of our current `newDir`.
|
||||||
|
In this case, we call the `fromInput` method again, recursively,
|
||||||
|
to create a subdirectory of the current one. This
|
||||||
|
call will keep consuming lines from the input until the sub-directory
|
||||||
|
has been processed, at which point it will return it to us. We'll
|
||||||
|
immediately append this sub-directory to the `newDir.dirs` list,
|
||||||
|
which becomes the sub-directory's new owner.
|
||||||
|
|
||||||
|
Recall that we need to give `fromInput` the name of the new
|
||||||
|
sub-directory. We can figure out the name by slicing the string
|
||||||
|
starting after the `$ cd` prefix. Since I want to get the rest of the
|
||||||
|
characters after the prefix, I leave the end of my range unbounded, which
|
||||||
|
makes the slice go until the characters run out at the end of the string.
|
||||||
|
If you're feeling shaky on lists and `append`, check out our [day 5 article]({{< relref "aoc2022-day05-cratestacks" >}}#moving-crates-within-an-array-of-lists).
|
||||||
|
If you want a little refresher on slicing, we first covered it on [day 3]({{< relref "aoc2022-day03-rucksacks" >}}/ranges-and-slicing).
|
||||||
|
|
||||||
|
*/
|
||||||
|
} else if line.startsWith("$ cd ") {
|
||||||
|
param cdPrefix = "$ cd ";
|
||||||
|
const dirName = line[cdPrefix.size..];
|
||||||
|
newDir.dirs.append(Dir.fromInput(dirName));
|
||||||
|
/*
|
||||||
|
As it turns out, all that's left is to handle files. We already get
|
||||||
|
directory names from `cd`, so there's no reason to worry about
|
||||||
|
lines starting with `dir`. The `ls` command itself always precedes
|
||||||
|
the list of files and directories; by itself, it provides us no
|
||||||
|
additional information. Thus, our last case is a line that's neither
|
||||||
|
`dir` nor `ls`. Such a line is a file, so its format will be a number
|
||||||
|
followed by the file's name.
|
||||||
|
|
||||||
|
I use the `partition` method on the line to split it into three
|
||||||
|
pieces: the part before the space, the space itself, and the part
|
||||||
|
after the space. After that, I can just update the `newDir` map,
|
||||||
|
associating the file called `name` with its size. I use an integer cast
|
||||||
|
to convert `size` (a string) to a number.
|
||||||
|
*/
|
||||||
|
} else if !line.startsWith("$ ls") && !line.startsWith("dir") {
|
||||||
|
const (size, _, name) = line.partition(" ");
|
||||||
|
newDir.files[name] = size : int;
|
||||||
|
/*
|
||||||
|
That's it for the loop! Once the loop stops running, we know we're done
|
||||||
|
processing the directory. All that remains is to return it. Returning
|
||||||
|
an `owned` value from a function or method transfers ownership to whatever
|
||||||
|
code calls the function or method.
|
||||||
|
*/
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return newDir;
|
||||||
}
|
}
|
||||||
|
/*
|
||||||
|
One more thing: I have explicitly annotated the
|
||||||
|
return type of `fromInput` to be `owned Dir` to let Chapel know
|
||||||
|
that I'm using the `owned` memory management strategy. This might just
|
||||||
|
be the first return type annotation we've written so far. Up until now,
|
||||||
|
Chapel has been able to deduce the return types of our procedures
|
||||||
|
and iterators automatically. However, here, because we are using
|
||||||
|
recursion, it needs just a little bit of help: determining the types
|
||||||
|
in the body of `fromInput` requires knowing the type of `fromInput`!
|
||||||
|
The manual type annotation helps break that loop.
|
||||||
|
*/
|
||||||
|
|
||||||
|
/*
|
||||||
|
#### An Iterator Method for Listing Directory Sizes
|
||||||
|
Let's recap. What we have now is a data structure, `Dir`, which represents
|
||||||
|
the directory tree. We also have a type method, `Dir.fromInput` that
|
||||||
|
converts our puzzle input into this data structure. What's left?
|
||||||
|
|
||||||
|
The way I see it, the problem is composed of three pieces:
|
||||||
|
|
||||||
|
1. Go through all of the directory sizes...
|
||||||
|
2. ... ignoring those that are above a certain threshold ...
|
||||||
|
3. ... and sum them.
|
||||||
|
|
||||||
|
Over the past week, we've gotten really good at summing things! In
|
||||||
|
Chapel, we can just use `+reduce` to compute the sum of something
|
||||||
|
iterable, so there's point number three. For point two, it turns out that
|
||||||
|
those [loop expressions]({{< relref "aoc2022-day06-packets" >}}#parallel-loop-expressions)
|
||||||
|
from yesterday can be used to filter out elements like so:
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
[for i in iterable] if someCondition then i
|
||||||
|
```
|
||||||
|
|
||||||
|
Putting these two pieces together, we might write something like:
|
||||||
|
```Chapel
|
||||||
|
+ reduce [for size in directorySizes] if size < 1000000 then size
|
||||||
|
```
|
||||||
|
|
||||||
|
That `directorySizes` is the only "fictional" piece of the solution.
|
||||||
|
Perhaps we can make our `Dir` tree support an iterator of directory sizes?
|
||||||
|
Then, we'd have our answer.
|
||||||
|
|
||||||
|
In my solution, I do just that. Methods on classes don't have to be procedures ---
|
||||||
|
they can also be iterators. There's only one complication. We want our
|
||||||
|
iterator method to yield the sizes of _all_ of the various sub-directories
|
||||||
|
within a `Dir` including sub-directories of sub-directories. That's because
|
||||||
|
we have to sum them all up as per the problem statement. However, when
|
||||||
|
_computing_ the size of a directory, we don't want to include sub-sub-directories
|
||||||
|
in our counting: the direct sub-directories already include the sizes of
|
||||||
|
their own contents. To make this work, I added a `parentSize` formal to
|
||||||
|
the iterator method, which represents a reference to the parent directory's
|
||||||
|
size. When it's done yielding its own size, as well as the sizes of the
|
||||||
|
sub-directories, the iterator method will add its own size to its parent's.
|
||||||
|
|
||||||
|
Here's the implementation of the iterator method; I'll talk about it in
|
||||||
|
more detail below.
|
||||||
|
*/
|
||||||
|
iter dirSizes(ref parentSize = 0): int {
|
||||||
|
// Compute sizes from files only.
|
||||||
|
var size = + reduce files.values();
|
||||||
|
for subDir in dirs {
|
||||||
|
// Yield directory sizes from the dir.
|
||||||
|
for subSize in subDir.dirSizes(size) do yield subSize;
|
||||||
|
}
|
||||||
|
yield size;
|
||||||
|
parentSize += size;
|
||||||
|
}
|
||||||
|
/*
|
||||||
|
The first thing this method does is create a new variable, `size`,
|
||||||
|
representing the current directory's size. It's initialized to the sum
|
||||||
|
of all the file sizes. However, at this point, that's not the whole size ---
|
||||||
|
we also need to figure out how much data is stored in the subdirectories.
|
||||||
|
|
||||||
|
I use a `for` loop over the `dirs` list to examine each sub-directory
|
||||||
|
of the current folder in turn. Each of these sub-directories is its
|
||||||
|
own full-fledged `Dir`, so we can call its `dirSizes`
|
||||||
|
method. This gives us an iterator of all directory sizes from `subDir`.
|
||||||
|
I simply yield them from the parent iterator, making it yield
|
||||||
|
the sizes of all directories, including nested ones. Notice that I also
|
||||||
|
provide `size` as the argument to the recursive call to `dirSizes`:
|
||||||
|
the inner for-loop serves the double purpose of yielding directory sizes
|
||||||
|
and finishing computing the current folder's size.
|
||||||
|
|
||||||
|
Once all of the sub-directory sizes have been yielded, the `size` variable
|
||||||
|
includes all the files in the folder, including nested ones. Thus, I use it to yield
|
||||||
|
the size of the current folder. I also add `size` to `parentSize`.
|
||||||
|
|
||||||
|
That concludes our `Dir` class!
|
||||||
|
*/
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
|
||||||
|
{{< skip >}}
|
||||||
```Chapel
|
```Chapel
|
||||||
iter these(param tag: iterKind): (string, int) where tag == iterKind.standalone {
|
iter these(param tag: iterKind): (string, int) where tag == iterKind.standalone {
|
||||||
var size = + reduce files.values();
|
var size = + reduce files.values();
|
||||||
|
@ -65,46 +450,74 @@ class TreeNode {
|
||||||
this.size = size;
|
this.size = size;
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
{{< /skip >}}
|
||||||
|
|
||||||
*/
|
*/
|
||||||
|
|
||||||
iter dirSizes(ref parentSize = 0): (string, int) {
|
|
||||||
var size = + reduce files.values();
|
|
||||||
for dir in dirs {
|
|
||||||
// Yield directory sizes from the dir.
|
|
||||||
for subSize in dir.dirSizes(size) do yield subSize;
|
|
||||||
}
|
|
||||||
yield (name, size);
|
|
||||||
parentSize += size;
|
|
||||||
}
|
|
||||||
|
|
||||||
proc type fromInput(name: string, readFrom): owned TreeNode {
|
|
||||||
var line: string;
|
|
||||||
var newDir = new TreeNode(name);
|
|
||||||
|
|
||||||
while readFrom.readLine(line, stripNewline = true) {
|
|
||||||
if line == "$ cd .." {
|
|
||||||
break;
|
|
||||||
} else if line.startsWith("$ cd ") {
|
|
||||||
const dirName = line["$ cd ".size..];
|
|
||||||
newDir.dirs.append(TreeNode.fromInput(dirName, readFrom));
|
|
||||||
} else if !line.startsWith("$ ls") {
|
|
||||||
const (sizeOrDir, _, name) = line.partition(" ");
|
|
||||||
if sizeOrDir == "dir" {
|
|
||||||
// Ignore directories, we'll `cd` into them.
|
|
||||||
} else {
|
|
||||||
newDir.files[name] = sizeOrDir : int;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
return newDir;
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
var rootFolder = TreeNode.fromInput("", stdin);
|
/*
|
||||||
|
### Putting It All Together
|
||||||
|
With our `Dir` class complete, we can finally make use of it in our code.
|
||||||
|
The first thing we need to do is read our file system from the input;
|
||||||
|
this is accomplished using the `fromInput` method.
|
||||||
|
*/
|
||||||
|
|
||||||
|
var rootFolder = Dir.fromInput("/");
|
||||||
|
|
||||||
|
/*
|
||||||
|
Next up, we can use that `+reduce` expression I described above. I use
|
||||||
|
a new variable, `rootSize`, to represent the size of the top-level directory.
|
||||||
|
After the call to `dirSizes` completes, it will be set to the total size of
|
||||||
|
the root directory, i.e., the total disk usage. */
|
||||||
var rootSize = 0;
|
var rootSize = 0;
|
||||||
writeln(+ reduce [(_, size) in rootFolder.dirSizes(rootSize)] if size < 100000 then size);
|
writeln(+ reduce [size in rootFolder.dirSizes(rootSize)] if size < 100000 then size);
|
||||||
|
|
||||||
const toDelete = rootSize - 40000000;
|
/*
|
||||||
writeln(min reduce [(_, size) in rootFolder.dirSizes()] if size >= toDelete then size);
|
I could've omitted the argument to `dirSizes` --- notice from the method's
|
||||||
|
signature that I provide a default value for `parentSize`.
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
iter dirSizes(ref parentSize = 0): int {
|
||||||
|
```
|
||||||
|
|
||||||
|
However, knowing `rootSize` lets us easily compute the amount of space we need
|
||||||
|
to free up (for part 2 of today's problem).
|
||||||
|
*/
|
||||||
|
const toDelete = rootSize - 40000000; // = 30000000 - (70000000 - rootSize)
|
||||||
|
|
||||||
|
/*
|
||||||
|
We can now re-use our `dirSizes` stream to check every directory size again,
|
||||||
|
this time looking for the smallest folder that meets a certain threshold.
|
||||||
|
A `min` reduction takes care of this:
|
||||||
|
*/
|
||||||
|
writeln(min reduce [size in rootFolder.dirSizes()] if size >= toDelete then size);
|
||||||
|
|
||||||
|
/* And there's the solution to part 2, as well! */
|
||||||
|
|
||||||
|
/*
|
||||||
|
### Summary
|
||||||
|
This concludes today's description of my solution. This time, I introduced
|
||||||
|
Chapel's classes --- defining them, creating fields and adding methods. We got
|
||||||
|
a little taste of memory management strategies and ownership, though I deliberately
|
||||||
|
kept it light to avoid introducing too many new concepts.
|
||||||
|
|
||||||
|
Admittedly, today's solution is (for the most part) serial. Although the
|
||||||
|
`+reduce` expression that computes the initial `size` of a directory from
|
||||||
|
its `files` is eligible for parallelization, the `dirSizes` iterator is not. The main
|
||||||
|
reason for this is that the interaction between recursive parallel iterators and
|
||||||
|
reductions is, at the time of writing, unimplemented.
|
||||||
|
Nevertheless, I think that using even a serial iterator has _yielded_ an elegant
|
||||||
|
solution (pun intended).
|
||||||
|
|
||||||
|
If you wanted to write a parallel version, I'd advise creating a new,
|
||||||
|
non-iterator method on `Dir` that solves just part 1 of today's puzzle.
|
||||||
|
This method could return a tuple of two elements, perhaps `sumSmallSizes`
|
||||||
|
and `dirSize`; then, a simple `forall` loop over `dirs` (and judicious use of reduce intents,
|
||||||
|
which are described in our [day 4 article]({{< relref "aoc2022-day04-ranges" >}}third-solution-parallel-approach))
|
||||||
|
will let you compute the answer in parallel.
|
||||||
|
|
||||||
|
Thanks for reading! Please feel free
|
||||||
|
to ask any questions or post any comments you have in the new [Blog
|
||||||
|
Category](https://chapel.discourse.group/c/blog/21) of Chapel's
|
||||||
|
Discourse Page. */
|
||||||
|
|
Loading…
Reference in New Issue
Block a user