Explain graph code
All checks were successful
continuous-integration/drone/push Build is passing

This commit is contained in:
Danila Fedorin 2020-03-14 21:04:13 -07:00
parent 1bdb4a650e
commit e3834ed6ea
2 changed files with 82 additions and 2 deletions

View File

@ -26,8 +26,8 @@ class function_graph {
size_t indegree;
};
using edge = std::pair<function, function>;
using data_ptr = std::shared_ptr<group_data>;
using edge = std::pair<function, function>;
using group_edge = std::pair<group_id, group_id>;
std::map<function, std::set<function>> adjacency_lists;

View File

@ -203,7 +203,7 @@ defn testOne = { if True False True }
defn testTwo = { if True 0 1 }
```
If we go through and type check them top-to-bottom, the following happens:
If we go through and typecheck them top-to-bottom, the following happens:
1. We start by assuming \\(\\text{if} : a \\rightarrow b \\rightarrow c \\rightarrow d\\),
\\(\\text{testOne} : e\\) and \\(\\text{testTwo} : f\\).
@ -343,3 +343,83 @@ includes the group's adjacency list and
(both used for Kahn's topological sorting algorithm), as well as the set
of functions in the group (which we will eventually return).
The `adjacency_lists` and `edges` fields are the meat of the graph representation.
Both of the variables provide a different view of the same graph: `adjacency_lists`
associates with every function a list of functions it depends on, while
`edges` holds a set of tuples describing edges in the graph. Having
more than one representation makes it more convenient for us to perform
different operations on our graphs.
Next up are some internal methods that perform the various steps we described
above:
* `compute_transitive_edges` applies Warshall's algorithm to find the graph's
transitive closure.
* `create_groups` creates two mappings, one from functions to their respective
groups' IDs, and one from group IDs to information about the corresponding groups.
This step is largely used to determine which functions belong to the same group,
and as such, uses the set of transitive edges generated by `compute_transitive_edges`.
* `create_edges` creates edges __between groups__. During this step, the indegrees
of each group are computed, as well as their adjacency lists.
* `generate_order` uses the indegrees and adjacency lists produced in the prior step
to establish a topological order.
Finally, the `add_edge` method is used to add a new dependency between two functions,
while the `compute_order` method uses the internal methods described above to convert
the function dependency graph into a properly ordered list of groups.
Let's start by looking at how to implement the internal methods. `compute_transitive_edges`
is a very straightforward implementation of Warshall's:
{{< codelines "C++" "compiler/10/graph.hpp" 53 71 >}}
Next is `create_groups`, for each function, we iterate over all other functions.
If the other function is mutually dependent with the first function, we add
it to the same group. In the outer loop, we skip over functions that have
already been added to the group. This is because
{{< sidenote "right" "equivalence-note" "mutual dependence" >}}
There is actually a slight difference between "mutual dependence"
the way we defined it and "being in the same group", and
it lies in the symmetric property of an equivalence relation.
We defined a function to depend on another function if it calls
that other function. Then, a recursive function depends on itself,
but a non-recursive function does not, and therefore does not
satisfy the symmetric property. However, as far as we're concerned,
a function should be in a group with itself even if it's not recursive. Thus, the
real equivalence relation we use is "in the same group as", and
consists of "mutual dependence" extended with symmetry.
{{< /sidenote >}}
is an [equivalence relation](https://en.wikipedia.org/wiki/Equivalence_relation),
which means that if we already added a function to a group, all its
group members were also already visited and added.
{{< codelines "C++" "compiler/10/graph.hpp" 73 94 >}}
Once groups have been created, we use their functions' edges
to create edges for the groups themselves, using `create_edges`.
We avoid creating edges from a group to itself, to avoid
unnecessary cycles. While constructing the edges, we also
increment the relevant indegree counter.
{{< codelines "C++" "compiler/10/graph.hpp" 96 113 >}}
Finally, we apply Kahn's algorithm to create a topological order
in `generate_order`:
{{< codelines "C++" "compiler/10/graph.hpp" 115 140 >}}
These four steps are used in `compute_order`:
{{< codelines "C++" "compiler/10/graph.hpp" 152 160 >}}
Finally, `add_edge` straightforwardly adds an edge
to the graph:
{{< codelines "C++" "compiler/10/graph.hpp" 142 150 >}}
With this, we can now properly order our typechecking.
However, there are a few pieces of the puzzle missing.
First of all, we need to actually insert function
dependencies into the graph. Second, we need to think
about how our existing language features and implementation
will interact with polymorphism. Third, we have to come up
with an implementation of polymorphic data types.