From e3834ed6ea6ef32e537321093265505093a7e078 Mon Sep 17 00:00:00 2001 From: Danila Fedorin Date: Sat, 14 Mar 2020 21:04:13 -0700 Subject: [PATCH] Explain graph code --- code/compiler/10/graph.hpp | 2 +- content/blog/10_compiler_polymorphism.md | 82 +++++++++++++++++++++++- 2 files changed, 82 insertions(+), 2 deletions(-) diff --git a/code/compiler/10/graph.hpp b/code/compiler/10/graph.hpp index a15b2ac..66dc392 100644 --- a/code/compiler/10/graph.hpp +++ b/code/compiler/10/graph.hpp @@ -26,8 +26,8 @@ class function_graph { size_t indegree; }; - using edge = std::pair; using data_ptr = std::shared_ptr; + using edge = std::pair; using group_edge = std::pair; std::map> adjacency_lists; diff --git a/content/blog/10_compiler_polymorphism.md b/content/blog/10_compiler_polymorphism.md index bb428e3..633f925 100644 --- a/content/blog/10_compiler_polymorphism.md +++ b/content/blog/10_compiler_polymorphism.md @@ -203,7 +203,7 @@ defn testOne = { if True False True } defn testTwo = { if True 0 1 } ``` -If we go through and type check them top-to-bottom, the following happens: +If we go through and typecheck them top-to-bottom, the following happens: 1. We start by assuming \\(\\text{if} : a \\rightarrow b \\rightarrow c \\rightarrow d\\), \\(\\text{testOne} : e\\) and \\(\\text{testTwo} : f\\). @@ -343,3 +343,83 @@ includes the group's adjacency list and (both used for Kahn's topological sorting algorithm), as well as the set of functions in the group (which we will eventually return). +The `adjacency_lists` and `edges` fields are the meat of the graph representation. +Both of the variables provide a different view of the same graph: `adjacency_lists` +associates with every function a list of functions it depends on, while +`edges` holds a set of tuples describing edges in the graph. Having +more than one representation makes it more convenient for us to perform +different operations on our graphs. + +Next up are some internal methods that perform the various steps we described +above: +* `compute_transitive_edges` applies Warshall's algorithm to find the graph's +transitive closure. +* `create_groups` creates two mappings, one from functions to their respective +groups' IDs, and one from group IDs to information about the corresponding groups. +This step is largely used to determine which functions belong to the same group, +and as such, uses the set of transitive edges generated by `compute_transitive_edges`. +* `create_edges` creates edges __between groups__. During this step, the indegrees +of each group are computed, as well as their adjacency lists. +* `generate_order` uses the indegrees and adjacency lists produced in the prior step +to establish a topological order. + +Finally, the `add_edge` method is used to add a new dependency between two functions, +while the `compute_order` method uses the internal methods described above to convert +the function dependency graph into a properly ordered list of groups. + +Let's start by looking at how to implement the internal methods. `compute_transitive_edges` +is a very straightforward implementation of Warshall's: + +{{< codelines "C++" "compiler/10/graph.hpp" 53 71 >}} + +Next is `create_groups`, for each function, we iterate over all other functions. +If the other function is mutually dependent with the first function, we add +it to the same group. In the outer loop, we skip over functions that have +already been added to the group. This is because +{{< sidenote "right" "equivalence-note" "mutual dependence" >}} +There is actually a slight difference between "mutual dependence" +the way we defined it and "being in the same group", and +it lies in the symmetric property of an equivalence relation. +We defined a function to depend on another function if it calls +that other function. Then, a recursive function depends on itself, +but a non-recursive function does not, and therefore does not +satisfy the symmetric property. However, as far as we're concerned, +a function should be in a group with itself even if it's not recursive. Thus, the +real equivalence relation we use is "in the same group as", and +consists of "mutual dependence" extended with symmetry. +{{< /sidenote >}} +is an [equivalence relation](https://en.wikipedia.org/wiki/Equivalence_relation), +which means that if we already added a function to a group, all its +group members were also already visited and added. + +{{< codelines "C++" "compiler/10/graph.hpp" 73 94 >}} + +Once groups have been created, we use their functions' edges +to create edges for the groups themselves, using `create_edges`. +We avoid creating edges from a group to itself, to avoid +unnecessary cycles. While constructing the edges, we also +increment the relevant indegree counter. + +{{< codelines "C++" "compiler/10/graph.hpp" 96 113 >}} + +Finally, we apply Kahn's algorithm to create a topological order +in `generate_order`: + +{{< codelines "C++" "compiler/10/graph.hpp" 115 140 >}} + +These four steps are used in `compute_order`: + +{{< codelines "C++" "compiler/10/graph.hpp" 152 160 >}} + +Finally, `add_edge` straightforwardly adds an edge +to the graph: + +{{< codelines "C++" "compiler/10/graph.hpp" 142 150 >}} + +With this, we can now properly order our typechecking. +However, there are a few pieces of the puzzle missing. +First of all, we need to actually insert function +dependencies into the graph. Second, we need to think +about how our existing language features and implementation +will interact with polymorphism. Third, we have to come up +with an implementation of polymorphic data types.