Compare commits
304 Commits
846d85bb7a
...
master
| Author | SHA1 | Date | |
|---|---|---|---|
| 5e117e3f48 | |||
| a6f3cd3f9a | |||
| ccc8d6f0eb | |||
| 7a088d6739 | |||
| 626baefd76 | |||
| 4602d02720 | |||
| 7fbd4ea9f8 | |||
| 6fd1e1962b | |||
| 62c338e382 | |||
| 40ea9ec637 | |||
| 787e194d41 | |||
| 71162e2db9 | |||
| 43debc65e4 | |||
| b07ea85b70 | |||
| 06e8b8e022 | |||
| 647f47a5f3 | |||
| 36e4feb668 | |||
| 11be991946 | |||
| a8c2b1d05a | |||
| fb46142e9d | |||
| 0b33d03b73 | |||
| 804147caef | |||
| d847d20666 | |||
| 07408d01a9 | |||
| 816a473913 | |||
| ce8f8fb872 | |||
| 2f60004241 | |||
| 7130c6bd11 | |||
| c5aacc060a | |||
| 6048dc0b9c | |||
| 1f01c3caff | |||
| bca44343eb | |||
| 3b9c2edcdd | |||
| fa180ee24e | |||
| 5846dd5d04 | |||
| f6b347eb05 | |||
| c1b27a13ae | |||
| 147658ee89 | |||
| 4017c52fee | |||
| 65d290556f | |||
| 854dccd4d2 | |||
| 07859cd1af | |||
| 19be7eb1f5 | |||
| 5de2ae1203 | |||
| 595a1ad99b | |||
| 32ccfc76ae | |||
| ae0d54e567 | |||
| b774f0e088 | |||
| e294bcb2d0 | |||
| 1322e0249c | |||
| e40a05633c | |||
| e5fb0a2929 | |||
| 3bd1f0c8a0 | |||
| 948759b7d4 | |||
| f2e424944e | |||
| 8a471c6b45 | |||
| 9b93582d18 | |||
| 9fc2d16fb8 | |||
| f00c69f02c | |||
| 4fc1191d13 | |||
| 951aafc90a | |||
| ee13409b33 | |||
| 615aeb72da | |||
| 3be67ca4c8 | |||
| b216581f2b | |||
| 1060198b01 | |||
| 0485917a20 | |||
| 37dd9ad6d4 | |||
| 77dade1d1d | |||
| d140422225 | |||
| 2defe99c73 | |||
| 0766211d79 | |||
| b854eade3c | |||
| c6f34f8eaa | |||
| 0ba5799c75 | |||
| 324fa948e6 | |||
| 3e46ff8be6 | |||
| 18eb1da5ba | |||
| dde7df4604 | |||
| e4101f1396 | |||
| d94ceeab2e | |||
| 697f083237 | |||
| 3e97fdcfea | |||
| 40007c427e | |||
| 5ab0d0d40e | |||
| 7c65afbc93 | |||
| fb071e55aa | |||
| 406c934b7a | |||
| 292cf009e6 | |||
| 6bf7659b19 | |||
| 0a90e8da29 | |||
| 32fe8e5ee6 | |||
| 7ccbaa7829 | |||
| 7817c9a4ce | |||
| bf9b0aedf9 | |||
| 21b2ff208e | |||
| ecad4541f6 | |||
| 26a29865e7 | |||
| ba287130d3 | |||
| 864276ea72 | |||
| 23eb36b911 | |||
| 51bcd116b3 | |||
| 928adbd594 | |||
| e91b6f692f | |||
| 861dafef70 | |||
| 388c23c376 | |||
| 82d9196c90 | |||
| 826a16eb66 | |||
| 0b97eb85a1 | |||
| a2132001e8 | |||
| 4fd6dd5606 | |||
| 7b71ec4402 | |||
| 07771117ac | |||
| 7fbc884e94 | |||
| b40be72590 | |||
| b3db25c470 | |||
| 6a7b6ffc1f | |||
| c87bb90c48 | |||
| a6225191d0 | |||
| afb904a64c | |||
| cbad391dcd | |||
| 0bf3facf6c | |||
| a9da39b987 | |||
| 1071bdd35a | |||
| 8656985885 | |||
| 3be523b79e | |||
| 1fb7e5ff85 | |||
| df75d6e017 | |||
| 29c9af4902 | |||
| 7b03183e75 | |||
| 4c70e61a14 | |||
| b2b225f4ae | |||
| c17142e648 | |||
| 8e4759bd2b | |||
| d2807917d2 | |||
| 71c030b947 | |||
| 1f3ab5349a | |||
| daaccb9b2f | |||
| b66c58b68e | |||
| 13636a0d29 | |||
| 5232f0a6e2 | |||
| 6a168f2fe1 | |||
| 04f12b545d | |||
| 711b01175d | |||
| 272c2666c5 | |||
| 60ba43378a | |||
| 1da60b3b28 | |||
| ee118b07e5 | |||
| 06ee998d54 | |||
| c197a45540 | |||
| 4d23b45633 | |||
| c027efa931 | |||
| 96c4d6fecc | |||
| 4e1fd54c58 | |||
| d6db020e1c | |||
| c0f4fe017f | |||
| 11d726a6ad | |||
| 37399afc68 | |||
| 5c19fc4208 | |||
| 00f0f13b93 | |||
| 54844fb954 | |||
| 6ffd3afeaa | |||
| 9d0dcd98bd | |||
| f78f877e21 | |||
| 5d0b903c03 | |||
| 032411fe9c | |||
| 76b061d699 | |||
| 08d37c839c | |||
| 0a26230fe1 | |||
| 2e59beda0c | |||
| ee8b1f5dc0 | |||
| 4938cdaecd | |||
| 84f28ae5ce | |||
| 58e7a1f2dc | |||
| 78bba7a0e9 | |||
| 9d31764073 | |||
| d787548915 | |||
| a29bca499f | |||
| 60d3b3025a | |||
| c036041339 | |||
| 1df315612a | |||
| 15beddf96b | |||
| 20d8b18a9b | |||
| 53ff0c39e4 | |||
| 357a3bef09 | |||
| 81eef51e88 | |||
| 10dfb2fe49 | |||
| ef76149112 | |||
| a6a330a78e | |||
| 7f4d0df366 | |||
| 3eddac0a89 | |||
| 6e7ac1c1ca | |||
| 68d9cf1274 | |||
| 5eb0d1548c | |||
| e543904995 | |||
| ffda1d3235 | |||
| b705aa217c | |||
| 6f20b17948 | |||
| 2fde7e5cf8 | |||
| bee06b6731 | |||
| d3fa7336a2 | |||
| 96545a899f | |||
| 6ef5ae2394 | |||
| 05a31dd4d4 | |||
| d9d5c8bf14 | |||
| 291a1f0178 | |||
| 2547b53aa2 | |||
| 409f8b7186 | |||
| 189422bf1e | |||
| 74daeee140 | |||
| befcd3cf98 | |||
| e063ff6aa5 | |||
| 6179c86919 | |||
| a20fe07a56 | |||
| 2b5dcf12d7 | |||
| 5873c1ca96 | |||
| c6e2ecb996 | |||
| 2130b00752 | |||
| 474c3a8348 | |||
| 29a18b8b37 | |||
| 21ab4f0a8b | |||
| 7d2842fd64 | |||
| cd4c121e07 | |||
| d0570f876e | |||
| 8bf99a1ab0 | |||
| fdcd4ddd60 | |||
| c09fff455f | |||
| 0b3755c69a | |||
| e90fb64946 | |||
| d542a4790e | |||
| bc5cb4009c | |||
| 6724533d0e | |||
| 0f0668b77b | |||
| 266bf9b4cf | |||
| a6f3bccf64 | |||
| 2d640f2e6a | |||
| d7d99205a1 | |||
| 9f437d5b9f | |||
| 0d3100ba33 | |||
| 72fb69d87b | |||
| ed4fcf5e9d | |||
| 8742c6e7b9 | |||
| d7d7254a7b | |||
| 7e8870de6c | |||
| 8f2b2addc2 | |||
| e4743bbdef | |||
| 80cdea6932 | |||
| 645f2c5c9c | |||
| 85b81ffc98 | |||
| fa5536f504 | |||
| 16086e79b0 | |||
| b001bba3b8 | |||
| 0c895a2662 | |||
| 6f0641f315 | |||
| dc9dbe8a0f | |||
| 0b8096f973 | |||
| d58a2a9975 | |||
| a83268a6e3 | |||
| 5c83f234c6 | |||
| 24abec4045 | |||
| 56ff56281e | |||
| c25f9ad9ae | |||
| 5041c90ac0 | |||
| 2855675fa5 | |||
| 209689c5f4 | |||
| 3d64b0aa28 | |||
| 3bceab0606 | |||
| c189da3671 | |||
| dd232cedb5 | |||
| 88c5daa561 | |||
| 4f281ef108 | |||
| 12aca7ca58 | |||
| 77ec1aa969 | |||
| 8710a5554c | |||
| 6b24d67409 | |||
| 48c3105f42 | |||
| 032453c4d0 | |||
| f093868da1 | |||
| 1f5e38190d | |||
| 250884c7bc | |||
| 8a2e91e65e | |||
| 5910ce7980 | |||
| 00bec06012 | |||
| 54dccdbc7d | |||
| 2bd776ec55 | |||
| 23cf7c9e8b | |||
| 384f5de765 | |||
| 9ae4798d80 | |||
| 850ccbdcee | |||
| d8ab3f2226 | |||
| 9ddd2dd3bc | |||
| f579641866 | |||
| a71c0c4e74 | |||
| d3921f9e20 | |||
| e0d7332dea | |||
| d6b8eb8548 | |||
| 2964b6c6fa | |||
| a0cd1074e1 | |||
| cc2b5ef918 | |||
| d003fdf357 | |||
| 5384faf3ec | |||
| a833cd84f3 | |||
| 7f1b9d31ea | |||
| 5bd8c11a86 |
3
.gitmodules
vendored
@@ -13,3 +13,6 @@
|
|||||||
[submodule "code/compiler"]
|
[submodule "code/compiler"]
|
||||||
path = code/compiler
|
path = code/compiler
|
||||||
url = https://dev.danilafe.com/DanilaFe/bloglang.git
|
url = https://dev.danilafe.com/DanilaFe/bloglang.git
|
||||||
|
[submodule "code/agda-spa"]
|
||||||
|
path = code/agda-spa
|
||||||
|
url = https://dev.danilafe.com/DanilaFe/agda-spa.git
|
||||||
|
|||||||
9
Gemfile
Normal file
@@ -0,0 +1,9 @@
|
|||||||
|
# frozen_string_literal: true
|
||||||
|
|
||||||
|
source "https://rubygems.org"
|
||||||
|
|
||||||
|
git_source(:github) {|repo_name| "https://github.com/#{repo_name}" }
|
||||||
|
|
||||||
|
gem 'nokogiri'
|
||||||
|
gem 'execjs'
|
||||||
|
gem 'duktape'
|
||||||
21
Gemfile.lock
Normal file
@@ -0,0 +1,21 @@
|
|||||||
|
GEM
|
||||||
|
remote: https://rubygems.org/
|
||||||
|
specs:
|
||||||
|
duktape (2.7.0.0)
|
||||||
|
execjs (2.9.1)
|
||||||
|
mini_portile2 (2.8.8)
|
||||||
|
nokogiri (1.18.3)
|
||||||
|
mini_portile2 (~> 2.8.2)
|
||||||
|
racc (~> 1.4)
|
||||||
|
racc (1.8.1)
|
||||||
|
|
||||||
|
PLATFORMS
|
||||||
|
ruby
|
||||||
|
|
||||||
|
DEPENDENCIES
|
||||||
|
duktape
|
||||||
|
execjs
|
||||||
|
nokogiri
|
||||||
|
|
||||||
|
BUNDLED WITH
|
||||||
|
2.1.4
|
||||||
361
agda.rb
Normal file
@@ -0,0 +1,361 @@
|
|||||||
|
require "nokogiri"
|
||||||
|
require "pathname"
|
||||||
|
|
||||||
|
files = ARGV[0..-1]
|
||||||
|
|
||||||
|
class LineInfo
|
||||||
|
attr_accessor :links
|
||||||
|
|
||||||
|
def initialize
|
||||||
|
@links = []
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
class AgdaContext
|
||||||
|
def initialize
|
||||||
|
@file_infos = {}
|
||||||
|
end
|
||||||
|
|
||||||
|
# Traverse the preformatted Agda block in the given Agda HTML file
|
||||||
|
# and find which textual ranges have IDs and links to other ranges.
|
||||||
|
# Store this information in a hash, line => LineInfo.
|
||||||
|
def process_agda_html_file(file)
|
||||||
|
return @file_infos[file] if @file_infos.include? file
|
||||||
|
|
||||||
|
@file_infos[file] = line_infos = {}
|
||||||
|
unless File.exist?(file)
|
||||||
|
return line_infos
|
||||||
|
end
|
||||||
|
|
||||||
|
document = Nokogiri::HTML.parse(File.open(file))
|
||||||
|
pre_code = document.css("pre.Agda")[0]
|
||||||
|
|
||||||
|
# The traversal postorder; we always visit children before their
|
||||||
|
# parents, and we visit leaves in sequence.
|
||||||
|
offset = 0
|
||||||
|
line = 1
|
||||||
|
pre_code.traverse do |at|
|
||||||
|
# Text nodes are always leaves; visiting a new leaf means we've advanced
|
||||||
|
# in the text by the length of that text. However, if there are newlines
|
||||||
|
# -- since this is a preformatted block -- we also advanced by a line.
|
||||||
|
# At this time, do not support links that span multiple lines, but
|
||||||
|
# Agda doesn't produce those either.
|
||||||
|
if at.text?
|
||||||
|
if at.content.include? "\n"
|
||||||
|
# This textual leaf is at least part whitespace. The logic
|
||||||
|
# assumes that links can't span multiple pages, and that links
|
||||||
|
# aren't nested, so ensure that the parent of the textual node
|
||||||
|
# is the preformatted block itself.
|
||||||
|
if at.parent.name != "pre"
|
||||||
|
# Costmetic highlight warnings are sometimes applied to newlines.
|
||||||
|
# If they don't have content, treat them as normal newlines at the
|
||||||
|
# top level.
|
||||||
|
#
|
||||||
|
# This is an <a class="CosmeticProblem">\n</a> node.
|
||||||
|
unless at.parent.name == "a" and at.parent['class'] == "CosmeticProblem" and at.content.strip.empty?
|
||||||
|
raise "unsupported Agda HTML output in file #{file} at line #{line} (content #{at.content.inspect})"
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
|
||||||
|
# Increase the line and track the final offset. Written as a loop
|
||||||
|
# in case we eventually want to add some handling for the pieces
|
||||||
|
# sandwiched between newlines.
|
||||||
|
at.content.split("\n", -1).each_with_index do |bit, idx|
|
||||||
|
line += 1 unless idx == 0
|
||||||
|
offset = bit.length
|
||||||
|
end
|
||||||
|
else
|
||||||
|
# It's not a newline node, so it could be anywhere. All we need to
|
||||||
|
# do is adjust the offset within the full pre block's text.
|
||||||
|
offset += at.content.length
|
||||||
|
end
|
||||||
|
elsif at.name == "a"
|
||||||
|
# We found a link. Agda emits both links and "things to link to" as
|
||||||
|
# 'a' nodes, so check for either, and record them. Even if
|
||||||
|
# the link is nested, the .content.length accessor will only
|
||||||
|
# retrieve the textual content, and thus -- assuming the link
|
||||||
|
# isn't split across lines -- will find the proper from-to range.
|
||||||
|
|
||||||
|
line_info = line_infos.fetch(line) { line_infos[line] = LineInfo.new }
|
||||||
|
href = at.attribute("href")
|
||||||
|
id = at.attribute("id")
|
||||||
|
if href or id
|
||||||
|
new_node = { :from => offset-at.content.length, :to => offset }
|
||||||
|
new_node[:href] = href if href
|
||||||
|
new_node[:id] = id if id
|
||||||
|
|
||||||
|
line_info.links << new_node
|
||||||
|
end
|
||||||
|
end
|
||||||
|
end
|
||||||
|
return line_infos
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
class FileGroup
|
||||||
|
def initialize(agda_context)
|
||||||
|
@agda_context = agda_context
|
||||||
|
# Agda HTML href -> list of (file, Hugo HTML node that links to it)
|
||||||
|
@nodes_referencing_href = {}
|
||||||
|
# Agda HTML href -> (its new ID in Hugo-land, file in which it's defined)
|
||||||
|
# This supports cross-post linking within a seires.
|
||||||
|
@global_seen_hrefs = {}
|
||||||
|
# file name -> Agda HTML href -> its new ID in Hugo-land
|
||||||
|
# This supports linking within a particular post.
|
||||||
|
@local_seen_hrefs = Hash.new { {} }
|
||||||
|
# Global counter to generate fresh IDs. There's no reason for it to
|
||||||
|
# be global within a series (since IDs are namespaced by the file they're in),
|
||||||
|
# but it's just more convenient this way.
|
||||||
|
@id_counter = 0
|
||||||
|
end
|
||||||
|
|
||||||
|
def note_defined_href(file, href)
|
||||||
|
file_hrefs = @local_seen_hrefs.fetch(file) do
|
||||||
|
@local_seen_hrefs[file] = {}
|
||||||
|
end
|
||||||
|
|
||||||
|
uniq_id = file_hrefs.fetch(href) do
|
||||||
|
new_id = "agda-unique-ident-#{@id_counter}"
|
||||||
|
@id_counter += 1
|
||||||
|
file_hrefs[href] = new_id
|
||||||
|
end
|
||||||
|
|
||||||
|
unless @global_seen_hrefs.include? href
|
||||||
|
@global_seen_hrefs[href] = { :file => file, :id => uniq_id }
|
||||||
|
end
|
||||||
|
|
||||||
|
return uniq_id
|
||||||
|
end
|
||||||
|
|
||||||
|
def note_used_href(file, node, href)
|
||||||
|
ref_list = @nodes_referencing_href.fetch(href) { @nodes_referencing_href[href] = [] }
|
||||||
|
ref_list << { :file => file, :node => node }
|
||||||
|
return href
|
||||||
|
end
|
||||||
|
|
||||||
|
# Given a Hugo HTML file which references potentially several Agda modules
|
||||||
|
# in code blocks, insert links into the code blocks.
|
||||||
|
#
|
||||||
|
# There are several things we need to take care of:
|
||||||
|
# 1. Finding the HTML files associated with each referenced Agda module.
|
||||||
|
# For this, we make use of the data-base-path etc. attributes that
|
||||||
|
# the vanilla theme inserts.
|
||||||
|
# 2. "zipping together" the Agda and Hugo HTML representations. Each of
|
||||||
|
# them encode the code, but they use different HTML elements and structures.
|
||||||
|
# So, given a Hugo HTML code block, traverse its textual contents
|
||||||
|
# and find any that are covered by links in the related Agda HTML file.
|
||||||
|
# 3. Fixing up links: the Agda HTML links assume each module has its own HTML
|
||||||
|
# file. This isn't true for us: multiple modules are stitched into
|
||||||
|
# one Hugo HTML file. Also, we don't include the entire Agda HTML
|
||||||
|
# file in the Hugo HTML, so some links may be broken. So, find IDs
|
||||||
|
# that are visible in the Hugo file, rename them to be globally unique,
|
||||||
|
# and re-write cross-file links that reference these IDs to point
|
||||||
|
# inside the Hugo file.
|
||||||
|
def process_source_file(file, document)
|
||||||
|
# Process each highlight group that's been marked as an Agda file.
|
||||||
|
document.css('div[data-agda-block]').each do |t|
|
||||||
|
first_line, last_line = nil, nil
|
||||||
|
|
||||||
|
if first_line_attr = t.attribute("data-first-line")
|
||||||
|
first_line = first_line_attr.to_s.to_i
|
||||||
|
end
|
||||||
|
if last_line_attr = t.attribute("data-last-line")
|
||||||
|
last_line = last_line_attr.to_s.to_i
|
||||||
|
end
|
||||||
|
|
||||||
|
if first_line and last_line
|
||||||
|
line_range = first_line..last_line
|
||||||
|
else
|
||||||
|
line_range = 1..
|
||||||
|
end
|
||||||
|
|
||||||
|
# Sometimes, code is deeply nested in the source file, but we don't
|
||||||
|
# want to show the leading space. In that case, the generator sets
|
||||||
|
# data-source-offset with how much leading space was stripped off.
|
||||||
|
initial_offset = 0
|
||||||
|
if source_offset_attr = t.attribute("data-source-offset")
|
||||||
|
initial_offset = source_offset_attr.to_s.to_i
|
||||||
|
end
|
||||||
|
|
||||||
|
full_path = t.attribute("data-file-path").to_s
|
||||||
|
full_path_dirs = Pathname(full_path).each_filename.to_a
|
||||||
|
|
||||||
|
# The name of an Agda module is determined from its directory
|
||||||
|
# structure: A/B/C.agda creates A.B.C.html. Depending on where
|
||||||
|
# the code is included, there might be some additional folders
|
||||||
|
# that precede A that we want to ignore.
|
||||||
|
base_path = t.attribute("data-base-path").to_s
|
||||||
|
base_dir_depth = 0
|
||||||
|
if base_path.empty?
|
||||||
|
# No submodules were used. Assume code/<X> is the root, since
|
||||||
|
# that's the code layout of the blog right now.
|
||||||
|
base_dir_depth = 1
|
||||||
|
base_path = full_path_dirs[0]
|
||||||
|
else
|
||||||
|
# The code is in a submodule. Assume that the base path / submodule
|
||||||
|
# root is the Agda module root, ignore all folders before that.
|
||||||
|
base_path_dirs = Pathname(base_path).each_filename.to_a
|
||||||
|
base_dir_depth = base_path_dirs.length
|
||||||
|
end
|
||||||
|
|
||||||
|
dirs_in_base = full_path_dirs[base_dir_depth..-1]
|
||||||
|
html_file = dirs_in_base.join(".").gsub(/\.agda$/, ".html")
|
||||||
|
html_path = File.join(["code", base_path, "html", html_file])
|
||||||
|
|
||||||
|
agda_info = @agda_context.process_agda_html_file(html_path)
|
||||||
|
|
||||||
|
# Hugo conveniently generates a bunch of spans, each encoding a line
|
||||||
|
# of code output. We can iterate over these and match them up with
|
||||||
|
# the line numbers we got from reading the Agda HTML output.
|
||||||
|
lines = t.css("pre.chroma code[data-lang] .line")
|
||||||
|
lines.zip(line_range).each do |line, line_no|
|
||||||
|
line_info = agda_info[line_no]
|
||||||
|
next unless line_info
|
||||||
|
|
||||||
|
offset = initial_offset
|
||||||
|
line.traverse do |lt|
|
||||||
|
if lt.text?
|
||||||
|
content = lt.content
|
||||||
|
new_offset = offset + content.length
|
||||||
|
|
||||||
|
# The span/a/etc. structure of the Agda and Hugo HTML files
|
||||||
|
# need not line up; it's possible for there to be a single link
|
||||||
|
# in the Agda file that's broken up across multiple HTML nodes
|
||||||
|
# in the Hugo output. For now, just don't link those, since inserting
|
||||||
|
# such overlapping links is relatively complicated. Instead,
|
||||||
|
# require links to fit fully within a current text node (and thus,
|
||||||
|
# not overlap the boundaries of any HTML).
|
||||||
|
matching_links = line_info.links.filter do |link|
|
||||||
|
link[:from] >= offset and link[:to] <= new_offset
|
||||||
|
end
|
||||||
|
|
||||||
|
# A given text node can be broken into any number of sub-nodes,
|
||||||
|
# where some sub-nodes are still text, and others have been turned
|
||||||
|
# into links. Store the new pieces in replace_with. E.g.,
|
||||||
|
#
|
||||||
|
# Original:
|
||||||
|
# abc
|
||||||
|
#
|
||||||
|
# New:
|
||||||
|
# a<a href="..">b</a>c
|
||||||
|
#
|
||||||
|
# replace_with:
|
||||||
|
# ["a", <Nokogiri::XML::Node...>, "c"]
|
||||||
|
#
|
||||||
|
# match_offset represents how much of the original text we've
|
||||||
|
# already converted. The below iteration assumes that matching
|
||||||
|
# links are in order, and don't overlap.
|
||||||
|
replace_with = []
|
||||||
|
replace_offset = 0
|
||||||
|
matching_links.each do |match|
|
||||||
|
# The link's range is an offset from the beginning of the line,
|
||||||
|
# but the text piece we're splitting up might be partway into
|
||||||
|
# the line. Convert the link coordinates to piece-relative ones.
|
||||||
|
relative_from = match[:from] - offset
|
||||||
|
relative_to = match[:to] - offset
|
||||||
|
|
||||||
|
# If the previous link ended some time before the new link
|
||||||
|
# began (or if the current link is the first one, and is not
|
||||||
|
# at the beginning), ensure that the plain text "in between"
|
||||||
|
# is kept.
|
||||||
|
replace_with << content[replace_offset...relative_from]
|
||||||
|
|
||||||
|
tag = (match.include? :href) ? 'a' : 'span'
|
||||||
|
new_node = Nokogiri::XML::Node.new(tag, document)
|
||||||
|
if match.include? :href
|
||||||
|
# For nodes with links, note what they're referring to, so
|
||||||
|
# we can adjust their hrefs when we assign global IDs.
|
||||||
|
href = match[:href].to_s
|
||||||
|
new_node['href'] = note_used_href file, new_node, href
|
||||||
|
end
|
||||||
|
if match.include? :id
|
||||||
|
# For nodes with IDs visible in the current Hugo file, we'll
|
||||||
|
# want to redirect links that previously go to other Agda
|
||||||
|
# module HTML files. So, note the ID that we want to redirect,
|
||||||
|
# and pick a new unique ID to replace it with.
|
||||||
|
id = match[:id].to_s
|
||||||
|
new_node['id'] = note_defined_href file, "#{html_file}##{id}"
|
||||||
|
end
|
||||||
|
new_node.content = content[relative_from...relative_to]
|
||||||
|
|
||||||
|
replace_with << new_node
|
||||||
|
replace_offset = relative_to
|
||||||
|
end
|
||||||
|
replace_with << content[replace_offset..-1]
|
||||||
|
|
||||||
|
# Finally, replace the node under consideration with the new
|
||||||
|
# pieces.
|
||||||
|
replace_with.each do |replacement|
|
||||||
|
lt.add_previous_sibling replacement
|
||||||
|
end
|
||||||
|
lt.remove
|
||||||
|
|
||||||
|
offset = new_offset
|
||||||
|
end
|
||||||
|
end
|
||||||
|
end
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
def cross_link_files
|
||||||
|
# Now, we have a complete list of all the IDs visible in scope.
|
||||||
|
# Redirect relevant links to these IDs. This achieves within-post
|
||||||
|
# links.
|
||||||
|
@nodes_referencing_href.each do |href, references|
|
||||||
|
references.each do |reference|
|
||||||
|
file = reference[:file]
|
||||||
|
node = reference[:node]
|
||||||
|
|
||||||
|
local_targets = @local_seen_hrefs[file]
|
||||||
|
if local_targets.include? href
|
||||||
|
# A code block in this fine provides this href, create a local link.
|
||||||
|
node['href'] = "##{local_targets[href]}"
|
||||||
|
elsif @global_seen_hrefs.include? href
|
||||||
|
# A code block in this series, but not in this file, defines
|
||||||
|
# this href. Create a cross-file link.
|
||||||
|
target = @global_seen_hrefs[href]
|
||||||
|
other_file = target[:file]
|
||||||
|
id = target[:id]
|
||||||
|
|
||||||
|
relpath = Pathname.new(other_file).dirname.relative_path_from(Pathname.new(file).dirname)
|
||||||
|
node['href'] = "#{relpath}##{id}"
|
||||||
|
else
|
||||||
|
# No definitions in any blog page. For now, just delete the anchor.
|
||||||
|
node.replace node.content
|
||||||
|
end
|
||||||
|
end
|
||||||
|
end
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
agda_context = AgdaContext.new
|
||||||
|
|
||||||
|
file_documents = {}
|
||||||
|
series_groups = files.group_by do |file|
|
||||||
|
file_documents[file] = document = Nokogiri::HTML.parse(File.open(file))
|
||||||
|
document.css("meta[name=blog-series]")&.attribute("content")&.to_s
|
||||||
|
end
|
||||||
|
|
||||||
|
# For the 'nil' group, process individually.
|
||||||
|
if files_with_no_series = series_groups.delete(nil)
|
||||||
|
files_with_no_series.each do |file|
|
||||||
|
file_group = FileGroup.new agda_context
|
||||||
|
file_group.process_source_file file, file_documents[file]
|
||||||
|
file_group.cross_link_files
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
# For groups, process them together to allow cross-linking
|
||||||
|
series_groups.each do |_, files_in_series|
|
||||||
|
file_group = FileGroup.new agda_context
|
||||||
|
files_in_series.each do |file|
|
||||||
|
file_group.process_source_file file, file_documents[file]
|
||||||
|
end
|
||||||
|
file_group.cross_link_files
|
||||||
|
end
|
||||||
|
|
||||||
|
# Having modified all the HTML files, save them.
|
||||||
|
file_documents.each do |file, document|
|
||||||
|
File.write(file, document.to_html(encoding: 'UTF-8'))
|
||||||
|
end
|
||||||
29
analyze.rb
@@ -25,8 +25,9 @@ Dir['content/blog/**/*.md'].each do |file|
|
|||||||
file = file.chomp
|
file = file.chomp
|
||||||
files << file
|
files << file
|
||||||
arr = refs[file] || (refs[file] = [])
|
arr = refs[file] || (refs[file] = [])
|
||||||
File.open(file).read.scan(/< relref "([^"]+)" >/) do |ref|
|
pattern = Regexp.union(/< relref "([^"]+)" >/, /< draftlink "[^"]+" "([^"]+)" >/)
|
||||||
ref = resolve_path(File.dirname(file), ref[0])
|
File.open(file).read.scan(pattern) do |ref|
|
||||||
|
ref = resolve_path(File.dirname(file), ref[0] || ref[1])
|
||||||
arr << ref
|
arr << ref
|
||||||
files << ref
|
files << ref
|
||||||
end
|
end
|
||||||
@@ -35,13 +36,14 @@ end
|
|||||||
|
|
||||||
data = {}
|
data = {}
|
||||||
id = 0
|
id = 0
|
||||||
|
series = {}
|
||||||
files.each do |file|
|
files.each do |file|
|
||||||
id += 1
|
id += 1
|
||||||
name = file
|
name = file
|
||||||
tags = []
|
tags = []
|
||||||
group = 1
|
group = 1
|
||||||
draft = false
|
draft = false
|
||||||
next unless File.exists?(file)
|
next unless File.exist?(file)
|
||||||
value = File.size(file)
|
value = File.size(file)
|
||||||
url = file.gsub(/^content/, "https://danilafe.com").delete_suffix("/index.md").delete_suffix(".md")
|
url = file.gsub(/^content/, "https://danilafe.com").delete_suffix("/index.md").delete_suffix(".md")
|
||||||
File.readlines(file).each do |l|
|
File.readlines(file).each do |l|
|
||||||
@@ -49,6 +51,12 @@ files.each do |file|
|
|||||||
name = $~[1].delete_prefix('"').delete_suffix('"')
|
name = $~[1].delete_prefix('"').delete_suffix('"')
|
||||||
elsif l =~ /^draft: true$/
|
elsif l =~ /^draft: true$/
|
||||||
draft = true
|
draft = true
|
||||||
|
elsif l =~ /^series: (.+)$/
|
||||||
|
this_series = $~[1]
|
||||||
|
series_list = series.fetch(this_series) do
|
||||||
|
series[this_series] = []
|
||||||
|
end
|
||||||
|
series_list << file
|
||||||
elsif l =~ /^tags: (.+)$/
|
elsif l =~ /^tags: (.+)$/
|
||||||
tags = $~[1].delete_prefix("[").delete_suffix("]").split(/,\s?/).map { |it| it.gsub('"', '') }
|
tags = $~[1].delete_prefix("[").delete_suffix("]").split(/,\s?/).map { |it| it.gsub('"', '') }
|
||||||
if tags.include? "Compilers"
|
if tags.include? "Compilers"
|
||||||
@@ -61,6 +69,10 @@ files.each do |file|
|
|||||||
group = 5
|
group = 5
|
||||||
elsif tags.include? "Crystal"
|
elsif tags.include? "Crystal"
|
||||||
group = 6
|
group = 6
|
||||||
|
elsif tags.include? "Agda"
|
||||||
|
group = 7
|
||||||
|
elsif tags.include? "Hugo"
|
||||||
|
group = 8
|
||||||
end
|
end
|
||||||
end
|
end
|
||||||
end
|
end
|
||||||
@@ -82,8 +94,17 @@ files.each do |file1|
|
|||||||
edges << { :from => data[file1][:id], :to => data[ref][:id] }
|
edges << { :from => data[file1][:id], :to => data[ref][:id] }
|
||||||
end
|
end
|
||||||
end
|
end
|
||||||
edges.uniq
|
series.each do |series, files|
|
||||||
|
files.sort.each_cons(2) do |file1, file2|
|
||||||
|
next unless data[file1]
|
||||||
|
next unless data[file2]
|
||||||
|
edges << { :from => data[file1][:id], :to => data[file2][:id] }
|
||||||
|
edges << { :from => data[file2][:id], :to => data[file1][:id] }
|
||||||
|
end
|
||||||
|
end
|
||||||
|
edges.uniq!
|
||||||
# edges.filter! { |e| e[:from] < e[:to] }
|
# edges.filter! { |e| e[:from] < e[:to] }
|
||||||
|
edges.map! { |e| { :from => [e[:from], e[:to]].min, :to => [e[:from], e[:to]].max } }.uniq!
|
||||||
|
|
||||||
puts ("export const nodes = " + JSON.pretty_unparse(data.values) + ";")
|
puts ("export const nodes = " + JSON.pretty_unparse(data.values) + ";")
|
||||||
puts ("export const edges = " + JSON.pretty_unparse(edges) + ";")
|
puts ("export const edges = " + JSON.pretty_unparse(edges) + ";")
|
||||||
|
|||||||
65
assets/bergamot/rendering/imp.bergamot
Normal file
@@ -0,0 +1,65 @@
|
|||||||
|
LatexListNil @ latexlist(nil, nil) <-;
|
||||||
|
LatexListCons @ latexlist(cons(?x, ?xs), cons(?l_x, ?l_s)) <- latex(?x, ?l_x), latexlist(?xs, ?l_s);
|
||||||
|
|
||||||
|
IntercalateNil @ intercalate(?sep, nil, nil) <-;
|
||||||
|
IntercalateConsCons @ intercalate(?sep, cons(?x_1, cons(?x_2, ?xs)), cons(?x_1, cons(?sep, ?ys))) <- intercalate(?sep, cons(?x_2, ?xs), ?ys);
|
||||||
|
IntercalateConsNil @ intercalate(?sep, cons(?x, nil), cons(?x, nil)) <-;
|
||||||
|
|
||||||
|
NonEmpty @ nonempty(cons(?x, ?xs)) <-;
|
||||||
|
|
||||||
|
InListHere @ inlist(?e, cons(?e, ?es)) <-;
|
||||||
|
InListThere @ inlist(?e_1, cons(?e_2, ?es)) <- inlist(?e_1, ?es);
|
||||||
|
|
||||||
|
BasicParenLit @ paren(lit(?v), ?l) <- latex(lit(?v), ?l);
|
||||||
|
BasicParenVar @ paren(var(?x), ?l) <- latex(var(?x), ?l);
|
||||||
|
BasicParenVar @ paren(metavariable(?x), ?l) <- latex(metavariable(?x), ?l);
|
||||||
|
BasicParenOther @ paren(?t, ?l) <- latex(?t, ?l_t), join(["(", ?l_t, ")"], ?l);
|
||||||
|
|
||||||
|
LatexInt @ latex(?i, ?l) <- int(?i), tostring(?i, ?l);
|
||||||
|
LatexFloat @ latex(?f, ?l) <- float(?f), tostring(?f, ?l);
|
||||||
|
LatexStr @ latex(?s, ?l) <- str(?s), escapestring(?s, ?l_1), latexifystring(?s, ?l_2), join(["\\texttt{\"", ?l_2, "\"}"], ?l);
|
||||||
|
LatexMeta @ latex(metavariable(?l), ?l) <-;
|
||||||
|
LatexLit @ latex(lit(?i), ?l) <- latex(?i, ?l);
|
||||||
|
LatexVar @ latex(var(metavariable(?s)), ?l) <- latex(metavariable(?s), ?l);
|
||||||
|
LatexVar @ latex(var(?s), ?l) <- latex(?s, ?l_v), join(["\\texttt{", ?l_v, "}"], ?l);
|
||||||
|
LatexPlus @ latex(plus(?e_1, ?e_2), ?l) <-
|
||||||
|
paren(?e_1, ?l_1), paren(?e_2, ?l_2),
|
||||||
|
join([?l_1, " + ", ?l_2], ?l);
|
||||||
|
LatexMinus @ latex(minus(?e_1, ?e_2), ?l) <-
|
||||||
|
paren(?e_1, ?l_1), paren(?e_2, ?l_2),
|
||||||
|
join([?l_1, " - ", ?l_2], ?l);
|
||||||
|
|
||||||
|
EnvLiteralNil @ envlitrec(empty, "\\{", "", ?seen) <-;
|
||||||
|
EnvLiteralSingle @ envlitsingle(?pre, ?e, ?v, "", ?pre, ?seen) <- inlist(?e, ?seen);
|
||||||
|
EnvLiteralSingle @ envlitsingle(?pre, ?e, ?v, ?l, ", ", ?seen) <- latex(?e, ?l_e), latex(?v, ?l_v), join([?pre, "\\texttt{", ?l_e, "} \\mapsto", ?l_v], ?l);
|
||||||
|
EnvLiteralCons @ envlitrec(extend(empty, ?e, ?v), ?l, ?newnext, ?seen) <- envlitrec(?rho, ?l_rho, ?next, cons(?e, ?seen)), envlitsingle(?next, ?e, ?v, ?l_v, ?newnext, ?seen), join([?l_rho, ?l_v], ?l);
|
||||||
|
EnvLiteralCons @ envlitrec(extend(?rho, ?e, ?v), ?l, ?newnext, ?seen) <- envlitrec(?rho, ?l_rho, ?next, cons(?e, ?seen)), envlitsingle(?next, ?e, ?v, ?l_v, ?newnext, ?seen), join([?l_rho, ?l_v], ?l);
|
||||||
|
EnvLiteralOuter @ envlit(?rho, ?l) <- envlitrec(?rho, ?l_rho, ?rest, []), join([?l_rho, "\\}"], ?l);
|
||||||
|
|
||||||
|
LatexEnvLit @ latex(?rho, ?l) <- envlit(?rho, ?l);
|
||||||
|
LatexTypeEmpty @ latex(empty, "\\{\\}") <-;
|
||||||
|
LatexExtend @ latex(extend(?a, ?b, ?c), ?l) <- latex(?a, ?l_a), latex(?b, ?l_b), latex(?c, ?l_c), join([?l_a, "[", ?l_b, " \\mapsto ", ?l_c, "]"], ?l);
|
||||||
|
LatexInenv @ latex(inenv(?x, ?v, ?G), ?l) <-latex(?x, ?l_x), latex(?v, ?l_v), latex(?G, ?l_G), join([?l_G, "(", ?l_x, ") = ", ?l_v], ?l);
|
||||||
|
LatexEvalTer @ latex(eval(?G, ?e, ?t), ?l) <- latex(?G, ?l_G), latex(?e, ?l_e), latex(?t, ?l_t), join([?l_G, ",\\ ", ?l_e, " \\Downarrow ", ?l_t], ?l);
|
||||||
|
|
||||||
|
LatexAdd @ latex(add(?a, ?b, ?c), ?l) <- latex(?a, ?l_a), latex(?b, ?l_b), latex(?c, ?l_c), join([?l_a, "+", ?l_b, "=", ?l_c], ?l);
|
||||||
|
LatexSubtract @ latex(subtract(?a, ?b, ?c), ?l) <- latex(?a, ?l_a), latex(?b, ?l_b), latex(?c, ?l_c), join([?l_a, "-", ?l_b, "=", ?l_c], ?l);
|
||||||
|
LatexEvalTer @ latex(stepbasic(?G, ?e, ?H), ?l) <- latex(?G, ?l_G), latex(?e, ?l_e), latex(?H, ?l_H), join([?l_G, ",\\ ", ?l_e, " \\Rightarrow ", ?l_H], ?l);
|
||||||
|
LatexEvalTer @ latex(step(?G, ?e, ?H), ?l) <- latex(?G, ?l_G), latex(?e, ?l_e), latex(?H, ?l_H), join([?l_G, ",\\ ", ?l_e, " \\Rightarrow ", ?l_H], ?l);
|
||||||
|
|
||||||
|
LatexNoop @ latex(noop, "\\texttt{noop}") <-;
|
||||||
|
LatexAssign @ latex(assign(?x, ?e), ?l) <- latex(?x, ?l_x), latex(?e, ?l_e), join([?l_x, " = ", ?l_e], ?l);
|
||||||
|
LatexAssign @ latex(if(?e, ?s_1, ?s_2), ?l) <- latex(?e, ?l_e), latex(?s_1, ?l_1), latex(?s_2, ?l_2), join(["\\textbf{if}\\ ", ?l_e, "\\ \\{\\ ", ?l_1, "\\ \\}\\ \\textbf{else}\\ \\{\\ ", ?l_2, "\\ \\}"], ?l);
|
||||||
|
LatexAssign @ latex(while(?e, ?s), ?l) <- latex(?e, ?l_e), latex(?s, ?l_s), join(["\\textbf{while}\\ ", ?l_e, "\\ \\{\\ ", ?l_s, "\\ \\}"], ?l);
|
||||||
|
LatexAssign @ latex(seq(?s_1, ?s_2), ?l) <- latex(?s_1, ?l_1), latex(?s_2, ?l_2), join([?l_1, "; ", ?l_2], ?l);
|
||||||
|
|
||||||
|
LatexNumNeq @ latex(not(eq(?e_1, ?e_2)), ?l) <- latex(?e_1, ?l_1), latex(?e_2, ?l_2), join([?l_1, " \\neq ", ?l_2], ?l);
|
||||||
|
LatexNot @ latex(not(?e), ?l) <- latex(?e, ?l_e), join(["\\neg (", ?l_e, ")"], ?l);
|
||||||
|
LatexNumEq @ latex(eq(?e_1, ?e_2), ?l) <- latex(?e_1, ?l_1), latex(?e_2, ?l_2), join([?l_1, " = ", ?l_2], ?l);
|
||||||
|
|
||||||
|
LatexIsInt @ latex(int(?e), ?l) <- latex(?e, ?l_e), join([?l_e, " \\in \\texttt{Int}"], ?l);
|
||||||
|
LatexIsFloat @ latex(float(?e), ?l) <- latex(?e, ?l_e), join([?l_e, " \\in \\texttt{Float}"], ?l);
|
||||||
|
LatexIsNum @ latex(num(?e), ?l) <- latex(?e, ?l_e), join([?l_e, " \\in \\texttt{Num}"], ?l);
|
||||||
|
LatexIsStr @ latex(str(?e), ?l) <- latex(?e, ?l_e), join([?l_e, " \\in \\texttt{Str}"], ?l);
|
||||||
|
LatexSym @ latex(?s, ?l) <- sym(?s), tostring(?s, ?l_1), join(["\\text{", ?l_1,"}"], ?l);
|
||||||
|
LatexCall @ latex(?c, ?l) <- call(?c, ?n, ?ts), nonempty(?ts), latexlist(?ts, ?lts_1), intercalate(", ", ?lts_1, ?lts_2), join(?lts_2, ?lts_3), join(["\\text{", ?n, "}", "(", ?lts_3, ")"], ?l);
|
||||||
74
assets/bergamot/rendering/lc.bergamot
Normal file
@@ -0,0 +1,74 @@
|
|||||||
|
PrecApp @ prec(app(?l, ?r), 100, left) <-;
|
||||||
|
PrecPlus @ prec(plus(?l, ?r), 80, either) <-;
|
||||||
|
PrecAbs @ prec(abs(?x, ?t, ?e), 0, right) <-;
|
||||||
|
PrecArr @ prec(tarr(?l, ?r), 0, right) <-;
|
||||||
|
|
||||||
|
SelectHead @ select(cons([?t, ?v], ?rest), ?default, ?v) <- ?t;
|
||||||
|
SelectTail @ select(cons([?t, ?v], ?rest), ?default, ?found) <- not(?t), select(?rest, ?default, ?found);
|
||||||
|
SelectEmpty @ select(nil, ?default, ?default) <-;
|
||||||
|
|
||||||
|
Eq @ eq(?x, ?x) <-;
|
||||||
|
|
||||||
|
ParenthAssocLeft @ parenthassoc(?a_i, left, right) <-;
|
||||||
|
ParenthAssocRight @ parenthassoc(?a_i, right, left) <-;
|
||||||
|
ParenthAssocNone @ parenthassoc(?a_i, none, ?pos) <-;
|
||||||
|
ParenthAssocNeq @ parenthassoc(?a_i, ?a_o, ?pos) <- not(eq(?a_i, ?a_o));
|
||||||
|
|
||||||
|
Parenth @ parenth(?inner, ?outer, ?pos, ?strin, ?strout) <-
|
||||||
|
prec(?inner, ?p_i, ?a_i), prec(?outer, ?p_o, ?a_o),
|
||||||
|
join(["(", ?strin, ")"], ?strinparen),
|
||||||
|
select([ [less(?p_i, ?p_o), strinparen], [less(?p_o, ?p_i), ?strin], [ parenthassoc(?a_i, ?a_o, ?pos), ?strinparen ] ], ?strin, ?strout);
|
||||||
|
ParenthFallback @ parenth(?inner, ?outer, ?pos, ?strin, ?strin) <-;
|
||||||
|
|
||||||
|
LatexListNil @ latexlist(nil, nil) <-;
|
||||||
|
LatexListCons @ latexlist(cons(?x, ?xs), cons(?l_x, ?l_s)) <- latex(?x, ?l_x), latexlist(?xs, ?l_s);
|
||||||
|
|
||||||
|
IntercalateNil @ intercalate(?sep, nil, nil) <-;
|
||||||
|
IntercalateConsCons @ intercalate(?sep, cons(?x_1, cons(?x_2, ?xs)), cons(?x_1, cons(?sep, ?ys))) <- intercalate(?sep, cons(?x_2, ?xs), ?ys);
|
||||||
|
IntercalateConsNil @ intercalate(?sep, cons(?x, nil), cons(?x, nil)) <-;
|
||||||
|
|
||||||
|
NonEmpty @ nonempty(cons(?x, ?xs)) <-;
|
||||||
|
|
||||||
|
LatexInt @ latex(?i, ?l) <- int(?i), tostring(?i, ?l);
|
||||||
|
LatexFloat @ latex(?f, ?l) <- float(?f), tostring(?f, ?l);
|
||||||
|
LatexStr @ latex(?s, ?l) <- str(?s), escapestring(?s, ?l_1), latexifystring(?s, ?l_2), join(["\\texttt{\"", ?l_2, "\"}"], ?l);
|
||||||
|
LatexMeta @ latex(metavariable(?l), ?l) <-;
|
||||||
|
LatexLit @ latex(lit(?i), ?l) <- latex(?i, ?l);
|
||||||
|
LatexVar @ latex(var(?s), ?l) <- latex(?s, ?l);
|
||||||
|
LatexPlus @ latex(plus(?e_1, ?e_2), ?l) <-
|
||||||
|
latex(?e_1, ?l_1), latex(?e_2, ?l_2),
|
||||||
|
parenth(?e_1, plus(?e_1, ?e_2), left, ?l_1, ?lp_1),
|
||||||
|
parenth(?e_2, plus(?e_1, ?e_2), right, ?l_2, ?lp_2),
|
||||||
|
join([?lp_1, " + ", ?lp_2], ?l);
|
||||||
|
LatexPair @ latex(pair(?e_1, ?e_2), ?l) <- latex(?e_1, ?l_1), latex(?e_2, ?l_2), join(["(", ?l_1, ", ", ?l_2, ")"], ?l);
|
||||||
|
LatexAbs @ latex(abs(?x, ?t, ?e), ?l) <- latex(?e, ?l_e), latex(?t, ?l_t), latex(?x, ?l_x), join(["\\lambda ", ?l_x, " : ", ?l_t, " . ", ?l_e], ?l);
|
||||||
|
LatexApp @ latex(app(?e_1, ?e_2), ?l) <-
|
||||||
|
latex(?e_1, ?l_1), latex(?e_2, ?l_2),
|
||||||
|
parenth(?e_1, app(?e_1, ?e_2), left, ?l_1, ?lp_1),
|
||||||
|
parenth(?e_2, app(?e_1, ?e_2), right, ?l_2, ?lp_2),
|
||||||
|
join([?lp_1, " \\enspace ", ?lp_2], ?l);
|
||||||
|
|
||||||
|
LatexTInt @ latex(tint, "\\text{tint}") <-;
|
||||||
|
LatexTStr @ latex(tstr, "\\text{tstr}") <-;
|
||||||
|
LatexTArr @ latex(tarr(?t_1, ?t_2), ?l) <-
|
||||||
|
latex(?t_1, ?l_1), latex(?t_2, ?l_2),
|
||||||
|
parenth(?t_1, tarr(?t_1, ?t_2), left, ?l_1, ?lp_1),
|
||||||
|
parenth(?t_2, tarr(?t_1, ?t_2), right, ?l_2, ?lp_2),
|
||||||
|
join([?lp_1, " \\to ", ?lp_2], ?l);
|
||||||
|
LatexTPair @ latex(tpair(?t_1, ?t_2), ?l) <- latex(?t_1, ?l_1), latex(?t_2, ?l_2), join(["(", ?l_1, ", ", ?l_2, ")"], ?l);
|
||||||
|
|
||||||
|
LatexTypeEmpty @ latex(empty, "\\varnothing") <-;
|
||||||
|
LatexTypeExtend @ latex(extend(?a, ?b, ?c), ?l) <- latex(?a, ?l_a), latex(?b, ?l_b), latex(?c, ?l_c), join([?l_a, " , ", ?l_b, " : ", ?l_c], ?l);
|
||||||
|
LatexTypeInenv @ latex(inenv(?x, ?t, ?G), ?l) <-latex(?x, ?l_x), latex(?t, ?l_t), latex(?G, ?l_G), join([?l_x, " : ", ?l_t, " \\in ", ?l_G], ?l);
|
||||||
|
|
||||||
|
LatexTypeBin @ latex(type(?e, ?t), ?l) <- latex(?e, ?l_e), latex(?t, ?l_t), join([?l_e, " : ", ?l_t], ?l);
|
||||||
|
LatexTypeTer @ latex(type(?G, ?e, ?t), ?l) <- latex(?G, ?l_G), latex(?e, ?l_e), latex(?t, ?l_t), join([?l_G, " \\vdash ", ?l_e, " : ", ?l_t], ?l);
|
||||||
|
|
||||||
|
LatexConverts @ latex(converts(?f, ?t), ?l) <- latex(?f, ?l_f), latex(?t, ?l_t), join([?l_f, " \\preceq ", ?l_t], ?l);
|
||||||
|
|
||||||
|
LatexIsInt @ latex(int(?e), ?l) <- latex(?e, ?l_e), join([?l_e, " \\in \\texttt{Int}"], ?l);
|
||||||
|
LatexIsFloat @ latex(float(?e), ?l) <- latex(?e, ?l_e), join([?l_e, " \\in \\texttt{Float}"], ?l);
|
||||||
|
LatexIsNum @ latex(num(?e), ?l) <- latex(?e, ?l_e), join([?l_e, " \\in \\texttt{Num}"], ?l);
|
||||||
|
LatexIsStr @ latex(str(?e), ?l) <- latex(?e, ?l_e), join([?l_e, " \\in \\texttt{Str}"], ?l);
|
||||||
|
LatexSym @ latex(?s, ?l) <- sym(?s), tostring(?s, ?l_1), join(["\\text{", ?l_1,"}"], ?l);
|
||||||
|
LatexCall @ latex(?c, ?l) <- call(?c, ?n, ?ts), nonempty(?ts), latexlist(?ts, ?lts_1), intercalate(", ", ?lts_1, ?lts_2), join(?lts_2, ?lts_3), join(["\\text{", ?n, "}", "(", ?lts_3, ")"], ?l);
|
||||||
174
assets/scss/bergamot.scss
Normal file
@@ -0,0 +1,174 @@
|
|||||||
|
@import "variables.scss";
|
||||||
|
@import "mixins.scss";
|
||||||
|
|
||||||
|
.bergamot-exercise {
|
||||||
|
counter-increment: bergamot-exercise;
|
||||||
|
|
||||||
|
.bergamot-root {
|
||||||
|
border: none;
|
||||||
|
padding: 0;
|
||||||
|
margin-top: 1em;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
.bergamot-exercise-label {
|
||||||
|
.bergamot-exercise-number::after {
|
||||||
|
content: "Exercise " counter(bergamot-exercise);
|
||||||
|
font-weight: bold;
|
||||||
|
text-decoration: underline;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
.bergamot-button {
|
||||||
|
@include bordered-block;
|
||||||
|
padding: 0.25em;
|
||||||
|
padding-left: 1em;
|
||||||
|
padding-right: 1em;
|
||||||
|
background-color: inherit;
|
||||||
|
display: inline-flex;
|
||||||
|
align-items: center;
|
||||||
|
justify-content: center;
|
||||||
|
transition: 0.25s;
|
||||||
|
font-family: $font-body;
|
||||||
|
@include var(color, text-color);
|
||||||
|
|
||||||
|
&.bergamot-hidden {
|
||||||
|
display: none;
|
||||||
|
}
|
||||||
|
|
||||||
|
.feather {
|
||||||
|
margin-right: 0.5em;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
.bergamot-play {
|
||||||
|
.feather { color: $primary-color; }
|
||||||
|
&:hover, &:focus {
|
||||||
|
.feather { color: lighten($primary-color, 20%); }
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
.bergamot-reset {
|
||||||
|
.feather { color: #0099CC; }
|
||||||
|
&:hover, &:focus {
|
||||||
|
.feather { color: lighten(#0099CC, 20%); }
|
||||||
|
}
|
||||||
|
|
||||||
|
svg {
|
||||||
|
fill: none;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
.bergamot-close {
|
||||||
|
.feather { color: tomato; }
|
||||||
|
&:hover, &:focus {
|
||||||
|
.feather { color: lighten(tomato, 20%); }
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
.bergamot-button-group {
|
||||||
|
margin-top: 1em;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
.bergamot-root {
|
||||||
|
@include bordered-block;
|
||||||
|
padding: 1em;
|
||||||
|
|
||||||
|
.bergamot-section-heading {
|
||||||
|
margin-bottom: 0.5em;
|
||||||
|
font-family: $font-body;
|
||||||
|
font-style: normal;
|
||||||
|
font-weight: bold;
|
||||||
|
font-size: 1.25em;
|
||||||
|
}
|
||||||
|
|
||||||
|
.bergamot-section {
|
||||||
|
margin-bottom: 1em;
|
||||||
|
}
|
||||||
|
|
||||||
|
textarea {
|
||||||
|
display: block;
|
||||||
|
width: 100%;
|
||||||
|
height: 10em;
|
||||||
|
resize: none;
|
||||||
|
}
|
||||||
|
|
||||||
|
input[type="text"] {
|
||||||
|
width: 100%;
|
||||||
|
@include textual-input;
|
||||||
|
}
|
||||||
|
|
||||||
|
.bergamot-rule-list {
|
||||||
|
display: flex;
|
||||||
|
flex-direction: row;
|
||||||
|
flex-wrap: wrap;
|
||||||
|
justify-content: center;
|
||||||
|
}
|
||||||
|
|
||||||
|
.bergamot-rule-list katex-expression {
|
||||||
|
margin-left: .5em;
|
||||||
|
margin-right: .5em;
|
||||||
|
flex-grow: 1;
|
||||||
|
flex-basis: 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
.bergamot-rule-section {
|
||||||
|
.bergamot-rule-section-name {
|
||||||
|
text-align: center;
|
||||||
|
margin: 0.25em;
|
||||||
|
font-weight: bold;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
.bergamot-proof-tree {
|
||||||
|
overflow: auto;
|
||||||
|
}
|
||||||
|
|
||||||
|
.bergamot-error {
|
||||||
|
@include bordered-block;
|
||||||
|
padding: 0.5rem;
|
||||||
|
border-color: tomato;
|
||||||
|
background-color: rgba(tomato, 0.25);
|
||||||
|
margin-top: 1rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.bergamot-selector {
|
||||||
|
button {
|
||||||
|
@include var(background-color, background-color);
|
||||||
|
@include var(color, text-color);
|
||||||
|
@include bordered-block;
|
||||||
|
padding: 0.5rem;
|
||||||
|
font-family: $font-body;
|
||||||
|
border-style: dotted;
|
||||||
|
|
||||||
|
&.active {
|
||||||
|
border-color: $primary-color;
|
||||||
|
border-style: solid;
|
||||||
|
font-weight: bold;
|
||||||
|
}
|
||||||
|
|
||||||
|
&:not(:first-child) {
|
||||||
|
border-bottom-left-radius: 0;
|
||||||
|
border-top-left-radius: 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
&:not(:last-child) {
|
||||||
|
border-bottom-right-radius: 0;
|
||||||
|
border-top-right-radius: 0;
|
||||||
|
border-right-width: 0;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
button.active + button {
|
||||||
|
border-left-color: $primary-color;
|
||||||
|
border-left-style: solid;
|
||||||
|
}
|
||||||
|
|
||||||
|
margin-bottom: 1rem;
|
||||||
|
}
|
||||||
|
|
||||||
|
.bergamot-no-proofs {
|
||||||
|
text-align: center;
|
||||||
|
}
|
||||||
|
}
|
||||||
430
assets/scss/thevoid.scss
Normal file
@@ -0,0 +1,430 @@
|
|||||||
|
@import "variables.scss";
|
||||||
|
|
||||||
|
body {
|
||||||
|
background-color: #1c1e26;
|
||||||
|
--text-color: white;
|
||||||
|
font-family: $font-code;
|
||||||
|
}
|
||||||
|
|
||||||
|
h1, h2, h3, h4, h5, h6 {
|
||||||
|
text-align: left;
|
||||||
|
font-family: $font-code;
|
||||||
|
}
|
||||||
|
|
||||||
|
h1::after {
|
||||||
|
content: "(writing)";
|
||||||
|
font-size: 1rem;
|
||||||
|
margin-left: 0.5em;
|
||||||
|
position: relative;
|
||||||
|
bottom: -0.5em;
|
||||||
|
color: $primary-color;
|
||||||
|
}
|
||||||
|
|
||||||
|
nav .container {
|
||||||
|
justify-content: flex-start;
|
||||||
|
|
||||||
|
a {
|
||||||
|
padding-left: 0;
|
||||||
|
margin-right: 1em;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
.header-divider {
|
||||||
|
visibility: hidden;
|
||||||
|
}
|
||||||
|
|
||||||
|
hr {
|
||||||
|
height: auto;
|
||||||
|
border: none;
|
||||||
|
|
||||||
|
&::after {
|
||||||
|
content: "* * *";
|
||||||
|
color: $primary-color;
|
||||||
|
font-size: 2rem;
|
||||||
|
display: block;
|
||||||
|
text-align: center;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Code for the CSS glitch effect. Originally from: https://codepen.io/mattgrosswork/pen/VwprebG */
|
||||||
|
|
||||||
|
.glitch {
|
||||||
|
position: relative;
|
||||||
|
|
||||||
|
span {
|
||||||
|
animation: paths 5s step-end infinite;
|
||||||
|
font-weight: bold;
|
||||||
|
}
|
||||||
|
|
||||||
|
&::before, &::after {
|
||||||
|
content: attr(data-text);
|
||||||
|
position: absolute;
|
||||||
|
width: 110%;
|
||||||
|
z-index: -1;
|
||||||
|
}
|
||||||
|
|
||||||
|
&::before {
|
||||||
|
top: 10px;
|
||||||
|
left: 15px;
|
||||||
|
color: #e0287d;
|
||||||
|
|
||||||
|
animation: paths 5s step-end infinite, opacity 5s step-end infinite,
|
||||||
|
font 8s step-end infinite, movement 10s step-end infinite;
|
||||||
|
}
|
||||||
|
|
||||||
|
&::after {
|
||||||
|
top: 5px;
|
||||||
|
left: -10px;
|
||||||
|
color: #1bc7fb;
|
||||||
|
|
||||||
|
animation: paths 5s step-end infinite, opacity 5s step-end infinite,
|
||||||
|
font 7s step-end infinite, movement 8s step-end infinite;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
@keyframes paths {
|
||||||
|
0% {
|
||||||
|
clip-path: polygon(
|
||||||
|
0% 43%,
|
||||||
|
83% 43%,
|
||||||
|
83% 22%,
|
||||||
|
23% 22%,
|
||||||
|
23% 24%,
|
||||||
|
91% 24%,
|
||||||
|
91% 26%,
|
||||||
|
18% 26%,
|
||||||
|
18% 83%,
|
||||||
|
29% 83%,
|
||||||
|
29% 17%,
|
||||||
|
41% 17%,
|
||||||
|
41% 39%,
|
||||||
|
18% 39%,
|
||||||
|
18% 82%,
|
||||||
|
54% 82%,
|
||||||
|
54% 88%,
|
||||||
|
19% 88%,
|
||||||
|
19% 4%,
|
||||||
|
39% 4%,
|
||||||
|
39% 14%,
|
||||||
|
76% 14%,
|
||||||
|
76% 52%,
|
||||||
|
23% 52%,
|
||||||
|
23% 35%,
|
||||||
|
19% 35%,
|
||||||
|
19% 8%,
|
||||||
|
36% 8%,
|
||||||
|
36% 31%,
|
||||||
|
73% 31%,
|
||||||
|
73% 16%,
|
||||||
|
1% 16%,
|
||||||
|
1% 56%,
|
||||||
|
50% 56%,
|
||||||
|
50% 8%
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
5% {
|
||||||
|
clip-path: polygon(
|
||||||
|
0% 29%,
|
||||||
|
44% 29%,
|
||||||
|
44% 83%,
|
||||||
|
94% 83%,
|
||||||
|
94% 56%,
|
||||||
|
11% 56%,
|
||||||
|
11% 64%,
|
||||||
|
94% 64%,
|
||||||
|
94% 70%,
|
||||||
|
88% 70%,
|
||||||
|
88% 32%,
|
||||||
|
18% 32%,
|
||||||
|
18% 96%,
|
||||||
|
10% 96%,
|
||||||
|
10% 62%,
|
||||||
|
9% 62%,
|
||||||
|
9% 84%,
|
||||||
|
68% 84%,
|
||||||
|
68% 50%,
|
||||||
|
52% 50%,
|
||||||
|
52% 55%,
|
||||||
|
35% 55%,
|
||||||
|
35% 87%,
|
||||||
|
25% 87%,
|
||||||
|
25% 39%,
|
||||||
|
15% 39%,
|
||||||
|
15% 88%,
|
||||||
|
52% 88%
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
30% {
|
||||||
|
clip-path: polygon(
|
||||||
|
0% 53%,
|
||||||
|
93% 53%,
|
||||||
|
93% 62%,
|
||||||
|
68% 62%,
|
||||||
|
68% 37%,
|
||||||
|
97% 37%,
|
||||||
|
97% 89%,
|
||||||
|
13% 89%,
|
||||||
|
13% 45%,
|
||||||
|
51% 45%,
|
||||||
|
51% 88%,
|
||||||
|
17% 88%,
|
||||||
|
17% 54%,
|
||||||
|
81% 54%,
|
||||||
|
81% 75%,
|
||||||
|
79% 75%,
|
||||||
|
79% 76%,
|
||||||
|
38% 76%,
|
||||||
|
38% 28%,
|
||||||
|
61% 28%,
|
||||||
|
61% 12%,
|
||||||
|
55% 12%,
|
||||||
|
55% 62%,
|
||||||
|
68% 62%,
|
||||||
|
68% 51%,
|
||||||
|
0% 51%,
|
||||||
|
0% 92%,
|
||||||
|
63% 92%,
|
||||||
|
63% 4%,
|
||||||
|
65% 4%
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
45% {
|
||||||
|
clip-path: polygon(
|
||||||
|
0% 33%,
|
||||||
|
2% 33%,
|
||||||
|
2% 69%,
|
||||||
|
58% 69%,
|
||||||
|
58% 94%,
|
||||||
|
55% 94%,
|
||||||
|
55% 25%,
|
||||||
|
33% 25%,
|
||||||
|
33% 85%,
|
||||||
|
16% 85%,
|
||||||
|
16% 19%,
|
||||||
|
5% 19%,
|
||||||
|
5% 20%,
|
||||||
|
79% 20%,
|
||||||
|
79% 96%,
|
||||||
|
93% 96%,
|
||||||
|
93% 50%,
|
||||||
|
5% 50%,
|
||||||
|
5% 74%,
|
||||||
|
55% 74%,
|
||||||
|
55% 57%,
|
||||||
|
96% 57%,
|
||||||
|
96% 59%,
|
||||||
|
87% 59%,
|
||||||
|
87% 65%,
|
||||||
|
82% 65%,
|
||||||
|
82% 39%,
|
||||||
|
63% 39%,
|
||||||
|
63% 92%,
|
||||||
|
4% 92%,
|
||||||
|
4% 36%,
|
||||||
|
24% 36%,
|
||||||
|
24% 70%,
|
||||||
|
1% 70%,
|
||||||
|
1% 43%,
|
||||||
|
15% 43%,
|
||||||
|
15% 28%,
|
||||||
|
23% 28%,
|
||||||
|
23% 71%,
|
||||||
|
90% 71%,
|
||||||
|
90% 86%,
|
||||||
|
97% 86%,
|
||||||
|
97% 1%,
|
||||||
|
60% 1%,
|
||||||
|
60% 67%,
|
||||||
|
71% 67%,
|
||||||
|
71% 91%,
|
||||||
|
17% 91%,
|
||||||
|
17% 14%,
|
||||||
|
39% 14%,
|
||||||
|
39% 30%,
|
||||||
|
58% 30%,
|
||||||
|
58% 11%,
|
||||||
|
52% 11%,
|
||||||
|
52% 83%,
|
||||||
|
68% 83%
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
76% {
|
||||||
|
clip-path: polygon(
|
||||||
|
0% 26%,
|
||||||
|
15% 26%,
|
||||||
|
15% 73%,
|
||||||
|
72% 73%,
|
||||||
|
72% 70%,
|
||||||
|
77% 70%,
|
||||||
|
77% 75%,
|
||||||
|
8% 75%,
|
||||||
|
8% 42%,
|
||||||
|
4% 42%,
|
||||||
|
4% 61%,
|
||||||
|
17% 61%,
|
||||||
|
17% 12%,
|
||||||
|
26% 12%,
|
||||||
|
26% 63%,
|
||||||
|
73% 63%,
|
||||||
|
73% 43%,
|
||||||
|
90% 43%,
|
||||||
|
90% 67%,
|
||||||
|
50% 67%,
|
||||||
|
50% 41%,
|
||||||
|
42% 41%,
|
||||||
|
42% 46%,
|
||||||
|
50% 46%,
|
||||||
|
50% 84%,
|
||||||
|
96% 84%,
|
||||||
|
96% 78%,
|
||||||
|
49% 78%,
|
||||||
|
49% 25%,
|
||||||
|
63% 25%,
|
||||||
|
63% 14%
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
90% {
|
||||||
|
clip-path: polygon(
|
||||||
|
0% 41%,
|
||||||
|
13% 41%,
|
||||||
|
13% 6%,
|
||||||
|
87% 6%,
|
||||||
|
87% 93%,
|
||||||
|
10% 93%,
|
||||||
|
10% 13%,
|
||||||
|
89% 13%,
|
||||||
|
89% 6%,
|
||||||
|
3% 6%,
|
||||||
|
3% 8%,
|
||||||
|
16% 8%,
|
||||||
|
16% 79%,
|
||||||
|
0% 79%,
|
||||||
|
0% 99%,
|
||||||
|
92% 99%,
|
||||||
|
92% 90%,
|
||||||
|
5% 90%,
|
||||||
|
5% 60%,
|
||||||
|
0% 60%,
|
||||||
|
0% 48%,
|
||||||
|
89% 48%,
|
||||||
|
89% 13%,
|
||||||
|
80% 13%,
|
||||||
|
80% 43%,
|
||||||
|
95% 43%,
|
||||||
|
95% 19%,
|
||||||
|
80% 19%,
|
||||||
|
80% 85%,
|
||||||
|
38% 85%,
|
||||||
|
38% 62%
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
1%,
|
||||||
|
7%,
|
||||||
|
33%,
|
||||||
|
47%,
|
||||||
|
78%,
|
||||||
|
93% {
|
||||||
|
clip-path: none;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
@keyframes movement {
|
||||||
|
0% {
|
||||||
|
top: 0px;
|
||||||
|
left: -20px;
|
||||||
|
}
|
||||||
|
|
||||||
|
15% {
|
||||||
|
top: 10px;
|
||||||
|
left: 10px;
|
||||||
|
}
|
||||||
|
|
||||||
|
60% {
|
||||||
|
top: 5px;
|
||||||
|
left: -10px;
|
||||||
|
}
|
||||||
|
|
||||||
|
75% {
|
||||||
|
top: -5px;
|
||||||
|
left: 20px;
|
||||||
|
}
|
||||||
|
|
||||||
|
100% {
|
||||||
|
top: 10px;
|
||||||
|
left: 5px;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
@keyframes opacity {
|
||||||
|
0% {
|
||||||
|
opacity: 0.1;
|
||||||
|
}
|
||||||
|
|
||||||
|
5% {
|
||||||
|
opacity: 0.7;
|
||||||
|
}
|
||||||
|
|
||||||
|
30% {
|
||||||
|
opacity: 0.4;
|
||||||
|
}
|
||||||
|
|
||||||
|
45% {
|
||||||
|
opacity: 0.6;
|
||||||
|
}
|
||||||
|
|
||||||
|
76% {
|
||||||
|
opacity: 0.4;
|
||||||
|
}
|
||||||
|
|
||||||
|
90% {
|
||||||
|
opacity: 0.8;
|
||||||
|
}
|
||||||
|
|
||||||
|
1%,
|
||||||
|
7%,
|
||||||
|
33%,
|
||||||
|
47%,
|
||||||
|
78%,
|
||||||
|
93% {
|
||||||
|
opacity: 0;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@keyframes font {
|
||||||
|
0% {
|
||||||
|
font-weight: 100;
|
||||||
|
color: #e0287d;
|
||||||
|
filter: blur(3px);
|
||||||
|
}
|
||||||
|
|
||||||
|
20% {
|
||||||
|
font-weight: 500;
|
||||||
|
color: #fff;
|
||||||
|
filter: blur(0);
|
||||||
|
}
|
||||||
|
|
||||||
|
50% {
|
||||||
|
font-weight: 300;
|
||||||
|
color: #1bc7fb;
|
||||||
|
filter: blur(2px);
|
||||||
|
}
|
||||||
|
|
||||||
|
60% {
|
||||||
|
font-weight: 700;
|
||||||
|
color: #fff;
|
||||||
|
filter: blur(0);
|
||||||
|
}
|
||||||
|
|
||||||
|
90% {
|
||||||
|
font-weight: 500;
|
||||||
|
color: #e0287d;
|
||||||
|
filter: blur(6px);
|
||||||
|
}
|
||||||
|
}
|
||||||
70
build-agda-html.rb
Normal file
@@ -0,0 +1,70 @@
|
|||||||
|
require "json"
|
||||||
|
require "set"
|
||||||
|
require "optparse"
|
||||||
|
require "fileutils"
|
||||||
|
|
||||||
|
# For target_dir, use absolute paths because when invoking Agda, we'll be
|
||||||
|
# using chdir.
|
||||||
|
root_path = "code"
|
||||||
|
target_dir = File.expand_path "code"
|
||||||
|
data_file = "data/submodules.json"
|
||||||
|
OptionParser.new do |opts|
|
||||||
|
opts.banner = "Usage: build-agda-html.rb [options]"
|
||||||
|
|
||||||
|
opts.on("--root-path=PATH", "Search for Agda project folders in the given location") do |f|
|
||||||
|
root_path = f
|
||||||
|
end
|
||||||
|
opts.on("--target-dir=PATH", "Generate HTML files into the given directory") do |f|
|
||||||
|
target_dir = File.expand_path f
|
||||||
|
end
|
||||||
|
opts.on("--data-file=FILE", "Specify the submodules.json that encodes nested submodule structure") do |f|
|
||||||
|
data_file = f
|
||||||
|
end
|
||||||
|
end.parse!
|
||||||
|
files = ARGV
|
||||||
|
|
||||||
|
code_paths = Dir.entries(root_path).select do |f|
|
||||||
|
File.directory?(File.join(root_path, f)) and f != '.' and f != '..'
|
||||||
|
end.to_set
|
||||||
|
code_paths += JSON.parse(File.read(data_file)).keys if File.exist? data_file
|
||||||
|
# Extending code_paths from submodules.json means that nested Agda modules
|
||||||
|
# have their root dir correctly set.
|
||||||
|
|
||||||
|
max_path = ->(path) {
|
||||||
|
code_paths.max_by do |code_path|
|
||||||
|
count = 0
|
||||||
|
path.chars.zip(code_path.chars) do |c1, c2|
|
||||||
|
break unless c1 == c2
|
||||||
|
count += 1
|
||||||
|
end
|
||||||
|
|
||||||
|
next count
|
||||||
|
end
|
||||||
|
}
|
||||||
|
|
||||||
|
files_for_paths = {}
|
||||||
|
Dir.glob("**/*.agda", base: root_path) do |agda_file|
|
||||||
|
best_path = max_path.call(agda_file)
|
||||||
|
files_for_path = files_for_paths.fetch(best_path) do
|
||||||
|
files_for_paths[best_path] = []
|
||||||
|
end
|
||||||
|
|
||||||
|
files_for_path << agda_file[best_path.length + File::SEPARATOR.length..-1]
|
||||||
|
end
|
||||||
|
|
||||||
|
original_wd = Dir.getwd
|
||||||
|
files_for_paths.each do |path, files|
|
||||||
|
Dir.chdir(original_wd)
|
||||||
|
Dir.chdir(File.join(root_path, path))
|
||||||
|
html_dir = File.join [target_dir, path, "html"]
|
||||||
|
FileUtils.mkdir_p html_dir
|
||||||
|
|
||||||
|
files.each do |file|
|
||||||
|
command = "#{ARGV[0]} #{file} --html --html-dir=#{html_dir}"
|
||||||
|
puts command
|
||||||
|
puts `#{command}`
|
||||||
|
|
||||||
|
# Allow some programs to fail (e.g., IO.agda in SPA without --guardedness)
|
||||||
|
# fail unless $? == 0
|
||||||
|
end
|
||||||
|
end
|
||||||
49
chatgpt-subset-feather-icon.rb
Normal file
@@ -0,0 +1,49 @@
|
|||||||
|
#!/usr/bin/env ruby
|
||||||
|
# frozen_string_literal: true
|
||||||
|
|
||||||
|
require 'nokogiri'
|
||||||
|
require 'set'
|
||||||
|
|
||||||
|
# 1) Process all files passed in from the command line
|
||||||
|
svgpath = ARGV[0]
|
||||||
|
files = ARGV[1..]
|
||||||
|
|
||||||
|
# 2) Extract used Feather icons
|
||||||
|
used_icons = Set.new
|
||||||
|
|
||||||
|
files.each do |file|
|
||||||
|
# Parse each HTML file
|
||||||
|
doc = File.open(file, "r:UTF-8") { |f| Nokogiri::HTML(f) }
|
||||||
|
|
||||||
|
# Look for <use xlink:href="/feather-sprite.svg#iconName">
|
||||||
|
doc.css("use").each do |use_tag|
|
||||||
|
href = use_tag["xlink:href"] || use_tag["href"]
|
||||||
|
if href && href.start_with?("/feather-sprite.svg#")
|
||||||
|
icon_name = href.split("#").last
|
||||||
|
used_icons << icon_name
|
||||||
|
end
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
puts "Found #{used_icons.size} unique icons: #{used_icons.to_a.join(', ')}"
|
||||||
|
|
||||||
|
# 3) Load the full feather-sprite.svg as XML
|
||||||
|
sprite_doc = File.open(svgpath, "r:UTF-8") { |f| Nokogiri::XML(f) }
|
||||||
|
|
||||||
|
# 4) Create a new SVG with only the required symbols
|
||||||
|
new_svg = Nokogiri::XML::Document.new
|
||||||
|
svg_tag = Nokogiri::XML::Node.new("svg", new_svg)
|
||||||
|
svg_tag["xmlns"] = "http://www.w3.org/2000/svg"
|
||||||
|
new_svg.add_child(svg_tag)
|
||||||
|
|
||||||
|
sprite_doc.css("symbol").each do |symbol_node|
|
||||||
|
if used_icons.include?(symbol_node["id"])
|
||||||
|
# Duplicate the symbol node (so it can be inserted in the new document)
|
||||||
|
svg_tag.add_child(symbol_node.dup)
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
# 5) Save the subset sprite
|
||||||
|
File.open(svgpath, "w:UTF-8") do |f|
|
||||||
|
f.write(new_svg.to_xml)
|
||||||
|
end
|
||||||
69
chatgpt-subset-one-go.py
Normal file
@@ -0,0 +1,69 @@
|
|||||||
|
import os
|
||||||
|
import sys
|
||||||
|
from bs4 import BeautifulSoup
|
||||||
|
from fontTools.subset import Subsetter, Options
|
||||||
|
from fontTools.ttLib import TTFont
|
||||||
|
|
||||||
|
FONT_EXTENSIONS = (".ttf", ".woff", ".woff2", ".otf") # Font file types
|
||||||
|
|
||||||
|
def extract_text_from_html(file_path):
|
||||||
|
"""Extract text content from a single HTML file."""
|
||||||
|
with open(file_path, "r", encoding="utf-8") as f:
|
||||||
|
soup = BeautifulSoup(f.read(), "html.parser")
|
||||||
|
return soup.get_text()
|
||||||
|
|
||||||
|
def get_used_characters(files):
|
||||||
|
"""Collect unique characters from all .html files in the given directory."""
|
||||||
|
char_set = set()
|
||||||
|
for file in files:
|
||||||
|
text = extract_text_from_html(file)
|
||||||
|
char_set.update(text)
|
||||||
|
return "".join(sorted(char_set))
|
||||||
|
|
||||||
|
def find_font_files(directory):
|
||||||
|
"""Find all font files in the given directory, recursively."""
|
||||||
|
font_files = []
|
||||||
|
for root, _, files in os.walk(directory):
|
||||||
|
for file in files:
|
||||||
|
if file.endswith(FONT_EXTENSIONS):
|
||||||
|
font_files.append(os.path.join(root, file))
|
||||||
|
return font_files
|
||||||
|
|
||||||
|
def subset_font_in_place(font_path, characters):
|
||||||
|
"""Subsets the given font file to include only the specified characters."""
|
||||||
|
# Convert characters to their integer code points
|
||||||
|
unicode_set = {ord(c) for c in characters}
|
||||||
|
|
||||||
|
font = TTFont(font_path)
|
||||||
|
options = Options()
|
||||||
|
options.drop_tables += ["DSIG"]
|
||||||
|
options.drop_tables += ["LTSH", "VDMX", "hdmx", "gasp"]
|
||||||
|
options.unicodes = unicode_set
|
||||||
|
options.variations = False
|
||||||
|
options.drop_variations = True
|
||||||
|
options.layout_features = ["*"] # keep all OT features
|
||||||
|
options.hinting = False
|
||||||
|
|
||||||
|
# Preserve original format if it was WOFF/WOFF2
|
||||||
|
if font_path.endswith(".woff2"):
|
||||||
|
options.flavor = "woff2"
|
||||||
|
elif font_path.endswith(".woff"):
|
||||||
|
options.flavor = "woff"
|
||||||
|
|
||||||
|
subsetter = Subsetter(options)
|
||||||
|
subsetter.populate(unicodes=unicode_set)
|
||||||
|
subsetter.subset(font)
|
||||||
|
|
||||||
|
# Overwrite the original font file
|
||||||
|
font.save(font_path)
|
||||||
|
print(f"Subsetted font in place: {font_path}")
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
used_chars = get_used_characters(sys.argv[2:])
|
||||||
|
print(f"Extracted {len(used_chars)} unique characters from {len(sys.argv[2:])} HTML files.")
|
||||||
|
|
||||||
|
font_files = find_font_files(sys.argv[1])
|
||||||
|
print(f"Found {len(font_files)} font files to subset.")
|
||||||
|
|
||||||
|
for font_file in font_files:
|
||||||
|
subset_font_in_place(font_file, used_chars)
|
||||||
87
code/agda-issomething/example.agda
Normal file
@@ -0,0 +1,87 @@
|
|||||||
|
open import Agda.Primitive using (Level; lsuc)
|
||||||
|
open import Relation.Binary.PropositionalEquality using (_≡_)
|
||||||
|
|
||||||
|
variable
|
||||||
|
a : Level
|
||||||
|
A : Set a
|
||||||
|
|
||||||
|
module FirstAttempt where
|
||||||
|
record Semigroup (A : Set a) : Set a where
|
||||||
|
field
|
||||||
|
_∙_ : A → A → A
|
||||||
|
|
||||||
|
isAssociative : ∀ (a₁ a₂ a₃ : A) → a₁ ∙ (a₂ ∙ a₃) ≡ (a₁ ∙ a₂) ∙ a₃
|
||||||
|
|
||||||
|
record Monoid (A : Set a) : Set a where
|
||||||
|
field semigroup : Semigroup A
|
||||||
|
|
||||||
|
open Semigroup semigroup public
|
||||||
|
|
||||||
|
field
|
||||||
|
zero : A
|
||||||
|
|
||||||
|
isIdentityLeft : ∀ (a : A) → zero ∙ a ≡ a
|
||||||
|
isIdentityRight : ∀ (a : A) → a ∙ zero ≡ a
|
||||||
|
|
||||||
|
record ContrivedExample (A : Set a) : Set a where
|
||||||
|
field
|
||||||
|
-- first property
|
||||||
|
monoid : Monoid A
|
||||||
|
|
||||||
|
-- second property; Semigroup is a stand-in.
|
||||||
|
semigroup : Semigroup A
|
||||||
|
|
||||||
|
operationsEqual : Monoid._∙_ monoid ≡ Semigroup._∙_ semigroup
|
||||||
|
|
||||||
|
module SecondAttempt where
|
||||||
|
record IsSemigroup {A : Set a} (_∙_ : A → A → A) : Set a where
|
||||||
|
field isAssociative : ∀ (a₁ a₂ a₃ : A) → a₁ ∙ (a₂ ∙ a₃) ≡ (a₁ ∙ a₂) ∙ a₃
|
||||||
|
|
||||||
|
record IsMonoid {A : Set a} (zero : A) (_∙_ : A → A → A) : Set a where
|
||||||
|
field
|
||||||
|
isSemigroup : IsSemigroup _∙_
|
||||||
|
|
||||||
|
isIdentityLeft : ∀ (a : A) → zero ∙ a ≡ a
|
||||||
|
isIdentityRight : ∀ (a : A) → a ∙ zero ≡ a
|
||||||
|
|
||||||
|
open IsSemigroup isSemigroup public
|
||||||
|
|
||||||
|
record IsContrivedExample {A : Set a} (zero : A) (_∙_ : A → A → A) : Set a where
|
||||||
|
field
|
||||||
|
-- first property
|
||||||
|
monoid : IsMonoid zero _∙_
|
||||||
|
|
||||||
|
-- second property; Semigroup is a stand-in.
|
||||||
|
semigroup : IsSemigroup _∙_
|
||||||
|
|
||||||
|
record Semigroup (A : Set a) : Set a where
|
||||||
|
field
|
||||||
|
_∙_ : A → A → A
|
||||||
|
isSemigroup : IsSemigroup _∙_
|
||||||
|
|
||||||
|
record Monoid (A : Set a) : Set a where
|
||||||
|
field
|
||||||
|
zero : A
|
||||||
|
_∙_ : A → A → A
|
||||||
|
isMonoid : IsMonoid zero _∙_
|
||||||
|
|
||||||
|
module ThirdAttempt {A : Set a} (_∙_ : A → A → A) where
|
||||||
|
record IsSemigroup : Set a where
|
||||||
|
field isAssociative : ∀ (a₁ a₂ a₃ : A) → a₁ ∙ (a₂ ∙ a₃) ≡ (a₁ ∙ a₂) ∙ a₃
|
||||||
|
|
||||||
|
record IsMonoid (zero : A) : Set a where
|
||||||
|
field
|
||||||
|
isSemigroup : IsSemigroup
|
||||||
|
|
||||||
|
isIdentityLeft : ∀ (a : A) → zero ∙ a ≡ a
|
||||||
|
isIdentityRight : ∀ (a : A) → a ∙ zero ≡ a
|
||||||
|
|
||||||
|
open IsSemigroup isSemigroup public
|
||||||
|
|
||||||
|
record IsContrivedExample (zero : A) : Set a where
|
||||||
|
field
|
||||||
|
-- first property
|
||||||
|
monoid : IsMonoid zero
|
||||||
|
|
||||||
|
-- second property; Semigroup is a stand-in.
|
||||||
|
semigroup : IsSemigroup
|
||||||
1
code/agda-spa
Submodule
202
code/dyno-alloy/DynoAlloy.als
Normal file
@@ -0,0 +1,202 @@
|
|||||||
|
enum Flag {Method, MethodOrField, Public}
|
||||||
|
|
||||||
|
/* There is a negative version for each flag (METHOD and NOT_METHOD).
|
||||||
|
Model this as two sets, one of positive flags, and one of netative flags,
|
||||||
|
and interpret the bitfield to be a conjunction of both flags. */
|
||||||
|
sig Bitfield {
|
||||||
|
, positiveFlags: set Flag
|
||||||
|
, negativeFlags: set Flag
|
||||||
|
}
|
||||||
|
|
||||||
|
/* A filter state has filterFlags and excludeFlags, both represented as conjunctions. */
|
||||||
|
sig FilterState {
|
||||||
|
, curFilter: Bitfield
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Initially, no search has happeneed for a scope, so its 'found' is not set to anything. */
|
||||||
|
one sig NotSet {}
|
||||||
|
|
||||||
|
/* Finally, there's a search state (whether or not a particular scope has already been
|
||||||
|
searched with a particular configuration). */
|
||||||
|
one sig SearchState {
|
||||||
|
, var found: Bitfield + NotSet
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
pred bitfieldEmpty[b: Bitfield] {
|
||||||
|
#b.positiveFlags = 0 and #b.negativeFlags = 0
|
||||||
|
}
|
||||||
|
|
||||||
|
pred bitfieldEqual[b1: Bitfield, b2: Bitfield] {
|
||||||
|
b1.positiveFlags = b2.positiveFlags and b1.negativeFlags = b2.negativeFlags
|
||||||
|
}
|
||||||
|
|
||||||
|
pred bitfieldIntersection[b1: Bitfield, b2: Bitfield, b3: Bitfield] {
|
||||||
|
b3.positiveFlags = b1.positiveFlags & b2.positiveFlags
|
||||||
|
b3.negativeFlags = b1.negativeFlags & b2.negativeFlags
|
||||||
|
}
|
||||||
|
|
||||||
|
pred bitfieldSubset[b1: Bitfield, b2: Bitfield] {
|
||||||
|
b1.positiveFlags in b2.positiveFlags
|
||||||
|
b1.negativeFlags in b2.negativeFlags
|
||||||
|
}
|
||||||
|
|
||||||
|
pred bitfieldIncomparable[b1: Bitfield, b2: Bitfield] {
|
||||||
|
not bitfieldSubset[b1, b2]
|
||||||
|
not bitfieldSubset[b2, b1]
|
||||||
|
}
|
||||||
|
|
||||||
|
pred addBitfieldFlag[b1: Bitfield, b2: Bitfield, flag: Flag] {
|
||||||
|
b2.positiveFlags = b1.positiveFlags + flag
|
||||||
|
b2.negativeFlags = b1.negativeFlags
|
||||||
|
}
|
||||||
|
|
||||||
|
pred addBitfieldFlagNeg[b1: Bitfield, b2: Bitfield, flag: Flag] {
|
||||||
|
b2.negativeFlags = b1.negativeFlags + flag
|
||||||
|
b2.positiveFlags = b1.positiveFlags
|
||||||
|
}
|
||||||
|
|
||||||
|
enum Property { PMethod, PField, PPublic }
|
||||||
|
|
||||||
|
sig Symbol {
|
||||||
|
properties: set Property
|
||||||
|
}
|
||||||
|
|
||||||
|
pred flagMatchesProperty[flag: Flag, property: Property] {
|
||||||
|
(flag = Method and property = PMethod) or
|
||||||
|
(flag = MethodOrField and (property = PMethod or property = PField)) or
|
||||||
|
(flag = Public and property = PPublic)
|
||||||
|
}
|
||||||
|
|
||||||
|
pred bitfieldMatchesProperties[bitfield: Bitfield, symbol: Symbol] {
|
||||||
|
all flag: bitfield.positiveFlags | some property: symbol.properties | flagMatchesProperty[flag, property]
|
||||||
|
all flag: bitfield.negativeFlags | no property: symbol.properties | flagMatchesProperty[flag, property]
|
||||||
|
}
|
||||||
|
|
||||||
|
bitfieldExists: run {
|
||||||
|
some Bitfield
|
||||||
|
}
|
||||||
|
|
||||||
|
matchingBitfieldExists: run {
|
||||||
|
some bitfield : Bitfield, symbol : Symbol | bitfieldMatchesProperties[bitfield, symbol]
|
||||||
|
}
|
||||||
|
|
||||||
|
matchingBitfieldExists2: run {
|
||||||
|
some bitfield : Bitfield, symbol : Symbol {
|
||||||
|
#bitfield.positiveFlags = 1
|
||||||
|
#bitfield.negativeFlags = 1
|
||||||
|
#symbol.properties = 2
|
||||||
|
bitfieldMatchesProperties[bitfield, symbol]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fact "method and field are incompatible" {
|
||||||
|
always no symbol: Symbol | {
|
||||||
|
PMethod in symbol.properties and PField in symbol.properties
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fact "public and field are incompatible" {
|
||||||
|
always no symbol: Symbol | {
|
||||||
|
PPublic in symbol.properties and PField in symbol.properties
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
matchingBitfieldExists3: run {
|
||||||
|
some bitfield : Bitfield, symbol : Symbol {
|
||||||
|
#bitfield.positiveFlags = 2
|
||||||
|
#symbol.properties = 2
|
||||||
|
bitfieldMatchesProperties[bitfield, symbol]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
pred possibleState[filterState: FilterState] {
|
||||||
|
some initialState: FilterState {
|
||||||
|
// Each lookup in scope starts with empty filter flags
|
||||||
|
bitfieldEmpty[initialState.curFilter]
|
||||||
|
|
||||||
|
// The intermediate states (bitfieldMiddle) are used for sequencing of operations.
|
||||||
|
some bitfieldMiddle : Bitfield {
|
||||||
|
// Add "Public" depending on skipPrivateVisibilities
|
||||||
|
addBitfieldFlag[initialState.curFilter, bitfieldMiddle, Public] or
|
||||||
|
bitfieldEqual[initialState.curFilter, bitfieldMiddle]
|
||||||
|
|
||||||
|
// If it's a method receiver, add method or field restriction
|
||||||
|
addBitfieldFlag[bitfieldMiddle, filterState.curFilter, MethodOrField] or
|
||||||
|
// if it's not a receiver, filter to non-methods (could be overridden)
|
||||||
|
// addBitfieldFlagNeg[bitfieldMiddle, filterState.curFilter, Method] or
|
||||||
|
// Maybe methods are not being curFilterd but it's not a receiver, so no change.
|
||||||
|
bitfieldEqual[bitfieldMiddle, filterState.curFilter]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
possibleStateExists: run {
|
||||||
|
some filterState : FilterState | possibleState[filterState] and #filterState.curFilter.positiveFlags = 1
|
||||||
|
}
|
||||||
|
|
||||||
|
pred update[toSet: Bitfield + NotSet, setTo: FilterState] {
|
||||||
|
toSet' in Bitfield and bitfieldIntersection[toSet, setTo.curFilter, toSet']
|
||||||
|
}
|
||||||
|
|
||||||
|
pred newUpdate[toSet: Bitfield + NotSet, setTo: FilterState] {
|
||||||
|
(not bitfieldIncomparable[toSet, setTo.curFilter] and update[toSet, setTo]) or
|
||||||
|
(bitfieldIncomparable[toSet, setTo.curFilter] and toSet = toSet')
|
||||||
|
}
|
||||||
|
|
||||||
|
pred updateOrSet[toSet: Bitfield + NotSet, setTo: FilterState] {
|
||||||
|
(toSet in NotSet and toSet' = setTo.curFilter) or
|
||||||
|
(toSet not in NotSet and update[toSet, setTo])
|
||||||
|
}
|
||||||
|
|
||||||
|
pred excludeBitfield[found: Bitfield + NotSet, exclude: Bitfield] {
|
||||||
|
(found != NotSet and bitfieldEqual[found, exclude]) or
|
||||||
|
(found = NotSet and bitfieldEmpty[exclude])
|
||||||
|
}
|
||||||
|
|
||||||
|
fact init {
|
||||||
|
all searchState: SearchState | searchState.found = NotSet
|
||||||
|
}
|
||||||
|
|
||||||
|
fact step {
|
||||||
|
always {
|
||||||
|
// Model that a new doLookupInScope could've occurred, with any combination of flags.
|
||||||
|
all searchState: SearchState {
|
||||||
|
some fs: FilterState {
|
||||||
|
// This is a possible combination of lookup flags
|
||||||
|
possibleState[fs]
|
||||||
|
|
||||||
|
// If a search has been performed before, take the intersection; otherwise,
|
||||||
|
// just insert the current filter flags.
|
||||||
|
updateOrSet[searchState.found, fs]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
counterexampleNotFound: run {
|
||||||
|
all searchState: SearchState {
|
||||||
|
// a way that subsequent results of searching will miss things.
|
||||||
|
eventually some symbol: Symbol,
|
||||||
|
fs: FilterState, fsBroken: FilterState,
|
||||||
|
exclude1: Bitfield, exclude2: Bitfield {
|
||||||
|
// Some search (fs) will cause a transition / modification of the search state...
|
||||||
|
possibleState[fs]
|
||||||
|
updateOrSet[searchState.found, fs]
|
||||||
|
excludeBitfield[searchState.found, exclude1]
|
||||||
|
// Such that a later, valid search... (fsBroken)
|
||||||
|
possibleState[fsBroken]
|
||||||
|
excludeBitfield[searchState.found', exclude2]
|
||||||
|
|
||||||
|
// Will allow for a symbol ...
|
||||||
|
// ... that are left out of the original search...
|
||||||
|
not bitfieldMatchesProperties[searchState.found, symbol]
|
||||||
|
// ... and out of the current search
|
||||||
|
not (bitfieldMatchesProperties[fs.curFilter, symbol] and not bitfieldMatchesProperties[exclude1, symbol])
|
||||||
|
// But would be matched by the broken search...
|
||||||
|
bitfieldMatchesProperties[fsBroken.curFilter, symbol]
|
||||||
|
// ... to not be matched by a search with the new state:
|
||||||
|
not (bitfieldMatchesProperties[fsBroken.curFilter, symbol] and not bitfieldMatchesProperties[exclude2, symbol])
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
62
code/patterns/patterns_genbase.rb
Normal file
@@ -0,0 +1,62 @@
|
|||||||
|
require 'victor'
|
||||||
|
|
||||||
|
BASE = 4
|
||||||
|
DIRS = 7
|
||||||
|
|
||||||
|
def sum_digits(n)
|
||||||
|
x = n % BASE
|
||||||
|
x == 0 ? BASE : x
|
||||||
|
end
|
||||||
|
|
||||||
|
def step(x, y, n, dir)
|
||||||
|
return [n*Math.cos(2*Math::PI/DIRS*dir), n*Math.sin(2*Math::PI/DIRS*dir), (dir+1) % DIRS]
|
||||||
|
end
|
||||||
|
|
||||||
|
def run_number(number)
|
||||||
|
counter = 1
|
||||||
|
x, y, dir = 0.0, 0.0, 0
|
||||||
|
line_stack = [[0,0]]
|
||||||
|
|
||||||
|
(BASE/BASE.gcd(number) * DIRS).times do |i|
|
||||||
|
dx, dy, dir = step(x,y, sum_digits(i*number), dir)
|
||||||
|
x += dx
|
||||||
|
y += dy
|
||||||
|
line_stack << [x,y]
|
||||||
|
end
|
||||||
|
|
||||||
|
puts line_stack.to_s
|
||||||
|
return make_svg(line_stack)
|
||||||
|
end
|
||||||
|
|
||||||
|
def make_svg(line_stack)
|
||||||
|
line_length = 20
|
||||||
|
xs = line_stack.map { |c| c[0] }
|
||||||
|
ys = line_stack.map { |c| c[1] }
|
||||||
|
|
||||||
|
x_offset = -xs.min
|
||||||
|
y_offset = -ys.min
|
||||||
|
svg_coords = ->(p) {
|
||||||
|
nx, ny = p
|
||||||
|
[(nx+x_offset)*line_length + line_length/2, (ny+y_offset)*line_length + line_length/2]
|
||||||
|
}
|
||||||
|
|
||||||
|
max_width = (xs.max - xs.min).abs * line_length + line_length
|
||||||
|
max_height = (ys.max - ys.min).abs * line_length + line_length
|
||||||
|
svg = Victor::SVG.new width: max_width, height: max_height
|
||||||
|
|
||||||
|
style = { stroke: 'black', stroke_width: 5 }
|
||||||
|
svg.build do
|
||||||
|
line_stack.each_cons(2) do |pair|
|
||||||
|
p1, p2 = pair
|
||||||
|
x1, y1 = svg_coords.call(p1)
|
||||||
|
x2, y2 = svg_coords.call(p2)
|
||||||
|
line x1: x1, y1: y1, x2: x2, y2: y2, style: style
|
||||||
|
circle cx: x2, cy: y2, r: line_length/6, style: style, fill: 'black'
|
||||||
|
end
|
||||||
|
end
|
||||||
|
return svg
|
||||||
|
end
|
||||||
|
|
||||||
|
(1..10).each do |i|
|
||||||
|
run_number(i).save "pattern_#{i}"
|
||||||
|
end
|
||||||
@@ -1,14 +0,0 @@
|
|||||||
[params]
|
|
||||||
[params.submoduleLinks]
|
|
||||||
[params.submoduleLinks.aoc2020]
|
|
||||||
url = "https://dev.danilafe.com/Advent-of-Code/AdventOfCode-2020/src/commit/7a8503c3fe1aa7e624e4d8672aa9b56d24b4ba82"
|
|
||||||
path = "aoc-2020"
|
|
||||||
[params.submoduleLinks.blogstaticflake]
|
|
||||||
url = "https://dev.danilafe.com/Nix-Configs/blog-static-flake/src/commit/67b47d9c298e7476c2ca211aac5c5fd961637b7b"
|
|
||||||
path = "blog-static-flake"
|
|
||||||
[params.submoduleLinks.compiler]
|
|
||||||
url = "https://dev.danilafe.com/DanilaFe/bloglang/src/commit/137455b0f4365ba3fd11c45ce49781cdbe829ec3"
|
|
||||||
path = "compiler"
|
|
||||||
[params.submoduleLinks.serverconfig]
|
|
||||||
url = "https://dev.danilafe.com/Nix-Configs/server-config/src/commit/98cffe09546aee1678f7baebdea5eb5fef288935"
|
|
||||||
path = "server-config"
|
|
||||||
23
config.toml
@@ -6,6 +6,12 @@ summaryLength = 20
|
|||||||
|
|
||||||
defaultContentLanguage = 'en'
|
defaultContentLanguage = 'en'
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
[taxonomies]
|
||||||
|
tag = 'tags'
|
||||||
|
series = "series"
|
||||||
|
|
||||||
[outputFormats]
|
[outputFormats]
|
||||||
[outputFormats.Toml]
|
[outputFormats.Toml]
|
||||||
name = "toml"
|
name = "toml"
|
||||||
@@ -20,8 +26,25 @@ defaultContentLanguage = 'en'
|
|||||||
endLevel = 4
|
endLevel = 4
|
||||||
ordered = false
|
ordered = false
|
||||||
startLevel = 3
|
startLevel = 3
|
||||||
|
[markup.goldmark]
|
||||||
|
[markup.goldmark.extensions]
|
||||||
|
[markup.goldmark.extensions.passthrough]
|
||||||
|
enable = true
|
||||||
|
[markup.goldmark.extensions.passthrough.delimiters]
|
||||||
|
block = [['\[', '\]'], ['$$', '$$']]
|
||||||
|
inline = [['\(', '\)']]
|
||||||
|
[markup.goldmark.parser]
|
||||||
|
[markup.goldmark.parser.attribute]
|
||||||
|
block = true
|
||||||
|
title = true
|
||||||
|
|
||||||
[languages]
|
[languages]
|
||||||
[languages.en]
|
[languages.en]
|
||||||
title = "Daniel's Blog"
|
title = "Daniel's Blog"
|
||||||
languageCode = "en-us"
|
languageCode = "en-us"
|
||||||
|
|
||||||
|
[params]
|
||||||
|
plausibleAnalyticsDomain = "danilafe.com"
|
||||||
|
githubUsername = "DanilaFe"
|
||||||
|
siteSourceUrl = "https://dev.danilafe.com/Web-Projects/blog-static/src/branch/master"
|
||||||
|
externalLinksInNewTab = false
|
||||||
|
|||||||
@@ -2,6 +2,7 @@
|
|||||||
title: "Advent of Code in Coq - Day 1"
|
title: "Advent of Code in Coq - Day 1"
|
||||||
date: 2020-12-02T18:44:56-08:00
|
date: 2020-12-02T18:44:56-08:00
|
||||||
tags: ["Advent of Code", "Coq"]
|
tags: ["Advent of Code", "Coq"]
|
||||||
|
series: "Advent of Code in Coq"
|
||||||
favorite: true
|
favorite: true
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
@@ -1,7 +1,8 @@
|
|||||||
---
|
---
|
||||||
title: Compiling a Functional Language Using C++, Part 0 - Intro
|
title: Compiling a Functional Language Using C++, Part 0 - Intro
|
||||||
date: 2019-08-03T01:02:30-07:00
|
date: 2019-08-03T01:02:30-07:00
|
||||||
tags: ["C and C++", "Functional Languages", "Compilers"]
|
tags: ["C", "C++", "Functional Languages", "Compilers"]
|
||||||
|
series: "Compiling a Functional Language using C++"
|
||||||
description: "In this first post of a larger series, we embark on a journey of developing a compiler for a lazily evaluated functional language."
|
description: "In this first post of a larger series, we embark on a journey of developing a compiler for a lazily evaluated functional language."
|
||||||
---
|
---
|
||||||
During my last academic term, I was enrolled in a compilers course.
|
During my last academic term, I was enrolled in a compilers course.
|
||||||
|
|||||||
@@ -2,6 +2,7 @@
|
|||||||
title: A Language for an Assignment - Homework 1
|
title: A Language for an Assignment - Homework 1
|
||||||
date: 2019-12-27T23:27:09-08:00
|
date: 2019-12-27T23:27:09-08:00
|
||||||
tags: ["Haskell", "Python", "Algorithms", "Programming Languages"]
|
tags: ["Haskell", "Python", "Algorithms", "Programming Languages"]
|
||||||
|
series: "A Language for an Assignment"
|
||||||
---
|
---
|
||||||
|
|
||||||
On a rainy Oregon day, I was walking between classes with a group of friends.
|
On a rainy Oregon day, I was walking between classes with a group of friends.
|
||||||
|
|||||||
110
content/blog/00_spa_agda_intro.md
Normal file
@@ -0,0 +1,110 @@
|
|||||||
|
---
|
||||||
|
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 0: Intro"
|
||||||
|
series: "Static Program Analysis in Agda"
|
||||||
|
description: "In this post, I give a top-level overview of my work on formally verified static analyses"
|
||||||
|
date: 2024-07-06T17:37:42-07:00
|
||||||
|
tags: ["Agda", "Programming Languages"]
|
||||||
|
---
|
||||||
|
|
||||||
|
Some years ago, when the Programming Languages research group at Oregon State
|
||||||
|
University was discussing what to read, the [_Static Program Analysis_](https://cs.au.dk/~amoeller/spa/)
|
||||||
|
lecture notes came up. The group didn't end up reading the lecture notes,
|
||||||
|
but I did. As I was going through them, I noticed that they were quite rigorous:
|
||||||
|
the first several chapters cover a little bit of [lattice theory](https://en.wikipedia.org/wiki/Lattice_(order)),
|
||||||
|
and the subsequent analyses -- and the descriptions thereof -- are quite precise.
|
||||||
|
When I went to implement the algorithms in the textbook, I realized that just
|
||||||
|
writing them down would not be enough. After all, the textbook also proves
|
||||||
|
several properties of the lattice-based analyses, which would be lost in
|
||||||
|
translation if I were to just write C++ or Haskell.
|
||||||
|
|
||||||
|
At the same time, I noticed that lots of recent papers in programming language
|
||||||
|
theory were formalizing their results in
|
||||||
|
[Agda](https://agda.readthedocs.io/en/latest/getting-started/what-is-agda.html).
|
||||||
|
Having [played]({{< relref "meaningfully-typechecking-a-language-in-idris" >}})
|
||||||
|
[with]({{< relref "advent-of-code-in-coq" >}}) [dependent]({{< relref "coq_dawn_eval" >}})
|
||||||
|
[types]({{< relref "coq_palindrome" >}}) before, I was excited to try it out.
|
||||||
|
Thus began my journey to formalize (the first few chapters of) _Static Program Analysis_
|
||||||
|
in Agda.
|
||||||
|
|
||||||
|
In all, I built a framework for static analyses, based on a tool
|
||||||
|
called _motone functions_. This framework can be used to implement and
|
||||||
|
reason about many different analyses (currently only a certain class called
|
||||||
|
_forward analyses_, but that's not hard limitation). Recently, I've proven
|
||||||
|
the _correctness_ of the algorithms my framework produces. Having reached
|
||||||
|
this milestone, I'd like to pause and talk about what I've done.
|
||||||
|
|
||||||
|
In subsequent posts in this series, will describe what I have so far.
|
||||||
|
It's not perfect, and some work is yet to be done; however, getting to
|
||||||
|
this point was no joke, and I think it's worth discussing. In all,
|
||||||
|
I'd like to cover the following major topics, spending a couple of posts on each:
|
||||||
|
|
||||||
|
* __Lattices__: the analyses I'm reasoning about use an algebraic structure
|
||||||
|
called a _lattice_. This structure has certain properties that make it
|
||||||
|
amenable to describing degrees of "knowledge" about a program. In
|
||||||
|
lattice-based static program analysis, the various elements of the
|
||||||
|
lattice represent different facts or properties that we know about the
|
||||||
|
program in question; operations on the lattice help us combine these facts
|
||||||
|
and reason about them. I write about this in {{< draftlink "Part 1: Lattices" "01_spa_agda_lattices" >}}.
|
||||||
|
|
||||||
|
Interestingly, lattices can be made by combining other lattices in certain
|
||||||
|
ways. We can therefore use simpler lattices as building blocks to create
|
||||||
|
more complex ones, all while preserving the algebraic structure that
|
||||||
|
we need for program analysis. I write about this in
|
||||||
|
{{< draftlink "Part 2: Combining Lattices" "02_spa_agda_combining_lattices" >}}.
|
||||||
|
|
||||||
|
* __The Fixed-Point Algorithm__: to analyze a program, we use information
|
||||||
|
that we already know to compute additional information. For instance,
|
||||||
|
we might use the fact that `1` is positive to compute the fact that
|
||||||
|
`1+1` is positive as well. Using that information, we can determine the
|
||||||
|
sign of `(1+1)+1`, and so on. In practice, this is often done by calling
|
||||||
|
some kind of "analyze" function over and over, each time getting closer to an
|
||||||
|
accurate characterization of the program's behavior. When the output of "analyze"
|
||||||
|
stops changing, we know we've found as much as we can find, and stop.
|
||||||
|
|
||||||
|
What does it mean for the output to stop changing? Roughly, that's when
|
||||||
|
the following equation holds: `knownInfo = analyze(knownInfo)`. In mathematics,
|
||||||
|
this is known as a [fixed point](https://mathworld.wolfram.com/FixedPoint.html).
|
||||||
|
To enable computing fixed points, we focus on a specific kind of lattices:
|
||||||
|
those with a _fixed height_. I talk about what this means in
|
||||||
|
{{< draftlink "Part 3: Lattices of Finite Height" "03_spa_agda_fixed_height" >}}.
|
||||||
|
|
||||||
|
Even if we restrict our attention to lattices of fixed height,
|
||||||
|
not all functions have fixed points; however, certain types of functions on
|
||||||
|
lattices always do. The _fixed-point algorithm_ is a way to compute these
|
||||||
|
points, and we will use this to drive our analyses. I talk
|
||||||
|
about this in {{< draftlink "Part 4: The Fixed-Point Algorithm" "04_spa_agda_fixedpoint" >}}.
|
||||||
|
|
||||||
|
* __Correctness__: putting together the work on lattices and the fixed-point
|
||||||
|
algorithm, we can implement a static program analyzer in Agda. However,
|
||||||
|
it's not hard to write an "analyze" function that has a fixed point but
|
||||||
|
produces an incorrect result. Thus, the next step is to prove that the results
|
||||||
|
of our analyzer accurately describe the program in question.
|
||||||
|
|
||||||
|
The interesting aspect of this step is that our program analyzer works
|
||||||
|
on [control-flow graphs](https://en.wikipedia.org/wiki/Control-flow_graph) (CFGs),
|
||||||
|
which are a relatively compiler-centric representation of programs. On the
|
||||||
|
other hand, what the language _actually does_ is defined by its
|
||||||
|
[semantics](https://en.wikipedia.org/wiki/Semantics_(computer_science)),
|
||||||
|
which is not at all compiler-centric. We need to connect these two, showing
|
||||||
|
that the CFGs we produce "make sense" for our language, and that given
|
||||||
|
CFGs that make sense, our analysis produces results that match the language's
|
||||||
|
execution. To do so, I write about the language and its semantics
|
||||||
|
in {{< draftlink "Part 5: Our Programming Language" "05_spa_agda_semantics" >}},
|
||||||
|
then about building control flow graphs for the language in
|
||||||
|
{{< draftlink "Part 6: Control Flow Graphs" "06_spa_agda_cfg" >}}.
|
||||||
|
I then write about combining these two representations in
|
||||||
|
{{< draftlink "Part 7: Connecting Semantics and Control Flow Graphs" "07_spa_agda_semantics_and_cfg" >}}.
|
||||||
|
|
||||||
|
|
||||||
|
### Navigation
|
||||||
|
Here are the posts that I’ve written so far for this series:
|
||||||
|
|
||||||
|
* {{< draftlink "Lattices" "01_spa_agda_lattices" >}}
|
||||||
|
* {{< draftlink "Combining Lattices" "02_spa_agda_combining_lattices" >}}
|
||||||
|
* {{< draftlink "Lattices of Finite Height" "03_spa_agda_fixed_height" >}}
|
||||||
|
* {{< draftlink "The Fixed-Point Algorithm" "04_spa_agda_fixedpoint" >}}
|
||||||
|
* {{< draftlink "Our Programming Language" "05_spa_agda_semantics" >}}
|
||||||
|
* {{< draftlink "Control Flow Graphs" "06_spa_agda_cfg" >}}
|
||||||
|
* {{< draftlink "Connecting Semantics and Control Flow Graphs" "07_spa_agda_semantics_and_cfg" >}}
|
||||||
|
* {{< draftlink "Forward Analysis" "08_spa_agda_forward" >}}
|
||||||
|
* {{< draftlink "Verifying the Forward Analysis" "09_spa_agda_verified_forward" >}}
|
||||||
@@ -2,7 +2,16 @@
|
|||||||
title: "Everything I Know About Types: Introduction"
|
title: "Everything I Know About Types: Introduction"
|
||||||
date: 2022-06-26T18:36:01-07:00
|
date: 2022-06-26T18:36:01-07:00
|
||||||
tags: ["Type Systems", "Programming Languages"]
|
tags: ["Type Systems", "Programming Languages"]
|
||||||
|
series: "Everything I Know About Types"
|
||||||
draft: true
|
draft: true
|
||||||
|
bergamot:
|
||||||
|
render_presets:
|
||||||
|
default: "bergamot/rendering/lc.bergamot"
|
||||||
|
presets:
|
||||||
|
intro:
|
||||||
|
prompt: "type(TERM, ?t)"
|
||||||
|
query: ""
|
||||||
|
file: "intro.bergamot"
|
||||||
---
|
---
|
||||||
|
|
||||||
I am in love with types and type systems. They are, quite probably,
|
I am in love with types and type systems. They are, quite probably,
|
||||||
@@ -140,3 +149,33 @@ understanding how to use a crate; check out the
|
|||||||
[documentation page for `Vec`](https://doc.rust-lang.org/std/vec/struct.Vec.html), for instance.
|
[documentation page for `Vec`](https://doc.rust-lang.org/std/vec/struct.Vec.html), for instance.
|
||||||
Documentation of Elm packages always lists functions' types (see [documentation for `Parser`](https://package.elm-lang.org/packages/elm/parser/latest/Parser), for example). Even C++ type signatures
|
Documentation of Elm packages always lists functions' types (see [documentation for `Parser`](https://package.elm-lang.org/packages/elm/parser/latest/Parser), for example). Even C++ type signatures
|
||||||
listed by Doxygen can be quite useful; I, for one, got a lot out of the [LLVM documentation](https://llvm.org/doxygen/classllvm_1_1IRBuilderBase.html).
|
listed by Doxygen can be quite useful; I, for one, got a lot out of the [LLVM documentation](https://llvm.org/doxygen/classllvm_1_1IRBuilderBase.html).
|
||||||
|
|
||||||
|
### Exercises, Bergamot and You
|
||||||
|
One of the reasons I love the [Software Foundations](https://softwarefoundations.cis.upenn.edu/)
|
||||||
|
series are the exercises. They are within the text, and they are machine-checked:
|
||||||
|
if you use a computer tool to work through the tasks, and verify that you did
|
||||||
|
them correctly. I hope to do something similar. Exercises will look something
|
||||||
|
like this:
|
||||||
|
|
||||||
|
{{< bergamot_exercise label="sample exercise" id="exercise-1" >}}
|
||||||
|
Here I tell you to do something that I believe would be instructive.
|
||||||
|
{{< /bergamot_exercise >}}
|
||||||
|
|
||||||
|
To achieve my ideal of interactive exercises, I developed a tool called Bergamot.
|
||||||
|
It's a tiny little programming language for writing _inference rules_, which are an invaluable tool
|
||||||
|
in a type theorist's toolbox. I introduced the tool in [a separate post on this
|
||||||
|
site]({{< relref "bergamot" >}}). Throughout this series, I'll be using Bergamot
|
||||||
|
for exercises and mild amounts of interactive content. This is completely optional:
|
||||||
|
My aim is to make everything I write self-contained and useful without various
|
||||||
|
tools. However, I think that having a way to interactively play with inference
|
||||||
|
rules is conducive to learning the concepts.
|
||||||
|
|
||||||
|
The unfortunate problem with making a tool for exercises is that you also need
|
||||||
|
to teach others how to use the tool. Some exercises will be more specific
|
||||||
|
to the tool than to type theory itself; I will denote these exercises as such where
|
||||||
|
possible. Also, whenever the context of the exercise can be loaded into
|
||||||
|
Bergamot, I will denote this with a play button.
|
||||||
|
|
||||||
|
{{< bergamot_exercise label="bergamot; sample exercise" preset="intro" id="exercise-2" >}}
|
||||||
|
Try typing `1+1` into the input field!
|
||||||
|
{{< /bergamot_exercise >}}
|
||||||
3
content/blog/00_types_intro/intro.bergamot
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
TNumber @ type(lit(?n), number) <- num(?n);
|
||||||
|
TPlusI @ type(plus(?e_1, ?e_2), number) <-
|
||||||
|
type(?e_1, number), type(?e_2, number);
|
||||||
@@ -2,6 +2,7 @@
|
|||||||
title: "Advent of Code in Coq - Day 8"
|
title: "Advent of Code in Coq - Day 8"
|
||||||
date: 2021-01-10T22:48:39-08:00
|
date: 2021-01-10T22:48:39-08:00
|
||||||
tags: ["Advent of Code", "Coq"]
|
tags: ["Advent of Code", "Coq"]
|
||||||
|
series: "Advent of Code in Coq"
|
||||||
---
|
---
|
||||||
|
|
||||||
Huh? We're on day 8? What happened to days 2 through 7?
|
Huh? We're on day 8? What happened to days 2 through 7?
|
||||||
@@ -57,7 +58,7 @@ Here's another inference rule, this time with some mathematical notation instead
|
|||||||
{n + 1 < m + 1}
|
{n + 1 < m + 1}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
This one reads, "if \\(n\\) is less than \\(m\\), then \\(n+1\\) is less than \\(m+1\\)". We can use inference
|
This one reads, "if \(n\) is less than \(m\), then \(n+1\) is less than \(m+1\)". We can use inference
|
||||||
rules to define various constructs. As an example, let's define what it means for a natural number to be even.
|
rules to define various constructs. As an example, let's define what it means for a natural number to be even.
|
||||||
It takes two rules:
|
It takes two rules:
|
||||||
|
|
||||||
@@ -79,7 +80,7 @@ again we see that 4 is even, as well. We can continue this to determine that 6,
|
|||||||
are even too. Never in this process will we visit the numbers 1 or 3 or 5, and that's good - they're not even!
|
are even too. Never in this process will we visit the numbers 1 or 3 or 5, and that's good - they're not even!
|
||||||
|
|
||||||
Let's now extend this notion to programming languages, starting with a simple arithmetic language.
|
Let's now extend this notion to programming languages, starting with a simple arithmetic language.
|
||||||
This language is made up of natural numbers and the \\(\square\\) operation, which represents the addition
|
This language is made up of natural numbers and the \(\square\) operation, which represents the addition
|
||||||
of two numbers. Again, we need two rules:
|
of two numbers. Again, we need two rules:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
@@ -92,14 +93,14 @@ of two numbers. Again, we need two rules:
|
|||||||
{e_1 \square e_2 \; \text{evaluates to} \; n_1 + n_2}
|
{e_1 \square e_2 \; \text{evaluates to} \; n_1 + n_2}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
First, let me explain myself. I used \\(\square\\) to demonstrate two important points. First, languages can be made of
|
First, let me explain myself. I used \(\square\) to demonstrate two important points. First, languages can be made of
|
||||||
any kind of characters we want; it's the rules that we define that give these languages meaning.
|
any kind of characters we want; it's the rules that we define that give these languages meaning.
|
||||||
Second, while \\(\square\\) is the addition operation _in our language_, \\(+\\) is the _mathematical addition operator_.
|
Second, while \(\square\) is the addition operation _in our language_, \(+\) is the _mathematical addition operator_.
|
||||||
They are not the same - we use the latter to define how the former works.
|
They are not the same - we use the latter to define how the former works.
|
||||||
|
|
||||||
Finally, writing "evaluates to" gets quite tedious, especially for complex languages. Instead,
|
Finally, writing "evaluates to" gets quite tedious, especially for complex languages. Instead,
|
||||||
PLT people use notation to make their semantics more concise. The symbol \\(\Downarrow\\) is commonly
|
PLT people use notation to make their semantics more concise. The symbol \(\Downarrow\) is commonly
|
||||||
used to mean "evaluates to"; thus, \\(e \Downarrow v\\) reads "the expression \\(e\\) evaluates to the value \\(v\\).
|
used to mean "evaluates to"; thus, \(e \Downarrow v\) reads "the expression \(e\) evaluates to the value \(v\).
|
||||||
Using this notation, our rules start to look like the following:
|
Using this notation, our rules start to look like the following:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
@@ -146,8 +147,8 @@ Inductive tinylang : Type :=
|
|||||||
| box (e1 e2 : tinylang) : tinylang.
|
| box (e1 e2 : tinylang) : tinylang.
|
||||||
```
|
```
|
||||||
|
|
||||||
This defines the two elements of our example language: `number n` corresponds to \\(n\\), and `box e1 e2` corresponds
|
This defines the two elements of our example language: `number n` corresponds to \(n\), and `box e1 e2` corresponds
|
||||||
to \\(e_1 \square e_2\\). Finally, we define the inference rules:
|
to \(e_1 \square e_2\). Finally, we define the inference rules:
|
||||||
|
|
||||||
```Coq {linenos=true}
|
```Coq {linenos=true}
|
||||||
Inductive tinylang_sem : tinylang -> nat -> Prop :=
|
Inductive tinylang_sem : tinylang -> nat -> Prop :=
|
||||||
@@ -157,7 +158,7 @@ Inductive tinylang_sem : tinylang -> nat -> Prop :=
|
|||||||
tinylang_sem (box e1 e2) (n1 + n2).
|
tinylang_sem (box e1 e2) (n1 + n2).
|
||||||
```
|
```
|
||||||
|
|
||||||
When we wrote our rules earlier, by using arbitrary variables like \\(e_1\\) and \\(n_1\\), we implicitly meant
|
When we wrote our rules earlier, by using arbitrary variables like \(e_1\) and \(n_1\), we implicitly meant
|
||||||
that our rules work for _any_ number or expression. When writing Coq we have to make this assumption explicit
|
that our rules work for _any_ number or expression. When writing Coq we have to make this assumption explicit
|
||||||
by using `forall`. For instance, the rule on line 2 reads, "for any number `n`, the expression `n` evaluates to `n`".
|
by using `forall`. For instance, the rule on line 2 reads, "for any number `n`, the expression `n` evaluates to `n`".
|
||||||
|
|
||||||
@@ -166,8 +167,8 @@ by using `forall`. For instance, the rule on line 2 reads, "for any number `n`,
|
|||||||
We've now written some example big-step operational semantics, both "on paper" and in Coq. Now, it's time to take a look at
|
We've now written some example big-step operational semantics, both "on paper" and in Coq. Now, it's time to take a look at
|
||||||
the specific semantics of the language from Day 8! Our language consists of a few parts.
|
the specific semantics of the language from Day 8! Our language consists of a few parts.
|
||||||
|
|
||||||
First, there are three opcodes: \\(\texttt{jmp}\\), \\(\\texttt{nop}\\), and \\(\\texttt{add}\\). Opcodes, combined
|
First, there are three opcodes: \(\texttt{jmp}\), \(\texttt{nop}\), and \(\texttt{add}\). Opcodes, combined
|
||||||
with an integer, make up an instruction. For example, the instruction \\(\\texttt{add} \\; 3\\) will increase the
|
with an integer, make up an instruction. For example, the instruction \(\texttt{add} \; 3\) will increase the
|
||||||
content of the accumulator by three. Finally, a program consists of a sequence of instructions; They're separated
|
content of the accumulator by three. Finally, a program consists of a sequence of instructions; They're separated
|
||||||
by newlines in the puzzle input, but we'll instead separate them by semicolons. For example, here's a complete program.
|
by newlines in the puzzle input, but we'll instead separate them by semicolons. For example, here's a complete program.
|
||||||
|
|
||||||
@@ -180,17 +181,17 @@ it will add 0 to the accumulator (keeping it the same),
|
|||||||
do nothing, and finally jump back to the beginning. At this point, it will try to run the addition instruction again,
|
do nothing, and finally jump back to the beginning. At this point, it will try to run the addition instruction again,
|
||||||
which is not allowed; thus, the program will terminate.
|
which is not allowed; thus, the program will terminate.
|
||||||
|
|
||||||
Did you catch that? The semantics of this language will require more information than just our program itself (which we'll denote by \\(p\\)).
|
Did you catch that? The semantics of this language will require more information than just our program itself (which we'll denote by \(p\)).
|
||||||
* First, to evaluate the program we will need a program counter, \\(\\textit{c}\\). This program counter
|
* First, to evaluate the program we will need a program counter, \(\textit{c}\). This program counter
|
||||||
will tell us the position of the instruction to be executed next. It can also point past the last instruction,
|
will tell us the position of the instruction to be executed next. It can also point past the last instruction,
|
||||||
which means our program terminated successfully.
|
which means our program terminated successfully.
|
||||||
* Next, we'll need the accumulator \\(a\\). Addition instructions can change the accumulator, and we will be interested
|
* Next, we'll need the accumulator \(a\). Addition instructions can change the accumulator, and we will be interested
|
||||||
in the number that ends up in the accumulator when our program finishes executing.
|
in the number that ends up in the accumulator when our program finishes executing.
|
||||||
* Finally, and more subtly, we'll need to keep track of the states we visited. For instance,
|
* Finally, and more subtly, we'll need to keep track of the states we visited. For instance,
|
||||||
in the course of evaluating our program above, we encounter the \\((c, a)\\) pair of \\((0, 0)\\) twice: once
|
in the course of evaluating our program above, we encounter the \((c, a)\) pair of \((0, 0)\) twice: once
|
||||||
at the beginning, and once at the end. However, whereas at the beginning we have not yet encountered the addition
|
at the beginning, and once at the end. However, whereas at the beginning we have not yet encountered the addition
|
||||||
instruction, at the end we have, so the evaluation behaves differently. To make the proofs work better in Coq,
|
instruction, at the end we have, so the evaluation behaves differently. To make the proofs work better in Coq,
|
||||||
we'll use a set \\(v\\) of
|
we'll use a set \(v\) of
|
||||||
{{< sidenote "right" "allowed-note" "allowed (valid) program counters (as opposed to visited program counters)." >}}
|
{{< sidenote "right" "allowed-note" "allowed (valid) program counters (as opposed to visited program counters)." >}}
|
||||||
Whereas the set of "visited" program counters keeps growing as our evaluation continues,
|
Whereas the set of "visited" program counters keeps growing as our evaluation continues,
|
||||||
the set of "allowed" program counters keeps shrinking. Because the "allowed" set never stops shrinking,
|
the set of "allowed" program counters keeps shrinking. Because the "allowed" set never stops shrinking,
|
||||||
@@ -205,10 +206,10 @@ never changes; only the state does. So I propose this (rather unorthodox) notati
|
|||||||
(c, a, v) \Rightarrow_p (c', a', v')
|
(c, a, v) \Rightarrow_p (c', a', v')
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
This reads, "after starting at program counter \\(c\\), accumulator \\(a\\), and set of valid addresses \\(v\\),
|
This reads, "after starting at program counter \(c\), accumulator \(a\), and set of valid addresses \(v\),
|
||||||
the program \\(p\\) terminates with program counter \\(c'\\), accumulator \\(a'\\), and set of valid addresses \\(v'\\)".
|
the program \(p\) terminates with program counter \(c'\), accumulator \(a'\), and set of valid addresses \(v'\)".
|
||||||
Before creating the inference rules for this evaluation relation, let's define the effect of evaluating a single
|
Before creating the inference rules for this evaluation relation, let's define the effect of evaluating a single
|
||||||
instruction, using notation \\((c, a) \rightarrow_i (c', a')\\). An addition instruction changes the accumulator,
|
instruction, using notation \((c, a) \rightarrow_i (c', a')\). An addition instruction changes the accumulator,
|
||||||
and increases the program counter by 1.
|
and increases the program counter by 1.
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
@@ -239,8 +240,8 @@ is done evaluating, and is in a "failed" state.
|
|||||||
{(c, a, v) \Rightarrow_{p} (c, a, v)}
|
{(c, a, v) \Rightarrow_{p} (c, a, v)}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
We use \\(\\text{length}(p)\\) to represent the number of instructions in \\(p\\). Note the second premise:
|
We use \(\text{length}(p)\) to represent the number of instructions in \(p\). Note the second premise:
|
||||||
even if our program counter \\(c\\) is not included in the valid set, if it's "past the end of the program",
|
even if our program counter \(c\) is not included in the valid set, if it's "past the end of the program",
|
||||||
the program terminates in an "ok" state.
|
the program terminates in an "ok" state.
|
||||||
{{< sidenote "left" "avoid-c-note" "Here's a rule for terminating in the \"ok\" state:" >}}
|
{{< sidenote "left" "avoid-c-note" "Here's a rule for terminating in the \"ok\" state:" >}}
|
||||||
In the presented rule, we don't use the variable <code>c</code> all that much, and we know its concrete
|
In the presented rule, we don't use the variable <code>c</code> all that much, and we know its concrete
|
||||||
@@ -280,20 +281,20 @@ our program can take a step, and continue evaluating from there.
|
|||||||
This is quite a rule. A lot of things need to work out for a program to evauate from a state that isn't
|
This is quite a rule. A lot of things need to work out for a program to evauate from a state that isn't
|
||||||
currently the final state:
|
currently the final state:
|
||||||
|
|
||||||
* The current program counter \\(c\\) must be valid. That is, it must be an element of \\(v\\).
|
* The current program counter \(c\) must be valid. That is, it must be an element of \(v\).
|
||||||
* This program counter must correspond to an instruction \\(i\\) in \\(p\\), which we write as \\(p[c] = i\\).
|
* This program counter must correspond to an instruction \(i\) in \(p\), which we write as \(p[c] = i\).
|
||||||
* This instruction must be executed, changing our program counter from \\(c\\) to \\(c'\\) and our
|
* This instruction must be executed, changing our program counter from \(c\) to \(c'\) and our
|
||||||
accumulator from \\(a\\) to \\(a'\\). The set of valid instructions will no longer include \\(c\\),
|
accumulator from \(a\) to \(a'\). The set of valid instructions will no longer include \(c\),
|
||||||
and will become \\(v - \\{c\\}\\).
|
and will become \(v - \{c\}\).
|
||||||
* Our program must then finish executing, starting at state
|
* Our program must then finish executing, starting at state
|
||||||
\\((c', a', v - \\{c\\})\\), and ending in some (unknown) state \\((c'', a'', v'')\\).
|
\((c', a', v - \{c\})\), and ending in some (unknown) state \((c'', a'', v'')\).
|
||||||
|
|
||||||
If all of these conditions are met, our program, starting at \\((c, a, v)\\), will terminate in the state \\((c'', a'', v'')\\). This third rule completes our semantics; a program being executed will keep running instructions using the third rule, until it finally
|
If all of these conditions are met, our program, starting at \((c, a, v)\), will terminate in the state \((c'', a'', v'')\). This third rule completes our semantics; a program being executed will keep running instructions using the third rule, until it finally
|
||||||
hits an invalid program counter (terminating with the first rule) or gets to the end of the program (terminating with the second rule).
|
hits an invalid program counter (terminating with the first rule) or gets to the end of the program (terminating with the second rule).
|
||||||
|
|
||||||
#### Aside: Vectors and Finite \\(\mathbb{N}\\)
|
#### Aside: Vectors and Finite \\(\\mathbb{N}\\)
|
||||||
We'll be getting to the Coq implementation of our semantics soon, but before we do:
|
We'll be getting to the Coq implementation of our semantics soon, but before we do:
|
||||||
what type should \\(c\\) be? It's entirely possible for an instruction like \\(\\texttt{jmp} \\; -10000\\)
|
what type should \(c\) be? It's entirely possible for an instruction like \(\texttt{jmp} \; -10000\)
|
||||||
to throw our program counter way before the first instruction of our program, so at first, it seems
|
to throw our program counter way before the first instruction of our program, so at first, it seems
|
||||||
as though we should use an integer. But the prompt doesn't even specify what should happen in this
|
as though we should use an integer. But the prompt doesn't even specify what should happen in this
|
||||||
case - it only says an instruction shouldn't be run twice. The "valid set", although it may help resolve
|
case - it only says an instruction shouldn't be run twice. The "valid set", although it may help resolve
|
||||||
@@ -301,7 +302,7 @@ this debate, is our invention, and isn't part of the original specification.
|
|||||||
|
|
||||||
There is, however, something we can infer from this problem. Since the problem of jumping "too far behind" or
|
There is, however, something we can infer from this problem. Since the problem of jumping "too far behind" or
|
||||||
"too far ahead" is never mentioned, we can assume that _all jumps will lead either to an instruction,
|
"too far ahead" is never mentioned, we can assume that _all jumps will lead either to an instruction,
|
||||||
or right to the end of a program_. This means that \\(c\\) is a natural number, with
|
or right to the end of a program_. This means that \(c\) is a natural number, with
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
0 \leq c \leq \text{length}(p)
|
0 \leq c \leq \text{length}(p)
|
||||||
@@ -321,9 +322,9 @@ inference rules, let's present two rules that define such a number:
|
|||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
This is a variation of the [Peano encoding](https://wiki.haskell.org/Peano_numbers) of natural numbers.
|
This is a variation of the [Peano encoding](https://wiki.haskell.org/Peano_numbers) of natural numbers.
|
||||||
It reads as follows: zero (\\(Z\\)) is a finite natural number less than any positive natural number \\(n\\). Then, if a finite natural number
|
It reads as follows: zero (\(Z\)) is a finite natural number less than any positive natural number \(n\). Then, if a finite natural number
|
||||||
\\(f\\) is less than \\(n\\), then adding one to that number (using the successor function \\(S\\))
|
\(f\) is less than \(n\), then adding one to that number (using the successor function \(S\))
|
||||||
will create a natural number less than \\(n+1\\). We encode this in Coq as follows
|
will create a natural number less than \(n+1\). We encode this in Coq as follows
|
||||||
([originally from here](https://coq.inria.fr/library/Coq.Vectors.Fin.html#t)):
|
([originally from here](https://coq.inria.fr/library/Coq.Vectors.Fin.html#t)):
|
||||||
|
|
||||||
```Coq
|
```Coq
|
||||||
@@ -332,9 +333,9 @@ Inductive t : nat -> Set :=
|
|||||||
| FS : forall {n}, t n -> t (S n).
|
| FS : forall {n}, t n -> t (S n).
|
||||||
```
|
```
|
||||||
|
|
||||||
The `F1` constructor here is equivalent to our \\(Z\\), and `FS` is equivalent to our \\(S\\).
|
The `F1` constructor here is equivalent to our \(Z\), and `FS` is equivalent to our \(S\).
|
||||||
To represent positive natural numbers \\(\\mathbb{N}^+\\), we simply take a regular natural
|
To represent positive natural numbers \(\mathbb{N}^+\), we simply take a regular natural
|
||||||
number from \\(\mathbb{N}\\) and find its successor using `S` (simply adding 1). Again, we have
|
number from \(\mathbb{N}\) and find its successor using `S` (simply adding 1). Again, we have
|
||||||
to explicitly use `forall` in our type signatures.
|
to explicitly use `forall` in our type signatures.
|
||||||
|
|
||||||
We can use a similar technique to represent a list with a known number of elements, known
|
We can use a similar technique to represent a list with a known number of elements, known
|
||||||
@@ -351,9 +352,9 @@ a vector:
|
|||||||
{(x::\textit{xs}) : \text{Vec} \; t \; (n+1)}
|
{(x::\textit{xs}) : \text{Vec} \; t \; (n+1)}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
These rules read: the empty list \\([]\\) is zero-length vector of any type \\(t\\). Then,
|
These rules read: the empty list \([]\) is zero-length vector of any type \(t\). Then,
|
||||||
if we take an element \\(x\\) of type \\(t\\), and an \\(n\\)-long vector \\(\textit{xs}\\) of \\(t\\),
|
if we take an element \(x\) of type \(t\), and an \(n\)-long vector \(\textit{xs}\) of \(t\),
|
||||||
then we can prepend \\(x\\) to \\(\textit{xs}\\) and get an \\((n+1)\\)-long vector of \\(t\\).
|
then we can prepend \(x\) to \(\textit{xs}\) and get an \((n+1)\)-long vector of \(t\).
|
||||||
In Coq, we write this as follows ([originally from here](https://coq.inria.fr/library/Coq.Vectors.VectorDef.html#t)):
|
In Coq, we write this as follows ([originally from here](https://coq.inria.fr/library/Coq.Vectors.VectorDef.html#t)):
|
||||||
|
|
||||||
```Coq
|
```Coq
|
||||||
@@ -362,32 +363,32 @@ Inductive t A : nat -> Type :=
|
|||||||
| cons : forall (h:A) (n:nat), t A n -> t A (S n).
|
| cons : forall (h:A) (n:nat), t A n -> t A (S n).
|
||||||
```
|
```
|
||||||
|
|
||||||
The `nil` constructor represents the empty list \\([]\\), and `cons` represents
|
The `nil` constructor represents the empty list \([]\), and `cons` represents
|
||||||
the operation of prepending an element (called `h` in the code and \\(x\\) in our inference rules)
|
the operation of prepending an element (called `h` in the code and \(x\) in our inference rules)
|
||||||
to another vector of length \\(n\\), which remains unnamed in the code but is called \\(\\textit{xs}\\) in our rules.
|
to another vector of length \(n\), which remains unnamed in the code but is called \(\textit{xs}\) in our rules.
|
||||||
|
|
||||||
These two definitions work together quite well. For instance, suppose we have a vector of length \\(n\\).
|
These two definitions work together quite well. For instance, suppose we have a vector of length \(n\).
|
||||||
If we were to access its elements by indices starting at 0, we'd be allowed to access indices 0 through \\(n-1\\).
|
If we were to access its elements by indices starting at 0, we'd be allowed to access indices 0 through \(n-1\).
|
||||||
These are precisely the values of the finite natural numbers less than \\(n\\), \\(\\text{Fin} \\; n \\).
|
These are precisely the values of the finite natural numbers less than \(n\), \(\text{Fin} \; n \).
|
||||||
Thus, given such an index \\(\\text{Fin} \\; n\\) and a vector \\(\\text{Vec} \\; t \\; n\\), we are guaranteed
|
Thus, given such an index \(\text{Fin} \; n\) and a vector \(\text{Vec} \; t \; n\), we are guaranteed
|
||||||
to be able to retrieve the element at the given index! In our code, we will not have to worry about bounds checking.
|
to be able to retrieve the element at the given index! In our code, we will not have to worry about bounds checking.
|
||||||
|
|
||||||
Of course, if our program has \\(n\\) elements, our program counter will be a finite number less than \\(n+1\\),
|
Of course, if our program has \(n\) elements, our program counter will be a finite number less than \(n+1\),
|
||||||
since there's always the possibility of it pointing past the instructions, indicating that we've finished
|
since there's always the possibility of it pointing past the instructions, indicating that we've finished
|
||||||
running the program. This leads to some minor complications: we can't safely access the program instruction
|
running the program. This leads to some minor complications: we can't safely access the program instruction
|
||||||
at index \\(\\text{Fin} \\; (n+1)\\). We can solve this problem by considering two cases:
|
at index \(\text{Fin} \; (n+1)\). We can solve this problem by considering two cases:
|
||||||
either our index points one past the end of the program (in which case its value is exactly the finite
|
either our index points one past the end of the program (in which case its value is exactly the finite
|
||||||
representation of \\(n\\)), or it's less than \\(n\\), in which case we can "tighten" the upper bound,
|
representation of \(n\)), or it's less than \(n\), in which case we can "tighten" the upper bound,
|
||||||
and convert that index into a \\(\\text{Fin} \\; n\\). We formalize it in a lemma:
|
and convert that index into a \(\text{Fin} \; n\). We formalize it in a lemma:
|
||||||
|
|
||||||
{{< codelines "Coq" "aoc-2020/day8.v" 80 82 >}}
|
{{< codelines "Coq" "aoc-2020/day8.v" 80 82 >}}
|
||||||
|
|
||||||
There's a little bit of a gotcha here. Instead of translating our above statement literally,
|
There's a little bit of a gotcha here. Instead of translating our above statement literally,
|
||||||
and returning a value that's the result of "tightening" our input `f`, we return a value
|
and returning a value that's the result of "tightening" our input `f`, we return a value
|
||||||
`f'` that can be "weakened" to `f`. This is because "tightening" is not a total function -
|
`f'` that can be "weakened" to `f`. This is because "tightening" is not a total function -
|
||||||
it's not always possible to convert a \\(\\text{Fin} \\; (n+1)\\) into a \\(\\text{Fin} \\; n\\).
|
it's not always possible to convert a \(\text{Fin} \; (n+1)\) into a \(\text{Fin} \; n\).
|
||||||
However, "weakening" \\(\\text{Fin} \\; n\\) _is_ a total function, since a number less than \\(n\\)
|
However, "weakening" \(\text{Fin} \; n\) _is_ a total function, since a number less than \(n\)
|
||||||
is, by the transitive property of a total order, also less than \\(n+1\\).
|
is, by the transitive property of a total order, also less than \(n+1\).
|
||||||
|
|
||||||
The Coq proof for this claim is as follows:
|
The Coq proof for this claim is as follows:
|
||||||
|
|
||||||
@@ -408,11 +409,11 @@ we assume is nonzero, since there isn't a natural number less than zero).
|
|||||||
can itself be tightened.
|
can itself be tightened.
|
||||||
* If it can't be tightened, then our smaller number is a finite representation of
|
* If it can't be tightened, then our smaller number is a finite representation of
|
||||||
`n-1`. This, in turn, means that adding one to it will be the finite representation
|
`n-1`. This, in turn, means that adding one to it will be the finite representation
|
||||||
of `n` (if \\(x\\) is equal to \\(n-1\\), then \\(x+1\\) is equal to \\(n\\)).
|
of `n` (if \(x\) is equal to \(n-1\), then \(x+1\) is equal to \(n\)).
|
||||||
* If it _can_ be tightened, then so can the successor (if \\(x\\) is less
|
* If it _can_ be tightened, then so can the successor (if \(x\) is less
|
||||||
than \\(n-1\\), then \\(x+1\\) is less than \\(n\\)).
|
than \(n-1\), then \(x+1\) is less than \(n\)).
|
||||||
|
|
||||||
Next, let's talk about addition, specifically the kind of addition done by the \\(\\texttt{jmp}\\) instruction.
|
Next, let's talk about addition, specifically the kind of addition done by the \(\texttt{jmp}\) instruction.
|
||||||
We can always add an integer to a natural number, but we can at best guarantee that the result
|
We can always add an integer to a natural number, but we can at best guarantee that the result
|
||||||
will be an integer. For instance, we can add `-1000` to `1`, and get `-999`, which is _not_ a natural
|
will be an integer. For instance, we can add `-1000` to `1`, and get `-999`, which is _not_ a natural
|
||||||
number. We implement this kind of addition in a function called `jump_t`:
|
number. We implement this kind of addition in a function called `jump_t`:
|
||||||
@@ -435,7 +436,7 @@ Now, suppose we wanted to write a function that _does_ return a valid program
|
|||||||
counter after adding the offset to it. Since it's possible for this function to fail
|
counter after adding the offset to it. Since it's possible for this function to fail
|
||||||
(for instance, if the offset is very negative), it has to return `option (fin (S n))`.
|
(for instance, if the offset is very negative), it has to return `option (fin (S n))`.
|
||||||
That is, this function may either fail (returning `None`) or succeed, returning
|
That is, this function may either fail (returning `None`) or succeed, returning
|
||||||
`Some f`, where `f` is of type `fin (S n)`, aka \\(\\text{Fin} \\; (n + 1)\\). Here's
|
`Some f`, where `f` is of type `fin (S n)`, aka \(\text{Fin} \; (n + 1)\). Here's
|
||||||
the function in Coq (again, don't worry too much about the definition):
|
the function in Coq (again, don't worry too much about the definition):
|
||||||
|
|
||||||
{{< codelines "Coq" "aoc-2020/day8.v" 61 61 >}}
|
{{< codelines "Coq" "aoc-2020/day8.v" 61 61 >}}
|
||||||
@@ -463,7 +464,7 @@ The star `*` is used here to represent a [product type](https://en.wikipedia.org
|
|||||||
rather than arithmetic multiplication. Our state type accepts an argument,
|
rather than arithmetic multiplication. Our state type accepts an argument,
|
||||||
`n`, much like a finite natural number or a vector. In fact, this `n` is passed on
|
`n`, much like a finite natural number or a vector. In fact, this `n` is passed on
|
||||||
to the state's program counter and set types. Rightly, a state for a program
|
to the state's program counter and set types. Rightly, a state for a program
|
||||||
of length \\(n\\) will not be of the same type as a state for a program of length \\(n+1\\).
|
of length \(n\) will not be of the same type as a state for a program of length \(n+1\).
|
||||||
|
|
||||||
An instruction is also a tuple, but this time containing only two elements: the opcode and
|
An instruction is also a tuple, but this time containing only two elements: the opcode and
|
||||||
the number. We write this as follows:
|
the number. We write this as follows:
|
||||||
@@ -478,11 +479,11 @@ definition:
|
|||||||
{{< codelines "Coq" "aoc-2020/day8.v" 38 38 >}}
|
{{< codelines "Coq" "aoc-2020/day8.v" 38 38 >}}
|
||||||
|
|
||||||
So far, so good! Finally, it's time to get started on the semantics themselves.
|
So far, so good! Finally, it's time to get started on the semantics themselves.
|
||||||
We begin with the inductive definition of \\((\\rightarrow_i)\\).
|
We begin with the inductive definition of \((\rightarrow_i)\).
|
||||||
I think this is fairly straightforward. However, we do use
|
I think this is fairly straightforward. However, we do use
|
||||||
`t` instead of \\(n\\) from the rules, and we use `FS`
|
`t` instead of \(n\) from the rules, and we use `FS`
|
||||||
instead of \\(+1\\). Also, we make the formerly implicit
|
instead of \(+1\). Also, we make the formerly implicit
|
||||||
assumption that \\(c+n\\) is valid explicit, by
|
assumption that \(c+n\) is valid explicit, by
|
||||||
providing a proof that `valid_jump_t pc t = Some pc'`.
|
providing a proof that `valid_jump_t pc t = Some pc'`.
|
||||||
|
|
||||||
{{< codelines "Coq" "aoc-2020/day8.v" 103 110 >}}
|
{{< codelines "Coq" "aoc-2020/day8.v" 103 110 >}}
|
||||||
@@ -506,8 +507,8 @@ counter will be equal to the length of the program `n`,
|
|||||||
so we use `nat_to_fin n`. On the other hand, if the program
|
so we use `nat_to_fin n`. On the other hand, if the program
|
||||||
terminates in as stuck state, it must be that it terminated
|
terminates in as stuck state, it must be that it terminated
|
||||||
at a program counter that points to an instruction. Thus, this
|
at a program counter that points to an instruction. Thus, this
|
||||||
program counter is actually a \\(\\text{Fin} \\; n\\), and not
|
program counter is actually a \(\text{Fin} \; n\), and not
|
||||||
a \\(\\text{Fin} \\ (n+1)\\), and is not in the set of allowed program counters.
|
a \(\text{Fin} \ (n+1)\), and is not in the set of allowed program counters.
|
||||||
We use the same "weakening" trick we saw earlier to represent
|
We use the same "weakening" trick we saw earlier to represent
|
||||||
this.
|
this.
|
||||||
|
|
||||||
@@ -517,8 +518,8 @@ Finally, we encode the three inference rules we came up with:
|
|||||||
|
|
||||||
Notice that we fused two of the premises in the last rule.
|
Notice that we fused two of the premises in the last rule.
|
||||||
Instead of naming the instruction at the current program
|
Instead of naming the instruction at the current program
|
||||||
counter (by writing \\(p[c] = i\\)) and using it in another premise, we simply use
|
counter (by writing \(p[c] = i\)) and using it in another premise, we simply use
|
||||||
`nth inp pc`, which corresponds to \\(p[c]\\) in our
|
`nth inp pc`, which corresponds to \(p[c]\) in our
|
||||||
"paper" semantics.
|
"paper" semantics.
|
||||||
|
|
||||||
Before we go on writing some actual proofs, we have
|
Before we go on writing some actual proofs, we have
|
||||||
@@ -531,12 +532,12 @@ start off, we'll define the notion of a "valid instruction", which is guaranteed
|
|||||||
to keep the program counter in the correct range.
|
to keep the program counter in the correct range.
|
||||||
There are a couple of ways to do this, but we'll use yet another definition based
|
There are a couple of ways to do this, but we'll use yet another definition based
|
||||||
on inference rules. First, though, observe that the same instruction may be valid
|
on inference rules. First, though, observe that the same instruction may be valid
|
||||||
for one program, and invalid for another. For instance, \\(\\texttt{jmp} \\; 100\\)
|
for one program, and invalid for another. For instance, \(\texttt{jmp} \; 100\)
|
||||||
is perfectly valid for a program with thousands of instructions, but if it occurs
|
is perfectly valid for a program with thousands of instructions, but if it occurs
|
||||||
in a program with only 3 instructions, it will certainly lead to disaster. Specifically,
|
in a program with only 3 instructions, it will certainly lead to disaster. Specifically,
|
||||||
the validity of an instruction depends on the length of the program in which it resides,
|
the validity of an instruction depends on the length of the program in which it resides,
|
||||||
and the program counter at which it's encountered.
|
and the program counter at which it's encountered.
|
||||||
Thus, we refine our idea of validity to "being valid for a program of length \\(n\\) at program counter \\(f\\)".
|
Thus, we refine our idea of validity to "being valid for a program of length \(n\) at program counter \(f\)".
|
||||||
For this, we can use the following two inference rules:
|
For this, we can use the following two inference rules:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
@@ -549,14 +550,14 @@ For this, we can use the following two inference rules:
|
|||||||
{o \; t \; \text{valid for} \; n, c }
|
{o \; t \; \text{valid for} \; n, c }
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
The first rule states that if a program has length \\(n\\), then \\(\\texttt{add}\\) is valid
|
The first rule states that if a program has length \(n\), then \(\texttt{add}\) is valid
|
||||||
at any program counter whose value is less than \\(n\\). This is because running
|
at any program counter whose value is less than \(n\). This is because running
|
||||||
\\(\\texttt{add}\\) will increment the program counter \\(c\\) by 1,
|
\(\texttt{add}\) will increment the program counter \(c\) by 1,
|
||||||
and thus, create a new program counter that's less than \\(n+1\\),
|
and thus, create a new program counter that's less than \(n+1\),
|
||||||
which, as we discussed above, is perfectly valid.
|
which, as we discussed above, is perfectly valid.
|
||||||
|
|
||||||
The second rule works for the other two instructions. It has an extra premise:
|
The second rule works for the other two instructions. It has an extra premise:
|
||||||
the result of `jump_valid_t` (written as \\(J_v\\)) has to be \\(\\text{Some} \\; c'\\),
|
the result of `jump_valid_t` (written as \(J_v\)) has to be \(\text{Some} \; c'\),
|
||||||
that is, `jump_valid_t` must succeed. Note that we require this even for no-ops,
|
that is, `jump_valid_t` must succeed. Note that we require this even for no-ops,
|
||||||
since it later turns out that one of the them may be a jump after all.
|
since it later turns out that one of the them may be a jump after all.
|
||||||
|
|
||||||
@@ -567,26 +568,26 @@ We encode the rules in Coq as follows:
|
|||||||
{{< codelines "Coq" "aoc-2020/day8.v" 152 157 >}}
|
{{< codelines "Coq" "aoc-2020/day8.v" 152 157 >}}
|
||||||
|
|
||||||
Note that we have three rules instead of two. This is because we "unfolded"
|
Note that we have three rules instead of two. This is because we "unfolded"
|
||||||
\\(o\\) from our second rule: rather than using set notation (or "or"), we
|
\(o\) from our second rule: rather than using set notation (or "or"), we
|
||||||
just generated two rules that vary in nothing but the operation involved.
|
just generated two rules that vary in nothing but the operation involved.
|
||||||
|
|
||||||
Of course, we must have that every instruction in a program is valid.
|
Of course, we must have that every instruction in a program is valid.
|
||||||
We don't really need inference rules for this, as much as a "forall" quantifier.
|
We don't really need inference rules for this, as much as a "forall" quantifier.
|
||||||
A program \\(p\\) of length \\(n\\) is valid if the following holds:
|
A program \(p\) of length \(n\) is valid if the following holds:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\forall (c : \text{Fin} \; n). p[c] \; \text{valid for} \; n, c
|
\forall (c : \text{Fin} \; n). p[c] \; \text{valid for} \; n, c
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
That is, for every possible in-bounds program counter \\(c\\),
|
That is, for every possible in-bounds program counter \(c\),
|
||||||
the instruction at the program counter is valid. We can now
|
the instruction at the program counter is valid. We can now
|
||||||
encode this in Coq, too:
|
encode this in Coq, too:
|
||||||
|
|
||||||
{{< codelines "Coq" "aoc-2020/day8.v" 160 161 >}}
|
{{< codelines "Coq" "aoc-2020/day8.v" 160 161 >}}
|
||||||
|
|
||||||
In the above, `n` is made implicit where possible.
|
In the above, `n` is made implicit where possible.
|
||||||
Since \\(c\\) (called `pc` in the code) is of type \\(\\text{Fin} \\; n\\), there's no
|
Since \(c\) (called `pc` in the code) is of type \(\text{Fin} \; n\), there's no
|
||||||
need to write \\(n\\) _again_. The curly braces tell Coq to infer that
|
need to write \(n\) _again_. The curly braces tell Coq to infer that
|
||||||
argument where possible.
|
argument where possible.
|
||||||
|
|
||||||
### Proving Termination
|
### Proving Termination
|
||||||
@@ -612,7 +613,7 @@ The first proof is rather simple. The claim is:
|
|||||||
> For our valid program, at any program counter `pc`
|
> For our valid program, at any program counter `pc`
|
||||||
and accumulator `acc`, there must exist another program
|
and accumulator `acc`, there must exist another program
|
||||||
counter `pc'` and accumulator `acc'` such that the
|
counter `pc'` and accumulator `acc'` such that the
|
||||||
instruction evaluation relation \\((\rightarrow_i)\\)
|
instruction evaluation relation \((\rightarrow_i)\)
|
||||||
connects the two. That is, valid addresses aside,
|
connects the two. That is, valid addresses aside,
|
||||||
we can always make a step.
|
we can always make a step.
|
||||||
|
|
||||||
@@ -675,14 +676,14 @@ case analysis on `o`.
|
|||||||
There are three possible cases we have to consider,
|
There are three possible cases we have to consider,
|
||||||
one for each type of instruction.
|
one for each type of instruction.
|
||||||
|
|
||||||
* If the instruction is \\(\\texttt{add}\\), we know
|
* If the instruction is \(\texttt{add}\), we know
|
||||||
that `pc' = pc + 1` and `acc' = acc + t0`. That is,
|
that `pc' = pc + 1` and `acc' = acc + t0`. That is,
|
||||||
the program counter is simply incremented, and the accumulator
|
the program counter is simply incremented, and the accumulator
|
||||||
is modified with the number part of the instruction.
|
is modified with the number part of the instruction.
|
||||||
* If the instruction is \\(\\texttt{nop}\\), the program
|
* If the instruction is \(\texttt{nop}\), the program
|
||||||
coutner will again be incremented (`pc' = pc + 1`),
|
coutner will again be incremented (`pc' = pc + 1`),
|
||||||
but the accumulator will stay the same, so `acc' = acc`.
|
but the accumulator will stay the same, so `acc' = acc`.
|
||||||
* If the instruction is \\(\\texttt{jmp}\\), things are
|
* If the instruction is \(\texttt{jmp}\), things are
|
||||||
more complicated. We must rely on the assumption
|
more complicated. We must rely on the assumption
|
||||||
that our input is valid, which tells us that adding
|
that our input is valid, which tells us that adding
|
||||||
`t0` to our `pc` will result in `Some f`, and not `None`.
|
`t0` to our `pc` will result in `Some f`, and not `None`.
|
||||||
@@ -736,7 +737,7 @@ counter that is not included in the valid set.
|
|||||||
to a "next" state.
|
to a "next" state.
|
||||||
|
|
||||||
Alternatively, we could say that one of the inference rules
|
Alternatively, we could say that one of the inference rules
|
||||||
for \\((\\Rightarrow_p)\\) must apply. This is not the case if the input
|
for \((\Rightarrow_p)\) must apply. This is not the case if the input
|
||||||
is not valid, since, as I said
|
is not valid, since, as I said
|
||||||
before, an arbitrary input program can lead us to jump
|
before, an arbitrary input program can lead us to jump
|
||||||
to a negative address (or to an address _way_ past the end of the program).
|
to a negative address (or to an address _way_ past the end of the program).
|
||||||
|
|||||||
@@ -1,7 +1,8 @@
|
|||||||
---
|
---
|
||||||
title: Compiling a Functional Language Using C++, Part 1 - Tokenizing
|
title: Compiling a Functional Language Using C++, Part 1 - Tokenizing
|
||||||
date: 2019-08-03T01:02:30-07:00
|
date: 2019-08-03T01:02:30-07:00
|
||||||
tags: ["C and C++", "Functional Languages", "Compilers"]
|
tags: ["C++", "Functional Languages", "Compilers"]
|
||||||
|
series: "Compiling a Functional Language using C++"
|
||||||
description: "In this post, we tackle the first component of our compiler: tokenizing."
|
description: "In this post, we tackle the first component of our compiler: tokenizing."
|
||||||
---
|
---
|
||||||
It makes sense to build a compiler bit by bit, following the stages we outlined in
|
It makes sense to build a compiler bit by bit, following the stages we outlined in
|
||||||
@@ -54,31 +55,31 @@ patterns that a string has to match. We define regular expressions
|
|||||||
as follows:
|
as follows:
|
||||||
|
|
||||||
* Any character is a regular expression that matches that character. Thus,
|
* Any character is a regular expression that matches that character. Thus,
|
||||||
\\(a\\) is a regular expression (from now shortened to regex) that matches
|
\(a\) is a regular expression (from now shortened to regex) that matches
|
||||||
the character 'a', and nothing else.
|
the character 'a', and nothing else.
|
||||||
* \\(r_1r_2\\), or the concatenation of \\(r_1\\) and \\(r_2\\), is
|
* \(r_1r_2\), or the concatenation of \(r_1\) and \(r_2\), is
|
||||||
a regular expression that matches anything matched by \\(r_1\\), followed
|
a regular expression that matches anything matched by \(r_1\), followed
|
||||||
by anything that matches \\(r_2\\). For instance, \\(ab\\), matches
|
by anything that matches \(r_2\). For instance, \(ab\), matches
|
||||||
the character 'a' followed by the character 'b' (thus matching "ab").
|
the character 'a' followed by the character 'b' (thus matching "ab").
|
||||||
* \\(r_1|r_2\\) matches anything that is either matched by \\(r_1\\) or
|
* \(r_1|r_2\) matches anything that is either matched by \(r_1\) or
|
||||||
\\(r_2\\). Thus, \\(a|b\\) matches the character 'a' or the character 'b'.
|
\(r_2\). Thus, \(a|b\) matches the character 'a' or the character 'b'.
|
||||||
* \\(r_1?\\) matches either an empty string, or anything matched by \\(r_1\\).
|
* \(r_1?\) matches either an empty string, or anything matched by \(r_1\).
|
||||||
* \\(r_1+\\) matches one or more things matched by \\(r_1\\). So,
|
* \(r_1+\) matches one or more things matched by \(r_1\). So,
|
||||||
\\(a+\\) matches "a", "aa", "aaa", and so on.
|
\(a+\) matches "a", "aa", "aaa", and so on.
|
||||||
* \\((r_1)\\) matches anything that matches \\(r_1\\). This is mostly used
|
* \((r_1)\) matches anything that matches \(r_1\). This is mostly used
|
||||||
to group things together in more complicated expressions.
|
to group things together in more complicated expressions.
|
||||||
* \\(.\\) matches any character.
|
* \(.\) matches any character.
|
||||||
|
|
||||||
More powerful variations of regex also include an "any of" operator, \\([c_1c_2c_3]\\),
|
More powerful variations of regex also include an "any of" operator, \([c_1c_2c_3]\),
|
||||||
which is equivalent to \\(c_1|c_2|c_3\\), and a "range" operator, \\([c_1-c_n]\\), which
|
which is equivalent to \(c_1|c_2|c_3\), and a "range" operator, \([c_1-c_n]\), which
|
||||||
matches all characters in the range between \\(c_1\\) and \\(c_n\\), inclusive.
|
matches all characters in the range between \(c_1\) and \(c_n\), inclusive.
|
||||||
|
|
||||||
Let's see some examples. An integer, such as 326, can be represented with \\([0-9]+\\).
|
Let's see some examples. An integer, such as 326, can be represented with \([0-9]+\).
|
||||||
This means, one or more characters between 0 or 9. Some (most) regex implementations
|
This means, one or more characters between 0 or 9. Some (most) regex implementations
|
||||||
have a special symbol for \\([0-9]\\), written as \\(\\setminus d\\). A variable,
|
have a special symbol for \([0-9]\), written as \(\setminus d\). A variable,
|
||||||
starting with a lowercase letter and containing lowercase or uppercase letters after it,
|
starting with a lowercase letter and containing lowercase or uppercase letters after it,
|
||||||
can be written as \\(\[a-z\]([a-zA-Z]+)?\\). Again, most regex implementations provide
|
can be written as \([a-z]([a-zA-Z]+)?\). Again, most regex implementations provide
|
||||||
a special operator for \\((r_1+)?\\), written as \\(r_1*\\).
|
a special operator for \((r_1+)?\), written as \(r_1*\).
|
||||||
|
|
||||||
So how does one go about checking if a regular expression matches a string? An efficient way is to
|
So how does one go about checking if a regular expression matches a string? An efficient way is to
|
||||||
first construct a [state machine](https://en.wikipedia.org/wiki/Finite-state_machine). A type of state machine can be constructed from a regular expression
|
first construct a [state machine](https://en.wikipedia.org/wiki/Finite-state_machine). A type of state machine can be constructed from a regular expression
|
||||||
|
|||||||
@@ -2,6 +2,7 @@
|
|||||||
title: A Language for an Assignment - Homework 2
|
title: A Language for an Assignment - Homework 2
|
||||||
date: 2019-12-30T20:05:10-08:00
|
date: 2019-12-30T20:05:10-08:00
|
||||||
tags: ["Haskell", "Python", "Algorithms", "Programming Languages"]
|
tags: ["Haskell", "Python", "Algorithms", "Programming Languages"]
|
||||||
|
series: "A Language for an Assignment"
|
||||||
---
|
---
|
||||||
|
|
||||||
After the madness of the
|
After the madness of the
|
||||||
@@ -142,7 +143,7 @@ def prog(xs):
|
|||||||
return (state, total+left+right)
|
return (state, total+left+right)
|
||||||
```
|
```
|
||||||
|
|
||||||
Honestly, that's pretty clean. As clean as `left.reverse()` to allow for \\(O(1)\\) pop is.
|
Honestly, that's pretty clean. As clean as `left.reverse()` to allow for \(O(1)\) pop is.
|
||||||
What's really clean, however, is the implementation of mergesort in our language.
|
What's really clean, however, is the implementation of mergesort in our language.
|
||||||
It goes as follows:
|
It goes as follows:
|
||||||
|
|
||||||
|
|||||||
@@ -1,7 +1,7 @@
|
|||||||
---
|
---
|
||||||
title: Learning Emulation, Part 1
|
title: Learning Emulation, Part 1
|
||||||
date: 2016-11-23 23:22:42.779811
|
date: 2016-06-27
|
||||||
tags: ["C and C++", "Emulation"]
|
tags: ["Emulation"]
|
||||||
---
|
---
|
||||||
I've decided that the purpose of a blog is to actually use it every once in a while. So, to fill up this blank space, I'll be documenting my own experience of starting to learn how emulation works. I'd like to say right now that my main goal was not to learn emulation. Rather, I needed to emulate to refresh my skills for a different subject area. However, emulation turned out fun enough to write about.
|
I've decided that the purpose of a blog is to actually use it every once in a while. So, to fill up this blank space, I'll be documenting my own experience of starting to learn how emulation works. I'd like to say right now that my main goal was not to learn emulation. Rather, I needed to emulate to refresh my skills for a different subject area. However, emulation turned out fun enough to write about.
|
||||||
|
|
||||||
|
|||||||
504
content/blog/01_spa_agda_lattices.md
Normal file
@@ -0,0 +1,504 @@
|
|||||||
|
---
|
||||||
|
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 1: Lattices"
|
||||||
|
series: "Static Program Analysis in Agda"
|
||||||
|
description: "In this post, I introduce an algebraic structure called a lattice, which underpins certain program analyses"
|
||||||
|
date: 2024-07-06T17:37:43-07:00
|
||||||
|
tags: ["Agda", "Programming Languages"]
|
||||||
|
---
|
||||||
|
|
||||||
|
This is the first post in a series on
|
||||||
|
[static program analysis in Agda]({{< relref "static-program-analysis-in-agda" >}}).
|
||||||
|
See the [introduction]({{< relref "00_spa_agda_intro" >}}) for a little bit
|
||||||
|
more context.
|
||||||
|
|
||||||
|
The goal of this post is to motivate the algebraic structure called a
|
||||||
|
[lattice](https://en.wikipedia.org/wiki/Lattice_(order)). Lattices have
|
||||||
|
{{< sidenote "right" "crdt-note" "broad applications" >}}
|
||||||
|
See, for instance, Lars Hupel's excellent
|
||||||
|
<a href="https://lars.hupel.info/topics/crdt/01-intro/">introduction to CRDTs</a>
|
||||||
|
which uses lattices for Conflict-Free Replicated Data Types. CRDTs can be
|
||||||
|
used to implement peer-to-peer distributed systems.
|
||||||
|
{{< /sidenote >}} beyond static program analysis, so the work in this post is
|
||||||
|
interesting in its own right. However, for the purposes of this series, I'm
|
||||||
|
most interested in lattices as an encoding of program information when performing
|
||||||
|
analysis. To start motivating lattices in that context, I'll need to start
|
||||||
|
with _monotone frameworks_.
|
||||||
|
|
||||||
|
### Monotone Frameworks
|
||||||
|
|
||||||
|
The key notion for monotone frameworks is the "specificity" of information.
|
||||||
|
Take, for instance, an analyzer that tries to figure out if a variable is
|
||||||
|
positive, negative, or equal to zero (this is called a _sign analysis_, and
|
||||||
|
we'll be using this example a lot). Of course, the variable could be "none
|
||||||
|
of the above" -- perhaps if it was initialized from user input, which would
|
||||||
|
allow both positive and negative numbers. Such an analyzer might return
|
||||||
|
`+`, `-`, `0`, or `unknown` for any given variable. These outputs are not
|
||||||
|
created equal: if a variable has sign `+`, we know more about it than if
|
||||||
|
the sign is `unknown`: we've ruled out negative numbers as possible values!
|
||||||
|
|
||||||
|
Specificity is important to us because we want our analyses to be as precise
|
||||||
|
as possible. It would be valid for a program analysis to just return
|
||||||
|
`unknown` for everything, but it wouldn't be very useful. Thus, we want to
|
||||||
|
rank possible outputs, and try pick the most specific one. The
|
||||||
|
{{< sidenote "right" "convention-note" "convention" -12 >}}
|
||||||
|
I say convention, because it doesn't actually matter if we represent more
|
||||||
|
specific values as "larger" or "smaller". Given a lattice with a particular
|
||||||
|
order written as <code><</code>, we can flip the sign in all relations
|
||||||
|
(turning <code>a < b</code> into <code>a > b</code>), and get back another
|
||||||
|
lattice. This lattice will have the same properties (more precisely,
|
||||||
|
the properties will be
|
||||||
|
<a href="https://en.wikipedia.org/wiki/Duality_(mathematics)">dual</a>). So
|
||||||
|
we shouldn't fret about picking a direction for "what's less than what".
|
||||||
|
{{< /sidenote >}}
|
||||||
|
seems to be to make
|
||||||
|
{{< sidenote "right" "order-note" "more specific things \"smaller\"" 1 >}}
|
||||||
|
Admittedly, it's a little bit odd to say that something which is "more" than
|
||||||
|
something else is actually smaller. The intuition that I favor is that
|
||||||
|
something that's more specific describes fewer objects: there are less
|
||||||
|
white horses than horses, so "white horse" is more specific than "horse".
|
||||||
|
The direction of <code><</code> can be thought of as comparing the number
|
||||||
|
of objects.<br>
|
||||||
|
<br>
|
||||||
|
Note that this is only an intuition; there are equally many positive and
|
||||||
|
negative numbers, but we will <em>not</em> group them together
|
||||||
|
in our order.
|
||||||
|
{{< /sidenote >}},
|
||||||
|
and less specific things "larger". Coming back to our previous example, we'd
|
||||||
|
write `+ < unknown`, since `+` is more specific. Of course, the exact
|
||||||
|
things we're trying to rank depend on the sort of analysis we're trying to
|
||||||
|
perform. Since I introduced sign analysis, we're ranking signs like `+` and `-`.
|
||||||
|
For other analyses, the elements will be different. The _comparison_, however,
|
||||||
|
will be a permanent fixture.
|
||||||
|
{#specificity}
|
||||||
|
|
||||||
|
Suppose now that we have some program analysis, and we're feeding it some input
|
||||||
|
information. Perhaps we're giving it the signs of variables `x` and `y`, and
|
||||||
|
hoping for it to give us the sign of a third variable `z`. It would be very
|
||||||
|
unfortunate if, when given more specific information, the analysis would return
|
||||||
|
a less specific output! The more you know going in, the more you should know
|
||||||
|
coming out. Similarly, when given less specific / vaguer information, the
|
||||||
|
analysis shouldn't produce a more specific answer -- how could it do that?
|
||||||
|
This leads us to come up with the following rule:
|
||||||
|
{#define-monotonicity}
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\textbf{if}\ \text{input}_1 \le \text{input}_2,
|
||||||
|
\textbf{then}\ \text{analyze}(\text{input}_1) \le \text{analyze}(\text{input}_2)
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
In mathematics, such a property is called _monotonicity_. We say that
|
||||||
|
"analyze" is a [monotonic function](https://en.wikipedia.org/wiki/Monotonic_function).
|
||||||
|
This property gives its name to monotone frameworks. For our purposes, this
|
||||||
|
property means that being more specific "pays off": better information in
|
||||||
|
means better information out. In Agda, we can encode monotonicity as follows:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice.agda" 17 21 >}}
|
||||||
|
|
||||||
|
Note that above, I defined `Monotonic` on an arbitrary function, whose
|
||||||
|
outputs might be of a different type than its inputs. This will come in handy
|
||||||
|
later.
|
||||||
|
|
||||||
|
The order `<` of our elements and the monotonicity of our analysis are useful
|
||||||
|
to us for another reason: they help gauge and limit, in a roundabout way, how much
|
||||||
|
work might be left for our analysis to do. This matters because we don't want
|
||||||
|
to allow analyses that can take forever to finish -- that's a little too long
|
||||||
|
for a pragmatic tool used by people.
|
||||||
|
|
||||||
|
The key observation -- which I will describe in detail in a later post --
|
||||||
|
is that a monotonic analysis, in a way, "climbs upwards" through an
|
||||||
|
order. As we continue using this analysis to refine information over and over,
|
||||||
|
its results get
|
||||||
|
{{< sidenote "right" "less-specific-note" "less and less specific." >}}
|
||||||
|
It is not a bad thing for our results to get less specific over time, because
|
||||||
|
our initial information is probably incomplete. If you've only seen German
|
||||||
|
shepherds in your life, that might be your picture of what a dog is like.
|
||||||
|
If you then come across a chihuahua, your initial definition of "dog" would
|
||||||
|
certainly not accommodate it. To allow for both German shepherds and chihuahuas,
|
||||||
|
you'd have to loosen the definition of "dog". This new definition would be less
|
||||||
|
specific, but it would be more accurate.
|
||||||
|
{{< /sidenote >}}
|
||||||
|
If we add an additional ingredient, and say that the order has a _fixed height_,
|
||||||
|
we can deduce that the analysis will eventually stop producing additional
|
||||||
|
information: either it will keep "climbing", and reach the top (thus having
|
||||||
|
to stop), or it will stop on its own before reaching the top. This is
|
||||||
|
the essence of the fixed-point algorithm, which in Agda-like pseudocode can
|
||||||
|
be stated as follows:
|
||||||
|
|
||||||
|
```Agda
|
||||||
|
module _ (IsFiniteHeight A ≺)
|
||||||
|
(f : A → A)
|
||||||
|
(Monotonicᶠ : Monotonic _≼_ _≼_ f) where
|
||||||
|
-- There exists a point...
|
||||||
|
aᶠ : A
|
||||||
|
|
||||||
|
-- Such that applying the monotonic function doesn't change the result.
|
||||||
|
aᶠ≈faᶠ : aᶠ ≈ f aᶠ
|
||||||
|
```
|
||||||
|
|
||||||
|
Moreover, the value we'll get out of the fixed point algorithm will be
|
||||||
|
the _least fixed point_. For us, this means that the result will be
|
||||||
|
"the most specific result possible".
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Fixedpoint.agda" 86 86 >}}
|
||||||
|
|
||||||
|
The above explanation omits a lot of details, but it's a start. To get more
|
||||||
|
precise, we must drill down into several aspects of what I've said so far.
|
||||||
|
The first of them is, __how can we compare program information using an order?__
|
||||||
|
|
||||||
|
### Lattices
|
||||||
|
|
||||||
|
Let's start with a question: when it comes to our specificity-based order,
|
||||||
|
is `-` less than, greater than, or equal to `+`? Surely it's not less specific;
|
||||||
|
knowing that a number is negative doesn't give you less information than
|
||||||
|
knowing if that number is positive. Similarly, it's not any more specific, for
|
||||||
|
the same reason. You could consider it equally specific, but that doesn't
|
||||||
|
seem quite right either; the information is different, so comparing specificity
|
||||||
|
feels apples-to-oranges. On the other hand, both `+` and `-` are clearly
|
||||||
|
more specific than `unknown`.
|
||||||
|
|
||||||
|
The solution to this conundrum is to simply refuse to compare certain elements:
|
||||||
|
`+` is neither less than, greater than, nor equal to `-`, but `+ < unknown` and
|
||||||
|
`- < unknown`. Such an ordering is called a [partial order](https://en.wikipedia.org/wiki/Partially_ordered_set).
|
||||||
|
|
||||||
|
Next, another question. Suppose that the user writes code like this:
|
||||||
|
|
||||||
|
```
|
||||||
|
if someCondition {
|
||||||
|
x = exprA;
|
||||||
|
} else {
|
||||||
|
x = exprB;
|
||||||
|
}
|
||||||
|
y = x;
|
||||||
|
```
|
||||||
|
|
||||||
|
If `exprA` has sign `s1`, and `exprB` has sign `s2`, what's the sign of `y`?
|
||||||
|
It's not necessarily `s1` nor `s2`, since they might not match: `s1` could be `+`,
|
||||||
|
and `s2` could be `-`, and using either `+` or `-` for `y` would be incorrect.
|
||||||
|
We're looking for something that can encompass _both_ `s1` and `s2`.
|
||||||
|
Necessarily, it would be either equally specific or less specific than
|
||||||
|
either `s1` or `s2`: there isn't any new information coming in about `x`,
|
||||||
|
and since we don't know which branch is taken, we stand to lose a little
|
||||||
|
bit of info. However, our goal is always to maximize specificity, since
|
||||||
|
more specific signs give us more information about our program.
|
||||||
|
|
||||||
|
This gives us the following constraints. Since the combined sign `s` has to
|
||||||
|
be equally or less specific than either `s1` and `s2`, we have `s1 <= s` and
|
||||||
|
`s2 <= s`. However, we want to pick `s` such that it's more specific than
|
||||||
|
any other "combined sign" candidate. Thus, if there's another sign `t`,
|
||||||
|
with `s1 <= t` and `s2 <= t`, then it must be less specific than `s`: `s <= t`.
|
||||||
|
|
||||||
|
At first, the above constraints might seem quite complicated. We can interpret
|
||||||
|
them in more familiar territory by looking at numbers instead of signs.
|
||||||
|
If we have two numbers `n1` and `n2`, what number is the smallest number
|
||||||
|
that's bigger than either `n1` or `n2`? Why, the maximum of the two, of course!
|
||||||
|
|
||||||
|
There is a reason why I used the constraints above instead of just saying
|
||||||
|
"maximum". For numbers, `max(a,b)` is either `a` or `b`. However, we saw earlier
|
||||||
|
that neither `+` nor `-` works as the sign for `y` in our program. Moreover,
|
||||||
|
we agreed above that our order is _partial_: how can we pick "the bigger of two
|
||||||
|
elements" if neither is bigger than the other? `max` itself doesn't quite work,
|
||||||
|
but what we're looking for is something similar. Instead, we simply require a
|
||||||
|
similar function for our signs. We call this function "[least upper bound](https://en.wikipedia.org/wiki/Least-upper-bound_property)",
|
||||||
|
since it is the "least (most specific)
|
||||||
|
element that's greater (less specific) than either `s1` or `s2`". Conventionally,
|
||||||
|
this function is written as \(a \sqcup b\) (or in our case, \(s_1 \sqcup s_2\)).
|
||||||
|
The \((\sqcup)\) symbol is also called the _join_ of \(a\) and \(b\).
|
||||||
|
We can define it for our signs so far using the following [Cayley table](https://en.wikipedia.org/wiki/Cayley_table).
|
||||||
|
{#least-upper-bound}
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\begin{array}{c|cccc}
|
||||||
|
\sqcup & - & 0 & + & ? \\
|
||||||
|
\hline
|
||||||
|
- & - & ? & ? & ? \\
|
||||||
|
0 & ? & 0 & ? & ? \\
|
||||||
|
+ & ? & ? & + & ? \\
|
||||||
|
? & ? & ? & ? & ? \\
|
||||||
|
\end{array}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
By using the above table, we can see that \((+\ \sqcup\ -)\ =\ ?\) (aka `unknown`).
|
||||||
|
This is correct; given the four signs we're working with, that's the most we can say.
|
||||||
|
Let's explore the analogy to the `max` function a little bit more, by observing
|
||||||
|
that this function has certain properties:
|
||||||
|
|
||||||
|
* `max(a, a) = a`. The maximum of one number is just that number.
|
||||||
|
Mathematically, this property is called _idempotence_. Note that
|
||||||
|
by inspecting the diagonal of the above table, we can confirm that our
|
||||||
|
\((\sqcup)\) function is idempotent.
|
||||||
|
* `max(a, b) = max(b, a)`. If you're taking the maximum of two numbers,
|
||||||
|
it doesn't matter which one you consider first. This property is called
|
||||||
|
_commutativity_. Note that if you mirror the table along the diagonal,
|
||||||
|
it doesn't change; this shows that our \((\sqcup)\) function is
|
||||||
|
commutative.
|
||||||
|
* `max(a, max(b, c)) = max(max(a, b), c)`. When you have three numbers,
|
||||||
|
and you're determining the maximum value, it doesn't matter which pair of
|
||||||
|
numbers you compare first. This property is called _associativity_. You
|
||||||
|
can use the table above to verify the \((\sqcup)\) is associative, too.
|
||||||
|
|
||||||
|
A set that has a binary operation (like `max` or \((\sqcup)\)) that
|
||||||
|
satisfies the above properties is called a [semilattice](https://en.wikipedia.org/wiki/Semilattice). In Agda, we can write this definition roughly as follows:
|
||||||
|
|
||||||
|
```Agda
|
||||||
|
record IsSemilattice {a} (A : Set a) (_⊔_ : A → A → A) : Set a where
|
||||||
|
field
|
||||||
|
⊔-assoc : (x y z : A) → ((x ⊔ y) ⊔ z) ≡ (x ⊔ (y ⊔ z))
|
||||||
|
⊔-comm : (x y : A) → (x ⊔ y) ≡ (y ⊔ x)
|
||||||
|
⊔-idemp : (x : A) → (x ⊔ x) ≡ x
|
||||||
|
```
|
||||||
|
|
||||||
|
Note that this is an example of the ["Is Something" pattern]({{< relref "agda_is_pattern" >}}).
|
||||||
|
It turns out to be convenient, however, to not require definitional equality
|
||||||
|
(`≡`). For instance, we might model sets as lists. Definitional equality
|
||||||
|
would force us to consider lists with the same elements but a different
|
||||||
|
order to be unequal. Instead, we parameterize our definition of `IsSemilattice`
|
||||||
|
by a binary relation `_≈_`, which we ask to be an [equivalence relation](https://en.wikipedia.org/wiki/Equivalence_relation).
|
||||||
|
{#definitional-equality}
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice.agda" 23 39 >}}
|
||||||
|
|
||||||
|
Notice that the above code also provides -- but doesn't require -- `_≼_` and
|
||||||
|
`_≺_`. That's because a least-upper-bound operation encodes an order:
|
||||||
|
intuitively, if `max(a, b) = b`, then `b` must be larger than `a`.
|
||||||
|
Lars Hupel's CRDT series includes [an explanation](https://lars.hupel.info/topics/crdt/03-lattices/#there-) of how the ordering operator
|
||||||
|
and the "least upper bound" function can be constructed from one another.
|
||||||
|
|
||||||
|
As it turns out, the `min` function has very similar properties to `max`:
|
||||||
|
it's idempotent, commutative, and associative. For a partial order like
|
||||||
|
ours, the analog to `min` is "greatest lower bound", or "the largest value
|
||||||
|
that's smaller than both inputs". Such a function is denoted as \(a\sqcap b\),
|
||||||
|
and often called the "meet" of \(a\) and \(b\).
|
||||||
|
As for what it means, where \(s_1 \sqcup s_2\) means "combine two signs where
|
||||||
|
you don't know which one will be used" (like in an `if`/`else`),
|
||||||
|
\(s_1 \sqcap s_2\) means "combine two signs where you know
|
||||||
|
{{< sidenote "right" "or-join-note" "both of them to be true" -7 >}}
|
||||||
|
If you're familiar with <a href="https://en.wikipedia.org/wiki/Boolean_algebra">
|
||||||
|
Boolean algebra</a>, this might look a little bit familiar to you. In fact,
|
||||||
|
the symbol for "and" on booleans is \(\land\). Similarly, the symbol
|
||||||
|
for "or" is \(\lor\). So, \(s_1 \sqcup s_2\) means "the sign is \(s_1\) or \(s_2\)",
|
||||||
|
or "(the sign is \(s_1\)) \(\lor\) (the sign is \(s_2\))". Similarly,
|
||||||
|
\(s_1 \sqcap s_2\) means "(the sign is \(s_1\)) \(\land\) (the sign is \(s_2\))".
|
||||||
|
Don't these symbols look similar?<br>
|
||||||
|
<br>
|
||||||
|
In fact, booleans with \((\lor)\) and \((\land)\) satisfy the semilattice
|
||||||
|
laws we've been discussing, and together form a lattice (to which I'm building
|
||||||
|
to in the main body of the text). The same is true for the set union and
|
||||||
|
intersection operations, \((\cup)\) and \((\cap)\).
|
||||||
|
{{< /sidenote >}}". For example, \((+\ \sqcap\ ?)\ =\ +\), because a variable
|
||||||
|
that's both "any sign" and "positive" must be positive.
|
||||||
|
{#lub-glub-or-and}
|
||||||
|
|
||||||
|
There's just one hiccup: what's the greatest lower bound of `+` and `-`?
|
||||||
|
it needs to be a value that's less than both of them, but so far, we don't have
|
||||||
|
such a value. Intuitively, this value should be called something like `impossible`,
|
||||||
|
because a number that's both positive and negative doesn't exist. So, let's
|
||||||
|
extend our analyzer to have a new `impossible` value. In fact, it turns
|
||||||
|
out that this "impossible" value is the least element of our set (we added
|
||||||
|
it to be the lower bound of `+` and co., which in turn are less than `unknown`).
|
||||||
|
Similarly, `unknown` is the largest element of our set, since it's greater
|
||||||
|
than `+` and co, and transitively greater than `impossible`. In mathematics,
|
||||||
|
it's not uncommon to define the least element as \(\bot\) (read "bottom"), and the
|
||||||
|
greatest element as \(\top\) (read "top"). With that in mind, the
|
||||||
|
following are the updated Cayley tables for our operations.
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\begin{array}{c|ccccc}
|
||||||
|
\sqcup & - & 0 & + & \top & \bot \\
|
||||||
|
\hline
|
||||||
|
- & - & \top & \top & \top & - \\
|
||||||
|
0 & \top & 0 & \top & \top & 0 \\
|
||||||
|
+ & \top & \top & + & \top & + \\
|
||||||
|
\top & \top & \top & \top & \top & \top \\
|
||||||
|
\bot & - & 0 & + & \top & \bot \\
|
||||||
|
\end{array}
|
||||||
|
|
||||||
|
\qquad
|
||||||
|
|
||||||
|
\begin{array}{c|ccccc}
|
||||||
|
\sqcap & - & 0 & + & \top & \bot \\
|
||||||
|
\hline
|
||||||
|
- & - & \bot & \bot & - & \bot \\
|
||||||
|
0 & \bot & 0 & \bot & 0 & \bot \\
|
||||||
|
+ & \bot & \bot & + & + & \bot \\
|
||||||
|
\top & - & 0 & + & \top & \bot \\
|
||||||
|
\bot & \bot & \bot & \bot & \bot & \bot \\
|
||||||
|
\end{array}
|
||||||
|
{{< /latex >}}
|
||||||
|
{#sign-lattice}
|
||||||
|
|
||||||
|
|
||||||
|
So, it turns out that our set of possible signs is a semilattice in two
|
||||||
|
ways. And if "semi" means "half", does two "semi"s make a whole? Indeed it does!
|
||||||
|
|
||||||
|
A lattice is made up of two semilattices. The operations of these two lattices,
|
||||||
|
however, must satisfy some additional properties. Let's examine the properties
|
||||||
|
in the context of `min` and `max` as we have before. They are usually called
|
||||||
|
the _absorption laws_:
|
||||||
|
{#absorption-laws}
|
||||||
|
|
||||||
|
* `max(a, min(a, b)) = a`. `a` is either less than or bigger than `b`;
|
||||||
|
so if you try to find the maximum __and__ the minimum of `a` and
|
||||||
|
`b`, one of the operations will return `a`.
|
||||||
|
* `min(a, max(a, b)) = a`. The reason for this one is the same as
|
||||||
|
the reason above.
|
||||||
|
|
||||||
|
In Agda, we can therefore write a lattice as follows:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice.agda" 183 193 >}}
|
||||||
|
|
||||||
|
### Concrete Examples
|
||||||
|
#### Natural Numbers
|
||||||
|
|
||||||
|
Since we've been talking about `min` and `max` as motivators for properties
|
||||||
|
of \((\sqcap)\) and \((\sqcup)\), it might not be all that surprising
|
||||||
|
that natural numbers form a lattice with `min` and `max` as the two binary
|
||||||
|
operations. In fact, the Agda standard library writes `min` as `_⊓_` and
|
||||||
|
`max` as `_⊔_`! We can make use of the already-proven properties of these
|
||||||
|
operators to easily define `IsLattice` for natural numbers. Notice that
|
||||||
|
since we're not doing anything clever, like considering lists up to reordering,
|
||||||
|
there's no reason not to use definitional equality `≡` for our equivalence relation.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Nat.agda" 1 45 >}}
|
||||||
|
|
||||||
|
The definition for the lattice instance itself is pretty similar; I'll omit
|
||||||
|
it here to avoid taking up a lot of vertical space, but you can find it
|
||||||
|
on lines 47 through 83 of [my `Lattice.Nat` module]({{< codeurl "agda-spa/Lattice/Nat.agda" >}}).
|
||||||
|
|
||||||
|
#### The "Above-Below" Lattice
|
||||||
|
It's not too hard to implement our sign lattice in Agda. However, we can
|
||||||
|
do it in a somewhat general way. As it turns out, extending an existing set,
|
||||||
|
such as \(\{+, -, 0\}\), with a "bottom" and "top" element (to be used
|
||||||
|
when taking the least upper bound and greatest lower bound) is quite common
|
||||||
|
and useful. For instance, if we were to do constant propagation (simplifying
|
||||||
|
`7+4` to `11`), we would probably do something similar, using the set
|
||||||
|
of integers \(\mathbb{Z}\) instead of the plus-zero-minus set.
|
||||||
|
|
||||||
|
The general definition is as follows. Take some original set \(S\) (like our 3-element
|
||||||
|
set of signs), and extend it with new "top" and "bottom" elements (\(\top\) and
|
||||||
|
\(\bot\)). Then, define \((\sqcup)\) as follows:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
x_1 \sqcup x_2 =
|
||||||
|
\begin{cases}
|
||||||
|
\top & x_1 = \top\ \text{or}\ x_2 = \top \\
|
||||||
|
\top & x_1, x_2 \in S, x_1 \neq x_2 \\
|
||||||
|
x_1 = x_2 & x_1, x_2 \in S, x_1 = x_2 \\
|
||||||
|
x_1 & x_2 = \bot \\
|
||||||
|
x_2 & x_1 = \bot
|
||||||
|
\end{cases}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
In other words, \(\top\) overrules anything that it's combined with. In
|
||||||
|
math terms, it's the __absorbing element__ of the lattice. On the other hand,
|
||||||
|
\(\bot\) gets overruled by anything it's combined with. In math terms, that's
|
||||||
|
an __identity element__. Finally, when combining two elements that _aren't_
|
||||||
|
\(\top\) or \(\bot\) (which would otherwise be covered by the prior sentences),
|
||||||
|
combining an element with itself leaves it unchanged (upholding idempotence),
|
||||||
|
while combining two unequal element results in \(\top\). That last part
|
||||||
|
matches the way we defined "least upper bound" earlier.
|
||||||
|
|
||||||
|
The intuition is as follows: the \((\sqcup)\) operator is like an "or". Then,
|
||||||
|
"anything or positive" means "anything"; same with "anything or negative", etc.
|
||||||
|
On the other hand, "impossible or positive" means positive, since one of those
|
||||||
|
cases will never happen. Finally, in the absense of additional elements, the
|
||||||
|
most we can say about "positive or negative" is "any sign"; of course,
|
||||||
|
"positive or positive" is the same as "positive".
|
||||||
|
|
||||||
|
|
||||||
|
The "greatest lower bound" operator is defined by effectively swapping top
|
||||||
|
and bottom.
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
x_1 \sqcup x_2 =
|
||||||
|
\begin{cases}
|
||||||
|
\bot & x_1 = \bot\ \text{or}\ x_2 = \bot \\
|
||||||
|
\bot & x_1, x_2 \in S, x_1 \neq x_2 \\
|
||||||
|
x_1 = x_2 & x_1, x_2 \in S, x_1 = x_2 \\
|
||||||
|
x_1 & x_2 = \top \\
|
||||||
|
x_2 & x_1 = \top
|
||||||
|
\end{cases}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
For this operator, \(\bot\) is the absorbing element, and \(\top\) is the
|
||||||
|
identity element. The intuition here is not too different: if
|
||||||
|
\((\sqcap)\) is like an "and", then "impossible and positive" can't happen;
|
||||||
|
same with "impossible and negative", and so on. On the other hand,
|
||||||
|
"anything and positive" clearly means positive. Finally, "negative and positive"
|
||||||
|
can't happen (again, there is no number that's both positive and negative),
|
||||||
|
and "positive and positive" is just "positive".
|
||||||
|
|
||||||
|
What properties of the underlying set did we use to get this to work? The
|
||||||
|
only thing we needed is to be able to check and see if two elements are
|
||||||
|
equal or not; this is called _decidable equality_. Since that's the only
|
||||||
|
thing we used, this means that we can define an "above/below" lattice like this
|
||||||
|
for any type for which we can check if two elements are equal. In Agda, I encoded
|
||||||
|
this using a [parameterized module](https://agda.readthedocs.io/en/latest/language/module-system.html#parameterised-modules):
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/AboveBelow.agda" 5 8 >}}
|
||||||
|
|
||||||
|
From there, I defined the actual data type as follows:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/AboveBelow.agda" 23 26 >}}
|
||||||
|
|
||||||
|
From there, I defined the \((\sqcup)\) and \((\sqcap)\) operations almost
|
||||||
|
exactly to the mathematical equation above (the cases were re-ordered to
|
||||||
|
improve Agda's reduction behavior). Here's the former:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/AboveBelow.agda" 86 93 >}}
|
||||||
|
|
||||||
|
And here's the latter:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/AboveBelow.agda" 181 188 >}}
|
||||||
|
|
||||||
|
The proofs of the lattice properties are straightforward and proceed
|
||||||
|
by simple case analysis. Unfortunately, Agda doesn't quite seem to
|
||||||
|
evaluate the binary operator in every context that I would expect it to,
|
||||||
|
which has led me to define some helper lemmas such as the following:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/AboveBelow.agda" 95 96 >}}
|
||||||
|
|
||||||
|
As a sample, here's a proof of commutativity of \((\sqcup)\):
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/AboveBelow.agda" 158 165 >}}
|
||||||
|
|
||||||
|
The details of the rest of the proofs can be found in the
|
||||||
|
[`AboveBelow.agda` file]({{< codeurl "agda-spa/Lattice/AboveBelow.agda" >}}).
|
||||||
|
|
||||||
|
To recover the sign lattice we've been talking about all along, it's sufficient
|
||||||
|
to define a sign data type:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Analysis/Sign.agda" 19 22 >}}
|
||||||
|
|
||||||
|
Then, prove decidable equality on it (effecitly defining a comparison function),
|
||||||
|
and instantiate the `AboveBelow` module:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Analysis/Sign.agda" 34 47 >}}
|
||||||
|
|
||||||
|
### From Simple Lattices to Complex Ones
|
||||||
|
|
||||||
|
Natural numbers and signs alone are cool enough, but they will not be sufficient
|
||||||
|
to write program analyzers. That's because when we're writing an analyzer,
|
||||||
|
we don't just care about one variable: we care about all of them! An
|
||||||
|
initial guess might be to say that when analyzing a program, we really need
|
||||||
|
_several_ signs: one for each variable. This might be reminiscent of a
|
||||||
|
[map](https://en.wikipedia.org/wiki/Associative_array). So, when we compare
|
||||||
|
specificity, we'll really be comparing the specificity of maps. Even that,
|
||||||
|
though, is not enough. The reason is that variables might have different
|
||||||
|
signs at different points in the program! A single map would not be able to
|
||||||
|
capture that sort of nuance, so what we really need is a map associating
|
||||||
|
states with another map, which in turn associates variables with their signs.
|
||||||
|
|
||||||
|
Mathematically, we might write this as:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\text{Info} \triangleq \text{ProgramStates} \to (\text{Variables} \to \text{Sign})
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
That's a big step up in complexity. We now have a doubly-nested map structure
|
||||||
|
instead of just a sign. and we need to compare such maps in order to gaugage
|
||||||
|
their specificity and advance our analyses. But where do we even start with
|
||||||
|
maps, and how do we define the \((\sqcup)\) and \((\sqcap)\) operations?
|
||||||
|
|
||||||
|
The solution turns out to be to define ways in which simpler lattices
|
||||||
|
(like our sign) can be combined and transformed to define more complex lattices.
|
||||||
|
We'll move on to that in the next post of this series.
|
||||||
19
content/blog/01_types_basics/conversion.bergamot
Normal file
@@ -0,0 +1,19 @@
|
|||||||
|
section "Conversion rules" {
|
||||||
|
ConvertsIS @ converts(integer, string) <-;
|
||||||
|
ConvertsIF @ converts(integer, float) <-;
|
||||||
|
ConvertsFS @ converts(float, string) <-;
|
||||||
|
}
|
||||||
|
|
||||||
|
section "Rules for literals" {
|
||||||
|
TInt @ type(lit(?n), integer) <- int(?n);
|
||||||
|
TFloat @ type(lit(?f), float) <- float(?f);
|
||||||
|
TString @ type(lit(?s), string) <- str(?s);
|
||||||
|
}
|
||||||
|
|
||||||
|
section "" {
|
||||||
|
TPlusInt @ type(plus(?e_1, ?e_2), integer) <- type(?e_1, integer), type(?e_2, integer);
|
||||||
|
TPlusFloat @ type(plus(?e_1, ?e_2), float) <- type(?e_1, float), type(?e_2, float);
|
||||||
|
TPlusString @ type(plus(?e_1, ?e_2), string) <- type(?e_1, string), type(?e_2, string);
|
||||||
|
}
|
||||||
|
|
||||||
|
TConverts @ type(?e, ?tau_2) <- type(?e, ?tau_1), converts(?tau_1, ?tau_2);
|
||||||
@@ -2,7 +2,25 @@
|
|||||||
title: "Everything I Know About Types: Basics"
|
title: "Everything I Know About Types: Basics"
|
||||||
date: 2022-06-30T19:08:50-07:00
|
date: 2022-06-30T19:08:50-07:00
|
||||||
tags: ["Type Systems", "Programming Languages"]
|
tags: ["Type Systems", "Programming Languages"]
|
||||||
|
series: "Everything I Know About Types"
|
||||||
draft: true
|
draft: true
|
||||||
|
bergamot:
|
||||||
|
render_presets:
|
||||||
|
default: "bergamot/rendering/lc.bergamot"
|
||||||
|
presets:
|
||||||
|
notation:
|
||||||
|
prompt: "type(TERM, ?t)"
|
||||||
|
query: ""
|
||||||
|
file: "notation.bergamot"
|
||||||
|
string:
|
||||||
|
prompt: "type(TERM, ?t)"
|
||||||
|
query: "\"hello\"+\"world\""
|
||||||
|
file: "string.bergamot"
|
||||||
|
conversion:
|
||||||
|
prompt: "type(TERM, ?t)"
|
||||||
|
query: ""
|
||||||
|
file: "conversion.bergamot"
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
It's finally time to start looking at types. As I mentioned, I want
|
It's finally time to start looking at types. As I mentioned, I want
|
||||||
@@ -48,7 +66,7 @@ int x = 0;
|
|||||||
```
|
```
|
||||||
|
|
||||||
Things in C++, C#, and many other languages look very similar.
|
Things in C++, C#, and many other languages look very similar.
|
||||||
In rust, we have to make an even finer distinction: we have to
|
In Rust, we have to make an even finer distinction: we have to
|
||||||
distinguish between integers represented using 32 bits and those
|
distinguish between integers represented using 32 bits and those
|
||||||
represented by 64 bits. Focusing on the former, we
|
represented by 64 bits. Focusing on the former, we
|
||||||
could write:
|
could write:
|
||||||
@@ -66,7 +84,7 @@ assign it to a variable; the following suffices.
|
|||||||
|
|
||||||
That should be enough examples of integers for now. I'm sure you've seen
|
That should be enough examples of integers for now. I'm sure you've seen
|
||||||
them in your programming or computer science career. What you
|
them in your programming or computer science career. What you
|
||||||
may not have seen, though, is the formal / mathematical way of
|
may not have seen, though, is the formal, mathematical way of
|
||||||
stating that some expression or value has a particular type.
|
stating that some expression or value has a particular type.
|
||||||
In the mathematical notation, too, there's no need to assign a value to
|
In the mathematical notation, too, there's no need to assign a value to
|
||||||
a variable to state its type. The notation is actually very similar
|
a variable to state its type. The notation is actually very similar
|
||||||
@@ -78,14 +96,14 @@ the that of Haskell; here's how one might write the claim that 1 is a number.
|
|||||||
|
|
||||||
There's one more difference between mathematical notation and the
|
There's one more difference between mathematical notation and the
|
||||||
code we've seen so far. If you wrote `num`, or `aNumber`, or anything
|
code we've seen so far. If you wrote `num`, or `aNumber`, or anything
|
||||||
other than just `numbeer` in the TypeScript example (or if you similarly
|
other than just `number` in the TypeScript example (or if you similarly
|
||||||
deviated from the "correct" name in other languages), you'd be greeted with
|
deviated from the "correct" name in other languages), you'd be greeted with
|
||||||
an error. The compilers or interpreters of these languages only understand a
|
an error. The compilers or interpreters of these languages only understand a
|
||||||
fixed set of types, and we are required to stick to names in that set. We have no such
|
fixed set of types, and we are required to stick to names in that set. We have no such
|
||||||
duty when using mathematical notation. The main goal of a mathematical definition
|
duty when using mathematical notation. The main goal of a mathematical definition
|
||||||
is not to run the code, or check if it's correct; it's to communicate something
|
is not to run the code, or check if it's correct; it's to communicate something
|
||||||
to others. As long as others understand what you mean, you can do whatever you want.
|
to others. As long as others understand what you mean, you can do whatever you want.
|
||||||
I _chose_ to use the word \\(\\text{number}\\) to represent the type
|
I _chose_ to use the word \(\text{number}\) to represent the type
|
||||||
of numbers, mainly because it's _very_ clear what that means. A theorist writing
|
of numbers, mainly because it's _very_ clear what that means. A theorist writing
|
||||||
a paper might cringe at the verbosity of such a convention. My goal, however, is
|
a paper might cringe at the verbosity of such a convention. My goal, however, is
|
||||||
to communicate things to _you_, dear reader, and I think it's best to settle for
|
to communicate things to _you_, dear reader, and I think it's best to settle for
|
||||||
@@ -110,6 +128,20 @@ Another consequence of this is that not everyone agrees on notation; according
|
|||||||
to [this paper](https://labs.oracle.com/pls/apex/f?p=LABS:0::APPLICATION_PROCESS%3DGETDOC_INLINE:::DOC_ID:959),
|
to [this paper](https://labs.oracle.com/pls/apex/f?p=LABS:0::APPLICATION_PROCESS%3DGETDOC_INLINE:::DOC_ID:959),
|
||||||
27 different ways of writing down substitutions were observed in the POPL conference alone.
|
27 different ways of writing down substitutions were observed in the POPL conference alone.
|
||||||
|
|
||||||
|
{{< bergamot_exercise label="bergamot; tweaking notation" preset="notation" id="exercise-1" >}}
|
||||||
|
Bergamot, the interactive tool I've developed for doing exercises, supports
|
||||||
|
customizing the notation for rules. Try changing the \(:\) symbol to
|
||||||
|
the \(\sim\) symbol (denoted in latex as `\sim`).
|
||||||
|
|
||||||
|
To change the way that rules are rendered, click the "Presentation Rules"
|
||||||
|
tab in the "Rules" section. There will be a lot there: I've added rules for
|
||||||
|
pretty-printing a fair amount of the standard programming languages notation.
|
||||||
|
Scroll down to `LatexTypeBin`, and change `:` to
|
||||||
|
`\\sim` on that line (the extra backslash is to handle string
|
||||||
|
escaping). Now try typing numbers into the input box; you should see
|
||||||
|
something like \(1 \sim \text{number} \)
|
||||||
|
{{< /bergamot_exercise >}}
|
||||||
|
|
||||||
One more thing. So far, we've only written down one claim: the value 1 is a number.
|
One more thing. So far, we've only written down one claim: the value 1 is a number.
|
||||||
What about the other numbers? To make sure they're accounted for, we need similar
|
What about the other numbers? To make sure they're accounted for, we need similar
|
||||||
rules for 2, 3, and so on.
|
rules for 2, 3, and so on.
|
||||||
@@ -126,8 +158,8 @@ This is exactly what is done in PL. We'd write the following.
|
|||||||
n:\text{number}
|
n:\text{number}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
What's this \\(n\\)? First, recall that notation is up to us. I'm choosing to use the letter
|
What's this \(n\)? First, recall that notation is up to us. I'm choosing to use the letter
|
||||||
\\(n\\) to stand for "any value that is a number". We write a symbol, say what we want it to mean,
|
\(n\) to stand for "any value that is a number". We write a symbol, say what we want it to mean,
|
||||||
and we're done.
|
and we're done.
|
||||||
|
|
||||||
{{< dialog >}}
|
{{< dialog >}}
|
||||||
@@ -149,8 +181,22 @@ by \(n\)) the type \(\text{number}\).
|
|||||||
{{< /message >}}
|
{{< /message >}}
|
||||||
{{< /dialog >}}
|
{{< /dialog >}}
|
||||||
|
|
||||||
But then, we need to be careful.
|
Actually, to be extra precise, we might want to be explicit about our claim
|
||||||
It's important to note that the letter \\(n\\) is
|
that \(n\) is a number, rather than resorting to notational conventions.
|
||||||
|
To do so, we'd need to write something like the following:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\cfrac{n \in \texttt{Num}}{n : \text{number}}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Where \(\texttt{Num}\) denotes the set of numbers in our syntax (`1`, `3.14`, etc.)
|
||||||
|
The stuff about the line is called a premise, and it's a simply a condition
|
||||||
|
required for the rule to hold. The rule then says that \(n\) has type number --
|
||||||
|
but only if \(n\) is a numeric symbol in our language. We'll talk about premises
|
||||||
|
in more detail later on.
|
||||||
|
|
||||||
|
Having introduced this variable-like thing \(n\), we need to be careful.
|
||||||
|
It's important to note that the letter \(n\) is
|
||||||
not a variable like `x` in our code snippets above. In fact, it's not at all part of the programming
|
not a variable like `x` in our code snippets above. In fact, it's not at all part of the programming
|
||||||
language we're discussing. Rather, it's kind of like a variable in our _rules_.
|
language we're discussing. Rather, it's kind of like a variable in our _rules_.
|
||||||
|
|
||||||
@@ -158,31 +204,38 @@ This distinction comes up a lot. The thing is, the notation we're building up to
|
|||||||
kind of language. It's not meant for a computer to execute, mind you, but that's not a requirement
|
kind of language. It's not meant for a computer to execute, mind you, but that's not a requirement
|
||||||
for something to be language (ever heard of English?). The bottom line is, we have symbols with
|
for something to be language (ever heard of English?). The bottom line is, we have symbols with
|
||||||
particular meanings, and there are rules to how they have to be written. The statement "1 is a number"
|
particular meanings, and there are rules to how they have to be written. The statement "1 is a number"
|
||||||
must be written by first writing 1, then a colon, then \\(\text{number}\\). It's a language.
|
must be written by first writing 1, then a colon, then \(\text{number}\). It's a language.
|
||||||
|
|
||||||
Really, then, we have two languages to think about:
|
Really, then, we have two languages to think about:
|
||||||
* The _object language_ is the programming language we're trying to describe and mathematically
|
* The _object language_ is the programming language we're trying to describe and mathematically
|
||||||
formalize. This is the language that has variables like `x`, keywords like `let` and `const`, and so on.
|
formalize. This is the language that has variables like `x`, keywords like `let` and `const`, and so on.
|
||||||
|
|
||||||
|
Some examples of our object language that we've seen so far are `1` and `2+3`.
|
||||||
|
In our mathematical notation, they look like \(1\) and \(2+3\).
|
||||||
|
|
||||||
* The _meta language_ is the notation we use to talk about our object language. It consists of
|
* The _meta language_ is the notation we use to talk about our object language. It consists of
|
||||||
the various symbols we define, and is really just a system for communicating various things
|
the various symbols we define, and is really just a system for communicating various things
|
||||||
(like type rules) to others.
|
(like type rules) to others.
|
||||||
|
|
||||||
Using this terminology, \\(n\\) is a variable in our meta language; this is commonly called
|
Expressions like \(n \in \texttt{Num}\) and \(1 : \text{number}\)
|
||||||
a _metavariable_. A rule such as \\(n:\\text{number}\\) that contains metavariables isn't
|
are examples of our meta language.
|
||||||
|
|
||||||
|
Using this terminology, \(n\) is a variable in our meta language; this is commonly called
|
||||||
|
a _metavariable_. A rule such as \(n:\text{number}\) that contains metavariables isn't
|
||||||
really a rule by itself; rather, it stands for a whole bunch of rules, one for each possible
|
really a rule by itself; rather, it stands for a whole bunch of rules, one for each possible
|
||||||
number that \\(n\\) can be. We call this a _rule schema_.
|
number that \(n\) can be. We call this a _rule schema_.
|
||||||
|
|
||||||
Alright, that's enough theory for now. Let's go back to the real world. Working with
|
Alright, that's enough theory for now. Let's go back to the real world. Working with
|
||||||
plain old values like `1` gets boring quickly. There's not many programs you can write
|
plain old values like `1` gets boring quickly. There's not many programs you can write
|
||||||
with them! Numbers can be added, though, why don't we look at that? All mainstream
|
with them! Numbers can be added, though, why don't we look at that? All mainstream
|
||||||
languages can do this quite easily. Here's Typescript:
|
languages can do this quite easily. Here's Typescript:
|
||||||
|
|
||||||
```
|
```TypeScript
|
||||||
const y = 1+1;
|
const y = 1+1;
|
||||||
```
|
```
|
||||||
|
|
||||||
When it comes to adding whole numbers, every other language is pretty much the same.
|
When it comes to adding whole numbers, every other language is pretty much the same.
|
||||||
Throwing addition into the mix, and branching out to other types of numbers, we
|
Throwing other types of numbers into the mix, we
|
||||||
can arrive at our first type error. Here it is in Rust:
|
can arrive at our first type error. Here it is in Rust:
|
||||||
|
|
||||||
```Rust
|
```Rust
|
||||||
@@ -230,9 +283,9 @@ const x: number = 1.1 + 1; // just fine!
|
|||||||
|
|
||||||
That concludes the second round of real-world examples. Let's take a look at formalizing
|
That concludes the second round of real-world examples. Let's take a look at formalizing
|
||||||
all of this mathematically. As a starting point, we can look at a rule that matches the TypeScript
|
all of this mathematically. As a starting point, we can look at a rule that matches the TypeScript
|
||||||
view of having only a single number type, \\(\\text{number}\\). This rule needs a little
|
view of having only a single number type, \(\text{number}\). This rule needs a little
|
||||||
bit "more" than the ones we've seen so far; we can't just blindly give things in the
|
bit "more" than the ones we've seen so far; we can't just blindly give things in the
|
||||||
form \\(a+b\\) the type \\(\\text{number}\\) (what if we're adding strings?). For our
|
form \(a+b\) the type \(\text{number}\) (what if we're adding strings?). For our
|
||||||
rule to behave in the way we have in mind, it's necessary for us to add _premises_.
|
rule to behave in the way we have in mind, it's necessary for us to add _premises_.
|
||||||
Before I explain any further, let me show you the rule.
|
Before I explain any further, let me show you the rule.
|
||||||
|
|
||||||
@@ -240,7 +293,7 @@ Before I explain any further, let me show you the rule.
|
|||||||
\frac{e_1:\text{number}\quad e_2:\text{number}}{e_1+e_2:\text{number}}
|
\frac{e_1:\text{number}\quad e_2:\text{number}}{e_1+e_2:\text{number}}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
In the above (and elsewhere) we will use the metavariable \\(e\\) as a stand-in for
|
In the above (and elsewhere) we will use the metavariable \(e\) as a stand-in for
|
||||||
any _expression_ in our source language. In general, expressions are things such as `1`,
|
any _expression_ in our source language. In general, expressions are things such as `1`,
|
||||||
`x`, `1.0+someFunction(y)`, and so on. In other words, they're things we can evaluate
|
`x`, `1.0+someFunction(y)`, and so on. In other words, they're things we can evaluate
|
||||||
to a value. For the purposes of this article, though, we're only looking at basic
|
to a value. For the purposes of this article, though, we're only looking at basic
|
||||||
@@ -250,12 +303,12 @@ For the moment, we will avoid rules for checking _statements_ (like `let x = 5;`
|
|||||||
|
|
||||||
Rules like the above consist of premises (above the line) and conclusions (below the line).
|
Rules like the above consist of premises (above the line) and conclusions (below the line).
|
||||||
The conclusion is the claim / fact that we can determine from the rule. In this specific case,
|
The conclusion is the claim / fact that we can determine from the rule. In this specific case,
|
||||||
the conclusion is that \\(e_1+e_2\\) has type \\(\\text{number}\\).
|
the conclusion is that \(e_1+e_2\) has type \(\text{number}\).
|
||||||
For this to be true, however, some conditions must be met; specifically, the sub-expressions
|
For this to be true, however, some conditions must be met; specifically, the sub-expressions
|
||||||
\\(e_1\\) and \\(e_2\\) must themselves be of type \\(\\text{number}\\). These are the premises.
|
\(e_1\) and \(e_2\) must themselves be of type \(\text{number}\). These are the premises.
|
||||||
Reading in plain English, we could pronounce this rule as:
|
Reading in plain English, we could pronounce this rule as:
|
||||||
|
|
||||||
> If \\(e_1\\) and \\(e_2\\) have type \\(\\text{number}\\), then \\(e_1+e_2\\) has type \\(\\text{number}\\).
|
> If \(e_1\) and \(e_2\) have type \(\text{number}\), then \(e_1+e_2\) has type \(\text{number}\).
|
||||||
|
|
||||||
Notice that we don't care what the left and right operands are (we say they can be any expression).
|
Notice that we don't care what the left and right operands are (we say they can be any expression).
|
||||||
We need not concern ourselves with how to compute _their_ type in this specific rule. Thus, the rule
|
We need not concern ourselves with how to compute _their_ type in this specific rule. Thus, the rule
|
||||||
@@ -272,11 +325,40 @@ Just to get some more practice, let's take a look at a rule for adding strings.
|
|||||||
|
|
||||||
This rule is read as follows:
|
This rule is read as follows:
|
||||||
|
|
||||||
> If \\(e_1\\) and \\(e_2\\) have type \\(\\text{string}\\), then \\(e_1+e_2\\) has type \\(\\text{string}\\).
|
> If \(e_1\) and \(e_2\) have type \(\text{string}\), then \(e_1+e_2\) has type \(\text{string}\).
|
||||||
|
|
||||||
|
{{< bergamot_exercise label="bergamot; adding rules for strings" preset="string" id="exercise-2" >}}
|
||||||
|
Try writing the Bergamot rules that correspond to the inference rule for strings
|
||||||
|
above. I've provided the rules for numbers; the rules for strings should be quite
|
||||||
|
similar.
|
||||||
|
|
||||||
|
In Bergamot, the claim that an expression `e` has type `t`
|
||||||
|
is written as `type(e, t)`. A rule looks like `RuleName @ conclusion <- premise1, premise2;`.
|
||||||
|
Thus, the rule
|
||||||
|
|
||||||
|
```
|
||||||
|
TNumber @ type(lit(?n), number) <- num(?n);
|
||||||
|
```
|
||||||
|
|
||||||
|
Has one premise, that the term \(n\) is a number, and the conclusion is that
|
||||||
|
a number literal has type \(\text{number}\). The `num` condition
|
||||||
|
is a Bergamot builtin, corresponding to our earlier notation of \(n \in \texttt{Num}\).
|
||||||
|
It holds for all numbers: it's always true that `num(1)`, `num(2)`,
|
||||||
|
etc. The equivalent builtin for something being a string is `str`.
|
||||||
|
|
||||||
|
To edit the rules in Bergamot, click the "Editor" button in the "Rules"
|
||||||
|
section. You will need to add two rules, just like we did for numbers:
|
||||||
|
a rule for string literals (something like \(\texttt{"Hello"} : \text{string}\),
|
||||||
|
but more general) and for adding two strings together. I suggest naming
|
||||||
|
these two rules `TString` and `TPlusS` respectively.
|
||||||
|
|
||||||
|
When you're done, you should be able to properly determine the types of
|
||||||
|
expressions such as `"Hello"` and `"Hello" + "World"`.
|
||||||
|
{{< /bergamot_exercise >}}
|
||||||
|
|
||||||
These rules generally work in other languages. Things get more complicated in languages like Java and Rust,
|
These rules generally work in other languages. Things get more complicated in languages like Java and Rust,
|
||||||
where types for numbers are more precise (\\(\\text{int}\\) and \\(\\text{float}\\) instead of
|
where types for numbers are more precise (\(\text{int}\) and \(\text{float}\) instead of
|
||||||
\\(\\text{number}\\)). In these languages, we need rules for both.
|
\(\text{number}\)). In these languages, we need rules for both.
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\frac{e_1:\text{int}\quad e_2:\text{int}}{e_1+e_2:\text{int}}
|
\frac{e_1:\text{int}\quad e_2:\text{int}}{e_1+e_2:\text{int}}
|
||||||
@@ -306,13 +388,40 @@ from the conversion rules. Chapter 15 of _Types and Programming Languages_
|
|||||||
by Benjamin Pierce is a nice explanation, but the [Wikipedia page](https://en.wikipedia.org/wiki/Subtyping)
|
by Benjamin Pierce is a nice explanation, but the [Wikipedia page](https://en.wikipedia.org/wiki/Subtyping)
|
||||||
ain't bad, either.
|
ain't bad, either.
|
||||||
|
|
||||||
|
{{< bergamot_exercise label="advanced; a taste of conversions" preset="conversion" id="exercise-3" >}}
|
||||||
|
This exercise is simply an early taste of formalizing conversions, which
|
||||||
|
allow users to (for example) write numbers where the language expects strings, with the
|
||||||
|
understanding that the number will be automatically turned into a string.
|
||||||
|
|
||||||
|
To avoid having an explosion of various rules, we instead define the "converts to"
|
||||||
|
relation, \(\tau_1 \preceq \tau_2\), where \(\tau_1\) and \(\tau_2\)
|
||||||
|
are types. To say that an integer can be automatically converted to a floating
|
||||||
|
pointer number, we can write \(\text{integer} \preceq \text{float}\).
|
||||||
|
Then, we add only a single additional rule to our language: `TConverts`.
|
||||||
|
This rule says that we can treat an expression of type \(\tau_1\) as
|
||||||
|
an expression of type \(\tau_2\), if the former can be converted to the
|
||||||
|
latter.
|
||||||
|
|
||||||
|
I have written some rules using these concepts. Input some expressions into
|
||||||
|
the box below that would require a conversion: some examples might be
|
||||||
|
`1 + 3.14` (adding an integer to a float), `1 + "hello"` (adding
|
||||||
|
an integer to a string), and `1.0 + "hello"` (adding a float to a string).
|
||||||
|
Click the "Proof Tree" tab to see how the various rules combine to make
|
||||||
|
the expression well-typed.
|
||||||
|
|
||||||
|
Now, remove the `ConvertsIS` rule that allows integers to be converted to
|
||||||
|
strings. Do all of the expressions from the previous paragraph still typecheck?
|
||||||
|
Can you see why?
|
||||||
|
|
||||||
|
{{< /bergamot_exercise >}}
|
||||||
|
|
||||||
Subtyping, however, is quite a bit beyond the scope of a "basics"
|
Subtyping, however, is quite a bit beyond the scope of a "basics"
|
||||||
post. For the moment, we shall content ourselves with the tedious approach.
|
post. For the moment, we shall content ourselves with the tedious approach.
|
||||||
|
|
||||||
Another thing to note is that we haven't yet seen rules for what programs are _incorrect_,
|
Another thing to note is that we haven't yet seen rules for what programs are _incorrect_,
|
||||||
and we never will. When formalizing type systems we rarely (if ever) explicitly enumerate
|
and we never will. When formalizing type systems we rarely (if ever) explicitly enumerate
|
||||||
cases that produce errors. Rather, we interpret the absence of matching rules to indicate
|
cases that produce errors. Rather, we interpret the absence of matching rules to indicate
|
||||||
that something is wrong. Since no rule has premises that match \\(e_1:\\text{float}\\) and \\(e_2:\\text{string}\\),
|
that something is wrong. Since no rule has premises that match \(e_1:\text{float}\) and \(e_2:\text{string}\),
|
||||||
we can infer that
|
we can infer that
|
||||||
{{< sidenote "right" "float-string-note" "given the rules so far," >}}
|
{{< sidenote "right" "float-string-note" "given the rules so far," >}}
|
||||||
I'm trying to be careful here, since adding a float to a string
|
I'm trying to be careful here, since adding a float to a string
|
||||||
@@ -367,16 +476,16 @@ Here's a quick summary of what we've covered:
|
|||||||
or TypeScript. The _meta language_ is the language that we use
|
or TypeScript. The _meta language_ is the language that we use
|
||||||
to reason and talk about the object language. Typically, this is
|
to reason and talk about the object language. Typically, this is
|
||||||
the language we use for writing down our rules.
|
the language we use for writing down our rules.
|
||||||
3. The common type-theoretic notation for "expression \\(x\\)
|
3. The common type-theoretic notation for "expression \(x\)
|
||||||
has type \\(\\tau\\)" is \\(x : \\tau\\).
|
has type \(\tau\)" is \(x : \tau\).
|
||||||
4. In writing more complicated rules, we will frequently make use
|
4. In writing more complicated rules, we will frequently make use
|
||||||
of the inference rule notation, which looks something like
|
of the inference rule notation, which looks something like
|
||||||
the following.
|
the following.
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\frac{P_1 \quad P_2 \quad ... \quad P_n}{P}
|
\frac{P_1 \quad P_2 \quad ... \quad P_n}{P}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
The above is read as "if \\(P_1\\) through \\(P_n\\) are
|
The above is read as "if \(P_1\) through \(P_n\) are
|
||||||
true, then \\(P\\) is also true."
|
true, then \(P\) is also true."
|
||||||
5. To support operators like `+` that can work on, say, both numbers
|
5. To support operators like `+` that can work on, say, both numbers
|
||||||
and strings, we provide inference rules for each such case. If this
|
and strings, we provide inference rules for each such case. If this
|
||||||
gets cumbersome, we can introduce a system of _subtypes_ into our
|
gets cumbersome, we can introduce a system of _subtypes_ into our
|
||||||
@@ -410,11 +519,11 @@ and already be up-to-speed on a big chunk of the content.
|
|||||||
{{< /dialog >}}
|
{{< /dialog >}}
|
||||||
|
|
||||||
#### Metavariables
|
#### Metavariables
|
||||||
| Symbol | Meaning |
|
| Symbol | Meaning | Syntactic Category |
|
||||||
|---------|--------------|
|
|---------|--------------|-----------------------|
|
||||||
| \\(n\\) | Numbers |
|
| \(n\) | Numbers | \(\texttt{Num}\) |
|
||||||
| \\(s\\) | Strings |
|
| \(s\) | Strings | \(\texttt{Str}\) |
|
||||||
| \\(e\\) | Expressions |
|
| \(e\) | Expressions | \(\texttt{Expr}\) |
|
||||||
|
|
||||||
#### Grammar
|
#### Grammar
|
||||||
{{< block >}}
|
{{< block >}}
|
||||||
@@ -431,7 +540,22 @@ and already be up-to-speed on a big chunk of the content.
|
|||||||
{{< foldtable >}}
|
{{< foldtable >}}
|
||||||
| Rule | Description |
|
| Rule | Description |
|
||||||
|--------------|-------------|
|
|--------------|-------------|
|
||||||
| {{< latex >}}s : \text{string} {{< /latex >}}| String literals have type \\(\\text{string}\\) |
|
| {{< latex >}}\frac{n \in \texttt{Num}}{n : \text{number}} {{< /latex >}}| Number literals have type \(\text{number}\) |
|
||||||
| {{< latex >}}n : \text{number} {{< /latex >}}| Number literals have type \\(\\text{number}\\) |
|
| {{< latex >}}\frac{s \in \texttt{Str}}{s : \text{string}} {{< /latex >}}| String literals have type \(\text{string}\) |
|
||||||
| {{< latex >}}\frac{e_1 : \text{string}\quad e_2 : \text{string}}{e_1+e_2 : \text{string}} {{< /latex >}}| Adding strings gives a string |
|
| {{< latex >}}\frac{e_1 : \text{string}\quad e_2 : \text{string}}{e_1+e_2 : \text{string}} {{< /latex >}}| Adding strings gives a string |
|
||||||
| {{< latex >}}\frac{e_1 : \text{number}\quad e_2 : \text{number}}{e_1+e_2 : \text{number}} {{< /latex >}}| Adding numbers gives a number |
|
| {{< latex >}}\frac{e_1 : \text{number}\quad e_2 : \text{number}}{e_1+e_2 : \text{number}} {{< /latex >}}| Adding numbers gives a number |
|
||||||
|
|
||||||
|
#### Playground
|
||||||
|
{{< bergamot_widget id="widget" query="" prompt="type(TERM, ?t)" >}}
|
||||||
|
section "" {
|
||||||
|
TNumber @ type(lit(?n), number) <- num(?n);
|
||||||
|
TString @ type(lit(?s), string) <- str(?s);
|
||||||
|
}
|
||||||
|
section "" {
|
||||||
|
TPlusI @ type(plus(?e_1, ?e_2), number) <-
|
||||||
|
type(?e_1, number), type(?e_2, number);
|
||||||
|
TPlusS @ type(plus(?e_1, ?e_2), string) <-
|
||||||
|
type(?e_1, string), type(?e_2, string);
|
||||||
|
}
|
||||||
|
{{< /bergamot_widget >}}
|
||||||
|
|
||||||
1
content/blog/01_types_basics/notation.bergamot
Normal file
@@ -0,0 +1 @@
|
|||||||
|
TNumber @ type(lit(?n), number) <- num(?n);
|
||||||
3
content/blog/01_types_basics/string.bergamot
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
TNumber @ type(lit(?n), number) <- num(?n);
|
||||||
|
TPlusI @ type(plus(?e_1, ?e_2), number) <-
|
||||||
|
type(?e_1, number), type(?e_2, number);
|
||||||
@@ -1,7 +1,8 @@
|
|||||||
---
|
---
|
||||||
title: Compiling a Functional Language Using C++, Part 2 - Parsing
|
title: Compiling a Functional Language Using C++, Part 2 - Parsing
|
||||||
date: 2019-08-03T01:02:30-07:00
|
date: 2019-08-03T01:02:30-07:00
|
||||||
tags: ["C and C++", "Functional Languages", "Compilers"]
|
tags: ["C++", "Functional Languages", "Compilers"]
|
||||||
|
series: "Compiling a Functional Language using C++"
|
||||||
description: "In this post, we combine our compiler's tokenizer with a parser, allowing us to extract structure from input source code."
|
description: "In this post, we combine our compiler's tokenizer with a parser, allowing us to extract structure from input source code."
|
||||||
---
|
---
|
||||||
In the previous post, we covered tokenizing. We learned how to convert an input string into logical segments, and even wrote up a tokenizer to do it according to the rules of our language. Now, it's time to make sense of the tokens, and parse our language.
|
In the previous post, we covered tokenizing. We learned how to convert an input string into logical segments, and even wrote up a tokenizer to do it according to the rules of our language. Now, it's time to make sense of the tokens, and parse our language.
|
||||||
@@ -12,9 +13,9 @@ recognizing tokens. For instance, consider a simple language of a matching
|
|||||||
number of open and closed parentheses, like `()` and `((()))`. You can't
|
number of open and closed parentheses, like `()` and `((()))`. You can't
|
||||||
write a regular expression for it! We resort to a wider class of languages, called
|
write a regular expression for it! We resort to a wider class of languages, called
|
||||||
__context free languages__. These languages are ones that are matched by __context free grammars__.
|
__context free languages__. These languages are ones that are matched by __context free grammars__.
|
||||||
A context free grammar is a list of rules in the form of \\(S \\rightarrow \\alpha\\), where
|
A context free grammar is a list of rules in the form of \(S \rightarrow \alpha\), where
|
||||||
\\(S\\) is a __nonterminal__ (conceptualy, a thing that expands into other things), and
|
\(S\) is a __nonterminal__ (conceptualy, a thing that expands into other things), and
|
||||||
\\(\\alpha\\) is a sequence of nonterminals and terminals (a terminal is a thing that doesn't
|
\(\alpha\) is a sequence of nonterminals and terminals (a terminal is a thing that doesn't
|
||||||
expand into other things; for us, this is a token).
|
expand into other things; for us, this is a token).
|
||||||
|
|
||||||
Let's write a context free grammar (CFG from now on) to match our parenthesis language:
|
Let's write a context free grammar (CFG from now on) to match our parenthesis language:
|
||||||
@@ -26,9 +27,9 @@ S & \rightarrow ()
|
|||||||
\end{aligned}
|
\end{aligned}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
So, how does this work? We start with a "start symbol" nonterminal, which we usually denote as \\(S\\). Then, to get a desired string,
|
So, how does this work? We start with a "start symbol" nonterminal, which we usually denote as \(S\). Then, to get a desired string,
|
||||||
we replace a nonterminal with the sequence of terminals and nonterminals on the right of one of its rules. For instance, to get `()`,
|
we replace a nonterminal with the sequence of terminals and nonterminals on the right of one of its rules. For instance, to get `()`,
|
||||||
we start with \\(S\\) and replace it with the body of the second one of its rules. This gives us `()` right away. To get `((()))`, we
|
we start with \(S\) and replace it with the body of the second one of its rules. This gives us `()` right away. To get `((()))`, we
|
||||||
have to do a little more work:
|
have to do a little more work:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
@@ -37,8 +38,8 @@ S \rightarrow (S) \rightarrow ((S)) \rightarrow ((()))
|
|||||||
|
|
||||||
In practice, there are many ways of using a CFG to parse a programming language. Various parsing algorithms support various subsets
|
In practice, there are many ways of using a CFG to parse a programming language. Various parsing algorithms support various subsets
|
||||||
of context free languages. For instance, top down parsers follow nearly exactly the structure that we had. They try to parse
|
of context free languages. For instance, top down parsers follow nearly exactly the structure that we had. They try to parse
|
||||||
a nonterminal by trying to match each symbol in its body. In the rule \\(S \\rightarrow \\alpha \\beta \\gamma\\), it will
|
a nonterminal by trying to match each symbol in its body. In the rule \(S \rightarrow \alpha \beta \gamma\), it will
|
||||||
first try to match \\(\\alpha\\), then \\(\\beta\\), and so on. If one of the three contains a nonterminal, it will attempt to parse
|
first try to match \(\alpha\), then \(\beta\), and so on. If one of the three contains a nonterminal, it will attempt to parse
|
||||||
that nonterminal following the same strategy. However, this leaves a flaw - For instance, consider the grammar
|
that nonterminal following the same strategy. However, this leaves a flaw - For instance, consider the grammar
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
@@ -48,8 +49,8 @@ S & \rightarrow a
|
|||||||
\end{aligned}
|
\end{aligned}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
A top down parser will start with \\(S\\). It will then try the first rule, which starts with \\(S\\). So, dutifully, it will
|
A top down parser will start with \(S\). It will then try the first rule, which starts with \(S\). So, dutifully, it will
|
||||||
try to parse __that__ \\(S\\). And to do that, it will once again try the first rule, and find that it starts with another \\(S\\)...
|
try to parse __that__ \(S\). And to do that, it will once again try the first rule, and find that it starts with another \(S\)...
|
||||||
This will never end, and the parser will get stuck. A grammar in which a nonterminal can appear in the beginning of one of its rules
|
This will never end, and the parser will get stuck. A grammar in which a nonterminal can appear in the beginning of one of its rules
|
||||||
__left recursive__, and top-down parsers aren't able to handle those grammars.
|
__left recursive__, and top-down parsers aren't able to handle those grammars.
|
||||||
|
|
||||||
@@ -67,8 +68,8 @@ We see nothing interesting on the left side of the dot, so we move (or __shift__
|
|||||||
a.aa
|
a.aa
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
Now, on the left side of the dot, we see something! In particular, we see the body of one of the rules for \\(S\\) (the second one).
|
Now, on the left side of the dot, we see something! In particular, we see the body of one of the rules for \(S\) (the second one).
|
||||||
So we __reduce__ the thing on the left side of the dot, by replacing it with the left hand side of the rule (\\(S\\)):
|
So we __reduce__ the thing on the left side of the dot, by replacing it with the left hand side of the rule (\(S\)):
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
S.aa
|
S.aa
|
||||||
@@ -93,7 +94,7 @@ start symbol, and nothing on the right of the dot, so we're done!
|
|||||||
In practice, we don't want to just match a grammar. That would be like saying "yup, this is our language".
|
In practice, we don't want to just match a grammar. That would be like saying "yup, this is our language".
|
||||||
Instead, we want to create something called an __abstract syntax tree__, or AST for short. This tree
|
Instead, we want to create something called an __abstract syntax tree__, or AST for short. This tree
|
||||||
captures the structure of our language, and is easier to work with than its textual representation. The structure
|
captures the structure of our language, and is easier to work with than its textual representation. The structure
|
||||||
of the tree we build will often mimic the structure of our grammar: a rule in the form \\(S \\rightarrow A B\\)
|
of the tree we build will often mimic the structure of our grammar: a rule in the form \(S \rightarrow A B\)
|
||||||
will result in a tree named "S", with two children corresponding the trees built for A and B. Since
|
will result in a tree named "S", with two children corresponding the trees built for A and B. Since
|
||||||
an AST captures the structure of the language, we'll be able to toss away some punctuation
|
an AST captures the structure of the language, we'll be able to toss away some punctuation
|
||||||
like `,` and `(`. These tokens will appear in our grammar, but we will tweak our parser to simply throw them away. Additionally,
|
like `,` and `(`. These tokens will appear in our grammar, but we will tweak our parser to simply throw them away. Additionally,
|
||||||
@@ -108,8 +109,8 @@ For instance, for `3+2*6`, we want our tree to have `+` as the root, `3` as the
|
|||||||
Why? Because this tree represents "the addition of 3 and the result of multiplying 2 by 6". If we had `*` be the root, we'd have
|
Why? Because this tree represents "the addition of 3 and the result of multiplying 2 by 6". If we had `*` be the root, we'd have
|
||||||
a tree representing "the multiplication of the result of adding 3 to 2 and 6", which is __not__ what our expression means.
|
a tree representing "the multiplication of the result of adding 3 to 2 and 6", which is __not__ what our expression means.
|
||||||
|
|
||||||
So, with this in mind, we want our rule for __addition__ (represented with the nonterminal \\(A\_{add}\\), to be matched first, and
|
So, with this in mind, we want our rule for __addition__ (represented with the nonterminal \(A_{add}\), to be matched first, and
|
||||||
for its children to be trees created by the multiplication rule, \\(A\_{mult}\\). So we write the following rules:
|
for its children to be trees created by the multiplication rule, \(A_{mult}\). So we write the following rules:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\begin{aligned}
|
\begin{aligned}
|
||||||
@@ -119,7 +120,7 @@ A_{add} & \rightarrow A_{mult}
|
|||||||
\end{aligned}
|
\end{aligned}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
The first rule matches another addition, added to the result of a multiplication. Similarly, the second rule matches another addition, from which the result of a multiplication is then subtracted. We use the \\(A\_{add}\\) on the left side of \\(+\\) and \\(-\\) in the body
|
The first rule matches another addition, added to the result of a multiplication. Similarly, the second rule matches another addition, from which the result of a multiplication is then subtracted. We use the \(A_{add}\) on the left side of \(+\) and \(-\) in the body
|
||||||
because we want to be able to parse strings like `1+2+3+4`, which we want to view as `((1+2)+3)+4` (mostly because
|
because we want to be able to parse strings like `1+2+3+4`, which we want to view as `((1+2)+3)+4` (mostly because
|
||||||
subtraction is [left-associative](https://en.wikipedia.org/wiki/Operator_associativity)). So, we want the top level
|
subtraction is [left-associative](https://en.wikipedia.org/wiki/Operator_associativity)). So, we want the top level
|
||||||
of the tree to be the rightmost `+` or `-`, since that means it will be the "last" operation. You may be asking,
|
of the tree to be the rightmost `+` or `-`, since that means it will be the "last" operation. You may be asking,
|
||||||
@@ -138,7 +139,7 @@ A_{mult} & \rightarrow P
|
|||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
P, in this case, is an application (remember, application has higher precedence than any binary operator).
|
P, in this case, is an application (remember, application has higher precedence than any binary operator).
|
||||||
Once again, if there's no `*` or `\`, we simply fall through to a \\(P\\) nonterminal, representing application.
|
Once again, if there's no `*` or `\`, we simply fall through to a \(P\) nonterminal, representing application.
|
||||||
|
|
||||||
Application is refreshingly simple:
|
Application is refreshingly simple:
|
||||||
|
|
||||||
@@ -149,7 +150,7 @@ P & \rightarrow B
|
|||||||
\end{aligned}
|
\end{aligned}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
An application is either only one "thing" (represented with \\(B\\), for base), such as a number or an identifier,
|
An application is either only one "thing" (represented with \(B\), for base), such as a number or an identifier,
|
||||||
or another application followed by a thing.
|
or another application followed by a thing.
|
||||||
|
|
||||||
We now need to define what a "thing" is. As we said before, it's a number, or an identifier. We also make a parenthesized
|
We now need to define what a "thing" is. As we said before, it's a number, or an identifier. We also make a parenthesized
|
||||||
@@ -165,7 +166,7 @@ B & \rightarrow C
|
|||||||
\end{aligned}
|
\end{aligned}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
What's the last \\(C\\)? We also want a "thing" to be a case expression. Here are the rules for that:
|
What's the last \(C\)? We also want a "thing" to be a case expression. Here are the rules for that:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\begin{aligned}
|
\begin{aligned}
|
||||||
@@ -180,15 +181,15 @@ L_L & \rightarrow \epsilon
|
|||||||
\end{aligned}
|
\end{aligned}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
\\(L\_B\\) is the list of branches in our case expression. \\(R\\) is a single branch, which is in the
|
\(L_B\) is the list of branches in our case expression. \(R\) is a single branch, which is in the
|
||||||
form `Pattern -> Expression`. \\(N\\) is a pattern, which we will for now define to be either a variable name
|
form `Pattern -> Expression`. \(N\) is a pattern, which we will for now define to be either a variable name
|
||||||
(\\(\\text{lowerVar}\\)), or a constructor with some arguments. The arguments of a constructor will be
|
(\(\text{lowerVar}\)), or a constructor with some arguments. The arguments of a constructor will be
|
||||||
lowercase names, and a list of those arguments will be represented with \\(L\_L\\). One of the bodies
|
lowercase names, and a list of those arguments will be represented with \(L_L\). One of the bodies
|
||||||
of this nonterminal is just the character \\(\\epsilon\\), which just means "nothing".
|
of this nonterminal is just the character \(\epsilon\), which just means "nothing".
|
||||||
We use this because a constructor can have no arguments (like Nil).
|
We use this because a constructor can have no arguments (like Nil).
|
||||||
|
|
||||||
We can use these grammar rules to represent any expression we want. For instance, let's try `3+(multiply 2 6)`,
|
We can use these grammar rules to represent any expression we want. For instance, let's try `3+(multiply 2 6)`,
|
||||||
where multiply is a function that, well, multiplies. We start with \\(A_{add}\\):
|
where multiply is a function that, well, multiplies. We start with \(A_{add}\):
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\begin{aligned}
|
\begin{aligned}
|
||||||
@@ -226,15 +227,15 @@ L_U & \rightarrow \epsilon
|
|||||||
\end{aligned}
|
\end{aligned}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
That's a lot of rules! \\(T\\) is the "top-level declaration rule. It matches either
|
That's a lot of rules! \(T\) is the "top-level declaration rule. It matches either
|
||||||
a function or a data definition. A function definition consists of the keyword "defn",
|
a function or a data definition. A function definition consists of the keyword "defn",
|
||||||
followed by a function name (starting with a lowercase letter), followed by a list of
|
followed by a function name (starting with a lowercase letter), followed by a list of
|
||||||
parameters, represented by \\(L\_P\\).
|
parameters, represented by \(L_P\).
|
||||||
|
|
||||||
A data type definition consists of the name of the data type (starting with an uppercase letter),
|
A data type definition consists of the name of the data type (starting with an uppercase letter),
|
||||||
and a list \\(L\_D\\) of data constructors \\(D\\). There must be at least one data constructor in this list,
|
and a list \(L_D\) of data constructors \(D\). There must be at least one data constructor in this list,
|
||||||
so we don't use the empty string rule here. A data constructor is simply an uppercase variable representing
|
so we don't use the empty string rule here. A data constructor is simply an uppercase variable representing
|
||||||
a constructor of the data type, followed by a list \\(L\_U\\) of zero or more uppercase variables (representing
|
a constructor of the data type, followed by a list \(L_U\) of zero or more uppercase variables (representing
|
||||||
the types of the arguments of the constructor).
|
the types of the arguments of the constructor).
|
||||||
|
|
||||||
Finally, we want one or more of these declarations in a valid program:
|
Finally, we want one or more of these declarations in a valid program:
|
||||||
@@ -265,7 +266,7 @@ Next, observe that there's
|
|||||||
a certain symmetry between our parser and our scanner. In our scanner, we mixed the theoretical idea of a regular expression
|
a certain symmetry between our parser and our scanner. In our scanner, we mixed the theoretical idea of a regular expression
|
||||||
with an __action__, a C++ code snippet to be executed when a regular expression is matched. This same idea is present
|
with an __action__, a C++ code snippet to be executed when a regular expression is matched. This same idea is present
|
||||||
in the parser, too. Each rule can produce a value, which we call a __semantic value__. For type safety, we allow
|
in the parser, too. Each rule can produce a value, which we call a __semantic value__. For type safety, we allow
|
||||||
each nonterminal and terminal to produce only one type of semantic value. For instance, all rules for \\(A_{add}\\) must
|
each nonterminal and terminal to produce only one type of semantic value. For instance, all rules for \(A_{add}\) must
|
||||||
produce an expression. We specify the type of each nonterminal and using `%type` directives. The types of terminals
|
produce an expression. We specify the type of each nonterminal and using `%type` directives. The types of terminals
|
||||||
are specified when they're declared.
|
are specified when they're declared.
|
||||||
|
|
||||||
|
|||||||
@@ -2,6 +2,7 @@
|
|||||||
title: A Language for an Assignment - Homework 3
|
title: A Language for an Assignment - Homework 3
|
||||||
date: 2020-01-02T22:17:43-08:00
|
date: 2020-01-02T22:17:43-08:00
|
||||||
tags: ["Haskell", "Python", "Algorithms", "Programming Languages"]
|
tags: ["Haskell", "Python", "Algorithms", "Programming Languages"]
|
||||||
|
series: "A Language for an Assignment"
|
||||||
---
|
---
|
||||||
|
|
||||||
It rained in Sunriver on New Year's Eve, and it continued to rain
|
It rained in Sunriver on New Year's Eve, and it continued to rain
|
||||||
|
|||||||
@@ -1,7 +1,7 @@
|
|||||||
---
|
---
|
||||||
title: Learning Emulation, Part 2
|
title: Learning Emulation, Part 2
|
||||||
date: 2016-11-23 23:23:18.664038
|
date: 2016-06-29
|
||||||
tags: ["C and C++", "Emulation"]
|
tags: ["C", "Emulation"]
|
||||||
---
|
---
|
||||||
_This is the second post in a series I'm writing about Chip-8 emulation. If you want to see the first one, head [here]({{< relref "/blog/01_learning_emulation.md" >}})._
|
_This is the second post in a series I'm writing about Chip-8 emulation. If you want to see the first one, head [here]({{< relref "/blog/01_learning_emulation.md" >}})._
|
||||||
|
|
||||||
|
|||||||
513
content/blog/02_spa_agda_combining_lattices.md
Normal file
@@ -0,0 +1,513 @@
|
|||||||
|
---
|
||||||
|
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 2: Combining Lattices"
|
||||||
|
series: "Static Program Analysis in Agda"
|
||||||
|
description: "In this post, I describe how lattices can be combined to create other, more complex lattices"
|
||||||
|
date: 2024-08-08T16:40:00-07:00
|
||||||
|
tags: ["Agda", "Programming Languages"]
|
||||||
|
---
|
||||||
|
|
||||||
|
In the previous post, I wrote about how lattices arise when tracking, comparing
|
||||||
|
and combining static information about programs. I then showed two simple lattices:
|
||||||
|
the natural numbers, and the (parameterized) "above-below" lattice, which
|
||||||
|
modified an arbitrary set with "bottom" and "top" elements (\(\bot\) and \(\top\)
|
||||||
|
respectively). One instance of the "above-below" lattice was the sign lattice,
|
||||||
|
which could be used to reason about the signs (positive, negative, or zero)
|
||||||
|
of variables in a program.
|
||||||
|
|
||||||
|
At the end of that post, I introduced a source of complexity: the "full"
|
||||||
|
lattices that we want to use for the program analysis aren't signs or numbers,
|
||||||
|
but maps of states and variables to lattice-based descriptions. The full lattice
|
||||||
|
for sign analysis might something in the form:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\text{Info} \triangleq \text{ProgramStates} \to (\text{Variables} \to \text{Sign})
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Thus, we have to compare and find least upper bounds (e.g.) of not just
|
||||||
|
signs, but maps! Proving the various lattice laws for signs was not too
|
||||||
|
challenging, but for for a two-level map like \(\text{Info}\) above, we'd
|
||||||
|
need to do a lot more work. We need tools to build up such complicated lattices.
|
||||||
|
|
||||||
|
The way to do this, it turns out, is by using simpler lattices as building blocks.
|
||||||
|
To start with, let's take a look at a very simple way of combining lattices
|
||||||
|
into a new one: taking the [Cartesian product](https://mathworld.wolfram.com/CartesianProduct.html).
|
||||||
|
|
||||||
|
### The Cartesian Product Lattice
|
||||||
|
|
||||||
|
Suppose you have two lattices \(L_1\) and \(L_2\). As I covered in the previous
|
||||||
|
post, each lattice comes equipped with a "least upper bound" operator \((\sqcup)\)
|
||||||
|
and a "greatest lower bound" operator \((\sqcap)\). Since we now have two lattices,
|
||||||
|
let's use numerical suffixes to disambiguate between the operators
|
||||||
|
of the first and second lattice: \((\sqcup_1)\) will be the LUB operator of
|
||||||
|
the first lattice \(L_1\), and \((\sqcup_2)\) of the second lattice \(L_2\),
|
||||||
|
and so on.
|
||||||
|
|
||||||
|
Then, let's take the Cartesian product of the elements of \(L_1\) and \(L_2\);
|
||||||
|
mathematically, we'll write this as \(L_1 \times L_2\), and in Agda, we can
|
||||||
|
just use the standard [`Data.Product`](https://agda.github.io/agda-stdlib/master/Data.Product.html)
|
||||||
|
module. Then, I'll define the lattice as another [parameterized module](https://agda.readthedocs.io/en/latest/language/module-system.html#parameterised-modules). Since both \(L_1\) and \(L_2\)
|
||||||
|
are lattices, this parameterized module will require `IsLattice` instances
|
||||||
|
for both types:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Prod.agda" 1 7 "hl_lines=7" >}}
|
||||||
|
|
||||||
|
Elements of \(L_1 \times L_2\) are in the form \((l_1, l_2)\), where
|
||||||
|
\(l_1 \in L_1\) and \(l_2 \in L_2\). Knowing that, let's define what it means
|
||||||
|
for two such elements to be equal. Recall that
|
||||||
|
we opted for a [custom equivalence relation]({{< relref "01_spa_agda_lattices#definitional-equality" >}})
|
||||||
|
instead of definitional equality to allow similar elements to be considered
|
||||||
|
equal; we'll have to define a similar relation for our new product lattice.
|
||||||
|
That's easy enough: we have an equality predicate `_≈₁_` that checks if an element
|
||||||
|
of \(L_1\) is equal to another, and we have `_≈₂_` that does the same for \(L_2\).
|
||||||
|
It's reasonable to say that _pairs_ of elements are equal if their respective
|
||||||
|
first and second elements are equal:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
(l_1, l_2) \approx (j_1, j_2) \iff l_1 \approx_1 j_1 \land l_2 \approx_2 j_2
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
In Agda:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Prod.agda" 39 40 >}}
|
||||||
|
|
||||||
|
Verifying that this relation has the properties of an equivalence relation
|
||||||
|
boils down to the fact that `_≈₁_` and `_≈₂_` are themselves equivalence
|
||||||
|
relations.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Prod.agda" 42 48 >}}
|
||||||
|
|
||||||
|
Defining \((\sqcup)\) and \((\sqcap)\) by simply applying the
|
||||||
|
corresponding operators from \(L_1\) and \(L_2\) seems quite natural as well.
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
(l_1, l_2) \sqcup (j_1, j_2) \triangleq (l_1 \sqcup_1 j_1, l_2 \sqcup_2 j_2) \\
|
||||||
|
(l_1, l_2) \sqcap (j_1, j_2) \triangleq (l_1 \sqcap_1 j_1, l_2 \sqcap_2 j_2)
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
As an example, consider the product lattice \(\text{Sign}\times\text{Sign}\),
|
||||||
|
which is made up of pairs of signs that we talked about in the previous
|
||||||
|
post. Two elements of this lattice are \((+, +)\) and \((+, -)\). Here's
|
||||||
|
how the \((\sqcup)\) operation is evaluated on them:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
(+, +) \sqcup (+, -) = (+ \sqcup + , + \sqcup -) = (+ , \top)
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
In Agda, the definition is written very similarly to its mathematical form:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Prod.agda" 50 54 >}}
|
||||||
|
|
||||||
|
All that's left is to prove the various (semi)lattice properties. Intuitively,
|
||||||
|
we can see that since the "combined" operator `_⊔_` just independently applies
|
||||||
|
the element operators `_⊔₁_` and `_⊔₂_`, as long as they are idempotent,
|
||||||
|
commutative, and associative, so is the "combined" operator itself.
|
||||||
|
Moreover, the proofs that `_⊔_` and `_⊓_` form semilattices are identical
|
||||||
|
up to replacing \((\sqcup)\) with \((\sqcap)\). Thus, in Agda, we can write
|
||||||
|
the code once, parameterizing it by the binary operators involved (and proofs
|
||||||
|
that these operators obey the semilattice laws).
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Prod.agda" 56 82 >}}
|
||||||
|
|
||||||
|
Above, I used `f₁` to stand for "either `_⊔₁_` or `_⊓₁_`", and similarly
|
||||||
|
`f₂` for "either `_⊔₂_` or `_⊓₂_`". Much like the semilattice properties,
|
||||||
|
proving lattice properties boils down to applying the lattice properties of
|
||||||
|
\(L_1\) and \(L_2\) to individual components.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Prod.agda" 84 96 >}}
|
||||||
|
|
||||||
|
This concludes the definition of the product lattice, which is made up of
|
||||||
|
two other lattices. If we have a type of analysis that can be expressed as
|
||||||
|
{{< sidenote "right" "pair-note" "a pair of two signs," >}}
|
||||||
|
Perhaps the signs are the smallest and largest possible values of a variable.
|
||||||
|
{{< /sidenote >}} for example, we won't have to do all the work of
|
||||||
|
proving the (semi)lattice properties of those pairs. In fact, we can build up
|
||||||
|
even bigger data structures. By taking a product twice, like
|
||||||
|
\(L_1 \times (L_2 \times L_3)\), we can construct a lattice of 3-tuples. Any
|
||||||
|
of the lattices involved in that product can itself be a product; we can
|
||||||
|
therefore create lattices out of arbitrary bundles of data, so long as
|
||||||
|
the smallest pieces that make up the bundles are themselves lattices.
|
||||||
|
|
||||||
|
Products will come very handy a bit later in this series. For now though,
|
||||||
|
our goal is to create another type of lattice: the map lattice. We will
|
||||||
|
take the same approach we did with products: assuming the elements of the
|
||||||
|
map are lattices, we'll prove that the map itself is a lattice. Then, just
|
||||||
|
like we could put products inside products when building up lattices, we'll
|
||||||
|
be able to put a map inside a map. This will allow us to represent the
|
||||||
|
\(\text{Info}\) lattice, which is a map of maps.
|
||||||
|
|
||||||
|
### The Map Lattice
|
||||||
|
#### The Theory
|
||||||
|
|
||||||
|
When I say "map", what I really means is something that associates keys with
|
||||||
|
values, like [dictionaries in Python](https://docs.python.org/3/tutorial/datastructures.html#dictionaries).
|
||||||
|
This data structure need not have a value for every possible key; a very precise
|
||||||
|
author might call such a map a "partial map". We might have a map
|
||||||
|
whose value (in Python-ish notation) is `{ "x": +, "y": - }`. Such a map states
|
||||||
|
that the sign of the variable `x` is `+`, and the sign of variable `y` is
|
||||||
|
`-`. Another possible map is `{ "y": +, "z": - }`; this one states that
|
||||||
|
the sign of `y` is `+`, and the sign of another variable `z` is `-`.
|
||||||
|
|
||||||
|
Let's start thinking about what sorts of lattices our maps will be.
|
||||||
|
The thing that [motivated our introduction]({{< relref "01_spa_agda_lattices#specificity" >}})
|
||||||
|
of lattices was comparing them by "specificity", so let's try figure out how
|
||||||
|
to compare maps. For that, we can begin small, by looking at singleton maps.
|
||||||
|
If we have `{"x": +}` and `{"x": ⊤}`, which one of them is smaller? Well, we have
|
||||||
|
previously established that `+` is more specific (and thus less than) `⊤`. Thus,
|
||||||
|
it shouldn't be too much of a stretch to say that for singleton maps of the same
|
||||||
|
key, the one with the smaller value is smaller.
|
||||||
|
|
||||||
|
Now, what about a pair of singleton maps like `{"x": +}` and `{"y": ⊤}`? Among
|
||||||
|
these two, each contains some information that the other does not. Although the
|
||||||
|
value of `y` is larger than the value of `x`, it describes a different key, so
|
||||||
|
it seems wrong to use that to call the `y`-singleton "larger". Let's call
|
||||||
|
these maps incompatible, then. More generally, if we have two maps and each one
|
||||||
|
has a key that the other doesn't, we can't compare them.
|
||||||
|
|
||||||
|
If only one map has a unique key, though, things are different. Take for
|
||||||
|
instance `{"x": +}` and `{"x": +, "y": +}`. Are they really incomparable?
|
||||||
|
The keys that the two maps do share can be compared (`+ <= +`, because they're
|
||||||
|
equal).
|
||||||
|
|
||||||
|
|
||||||
|
All of the above leads to the following conventional definition, which I find
|
||||||
|
easier to further motivate using \((\sqcup)\) and \((\sqcap)\)
|
||||||
|
(and [do so below]({{< relref "#union-as-or" >}})).
|
||||||
|
|
||||||
|
> A map `m1` is less than or equal to another map `m2` (`m1 <= m2`) if for
|
||||||
|
> every key `k` that has a value in `m1`, the key also has a value in `m2`, and
|
||||||
|
> `m1[k] <= m2[k]`.
|
||||||
|
|
||||||
|
That definitions matches our intuitions so far. The only key in `{"x": +}` is `x`;
|
||||||
|
this key is also in `{"x": ⊤}` (check) and `+ < ⊤` (check). On the other hand,
|
||||||
|
both `{"x": +}` and `{"y": ⊤}` have a key that the other doesn't, so the
|
||||||
|
definition above is not satisfied. Finally, for `{"x": +}` and
|
||||||
|
`{"x": +, "y": +}`, the only key in the former is also present in the latter,
|
||||||
|
and `+ <= +`; the definition is satisfied.
|
||||||
|
|
||||||
|
Next, we need to define the \((\sqcup)\) and \((\sqcap)\) operators that match
|
||||||
|
our definition of "less than or equal". Let's start with \((\sqcup)\). For two
|
||||||
|
maps \(m_1\) and \(m_2\), the join of those two maps, \(m_1 \sqcup m_2\) should
|
||||||
|
be greater than or equal to both; in other words, both sub-maps should be less
|
||||||
|
than or equal to the join.
|
||||||
|
|
||||||
|
Our newly-introduced condition for "less than or equal"
|
||||||
|
requires that each key in the smaller map be present in the larger one; as
|
||||||
|
a result, \(m_1 \sqcup m_2\) should contain all the keys in \(m_1\) __and__
|
||||||
|
all the keys in \(m_2\). So, we could just take the union of the two maps:
|
||||||
|
copy values from both into the result. Only, what happens if both \(m_1\)
|
||||||
|
and \(m_2\) have a value mapped to a particular key \(k\)? The values in the two
|
||||||
|
maps could be distinct, and they might even be incomparable. This is where the
|
||||||
|
second part of the condition kicks in: the value in the combination of the
|
||||||
|
maps needs to be bigger than the value in either sub-map. We already know how
|
||||||
|
to get a value that's bigger than two other values: we use a join on the
|
||||||
|
values!
|
||||||
|
|
||||||
|
Thus, define \(m_1 \sqcup m_2\) as a map that has all the keys
|
||||||
|
from \(m_1\) and \(m_2\), where the value at a particular key is given
|
||||||
|
as follows:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
(m_1 \sqcup m_2)[k] =
|
||||||
|
\begin{cases}
|
||||||
|
m_1[k] \sqcup m_2[k] & k \in m_1, k \in m_2 \\
|
||||||
|
m_1[k] & k \in m_1, k \notin m_2 \\
|
||||||
|
m_2[k] & k \notin m_1, k \in m_2
|
||||||
|
\end{cases}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
If you're familiar with set theory, this operation is like
|
||||||
|
{{< sidenote "right" "map-union-note" "an extension of the union operator \((\cup)\)" >}}
|
||||||
|
There are, of course, other ways to extend the "union" operation to maps.
|
||||||
|
Haskell, for instance, defines it in a "left-biased" way (preferring the
|
||||||
|
elements from the left operand of the operation when duplicates are encountered).<br>
|
||||||
|
<br>
|
||||||
|
However, with a "join" operation \((\sqcup)\) that's defined on the values
|
||||||
|
stored in the map gives us an extra tool to work with. As a result, I would
|
||||||
|
argue that our extension, given such an operator, is the most natural.
|
||||||
|
{{< /sidenote >}} to maps. In fact, this begins to motivate
|
||||||
|
the choice to use \((\sqcup)\) to denote this operation. A further bit of
|
||||||
|
motivation is this:
|
||||||
|
[we've already seen]({{< relref "01_spa_agda_lattices#lub-glub-or-and" >}})
|
||||||
|
that the \((\sqcup)\) and \((\sqcap)\) operators correspond to "or"
|
||||||
|
and "and". The elements in the union of two sets are precisely
|
||||||
|
those that are in one set __or__ the other. Thus, using union here fits our
|
||||||
|
notion of how the \((\sqcup)\) operator behaves.
|
||||||
|
{#union-as-or}
|
||||||
|
|
||||||
|
Now, let's take a look at the \((\sqcap)\) operator. For two maps \(m_1\) and
|
||||||
|
\(m_2\), the meet of those two maps, \(m_1 \sqcap m_2\) should be less than
|
||||||
|
or equal to both. Our definition above requires that each key of the smaller
|
||||||
|
map is present in the larger map; for the combination of two maps to be
|
||||||
|
smaller than both, we must ensure that it only has keys present in both maps.
|
||||||
|
To combine the elements from the two maps, we can use the \((\sqcap)\) operator
|
||||||
|
on values.
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
(m_1 \sqcap m_2)[k] = m_1[k] \sqcap m_2[k]
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Turning once again to set theory, we can think of this operation like the
|
||||||
|
extension of the intersection operator \((\cap)\) to maps. This can be
|
||||||
|
motivated in the same way as the union operation above; the \((\sqcap)\)
|
||||||
|
operator combines lattice elements in such away that the result represents
|
||||||
|
both of them, and intersections of sets contain elements that are in __both__
|
||||||
|
sets.
|
||||||
|
|
||||||
|
Now we have the the two binary operators and the comparison function in hand.
|
||||||
|
There's just one detail we're missing: what it means for two maps to be
|
||||||
|
equivalent. Here, once again we take our cue from set theory: two sets are
|
||||||
|
said to be equal when each one is a subset of the other. Mathematically, we can
|
||||||
|
write this as follows:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
m_1 \approx m_2 \triangleq m_1 \subseteq m_2 \land m_1 \supseteq m_2
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
I might as well show you the Agda definition of this, since it's a word-for-word
|
||||||
|
transliteration:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 530 531 >}}
|
||||||
|
|
||||||
|
Defining equivalence more abstractly this way helps avoid concerns about the
|
||||||
|
precise implementation of our maps.
|
||||||
|
|
||||||
|
Okay, but we haven't actually defined what it means for one map to be a subset
|
||||||
|
of another. My definition is as follows: if \(m_1 \subseteq m_2\), that is,
|
||||||
|
if \(m_1\) is a subset of \(m_2\), then every key in \(m_1\) is also present
|
||||||
|
in \(m_2\), and they are mapped to the same value. My first stab at
|
||||||
|
a mathematical definition of this is the following:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
m_1 \subseteq m_2 \triangleq \forall k, v.\ (k, v) \in m_1 \Rightarrow (k, v) \in m_2
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Only there's a slight complication; remember that our values themselves come
|
||||||
|
from a lattice, and that this lattice might use its own equivalence operator
|
||||||
|
\((\approx)\) to group similar elements. One example where this is important
|
||||||
|
is our now-familiar "map of maps" scenario: the values store in the "outer"
|
||||||
|
map are themselves maps, and we don't want the order of the keys or other
|
||||||
|
menial details of the inner maps to influence whether the outer maps are equal.
|
||||||
|
Thus, we settle for a more robust definition of \(m_1 \subseteq m_2\)
|
||||||
|
that allows \(m_1\) to have different-but-equivalent values from those
|
||||||
|
in \(m_2\).
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
m_1 \subseteq m_2 \triangleq \forall k, v.\ (k, v) \in m_1 \Rightarrow \exists v'.\ v \approx v' \land (k, v') \in m_2
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
In Agda, the core of my definition is once again very close:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 98 99 >}}
|
||||||
|
|
||||||
|
#### The Implementation
|
||||||
|
|
||||||
|
Now it's time to show you how I implemented the Map lattice. I chose
|
||||||
|
represent maps using a list of key-value pairs, along with a condition
|
||||||
|
that the keys are unique (non-repeating). I chose this definition because
|
||||||
|
it was simple to implement, and because it makes it possible to iterate
|
||||||
|
over the keys of a map. That last property is useful if we use the maps
|
||||||
|
to later represent sets (which I did). Moreover, lists of key-value pairs are
|
||||||
|
easy to serialize and write to disk. This isn't hugely important for my
|
||||||
|
immediate static program analysis needs, but it might be nice in the future.
|
||||||
|
The requirement that the keys are unique prevents the map from being a multi-map
|
||||||
|
(which might have several values associated with a particular key).
|
||||||
|
|
||||||
|
My `Map` module is parameterized by the key and value types (`A` and `B`
|
||||||
|
respectively), and additionally requires some additional properties to
|
||||||
|
be satisfied by these types.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 6 10 >}}
|
||||||
|
|
||||||
|
For `A`, the key property is the
|
||||||
|
[decidability](https://en.wikipedia.org/wiki/Decidability_(logic)) of
|
||||||
|
equality: there should be a way to compare keys for equality. This is
|
||||||
|
important for all sorts of map operations. For example, when inserting a new
|
||||||
|
value into a map, we need to decide if the value is already present (so that
|
||||||
|
we know to override it), but if we can't check if two values are equal, we
|
||||||
|
can't see if it's already there.
|
||||||
|
|
||||||
|
The values of the map (represented by `B`) we expected to be lattices, so
|
||||||
|
we require them to provide the lattice operations \((\sqcup)\) and \((\sqcap)\),
|
||||||
|
as well as the equivalence relation \((\approx)\) and the proof of the lattice
|
||||||
|
properties in `isLattice`. To distinguish the lattice operations on `B`
|
||||||
|
from the ones we'll be defining on the map itself -- you might've
|
||||||
|
noticed that there's a bit of overleading going on in this post -- I've
|
||||||
|
suffixed them with the subscript `2`. My convention is to use the subscript
|
||||||
|
corresponding to the number of the type parameter. Here, `A` is "first" and `B`
|
||||||
|
is "second", so the operators on `B` get `2`.
|
||||||
|
|
||||||
|
From there, I define the map as a pair; the first component is the list of
|
||||||
|
key-value pairs, and the second is the proof that all the keys in the
|
||||||
|
list occur only once.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 480 481 >}}
|
||||||
|
|
||||||
|
Now, to implement union and intersection; for the most part, the proofs deal
|
||||||
|
just with the first component of the map -- the key-value pairs. For union,
|
||||||
|
the key operation is "insert-or-combine". We can think of merging two maps
|
||||||
|
as inserting all the keys from one map (arbitrary, the "left") into the
|
||||||
|
other. If a key is not in the "left" map, insertion won't do anything to its
|
||||||
|
prior value in the right map; similarly, if a key is not in the "right" map,
|
||||||
|
then it should appear unchanged in the final result after insertion. Finally,
|
||||||
|
if a key is inserted into the "right" map, but already has a value there, then
|
||||||
|
the two values need to be combined using `_⊔₂_`. This leads to the following
|
||||||
|
definition of `insert` on key-value pair lists:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 114 118 >}}
|
||||||
|
|
||||||
|
Above, `f` is just a stand-in for `_⊓₂_` (making the definition a tiny bit more general).
|
||||||
|
For each element in the "right" key-value list, we check if its key matches
|
||||||
|
the one we're inserting; if it does, we have to combine the values, and
|
||||||
|
there's no need to recurse into the rest of the list. If on the other hand
|
||||||
|
the key doesn't match, we move on to the next element of the list. If we
|
||||||
|
run out of elements, we know that the key we're inserting wasn't in the "right"
|
||||||
|
map, so we insert it as-is.
|
||||||
|
|
||||||
|
The union operation is just about inserting every pair from one map into another.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 120 121 >}}
|
||||||
|
|
||||||
|
Here, I defined my own version of `foldr` which unpacks the pairs, for
|
||||||
|
convenience:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 110 112 "" "**(Click here to see the definition of my `foldr`)**" >}}
|
||||||
|
|
||||||
|
For intersection, we do something similar; however, since only elements in
|
||||||
|
_both_ maps should be in the final output, if our "insertion" doesn't find
|
||||||
|
an existing key, it should just fall through; this can be achieved by defining
|
||||||
|
a version of `insert` whose base case simply throws away the input. Of course,
|
||||||
|
this function should also use `_⊓₂_` instead of `_⊔₂_`; below, though, I again
|
||||||
|
use a general function `f` to provide a more general definition. I called this
|
||||||
|
version of the function `update`.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 295 299 >}}
|
||||||
|
|
||||||
|
Just changing `insert` to `update` is not enough. It's true that calling
|
||||||
|
`update` with all keys from `m1` on `m2` would forget all keys unique to `m1`,
|
||||||
|
it would still leave behind the only-in-`m2` keys. To get rid of these, I
|
||||||
|
defined another function, `restrict`, that drops all keys in its second
|
||||||
|
argument that aren't present in its first argument.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 304 308 >}}
|
||||||
|
|
||||||
|
Altogether, intesection is defined as follows, where `updates` just
|
||||||
|
calls `update` for every key-value pair in its first argument.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 310 311 >}}
|
||||||
|
|
||||||
|
The next hurdle is all the proofs about these implementations. I will
|
||||||
|
leave the details of the proofs either as appendices or as links to
|
||||||
|
other posts on this site.
|
||||||
|
|
||||||
|
The first key property is that the insertion, union, update, and intersection operations all
|
||||||
|
preserve uniqueness of keys; the [proofs for this are here](#appendix-proof-of-uniqueness-of-keys).
|
||||||
|
The set of properties are the lattice laws for union and intersection.
|
||||||
|
The proofs of those proceed by cases; to prove that \((\sqcup)\) is
|
||||||
|
commutative, we reason that if \((k , v) \in m_1 \sqcup m_2\), then it must be
|
||||||
|
either in \(m_1\), in \(m_2\), or in both; for each of these three possible
|
||||||
|
cases, we can show that \((k , v)\) must be the same in \(m_2 \sqcup m_1\).
|
||||||
|
Things get even more tedious for proofs of associativity, since there are
|
||||||
|
7 cases to consider; I describe the strategy I used for such proofs
|
||||||
|
in my [article about the "Expression" pattern]({{< relref "agda_expr_pattern" >}})
|
||||||
|
in Agda.
|
||||||
|
|
||||||
|
### Additional Properties of Lattices
|
||||||
|
|
||||||
|
The product and map lattices are the two pulling the most weight in my
|
||||||
|
implementation of program analyses. However, there's an additional property
|
||||||
|
that they have: if the lattices they are made of have a _finite height_,
|
||||||
|
then so do products and map lattices themselves. A lattice having a finite
|
||||||
|
height means that we can only line up so many elements using the less-than
|
||||||
|
operator `<`. For instance, the natural numbers are _not_ a finite-height lattice;
|
||||||
|
we can create the infinite chain:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
0 < 1 < 2 < ...
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
On the other hand, our sign lattice _is_ of finite height; the longest chains
|
||||||
|
we can make have three elements and two `<` signs. Here's one:
|
||||||
|
{#sign-three-elements}
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\bot < + < \top
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
As a result of this, _pairs_ of signs also have a finite height; the longest
|
||||||
|
chains we can make have five elements and four `<` signs.
|
||||||
|
{{< sidenote "right" "example-note" "An example:" >}}
|
||||||
|
Notice that the elements in the example progress the same way as the ones
|
||||||
|
in the single-sign chain. This is no accident; the longest chains in the
|
||||||
|
pair lattice can be constructed from longest chains of its element
|
||||||
|
lattices. The length of the product lattice chain, counted by the number of
|
||||||
|
"less than" signs, is the sum of the lengths of the element chains.
|
||||||
|
{{< /sidenote >}}
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
(\bot, \bot) < (\bot, +) < (\bot, \top) < (+, \top) < (\top, \top)
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
The same is true for maps, under certain conditions.
|
||||||
|
|
||||||
|
The finite-height property is crucial to lattice-based static program analysis;
|
||||||
|
we'll talk about it in more detail in the next post of this series.
|
||||||
|
|
||||||
|
{{< seriesnav >}}
|
||||||
|
|
||||||
|
### Appendix: Proof of Uniqueness of Keys
|
||||||
|
|
||||||
|
I will provide sketches of the proofs here, and omit the implementations
|
||||||
|
of my lemmas. Click on the link in the code block headers to jump to their
|
||||||
|
implementation on my Git server.
|
||||||
|
|
||||||
|
First, note that if we're inserting a key that's already in a list, then the keys of that list are unchanged.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 123 124 >}}
|
||||||
|
|
||||||
|
On the other hand, if we're inserting a new key, it ends up at the end, and
|
||||||
|
the rest of the keys are unchanged.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 134 135 >}}
|
||||||
|
|
||||||
|
Then, for any given key-value pair, the key either is or isn't in the list we're
|
||||||
|
inserting it into. If it is, then the list ends up unchanged, and remains
|
||||||
|
unique if it was already unique. On the other hand, if it's not in the list,
|
||||||
|
then it ends up at the end; adding a new element to the end of a unique
|
||||||
|
list produces another unique list. Thus, in either case, the final keys
|
||||||
|
are unique.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 143 148 >}}
|
||||||
|
|
||||||
|
By induction, we can then prove that calling `insert` many times as we do
|
||||||
|
in `union` preserves uniqueness too. Here, `insert-preserves-Unique` serves
|
||||||
|
as the inductive step.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 164 168 >}}
|
||||||
|
|
||||||
|
For `update`, things are simple; it doesn't change the keys of the argument
|
||||||
|
list at all, since it only modifies, and doesn't add new pairs. This
|
||||||
|
is captured by the `update-keys` property:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 313 314 >}}
|
||||||
|
|
||||||
|
If the keys don't change, they obviously remain unique.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 328 330 >}}
|
||||||
|
|
||||||
|
For `restrict`, we note that it only ever removes keys; as a result, if
|
||||||
|
a key was not in the input to `restrict`, then it won't be in its output,
|
||||||
|
either.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 337 338 >}}
|
||||||
|
|
||||||
|
As a result, for each key of the list being restricted, we either drop it
|
||||||
|
(which does not damage uniqueness) or we keep it; since we only remove
|
||||||
|
keys, and since the keys were originally unique, the key we kept won't
|
||||||
|
conflict with any of the other final keys.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 345 351 >}}
|
||||||
|
|
||||||
|
Since both `update` and `restrict` preserve uniqueness, then so does
|
||||||
|
`intersect`:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 353 355 >}}
|
||||||
@@ -2,6 +2,7 @@
|
|||||||
title: "Everything I Know About Types: Variables"
|
title: "Everything I Know About Types: Variables"
|
||||||
date: 2022-08-28T19:05:31-07:00
|
date: 2022-08-28T19:05:31-07:00
|
||||||
tags: ["Type Systems", "Programming Languages"]
|
tags: ["Type Systems", "Programming Languages"]
|
||||||
|
series: "Everything I Know About Types"
|
||||||
draft: true
|
draft: true
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -114,13 +115,15 @@ aspects of variables that we can gather from the preceding examples:
|
|||||||
Rust, Crystal).
|
Rust, Crystal).
|
||||||
|
|
||||||
To get started with type rules for variables, let's introduce
|
To get started with type rules for variables, let's introduce
|
||||||
another metavariable, \\(x\\) (along with \\(n\\) from before).
|
another metavariable, \(x\) (along with \(n\) from before).
|
||||||
Whereas \\(n\\) ranges over any number in our language, \\(x\\) ranges
|
Whereas \(n\) ranges over any number in our language, \(x\) ranges
|
||||||
over any variable. It can be used as a stand-in for `x`, `y`, `myVar`, and so on.
|
over any variable. It can be used as a stand-in for `x`, `y`, `myVar`, and so on.
|
||||||
|
|
||||||
The first property prevents us from writing type rules like the
|
Now, let's start by looking at versions of formal rules that are
|
||||||
|
__incorrect__. The first property listed above prevents us from writing type rules like the
|
||||||
following, since we cannot always assume that a variable has type
|
following, since we cannot always assume that a variable has type
|
||||||
\\(\\text{number}\\) or \\(\\text{string}\\).
|
\(\text{number}\) or \(\text{string}\) (it might have either,
|
||||||
|
depending on where in the code it's used!).
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
x : \text{number}
|
x : \text{number}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
@@ -140,10 +143,10 @@ for a variable to take on different types in different places.
|
|||||||
|
|
||||||
With these constraints in mind, we have enough to start creating
|
With these constraints in mind, we have enough to start creating
|
||||||
rules for expressions (but not statements yet; we'll get to that).
|
rules for expressions (but not statements yet; we'll get to that).
|
||||||
The solution to our problem is to add a third "thing" to our rules:
|
The way to work around the four constraints is to add a third "thing" to our rules:
|
||||||
the _environment_, typically denoted using the Greek uppercase gamma,
|
the _environment_, typically denoted using the Greek uppercase gamma,
|
||||||
\\(\\Gamma\\). Much like we avoided writing similar rules for every possible
|
\(\Gamma\). Much like we avoided writing similar rules for every possible
|
||||||
number by using \\(n\\) as a metavariable for _any_ number, we will use \\(\\Gamma\\)
|
number by using \(n\) as a metavariable for _any_ number, we will use \(\Gamma\)
|
||||||
as a metavariable to stand in for _any_ environment. What is an environment,
|
as a metavariable to stand in for _any_ environment. What is an environment,
|
||||||
though? It's basically a list of pairs associating
|
though? It's basically a list of pairs associating
|
||||||
variables with their types. For instance, if in some situation
|
variables with their types. For instance, if in some situation
|
||||||
@@ -200,8 +203,8 @@ e : \tau
|
|||||||
|
|
||||||
This reads,
|
This reads,
|
||||||
|
|
||||||
> The expression \\(e\\) [another metavariable, this one is used for
|
> The expression \(e\) [another metavariable, this one is used for
|
||||||
all expressions] has type \\(\\tau\\) [also a metavariable, for
|
all expressions] has type \(\tau\) [also a metavariable, for
|
||||||
types].
|
types].
|
||||||
|
|
||||||
However, as we've seen, we can't make global claims like this when variables are
|
However, as we've seen, we can't make global claims like this when variables are
|
||||||
@@ -214,16 +217,16 @@ on the situation. Now, we instead write:
|
|||||||
|
|
||||||
This version reads,
|
This version reads,
|
||||||
|
|
||||||
> In the environment \\(\\Gamma\\), the expression \\(e\\) has type \\(\\tau\\).
|
> In the environment \(\Gamma\), the expression \(e\) has type \(\tau\).
|
||||||
|
|
||||||
And here's the difference. The new \\(\\Gamma\\) of ours captures this
|
And here's the difference. The new \(\Gamma\) of ours captures this
|
||||||
"depending on the situation" aspect of expressions with variables. It
|
"depending on the situation" aspect of expressions with variables. It
|
||||||
provides us with
|
provides us with
|
||||||
{{< sidenote "right" "context-note" "much-needed context." >}}
|
{{< sidenote "right" "context-note" "much-needed context." >}}
|
||||||
In fact, \(\Gamma\) is sometimes called the typing context.
|
In fact, \(\Gamma\) is sometimes called the typing context.
|
||||||
{{< /sidenote >}} This version makes it clear that \\(x\\)
|
{{< /sidenote >}} This version makes it clear that \(e\)
|
||||||
isn't _always_ of type \\(\\tau\\), but only in the specific situation
|
isn't _always_ of type \(\tau\), but only in the specific situation
|
||||||
described by \\(\\Gamma\\). Using our first two-`number` environment,
|
described by \(\Gamma\). Using our first two-`number` environment,
|
||||||
we can make the following (true) claim:
|
we can make the following (true) claim:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
@@ -242,7 +245,7 @@ also results in a string".
|
|||||||
|
|
||||||
Okay, so now we've seen a couple of examples, but these examples are _not_ rules!
|
Okay, so now we've seen a couple of examples, but these examples are _not_ rules!
|
||||||
They capture only specific situations (which we've "hard-coded" by specifying
|
They capture only specific situations (which we've "hard-coded" by specifying
|
||||||
what \\(\\Gamma\\) is). Here's what a general rule __should not look like__:
|
what \(\Gamma\) is). Here's what a general rule __should not look like__:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\{ x_1 : \text{string}, x_2 : \text{string} \} \vdash x_1+x_2 : \text{string}
|
\{ x_1 : \text{string}, x_2 : \text{string} \} \vdash x_1+x_2 : \text{string}
|
||||||
@@ -265,8 +268,9 @@ This rule is bad, and it should feel bad. Here are two reasons:
|
|||||||
|
|
||||||
1. It only works for expressions like `x+y` or `a+b`, but not for
|
1. It only works for expressions like `x+y` or `a+b`, but not for
|
||||||
more complicated things like `(a+b)+(c+d)`. This is because
|
more complicated things like `(a+b)+(c+d)`. This is because
|
||||||
by using \\(x\_1\\) and \\(x\_2\\), the metavariables for
|
by using \(x_1\) and \(x_2\), the metavariables for
|
||||||
variables, it rules out additions that _don't_ add variables.
|
variables, it rules out additions that _don't_ add variables
|
||||||
|
(like the middle `+` in the example).
|
||||||
2. It doesn't play well with other rules; it can't be the _only_
|
2. It doesn't play well with other rules; it can't be the _only_
|
||||||
rule for addition of numbers, since it doesn't work for
|
rule for addition of numbers, since it doesn't work for
|
||||||
number literals (i.e., `1+1` is out).
|
number literals (i.e., `1+1` is out).
|
||||||
@@ -275,7 +279,7 @@ The trouble is that this rule is trying to do too much; it's trying
|
|||||||
to check the environment for variables, but it's _also_ trying to
|
to check the environment for variables, but it's _also_ trying to
|
||||||
specify the results of adding two numbers. That's not how we
|
specify the results of adding two numbers. That's not how we
|
||||||
did it last time! In fact, when it came to numbers, we had two
|
did it last time! In fact, when it came to numbers, we had two
|
||||||
rules. The first said that any number symbol had the \\(\\text{number}\\)
|
rules. The first said that any number symbol had the \(\text{number}\)
|
||||||
type. Previously, we wrote it as follows:
|
type. Previously, we wrote it as follows:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
@@ -283,17 +287,27 @@ n : \text{number}
|
|||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
Another rule specified the type of addition, without caring how the
|
Another rule specified the type of addition, without caring how the
|
||||||
sub-expressions \\(e\_1\\) and \\(e\_2\\) were given _their_ types.
|
sub-expressions \(e_1\) and \(e_2\) were given _their_ types.
|
||||||
As long as they had type \\(\\text{number}\\), all was well.
|
As long as they had type \(\text{number}\), all was well.
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\frac{e_1 : \text{number} \quad e_2 : \text{number}}{e_1 + e_2 : \text{number}}
|
\frac{e_1 : \text{number} \quad e_2 : \text{number}}{e_1 + e_2 : \text{number}}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
These rules are good, and we should keep them. Now, though, environments
|
So, instead of having a rule for "adding two number symbols", we had
|
||||||
|
a rule for "adding" and a rule for "number symbols". That approach
|
||||||
|
worked well because the rule for "adding" could be used to figure out
|
||||||
|
the types of compount addition expressions, like `(1+1)+(2+2)`, which
|
||||||
|
are _not_ "additions of number symbols". Taking inspiration from this
|
||||||
|
past success, we want to similarly separate "adding two variables"
|
||||||
|
into "variables" and "adding". We already have the latter, though,
|
||||||
|
so all that's left is the former.
|
||||||
|
|
||||||
|
Before we get to that, though, we need to update the two rules we
|
||||||
|
just saw above. These rules are good, and we should keep them. Now, though, environments
|
||||||
are in play. Fortunately, the environment doesn't matter at all when it
|
are in play. Fortunately, the environment doesn't matter at all when it
|
||||||
comes to figuring out what the type of a symbol like `1` is -- it's always
|
comes to figuring out what the type of a symbol like `1` is -- it's always
|
||||||
a number! We can thus write the updated rule as follows. Leaving \\(\\Gamma\\)
|
a number! We can thus write the updated rule as follows. Leaving \(\Gamma\)
|
||||||
unspecified means it can
|
unspecified means it can
|
||||||
stand for any environment.
|
stand for any environment.
|
||||||
|
|
||||||
@@ -302,7 +316,7 @@ stand for any environment.
|
|||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
We can also translate the addition rule in a pretty straightforward
|
We can also translate the addition rule in a pretty straightforward
|
||||||
manner, by tacking on \\(\\Gamma\\) for every typing claim.
|
manner, by tacking on \(\Gamma\) for every typing claim.
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\frac{\Gamma \vdash e_1 : \text{number} \quad \Gamma \vdash e_2 : \text{number}}{\Gamma \vdash e_1 + e_2 : \text{number}}
|
\frac{\Gamma \vdash e_1 : \text{number} \quad \Gamma \vdash e_2 : \text{number}}{\Gamma \vdash e_1 + e_2 : \text{number}}
|
||||||
@@ -312,33 +326,32 @@ So we have a rule for number symbols like `1` or `2`, and we have
|
|||||||
a rule for addition. All that's left is a rule for variables, like `x`
|
a rule for addition. All that's left is a rule for variables, like `x`
|
||||||
and `y`. This rule needs to make sure that a variable is defined,
|
and `y`. This rule needs to make sure that a variable is defined,
|
||||||
and that it has a particular type. A variable is defined, and has a type,
|
and that it has a particular type. A variable is defined, and has a type,
|
||||||
if a pair \\(x : \\tau\\) is present in the environment \\(\\Gamma\\).
|
if a pair \(x : \tau\) is present in the environment \(\Gamma\).
|
||||||
Thus, we can write the variable rule like this:
|
Thus, we can write the variable rule like this:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\frac{x : \tau \in \Gamma}{\Gamma \vdash x : \tau}
|
\frac{x : \tau \in \Gamma}{\Gamma \vdash x : \tau}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
Note that we're using the \\(\\tau\\) metavariable to range over any type;
|
Note that we're using the \(\tau\) metavariable to range over any type;
|
||||||
this means the rule applies to (object) variables declared to have type
|
this means the rule applies to (object) variables declared to have type
|
||||||
\\(\\text{number}\\), \\(\\text{string}\\), or anything else present in
|
\(\text{number}\), \(\text{string}\), or anything else present in
|
||||||
our system. A single rule takes care of figuring the types of _all_
|
our system. A single rule takes care of figuring the types of _all_
|
||||||
variables.
|
variables.
|
||||||
|
|
||||||
{{< todo >}}
|
> [!TODO]
|
||||||
The rest of this, but mostly statements.
|
> The rest of this, but mostly statements.
|
||||||
{{< /todo >}}
|
|
||||||
|
|
||||||
### This Page at a Glance
|
### This Page at a Glance
|
||||||
#### Metavariables
|
#### Metavariables
|
||||||
| Symbol | Meaning |
|
| Symbol | Meaning |
|
||||||
|---------|--------------|
|
|---------|--------------|
|
||||||
| \\(n\\) | Numbers |
|
| \(n\) | Numbers |
|
||||||
| \\(s\\) | Strings |
|
| \(s\) | Strings |
|
||||||
| \\(e\\) | Expressions |
|
| \(e\) | Expressions |
|
||||||
| \\(x\\) | Variables |
|
| \(x\) | Variables |
|
||||||
| \\(\\tau\\) | Types |
|
| \(\tau\) | Types |
|
||||||
| \\(\\Gamma\\) | Typing environments |
|
| \(\Gamma\) | Typing environments |
|
||||||
|
|
||||||
#### Grammar
|
#### Grammar
|
||||||
{{< block >}}
|
{{< block >}}
|
||||||
@@ -356,8 +369,8 @@ The rest of this, but mostly statements.
|
|||||||
{{< foldtable >}}
|
{{< foldtable >}}
|
||||||
| Rule | Description |
|
| Rule | Description |
|
||||||
|--------------|-------------|
|
|--------------|-------------|
|
||||||
| {{< latex >}}\Gamma \vdash n : \text{number} {{< /latex >}}| Number literals have type \\(\\text{number}\\) |
|
| {{< latex >}}\Gamma \vdash n : \text{number} {{< /latex >}}| Number literals have type \(\text{number}\) |
|
||||||
| {{< latex >}}\Gamma \vdash s : \text{string} {{< /latex >}}| String literals have type \\(\\text{string}\\) |
|
| {{< latex >}}\Gamma \vdash s : \text{string} {{< /latex >}}| String literals have type \(\text{string}\) |
|
||||||
| {{< latex >}}\frac{x:\tau \in \Gamma}{\Gamma\vdash x : \tau}{{< /latex >}}| Variables have whatever type is given to them by the environment |
|
| {{< latex >}}\frac{x:\tau \in \Gamma}{\Gamma\vdash x : \tau}{{< /latex >}}| Variables have whatever type is given to them by the environment |
|
||||||
| {{< latex >}}\frac{\Gamma \vdash e_1 : \text{string}\quad \Gamma \vdash e_2 : \text{string}}{\Gamma \vdash e_1+e_2 : \text{string}} {{< /latex >}}| Adding strings gives a string |
|
| {{< latex >}}\frac{\Gamma \vdash e_1 : \text{string}\quad \Gamma \vdash e_2 : \text{string}}{\Gamma \vdash e_1+e_2 : \text{string}} {{< /latex >}}| Adding strings gives a string |
|
||||||
| {{< latex >}}\frac{\Gamma \vdash e_1 : \text{number}\quad \Gamma \vdash e_2 : \text{number}}{\Gamma \vdash e_1+e_2 : \text{number}} {{< /latex >}}| Adding numbers gives a number |
|
| {{< latex >}}\frac{\Gamma \vdash e_1 : \text{number}\quad \Gamma \vdash e_2 : \text{number}}{\Gamma \vdash e_1+e_2 : \text{number}} {{< /latex >}}| Adding numbers gives a number |
|
||||||
|
|||||||
@@ -1,7 +1,8 @@
|
|||||||
---
|
---
|
||||||
title: Compiling a Functional Language Using C++, Part 3 - Type Checking
|
title: Compiling a Functional Language Using C++, Part 3 - Type Checking
|
||||||
date: 2019-08-06T14:26:38-07:00
|
date: 2019-08-06T14:26:38-07:00
|
||||||
tags: ["C and C++", "Functional Languages", "Compilers"]
|
tags: ["C++", "Functional Languages", "Compilers"]
|
||||||
|
series: "Compiling a Functional Language using C++"
|
||||||
description: "In this post, we allow our compiler to throw away invalid programs, detected using a monomorphic typechecking algorithm."
|
description: "In this post, we allow our compiler to throw away invalid programs, detected using a monomorphic typechecking algorithm."
|
||||||
---
|
---
|
||||||
I think tokenizing and parsing are boring. The thing is, looking at syntax
|
I think tokenizing and parsing are boring. The thing is, looking at syntax
|
||||||
@@ -158,8 +159,8 @@ the form:
|
|||||||
\frac{A_1 \ldots A_n} {B_1 \ldots B_m}
|
\frac{A_1 \ldots A_n} {B_1 \ldots B_m}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
This reads, "given that the premises \\(A\_1\\) through \\(A\_n\\) are true,
|
This reads, "given that the premises \(A_1\) through \(A_n\) are true,
|
||||||
it holds that the conclusions \\(B\_1\\) through \\(B\_m\\) are true".
|
it holds that the conclusions \(B_1\) through \(B_m\) are true".
|
||||||
|
|
||||||
For example, we can have the following inference rule:
|
For example, we can have the following inference rule:
|
||||||
|
|
||||||
@@ -172,18 +173,18 @@ For example, we can have the following inference rule:
|
|||||||
Since you wear a jacket when it's cold, and it's cold, we can conclude
|
Since you wear a jacket when it's cold, and it's cold, we can conclude
|
||||||
that you will wear a jacket.
|
that you will wear a jacket.
|
||||||
|
|
||||||
When talking about type systems, it's common to represent a type with \\(\\tau\\).
|
When talking about type systems, it's common to represent a type with \(\tau\).
|
||||||
The letter, which is the greek character "tau", is used as a placeholder for
|
The letter, which is the greek character "tau", is used as a placeholder for
|
||||||
some __concrete type__. It's kind of like a template, to be filled in
|
some __concrete type__. It's kind of like a template, to be filled in
|
||||||
with an actual value. When we plug in an actual value into a rule containing
|
with an actual value. When we plug in an actual value into a rule containing
|
||||||
\\(\\tau\\), we say we are __instantiating__ it. Similarly, we will use
|
\(\tau\), we say we are __instantiating__ it. Similarly, we will use
|
||||||
\\(e\\) to serve as a placeholder for an expression (matched by our
|
\(e\) to serve as a placeholder for an expression (matched by our
|
||||||
\\(A\_{add}\\) grammar rule from part 2). Next, we have the typing relation,
|
\(A_{add}\) grammar rule from part 2). Next, we have the typing relation,
|
||||||
written as \\(e:\\tau\\). This says that "expression \\(e\\) has the type
|
written as \(e:\tau\). This says that "expression \(e\) has the type
|
||||||
\\(\\tau\\)".
|
\(\tau\)".
|
||||||
|
|
||||||
Alright, this is enough to get us started with some typing rules.
|
Alright, this is enough to get us started with some typing rules.
|
||||||
Let's start with one for numbers. If we define \\(n\\) to mean
|
Let's start with one for numbers. If we define \(n\) to mean
|
||||||
"any expression that is a just a number, like 3, 2, 6, etc.",
|
"any expression that is a just a number, like 3, 2, 6, etc.",
|
||||||
we can write the typing rule as follows:
|
we can write the typing rule as follows:
|
||||||
|
|
||||||
@@ -205,30 +206,30 @@ Now, let's move on to the rule for function application:
|
|||||||
|
|
||||||
This rule includes everything we've said before:
|
This rule includes everything we've said before:
|
||||||
the thing being applied has to have a function type
|
the thing being applied has to have a function type
|
||||||
(\\(\\tau\_1 \\rightarrow \\tau\_2\\)), and
|
(\(\tau_1 \rightarrow \tau_2\)), and
|
||||||
the expression the function is applied to
|
the expression the function is applied to
|
||||||
has to have the same type \\(\\tau\_1\\) as the
|
has to have the same type \(\tau_1\) as the
|
||||||
left type of the function.
|
left type of the function.
|
||||||
|
|
||||||
It's the variable rule that forces us to adjust our notation.
|
It's the variable rule that forces us to adjust our notation.
|
||||||
Our rules don't take into account the context that we've
|
Our rules don't take into account the context that we've
|
||||||
already discussed, and thus, we can't bring
|
already discussed, and thus, we can't bring
|
||||||
in any outside information. Let's fix that! It's convention
|
in any outside information. Let's fix that! It's convention
|
||||||
to use the symbol \\(\\Gamma\\) for the context. We then
|
to use the symbol \(\Gamma\) for the context. We then
|
||||||
add notation to say, "using the context \\(\\Gamma\\),
|
add notation to say, "using the context \(\Gamma\),
|
||||||
we can deduce that \\(e\\) has type \\(\\tau\\)". We will
|
we can deduce that \(e\) has type \(\tau\)". We will
|
||||||
write this as \\(\\Gamma \\vdash e : \\tau\\).
|
write this as \(\Gamma \vdash e : \tau\).
|
||||||
|
|
||||||
But what __is__ our context? We can think of it
|
But what __is__ our context? We can think of it
|
||||||
as a mapping from variable names to their known types. We
|
as a mapping from variable names to their known types. We
|
||||||
can represent such a mapping using a set of pairs
|
can represent such a mapping using a set of pairs
|
||||||
in the form \\(x : \\tau\\), where \\(x\\) represents
|
in the form \(x : \tau\), where \(x\) represents
|
||||||
a variable name.
|
a variable name.
|
||||||
|
|
||||||
Since \\(\\Gamma\\) is just a regular set, we can
|
Since \(\Gamma\) is just a regular set, we can
|
||||||
write \\(x : \\tau \\in \\Gamma\\), meaning that
|
write \(x : \tau \in \Gamma\), meaning that
|
||||||
in the current context, it is known that \\(x\\)
|
in the current context, it is known that \(x\)
|
||||||
has the type \\(\\tau\\).
|
has the type \(\tau\).
|
||||||
|
|
||||||
Let's update our rules with this new addition.
|
Let's update our rules with this new addition.
|
||||||
|
|
||||||
@@ -282,19 +283,19 @@ Let's first take a look at the whole case expression rule:
|
|||||||
This is a lot more complicated than the other rules we've seen, and we've used some notation
|
This is a lot more complicated than the other rules we've seen, and we've used some notation
|
||||||
that we haven't seen before. Let's take this step by step:
|
that we haven't seen before. Let's take this step by step:
|
||||||
|
|
||||||
1. \\(e : \\tau\\), in this case, means that the expression between `case` and `of`, is of type \\(\\tau\\).
|
1. \(e : \tau\), in this case, means that the expression between `case` and `of`, is of type \(\tau\).
|
||||||
2. \\(\\text{matcht}(\\tau, p\_i) = b\_i\\) means that the pattern \\(p\_i\\) can match a value of type
|
2. \(\text{matcht}(\tau, p_i) = b_i\) means that the pattern \(p_i\) can match a value of type
|
||||||
\\(\\tau\\), producing additional type pairs \\(b\_i\\). We need \\(b\_i\\) because a pattern
|
\(\tau\), producing additional type pairs \(b_i\). We need \(b_i\) because a pattern
|
||||||
such as `Cons x xs` will introduce new type information, namely \\(\text{x} : \text{Int}\\) and \\(\text{xs} : \text{List}\\).
|
such as `Cons x xs` will introduce new type information, namely \(\text{x} : \text{Int}\) and \(\text{xs} : \text{List}\).
|
||||||
3. \\(\\Gamma,b\_i \\vdash e\_i : \\tau\_c\\) means that each individual branch can be deduced to have the type
|
3. \(\Gamma,b_i \vdash e_i : \tau_c\) means that each individual branch can be deduced to have the type
|
||||||
\\(\\tau\_c\\), using the previously existing context \\(\\Gamma\\), with the addition of \\(b\_i\\), the new type information.
|
\(\tau_c\), using the previously existing context \(\Gamma\), with the addition of \(b_i\), the new type information.
|
||||||
4. Finally, the conclusion is that the case expression, if all the premises are met, is of type \\(\\tau\_c\\).
|
4. Finally, the conclusion is that the case expression, if all the premises are met, is of type \(\tau_c\).
|
||||||
|
|
||||||
For completeness, let's add rules for \\(\\text{matcht}(\\tau, p\_i) = b\_i\\). We'll need two: one for
|
For completeness, let's add rules for \(\text{matcht}(\tau, p_i) = b_i\). We'll need two: one for
|
||||||
the "basic" pattern, which always matches the value and binds a variable to it, and one
|
the "basic" pattern, which always matches the value and binds a variable to it, and one
|
||||||
for a constructor pattern, that matches a constructor and its parameters.
|
for a constructor pattern, that matches a constructor and its parameters.
|
||||||
|
|
||||||
Let's define \\(v\\) to be a variable name in the context of a pattern. For the basic pattern:
|
Let's define \(v\) to be a variable name in the context of a pattern. For the basic pattern:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\frac
|
\frac
|
||||||
@@ -302,7 +303,7 @@ Let's define \\(v\\) to be a variable name in the context of a pattern. For the
|
|||||||
{\text{matcht}(\tau, v) = \{v : \tau \}}
|
{\text{matcht}(\tau, v) = \{v : \tau \}}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
For the next rule, let's define \\(c\\) to be a constructor name. The rule for the constructor pattern, then, is:
|
For the next rule, let's define \(c\) to be a constructor name. The rule for the constructor pattern, then, is:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\frac
|
\frac
|
||||||
@@ -311,8 +312,8 @@ For the next rule, let's define \\(c\\) to be a constructor name. The rule for t
|
|||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
This rule means that whenever we have a pattern in the form of a constructor applied to
|
This rule means that whenever we have a pattern in the form of a constructor applied to
|
||||||
\\(n\\) variable names, if the constructor takes \\(n\\) arguments of types \\(\\tau\_1\\)
|
\(n\) variable names, if the constructor takes \(n\) arguments of types \(\tau_1\)
|
||||||
through \\(\\tau\_n\\), then the each variable will have a corresponding type.
|
through \(\tau_n\), then the each variable will have a corresponding type.
|
||||||
|
|
||||||
We didn't include lambda expressions in our syntax, and thus we won't need typing rules for them,
|
We didn't include lambda expressions in our syntax, and thus we won't need typing rules for them,
|
||||||
so it actually seems like we're done with the first draft of our type rules.
|
so it actually seems like we're done with the first draft of our type rules.
|
||||||
@@ -398,8 +399,8 @@ we're binding is the same as the string we're binding it to
|
|||||||
We now have a unification algorithm, but we still
|
We now have a unification algorithm, but we still
|
||||||
need to implement our rules. Our rules
|
need to implement our rules. Our rules
|
||||||
usually include three things: an environment
|
usually include three things: an environment
|
||||||
\\(\\Gamma\\), an expression \\(e\\),
|
\(\Gamma\), an expression \(e\),
|
||||||
and a type \\(\\tau\\). We will
|
and a type \(\tau\). We will
|
||||||
represent this as a method on `ast`, our struct
|
represent this as a method on `ast`, our struct
|
||||||
for an expression tree. This
|
for an expression tree. This
|
||||||
method will take an environment and return
|
method will take an environment and return
|
||||||
@@ -412,7 +413,7 @@ to a type. So naively, we can implement this simply
|
|||||||
using an `std::map`. But observe
|
using an `std::map`. But observe
|
||||||
that we only extend the environment in one case so far:
|
that we only extend the environment in one case so far:
|
||||||
a case expression. In a case expression, we have the base
|
a case expression. In a case expression, we have the base
|
||||||
envrionment \\(\\Gamma\\), and for each branch,
|
envrionment \(\Gamma\), and for each branch,
|
||||||
we extend it with the bindings produced by
|
we extend it with the bindings produced by
|
||||||
the pattern match. Each branch receives a modified
|
the pattern match. Each branch receives a modified
|
||||||
copy of the original environment, one that
|
copy of the original environment, one that
|
||||||
@@ -458,7 +459,7 @@ We start with with a signature inside `ast`:
|
|||||||
```
|
```
|
||||||
virtual type_ptr typecheck(type_mgr& mgr, const type_env& env) const;
|
virtual type_ptr typecheck(type_mgr& mgr, const type_env& env) const;
|
||||||
```
|
```
|
||||||
We also implement the \\(\\text{matchp}\\) function
|
We also implement the \(\text{matchp}\) function
|
||||||
as a method `match` on `pattern` with the following signature:
|
as a method `match` on `pattern` with the following signature:
|
||||||
```
|
```
|
||||||
virtual void match(type_ptr t, type_mgr& mgr, type_env& env) const;
|
virtual void match(type_ptr t, type_mgr& mgr, type_env& env) const;
|
||||||
|
|||||||
@@ -1,7 +1,7 @@
|
|||||||
---
|
---
|
||||||
title: Learning Emulation, Part 2.5 - Implementation
|
title: Learning Emulation, Part 2.5 - Implementation
|
||||||
date: 2016-11-23 23:23:56.633942
|
date: 2016-06-30
|
||||||
tags: ["C and C++", "Emulation"]
|
tags: ["C", "Emulation"]
|
||||||
---
|
---
|
||||||
_This is the third post in a series I'm writing about Chip-8 emulation. If you want to see the first one, head [here]({{< relref "/blog/01_learning_emulation.md" >}})._
|
_This is the third post in a series I'm writing about Chip-8 emulation. If you want to see the first one, head [here]({{< relref "/blog/01_learning_emulation.md" >}})._
|
||||||
|
|
||||||
|
|||||||
555
content/blog/03_spa_agda_fixed_height.md
Normal file
@@ -0,0 +1,555 @@
|
|||||||
|
---
|
||||||
|
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 3: Lattices of Finite Height"
|
||||||
|
series: "Static Program Analysis in Agda"
|
||||||
|
description: "In this post, I describe the class of finite-height lattices, and prove that lattices we've alread seen are in that class"
|
||||||
|
date: 2024-08-08T17:29:00-07:00
|
||||||
|
tags: ["Agda", "Programming Languages"]
|
||||||
|
---
|
||||||
|
|
||||||
|
In the previous post, I introduced the class of finite-height lattices:
|
||||||
|
lattices where chains made from elements and the less-than operator `<`
|
||||||
|
can only be so long. As a first example,
|
||||||
|
[natural numbers form a lattice]({{< relref "01_spa_agda_lattices#natural-numbers" >}}),
|
||||||
|
but they __are not a finite-height lattice__; the following chain can be made
|
||||||
|
infinitely long:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
0 < 1 < 2 < ...
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
There isn't a "biggest natural number"! On the other hand, we've seen that our
|
||||||
|
[sign lattice]({{< relref "01_spa_agda_lattices#sign-lattice" >}}) has a finite
|
||||||
|
height; the longest chain we can make is three elements long; I showed one
|
||||||
|
such chain (there are many chains of three elements) in
|
||||||
|
[the previous post]({{< relref "02_spa_agda_combining_lattices#sign-three-elements" >}}),
|
||||||
|
but here it is again:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\bot < + < \top
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
It's also true that the [Cartesian product lattice \(L_1 \times L_2\)]({{< relref "02_spa_agda_combining_lattices#the-cartesian-product-lattice" >}})
|
||||||
|
has a finite height, as long as \(L_1\) and \(L_2\) are themselves finite-height
|
||||||
|
lattices. In the specific case where both \(L_1\) and \(L_2\) are the sign
|
||||||
|
lattice (\(L_1 = L_2 = \text{Sign} \)) we can observe that the longest
|
||||||
|
chains have five elements. The following is one example:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
(\bot, \bot) < (\bot, +) < (\bot, \top) < (+, \top) < (\top, \top)
|
||||||
|
{{< /latex >}}
|
||||||
|
{#sign-prod-chain}
|
||||||
|
|
||||||
|
The fact that \(L_1\) and \(L_2\) are themselves finite-height lattices is
|
||||||
|
important; if either one of them is not, we can easily construct an infinite
|
||||||
|
chain of the products. If we allowed \(L_2\) to be natural numbers, we'd
|
||||||
|
end up with infinite chains like this one:
|
||||||
|
{#product-both-finite-height}
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
(\bot, 0) < (\bot, 1) < (\bot, 2) < ...
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Another lattice that has a finite height under certain conditions is
|
||||||
|
[the map lattice]({{< relref "02_spa_agda_combining_lattices#the-map-lattice" >}}).
|
||||||
|
The "under certain conditions" part is important; we can easily construct
|
||||||
|
an infinite chain of map lattice elements in general:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\{ a : 1 \} < \{ a : 1, b : 1 \} < \{ a: 1, b: 1, c: 1 \} < ...
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
As long as we have infinite keys to choose from, we can always keep
|
||||||
|
adding new keys to make bigger and bigger maps. But if we fix the keys in
|
||||||
|
the map --- say, we use only `a` and `b` --- then suddenly our heights are once
|
||||||
|
again fixed. In fact, for the two keys I just picked, one longest chain
|
||||||
|
is remarkably similar to the product chain above.
|
||||||
|
{#fin-keys}
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\{a: \bot, a: \bot\} < \{a: \bot, b: +\} < \{a: \bot, b: \top\} < \{a: +, b: \top\} < \{a: \top, b: \top\}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
The class of finite-height lattices is important for static program analysis,
|
||||||
|
because it ensures that out our analyses don't take infinite time. Though
|
||||||
|
there's an intuitive connection ("finite lattices mean finite execution"),
|
||||||
|
the details of why the former is needed for the latter are nuanced. We'll
|
||||||
|
talk about them in a subsequent post.
|
||||||
|
|
||||||
|
In the meantime, let's dig deeper into the notion of finite height, and
|
||||||
|
the Agda proofs of the properties I've introduced thus far.
|
||||||
|
|
||||||
|
### Formalizing Finite Height
|
||||||
|
|
||||||
|
The formalization I settled on is quite similar to the informal description:
|
||||||
|
a lattice has a finite height of length \(h\) if the longest chain
|
||||||
|
of elements compared by \((<)\) is exactly \(h\). There's only a slight
|
||||||
|
complication: we allow for equivalent-but-not-equal elements in lattices.
|
||||||
|
For instance, for a map lattice, we don't care about the order of the keys:
|
||||||
|
so long as two maps relate the same set of keys to the same respective values,
|
||||||
|
we will consider them equal. This, however, is beyond the notion of Agda's
|
||||||
|
propositional equality (`_≡_`). Thus, we we need to generalize the definition
|
||||||
|
of a chain to support equivalences. I parameterize the `Chain` module
|
||||||
|
in my code by an equivalence relation, as well as the comparison relation `R`,
|
||||||
|
which we will set to `<` for our chains. The equivalence relation `_≈_` and the
|
||||||
|
ordering relation `R`/`<` are expected to play together nicely (if `a < b`, and
|
||||||
|
`a` is equivalent to `c`, then it should be the case that `c < b`).
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Chain.agda" 3 7 >}}
|
||||||
|
|
||||||
|
From there, the definition of the `Chain` data type is much like the definition
|
||||||
|
of [a vector from `Data.Vec`](https://agda.github.io/agda-stdlib/v2.0/Data.Vec.Base.html#1111),
|
||||||
|
but indexed by the endpoints, and containing witnesses of `R`/`<`
|
||||||
|
between its elements. The indexing allows for representing
|
||||||
|
the type of chains between particular lattice elements, and serves to ensure
|
||||||
|
concatenation and other operations don't merge disparate chains.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Chain.agda" 19 21 >}}
|
||||||
|
|
||||||
|
In the `done` case, we create a single-element chain, which has no comparisons.
|
||||||
|
In this case, the chain starts and stops at the same element (where "the same"
|
||||||
|
is modulo our equivalence). The `step` case prepends a new comparison `a1 < a2`
|
||||||
|
to an existing chain; once again, we allow for the existing chain to start
|
||||||
|
with a different-but-equivalent element `a2'`.
|
||||||
|
|
||||||
|
With that definition in hand, I define what it means for a type of
|
||||||
|
chains between elements of the lattice `A` to be bounded by a certain height; simply
|
||||||
|
put, all chains must have length less than or equal to the bound.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Chain.agda" 38 39 >}}
|
||||||
|
|
||||||
|
Though `Bounded` specifies _a_ bound on the length of chains, it doesn't
|
||||||
|
specify _the_ (lowest) bound. Specifically, if the chains can only have
|
||||||
|
length three, they are bounded by both 3, 30, and 300. To claim a lowest
|
||||||
|
bound (which would be the maximum length of the lattice), we need to show that
|
||||||
|
a chain of that length actually exists (otherwise,
|
||||||
|
we could take the previous natural number, and it would be a bound as well).
|
||||||
|
Thus, I define the `Height` predicate to require that a chain of the desired
|
||||||
|
height exists, and that this height bounds the length of all other chains.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Chain.agda" 47 53 >}}
|
||||||
|
|
||||||
|
Finally, for a lattice to have a finite height, the type of chains formed by using
|
||||||
|
its less-than operator needs to have that height (satisfy the `Height h` predicate).
|
||||||
|
To avoid having to thread through the equivalence relation, congruence proof,
|
||||||
|
and more, I define a specialized predicate for lattices specifically.
|
||||||
|
I do so as a "method" in my `IsLattice` record.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice.agda" 183 210 "hl_lines = 27 28">}}
|
||||||
|
|
||||||
|
Thus, bringing the operators and other definitions of `IsLattice` into scope
|
||||||
|
will also bring in the `FixedHeight` predicate.
|
||||||
|
|
||||||
|
### Fixed Height of the "Above-Below" Lattice
|
||||||
|
We've already seen intuitive evidence that the sign lattice --- which is an instance of
|
||||||
|
the ["above-below" lattice]({{< relref "01_spa_agda_lattices#the-above-below-lattice" >}}) ---
|
||||||
|
has a fixed height. The reason is simple: we extended a set of incomparable
|
||||||
|
elements with a single element that's greater, and a single element that's lower.
|
||||||
|
We can't make chains out of incomparable elements (since we can't compare them
|
||||||
|
using `<`); thus, we can only have one `<` from the new least element, and
|
||||||
|
one `<` from the new greatest element.
|
||||||
|
|
||||||
|
The proof is a bit tedious, but not all that complicated.
|
||||||
|
First, a few auxiliary helpers; feel free to read only the type signatures.
|
||||||
|
They specify, respectively:
|
||||||
|
1. That the bottom element \(\bot\) of the above-below lattice is less than any
|
||||||
|
concrete value from the underlying set. For instance, in the sign lattice case, \(\bot < +\).
|
||||||
|
2. That \(\bot\) is the only element satisfying the first property; that is,
|
||||||
|
any value strictly less than an element of the underlying set must be \(\bot\).
|
||||||
|
3. That the top element \(\top\) of the above-below lattice is greater than
|
||||||
|
any concrete value of the underlying set. This is the dual of the first property.
|
||||||
|
4. That, much like the bottom element is the only value strictly less than elements
|
||||||
|
of the underlying set, the top element is the only value strictly greater.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/AboveBelow.agda" 315 335 >}}
|
||||||
|
|
||||||
|
From there, we can construct an instance of the longest chain. Actually,
|
||||||
|
there's a bit of a hang-up: what if the underlying set is empty? Concretely,
|
||||||
|
what if there were no signs? Then we could only construct a chain with
|
||||||
|
one comparison: \(\bot < \top\). Instead of adding logic to conditionally
|
||||||
|
specify the length, I simply require that the set is populated by requiring
|
||||||
|
a witness
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/AboveBelow.agda" 85 85 >}}
|
||||||
|
|
||||||
|
I use this witness to construct the two-`<` chain.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/AboveBelow.agda" 339 340 >}}
|
||||||
|
|
||||||
|
The proof that the length of two --- in terms of comparisons --- is the
|
||||||
|
bound of all chains of `AboveBelow` elements requires systematically
|
||||||
|
rejecting all longer chains. Informally, suppose you have a chain of
|
||||||
|
three or more comparisons.
|
||||||
|
|
||||||
|
1. If it starts with \(\top\), you can't add any more elements since that's the
|
||||||
|
greatest element (contradiction).
|
||||||
|
2. If you start with an element of the underlying set, you could add another
|
||||||
|
element, but it has to be the top element; after that, you can't add any
|
||||||
|
more (contradiction).
|
||||||
|
3. If you start with \(\bot\), you could arrive at a chain of two comparisons,
|
||||||
|
but you can't go beyond that (in three cases, each leading to contradictions).
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/AboveBelow.agda" 342 355 "hl_lines=8-14">}}
|
||||||
|
|
||||||
|
Thus, the above-below lattice has a length of two comparisons (or alternatively,
|
||||||
|
three elements).
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/AboveBelow.agda" 357 363 >}}
|
||||||
|
|
||||||
|
And that's it.
|
||||||
|
|
||||||
|
### Fixed Height of the Product Lattice
|
||||||
|
|
||||||
|
Now, for something less tedious. We saw above that for a product lattice
|
||||||
|
to have a finite height,
|
||||||
|
[its constituent lattices must have a finite height](#product-both-finite-height).
|
||||||
|
The proof was by contradiction (by constructing an infinitely long product
|
||||||
|
chain given a single infinite lattice). As a result, we'll focus this
|
||||||
|
section on products of two finite lattices `A` and `B`. Additionally, for the
|
||||||
|
proofs in this section, I require element equivalence to be decidable.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Prod.agda" 115 117 >}}
|
||||||
|
|
||||||
|
Let's think about how we might go about constructing the longest chain in
|
||||||
|
a product lattice. Let's start with some arbitrary element \(p_1 = (a_1, b_1)\).
|
||||||
|
We need to find another value that isn't equal to \(p_1\), because we're building
|
||||||
|
chains of the less-than operator \((<)\), and not the less-than-or-equal operator
|
||||||
|
\((\leq)\). As a result, we need to change either the first component, the second
|
||||||
|
component, or both. If we're building "to the right" (adding bigger elements),
|
||||||
|
the new components would need to be bigger. Suppose then that we came up
|
||||||
|
with \(a_2\) and \(b_2\), with \(a_1 < a_2\) and \(b_1 < b_2\). We could then
|
||||||
|
create a length-one chain:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
(a_1, b_1) < (a_2, b_2)
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
That works, but we can construct an even longer chain by increasing only one
|
||||||
|
element at a time:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
(a_1, b_1) < (a_1, b_2) < (a_2, b_2)
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
We can apply this logic every time; the conclusion is that when building
|
||||||
|
up a chain, we need to increase one element at a time. Then, how many times
|
||||||
|
can we increase an element? Well, if lattice `A` has a height of two (comparisons),
|
||||||
|
then we can take its lowest element, and increase it twice. Similarly, if
|
||||||
|
lattice `B` has a height of three, starting at its lowest element, we can
|
||||||
|
increase it three times. In all, when building a chain of `A × B`, we can
|
||||||
|
increase an element five times. Generally, the number of `<` in the product chain
|
||||||
|
is the sum of the numbers of `<` in the chains of `A` and `B`.
|
||||||
|
|
||||||
|
This gives us a recipe for constructing
|
||||||
|
the longest chain in the product lattice: take the longest chains of `A` and
|
||||||
|
`B`, and start with the product of their lowest elements. Then, increase
|
||||||
|
the elements one at a time according to the chains. The simplest way to do
|
||||||
|
that might be to increase by all elements of the `A` chain, and then
|
||||||
|
by all of the elements of the `B` chain (or the other way around). That's the
|
||||||
|
strategy I took when [constructing the \(\text{Sign} \times \text{Sign}\)
|
||||||
|
chain above](#sign-prod-chain).
|
||||||
|
|
||||||
|
To formalize this notion, a few lemmas. First, given two chains where
|
||||||
|
one starts with the same element another ends with, we can combine them into
|
||||||
|
one long chain.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Chain.agda" 31 33 >}}
|
||||||
|
|
||||||
|
More interestingly, given a chain of comparisons in one lattice, we are
|
||||||
|
able to lift it into a chain in another lattice by applying a function
|
||||||
|
to each element. This function must be monotonic, because it must not
|
||||||
|
be able to reverse \(a < b\) such that \(f(b) < f(a)\). Moreover, this function
|
||||||
|
should be injective, because if \(f(a) = f(b)\), then a chain \(a < b\) might
|
||||||
|
be collapsed into \(f(a) \not< f(a)\), changing its length. Finally,
|
||||||
|
the function needs to produce equivalent outputs when giving equivalent inputs.
|
||||||
|
The result is the following lemma:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice.agda" 226 247 >}}
|
||||||
|
|
||||||
|
Given this, and two lattices of finite height, we construct the full product
|
||||||
|
chain by lifting the `A` chain into the product via \(a \mapsto (a, \bot_2)\),
|
||||||
|
lifting the `B` chain into the product via \(b \mapsto (\top_1, b)\), and
|
||||||
|
concatenating the results. This works because the first chain ends with
|
||||||
|
\((\top_1, \bot_2)\), and the second starts with it.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Prod.agda" 169 171 >}}
|
||||||
|
|
||||||
|
This gets us the longest chain; what remains is to prove that this chain's
|
||||||
|
length is the bound of all other changes. To do so, we need to work in
|
||||||
|
the opposite direction; given a chain in the product lattice, we need to
|
||||||
|
somehow reduce it to chains in lattices `A` and `B`, and leverage their
|
||||||
|
finite height to complete the proof.
|
||||||
|
|
||||||
|
The key idea is that for every two consecutive elements in the product lattice
|
||||||
|
chain, we know that at least one of their components must've increased.
|
||||||
|
This increase had to come either from elements in lattice `A` or in lattice `B`.
|
||||||
|
We can thus stick this increase into an `A`-chain or a `B`-chain, increasing
|
||||||
|
its length. Since one of the chains grows with every consecutive pair, the
|
||||||
|
number of consecutive pairs can't exceed the combined lengths of the `A` and `B` chains.
|
||||||
|
|
||||||
|
I implement this idea as an `unzip` function, which takes a product chain
|
||||||
|
and produces two chains made from its increases. By the logic we've described,
|
||||||
|
the length two chains has to bound the main one's. I give the signature below,
|
||||||
|
and will put the implementation in a collapsible detail block. One last
|
||||||
|
detail is that the need to decide which chain to grow --- and thus which element
|
||||||
|
has increased --- is what introduces the need for decidable equality.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Prod.agda" 149 149 >}}
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Prod.agda" 149 163 "" "**(Click here for the implementation of `unzip`)**" >}}
|
||||||
|
|
||||||
|
Having decomposed the product chain into constituent chains, we simply combine
|
||||||
|
the facts that they have to be bounded by the height of the `A` and `B` lattices,
|
||||||
|
as well as the fact that they bound the combined chain.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Prod.agda" 165 175 "hl_lines = 8-10" >}}
|
||||||
|
|
||||||
|
This completes the proof!
|
||||||
|
|
||||||
|
### Iterated Products
|
||||||
|
|
||||||
|
The product lattice allows us to combine finite height lattices into a
|
||||||
|
new finite height lattice. From there, we can use this newly created lattice
|
||||||
|
as a component of yet another product lattice. For instance, if we had
|
||||||
|
\(L_1 \times L_2\), we can take a product of that with \(L_1\) again,
|
||||||
|
and get \(L_1 \times (L_1 \times L_2)\). Since this also creates a
|
||||||
|
finite-height lattice, we can repeat this process, and keep
|
||||||
|
taking a product with \(L_1\), creating:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\overbrace{L_1 \times ... \times L_1}^{n\ \text{times}} \times L_2.
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
I call this the _iterated product lattice_. Its significance will become
|
||||||
|
clear shortly; in the meantime, let's prove that it is indeed a lattice
|
||||||
|
(of finite height).
|
||||||
|
To create an iterated product lattice, we still need two constituent
|
||||||
|
lattices as input.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/IterProd.agda" 7 11 >}}
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/IterProd.agda" 23 24 >}}
|
||||||
|
|
||||||
|
At a high level, the proof goes by induction on the number of applications
|
||||||
|
of the product. There's just one trick. I'd like to build up an `isLattice`
|
||||||
|
instance even if `A` and `B` are not finite-height. That's because in
|
||||||
|
that case, the iterated product is still a lattice, just not one with
|
||||||
|
a finite height. On the other hand, the `isFiniteHeightLattice` proof
|
||||||
|
requires the `isLattice` proof. Since we're building up by induction,
|
||||||
|
that means that every recursive invocation of the function, we need
|
||||||
|
to get the "partial" lattice instance and give it to the "partial" finite
|
||||||
|
height lattice instance. When I implemented the inductive proof for
|
||||||
|
`isLattice` independently from the (more specific) inductive proof
|
||||||
|
of `isFiniteHeightLattice`, Agda could not unify the two `isLattice`
|
||||||
|
instances (the "actual" one and the one that serves as witness
|
||||||
|
for `isFiniteHeightLattice`). This led to some trouble and inconvenience,
|
||||||
|
and so, I thought it best to build the two up together.
|
||||||
|
|
||||||
|
To build up with the lattice instance and --- if possible --- the finite height
|
||||||
|
instance, I needed to allow for the constituent lattices being either finite
|
||||||
|
or infinite. I supported this by defining a helper type:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/IterProd.agda" 40 55 >}}
|
||||||
|
|
||||||
|
Then, I defined the "everything at once" type, in which, instead of
|
||||||
|
a field for the proof of finite height, has a field that constructs
|
||||||
|
this proof _if the necessary additional information is present_.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/IterProd.agda" 57 76 >}}
|
||||||
|
|
||||||
|
Finally, the proof by induction. It's actually relatively long, so I'll
|
||||||
|
include it as a collapsible block.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/IterProd.agda" 78 120 "" "**(Click here to expand the inductive proof)**" >}}
|
||||||
|
|
||||||
|
### Fixed Height of the Map Lattice
|
||||||
|
|
||||||
|
We saw above that [we can make a map lattice have a finite height if
|
||||||
|
we fix its keys](#finite-keys). How does this work? Well, if the keys
|
||||||
|
are always the same, we can think of such a map as just a tuple, with
|
||||||
|
as many element as there are keys.
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\begin{array}{cccccc}
|
||||||
|
\{ & a: 1, & b: 2, & c: 3, & \} \\
|
||||||
|
& & \iff & & \\
|
||||||
|
( & 1, & 2, & 3 & )
|
||||||
|
\end{array}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
This is why I introduced [iterated products](#iterated-products) earlier;
|
||||||
|
we can use them to construct the second lattice in the example above.
|
||||||
|
I'll take one departure from that example, though: I'll "pad" the tuples
|
||||||
|
with an additional unit element at the end. The unit type (denoted \(\top\))
|
||||||
|
--- which has only a single element --- forms a finite height lattice trivially;
|
||||||
|
I prove this in [an appendix below](#appendix-the-unit-lattice).
|
||||||
|
Using this padding helps reduce the number of special cases; without the
|
||||||
|
adding, the tuple definition might be something like the following:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\text{tup}(A, k) =
|
||||||
|
\begin{cases}
|
||||||
|
\top & k = 0 \\
|
||||||
|
A & k = 1 \\
|
||||||
|
A \times \text{tup}(A, k - 1) & k > 1
|
||||||
|
\end{cases}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
On the other hand, if we were to allow the extra padding, we could drop
|
||||||
|
the definition down to:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\text{tup}(A, k) = \text{iterate}(t \mapsto A \times t, k, \bot) =
|
||||||
|
\begin{cases}
|
||||||
|
\top & k = 0 \\
|
||||||
|
A \times \text{tup}(A, k - 1) & k > 0
|
||||||
|
\end{cases}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
And so, we drop from two to three cases, which means less proof work for us.
|
||||||
|
The tough part is to prove that the two representations of maps --- the
|
||||||
|
key-value list and the iterated product --- are equivalent. We will not
|
||||||
|
have much trouble proving that they're both lattices (we did that last time,
|
||||||
|
for both [products]({{< relref "02_spa_agda_combining_lattices#the-cartesian-product-lattice" >}}) and [maps]({{< relref "02_spa_agda_combining_lattices#the-map-lattice" >}})). Instead, what we need to do is prove that
|
||||||
|
the height of one lattice is the same as the height of the other. We prove
|
||||||
|
this by providing something like an [isomorphism](https://mathworld.wolfram.com/Isomorphism.html):
|
||||||
|
a pair of functions that convert between the two representations, and
|
||||||
|
preserve the properties and relationships (such as \((\sqcup)\)) of lattice
|
||||||
|
elements. In fact, the list of the conversion functions' properties is quite
|
||||||
|
extensive:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Isomorphism.agda" 22 33 "hl_lines=8-12">}}
|
||||||
|
|
||||||
|
1. First, the functions must preserve our definition of equivalence. Thus,
|
||||||
|
if we convert two equivalent elements from the list representation to
|
||||||
|
the tuple representation, the resulting tuples should be equivalent as well.
|
||||||
|
The reverse must be true, too.
|
||||||
|
2. Second, the functions must preserve the binary operations --- see also the definition
|
||||||
|
of a [homomorphism](https://en.wikipedia.org/wiki/Homomorphism#Definition).
|
||||||
|
Specifically, if \(f\) is a conversion function, then the following
|
||||||
|
should hold:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
f(a \sqcup b) \approx f(a) \sqcup f(b)
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
For the purposes of proving that equivalent maps have finite heights, it
|
||||||
|
turns out that this property need only hold for the join operator \((\sqcup)\).
|
||||||
|
3. Finally, the functions must be inverses of each other. If you convert a
|
||||||
|
list to a tuple, and then the tuple back into a list, the resulting
|
||||||
|
value should be equivalent to what we started with. In fact, they
|
||||||
|
need to be both "left" and "right" inverses, so that both \(f(g(x))\approx x\)
|
||||||
|
and \(g(f(x)) \approx x\).
|
||||||
|
|
||||||
|
Given this, the high-level proof is in two parts:
|
||||||
|
|
||||||
|
1. __Proving that a chain of the same height exists in the second (e.g., tuple)
|
||||||
|
lattice:__ To do this, we want to take the longest chain in the first
|
||||||
|
(e.g. key-value list) lattice, and convert it into a chain in the second.
|
||||||
|
The mechanism for this is not too hard to imagine: we just take the original
|
||||||
|
chain, and apply the conversion function to each element.
|
||||||
|
|
||||||
|
Intuitively, this works because of the structure-preserving properties
|
||||||
|
we required above. For instance (recall the
|
||||||
|
[definition of \((\leq)\) given by Lars Hupel](https://lars.hupel.info/topics/crdt/03-lattices/#there-), which in brief is \(a \leq b \triangleq a \sqcup b = b\)):
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\begin{array}{rcr}
|
||||||
|
a \leq b & \iff & (\text{definition of less than})\\
|
||||||
|
a \sqcup b \approx b & \implies & (\text{conversions preserve equivalence}) \\
|
||||||
|
f(a \sqcup b) \approx f(b) & \implies & (\text{conversions distribute over binary operations}) \\
|
||||||
|
f(a) \sqcup f(b) \approx f(b) & \iff & (\text{definition of less than}) \\
|
||||||
|
f(a) \leq f(b)
|
||||||
|
\end{array}
|
||||||
|
{{< /latex >}}
|
||||||
|
2. __Proving that longer chains can't exist in the second (e.g., tuple) lattice:__
|
||||||
|
we've already seen the mechanism to port a chain from one lattice to
|
||||||
|
another lattice, and we can use this same mechanism (but switching directions)
|
||||||
|
to go in reverse. If we do that, we can take a chain of questionable length
|
||||||
|
in the tuple lattice, port it back to the key-value map, and use the
|
||||||
|
(already known) fact that its chains are bounded to conclude the same
|
||||||
|
thing about the tuple chain.
|
||||||
|
|
||||||
|
As you can tell, the chain porting mechanism is doing the heavy lifting here.
|
||||||
|
It's relatively easy to implement given the conditions we've set on
|
||||||
|
conversion functions, in both directions:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Isomorphism.agda" 52 64 >}}
|
||||||
|
|
||||||
|
With that, we can prove the second lattice's finite height:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Isomorphism.agda" 66 80 >}}
|
||||||
|
|
||||||
|
|
||||||
|
The conversion functions are also not too difficult to define. I give
|
||||||
|
them below, but I refrain from showing proofs of the more involved
|
||||||
|
properties (such as the fact that `from` and `to` are inverses, preserve
|
||||||
|
equivalence, and distribute over join) here. You can view them by clicking
|
||||||
|
the link at the top of the code block below.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/FiniteValueMap.agda" 68 85 >}}
|
||||||
|
|
||||||
|
Above, `FiniteValueMap ks` is the type of maps whose keys are fixed to
|
||||||
|
`ks`; defined as follows:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/FiniteMap.agda" 58 60 >}}
|
||||||
|
|
||||||
|
Proving the remaining properties (which as I mentioned, I omit from
|
||||||
|
the main body of the post) is sufficient to apply the isomorphism,
|
||||||
|
proving that maps with finite keys are of a finite height.
|
||||||
|
|
||||||
|
|
||||||
|
### Using the Finite Height Property
|
||||||
|
|
||||||
|
Lattices having a finite height is a crucial property for the sorts of
|
||||||
|
static program analyses I've been working to implement.
|
||||||
|
We can create functions that traverse "up" through the lattice,
|
||||||
|
creating larger values each time. If these lattices are of a finite height,
|
||||||
|
then the static analyses functions can only traverse "so high".
|
||||||
|
Under certain conditions, this
|
||||||
|
guarantees that our static analysis will eventually terminate with
|
||||||
|
a [fixed point](https://mathworld.wolfram.com/FixedPoint.html). Pragmatically,
|
||||||
|
this is a state in which running our analysis does not yield any more information.
|
||||||
|
|
||||||
|
The way that the fixed point is found is called the _fixed point algorithm_.
|
||||||
|
We'll talk more about this in the next post.
|
||||||
|
|
||||||
|
{{< seriesnav >}}
|
||||||
|
|
||||||
|
### Appendix: The Unit Lattice
|
||||||
|
|
||||||
|
The unit lattice is a relatively boring one. I use the built-in unit type
|
||||||
|
in Agda, which (perhaps a bit confusingly) is represented using the symbol `⊤`.
|
||||||
|
It only has a single constructor, `tt`.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Unit.agda" 6 7 >}}
|
||||||
|
|
||||||
|
The equivalence for the unit type is just propositional equality (we have
|
||||||
|
no need to identify unequal values of `⊤`, since there is only one value).
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Unit.agda" 17 25 >}}
|
||||||
|
|
||||||
|
Both the join \((\sqcup)\) and meet \((\sqcap)\) operations are trivially defined;
|
||||||
|
in both cases, they simply take two `tt`s and produce a new `tt`.
|
||||||
|
Mathematically, one might write this as \((\text{tt}, \text{tt}) \mapsto \text{tt}\).
|
||||||
|
In Agda:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Unit.agda" 30 34 >}}
|
||||||
|
|
||||||
|
These operations are trivially associative, commutative, and idempotent.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Unit.agda" 39 46 >}}
|
||||||
|
|
||||||
|
That's sufficient for them to be semilattices:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Unit.agda" 48 54 >}}
|
||||||
|
|
||||||
|
The [absorption laws]({{< relref "01_spa_agda_lattices#absorption-laws" >}})
|
||||||
|
are also trivially satisfied, which means that the unit type forms a lattice.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Unit.agda" 78 90 >}}
|
||||||
|
|
||||||
|
Since there's only one element, it's not really possible to have chains
|
||||||
|
that contain any more than one value. As a result, the height (in comparisons)
|
||||||
|
of the unit lattice is zero.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Unit.agda" 102 117 >}}
|
||||||
@@ -1,7 +1,8 @@
|
|||||||
---
|
---
|
||||||
title: Compiling a Functional Language Using C++, Part 4 - Small Improvements
|
title: Compiling a Functional Language Using C++, Part 4 - Small Improvements
|
||||||
date: 2019-08-06T14:26:38-07:00
|
date: 2019-08-06T14:26:38-07:00
|
||||||
tags: ["C and C++", "Functional Languages", "Compilers"]
|
tags: ["C++", "Functional Languages", "Compilers"]
|
||||||
|
series: "Compiling a Functional Language using C++"
|
||||||
description: "In this post, we take a little break from pushing our compiler forward to make some improvements to the code we've written so far."
|
description: "In this post, we take a little break from pushing our compiler forward to make some improvements to the code we've written so far."
|
||||||
---
|
---
|
||||||
We've done quite a big push in the previous post. We defined
|
We've done quite a big push in the previous post. We defined
|
||||||
|
|||||||
240
content/blog/04_spa_agda_fixedpoint.md
Normal file
@@ -0,0 +1,240 @@
|
|||||||
|
---
|
||||||
|
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 4: The Fixed-Point Algorithm"
|
||||||
|
series: "Static Program Analysis in Agda"
|
||||||
|
description: "In this post, I show how to find the least fixed point of a finite-height lattice"
|
||||||
|
date: 2024-11-03T17:50:26-08:00
|
||||||
|
tags: ["Agda", "Programming Languages"]
|
||||||
|
---
|
||||||
|
|
||||||
|
In the preivous post we looked at lattices of finite height, which are a crucial
|
||||||
|
ingredient to our static analyses. In this post, I will describe the specific
|
||||||
|
algorithm that makes use of these lattices; this algorithm will be at the core
|
||||||
|
of this series.
|
||||||
|
|
||||||
|
Lattice-based static analyses tend to operate by iteratively combining facts
|
||||||
|
from the program into new ones. For instance, when analyzing `y = 1 + 2`, we
|
||||||
|
take the (trivial) facts that the numbers one and two are positive, and combine
|
||||||
|
them into the knowledge that `y` is positive as well. If another line of code
|
||||||
|
reads `x = y + 1`, we then apply our new knowledge of `y` to determine the sign
|
||||||
|
of `x`, too. Combining facs in this manner gives us more information, which we
|
||||||
|
can then continue to apply to learn more about the program.
|
||||||
|
|
||||||
|
A static program analyzer, however, is a very practical thing. Although in
|
||||||
|
mathemaitics we may allow ourselves to delve into infinite algorithms, we have
|
||||||
|
no such luxury while trying to, say, compile some code. As a result, after
|
||||||
|
a certain point, we need to stop our iterative (re)combination of facts. In
|
||||||
|
an ideal world, that point would be when we know we have found out everything we
|
||||||
|
could about the program. A corollary to that would be that this point must
|
||||||
|
be guaranteed to eventually occur, lest we keep looking for it indenfinitely.
|
||||||
|
|
||||||
|
The fixed-point algorithm does this for us. If we describe our analysis as
|
||||||
|
a monotonic function over a finite-height lattice, this algorithm gives
|
||||||
|
us a surefire way to find out facts about our program that constitute "complete"
|
||||||
|
information that can't be re-inspected to find out more. The algorithm is guaranteed
|
||||||
|
to terminate, which means that we will not get stuck in an infinite loop.
|
||||||
|
|
||||||
|
### The Algorithm
|
||||||
|
Take a lattice \(L\) and a monotonic function \(f\). We've
|
||||||
|
[talked about monotonicity before]({{< relref "01_spa_agda_lattices#define-monotonicity" >}}),
|
||||||
|
but it's easy to re-state. Specifically, a function is monotonic if the following
|
||||||
|
rule holds true:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\textbf{if}\ a \le b\ \textbf{then}\ f(a) \le f(b)
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Recall that the less-than relation on lattices in our case
|
||||||
|
[encodes specificity]({{< relref "01_spa_agda_lattices#specificity" >}}).
|
||||||
|
In particular, if elements of our lattice describe our program, than smaller
|
||||||
|
elements should provide more precise descriptions (where "`x` is potive"
|
||||||
|
is more precise than "`x` has any sign", for example). Viewed through this
|
||||||
|
lens, monotonicity means that more specific inputs produce more specific outputs.
|
||||||
|
That seems reasonable.
|
||||||
|
|
||||||
|
Now, let's start with the least element of our lattice, denoted \(\bot\).
|
||||||
|
A lattice of finite height is guaranteed to have such an element. If it didn't,
|
||||||
|
we could always extend chains by tacking on a smaller element to their bottom,
|
||||||
|
and then the lattice wouldn't have a finite height anymore.
|
||||||
|
{#start-least}
|
||||||
|
|
||||||
|
Now, apply \(f\) to \(\bot\) to get \(f(\bot)\). Since \(\bot\) is the least
|
||||||
|
element, it must be true that \(\bot \le f(\bot)\). Now, if it's "less than or equal",
|
||||||
|
is it "less than", or is "equal")? If it's the latter, we have \(\bot = f(\bot)\).
|
||||||
|
This means we've found a fixed point: given our input \(\bot\) our analysis \(f\)
|
||||||
|
produced no new information, and we're done. Otherwise, we are not done, but we
|
||||||
|
know that \(\bot < f(\bot)\), which will be helpful shortly.
|
||||||
|
|
||||||
|
Continuing the "less than" case, we can apply \(f\) again, this time to \(f(\bot)\).
|
||||||
|
This gives us \(f(f(\bot))\). Since \(f\) is monotonic and \(\bot \le f(\bot)\), we know
|
||||||
|
also that \(f(\bot) \le f(f(\bot))\). Again, ask "which is it?", and as before, if
|
||||||
|
\(f(\bot) = f(f(\bot))\), we have found a fixed point. Otherwise, we know that
|
||||||
|
\(f(\bot) < f(f(\bot))\).
|
||||||
|
|
||||||
|
We can keep doing this. Notice that with each step, we are either done
|
||||||
|
(having found a fixed point) or we have a new inequality in our hands. We can
|
||||||
|
arrange the ones we've seen so far into a chain:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\bot < f(\bot) < f(f(\bot)) < ...
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Each time we fail to find a fixed point, we add one element to our chain, growing
|
||||||
|
it. But if our lattice \(L\) has a finite height, that means eventually this
|
||||||
|
process will have to stop; the chain can't grow forever. Eventually, we will
|
||||||
|
have to find a value such that \(v = f(v)\). Thus, our algorithm is guaranteed
|
||||||
|
to terminate, and give a fixed point.
|
||||||
|
|
||||||
|
I implemented the iterative process of applying \(f\) using a recursive function.
|
||||||
|
Agda has a termination checker, to which the logic above --- which proves that
|
||||||
|
iteration will eventually finish --- is not at all obvious. The trick to
|
||||||
|
getting it to work was to use a notion of "gas": an always-decreasing value
|
||||||
|
that serves as one of the functions' arguments. Since the value is always decreasing
|
||||||
|
in size, the termination checker is satisfied.
|
||||||
|
|
||||||
|
This works by observing that we already have a rough idea of the maximum number
|
||||||
|
of times our function will recurse; that would be the height of the lattice.
|
||||||
|
After that, we would be building an impossibly long chain. So, we'll give the
|
||||||
|
function a "budget" of that many iterations, plus one more. Since the chain
|
||||||
|
increases once each time the budget shrinks (indicating recursion), running
|
||||||
|
out of our "gas" will mean that we built an impossibly long chain --- it
|
||||||
|
will provably never happen.
|
||||||
|
|
||||||
|
In all, the recursive function is as follows:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Fixedpoint.agda" 53 64 >}}
|
||||||
|
|
||||||
|
The first case handles running out of gas, arguing by bottom-elimination (contradiction).
|
||||||
|
The second case follows the algorithm I've described pretty closely; it
|
||||||
|
applies \(f\) to an existing value, checks if the result is equal (equivalent)
|
||||||
|
to the original, and if it isn't, it grows the existing chain of elements
|
||||||
|
and invokes the step function rescurisvely with the grown chain and less gas.
|
||||||
|
|
||||||
|
The recursive function implements a single "step" of the process (applying `f`,
|
||||||
|
comparing for equality, returning the fixed point if one was found). All that's
|
||||||
|
left is to kick off the process using \(\bot\). This is what `fix` does:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Fixedpoint.agda" 66 67 >}}
|
||||||
|
|
||||||
|
This functions is responsible for providing gas to `doStep`; as I mentioned above,
|
||||||
|
it provides just a bit more gas than the maximum-length chain, which means that
|
||||||
|
if the gas is exhausted, we've certainly arrived at a contradiction. It also
|
||||||
|
provides an initial chain onto which `doStep` will keep tacking on new inequalities
|
||||||
|
as it finds them. Since we haven't found any yet, this is the single-element
|
||||||
|
chain of \(\bot\). The last thing is does is set up the recursion invariant
|
||||||
|
(that the sum of the gas and the chain length is constant), and provides
|
||||||
|
a proof that \(\bot \le f(\bot)\). This function always returns a fixed point.
|
||||||
|
|
||||||
|
### Least Fixed Point
|
||||||
|
Functions can have many fixed points. Take the identity function that simply
|
||||||
|
returns its argument unchanged; this function has a fixed point for every
|
||||||
|
element in its domain, since, for example, \(\text{id}(1) = 1\), \(\text{id}(2) = 2\), etc.
|
||||||
|
The fixed point found by our algorithm above is somewhat special among the
|
||||||
|
possible fixed points of \(f\): it is the _least fixed point_ of the function.
|
||||||
|
Call our fixed point \(a\); if there's another point \(b\) such that \(b=f(b)\),
|
||||||
|
then the fixed point we found must be less than or equal to \(b\) (that is,
|
||||||
|
\(a \le b\)). This is important given our interpretation of "less than" as "more specific":
|
||||||
|
the fixedpoint algorithm produces the most specific possible information about
|
||||||
|
our program given the rules of our analysis.
|
||||||
|
|
||||||
|
The proof is simple; suppose that it took \(k\) iterations of calling \(f\)
|
||||||
|
to arrive at our fixed point. This gives us:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
a = \underbrace{f(f(...(f(}_{k\ \text{times}}\bot)))) = f^k(\bot)
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Now, take our other fixed point \(b\). Since \(\bot\) is the least element of
|
||||||
|
the lattice, we have \(\bot \le b\).
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\begin{array}{ccccccccr}
|
||||||
|
& & \bot & \le & & & b & \quad \implies & \text{(monotonicity of}\ f \text{)}\\
|
||||||
|
& & f(\bot) & \le & f(b) & = & b & \quad \implies & \text{(} b\ \text{is a fixed point, monotonicity of}\ f \text{)}\\
|
||||||
|
& & f^2(\bot) & \le & f(b) & = & b & \quad \implies & \text{(} b\ \text{is a fixed point, monotonicity of}\ f \text{)}\\
|
||||||
|
\\
|
||||||
|
& & \vdots & & \vdots & & \vdots & & \\
|
||||||
|
\\
|
||||||
|
a & = & f^k(\bot) & \le & f(b) & = & b &
|
||||||
|
\end{array}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Because of the monotonicity of \(f\), each time we apply it, it preserves the
|
||||||
|
less-than relationship that started with \(\bot \le b\). Doing that \(k\) times,
|
||||||
|
we verify that \(a\) is our least fixed point.
|
||||||
|
|
||||||
|
To convince Agda of this proof, we once again get in an argument with the termination
|
||||||
|
checker, which ends the same way it did last time: with us using the notion of 'gas'
|
||||||
|
to ensure that the repeated application of \(f\) eventually ends. Since we're
|
||||||
|
interested in verifying that `doStep` producdes the least fixed point, we formulate
|
||||||
|
the proof in terms of `doStep` applied to various arguments.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Fixedpoint.agda" 76 84 >}}
|
||||||
|
|
||||||
|
As with `doStep`, this function takes as arguments the amount of gas `g` and
|
||||||
|
a partially-built chain `c`, which gets appended to for each failed equality
|
||||||
|
comparison. In addition, however, this function takes another arbitrary fixed
|
||||||
|
point `b`, which is greater than the current input to `doStep` (which is
|
||||||
|
a value \(f^i(\bot\)\) for some \(i\)). It then proves that when `doStep` terminates
|
||||||
|
(which will be with a value in the form \(f^k(\bot)\)), this value will still
|
||||||
|
be smaller than `b`. Since it is a proof about `doStep`, `stepPreservesLess`
|
||||||
|
proceeds by the same case analysis as its subject, and has a very similar (albeit
|
||||||
|
simpler) structure. In short, though, it encodes the relatively informal proof
|
||||||
|
I gave above.
|
||||||
|
|
||||||
|
Just like with `doStep`, I define a helper function for `stepPreservesLess` that
|
||||||
|
kicks off its recursive invocations.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Fixedpoint.agda" 86 87 >}}
|
||||||
|
|
||||||
|
Above, `aᶠ` is the output of `fix`:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Fixedpoint.agda" 69 70 >}}
|
||||||
|
|
||||||
|
### What is a Program?
|
||||||
|
With the fixed point algorithm in hand, we have all the tools we need to define
|
||||||
|
static program analyses:
|
||||||
|
|
||||||
|
1. We've created a collection of "lattice builders", which allow us to combine
|
||||||
|
various lattice building blocks into more complicated structures; these
|
||||||
|
structures are advanced enough to represent the information about our programs.
|
||||||
|
2. We've figured out a way (our fixed point algorithm) to repeatedly apply
|
||||||
|
an inference function to our programs and eventually produce results.
|
||||||
|
This algorithm requires some additional properties from our latttices.
|
||||||
|
3. We've proven that our lattice builders create lattices with these properties,
|
||||||
|
making it possible to use them to construct functions fit for our fixed point
|
||||||
|
algorithm.
|
||||||
|
|
||||||
|
All that's left is to start defining monotonic functions over lattices! Except,
|
||||||
|
what are we analyzing? We've focused a fair bit on the theory of lattices,
|
||||||
|
but we haven't yet defined even a tiny piece of the language that our programs
|
||||||
|
will be analyzing. We will start with programs like this:
|
||||||
|
|
||||||
|
```
|
||||||
|
x = 42
|
||||||
|
y = 1
|
||||||
|
if y {
|
||||||
|
x = -1;
|
||||||
|
} else {
|
||||||
|
x = -2;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
We will need to model these programs in Agda by describing them as trees
|
||||||
|
([Abstract Syntax Trees](https://en.wikipedia.org/wiki/Abstract_syntax_tree), to be
|
||||||
|
precise). We will also need to specify how to evaluate these programs (provide
|
||||||
|
the [semantics](https://en.wikipedia.org/wiki/Semantics_(computer_science)) of
|
||||||
|
our language). We will use big-step (also known as "natural") operational semantics
|
||||||
|
to do so; here's an example rule:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\frac{\rho_1, e \Downarrow z \quad \neg (z = 0) \quad \rho_1,s_1 \Downarrow \rho_2}
|
||||||
|
{\rho_1, \textbf{if}\ e\ \textbf{then}\ s_1\ \textbf{else}\ s_2\ \Downarrow\ \rho_2}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
The above reads:
|
||||||
|
|
||||||
|
> If the condition of an `if`-`else` statement evaluates to a nonzero value,
|
||||||
|
> then to evaluate the statement, you evaluate its `then` branch.
|
||||||
|
|
||||||
|
In the next post, we'll talk more about how these rules work, and define
|
||||||
|
the remainder of them to give our language life. See you then!
|
||||||
@@ -1,7 +1,8 @@
|
|||||||
---
|
---
|
||||||
title: Compiling a Functional Language Using C++, Part 5 - Execution
|
title: Compiling a Functional Language Using C++, Part 5 - Execution
|
||||||
date: 2019-08-06T14:26:38-07:00
|
date: 2019-08-06T14:26:38-07:00
|
||||||
tags: ["C and C++", "Functional Languages", "Compilers"]
|
tags: ["Functional Languages", "Compilers"]
|
||||||
|
series: "Compiling a Functional Language using C++"
|
||||||
description: "In this post, we define the rules for a G-machine, the abstract machine that we will target with our compiler."
|
description: "In this post, we define the rules for a G-machine, the abstract machine that we will target with our compiler."
|
||||||
---
|
---
|
||||||
{{< gmachine_css >}}
|
{{< gmachine_css >}}
|
||||||
@@ -146,19 +147,19 @@ We will follow the same notation as Simon Peyton Jones in
|
|||||||
[his book](https://www.microsoft.com/en-us/research/wp-content/uploads/1992/01/student.pdf)
|
[his book](https://www.microsoft.com/en-us/research/wp-content/uploads/1992/01/student.pdf)
|
||||||
, which was my source of truth when implementing my compiler. The machine
|
, which was my source of truth when implementing my compiler. The machine
|
||||||
will be executing instructions that we give it, and as such, it must have
|
will be executing instructions that we give it, and as such, it must have
|
||||||
an instruction queue, which we will reference as \\(i\\). We will write
|
an instruction queue, which we will reference as \(i\). We will write
|
||||||
\\(x:i\\) to mean "an instruction queue that starts with
|
\(x:i\) to mean "an instruction queue that starts with
|
||||||
an instruction x and ends with instructions \\(i\\)". A stack machine
|
an instruction x and ends with instructions \(i\)". A stack machine
|
||||||
obviously needs to have a stack - we will call it \\(s\\), and will
|
obviously needs to have a stack - we will call it \(s\), and will
|
||||||
adopt a similar notation to the instruction queue: \\(a\_1, a\_2, a\_3 : s\\)
|
adopt a similar notation to the instruction queue: \(a_1, a_2, a_3 : s\)
|
||||||
will mean "a stack with the top values \\(a\_1\\), \\(a\_2\\), and \\(a\_3\\),
|
will mean "a stack with the top values \(a_1\), \(a_2\), and \(a_3\),
|
||||||
and remaining instructions \\(s\\)". Finally, as we said, our stack
|
and remaining instructions \(s\)". Finally, as we said, our stack
|
||||||
machine has a dump, which we will write as \\(d\\). On this dump,
|
machine has a dump, which we will write as \(d\). On this dump,
|
||||||
we will push not only the current stack, but also the current
|
we will push not only the current stack, but also the current
|
||||||
instructions that we are executing, so we may resume execution
|
instructions that we are executing, so we may resume execution
|
||||||
later. We will write \\(\\langle i, s \\rangle : d\\) to mean
|
later. We will write \(\langle i, s \rangle : d\) to mean
|
||||||
"a dump with instructions \\(i\\) and stack \\(s\\) on top,
|
"a dump with instructions \(i\) and stack \(s\) on top,
|
||||||
followed by instructions and stacks in \\(d\\)".
|
followed by instructions and stacks in \(d\)".
|
||||||
|
|
||||||
There's one more thing the G-machine will have that we've not yet discussed at all,
|
There's one more thing the G-machine will have that we've not yet discussed at all,
|
||||||
and it's needed because of the following quip earlier in the post:
|
and it's needed because of the following quip earlier in the post:
|
||||||
@@ -178,14 +179,14 @@ its address the same, but change the value on the heap.
|
|||||||
This way, all trees that reference the node we change become updated,
|
This way, all trees that reference the node we change become updated,
|
||||||
without us having to change them - their child address remains the same,
|
without us having to change them - their child address remains the same,
|
||||||
but the child has now been updated. We represent the heap
|
but the child has now been updated. We represent the heap
|
||||||
using \\(h\\). We write \\(h[a : v]\\) to say "the address \\(a\\) points
|
using \(h\). We write \(h[a : v]\) to say "the address \(a\) points
|
||||||
to value \\(v\\) in the heap \\(h\\)". Now you also know why we used
|
to value \(v\) in the heap \(h\)". Now you also know why we used
|
||||||
the letter \\(a\\) when describing values on the stack - the stack contains
|
the letter \(a\) when describing values on the stack - the stack contains
|
||||||
addresses of (or pointers to) tree nodes.
|
addresses of (or pointers to) tree nodes.
|
||||||
|
|
||||||
_Compiling Functional Languages: a tutorial_ also keeps another component
|
_Compiling Functional Languages: a tutorial_ also keeps another component
|
||||||
of the G-machine, the __global map__, which maps function names to addresses of nodes
|
of the G-machine, the __global map__, which maps function names to addresses of nodes
|
||||||
that represent them. We'll stick with this, and call this global map \\(m\\).
|
that represent them. We'll stick with this, and call this global map \(m\).
|
||||||
|
|
||||||
Finally, let's talk about what kind of nodes our trees will be made of.
|
Finally, let's talk about what kind of nodes our trees will be made of.
|
||||||
We don't have to include every node that we've defined as a subclass of
|
We don't have to include every node that we've defined as a subclass of
|
||||||
@@ -217,7 +218,7 @@ First up is __PushInt__:
|
|||||||
|
|
||||||
Let's go through this. We start with an instruction queue
|
Let's go through this. We start with an instruction queue
|
||||||
with `PushInt n` on top. We allocate a new `NInt` with the
|
with `PushInt n` on top. We allocate a new `NInt` with the
|
||||||
number `n` on the heap at address \\(a\\). We then push
|
number `n` on the heap at address \(a\). We then push
|
||||||
the address of the `NInt` node on top of the stack. Next,
|
the address of the `NInt` node on top of the stack. Next,
|
||||||
__PushGlobal__:
|
__PushGlobal__:
|
||||||
|
|
||||||
@@ -250,7 +251,7 @@ __Push__:
|
|||||||
{{< /gmachine >}}
|
{{< /gmachine >}}
|
||||||
|
|
||||||
We define this instruction to work if and only if there exists an address
|
We define this instruction to work if and only if there exists an address
|
||||||
on the stack at offset \\(n\\). We take the value at that offset, and
|
on the stack at offset \(n\). We take the value at that offset, and
|
||||||
push it onto the stack again. This can be helpful for something like
|
push it onto the stack again. This can be helpful for something like
|
||||||
`f x x`, where we use the same tree twice. Speaking of that - let's
|
`f x x`, where we use the same tree twice. Speaking of that - let's
|
||||||
define an instruction to combine two nodes into an application:
|
define an instruction to combine two nodes into an application:
|
||||||
@@ -273,11 +274,11 @@ that is an `NApp` node, with its two children being the nodes we popped off.
|
|||||||
Finally, we push it onto the stack.
|
Finally, we push it onto the stack.
|
||||||
|
|
||||||
Let's try use these instructions to get a feel for it. In
|
Let's try use these instructions to get a feel for it. In
|
||||||
order to conserve space, let's use \\(\\text{G}\\) for PushGlobal,
|
order to conserve space, let's use \(\text{G}\) for PushGlobal,
|
||||||
\\(\\text{I}\\) for PushInt, and \\(\\text{A}\\) for PushApp.
|
\(\text{I}\) for PushInt, and \(\text{A}\) for PushApp.
|
||||||
Let's say we want to construct a graph for `double 326`. We'll
|
Let's say we want to construct a graph for `double 326`. We'll
|
||||||
use the instructions \\(\\text{I} \; 326\\), \\(\\text{G} \; \\text{double}\\),
|
use the instructions \(\text{I} \; 326\), \(\text{G} \; \text{double}\),
|
||||||
and \\(\\text{A}\\). Let's watch these instructions play out:
|
and \(\text{A}\). Let's watch these instructions play out:
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\begin{aligned}
|
\begin{aligned}
|
||||||
[\text{I} \; 326, \text{G} \; \text{double}, \text{A}] & \quad s \quad & d \quad & h \quad & m[\text{double} : a_d] \\
|
[\text{I} \; 326, \text{G} \; \text{double}, \text{A}] & \quad s \quad & d \quad & h \quad & m[\text{double} : a_d] \\
|
||||||
@@ -345,13 +346,13 @@ code for the global function:
|
|||||||
{{< /gmachine_inner >}}
|
{{< /gmachine_inner >}}
|
||||||
{{< /gmachine >}}
|
{{< /gmachine >}}
|
||||||
|
|
||||||
In this rule, we used a general rule for \\(a\_k\\), in which \\(k\\) is any number
|
In this rule, we used a general rule for \(a_k\), in which \(k\) is any number
|
||||||
between 1 and \\(n-1\\). We also expect the `NGlobal` node to contain two parameters,
|
between 1 and \(n-1\). We also expect the `NGlobal` node to contain two parameters,
|
||||||
\\(n\\) and \\(c\\). \\(n\\) is the arity of the function (the number of arguments
|
\(n\) and \(c\). \(n\) is the arity of the function (the number of arguments
|
||||||
it expects), and \\(c\\) are the instructions to construct the function's tree.
|
it expects), and \(c\) are the instructions to construct the function's tree.
|
||||||
|
|
||||||
The attentive reader will have noticed a catch: we kept \\(a\_{n-1}\\) on the stack!
|
The attentive reader will have noticed a catch: we kept \(a_{n-1}\) on the stack!
|
||||||
This once again goes back to replacing a node in-place. \\(a\_{n-1}\\) is the address of the "root" of the
|
This once again goes back to replacing a node in-place. \(a_{n-1}\) is the address of the "root" of the
|
||||||
whole expression we're simplifying. Thus, to replace the value at this address, we need to keep
|
whole expression we're simplifying. Thus, to replace the value at this address, we need to keep
|
||||||
the address until we have something to replace it with.
|
the address until we have something to replace it with.
|
||||||
|
|
||||||
@@ -450,7 +451,7 @@ and define it to contain a mapping from tags to instructions
|
|||||||
to be executed for a value of that tag. For instance,
|
to be executed for a value of that tag. For instance,
|
||||||
if the constructor `Nil` has tag 0, and `Cons` has tag 1,
|
if the constructor `Nil` has tag 0, and `Cons` has tag 1,
|
||||||
the mapping for the case expression of a length function
|
the mapping for the case expression of a length function
|
||||||
could be written as \\([0 \\rightarrow [\\text{PushInt} \; 0], 1 \\rightarrow [\\text{PushGlobal} \; \\text{length}, ...] ]\\).
|
could be written as \([0 \rightarrow [\text{PushInt} \; 0], 1 \rightarrow [\text{PushGlobal} \; \text{length}, ...] ]\).
|
||||||
Let's define the rule for it:
|
Let's define the rule for it:
|
||||||
|
|
||||||
{{< gmachine "Jump" >}}
|
{{< gmachine "Jump" >}}
|
||||||
@@ -474,7 +475,7 @@ creating a final graph. We then continue to reduce this final
|
|||||||
graph. But we've left the function parameters on the stack!
|
graph. But we've left the function parameters on the stack!
|
||||||
This is untidy. We define a __Slide__ instruction,
|
This is untidy. We define a __Slide__ instruction,
|
||||||
which keeps the address at the top of the stack, but gets
|
which keeps the address at the top of the stack, but gets
|
||||||
rid of the next \\(n\\) addresses:
|
rid of the next \(n\) addresses:
|
||||||
|
|
||||||
{{< gmachine "Slide" >}}
|
{{< gmachine "Slide" >}}
|
||||||
{{< gmachine_inner "Before">}}
|
{{< gmachine_inner "Before">}}
|
||||||
|
|||||||
551
content/blog/05_spa_agda_semantics/index.md
Normal file
@@ -0,0 +1,551 @@
|
|||||||
|
---
|
||||||
|
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 5: Our Programming Language"
|
||||||
|
series: "Static Program Analysis in Agda"
|
||||||
|
description: "In this post, I define the language that well serve as the object of our vartious analyses"
|
||||||
|
date: 2024-11-03T17:50:27-08:00
|
||||||
|
tags: ["Agda", "Programming Languages"]
|
||||||
|
custom_js: ["parser.js"]
|
||||||
|
bergamot:
|
||||||
|
render_presets:
|
||||||
|
default: "bergamot/rendering/imp.bergamot"
|
||||||
|
input_modes:
|
||||||
|
- name: "Expression"
|
||||||
|
fn: "window.parseExpr"
|
||||||
|
- name: "Basic Statement"
|
||||||
|
fn: "window.parseBasicStmt"
|
||||||
|
- name: "Statement"
|
||||||
|
fn: "window.parseStmt"
|
||||||
|
---
|
||||||
|
|
||||||
|
In the previous several posts, I've formalized the notion of lattices, which
|
||||||
|
are an essential ingredient to formalizing the analyses in Anders Møller's
|
||||||
|
lecture notes. However, there can be no program analysis without a program
|
||||||
|
to analyze! In this post, I will define the (very simple) language that we
|
||||||
|
will be analyzing. An essential aspect of the language is its
|
||||||
|
[semantics](https://en.wikipedia.org/wiki/Semantics_(computer_science)), which
|
||||||
|
simply speaking explains what each feature of the language does. At the end
|
||||||
|
of the previous article, I gave the following _inference rule_ which defined
|
||||||
|
(partially) how the `if`-`else` statement in the language works.
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\frac{\rho_1, e \Downarrow z \quad \neg (z = 0) \quad \rho_1,s_1 \Downarrow \rho_2}
|
||||||
|
{\rho_1, \textbf{if}\ e\ \textbf{then}\ s_1\ \textbf{else}\ s_2\ \Downarrow\ \rho_2}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Like I mentioned then, this rule reads as follows:
|
||||||
|
|
||||||
|
> If the condition of an `if`-`else` statement evaluates to a nonzero value,
|
||||||
|
> then to evaluate the statement, you evaluate its `then` branch.
|
||||||
|
|
||||||
|
Another similar --- but crucially, not equivlalent -- rule is the following:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\frac{\rho_1, e \Downarrow z \quad z = 1 \quad \rho_1,s_1 \Downarrow \rho_2}
|
||||||
|
{\rho_1, \textbf{if}\ e\ \textbf{then}\ s_1\ \textbf{else}\ s_2\ \Downarrow\ \rho_2}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
This time, the English interpretation of the rule is as follows:
|
||||||
|
|
||||||
|
> If the condition of an `if`-`else` statement evaluates to one,
|
||||||
|
> then to evaluate the statement, you evaluate its `then` branch.
|
||||||
|
|
||||||
|
These rules are certainly not equivalent. For instance, the former allows
|
||||||
|
the "then" branch to be executed when the condition is `2`; however, in
|
||||||
|
the latter, the value of the conditional must be `1`. If our analysis were
|
||||||
|
intelligent (our first few will not be), then this difference would change
|
||||||
|
its output when determining the signs of the following program:
|
||||||
|
|
||||||
|
```
|
||||||
|
x = 2
|
||||||
|
if x {
|
||||||
|
y = - 1
|
||||||
|
} else {
|
||||||
|
y = 1
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Using the first, more "relaxed" rule, the condition would be considered "true",
|
||||||
|
and the sign of `y` would be `-`. On the other hand, using the second,
|
||||||
|
"stricter" rule, the sign of `y` would be `+`. I stress that in this case,
|
||||||
|
I am showing a flow-sensitive analysis (one that can understand control flow
|
||||||
|
and make more specific predictions); for our simplest analyses, we will not
|
||||||
|
be aiming for flow-sensitivity. There is plenty of work to do even then.
|
||||||
|
|
||||||
|
The point of showing these two distinct rules is that we need to be very precise
|
||||||
|
about how the language will behave, because our analyses depend on that behavior.
|
||||||
|
|
||||||
|
Let's not get ahead of ourselves, though. I've motivated the need for semantics,
|
||||||
|
but there is much groundwork to be laid before we delve into the precise rules
|
||||||
|
of our language. After all, to define the language's semantics, we need to
|
||||||
|
have a language.
|
||||||
|
|
||||||
|
### The Syntax of Our Simple Language
|
||||||
|
|
||||||
|
I've shown a couple of examples our our language now, and there won't be that
|
||||||
|
much more to it. We can start with _expressions_: things that evaluate to
|
||||||
|
something. Some examples of expressions are `1`, `x`, and `2-(x+y)`. For our
|
||||||
|
specific language, the precise set of possible expressions can be given
|
||||||
|
by the following [Context-Free Grammar](https://en.wikipedia.org/wiki/Context-free_grammar):
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\begin{array}{rcll}
|
||||||
|
e & ::= & x & \text{(variables)} \\
|
||||||
|
& | & z & \text{(integer literals)} \\
|
||||||
|
& | & e + e & \text{(addition)} \\
|
||||||
|
& | & e - e & \text{(subtraction)}
|
||||||
|
\end{array}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
The above can be read as follows:
|
||||||
|
|
||||||
|
> An expression \(e\) is one of the following things:
|
||||||
|
> 1. Some variable \(x\) [importantly \(x\) is a placeholder for _any_ variable,
|
||||||
|
> which could be `x` or `y` in our program code; specifically, \(x\) is
|
||||||
|
> a [_metavariable_](https://en.wikipedia.org/wiki/Metavariable).]
|
||||||
|
> 2. Some integer \(z\) [once again, \(z\) can be any integer, like 1, -42, etc.].
|
||||||
|
> 3. The addition of two other expressions [which could themselves be additions etc.].
|
||||||
|
> 4. The subtraction of two other expressions [which could also themselves be additions, subtractions, etc.].
|
||||||
|
|
||||||
|
Since expressions can be nested within other expressions --- which is necessary
|
||||||
|
to allow complicated code like `2-(x+y)` above --- they form a tree. Each node
|
||||||
|
is one of the elements of the grammar above (variable, addition, etc.). If
|
||||||
|
a node contains sub-expressions (like addition and subtraction do), then
|
||||||
|
these sub-expressions form sub-trees of the given node. This data structure
|
||||||
|
is called an [Abstract Syntax Tree](https://en.wikipedia.org/wiki/Abstract_syntax_tree).
|
||||||
|
|
||||||
|
Notably, though `2-(x+y)` has parentheses, our grammar above does not include
|
||||||
|
include them as a case. The reason for this is that the structure of an
|
||||||
|
abstract syntax tree is sufficient to encode the order in which the operations
|
||||||
|
should be evaluated. Since I lack a nice way of drawing ASTs, I will use
|
||||||
|
an ASCII drawing to show an example.
|
||||||
|
|
||||||
|
```
|
||||||
|
Expression: 2 - (x+y)
|
||||||
|
(-)
|
||||||
|
/ \
|
||||||
|
2 (+)
|
||||||
|
/ \
|
||||||
|
x y
|
||||||
|
|
||||||
|
|
||||||
|
Expression: (2-x) + y
|
||||||
|
(+)
|
||||||
|
/ \
|
||||||
|
(-) y
|
||||||
|
/ \
|
||||||
|
2 x
|
||||||
|
```
|
||||||
|
|
||||||
|
Above, in the first AST, `(+)` is a child of the `(-)` node, which means
|
||||||
|
that it's a sub-expression. As a result, that subexpression is evaluated first,
|
||||||
|
before evaluating `(-)`, and so, the AST expresents `2-(x+y)`. In the other
|
||||||
|
example, `(-)` is a child of `(+)`, and is therefore evaluated first. The resulting
|
||||||
|
association encoded by that AST is `(2-x)+y`.
|
||||||
|
|
||||||
|
To an Agda programmer, the one-of-four-things definition above should read
|
||||||
|
quite similarly to the definition of an algebraic data type. Indeed, this
|
||||||
|
is how we can encode the abstract syntax tree of expressions:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Base.agda" 12 16 >}}
|
||||||
|
|
||||||
|
The only departure from the grammar above is that I had to invent constructors
|
||||||
|
for the variable and integer cases, since Agda doesn't support implicit coercions.
|
||||||
|
This adds a little bit of extra overhead, requiring, for example, that we write
|
||||||
|
numbers as `# 42` instead of `42`.
|
||||||
|
|
||||||
|
Having defined expressions, the next thing on the menu is _statements_. Unlike
|
||||||
|
expressions, which just produce values, statements "do something"; an example
|
||||||
|
of a statement might be the following Python line:
|
||||||
|
|
||||||
|
```Python
|
||||||
|
print("Hello, world!")
|
||||||
|
```
|
||||||
|
|
||||||
|
The `print` function doesn't produce any value, but it does perform an action;
|
||||||
|
it prints its argument to the console!
|
||||||
|
|
||||||
|
For the formalization, it turns out to be convenient to separate "simple"
|
||||||
|
statements from "complex" ones. Pragmatically speaking, the difference is that
|
||||||
|
between the "simple" and the "complex" is control flow; simple statements
|
||||||
|
will be guaranteed to always execute without any decisions or jumps.
|
||||||
|
The reason for this will become clearer in subsequent posts; I will foreshadow
|
||||||
|
a bit by saying that consecutive simple statements can be placed into a single
|
||||||
|
[basic block](https://en.wikipedia.org/wiki/Basic_block).
|
||||||
|
{#introduce-simple-statements}
|
||||||
|
|
||||||
|
The following is a group of three simple statements:
|
||||||
|
|
||||||
|
```
|
||||||
|
x = 1
|
||||||
|
y = x + 2
|
||||||
|
noop
|
||||||
|
```
|
||||||
|
|
||||||
|
These will always be executed in the same order, exactly once. Here, `noop`
|
||||||
|
is a convenient type of statement that simply does nothing.
|
||||||
|
|
||||||
|
On the other hand, the following statement is not simple:
|
||||||
|
|
||||||
|
```
|
||||||
|
while x {
|
||||||
|
x = x - 1
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
It's not simple because it makes decisions about how the code should be executed;
|
||||||
|
if `x` is nonzero, it will try executing the statement in the body of the loop
|
||||||
|
(`x = x - 1`). Otherwise, it would skip evaluating that statement, and carry on
|
||||||
|
with subsequent code.
|
||||||
|
|
||||||
|
I first define simple statements using the `BasicStmt` type:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Base.agda" 18 20 >}}
|
||||||
|
|
||||||
|
Complex statements are just called `Stmt`; they include loops, conditionals and
|
||||||
|
sequences ---
|
||||||
|
{{< sidenote "right" "then-note" "\(s_1\ \text{then}\ s_2\)" >}}
|
||||||
|
The standard notation for sequencing in imperative languages is
|
||||||
|
\(s_1; s_2\). However, Agda gives special meaning to the semicolon,
|
||||||
|
and I couldn't find any passable symbolic alternatives.
|
||||||
|
{{< /sidenote >}} is a sequence where \(s_2\) is evaluated after \(s_1\).
|
||||||
|
Complex statements subsume simple statements, which I model using the constructor
|
||||||
|
`⟨_⟩`.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Base.agda" 25 29 >}}
|
||||||
|
|
||||||
|
For an example of using this encoding, take the following simple program:
|
||||||
|
|
||||||
|
```
|
||||||
|
var = 1
|
||||||
|
if var {
|
||||||
|
x = 1
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The Agda version is:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Main.agda" 27 34 >}}
|
||||||
|
|
||||||
|
Notice how we used `noop` to express the fact that the `else` branch of the
|
||||||
|
conditional does nothing.
|
||||||
|
|
||||||
|
### The Semantics of Our Language
|
||||||
|
|
||||||
|
We now have all the language constructs that I'll be showing off --- because
|
||||||
|
those are all the concepts that I've formalized. What's left is to define
|
||||||
|
how they behave. We will do this using a logical tool called
|
||||||
|
[_inference rules_](https://en.wikipedia.org/wiki/Rule_of_inference). I've
|
||||||
|
written about them a number of times; they're ubiquitous, particularly in the
|
||||||
|
sorts of things I like explore on this site. The [section on inference rules]({{< relref "01_aoc_coq#inference-rules" >}})
|
||||||
|
from my Advent of Code series is pretty relevant, and [the notation section from
|
||||||
|
a post in my compiler series]({{< relref "03_compiler_typechecking#some-notation" >}}) says
|
||||||
|
much the same thing; I won't be re-describing them here.
|
||||||
|
|
||||||
|
There are three pieces which demand semantics: expressions, simple statements,
|
||||||
|
and non-simple statements. The semantics of each of the three requires the
|
||||||
|
semantics of the items that precede it. We will therefore start with expressions.
|
||||||
|
|
||||||
|
#### Expressions
|
||||||
|
|
||||||
|
The trickiest thing about expression is that the value of an expression depends
|
||||||
|
on the "context": `x+1` can evaluate to `43` if `x` is `42`, or it can evaluate
|
||||||
|
to `0` if `x` is `-1`. To evaluate an expression, we will therefore need to
|
||||||
|
assign values to all of the variables in that expression. A mapping that
|
||||||
|
assigns values to variables is typically called an _environment_. We will write
|
||||||
|
\(\varnothing\) for "empty environment", and \(\{\texttt{x} \mapsto 42, \texttt{y} \mapsto -1 \}\) for
|
||||||
|
an environment that maps the variable \(\texttt{x}\) to 42, and the variable \(\texttt{y}\) to -1.
|
||||||
|
|
||||||
|
Now, a bit of notation. We will use the letter \(\rho\) to represent environments
|
||||||
|
(and if several environments are involved, we will occasionally number them
|
||||||
|
as \(\rho_1\), \(\rho_2\), etc.) We will use the letter \(e\) to stand for
|
||||||
|
expressions, and the letter \(v\) to stand for values. Finally, we'll write
|
||||||
|
\(\rho, e \Downarrow v\) to say that "in an environment \(\rho\), expression \(e\)
|
||||||
|
evaluates to value \(v\)". Our two previous examples of evaluating `x+1` can
|
||||||
|
thus be written as follows:
|
||||||
|
{#notation-for-environments}
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\{ \texttt{x} \mapsto 42 \}, \texttt{x}+1 \Downarrow 43 \\
|
||||||
|
\{ \texttt{x} \mapsto -1 \}, \texttt{x}+1 \Downarrow 0 \\
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Now, on to the actual rules for how to evaluate expressions. Most simply,
|
||||||
|
integer literals like `1` just evaluate to themselves.
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\frac{n \in \text{Int}}{\rho, n \Downarrow n}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Note that the letter \(\rho\) is completely unused in the above rule. That's
|
||||||
|
because no matter what values _variables_ have, a number still evaluates to
|
||||||
|
the same value. As we've already established, the same is not true for a
|
||||||
|
variable like \(\texttt{x}\). To evaluate such a variable, we need to retrieve
|
||||||
|
the value it's mapped to in the current environment, which we will write as
|
||||||
|
\(\rho(\texttt{x})\). This gives the following inference rule:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\frac{\rho(x) = v}{\rho, x \Downarrow v}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
All that's left is to define addition and subtraction. For an expression in the
|
||||||
|
form \(e_1+e_2\), we first need to evaluate the two subexpressions \(e_1\)
|
||||||
|
and \(e_2\), and then add the two resulting numbers. As a result, the addition
|
||||||
|
rule includes two additional premises, one for evaluating each summand.
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\frac
|
||||||
|
{\rho, e_1 \Downarrow v_1 \quad \rho, e_2 \Downarrow v_2 \quad v_1 + v_2 = v}
|
||||||
|
{\rho, e_1+e_2 \Downarrow v}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
The subtraction rule is similar. Below, I've configured an instance of
|
||||||
|
[Bergamot]({{< relref "bergamot" >}}) to interpret these exact rules. Try
|
||||||
|
typing various expressions like `1`, `1+1`, etc. into the input box below
|
||||||
|
to see them evaluate. If you click the "Full Proof Tree" button, you can also view
|
||||||
|
the exact rules that were used in computing a particular value. The variables
|
||||||
|
`x`, `y`, and `z` are pre-defined for your convenience.
|
||||||
|
|
||||||
|
{{< bergamot_widget id="expr-widget" query="" prompt="eval(extend(extend(extend(empty, x, 17), y, 42), z, 0), TERM, ?v)" modes="Expression:Expression" >}}
|
||||||
|
hidden section "" {
|
||||||
|
Eq @ eq(?x, ?x) <-;
|
||||||
|
}
|
||||||
|
section "" {
|
||||||
|
EvalNum @ eval(?rho, lit(?n), ?n) <- int(?n);
|
||||||
|
EvalVar @ eval(?rho, var(?x), ?v) <- inenv(?x, ?v, ?rho);
|
||||||
|
}
|
||||||
|
section "" {
|
||||||
|
EvalPlus @ eval(?rho, plus(?e_1, ?e_2), ?v) <- eval(?rho, ?e_1, ?v_1), eval(?rho, ?e_2, ?v_2), add(?v_1, ?v_2, ?v);
|
||||||
|
EvalMinus @ eval(?rho, minus(?e_1, ?e_2), ?v) <- eval(?rho, ?e_1, ?v_1), eval(?rho, ?e_2, ?v_2), subtract(?v_1, ?v_2, ?v);
|
||||||
|
}
|
||||||
|
hidden section "" {
|
||||||
|
EnvTake @ inenv(?x, ?v, extend(?rho, ?x, ?v)) <-;
|
||||||
|
EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(eq(?x, ?y));
|
||||||
|
}
|
||||||
|
{{< /bergamot_widget >}}
|
||||||
|
|
||||||
|
The Agda equivalent of this looks very similar to the rules themselves. I use
|
||||||
|
`⇒ᵉ` instead of \(\Downarrow\), and there's a little bit of tedium with
|
||||||
|
wrapping integers into a new `Value` type. I also used a (partial) relation
|
||||||
|
`(x, v) ∈ ρ` instead of explicitly defining accessing an environment, since
|
||||||
|
it is conceivable for a user to attempt accessing a variable that has not
|
||||||
|
been assigned to. Aside from these notational changes, the structure of each
|
||||||
|
of the constructors of the evaluation data type matches the inference rules
|
||||||
|
I showed above.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Semantics.agda" 27 35 >}}
|
||||||
|
|
||||||
|
#### Simple Statements
|
||||||
|
The main difference between formalizing (simple and "normal") statements is
|
||||||
|
that they modify the environment. If `x` has one value, writing `x = x + 1` will
|
||||||
|
certainly change that value. On the other hand, statements don't produce values.
|
||||||
|
So, we will be writing claims like \(\rho_1 , \textit{bs} \Rightarrow \rho_2\)
|
||||||
|
to say that the basic statement \(\textit{bs}\), when starting in environment
|
||||||
|
\(\rho_1\), will produce environment \(\rho_2\). Here's an example:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\{ \texttt{x} \mapsto 42, \texttt{y} \mapsto 17 \}, \ \texttt{x = x - \text{1}} \Rightarrow \{ \texttt{x} \mapsto 41, \texttt{y} \mapsto 17 \}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Here, we subtracted one from a variable with value `42`, leaving it with a new
|
||||||
|
value of `41`.
|
||||||
|
|
||||||
|
There are two basic statements, and one of them quite literally does nothing.
|
||||||
|
The inference rule for `noop` is very simple:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\rho,\ \texttt{noop} \Rightarrow \rho
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
For the assignment rule, we need to know how to evaluate the expression on the
|
||||||
|
right side of the equal sign. This is why we needed to define the semantics
|
||||||
|
of expressions first. Given those, the evaluation rule for assignment is as
|
||||||
|
follows, with \(\rho[x \mapsto v]\) meaning "the environment \(\rho\) but
|
||||||
|
mapping the variable \(x\) to value \(v\)".
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\frac
|
||||||
|
{\rho, e \Downarrow v}
|
||||||
|
{\rho, x = e \Rightarrow \rho[x \mapsto v]}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Those are actually all the rules we need, and below, I am once again configuring
|
||||||
|
a Bergamot instance, this time with simple statements. Try out `noop` or some
|
||||||
|
sort of variable assignment, like `x = x + 1`.
|
||||||
|
|
||||||
|
{{< bergamot_widget id="basic-stmt-widget" query="" prompt="stepbasic(extend(extend(extend(empty, x, 17), y, 42), z, 0), TERM, ?env)" modes="Basic Statement:Basic Statement" >}}
|
||||||
|
hidden section "" {
|
||||||
|
Eq @ eq(?x, ?x) <-;
|
||||||
|
}
|
||||||
|
hidden section "" {
|
||||||
|
EvalNum @ eval(?rho, lit(?n), ?n) <- int(?n);
|
||||||
|
EvalVar @ eval(?rho, var(?x), ?v) <- inenv(?x, ?v, ?rho);
|
||||||
|
}
|
||||||
|
hidden section "" {
|
||||||
|
EvalPlus @ eval(?rho, plus(?e_1, ?e_2), ?v) <- eval(?rho, ?e_1, ?v_1), eval(?rho, ?e_2, ?v_2), add(?v_1, ?v_2, ?v);
|
||||||
|
EvalMinus @ eval(?rho, minus(?e_1, ?e_2), ?v) <- eval(?rho, ?e_1, ?v_1), eval(?rho, ?e_2, ?v_2), subtract(?v_1, ?v_2, ?v);
|
||||||
|
}
|
||||||
|
section "" {
|
||||||
|
StepNoop @ stepbasic(?rho, noop, ?rho) <-;
|
||||||
|
StepAssign @ stepbasic(?rho, assign(?x, ?e), extend(?rho, ?x, ?v)) <- eval(?rho, ?e, ?v);
|
||||||
|
}
|
||||||
|
hidden section "" {
|
||||||
|
EnvTake @ inenv(?x, ?v, extend(?rho, ?x, ?v)) <-;
|
||||||
|
EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(eq(?x, ?y));
|
||||||
|
}
|
||||||
|
{{< /bergamot_widget >}}
|
||||||
|
|
||||||
|
The Agda implementation is once again just a data type with constructors-for-rules.
|
||||||
|
This time they also look quite similar to the rules I've shown up until now,
|
||||||
|
though I continue to explicitly quantify over variables like `ρ`.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Semantics.agda" 37 40 >}}
|
||||||
|
|
||||||
|
#### Statements
|
||||||
|
|
||||||
|
Let's work on non-simple statements next. The easiest rule to define is probably
|
||||||
|
sequencing. When we use `then` (or `;`) to combine two statements, what we
|
||||||
|
actually want is to execute the first statement, which may change variables,
|
||||||
|
and then execute the second statement while keeping the changes from the first.
|
||||||
|
This means there are three environments: \(\rho_1\) for the initial state before
|
||||||
|
either statement is executed, \(\rho_2\) for the state between executing the
|
||||||
|
first and second statement, and \(\rho_3\) for the final state after both
|
||||||
|
are done executing. This leads to the following rule:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\frac
|
||||||
|
{ \rho_1, s_1 \Rightarrow \rho_2 \quad \rho_2, s_2 \Rightarrow \rho_3 }
|
||||||
|
{ \rho_1, s_1; s_2 \Rightarrow \rho_3 }
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
We will actually need two rules to evaluate the conditional statement: one
|
||||||
|
for when the condition evaluates to "true", and one for when the condition
|
||||||
|
evaluates to "false". Only, I never specified booleans as being part of
|
||||||
|
the language, which means that we will need to come up what "false" and "true"
|
||||||
|
are. I will take my cue from C++ and use zero as "false", and any other number
|
||||||
|
as "true".
|
||||||
|
|
||||||
|
If the condition of an `if`-`else` statement is "true" (nonzero), then the
|
||||||
|
effect of executing the `if`-`else` should be the same as executing the "then"
|
||||||
|
part of the statement, while completely ignoring the "else" part.
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\frac
|
||||||
|
{ \rho_1 , e \Downarrow v \quad v \neq 0 \quad \rho_1, s_1 \Rightarrow \rho_2}
|
||||||
|
{ \rho_1, \textbf{if}\ e\ \{ s_1 \}\ \textbf{else}\ \{ s_2 \}\ \Rightarrow \rho_2 }
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Notice that in the above rule, we used the evaluation judgement \(\rho_1, e \Downarrow v\)
|
||||||
|
to evaluate the _expression_ that serves as the condition. We then had an
|
||||||
|
additional premise that requires the truthiness of the resulting value \(v\).
|
||||||
|
The rule for evaluating a conditional with a "false" branch is very similar.
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\frac
|
||||||
|
{ \rho_1 , e \Downarrow v \quad v = 0 \quad \rho_1, s_2 \Rightarrow \rho_2}
|
||||||
|
{ \rho_1, \textbf{if}\ e\ \{ s_1 \}\ \textbf{else}\ \{ s_2 \}\ \Rightarrow \rho_2 }
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Now that we have rules for conditional statements, it will be surprisingly easy
|
||||||
|
to define the rules for `while` loops. A `while` loop will also have two rules,
|
||||||
|
one for when its condition is truthy and one for when it's falsey. However,
|
||||||
|
unlike the "false" case, a while loop will do nothing, leaving the environment
|
||||||
|
unchanged:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\frac
|
||||||
|
{ \rho_1 , e \Downarrow v \quad v = 0 }
|
||||||
|
{ \rho_1 , \textbf{while}\ e\ \{ s \}\ \Rightarrow \rho_1 }
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
The trickiest rule is for when the condition of a `while` loop is true.
|
||||||
|
We evaluate the body once, starting in environment \(\rho_1\) and finishing
|
||||||
|
in \(\rho_2\), but then we're not done. In fact, we have to go back to the top,
|
||||||
|
and check the condition again, starting over. As a result, we include another
|
||||||
|
premise, that tells us that evaluating the loop starting at \(\rho_2\), we
|
||||||
|
eventually end in state \(\rho_3\). This encodes the "rest of the iterations"
|
||||||
|
in addition to the one we just performed. The environment \(\rho_3\) is our
|
||||||
|
final state, so that's what we use in the rule's conclusion.
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\frac
|
||||||
|
{ \rho_1 , e \Downarrow v \quad v \neq 0 \quad \rho_1 , s \Rightarrow \rho_2 \quad \rho_2 , \textbf{while}\ e\ \{ s \}\ \Rightarrow \rho_3 }
|
||||||
|
{ \rho_1 , \textbf{while}\ e\ \{ s \}\ \Rightarrow \rho_3 }
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
And that's it! We have now seen every rule that defines the little object language
|
||||||
|
I've been using for my Agda work. Below is a Bergamot widget that implements
|
||||||
|
these rules. Try the following program, which computes the `x`th power of two,
|
||||||
|
and stores it in `y`:
|
||||||
|
|
||||||
|
```
|
||||||
|
x = 5; y = 1; while (x) { y = y + y; x = x - 1 }
|
||||||
|
```
|
||||||
|
|
||||||
|
{{< bergamot_widget id="stmt-widget" query="" prompt="step(extend(extend(extend(empty, x, 17), y, 42), z, 0), TERM, ?env)" modes="Statement:Statement" >}}
|
||||||
|
hidden section "" {
|
||||||
|
Eq @ eq(?x, ?x) <-;
|
||||||
|
}
|
||||||
|
hidden section "" {
|
||||||
|
EvalNum @ eval(?rho, lit(?n), ?n) <- int(?n);
|
||||||
|
EvalVar @ eval(?rho, var(?x), ?v) <- inenv(?x, ?v, ?rho);
|
||||||
|
}
|
||||||
|
hidden section "" {
|
||||||
|
EvalPlus @ eval(?rho, plus(?e_1, ?e_2), ?v) <- eval(?rho, ?e_1, ?v_1), eval(?rho, ?e_2, ?v_2), add(?v_1, ?v_2, ?v);
|
||||||
|
EvalMinus @ eval(?rho, minus(?e_1, ?e_2), ?v) <- eval(?rho, ?e_1, ?v_1), eval(?rho, ?e_2, ?v_2), subtract(?v_1, ?v_2, ?v);
|
||||||
|
}
|
||||||
|
hidden section "" {
|
||||||
|
StepNoop @ stepbasic(?rho, noop, ?rho) <-;
|
||||||
|
StepAssign @ stepbasic(?rho, assign(?x, ?e), extend(?rho, ?x, ?v)) <- eval(?rho, ?e, ?v);
|
||||||
|
}
|
||||||
|
hidden section "" {
|
||||||
|
StepNoop @ stepbasic(?rho, noop, ?rho) <-;
|
||||||
|
StepAssign @ stepbasic(?rho, assign(?x, ?e), extend(?rho, ?x, ?v)) <- eval(?rho, ?e, ?v);
|
||||||
|
}
|
||||||
|
hidden section "" {
|
||||||
|
StepLiftBasic @ step(?rho_1, ?s, ?rho_2) <- stepbasic(?rho_1, ?s, ?rho_2);
|
||||||
|
}
|
||||||
|
section "" {
|
||||||
|
StepIfTrue @ step(?rho_1, if(?e, ?s_1, ?s_2), ?rho_2) <- eval(?rho_1, ?e, ?v), not(eq(?v, 0)), step(?rho_1, ?s_1, ?rho_2);
|
||||||
|
StepIfFalse @ step(?rho_1, if(?e, ?s_1, ?s_2), ?rho_2) <- eval(?rho_1, ?e, ?v), eq(?v, 0), step(?rho_1, ?s_2, ?rho_2);
|
||||||
|
StepWhileTrue @ step(?rho_1, while(?e, ?s), ?rho_3) <- eval(?rho_1, ?e, ?v), not(eq(?v, 0)), step(?rho_1, ?s, ?rho_2), step(?rho_2, while(?e, ?s), ?rho_3);
|
||||||
|
StepWhileFalse @ step(?rho_1, while(?e, ?s), ?rho_1) <- eval(?rho_1, ?e, ?v), eq(?v, 0);
|
||||||
|
StepSeq @ step(?rho_1, seq(?s_1, ?s_2), ?rho_3) <- step(?rho_1, ?s_1, ?rho_2), step(?rho_2, ?s_2, ?rho_3);
|
||||||
|
}
|
||||||
|
hidden section "" {
|
||||||
|
EnvTake @ inenv(?x, ?v, extend(?rho, ?x, ?v)) <-;
|
||||||
|
EnvSkip @ inenv(?x, ?v_1, extend(?rho, ?y, ?v_2)) <- inenv(?x, ?v_1, ?rho), not(eq(?x, ?y));
|
||||||
|
}
|
||||||
|
{{< /bergamot_widget >}}
|
||||||
|
|
||||||
|
As with all the other rules we've seen, the mathematical notation above can
|
||||||
|
be directly translated into Agda:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Semantics.agda" 47 64 >}}
|
||||||
|
|
||||||
|
### Semantics as Ground Truth
|
||||||
|
|
||||||
|
Prior to this post, we had been talking about using lattices and monotone
|
||||||
|
functions for program analysis. The key problem with using this framework to
|
||||||
|
define analyses is that there are many monotone functions that produce complete
|
||||||
|
nonsese; their output is, at best, unrelated to the program they're supposed
|
||||||
|
to analyze. We don't want to write such functions, since having incorrect
|
||||||
|
information about the programs in question is unhelpful.
|
||||||
|
|
||||||
|
What does it mean for a function to produce correct information, though?
|
||||||
|
In the context of sign analysis, it would mean that if we say a variable `x` is `+`,
|
||||||
|
then evaluating the program will leave us in a state in which `x` is posive.
|
||||||
|
The semantics we defined in this post give us the "evaluating the program piece".
|
||||||
|
They establish what the programs _actually_ do, and we can use this ground
|
||||||
|
truth when checking that our analyses are correct. In subsequent posts, I will
|
||||||
|
prove the exact property I informally stated above: __for the program analyses
|
||||||
|
we define, things they "claim" about our program will match what actually happens
|
||||||
|
when executing the program using our semantics__.
|
||||||
|
|
||||||
|
A piece of the puzzle still remains: how are we going to use the monotone
|
||||||
|
functions we've been talking so much about? We need to figure out what to feed
|
||||||
|
to our analyses before we can prove their correctness.
|
||||||
|
|
||||||
|
I have an answer to that question: we will be using _control flow graphs_ (CFGs).
|
||||||
|
These are another program representation, one that's more commonly found in
|
||||||
|
compilers. I will show what they look like in the next post. I hope to see you
|
||||||
|
there!
|
||||||
128
content/blog/05_spa_agda_semantics/parser.js
Normal file
@@ -0,0 +1,128 @@
|
|||||||
|
const match = str => input => {
|
||||||
|
if (input.startsWith(str)) {
|
||||||
|
return [[str, input.slice(str.length)]]
|
||||||
|
}
|
||||||
|
return [];
|
||||||
|
};
|
||||||
|
|
||||||
|
const map = (f, m) => input => {
|
||||||
|
return m(input).map(([v, rest]) => [f(v), rest]);
|
||||||
|
};
|
||||||
|
|
||||||
|
const apply = (m1, m2) => input => {
|
||||||
|
return m1(input).flatMap(([f, rest]) => m2(rest).map(([v, rest]) => [f(v), rest]));
|
||||||
|
};
|
||||||
|
|
||||||
|
const pure = v => input => [[v, input]];
|
||||||
|
|
||||||
|
const liftA = (f, ...ms) => input => {
|
||||||
|
if (ms.length <= 0) return []
|
||||||
|
|
||||||
|
let results = map(v => [v], ms[0])(input);
|
||||||
|
for (let i = 1; i < ms.length; i++) {
|
||||||
|
results = results.flatMap(([vals, rest]) =>
|
||||||
|
ms[i](rest).map(([val, rest]) => [[...vals, val], rest])
|
||||||
|
);
|
||||||
|
}
|
||||||
|
return results.map(([vals, rest]) => [f(...vals), rest]);
|
||||||
|
};
|
||||||
|
|
||||||
|
const many1 = (m) => liftA((x, xs) => [x].concat(xs), m, oneOf([
|
||||||
|
lazy(() => many1(m)),
|
||||||
|
pure([])
|
||||||
|
]));
|
||||||
|
|
||||||
|
const many = (m) => oneOf([ pure([]), many1(m) ]);
|
||||||
|
|
||||||
|
const oneOf = ms => input => {
|
||||||
|
return ms.flatMap(m => m(input));
|
||||||
|
};
|
||||||
|
|
||||||
|
const takeWhileRegex0 = regex => input => {
|
||||||
|
let idx = 0;
|
||||||
|
while (idx < input.length && regex.test(input[idx])) {
|
||||||
|
idx++;
|
||||||
|
}
|
||||||
|
return [[input.slice(0, idx), input.slice(idx)]];
|
||||||
|
};
|
||||||
|
|
||||||
|
const takeWhileRegex = regex => input => {
|
||||||
|
const result = takeWhileRegex0(regex)(input);
|
||||||
|
if (result[0][0].length > 0) return result;
|
||||||
|
return [];
|
||||||
|
};
|
||||||
|
|
||||||
|
const spaces = takeWhileRegex0(/\s/);
|
||||||
|
|
||||||
|
const digits = takeWhileRegex(/\d/);
|
||||||
|
|
||||||
|
const alphas = takeWhileRegex(/[a-zA-Z]/);
|
||||||
|
|
||||||
|
const left = (m1, m2) => liftA((a, _) => a, m1, m2);
|
||||||
|
|
||||||
|
const right = (m1, m2) => liftA((_, b) => b, m1, m2);
|
||||||
|
|
||||||
|
const word = s => left(match(s), spaces);
|
||||||
|
|
||||||
|
const end = s => s.length == 0 ? [['', '']] : [];
|
||||||
|
|
||||||
|
const lazy = deferred => input => deferred()(input);
|
||||||
|
|
||||||
|
const ident = left(alphas, spaces);
|
||||||
|
|
||||||
|
const number = oneOf([
|
||||||
|
liftA((a, b) => a + b, word("-"), left(digits, spaces)),
|
||||||
|
left(digits, spaces),
|
||||||
|
]);
|
||||||
|
|
||||||
|
const basicExpr = oneOf([
|
||||||
|
map(n => `lit(${n})`, number),
|
||||||
|
map(x => `var(${x})`, ident),
|
||||||
|
liftA((lp, v, rp) => v, word("("), lazy(() => expr), word(")")),
|
||||||
|
]);
|
||||||
|
|
||||||
|
const opExpr = oneOf([
|
||||||
|
liftA((_a, _b, e) => ["plus", e], word("+"), spaces, lazy(() => expr)),
|
||||||
|
liftA((_a, _b, e) => ["minus", e], word("-"), spaces, lazy(() => expr)),
|
||||||
|
]);
|
||||||
|
|
||||||
|
const flatten = (e, es) => {
|
||||||
|
return es.reduce((e1, [op, e2]) => `${op}(${e1}, ${e2})`, e);
|
||||||
|
}
|
||||||
|
|
||||||
|
const expr = oneOf([
|
||||||
|
basicExpr,
|
||||||
|
liftA(flatten, basicExpr, many(opExpr)),
|
||||||
|
]);
|
||||||
|
|
||||||
|
const basicStmt = oneOf([
|
||||||
|
liftA((x, _, e) => `assign(${x}, ${e})`, ident, word("="), expr),
|
||||||
|
word("noop"),
|
||||||
|
]);
|
||||||
|
|
||||||
|
const stmt = oneOf([
|
||||||
|
basicStmt,
|
||||||
|
liftA((_if, _lp_, cond, _rp, _lbr1_, s1, _rbr1, _else, _lbr2, s2, _rbr2) => `if(${cond}, ${s1}, ${s2})`,
|
||||||
|
word("if"), word("("), expr, word(")"),
|
||||||
|
word("{"), lazy(() => stmtSeq), word("}"),
|
||||||
|
word("else"), word("{"), lazy(() => stmtSeq), word("}")),
|
||||||
|
liftA((_while, _lp_, cond, _rp, _lbr_, s1, _rbr) => `while(${cond}, ${s1})`,
|
||||||
|
word("while"), word("("), expr, word(")"),
|
||||||
|
word("{"), lazy(() => stmtSeq), word("}")),
|
||||||
|
]);
|
||||||
|
|
||||||
|
const stmtSeq = oneOf([
|
||||||
|
liftA((s1, _semi, rest) => `seq(${s1}, ${rest})`, stmt, word(";"), lazy(() => stmtSeq)),
|
||||||
|
stmt,
|
||||||
|
]);
|
||||||
|
|
||||||
|
const parseWhole = m => string => {
|
||||||
|
const result = left(m, end)(string);
|
||||||
|
console.log(result);
|
||||||
|
if (result.length > 0) return result[0][0];
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
|
||||||
|
window.parseExpr = parseWhole(expr);
|
||||||
|
window.parseBasicStmt = parseWhole(basicStmt);
|
||||||
|
window.parseStmt = parseWhole(stmtSeq);
|
||||||
@@ -1,7 +1,8 @@
|
|||||||
---
|
---
|
||||||
title: Compiling a Functional Language Using C++, Part 6 - Compilation
|
title: Compiling a Functional Language Using C++, Part 6 - Compilation
|
||||||
date: 2019-08-06T14:26:38-07:00
|
date: 2019-08-06T14:26:38-07:00
|
||||||
tags: ["C and C++", "Functional Languages", "Compilers"]
|
tags: ["C++", "Functional Languages", "Compilers"]
|
||||||
|
series: "Compiling a Functional Language using C++"
|
||||||
description: "In this post, we enable our compiler to convert programs written in our functional language to G-machine instructions."
|
description: "In this post, we enable our compiler to convert programs written in our functional language to G-machine instructions."
|
||||||
---
|
---
|
||||||
In the previous post, we defined a machine for graph reduction,
|
In the previous post, we defined a machine for graph reduction,
|
||||||
@@ -12,9 +13,9 @@ this G-machine. We will define a __compilation scheme__,
|
|||||||
which will be a set of rules that tell us how to
|
which will be a set of rules that tell us how to
|
||||||
translate programs in our language into G-machine instructions.
|
translate programs in our language into G-machine instructions.
|
||||||
To mirror _Implementing Functional Languages: a tutorial_, we'll
|
To mirror _Implementing Functional Languages: a tutorial_, we'll
|
||||||
call this compilation scheme \\(\\mathcal{C}\\), and write it
|
call this compilation scheme \(\mathcal{C}\), and write it
|
||||||
as \\(\\mathcal{C} ⟦e⟧ = i\\), meaning "the expression \\(e\\)
|
as \(\mathcal{C} ⟦e⟧ = i\), meaning "the expression \(e\)
|
||||||
compiles to the instructions \\(i\\)".
|
compiles to the instructions \(i\)".
|
||||||
|
|
||||||
To follow our route from the typechecking, let's start
|
To follow our route from the typechecking, let's start
|
||||||
with compiling expressions that are numbers. It's pretty easy:
|
with compiling expressions that are numbers. It's pretty easy:
|
||||||
@@ -36,7 +37,7 @@ we want to compile it last:
|
|||||||
\mathcal{C} ⟦e_1 \; e_2⟧ = \mathcal{C} ⟦e_2⟧ ⧺ \mathcal{C} ⟦e_1⟧ ⧺ [\text{MkApp}]
|
\mathcal{C} ⟦e_1 \; e_2⟧ = \mathcal{C} ⟦e_2⟧ ⧺ \mathcal{C} ⟦e_1⟧ ⧺ [\text{MkApp}]
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
Here, we used the \\(⧺\\) operator to represent the concatenation of two
|
Here, we used the \(⧺\) operator to represent the concatenation of two
|
||||||
lists. Otherwise, this should be pretty intutive - we first run the instructions
|
lists. Otherwise, this should be pretty intutive - we first run the instructions
|
||||||
to create the parameter, then we run the instructions to create the function,
|
to create the parameter, then we run the instructions to create the function,
|
||||||
and finally, we combine them using MkApp.
|
and finally, we combine them using MkApp.
|
||||||
@@ -48,17 +49,17 @@ define our case expression compilation scheme to match. However,
|
|||||||
we still need to know __where__ on the stack each variable is,
|
we still need to know __where__ on the stack each variable is,
|
||||||
and this changes as the stack is modified.
|
and this changes as the stack is modified.
|
||||||
|
|
||||||
To accommodate for this, we define an environment, \\(\\rho\\),
|
To accommodate for this, we define an environment, \(\rho\),
|
||||||
to be a partial function mapping variable names to thier
|
to be a partial function mapping variable names to thier
|
||||||
offsets on the stack. We write \\(\\rho = [x \\rightarrow n, y \\rightarrow m]\\)
|
offsets on the stack. We write \(\rho = [x \rightarrow n, y \rightarrow m]\)
|
||||||
to say "the environment \\(\\rho\\) maps variable \\(x\\) to stack offset \\(n\\),
|
to say "the environment \(\rho\) maps variable \(x\) to stack offset \(n\),
|
||||||
and variable \\(y\\) to stack offset \\(m\\)". We also write \\(\\rho \\; x\\) to
|
and variable \(y\) to stack offset \(m\)". We also write \(\rho \; x\) to
|
||||||
say "look up \\(x\\) in \\(\\rho\\)", since \\(\\rho\\) is a function. Finally,
|
say "look up \(x\) in \(\rho\)", since \(\rho\) is a function. Finally,
|
||||||
to help with the ever-changing stack, we define an augmented environment
|
to help with the ever-changing stack, we define an augmented environment
|
||||||
\\(\\rho^{+n}\\), such that \\(\\rho^{+n} \\; x = \\rho \\; x + n\\). In words,
|
\(\rho^{+n}\), such that \(\rho^{+n} \; x = \rho \; x + n\). In words,
|
||||||
this basically means "\\(\\rho^{+n}\\) has all the variables from \\(\\rho\\),
|
this basically means "\(\rho^{+n}\) has all the variables from \(\rho\),
|
||||||
but their addresses are incremented by \\(n\\)". We now pass \\(\\rho\\)
|
but their addresses are incremented by \(n\)". We now pass \(\rho\)
|
||||||
in to \\(\\mathcal{C}\\) together with the expression \\(e\\). Let's
|
in to \(\mathcal{C}\) together with the expression \(e\). Let's
|
||||||
rewrite our first two rules. For numbers:
|
rewrite our first two rules. For numbers:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
@@ -71,8 +72,8 @@ For function application:
|
|||||||
\mathcal{C} ⟦e_1 \; e_2⟧ \; \rho = \mathcal{C} ⟦e_2⟧ \; \rho \; ⧺ \;\mathcal{C} ⟦e_1⟧ \; \rho^{+1} \; ⧺ \; [\text{MkApp}]
|
\mathcal{C} ⟦e_1 \; e_2⟧ \; \rho = \mathcal{C} ⟦e_2⟧ \; \rho \; ⧺ \;\mathcal{C} ⟦e_1⟧ \; \rho^{+1} \; ⧺ \; [\text{MkApp}]
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
Notice how in that last rule, we passed in \\(\\rho^{+1}\\) when compiling the function's expression. This is because
|
Notice how in that last rule, we passed in \(\rho^{+1}\) when compiling the function's expression. This is because
|
||||||
the result of running the instructions for \\(e\_2\\) will have left on the stack the function's parameter. Whatever
|
the result of running the instructions for \(e_2\) will have left on the stack the function's parameter. Whatever
|
||||||
was at the top of the stack (and thus, had index 0), is now the second element from the top (address 1). The
|
was at the top of the stack (and thus, had index 0), is now the second element from the top (address 1). The
|
||||||
same is true for all other things that were on the stack. So, we increment the environment accordingly.
|
same is true for all other things that were on the stack. So, we increment the environment accordingly.
|
||||||
|
|
||||||
@@ -83,7 +84,7 @@ With the environment, the variable rule is simple:
|
|||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
One more thing. If we run across a function name, we want to
|
One more thing. If we run across a function name, we want to
|
||||||
use PushGlobal rather than Push. Defining \\(f\\) to be a name
|
use PushGlobal rather than Push. Defining \(f\) to be a name
|
||||||
of a global function, we capture this using the following rule:
|
of a global function, we capture this using the following rule:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
@@ -92,8 +93,8 @@ of a global function, we capture this using the following rule:
|
|||||||
|
|
||||||
Now it's time for us to compile case expressions, but there's a bit of
|
Now it's time for us to compile case expressions, but there's a bit of
|
||||||
an issue - our case expressions branches don't map one-to-one with
|
an issue - our case expressions branches don't map one-to-one with
|
||||||
the \\(t \\rightarrow i\_t\\) format of the Jump instruction.
|
the \(t \rightarrow i_t\) format of the Jump instruction.
|
||||||
This is because we allow for name patterns in the form \\(x\\),
|
This is because we allow for name patterns in the form \(x\),
|
||||||
which can possibly match more than one tag. Consider this
|
which can possibly match more than one tag. Consider this
|
||||||
rather useless example:
|
rather useless example:
|
||||||
|
|
||||||
@@ -119,8 +120,8 @@ Effectively, we'll be performing [desugaring](https://en.wikipedia.org/wiki/Synt
|
|||||||
|
|
||||||
Now, on to defining the compilation rules for case expressions.
|
Now, on to defining the compilation rules for case expressions.
|
||||||
It's helpful to define compiling a single branch of a case expression
|
It's helpful to define compiling a single branch of a case expression
|
||||||
separately. For a branch in the form \\(t \\; x\_1 \\; x\_2 \\; ... \\; x\_n \\rightarrow \text{body}\\),
|
separately. For a branch in the form \(t \; x_1 \; x_2 \; ... \; x_n \rightarrow \text{body}\),
|
||||||
we define a compilation scheme \\(\\mathcal{A}\\) as follows:
|
we define a compilation scheme \(\mathcal{A}\) as follows:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\begin{aligned}
|
\begin{aligned}
|
||||||
@@ -132,15 +133,15 @@ t \rightarrow [\text{Split} \; n] \; ⧺ \; \mathcal{C}⟦\text{body}⟧ \; \rho
|
|||||||
|
|
||||||
First, we run Split - the node on the top of the stack is a packed constructor,
|
First, we run Split - the node on the top of the stack is a packed constructor,
|
||||||
and we want access to its member variables, since they can be referenced by
|
and we want access to its member variables, since they can be referenced by
|
||||||
the branch's body via \\(x\_i\\). For the same reason, we must make sure to include
|
the branch's body via \(x_i\). For the same reason, we must make sure to include
|
||||||
\\(x\_1\\) through \\(x\_n\\) in our environment. Furthermore, since the split values now occupy the stack,
|
\(x_1\) through \(x_n\) in our environment. Furthermore, since the split values now occupy the stack,
|
||||||
we have to offset our environment by \\(n\\) before adding bindings to our new variables.
|
we have to offset our environment by \(n\) before adding bindings to our new variables.
|
||||||
Doing all these things gives us \\(\\rho'\\), which we use to compile the body, placing
|
Doing all these things gives us \(\rho'\), which we use to compile the body, placing
|
||||||
the resulting instructions after Split. This leaves us with the desired graph on top of
|
the resulting instructions after Split. This leaves us with the desired graph on top of
|
||||||
the stack - the only thing left to do is to clean up the stack of the unpacked values,
|
the stack - the only thing left to do is to clean up the stack of the unpacked values,
|
||||||
which we do using Slide.
|
which we do using Slide.
|
||||||
|
|
||||||
Notice that we didn't just create instructions - we created a mapping from the tag \\(t\\)
|
Notice that we didn't just create instructions - we created a mapping from the tag \(t\)
|
||||||
to the instructions that correspond to it.
|
to the instructions that correspond to it.
|
||||||
|
|
||||||
Now, it's time for compiling the whole case expression. We first want
|
Now, it's time for compiling the whole case expression. We first want
|
||||||
@@ -154,7 +155,7 @@ is captured by the following rule:
|
|||||||
\mathcal{C} ⟦e⟧ \; \rho \; ⧺ [\text{Eval}, \text{Jump} \; [\mathcal{A} ⟦\text{alt}_1⟧ \; \rho, ..., \mathcal{A} ⟦\text{alt}_n⟧ \; \rho]]
|
\mathcal{C} ⟦e⟧ \; \rho \; ⧺ [\text{Eval}, \text{Jump} \; [\mathcal{A} ⟦\text{alt}_1⟧ \; \rho, ..., \mathcal{A} ⟦\text{alt}_n⟧ \; \rho]]
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
This works because \\(\\mathcal{A}\\) creates not only instructions,
|
This works because \(\mathcal{A}\) creates not only instructions,
|
||||||
but also a tag mapping. We simply populate our Jump instruction such mappings
|
but also a tag mapping. We simply populate our Jump instruction such mappings
|
||||||
resulting from compiling each branch.
|
resulting from compiling each branch.
|
||||||
|
|
||||||
@@ -176,7 +177,7 @@ as always, you can look at the full project source code, which is
|
|||||||
freely available for each post in the series.
|
freely available for each post in the series.
|
||||||
|
|
||||||
We can now envision a method on the `ast` struct that takes an environment
|
We can now envision a method on the `ast` struct that takes an environment
|
||||||
(just like our compilation scheme takes the environment \\(\\rho\\\)),
|
(just like our compilation scheme takes the environment \(\rho\)),
|
||||||
and compiles the `ast`. Rather than returning a vector
|
and compiles the `ast`. Rather than returning a vector
|
||||||
of instructions (which involves copying, unless we get some optimization kicking in),
|
of instructions (which involves copying, unless we get some optimization kicking in),
|
||||||
we'll pass a reference to a vector to our method. The method will then place the generated
|
we'll pass a reference to a vector to our method. The method will then place the generated
|
||||||
@@ -187,7 +188,7 @@ from a variable? A naive solution would be to take a list or map of
|
|||||||
global functions as a third parameter to our `compile` method.
|
global functions as a third parameter to our `compile` method.
|
||||||
But there's an easier way! We know that the program passed type checking.
|
But there's an easier way! We know that the program passed type checking.
|
||||||
This means that every referenced variable exists. From then, the situation is easy -
|
This means that every referenced variable exists. From then, the situation is easy -
|
||||||
if actual variable names are kept in the environment, \\(\\rho\\), then whenever
|
if actual variable names are kept in the environment, \(\rho\), then whenever
|
||||||
we see a variable that __isn't__ in the current environment, it must be a function name.
|
we see a variable that __isn't__ in the current environment, it must be a function name.
|
||||||
|
|
||||||
Having finished contemplating out method, it's time to define a signature:
|
Having finished contemplating out method, it's time to define a signature:
|
||||||
|
|||||||
15
content/blog/06_spa_agda_cfg/if-cfg.dot
Normal file
@@ -0,0 +1,15 @@
|
|||||||
|
digraph G {
|
||||||
|
graph[dpi=300 fontsize=14 fontname="Courier New"];
|
||||||
|
node[shape=rectangle style="filled" fillcolor="#fafafa" penwidth=0.5 color="#aaaaaa"];
|
||||||
|
edge[arrowsize=0.3 color="#444444"]
|
||||||
|
|
||||||
|
node_begin [label="x = ...;\lx\l"]
|
||||||
|
node_then [label="x = 1\l"]
|
||||||
|
node_else [label="x = 0\l"]
|
||||||
|
node_end [label="y = x\l"]
|
||||||
|
|
||||||
|
node_begin -> node_then
|
||||||
|
node_begin -> node_else
|
||||||
|
node_then -> node_end
|
||||||
|
node_else -> node_end
|
||||||
|
}
|
||||||
BIN
content/blog/06_spa_agda_cfg/if-cfg.png
Normal file
|
After Width: | Height: | Size: 21 KiB |
377
content/blog/06_spa_agda_cfg/index.md
Normal file
@@ -0,0 +1,377 @@
|
|||||||
|
---
|
||||||
|
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 6: Control Flow Graphs"
|
||||||
|
series: "Static Program Analysis in Agda"
|
||||||
|
description: "In this post, I show an Agda definition of Control Flow Graphs and their construction"
|
||||||
|
date: 2024-11-27T16:26:42-07:00
|
||||||
|
tags: ["Agda", "Programming Languages"]
|
||||||
|
---
|
||||||
|
|
||||||
|
In the previous section, I've given a formal definition of the programming
|
||||||
|
language that I've been trying to analyze. This formal definition serves
|
||||||
|
as the "ground truth" for how our little imperative programs are executed;
|
||||||
|
however, program analyses (especially in practice) seldom take the formal
|
||||||
|
semantics as input. Instead, they focus on more pragmatic program
|
||||||
|
representations from the world of compilers. One such representation are
|
||||||
|
_Control Flow Graphs (CFGs)_. That's what I want to discuss in this post.
|
||||||
|
|
||||||
|
Let's start by building some informal intuition. CFGs are pretty much what
|
||||||
|
their name suggests: they are a type of [graph](https://en.wikipedia.org/wiki/Graph_(discrete_mathematics));
|
||||||
|
their edges show how execution might jump from one piece of code to
|
||||||
|
another (how control might flow).
|
||||||
|
|
||||||
|
For example, take the below program.
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
x = ...;
|
||||||
|
if x {
|
||||||
|
x = 1;
|
||||||
|
} else {
|
||||||
|
x = 0;
|
||||||
|
}
|
||||||
|
y = x;
|
||||||
|
```
|
||||||
|
|
||||||
|
The CFG might look like this:
|
||||||
|
|
||||||
|
{{< figure src="if-cfg.png" label="CFG for simple `if`-`else` code." class="small" >}}
|
||||||
|
|
||||||
|
Here, the initialization of `x` with `...`, as well as the `if` condition (just `x`),
|
||||||
|
are guaranteed to execute one after another, so they occupy a single node. From there,
|
||||||
|
depending on the condition, the control flow can jump to one of the
|
||||||
|
branches of the `if` statement: the "then" branch if the condition is truthy,
|
||||||
|
and the "else" branch if the condition is falsy. As a result, there are two
|
||||||
|
arrows coming out of the initial node. Once either branch is executed, control
|
||||||
|
always jumps to the code right after the `if` statement (the `y = x`). Thus,
|
||||||
|
both the `x = 1` and `x = 0` nodes have a single arrow to the `y = x` node.
|
||||||
|
|
||||||
|
As another example, if you had a loop:
|
||||||
|
|
||||||
|
```
|
||||||
|
x = ...;
|
||||||
|
while x {
|
||||||
|
x = x - 1;
|
||||||
|
}
|
||||||
|
y = x;
|
||||||
|
```
|
||||||
|
|
||||||
|
The CFG would look like this:
|
||||||
|
|
||||||
|
{{< figure src="while-cfg.png" label="CFG for simple `while` code." class="small" >}}
|
||||||
|
|
||||||
|
Here, the condition of the loop (`x`) is not always guaranteed to execute together
|
||||||
|
with the code that initializes `x`. That's because the condition of the loop
|
||||||
|
is checked after every iteration, whereas the code before the loop is executed
|
||||||
|
only once. As a result, `x = ...` and `x` occupy distinct CFG nodes. From there,
|
||||||
|
the control flow can proceed in two different ways, depending on the value
|
||||||
|
of `x`. If `x` is truthy, the program will proceed to the loop body (decrementing `x`).
|
||||||
|
If `x` is falsy, the program will skip the loop body altogether, and go to the
|
||||||
|
code right after the loop (`y = x`). This is indicated by the two arrows
|
||||||
|
going out of the `x` node. After executing the body, we return to the condition
|
||||||
|
of the loop to see if we need to run another iteration. Because of this, the
|
||||||
|
decrementing node has an arrow back to the loop condition.
|
||||||
|
|
||||||
|
Now, let's be a bit more precise. Control Flow Graphs are defined as follows:
|
||||||
|
|
||||||
|
* __The nodes__ are [_basic blocks_](https://en.wikipedia.org/wiki/Basic_block).
|
||||||
|
Paraphrasing Wikipedia's definition, a basic block is a piece of code that
|
||||||
|
has only one entry point and one exit point.
|
||||||
|
|
||||||
|
The one-entry-point rule means that it's not possible to jump into the middle
|
||||||
|
of the basic block, executing only half of its instructions. The execution of
|
||||||
|
a basic block always begins at the top. Symmetrically, the one-exit-point
|
||||||
|
rule means that you can't jump away to other code, skipping some instructions.
|
||||||
|
The execution of a basic block always ends at the bottom.
|
||||||
|
|
||||||
|
As a result of these constraints, when running a basic block, you are
|
||||||
|
guaranteed to execute every instruction in exactly the order they occur in,
|
||||||
|
and execute each instruction exactly once.
|
||||||
|
* __The edges__ are jumps between basic blocks. We've already seen how
|
||||||
|
`if` and `while` statements introduce these jumps.
|
||||||
|
|
||||||
|
Basic blocks can only be made of code that doesn't jump (otherwise,
|
||||||
|
we violate the single-exit-point policy). In the previous post,
|
||||||
|
we defined exactly this kind of code as [simple statements]({{< relref "05_spa_agda_semantics#introduce-simple-statements" >}}).
|
||||||
|
So, in our control flow graph, nodes will be sequences of simple statements.
|
||||||
|
{#list-basic-stmts}
|
||||||
|
|
||||||
|
### Control Flow Graphs in Agda
|
||||||
|
|
||||||
|
#### Basic Definition
|
||||||
|
At an abstract level, it's easy to say "it's just a graph where X is Y" about
|
||||||
|
anything. It's much harder to give a precise definition of such a graph,
|
||||||
|
particularly if you want to rule out invalid graphs (e.g., ones with edges
|
||||||
|
pointing nowhere). In Agda, I chose the represent a CFG with two lists: one of nodes,
|
||||||
|
and one of edges. Each node is simply a list of `BasicStmt`s, as
|
||||||
|
I described in a preceding paragraph. An edge is simply a pair of numbers,
|
||||||
|
each number encoding the index of the node connected by the edge.
|
||||||
|
|
||||||
|
Here's where it gets a little complicated. I don't want to use plain natural
|
||||||
|
numbers for indices, because that means you can easily introduce "broken" edge.
|
||||||
|
For example, what if you have 4 nodes, and you have an edge `(5, 5)`? To avoid
|
||||||
|
this, I picked the finite natural numbers represented by
|
||||||
|
[`Fin`](https://agda.github.io/agda-stdlib/v2.0/Data.Fin.Base.html#1154)
|
||||||
|
as endpoints for edges.
|
||||||
|
|
||||||
|
```Agda
|
||||||
|
data Fin : ℕ → Set where
|
||||||
|
zero : Fin (suc n)
|
||||||
|
suc : (i : Fin n) → Fin (suc n)
|
||||||
|
```
|
||||||
|
|
||||||
|
Specifically, `Fin n` is the type of natural numbers less than `n`. Following
|
||||||
|
this definition, `Fin 3` represents the numbers `0`, `1` and `2`. These are
|
||||||
|
represented using the same constructors as `Nat`: `zero` and `suc`. The type
|
||||||
|
of `zero` is `Fin (suc n)` for any `n`; this makes sense because zero is less
|
||||||
|
than any number plus one. For `suc`, the bound `n` of the input `i` is incremented
|
||||||
|
by one, leading to another `suc n` in the final type. This makes sense because if
|
||||||
|
`i < n`, then `i + 1 < n + 1`. I've previously explained this data type
|
||||||
|
[in another post on this site]({{< relref "01_aoc_coq#aside-vectors-and-finite-mathbbn" >}}).
|
||||||
|
|
||||||
|
Here's my definition of `Graph`s written using `Fin`:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Graphs.agda" 24 39 >}}
|
||||||
|
|
||||||
|
I explicitly used a `size` field, which determines how many nodes are in the
|
||||||
|
graph, and serves as the upper bound for the edge indices. From there, an
|
||||||
|
index `Index` into the node list is
|
||||||
|
{{< sidenote "right" "size-note" "just a natural number less than `size`," >}}
|
||||||
|
Ther are <code>size</code> natural numbers less than <code>size</code>:<br>
|
||||||
|
<code>0, 1, ..., size - 1</code>.
|
||||||
|
{{< /sidenote >}}
|
||||||
|
and an edge is just a pair of indices. The graph then contains a vector
|
||||||
|
(exact-length list) `nodes` of all the basic blocks, and then a list of
|
||||||
|
edges `edges`.
|
||||||
|
|
||||||
|
There are two fields here that I have not yet said anything about: `inputs`
|
||||||
|
and `outputs`. When we have a complete CFG for our programs, these fields are
|
||||||
|
totally unnecessary. However, as we are _building_ the CFG, these will come
|
||||||
|
in handy, by telling us how to stitch together smaller sub-graphs that we've
|
||||||
|
already built. Let's talk about that next.
|
||||||
|
|
||||||
|
#### Combining Graphs
|
||||||
|
Suppose you're building a CFG for a program in the following form:
|
||||||
|
|
||||||
|
```
|
||||||
|
code1;
|
||||||
|
code2;
|
||||||
|
```
|
||||||
|
|
||||||
|
Where `code1` and `code2` are arbitrary pieces of code, which could include
|
||||||
|
statements, loops, and pretty much anything else. Besides the fact that they
|
||||||
|
occur one after another, these pieces of code are unrelated, and we can
|
||||||
|
build CFGs for each one them independently. However, the fact that `code1` and
|
||||||
|
`code2` are in sequence means that the full control flow graph for the above
|
||||||
|
program should have edges going from the nodes in `code1` to the nodes in `code2`.
|
||||||
|
Of course, not _every_ node in `code1` should have such edges: that would
|
||||||
|
mean that after executing any "basic" sequence of instructions, you could suddenly
|
||||||
|
decide to skip the rest of `code1` and move on to executing `code2`.
|
||||||
|
|
||||||
|
Thus, we need to be more precise about what edges we need to insert; we want
|
||||||
|
to insert edges between the "final" nodes in `code1` (where control ends up
|
||||||
|
after `code1` is finished executing) and the "initial" nodes in `code2` (where
|
||||||
|
control would begin once we started executing `code2`). Those are the `outputs`
|
||||||
|
and `inputs`, respectively. When stitching together sequenced control graphs,
|
||||||
|
we will connect each of the outputs of one to each of the inputs of the other.
|
||||||
|
|
||||||
|
This is defined by the operation `g₁ ↦ g₂`, which sequences two graphs `g₁` and `g₂`:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Graphs.agda" 72 83 >}}
|
||||||
|
|
||||||
|
The definition starts out pretty innocuous, but gets a bit complicated by the
|
||||||
|
end. The sum of the numbers of nodes in the two operands becomes the new graph
|
||||||
|
size, and the nodes from the two graphs are all included in the result. Then,
|
||||||
|
the definitions start making use of various operators like `↑ˡᵉ` and `↑ʳᵉ`;
|
||||||
|
these deserve an explanation.
|
||||||
|
|
||||||
|
The tricky thing is that when we're concatenating lists of nodes, we are changing
|
||||||
|
some of the indices of the elements within. For instance, in the lists
|
||||||
|
`[x]` and `[y]`, the indices of both `x` and `y` are `0`; however, in the
|
||||||
|
concatenated list `[x, y]`, the index of `x` is still `0`, but the index of `y`
|
||||||
|
is `1`. More generally, when we concatenate two lists `l1` and `l2`, the indices
|
||||||
|
into `l1` remain unchanged, whereas the indices `l2` are shifted by `length l1`.
|
||||||
|
{#fin-reindexing}
|
||||||
|
|
||||||
|
Actually, that's not all there is to it. The _values_ of the indices into
|
||||||
|
the left list don't change, but their types do! They start as `Fin (length l1)`,
|
||||||
|
but for the whole list, these same indices will have type `Fin (length l1 + length l2))`.
|
||||||
|
|
||||||
|
To help deal with this, Agda provides the operators
|
||||||
|
[`↑ˡ`](https://agda.github.io/agda-stdlib/v2.0/Data.Fin.Base.html#2355)
|
||||||
|
and [`↑ʳ`](https://agda.github.io/agda-stdlib/v2.0/Data.Fin.Base.html#2522)
|
||||||
|
that implement this re-indexing and re-typing. The former implements "re-indexing
|
||||||
|
on the left" -- given an index into the left list `l1`, it changes its type
|
||||||
|
by adding the other list's length to it, but keeps the index value itself
|
||||||
|
unchanged. The latter implements "re-indexing on the right" -- given an index
|
||||||
|
into the right list `l2`, it adds the length of the first list to it (shifting it),
|
||||||
|
and does the same to its type.
|
||||||
|
|
||||||
|
The definition leads to the following equations:
|
||||||
|
|
||||||
|
```Agda
|
||||||
|
l1 : Vec A n
|
||||||
|
l2 : Vec A m
|
||||||
|
|
||||||
|
idx1 : Fin n -- index into l1
|
||||||
|
idx2 : Fin m -- index into l2
|
||||||
|
|
||||||
|
l1 [ idx1 ] ≡ (l1 ++ l2) [ idx1 ↑ˡ m ]
|
||||||
|
l2 [ idx2 ] ≡ (l1 ++ l2) [ n ↑ʳ idx2 ]
|
||||||
|
```
|
||||||
|
|
||||||
|
The operators used in the definition above are just versions of the same
|
||||||
|
re-indexing operators. The `↑ˡᵉ` operator applies `↑ˡ` to all the (__e__)dges
|
||||||
|
in a graph, and the `↑ˡⁱ` applies it to all the (__i__)ndices in a list
|
||||||
|
(like `inputs` and `outputs`).
|
||||||
|
|
||||||
|
Given these definitions, hopefully the intent with the rest of the definition
|
||||||
|
is not too hard to see. The edges in the new graph come from three places:
|
||||||
|
the graph `g₁` and `g₂`, and from creating a new edge from each of the outputs
|
||||||
|
of `g₁` to each of the inputs of `g₂`. We keep the inputs of `g₁` as the
|
||||||
|
inputs of the whole graph (since `g₁` comes first), and symmetrically we keep
|
||||||
|
the outputs of `g₂`. Of course, we do have to re-index them to keep them
|
||||||
|
pointing at the right nodes.
|
||||||
|
|
||||||
|
Another operation we will need is "overlaying" two graphs: this will be like
|
||||||
|
placing them in parallel, without adding jumps between the two. We use this
|
||||||
|
operation when combining the sub-CFGs of the "if" and "else" branches of an
|
||||||
|
`if`/`else`, which both follow the condition, and both proceed to the code after
|
||||||
|
the conditional.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Graphs.agda" 59 70 >}}
|
||||||
|
|
||||||
|
Everything here is just concatenation; we pool together the nodes, edges,
|
||||||
|
inputs, and outputs, and the main source of complexity is the re-indexing.
|
||||||
|
|
||||||
|
The one last operation, which we will use for `while` loops, is looping. This
|
||||||
|
operation simply connects the outputs of a graph back to its inputs (allowing
|
||||||
|
looping), and also allows the body to be skipped. This is slightly different
|
||||||
|
from the graph for `while` loops I showed above; the reason for that is that
|
||||||
|
I currently don't include the conditional expressions in my CFG. This is a
|
||||||
|
limitation that I will address in future work.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Graphs.agda" 85 95 >}}
|
||||||
|
|
||||||
|
Given these thee operations, I construct Control Flow Graphs as follows, where
|
||||||
|
`singleton` creates a new CFG node with the given list of simple statements:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Graphs.agda" 122 126 >}}
|
||||||
|
|
||||||
|
Throughout this, I've been liberal to include empty CFG nodes as was convenient.
|
||||||
|
This is a departure from the formal definition I gave above, but it makes
|
||||||
|
things much simpler.
|
||||||
|
|
||||||
|
### Additional Functions
|
||||||
|
|
||||||
|
To integrate Control Flow Graphs into our lattice-based program analyses, we'll
|
||||||
|
need to do a couple of things. First, upon reading the
|
||||||
|
[reference _Static Program Analysis_ text](https://cs.au.dk/~amoeller/spa/),
|
||||||
|
one sees a lot of quantification over the predecessors or successors of a
|
||||||
|
given CFG node. For example, the following equation is from Chapter 5:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\textit{JOIN}(v) = \bigsqcup_{w \in \textit{pred}(v)} \llbracket w \rrbracket
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
To compute the \(\textit{JOIN}\) function (which we have not covered yet) for
|
||||||
|
a given CFG node, we need to iterate over all of its predecessors, and
|
||||||
|
combine their static information using \(\sqcup\), which I first
|
||||||
|
[explained several posts ago]({{< relref "01_spa_agda_lattices#least-upper-bound" >}}).
|
||||||
|
To be able to iterate over them, we need to be able to retrieve the predecessors
|
||||||
|
of a node from a graph!
|
||||||
|
|
||||||
|
Our encoding does not make computing the predecessors particularly easy; to
|
||||||
|
check if two nodes are connected, we need to check if an `Index`-`Index` pair
|
||||||
|
corresponding to the nodes is present in the `edges` list. To this end, we need
|
||||||
|
to be able to compare edges for equality. Fortunately, it's relatively
|
||||||
|
straightforward to show that our edges can be compared in such a way;
|
||||||
|
after all, they are just pairs of `Fin`s, and `Fin`s and products support
|
||||||
|
these comparisons.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Graphs.agda" 149 152 >}}
|
||||||
|
|
||||||
|
Next, if we can compare edges for equality, we can check if an edge is in
|
||||||
|
a list. Agda provides a built-in function for this:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Graphs.agda" 154 154 >}}
|
||||||
|
|
||||||
|
To find the predecessors of a particular node, we go through all other nodes
|
||||||
|
in the graph and see if there's an edge there between those nodes and the
|
||||||
|
current one. This is preferable to simply iterating over the edges because
|
||||||
|
we may have duplicates in that list (why not?).
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Graphs.agda" 165 166 >}}
|
||||||
|
|
||||||
|
Above, `indices` is a list of all the node identifiers in the graph. Since the
|
||||||
|
graph has `size` nodes, the indices of all these nodes are simply the values
|
||||||
|
`0`, `1`, ..., `size - 1`. I defined a special function `finValues` to compute
|
||||||
|
this list, together with a proof that this list is unique.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Graphs.agda" 127 143 >}}
|
||||||
|
|
||||||
|
Another important property of `finValues` is that each node identifier is
|
||||||
|
present in the list, so that our computation written by traversing the node
|
||||||
|
list do not "miss" nodes.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Graphs.agda" 145 147 >}}
|
||||||
|
|
||||||
|
We can specialize these definitions for a particular graph `g`:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Graphs.agda" 156 163 >}}
|
||||||
|
|
||||||
|
To recap, we now have:
|
||||||
|
* A way to build control flow graphs from programs
|
||||||
|
* A list (unique'd and complete) of all nodes in the control flow graph so that
|
||||||
|
we can iterate over them when the algorithm demands.
|
||||||
|
* A 'predecessors' function, which will be used by our static program analyses,
|
||||||
|
implemented as an iteration over the list of nodes.
|
||||||
|
|
||||||
|
All that's left is to connect our `predecessors` function to edges in the graph.
|
||||||
|
The following definitions say that when an edge is in the graph, the starting
|
||||||
|
node is listed as a predecessor of the ending node, and vise versa.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Graphs.agda" 168 177 >}}
|
||||||
|
|
||||||
|
### Connecting Two Distinct Representations
|
||||||
|
|
||||||
|
I've described Control Flow Graphs as a compiler-centric representation of the
|
||||||
|
program. Unlike the formal semantics from the previous post, CFGs do not reason
|
||||||
|
about the dynamic behavior of the code. Instead, they capture the possible
|
||||||
|
paths that execution can take through the instructions. In that
|
||||||
|
sense, they are more of an approximation of what the program will do. This is
|
||||||
|
good: because of [Rice's theorem](https://en.wikipedia.org/wiki/Rice%27s_theorem),
|
||||||
|
we can't do anything other than approximating without running the program.
|
||||||
|
|
||||||
|
However, an incorrect approximation is of no use at all. Since the CFGs we build
|
||||||
|
will be the core data type used by our program analyses, it's important that they
|
||||||
|
are an accurate, if incomplete, representation. Specifically, because most
|
||||||
|
of our analyses reason about possible outcomes --- we report what sign each
|
||||||
|
variable __could__ have, for instance --- it's important that we don't accidentally
|
||||||
|
omit cases that can happen in practice from our CFGs. Formally, this means
|
||||||
|
that for each possible execution of a program according to its semantics,
|
||||||
|
{{< sidenote "right" "converse-note" "there exists a corresponding path through the graph." >}}
|
||||||
|
The converse is desirable too: that the graph has only paths that correspond
|
||||||
|
to possible executions of the program. One graph that violates this property is
|
||||||
|
the strongly-connected graph of all basic blocks in a program. Analyzing
|
||||||
|
such a graph would give us an overly-conservative estimation; since anything
|
||||||
|
can happen, most of our answers will likely be too general to be of any use. If,
|
||||||
|
on the other hand, only the necessary graph connections exist, we can be more
|
||||||
|
precise.<br>
|
||||||
|
<br>
|
||||||
|
However, proving this converse property (or even stating it precisely) is much
|
||||||
|
harder, because our graphs are somewhat conservative already. There exist
|
||||||
|
programs in which the condition of an <code>if</code>-statement is always
|
||||||
|
evaluated to <code>false</code>, but our graphs always have edges for both
|
||||||
|
the "then" and "else" cases. Determining whether a condition is always false
|
||||||
|
(e.g.) is undecidable thanks to Rice's theorem (again), so we can't rule it out.
|
||||||
|
Instead, we could broaden "all possible executions"
|
||||||
|
to "all possible executions where branching conditions can produce arbitrary
|
||||||
|
results", but this is something else entirely.<br>
|
||||||
|
<br>
|
||||||
|
For the time being, I will leave this converse property aside. As a result,
|
||||||
|
our approximations might be "too careful". However, they will at the very least
|
||||||
|
be sound.
|
||||||
|
{{< /sidenote >}}
|
||||||
|
|
||||||
|
In the next post, I will prove that this property holds for the graphs shown
|
||||||
|
here and the formal semantics I defined earlier. I hope to see you there!
|
||||||
15
content/blog/06_spa_agda_cfg/while-cfg.dot
Normal file
@@ -0,0 +1,15 @@
|
|||||||
|
digraph G {
|
||||||
|
graph[dpi=300 fontsize=14 fontname="Courier New"];
|
||||||
|
node[shape=rectangle style="filled" fillcolor="#fafafa" penwidth=0.5 color="#aaaaaa"];
|
||||||
|
edge[arrowsize=0.3 color="#444444"]
|
||||||
|
|
||||||
|
node_begin [label="x = ...;\l"]
|
||||||
|
node_cond [label="x\l"]
|
||||||
|
node_body [label="x = x - 1\l"]
|
||||||
|
node_end [label="y = x\l"]
|
||||||
|
|
||||||
|
node_begin -> node_cond
|
||||||
|
node_cond -> node_body
|
||||||
|
node_cond -> node_end
|
||||||
|
node_body -> node_cond
|
||||||
|
}
|
||||||
BIN
content/blog/06_spa_agda_cfg/while-cfg.png
Normal file
|
After Width: | Height: | Size: 14 KiB |
@@ -1,7 +1,8 @@
|
|||||||
---
|
---
|
||||||
title: Compiling a Functional Language Using C++, Part 7 - Runtime
|
title: Compiling a Functional Language Using C++, Part 7 - Runtime
|
||||||
date: 2019-08-06T14:26:38-07:00
|
date: 2019-08-06T14:26:38-07:00
|
||||||
tags: ["C and C++", "Functional Languages", "Compilers"]
|
tags: ["C", "C++", "Functional Languages", "Compilers"]
|
||||||
|
series: "Compiling a Functional Language using C++"
|
||||||
description: "In this post, we implement the supporting code that will be shared between all executables our compiler will create."
|
description: "In this post, we implement the supporting code that will be shared between all executables our compiler will create."
|
||||||
---
|
---
|
||||||
Wikipedia has the following definition for a __runtime__:
|
Wikipedia has the following definition for a __runtime__:
|
||||||
|
|||||||
376
content/blog/07_spa_agda_semantics_and_cfg/index.md
Normal file
@@ -0,0 +1,376 @@
|
|||||||
|
---
|
||||||
|
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 7: Connecting Semantics and Control Flow Graphs"
|
||||||
|
series: "Static Program Analysis in Agda"
|
||||||
|
description: "In this post, I prove that the Control Flow Graphs from the previous sections are sound according to our language's semantics"
|
||||||
|
date: 2024-11-28T20:32:00-07:00
|
||||||
|
tags: ["Agda", "Programming Languages"]
|
||||||
|
---
|
||||||
|
|
||||||
|
In the previous two posts, I covered two ways of looking at programs in
|
||||||
|
my little toy language:
|
||||||
|
|
||||||
|
* [In part 5]({{< relref "05_spa_agda_semantics" >}}), I covered the
|
||||||
|
__formal semantics__ of the programming language. These are precise rules
|
||||||
|
that describe how programs are executed. These serve as the source of truth
|
||||||
|
for what each statement and expression does.
|
||||||
|
|
||||||
|
Because they are the source of truth, they capture all information about
|
||||||
|
how programs are executed. To determine that a program starts in one
|
||||||
|
environment and ends in another (getting a judgement \(\rho_1, s \Rightarrow \rho_2\)),
|
||||||
|
we need to actually run the program. In fact, our Agda definitions
|
||||||
|
encoding the semantics actually produce proof trees, which contain every
|
||||||
|
single step of the program's execution.
|
||||||
|
* [In part 6]({{< relref "06_spa_agda_cfg" >}}), I covered
|
||||||
|
__Control Flow Graphs__ (CFGs), which in short arranged code into a structure
|
||||||
|
that represents how execution moves from one statement or expression to the
|
||||||
|
next.
|
||||||
|
|
||||||
|
Unlike the semantics, CFGs do not capture a program's entire execution;
|
||||||
|
they merely contain the possible orders in which statements can be evaluated.
|
||||||
|
Instead of capturing the exact number of iterations performed by a `while`
|
||||||
|
loop, they encode repetition as cycles in the graph. Because they are
|
||||||
|
missing some information, they're more of an approximation of a program's
|
||||||
|
behavior.
|
||||||
|
|
||||||
|
Our analyses operate on CFGs, but it is our semantics that actually determine
|
||||||
|
how a program behaves. In order for our analyses to be able to produce correct
|
||||||
|
results, we need to make sure that there isn't a disconnect between the
|
||||||
|
approximation and the truth. In the previous post, I stated the property I will
|
||||||
|
use to establish the connection between the two perspectives:
|
||||||
|
|
||||||
|
> For each possible execution of a program according to its semantics,
|
||||||
|
> there exists a corresponding path through the graph.
|
||||||
|
|
||||||
|
By ensuring this property, we will guarantee that our Control Flow Graphs
|
||||||
|
account for anything that might happen. Thus, a correct analysis built on top
|
||||||
|
of the graphs will produce results that match reality.
|
||||||
|
|
||||||
|
### Traces: Paths Through a Graph
|
||||||
|
A CFG contains each "basic" statement in our program, by definition; when we're
|
||||||
|
executing the program, we are therefore running code in one of the CFG's nodes.
|
||||||
|
When we switch from one node to another, there ought to be an edge between the
|
||||||
|
two, since edges in the CFG encode possible control flow. We keep doing this
|
||||||
|
until the program terminates (if ever).
|
||||||
|
|
||||||
|
Now, I said that there "ought to be edges" in the graph that correspond to
|
||||||
|
our program's execution. Moreover, the endpoints of these edges have to line
|
||||||
|
up, since we can only switch which basic block / node we're executing by following
|
||||||
|
an edge. As a result, if our CFG is correct, then for every program execution,
|
||||||
|
there is a path between the CFG's nodes that matches the statements that we
|
||||||
|
were executing.
|
||||||
|
|
||||||
|
Take the following program and CFG from the previous post as an example.
|
||||||
|
|
||||||
|
{{< sidebyside >}}
|
||||||
|
{{% sidebysideitem weight="0.55" %}}
|
||||||
|
```
|
||||||
|
x = 2;
|
||||||
|
while x {
|
||||||
|
x = x - 1;
|
||||||
|
}
|
||||||
|
y = x;
|
||||||
|
```
|
||||||
|
{{% /sidebysideitem %}}
|
||||||
|
{{< sidebysideitem weight="0.5" >}}
|
||||||
|
{{< figure src="while-cfg.png" label="CFG for simple `while` code." class="small" >}}
|
||||||
|
{{< /sidebysideitem >}}
|
||||||
|
{{< /sidebyside >}}
|
||||||
|
|
||||||
|
We start by executing `x = 2`, which is the top node in the CFG. Then, we execute
|
||||||
|
the condition of the loop, `x`. This condition is in the second node from
|
||||||
|
the top; fortunately, there exists an edge between `x = 2` and `x` that
|
||||||
|
allows for this possibility. Once we computed `x`, we know that it's nonzero,
|
||||||
|
and therefore we proceed to the loop body. This is the statement `x = x - 1`,
|
||||||
|
contained in the bottom left node in the CFG. There is once again an edge
|
||||||
|
between `x` and that node; so far, so good. Once we're done executing the
|
||||||
|
statement, we go back to the top of the loop again, following the edge back to
|
||||||
|
the middle node. We then execute the condition, loop body, and condition again.
|
||||||
|
At that point we have reduced `x` to zero, so the condition produces a falsey
|
||||||
|
value. We exit the loop and execute `y = x`, which is allowed by the edge from
|
||||||
|
the middle node to the bottom right node.
|
||||||
|
|
||||||
|
We will want to show that every possible execution of the program (e.g.,
|
||||||
|
with different variable assignments) corresponds to a path in the CFG. If one
|
||||||
|
doesn't, then our program can do something that our CFG doesn't account for,
|
||||||
|
which means that our analyses will not be correct.
|
||||||
|
|
||||||
|
I will define a `Trace` datatype, which will be an embellished
|
||||||
|
path through the graph. At its core, a path is simply a list of indices
|
||||||
|
together with edges that connect them. Viewed another way, it's a list of edges,
|
||||||
|
where each edge's endpoint is the next edge's starting point. We want to make
|
||||||
|
illegal states unrepresentable, and therefore use the type system to assert
|
||||||
|
that the edges are compatible. The easiest way to do this is by making
|
||||||
|
our `Trace` indexed by its start and end points. An empty trace, containing
|
||||||
|
no edges, will start and end in the same node; the `::` equivalent for the trace
|
||||||
|
will allow prepending one edge, starting at node `i1` and ending in `i2`, to
|
||||||
|
another trace which starts in `i2` and ends in some arbitrary `i3`. Here's
|
||||||
|
an initial stab at that:
|
||||||
|
|
||||||
|
```Agda
|
||||||
|
module _ {g : Graph} where
|
||||||
|
open Graph g using (Index; edges; inputs; outputs)
|
||||||
|
|
||||||
|
data Trace : Index → Index → Set where
|
||||||
|
Trace-single : ∀ {idx : Index} → Trace idx idx
|
||||||
|
Trace-edge : ∀ {idx₁ idx₂ idx₃ : Index} →
|
||||||
|
(idx₁ , idx₂) ∈ edges →
|
||||||
|
Trace idx₂ idx₃ → Trace idx₁ idx₃
|
||||||
|
```
|
||||||
|
|
||||||
|
This isn't enough, though. Suppose you had a function that takes an evaluation
|
||||||
|
judgement and produces a trace, resulting in a signature like this:
|
||||||
|
|
||||||
|
```Agda
|
||||||
|
buildCfg-sufficient : ∀ {s : Stmt} {ρ₁ ρ₂ : Env} → ρ₁ , s ⇒ˢ ρ₂ →
|
||||||
|
let g = buildCfg s
|
||||||
|
in Σ (Index g × Index g) (λ (idx₁ , idx₂) → Trace {g} idx₁ idx₂)
|
||||||
|
```
|
||||||
|
|
||||||
|
What's stopping this function from returning _any_ trace through the graph,
|
||||||
|
including one that doesn't even include the statements in our program `s`?
|
||||||
|
We need to narrow the type somewhat to require that the nodes it visits have
|
||||||
|
some relation to the program execution in question.
|
||||||
|
|
||||||
|
We could do this by indexing the `Trace` data type by a list of statements
|
||||||
|
that we expect it to match, and requiring that for each constructor, the
|
||||||
|
statements of the starting node be at the front of that list. We could compute
|
||||||
|
the list of executed statements in order using
|
||||||
|
{{< sidenote "right" "full-execution=note" "a recursive function on the `_,_⇒ˢ_` data type." >}}
|
||||||
|
I mentioned earlier that our encoding of the semantics is actually defining
|
||||||
|
a proof tree, which includes every step of the computation. That's why we can
|
||||||
|
write a function that takes the proof tree and extracts the executed statements.
|
||||||
|
{{< /sidenote >}}
|
||||||
|
|
||||||
|
That would work, but it loses a bit of information. The execution judgement
|
||||||
|
contains not only each statement that was evaluated, but also the environments
|
||||||
|
before and after evaluating it. Keeping those around will be useful: eventually,
|
||||||
|
we'd like to state the invariant that at every CFG node, the results of our
|
||||||
|
analysis match the current program environment. Thus, instead of indexing simply
|
||||||
|
by the statements of code, I chose to index my `Trace` by the
|
||||||
|
starting and ending environment, and to require it to contain evaluation judgements
|
||||||
|
for each node's code. The judgements include the statements that were evaluated,
|
||||||
|
which we can match against the code in the CFG node. However, they also assert
|
||||||
|
that the environments before and after are connected by that code in the
|
||||||
|
language's formal semantics. The resulting definition is as follows:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Traces.agda" 10 18 >}}
|
||||||
|
|
||||||
|
The `g [ idx ]` and `g [ idx₁ ]` represent accessing the basic block code at
|
||||||
|
indices `idx` and `idx₁` in graph `g`.
|
||||||
|
|
||||||
|
### Trace Preservation by Graph Operations
|
||||||
|
|
||||||
|
Our proofs of trace existence will have the same "shape" as the functions that
|
||||||
|
build the graph. To prove the trace property, we'll assume that evaluations of
|
||||||
|
sub-statements correspond to traces in the sub-graphs, and use that to prove
|
||||||
|
that the full statements have corresponding traces in the full graph. We built
|
||||||
|
up graphs by combining sub-graphs for sub-statements, using `_∙_` (overlaying
|
||||||
|
two graphs), `_↦_` (sequencing two graphs) and `loop` (creating a zero-or-more
|
||||||
|
loop in the graph). Thus, to make the jump from sub-graphs to full graphs,
|
||||||
|
we'll need to prove that traces persist through overlaying, sequencing,
|
||||||
|
and looping.
|
||||||
|
|
||||||
|
Take `_∙_`, for instance; we want to show that if a trace exists in the left
|
||||||
|
operand of overlaying, it also exists in the final graph. This leads to
|
||||||
|
the following statement and proof:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Properties.agda" 88 97 >}}
|
||||||
|
|
||||||
|
There are some details there to discuss.
|
||||||
|
* First, we have to change the
|
||||||
|
indices of the returned `Trace`. That's because they start out as indices
|
||||||
|
into the graph `g₁`, but become indices into the graph `g₁ ∙ g₂`. To take
|
||||||
|
care of this re-indexing, we have to make use of the `↑ˡ` operators,
|
||||||
|
which I described in [this section of the previous post]({{< relref "06_spa_agda_cfg#fin-reindexing" >}}).
|
||||||
|
* Next, in either case, we need to show that the new index acquired via `↑ˡ`
|
||||||
|
returns the same basic block in the new graph as the old index returned in
|
||||||
|
the original graph. Fortunately, the Agda standard library provides a proof
|
||||||
|
of this, `lookup-++ˡ`. The resulting equality is the following:
|
||||||
|
|
||||||
|
```Agda
|
||||||
|
g₁ [ idx₁ ] ≡ (g₁ ∙ g₂) [ idx₁ ↑ˡ Graph.size g₂ ]
|
||||||
|
```
|
||||||
|
|
||||||
|
This allows us to use the evaluation judgement in each constructor for
|
||||||
|
traces in the output of the function.
|
||||||
|
* Lastly, in the `Trace-edge` case, we have to additionally return a proof that the
|
||||||
|
edge used by the trace still exists in the output graph. This follows
|
||||||
|
from the fact that we include the edges from `g₁` after re-indexing them.
|
||||||
|
|
||||||
|
```Agda
|
||||||
|
; edges = (Graph.edges g₁ ↑ˡᵉ Graph.size g₂) List.++
|
||||||
|
(Graph.size g₁ ↑ʳᵉ Graph.edges g₂)
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
The `↑ˡᵉ` function is just a list `map` with `↑ˡ`. Thus, if a pair of edges
|
||||||
|
is in the original list (`Graph.edges g₁`), as is evidenced by `idx₁→idx`,
|
||||||
|
then its re-indexing is in the mapped list. To show this, I use the utility
|
||||||
|
lemma `x∈xs⇒fx∈fxs`. The mapped list is the left-hand-side of a `List.++`
|
||||||
|
operator, so I additionally use the lemma `∈-++⁺ˡ` that shows membership is
|
||||||
|
preserved by list concatenation.
|
||||||
|
|
||||||
|
The proof of `Trace-∙ʳ`, the same property but for the right-hand operand `g₂`,
|
||||||
|
is very similar, as are the proofs for sequencing. I give their statements,
|
||||||
|
but not their proofs, below.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Properties.agda" 99 101 >}}
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Properties.agda" 139 141 >}}
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Properties.agda" 150 152 >}}
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Properties.agda" 175 176 >}}
|
||||||
|
|
||||||
|
Preserving traces is unfortunately not quite enough. The thing that we're missing
|
||||||
|
is looping: the same sub-graph can be re-traversed several times as part of
|
||||||
|
execution, which suggests that we ought to be able to combine multiple traces
|
||||||
|
through a loop graph into one. Using our earlier concrete example, we might
|
||||||
|
have traces for evaluating `x` then `x = x -1` with the variable `x` being
|
||||||
|
mapped first to `2` and then to `1`. These traces occur back-to-back, so we
|
||||||
|
will put them together into a single trace. To prove some properties about this,
|
||||||
|
I'll define a more precise type of trace.
|
||||||
|
|
||||||
|
### End-To-End Traces
|
||||||
|
The key way that traces through a loop graph are combined is through the
|
||||||
|
back-edges. Specifically, our `loop` graphs have edges from each of the `output`
|
||||||
|
nodes to each of the `input` nodes. Thus, if we have two paths, both
|
||||||
|
starting at the beginning of the graph and ending at the end, we know that
|
||||||
|
the first path's end has an edge to the second path's beginning. This is
|
||||||
|
enough to combine them.
|
||||||
|
|
||||||
|
This logic doesn't work if one of the paths ends in the middle of the graph,
|
||||||
|
and not on one of the `output`s. That's because there is no guarantee that there
|
||||||
|
is a connecting edge.
|
||||||
|
|
||||||
|
To make things easier, I defined a new data type of "end-to-end" traces, whose
|
||||||
|
first nodes are one of the graph's `input`s, and whose last nodes are one
|
||||||
|
of the graph's `output`s.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Traces.agda" 27 36 >}}
|
||||||
|
|
||||||
|
We can trivially lift the proofs from the previous section to end-to-end traces.
|
||||||
|
For example, here's the lifted version of the first property we proved:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Properties.agda" 110 121 >}}
|
||||||
|
|
||||||
|
The other lifted properties are similar.
|
||||||
|
|
||||||
|
For looping, the proofs get far more tedious, because of just how many
|
||||||
|
sources of edges there are in the output graph --- they span four lines:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Graphs.agda" 84 94 "hl_lines=5-8" >}}
|
||||||
|
|
||||||
|
I therefore made use of two helper lemmas. The first is about list membership
|
||||||
|
under concatenation. Simply put, if you concatenate a bunch of lists, and
|
||||||
|
one of them (`l`) contains some element `x`, then the concatenation contains
|
||||||
|
`x` too.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Utils.agda" 82 85 >}}
|
||||||
|
|
||||||
|
I then specialized this lemma for concatenated groups of edges.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Properties.agda" 162 172 "hl_lines=9-11" >}}
|
||||||
|
|
||||||
|
Now we can finally prove end-to-end properties of loop graphs. The simplest one is
|
||||||
|
that they allow the code within them to be entirely bypassed
|
||||||
|
(as when the loop body is evaluated zero times). I called this
|
||||||
|
`EndToEndTrace-loop⁰`. The "input" node of the loop graph is index `zero`,
|
||||||
|
while the "output" node of the loop graph is index `suc zero`. Thus, the key
|
||||||
|
step is to show that an edge between these two indices exists:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Properties.agda" 227 240 "hl_lines=5-6" >}}
|
||||||
|
|
||||||
|
The only remaining novelty is the `trace` field of the returned `EndToEndTrace`.
|
||||||
|
It uses the trace concatenation operation `++⟨_⟩`. This operator allows concatenating
|
||||||
|
two traces, which start and end at distinct nodes, as long as there's an edge
|
||||||
|
that connects them:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Traces.agda" 21 25 >}}
|
||||||
|
|
||||||
|
The expression on line 239 of `Properties.agda` is simply the single-edge trace
|
||||||
|
constructed from the edge `0 -> 1` that connects the start and end nodes of the
|
||||||
|
loop graph. Both of those nodes is empty, so no code is evaluated in that case.
|
||||||
|
|
||||||
|
The proof for combining several traces through a loop follows a very similar
|
||||||
|
pattern. However, instead of constructing a single-edge trace as we did above,
|
||||||
|
it concatenates two traces from its arguments. Also, instead of using
|
||||||
|
the edge from the first node to the last, it instead uses an edge from the
|
||||||
|
last to the first, as I described at the very beginning of this section.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Properties.agda" 209 225 "hl_lines=8-9" >}}
|
||||||
|
|
||||||
|
### Proof of Sufficiency
|
||||||
|
|
||||||
|
We now have all the pieces to show each execution of our program has a corresponding
|
||||||
|
trace through a graph. Here is the whole proof:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Properties.agda" 281 296 >}}
|
||||||
|
|
||||||
|
We proceed by
|
||||||
|
{{< sidenote "right" "derivation-note" "checking what inference rule was used to execute a particular statement," >}}
|
||||||
|
Precisely, we proceed by induction on the derivation of \(\rho_1, s \Rightarrow \rho_2\).
|
||||||
|
{{< /sidenote >}}
|
||||||
|
because that's what tells us what the program did in that particular moment.
|
||||||
|
|
||||||
|
* When executing a basic statement, we know that we constructed a singleton
|
||||||
|
graph that contains one node with that statement. Thus, we can trivially
|
||||||
|
construct a single-step trace without any edges.
|
||||||
|
* When executing a sequence of statements, we have two induction hypotheses.
|
||||||
|
These state that the sub-graphs we construct for the first and second statement
|
||||||
|
have the trace property. We also have two evaluation judgements (one for each
|
||||||
|
statement), which means that we can apply that property to get traces. The
|
||||||
|
`buildCfg` function sequences the two graphs, and we can sequence
|
||||||
|
the two traces through them, resulting in a trace through the final output.
|
||||||
|
* For both the `then` and `else` cases of evaluating an `if` statement,
|
||||||
|
we observe that `buildCfg` overlays the sub-graphs of the two branches using
|
||||||
|
`_∙_`. We also know that the two sub-graphs have the trace property.
|
||||||
|
* In the `then` case, since we have an evaluation judgement for
|
||||||
|
`s₁` (in variable `ρ₁,s₁⇒ρ₂`), we conclude that there's a correct trace
|
||||||
|
through the `then` sub-graph. Since that graph is the left operand of
|
||||||
|
`_∙_`, we use `EndToEndTrace-∙ˡ` to show that the trace is preserved in the full graph.
|
||||||
|
* In the `else` case things are symmetric. We are evaluating `s₂`, with
|
||||||
|
a judgement given by `ρ₁,s₂⇒ρ₂`. We use that to conclude that there's a
|
||||||
|
trace through the graph built from `s₂`. Since this sub-graph is the right
|
||||||
|
operand of `_∙_`, we use `EndToEndTrace-∙ʳ` to show that it's preserved in
|
||||||
|
the full graph.
|
||||||
|
* For the `true` case of `while`, we have two evaluation judgements: one
|
||||||
|
for the body and one for the loop again, this time
|
||||||
|
in a new environment. They are stored in `ρ₁,s⇒ρ₂` and `ρ₂,ws⇒ρ₃`, respectively.
|
||||||
|
The statement being evaluated by `ρ₂,ws⇒ρ₃` is actually the exact same statement
|
||||||
|
that's being evaluated at the top level of the proof. Thus, we can use
|
||||||
|
`EndToEndTrace-loop²`, which sequences two traces through the same graph.
|
||||||
|
|
||||||
|
We also use `EndToEndTrace-loop` to lift the trace through `buildCfg s` into
|
||||||
|
a trace through `buildCfg (while e s)`.
|
||||||
|
* For the `false` case of the `while`, we don't execute any instructions,
|
||||||
|
and finish evaluating right away. This corresponds to the do-nothing trace,
|
||||||
|
which we have established exists using `EndToEndTrace-loop⁰`.
|
||||||
|
|
||||||
|
That's it! We have now validated that the Control Flow Graphs we construct
|
||||||
|
match the semantics of the programming language, which makes them a good
|
||||||
|
input to our static program analyses. We can finally start writing those!
|
||||||
|
|
||||||
|
### Defining and Verifying Static Program Analyses
|
||||||
|
|
||||||
|
We have all the pieces we need to define a formally-verified forward analysis:
|
||||||
|
|
||||||
|
* We have used the framework of lattices to encode the precision of program
|
||||||
|
analysis outputs. Smaller elements in a lattice are more specific,
|
||||||
|
meaning more useful information.
|
||||||
|
* We have [implemented fixed-point algorithm]({{< relref "04_spa_agda_fixedpoint" >}}),
|
||||||
|
which finds the smallest solutions to equations in the form \(f(x) = x\)
|
||||||
|
for monotonic functions over lattices. By defining our analysis as such a function,
|
||||||
|
we can apply the algorithm to find the most precise steady-state description
|
||||||
|
of our program.
|
||||||
|
* We have defined how our programs are executed, which is crucial for defining
|
||||||
|
"correctness".
|
||||||
|
|
||||||
|
Here's how these pieces will fit together. We will construct a
|
||||||
|
finite-height lattice. Every single element of this lattice will contain
|
||||||
|
information about each variable at each node in the Control Flow Graph. We will
|
||||||
|
then define a monotonic function that updates this information using the
|
||||||
|
structure encoded in the CFG's edges and nodes. Then, using the fixed-point algorithm,
|
||||||
|
we will find the least element of the lattice, which will give us a precise
|
||||||
|
description of all program variables at all points in the program. Because
|
||||||
|
we have just validated our CFGs to be faithful to the language's semantics,
|
||||||
|
we'll be able to prove that our algorithm produces accurate results.
|
||||||
|
|
||||||
|
The next post or two will be the last stretch; I hope to see you there!
|
||||||
15
content/blog/07_spa_agda_semantics_and_cfg/while-cfg.dot
Normal file
@@ -0,0 +1,15 @@
|
|||||||
|
digraph G {
|
||||||
|
graph[dpi=300 fontsize=14 fontname="Courier New"];
|
||||||
|
node[shape=rectangle style="filled" fillcolor="#fafafa" penwidth=0.5 color="#aaaaaa"];
|
||||||
|
edge[arrowsize=0.3 color="#444444"]
|
||||||
|
|
||||||
|
node_begin [label="x = 2;\l"]
|
||||||
|
node_cond [label="x\l"]
|
||||||
|
node_body [label="x = x - 1\l"]
|
||||||
|
node_end [label="y = x\l"]
|
||||||
|
|
||||||
|
node_begin -> node_cond
|
||||||
|
node_cond -> node_body
|
||||||
|
node_cond -> node_end
|
||||||
|
node_body -> node_cond
|
||||||
|
}
|
||||||
BIN
content/blog/07_spa_agda_semantics_and_cfg/while-cfg.png
Normal file
|
After Width: | Height: | Size: 14 KiB |
@@ -1,7 +1,8 @@
|
|||||||
---
|
---
|
||||||
title: Compiling a Functional Language Using C++, Part 8 - LLVM
|
title: Compiling a Functional Language Using C++, Part 8 - LLVM
|
||||||
date: 2019-10-30T22:16:22-07:00
|
date: 2019-10-30T22:16:22-07:00
|
||||||
tags: ["C and C++", "Functional Languages", "Compilers"]
|
tags: ["C++", "Functional Languages", "Compilers"]
|
||||||
|
series: "Compiling a Functional Language using C++"
|
||||||
description: "In this post, we enable our compiler to convert G-machine instructions to LLVM IR, which finally allows us to generate working executables."
|
description: "In this post, we enable our compiler to convert G-machine instructions to LLVM IR, which finally allows us to generate working executables."
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
445
content/blog/08_spa_agda_forward/index.md
Normal file
@@ -0,0 +1,445 @@
|
|||||||
|
---
|
||||||
|
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 8: Forward Analysis"
|
||||||
|
series: "Static Program Analysis in Agda"
|
||||||
|
description: "In this post, I use the monotone lattice framework and verified CFGs to define a sign analysis"
|
||||||
|
date: 2024-12-01T15:09:07-08:00
|
||||||
|
tags: ["Agda", "Programming Languages"]
|
||||||
|
---
|
||||||
|
|
||||||
|
In the previous post, I showed that the Control Flow graphs we built of our
|
||||||
|
programs match how they are really executed. This means that we can rely
|
||||||
|
on these graphs to compute program information. In this post, we finally
|
||||||
|
get to compute that information. Here's a quick bit paraphrasing from last time
|
||||||
|
that provides a summary of our approach:
|
||||||
|
|
||||||
|
1. We will construct a finite-height lattice. Every single element of this
|
||||||
|
lattice will contain information about each variable at each node in the
|
||||||
|
Control Flow Graph.
|
||||||
|
2. We will then define a monotonic function that update this information using
|
||||||
|
the structure encoded in the CFG’s edges and nodes.
|
||||||
|
3. Then, using the fixed-point algorithm, we will find the least element of the
|
||||||
|
lattice, which will give us a precise description of all program variables at
|
||||||
|
all points in the program.
|
||||||
|
4. Because we have just validated our CFGs to be faithful to the language’s
|
||||||
|
semantics, we’ll be able to prove that our algorithm produces accurate results.
|
||||||
|
|
||||||
|
Let's jump right into it!
|
||||||
|
|
||||||
|
### Choosing a Lattice
|
||||||
|
A lot of this time, we have been [talking about lattices]({{< relref "01_spa_agda_lattices" >}}),
|
||||||
|
particularly [lattices of finite height]({{< relref "03_spa_agda_fixed_height" >}}).
|
||||||
|
These structures represent things we know about the program, and provide operators
|
||||||
|
like \((\sqcup)\) and \((\sqcap)\) that help us combine such knowledge.
|
||||||
|
|
||||||
|
The forward analysis code I present here will work with any finite-height
|
||||||
|
lattice, with the additional constraint that equivalence of lattices
|
||||||
|
is decidable, which comes from [the implementation of the fixed-point algorithm]({{< relref "04_spa_agda_fixedpoint" >}}),
|
||||||
|
in which we routinely check if a function's output is the same as its input.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 4 8 >}}
|
||||||
|
|
||||||
|
The finite-height lattice `L` is intended to describe the state of a single
|
||||||
|
variable.
|
||||||
|
One example of a lattice that can be used as `L` is our
|
||||||
|
sign lattice. We've been using the sign lattice in our examples [from the very beginning]({{< relref "01_spa_agda_lattices#lattices" >}}),
|
||||||
|
and we will stick with it for the purposes of this explanation. However, this
|
||||||
|
lattice alone does not describe our program, since it only talks about a single
|
||||||
|
sign; programs have lots of variables, all of which can have different signs!
|
||||||
|
So, we might go one step further and define a map lattice from variables to
|
||||||
|
their signs:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\text{Variable} \to \text{Sign}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
We [have seen]({{< relref "02_spa_agda_combining_lattices#the-map-lattice" >}})
|
||||||
|
that we can turn any lattice \(L\) into a map lattice \(A \to L\), for any
|
||||||
|
type of keys \(A\). In this case, we will define \(A \triangleq \text{Variable}\),
|
||||||
|
and \(L \triangleq \text{Sign}\). The
|
||||||
|
[sign lattice has a finite height]({{< relref "02_spa_agda_combining_lattices#the-map-lattice" >}}),
|
||||||
|
and I've proven that, as long as we pick a finite set of keys, [map lattices
|
||||||
|
\(A \to L\) have a finite height if \(L\) has a finite height]({{< relref "03_spa_agda_fixed_height#fixed-height-of-the-map-lattice" >}}).
|
||||||
|
Since a program's text is finite, \(\text{Variable}\) is a finite set, and
|
||||||
|
we have ourselves a finite-height lattice \(\text{Variable} \to \text{Sign}\).
|
||||||
|
|
||||||
|
We're on the right track, but even the lattice we have so far is not sufficient.
|
||||||
|
That's because variables have different signs at different points in the program!
|
||||||
|
You might initialize a variable with `x = 1`, making it positive, and then
|
||||||
|
go on to compute some arbitrary function using loops and conditionals. For
|
||||||
|
each variable, we need to keep track of its sign at various points in the code.
|
||||||
|
When we [defined Control Flow Graphs]({{< relref "06_spa_agda_cfg" >}}), we
|
||||||
|
split our programs into sequences of statements that are guaranteed to execute
|
||||||
|
together --- basic blocks. For our analysis, we'll keep per-variable for
|
||||||
|
each basic block in the program. Since basic blocks are nodes in the Control Flow
|
||||||
|
Graph of our program, our whole lattice will be as follows:
|
||||||
|
{#whole-lattice}
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\text{Info} \triangleq \text{NodeId} \to (\text{Variable} \to \text{Sign})
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
We follow the same logic we just did for the variable-sign lattice; since
|
||||||
|
\(\text{Variable} \to \text{Sign}\) is a lattice of finite height, and since
|
||||||
|
\(\text{NodeId}\) is a finite set, the whole \(\text{Info}\) map will be
|
||||||
|
a lattice with a finite height.
|
||||||
|
|
||||||
|
Notice that both the sets of \(\text{Variable}\) and \(\text{NodeId}\) depend
|
||||||
|
on the program in question. The lattice we use is slightly different for
|
||||||
|
each input program! We can use Agda's parameterized modules to automaitcally
|
||||||
|
parameterize all our functions over programs:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 36 37 >}}
|
||||||
|
|
||||||
|
Now, let's make the informal descriptions above into code, by instantiating
|
||||||
|
our map lattice modules. First, I invoked the code for the smaller variable-sign
|
||||||
|
lattice. This ended up being quite long, so that I could rename variables I
|
||||||
|
brought into scope. I will collapse the relevant code block; suffice to say
|
||||||
|
that I used the suffix `v` (e.g., renaming `_⊔_` to `_⊔ᵛ_`) for properties
|
||||||
|
and operators to do with variable-sign maps (in Agda: `VariableValuesFiniteMap`).
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 41 82 "" "**(Click here to expand the module uses for variable-sign maps)**" >}}
|
||||||
|
|
||||||
|
I then used this lattice as an argument to the map module again, to
|
||||||
|
construct the top-level \(\text{Info}\) lattice (in Agda: `StateVariablesFiniteMap`).
|
||||||
|
This also required a fair bit of code, most of it to do with renaming.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 85 112 "" "**(Click here to expand the module uses for the top-level lattice)**" >}}
|
||||||
|
|
||||||
|
### Constructing a Monotone Function
|
||||||
|
|
||||||
|
We now have a lattice in hand; the next step is to define a function over
|
||||||
|
this lattice. For us to be able to use the fixed-point algorithm on this
|
||||||
|
function, it will need to be [monotonic]({{< relref "01_spa_agda_lattices#define-monotonicity" >}}).
|
||||||
|
|
||||||
|
Our goal with static analysis is to compute information about our program; that's
|
||||||
|
what we want the function to do. When the lattice we're using is the sign lattice,
|
||||||
|
we're trying to determine the signs of each of the variables in various parts
|
||||||
|
of the program. How do we go about this?
|
||||||
|
|
||||||
|
Each piece of code in the program might change a variable's sign. For instance,
|
||||||
|
if `x` has sign \(0\), and we run the statement `x = x - 1`, the sign of
|
||||||
|
`x` will be \(-\). If we have an expression `y + z`, we can use the signs of
|
||||||
|
`y` and `z` to compute the sign of the whole thing. This is a form
|
||||||
|
of [abstract interpretation](https://en.wikipedia.org/wiki/Abstract_interpretation),
|
||||||
|
in which we almost-run the program, but forget some details (e.g., the
|
||||||
|
exact values of `x`, `y`, and `z`, leaving only their signs). The exact details
|
||||||
|
of how this partial evaluation is done are analysis-specific; in general, we
|
||||||
|
simply require an analysis to provide an evaluator. We will define
|
||||||
|
[an evaluator for the sign lattice below](#instantiating-with-the-sign-lattice).
|
||||||
|
{#general-evaluator}
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 166 167 >}}
|
||||||
|
|
||||||
|
From this, we know how each statement and basic block will change variables
|
||||||
|
in the function. But we have described them process as "if a variable has
|
||||||
|
sign X, it becomes sign Y" -- how do we know what sign a variable has _before_
|
||||||
|
the code runs? Fortunately, the Control Flow Graph tells us exactly
|
||||||
|
what code could be executed before any given basic block. Recall that edges
|
||||||
|
in the graph describe all possible jumps that could occur; thus, for any
|
||||||
|
node, the incoming edges describe all possible blocks that can precede it.
|
||||||
|
This is why we spent all that time [defining the `predecessors` function]({{< relref "06_spa_agda_cfg#additional-functions" >}}).
|
||||||
|
|
||||||
|
We proceed as follows: for any given node, find its predecessors. By accessing
|
||||||
|
our \(\text{Info}\) map for each predecessor, we can determine our current
|
||||||
|
best guess of variable signs at that point, in the form of a \(\text{Variable} \to \text{Sign}\)
|
||||||
|
map (more generally, \(\text{Variable} \to L\) map in an arbitrary analysis).
|
||||||
|
We know that any of these predecessors could've been the previous point of
|
||||||
|
execution; if a variable `x` has sign \(+\) in one predecessor and \(-\)
|
||||||
|
in another, it can be either one or the other when we start executing the
|
||||||
|
current block. Early on, we saw that [the \((\sqcup)\) operator models disjunction
|
||||||
|
("A or B")]({{< relref "01_spa_agda_lattices#lub-glub-or-and" >}}). So, we apply
|
||||||
|
\((\sqcup)\) to the variable-sign maps of all predecessors. The
|
||||||
|
[reference _Static Program Analysis_ text](https://cs.au.dk/~amoeller/spa/)
|
||||||
|
calls this operation \(\text{JOIN}\):
|
||||||
|
{#join-preds}
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\textit{JOIN}(v) = \bigsqcup_{w \in \textit{pred}(v)} \llbracket w \rrbracket
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
The Agda implementation uses a `foldr`:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 139 140 >}}
|
||||||
|
|
||||||
|
Computing the "combined incoming states" for any node is a monotonic function.
|
||||||
|
This follows from the monotonicity of \((\sqcup)\) --- in both arguments ---
|
||||||
|
and the definition of `foldr`.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice.agda" 143 151 "" "**(Click here to expand the general proof)**" >}}
|
||||||
|
|
||||||
|
From this, we can formally state that \(\text{JOIN}\) is monotonic. Note that
|
||||||
|
the input and output lattices are different: the input lattice is the lattice
|
||||||
|
of variable states at each block, and the output lattice is a single variable-sign
|
||||||
|
map, representing the combined preceding state at a given node.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 145 149 >}}
|
||||||
|
|
||||||
|
Above, the `m₁≼m₂⇒m₁[ks]≼m₂[ks]` lemma states that for two maps with the same
|
||||||
|
keys, where one map is less than another, all the values for any subset of keys
|
||||||
|
`ks` are pairwise less than each other (i.e. `m₁[k]≼m₂[k]`, and `m₁[l]≼m₂[l]`, etc.).
|
||||||
|
This follows from the definition of "less than" for maps.
|
||||||
|
{#less-than-lemma}
|
||||||
|
|
||||||
|
So those are the two pieces: first, join all the preceding states, then use
|
||||||
|
the abstract interpretation function. I opted to do both of these in bulk:
|
||||||
|
|
||||||
|
1. Take an initial \(\text{Info}\) map, and update every basic block's entry
|
||||||
|
to be the join of its predecessors.
|
||||||
|
2. In the new joined map, each key now contains the variable state at
|
||||||
|
the beginning of the block; so, apply the abstract interpretation function
|
||||||
|
via `eval` to each key, computing the state at the end of the block.
|
||||||
|
|
||||||
|
I chose to do these in bulk because this way, after each application of
|
||||||
|
the function, we have updated each block with exactly one round of information.
|
||||||
|
The alternative --- which is specified in the reference text --- is to update
|
||||||
|
one key at a time. The difference there is that updates to later keys might be
|
||||||
|
"tainted" by updates to keys that came before them. This is probably fine
|
||||||
|
(and perhaps more efficient, in that it "moves faster"), but it's harder to
|
||||||
|
reason about.
|
||||||
|
|
||||||
|
#### Generalized Update
|
||||||
|
|
||||||
|
To implement bulk assignment, I needed to implement the source text's
|
||||||
|
Exercise 4.26:
|
||||||
|
|
||||||
|
> __Exercise 4.26__: Recall that \(f[a \leftarrow x]\) denotes the function that is identical to
|
||||||
|
> \(f\) except that it maps \(a\) to \(x\). Assume \(f : L_1 \to (A \to L_2)\)
|
||||||
|
> and \(g : L_1 \to L_2\) are monotone functions where \(L_1\) and \(L_2\) are
|
||||||
|
> lattices and \(A\) is a set, and let \(a \in A\). (Note that the codomain of
|
||||||
|
> \(f\) is a map lattice.)
|
||||||
|
>
|
||||||
|
> Show that the function \(h : L_1 \to (A \to L_2)\)
|
||||||
|
> defined by \(h(x) = f(x)[a \leftarrow g(x)]\) is monotone.
|
||||||
|
|
||||||
|
In fact, I generalized this statement to update several keys at once, as follows:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
h(x) = f(x)[a_1 \leftarrow g(a_1, x),\ ...,\ a_n \leftarrow g(a_n, x)]
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
I called this operation "generalized update".
|
||||||
|
|
||||||
|
At first, the exercise may not obviously correspond to the bulk operation
|
||||||
|
I've described. Particularly confusing is the fact that it has two lattices,
|
||||||
|
\(L_1\) and \(L_2\). In fact, the exercise results in a very general theorem;
|
||||||
|
we can exploit a more concrete version of the theorem by setting
|
||||||
|
\(L_1 \triangleq A \to L_2\), resulting in an overall signature for \(f\) and \(h\):
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
f : (A \to L_2) \to (A \to L_2)
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
In other words, if we give the entire operation in Exercise 4.26 a type,
|
||||||
|
it would look like this:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\text{ex}_{4.26} : \underbrace{K}_{\text{value of}\ a} \to \underbrace{(\text{Map} \to V)}_{\text{updater}} \to \underbrace{\text{Map} \to \text{Map}}_{f} \to \underbrace{\text{Map} \to \text{Map}}_{h}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
That's still more general than we need it. This here allows us to modify
|
||||||
|
any map-to-map function by updating a certain key in that function. If we
|
||||||
|
_just_ want to update keys (as we do for the purposes of static analysis),
|
||||||
|
we can recover a simpler version by setting \(f \triangleq id\), which
|
||||||
|
results in an updater \(h(x) = x[a \leftarrow g(x)]\), and a signature for
|
||||||
|
the exercise:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\text{ex}_{4.26} : \underbrace{K}_{\text{value of}\ a} \to \underbrace{(\text{Map} \to V)}_{\text{updater}\ g} \to \underbrace{\text{Map}}_{\text{old map}} \to \underbrace{\text{Map}}_{\text{updated map}}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
This looks just like Haskell's [`Data.Map.adjust` function](https://hackage.haskell.org/package/containers-0.4.0.0/docs/src/Data-Map.html#adjust), except that it
|
||||||
|
can take the entire map into consideration when updating a key.
|
||||||
|
|
||||||
|
My generalized version takes in a list of keys to update, and makes the updater
|
||||||
|
accept a key so that its behavior can be specialized for each entry it changes.
|
||||||
|
The sketch of the implementation is in the `_updating_via_` function from
|
||||||
|
the `Map` module, and its helper `transform`. Here, I collapse its definition,
|
||||||
|
since it's not particularly important.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Map.agda" 926 931 "" "**(Click here to see the definition of `transform`)**" >}}
|
||||||
|
|
||||||
|
The proof of monotonicity --- which is the solution to the exercise --- is
|
||||||
|
actually quite complicated. I will omit its description, and show it here
|
||||||
|
in another collapsed block.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Map.agda" 1042 1105 "" "**(Click here to see the proof of monotonicity of \(h\))**" >}}
|
||||||
|
|
||||||
|
Given a proof of the exercise, all that's left is to instantiate the theorem
|
||||||
|
with the argument I described. Specifically:
|
||||||
|
|
||||||
|
* \(L_1 \triangleq \text{Info} \triangleq \text{NodeId} \to (\text{Variable} \to \text{Sign})\)
|
||||||
|
* \(L_2 \triangleq \text{Variable} \to \text{Sign} \)
|
||||||
|
* \(A \triangleq \text{NodeId}\)
|
||||||
|
* \(f \triangleq \text{id} \triangleq x \mapsto x\)
|
||||||
|
* \(g(k, m) = \text{JOIN}(k, m)\)
|
||||||
|
|
||||||
|
In the equation for \(g\), I explicitly insert the map \(m\) instead of leaving
|
||||||
|
it implicit as the textbook does. In Agda, this instantiation for joining
|
||||||
|
all predecessor looks like this (using `states` as the list of keys to update,
|
||||||
|
indicating that we should update _every_ key):
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 152 157 >}}
|
||||||
|
|
||||||
|
And the one for evaluating all programs looks like this:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 215 220 >}}
|
||||||
|
|
||||||
|
Actually, we haven't yet seen that `updateVariablesFromStmt`. This is
|
||||||
|
a function that we can define using the user-provided abtract interpretation
|
||||||
|
`eval`. Specifically, it handles the job of updating the sign of a variable
|
||||||
|
once it has been assigned to (or doing nothing if the statement is a no-op).
|
||||||
|
{#define-updateVariablesFromStmt}
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 191 193 >}}
|
||||||
|
|
||||||
|
The `updateVariablesFromExpression` is now new, and it is yet another map update,
|
||||||
|
which changes the sign of a variable `k` to be the one we get from running
|
||||||
|
`eval` on it. Map updates are instances of the generalized update; this
|
||||||
|
time, the updater \(g\) is `eval`. The exercise requires the updater to be
|
||||||
|
monotonic, which constrains the user-provided evaluation function to be
|
||||||
|
monotonic too.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 173 181 >}}
|
||||||
|
|
||||||
|
We finally write the `analyze` function as the composition of the two bulk updates:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 226 232 >}}
|
||||||
|
|
||||||
|
### Instantiating with the Sign Lattice
|
||||||
|
Thus far, I've been talking about the sign lattice throughout, but implementing
|
||||||
|
the Agda code in terms of a general lattice `L` and evaluation function `eval`.
|
||||||
|
In order to actually run the Agda code, we do need to provide an `eval` function,
|
||||||
|
which implements the logic we used above, in which a zero-sign variable \(x\)
|
||||||
|
minus one was determined to be negative. For binary operators specifically,
|
||||||
|
I've used the table provided in the textbook; here they are:
|
||||||
|
|
||||||
|
{{< figure src="plusminus.png" caption="Cayley tables for abstract interpretation of plus and minus" >}}
|
||||||
|
|
||||||
|
These are pretty much common sense:
|
||||||
|
* A positive plus a positive is still positive, so \(+\ \hat{+}\ + = +\)
|
||||||
|
* A positive plus any sign could be any sign still, so \(+\ \hat{+}\ \top = \top\)
|
||||||
|
* Any sign plus "impossible" is impossible, so \(\top\ \hat{+} \bot = \bot\).
|
||||||
|
* etc.
|
||||||
|
|
||||||
|
The Agda encoding for the plus function is as follows, and the one for minus
|
||||||
|
is similar.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Sign.agda" 76 94 >}}
|
||||||
|
|
||||||
|
As the comment in the block says, it would be incredibly tedious to verify
|
||||||
|
the monotonicity of these tables, since you would have to consider roughly
|
||||||
|
125 cases _per argument_: for each (fixed) sign \(s\) and two other signs
|
||||||
|
\(s_1 \le s_2\), we'd need to show that \(s\ \hat{+}\ s_1 \le s\ \hat{+}\ s_2\).
|
||||||
|
I therefore commit the _faux pas_ of using `postulate`. Fortunately, the proof
|
||||||
|
of monotonicity is not used for the execution of the program, so we will
|
||||||
|
get away with this, barring any meddling kids.
|
||||||
|
|
||||||
|
From this, all that's left is to show that for any expression `e`, the
|
||||||
|
evaluation function:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\text{eval} : \text{Expr} \to (\text{Variable} \to \text{Sign}) \to \text{Sign}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
is monotonic. It's defined straightforwardly and very much like an evaluator /
|
||||||
|
interpreter, suggesting that "abstract interpretation" is the correct term here.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Sign.agda" 176 184 >}}
|
||||||
|
|
||||||
|
Thought it won't happen, it was easier to just handle the case where there's
|
||||||
|
an undefined variable; I give it "any sign". Otherwise, the function simply
|
||||||
|
consults the sign tables for `+` or `-`, as well as the known signs of the
|
||||||
|
variables. For natural number literals, it assigns `0` the "zero" sign, and
|
||||||
|
any other natural number the "\(+\)".
|
||||||
|
|
||||||
|
To prove monotonicity, we need to consider two variable maps (one less than
|
||||||
|
the other), and show that the abstract interpretation respects that ordering.
|
||||||
|
This boils down to the fact that the `plus` and `minus` tables are monotonic
|
||||||
|
in both arguments (thus, if their sub-expressions are evaluated monotonically
|
||||||
|
given an environment, then so is the whole addition or subtraction), and
|
||||||
|
to the fact that for two maps `m₁ ≼ m₂`, the values at corresponding keys
|
||||||
|
are similarly ordered: `m₁[k] ≼ m₂[k]`. We [saw that above](#less-than-lemma).
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Sign.agda" 186 223 "" "**(Click to expand the proof that the evaluation function for signs is monotonic)**" >}}
|
||||||
|
|
||||||
|
That's all we need. With this, I just instantiate the `Forward` module we have
|
||||||
|
been working with, and make use of the `result`. I also used a `show`
|
||||||
|
function (which I defined) to stringify that output.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Sign.agda" 225 229 >}}
|
||||||
|
|
||||||
|
But wait, `result`? We haven't seen a result just yet. That's the last piece,
|
||||||
|
and it involves finally making use of the fixed-point algorithm.
|
||||||
|
|
||||||
|
### Invoking the Fixed Point Algorithm
|
||||||
|
Our \(\text{Info}\) lattice is of finite height, and the function we have defined
|
||||||
|
is monotonic (by virtue of being constructed only from map updates, which
|
||||||
|
are monotonic by Exercise 4.26, and from function composition, which preserves
|
||||||
|
monotonicity). We can therefore apply the fixed-point-algorithm, and compute
|
||||||
|
the least fixed point:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 235 238 >}}
|
||||||
|
|
||||||
|
With this, `analyze` is the result of our forward analysis!
|
||||||
|
|
||||||
|
In a `Main.agda` file, I invoked this analysis on a sample program:
|
||||||
|
|
||||||
|
```Agda
|
||||||
|
testCode : Stmt
|
||||||
|
testCode =
|
||||||
|
⟨ "zero" ← (# 0) ⟩ then
|
||||||
|
⟨ "pos" ← ((` "zero") Expr.+ (# 1)) ⟩ then
|
||||||
|
⟨ "neg" ← ((` "zero") Expr.- (# 1)) ⟩ then
|
||||||
|
⟨ "unknown" ← ((` "pos") Expr.+ (` "neg")) ⟩
|
||||||
|
|
||||||
|
testProgram : Program
|
||||||
|
testProgram = record
|
||||||
|
{ rootStmt = testCode
|
||||||
|
}
|
||||||
|
|
||||||
|
open WithProg testProgram using (output; analyze-correct)
|
||||||
|
|
||||||
|
main = run {0ℓ} (putStrLn output)
|
||||||
|
```
|
||||||
|
|
||||||
|
The result is verbose, since it shows variable signs for each statement
|
||||||
|
in the program. However, the key is the last basic block, which shows
|
||||||
|
the variables at the end of the program. It reads:
|
||||||
|
|
||||||
|
```
|
||||||
|
{"neg" ↦ -, "pos" ↦ +, "unknown" ↦ ⊤, "zero" ↦ 0, }
|
||||||
|
```
|
||||||
|
|
||||||
|
### Verifying the Analysis
|
||||||
|
We now have a general framework for running forward analyses: you provide
|
||||||
|
an abstract interpretation function for expressions, as well as a proof
|
||||||
|
that this function is monotonic, and you get an Agda function that takes
|
||||||
|
a program and tells you the variable states at every point. If your abstract
|
||||||
|
interpretation function is for determining the signs of expressions, the
|
||||||
|
final result is an analysis that determines all possible signs for all variables,
|
||||||
|
anywhere in the code. It's pretty easy to instantiate this framework with
|
||||||
|
another type of forward analysis --- in fact, by switching the
|
||||||
|
`plus` function to one that uses `AboveBelow ℤ`, rather than `AboveBelow Sign`:
|
||||||
|
|
||||||
|
```Agda
|
||||||
|
plus : ConstLattice → ConstLattice → ConstLattice
|
||||||
|
plus ⊥ᶜ _ = ⊥ᶜ
|
||||||
|
plus _ ⊥ᶜ = ⊥ᶜ
|
||||||
|
plus ⊤ᶜ _ = ⊤ᶜ
|
||||||
|
plus _ ⊤ᶜ = ⊤ᶜ
|
||||||
|
plus [ z₁ ]ᶜ [ z₂ ]ᶜ = [ z₁ Int.+ z₂ ]ᶜ
|
||||||
|
```
|
||||||
|
|
||||||
|
we can defined a constant-propagation analysis.
|
||||||
|
|
||||||
|
```
|
||||||
|
{"neg" ↦ -1, "pos" ↦ 1, "unknown" ↦ 0, "zero" ↦ 0, }
|
||||||
|
```
|
||||||
|
|
||||||
|
However, we haven't proved our analysis correct, and we haven't yet made use of
|
||||||
|
the CFG-semantics equivalence that we
|
||||||
|
[proved in the previous section]({{< relref "07_spa_agda_semantics_and_cfg" >}}).
|
||||||
|
I was hoping to get to it in this post, but there was just too much to
|
||||||
|
cover. So, I will get to that in the next post, where we will make use
|
||||||
|
of the remaining machinery to demonstrate that the output of our analyzer
|
||||||
|
matches reality.
|
||||||
BIN
content/blog/08_spa_agda_forward/plusminus.png
Normal file
|
After Width: | Height: | Size: 42 KiB |
@@ -1,7 +1,8 @@
|
|||||||
---
|
---
|
||||||
title: Compiling a Functional Language Using C++, Part 9 - Garbage Collection
|
title: Compiling a Functional Language Using C++, Part 9 - Garbage Collection
|
||||||
date: 2020-02-10T19:22:41-08:00
|
date: 2020-02-10T19:22:41-08:00
|
||||||
tags: ["C and C++", "Functional Languages", "Compilers"]
|
tags: ["C", "C++", "Functional Languages", "Compilers"]
|
||||||
|
series: "Compiling a Functional Language using C++"
|
||||||
description: "In this post, we implement a garbage collector that frees memory no longer used by the executables our compiler creates."
|
description: "In this post, we implement a garbage collector that frees memory no longer used by the executables our compiler creates."
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
547
content/blog/09_spa_agda_verified_forward/index.md
Normal file
@@ -0,0 +1,547 @@
|
|||||||
|
---
|
||||||
|
title: "Implementing and Verifying \"Static Program Analysis\" in Agda, Part 9: Verifying the Forward Analysis"
|
||||||
|
series: "Static Program Analysis in Agda"
|
||||||
|
description: "In this post, I prove that the sign analysis from the previous is correct"
|
||||||
|
date: 2024-12-25T19:00:00-08:00
|
||||||
|
tags: ["Agda", "Programming Languages"]
|
||||||
|
left_align_code: true
|
||||||
|
---
|
||||||
|
|
||||||
|
In the previous post, we put together a number of powerful pieces of machinery
|
||||||
|
to construct a sign analysis. However, we still haven't verified that this
|
||||||
|
analysis produces correct results. For the most part, we already have the
|
||||||
|
tools required to demonstrate correctness; the most important one
|
||||||
|
is the [validity of our CFGs]({{< relref "07_spa_agda_semantics_and_cfg" >}})
|
||||||
|
relative to [the semantics of the little language]({{< relref "05_spa_agda_semantics" >}}).
|
||||||
|
|
||||||
|
### High-Level Algorithm
|
||||||
|
We'll keep working with the sign lattice as an example, keeping in mind
|
||||||
|
how what we do generalizes to a any lattice \(L\) describing a variable's
|
||||||
|
state. The general shape of our argument will be as follows, where I've underlined and
|
||||||
|
numbered assumptions or aspects that we have yet to provide.
|
||||||
|
|
||||||
|
1. Our fixed-point analysis from the previous section gave us a result \(r\)
|
||||||
|
that satisfies the following equation:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
r = \text{update}(\text{join}(r))
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Above \(\text{join}\) applies the [predecessor-combining function]({{< relref "08_spa_agda_forward#join-preds" >}})
|
||||||
|
from the previous post to each state (corresponding to `joinAll` in Agda)
|
||||||
|
and \(\text{update}\) performs one round of abstract interpretation.
|
||||||
|
|
||||||
|
2. Because of the [correspondence of our semantics and CFGs]({{< relref "07_spa_agda_semantics_and_cfg" >}}),
|
||||||
|
each program evaluation in the form \(\rho, s \Rightarrow \rho'\)
|
||||||
|
corresponds to a path through the Control Flow Graph. Along the path,
|
||||||
|
each node contains simple statements, which correspond to intermediate steps
|
||||||
|
in evaluating the program. These will also be in the form
|
||||||
|
\(\rho_1, b \Rightarrow \rho_2\).
|
||||||
|
|
||||||
|
3. We will proceed iteratively, stepping through the trace one basic block at
|
||||||
|
a time. At each node in the graph:
|
||||||
|
* We will assume that the beginning state (the variables in \(\rho_1\)) are
|
||||||
|
{{< internal "correctly-described" >}}
|
||||||
|
correctly described
|
||||||
|
{{< /internal >}}
|
||||||
|
by one of the predecessors of the current node. Since
|
||||||
|
{{< internal "disjunction" >}}
|
||||||
|
joining represents "or"
|
||||||
|
{{< /internal >}},
|
||||||
|
that is the same
|
||||||
|
as saying that \(\text{join}(r)\)
|
||||||
|
contains an accurate description of \(\rho_1\).
|
||||||
|
|
||||||
|
* Because
|
||||||
|
{{< internal "abstract-interpretation" >}}
|
||||||
|
the abstract interpretation function preserves accurate descriptions
|
||||||
|
{{< /internal >}},
|
||||||
|
if \(\text{join}(r)\) contains an accurate description \(\rho_1\), then applying our
|
||||||
|
abstract interpretation function via \(\text{update}\) should result in
|
||||||
|
a map that contains an accurate-described \(\rho_2\). In other words, \(\text{update}(\text{join}(r))\)
|
||||||
|
describes \(\rho_2\) at the current block.
|
||||||
|
{{< internal "equivalence" >}}
|
||||||
|
By the equation above
|
||||||
|
{{< /internal >}}, that's the same as saying
|
||||||
|
\(r\) describes \(\rho_2\) at the current block.
|
||||||
|
|
||||||
|
* Since the trace is a path through a graph, there must be an edge from
|
||||||
|
the current basic block to the next. This means that the current basic
|
||||||
|
block is a predecessor of the next one. From the previous point, we know
|
||||||
|
that \(\rho_2\) is accurately described by this predecessor, fulfilling
|
||||||
|
our earlier assumption and allowing us to continue iteration.
|
||||||
|
|
||||||
|
So, what are the missing pieces?
|
||||||
|
|
||||||
|
1. We need to define what it means for a lattice (like our sign lattice)
|
||||||
|
to "correctly describe" what happens when evaluating a program for real.
|
||||||
|
For example, the \(+\) in sign analysis describes values that are bigger than zero,
|
||||||
|
and a map like `{x:+}` states that `x` can only take on positive values.
|
||||||
|
2. We've seen before [the \((\sqcup)\) operator models disjunction
|
||||||
|
("A or B")]({{< relref "01_spa_agda_lattices#lub-glub-or-and" >}}), but
|
||||||
|
that was only an informal observation; we'll need to specify it preceisely.
|
||||||
|
3. Each analysis [provides an abstract interpretation `eval` function]({{< relref "08_spa_agda_forward#general-evaluator" >}}).
|
||||||
|
However, until now, nothing has formally constrained this function; we could
|
||||||
|
return \(+\) in every case, even though that would not be accurate. We will
|
||||||
|
need, for each analysis, a proof that its `eval` preserves accurate descriptions.
|
||||||
|
4. The equalities between our lattice elements [are actually equivalences]({{< relref "01_spa_agda_lattices#definitional-equality" >}}),
|
||||||
|
which helps us use simpler representations of data structures. Thus, even
|
||||||
|
in statements of the fixed point algorithm, our final result is a value \(a\)
|
||||||
|
such that \(a \approx f(a)\). We need to prove that our notion of equivalent
|
||||||
|
lattice elements plays nicely with correctness.
|
||||||
|
|
||||||
|
Let's start with the first bullet point.
|
||||||
|
|
||||||
|
### A Formal Definition of Correctness
|
||||||
|
|
||||||
|
When a variable is mapped to a particular sign (like `{ "x": + }`),
|
||||||
|
what that really says is that the value of `x` is greater than zero. Recalling
|
||||||
|
from [the post about our language's semantics]({{< relref "05_spa_agda_semantics#notation-for-environments" >}})
|
||||||
|
that we use the symbol \(\rho\) to represent mappings of variables to
|
||||||
|
their values, we might write this claim as:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\rho(\texttt{x}) > 0
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
This is a good start, but it's a little awkward defining the meaning of "plus"
|
||||||
|
by referring to the context in which it's used (the `{ "x": ... }` portion
|
||||||
|
of the expression above). Instead, let's associate with each sign (like \(+\)) a
|
||||||
|
predicate: a function that takes a value, and makes a claim about that value
|
||||||
|
("this is positive"):
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\llbracket + \rrbracket\ v = v > 0
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
The notation above is a little weird unless you, like me, have a background in
|
||||||
|
Programming Language Theory (❤️). This comes from [denotational semantics](https://en.wikipedia.org/wiki/Denotational_semantics);
|
||||||
|
generally, one writes:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\llbracket \text{thing} \rrbracket = \text{the meaning of the thing}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Where \(\llbracket \cdot \rrbracket\) is really a function (we call it
|
||||||
|
the _semantic function_) that maps things to
|
||||||
|
their meaning. Then, the above equation is similar to the more familiar
|
||||||
|
\(f(x) = x+1\): function and arguments on the left, definition on the right. When
|
||||||
|
the "meaning of the thing" is itself a function, we could write it explicitly
|
||||||
|
using lambda-notation:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\llbracket \text{thing} \rrbracket = \lambda x.\ \text{body of the function}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Or, we could use the Haskell style and write the new variable on the left of
|
||||||
|
the equality:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\llbracket \text{thing} \rrbracket\ x = \text{body of the function}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
That is precisely what I'm doing above with \(\llbracket + \rrbracket\).
|
||||||
|
With this in mind, we could define the entire semantic function for the
|
||||||
|
sign lattice as follows:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\llbracket + \rrbracket\ v = v\ \texttt{>}\ 0 \\
|
||||||
|
\llbracket 0 \rrbracket\ v = v\ \texttt{=}\ 0 \\
|
||||||
|
\llbracket - \rrbracket\ v = v\ \texttt{<}\ 0 \\
|
||||||
|
\llbracket \top \rrbracket\ v = \text{true} \\
|
||||||
|
\llbracket \bot \rrbracket\ v = \text{false}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
In Agda, the integer type already distinguishes between "negative natural" or
|
||||||
|
"positive natural" cases, which made it possible to define the semantic function
|
||||||
|
{{< sidenote "right" "without-note" "without using inequalities." >}}
|
||||||
|
Reasoning about inequalities is painful, sometimes requiring a number of
|
||||||
|
lemmas to arrive at a result that is intuitively obvious. Coq has a powerful
|
||||||
|
tactic called <a href="https://coq.inria.fr/doc/v8.11/refman/addendum/micromega.html#coq:tacn.lia"><code>lia</code></a>
|
||||||
|
that automatically solves systems of inequalities, and I use it liberally.
|
||||||
|
However, lacking such a tactic in Agda, I would like to avoid inequalities
|
||||||
|
if they are not needed.
|
||||||
|
{{< /sidenote >}}
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Sign.agda" 114 119 >}}
|
||||||
|
|
||||||
|
Notably, \(\llbracket \top \rrbracket\ v\) always holds, and
|
||||||
|
\(\llbracket \bot \rrbracket\ v\) never does. __In general__, we will always
|
||||||
|
need to define a semantic function for whatever lattice we are choosing for
|
||||||
|
our analysis.
|
||||||
|
|
||||||
|
It's important to remember from the previous post that the sign lattice
|
||||||
|
(or, more generally, our lattice \(L\)) is only a component of the
|
||||||
|
[lattice we use to instantiate the analysis]({{< relref "08_spa_agda_forward#whole-lattice" >}}).
|
||||||
|
We at least need to define what it means for the \(\text{Variable} \to \text{Sign}\)
|
||||||
|
portion of that lattice to be correct. This way, we'll have correctness
|
||||||
|
criteria for each key (CFG node) in the top-level \(\text{Info}\) lattice.
|
||||||
|
Since a map from variables to their sign characterizes not a single value \(v\)
|
||||||
|
but a whole environment \(\rho\), something like this is a good start:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\llbracket \texttt{\{} x_1: s_1, ..., x_n: s_n \texttt{\}} \rrbracket\ \rho = \llbracket s_1 \rrbracket\ \rho(x_1)\ \text{and}\ ...\ \text{and}\ \llbracket s_n \rrbracket\ \rho(x_n)
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
As a concrete example, we might get:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\llbracket \texttt{\{} \texttt{x}: +, \texttt{y}: - \texttt{\}} \rrbracket\ \rho = \rho(\texttt{x})\ \texttt{>}\ 0\ \text{and}\ \rho(\texttt{y})\ \texttt{<}\ 0
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
This is pretty good, but not quite right. For instance, the initial state of
|
||||||
|
the program --- before running the analysis --- assigns \(\bot\) to each
|
||||||
|
element. This is true because our fixed-point algorithm [starts with the least
|
||||||
|
element of the lattice]({{< relref "04_spa_agda_fixedpoint#start-least" >}}).
|
||||||
|
But even for a single-variable map `{x: ⊥ }`, the semantic function above would
|
||||||
|
give:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\llbracket \texttt{\{} \texttt{x}: \bot \texttt{\}} \rrbracket\ \rho = \text{false}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
That's clearly not right: our initial state should be possible, lest
|
||||||
|
the entire proof be just a convoluted [_ex falso_](https://en.wikipedia.org/wiki/Principle_of_explosion)!
|
||||||
|
|
||||||
|
There is another tricky aspect of our analysis, which is primarily defined
|
||||||
|
[using the join (\(\sqcup\)) operator]({{< relref "08_spa_agda_forward#join-preds" >}}).
|
||||||
|
Observe the following example:
|
||||||
|
|
||||||
|
```C
|
||||||
|
// initial state: { x: ⊥ }
|
||||||
|
if b {
|
||||||
|
x = 1; // state: { x: + }
|
||||||
|
} else {
|
||||||
|
// state unchanged: { x: ⊥ }
|
||||||
|
}
|
||||||
|
// state: { x: + } ⊔ { x: ⊥ } = { x: + }
|
||||||
|
```
|
||||||
|
|
||||||
|
Notice that in the final state, the sign of `x` is `+`, even though when
|
||||||
|
`b` is `false`, the variable is never set. In a simple language like ours,
|
||||||
|
without variable declaration points, this is probably the best we could hope
|
||||||
|
for. The crucial observation, though, is that the oddness only comes into
|
||||||
|
play with variables that are not set. In the "initial state" case, none
|
||||||
|
of the variables have been modified; in the `else` case of the conditional,
|
||||||
|
`x` was never assigned to. We can thus relax our condition to an if-then:
|
||||||
|
if a variable is in our environment \(\rho\), then the variable-sign lattice's
|
||||||
|
interpretation accurately describes it.
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\begin{array}{ccc}
|
||||||
|
\llbracket \texttt{\{} x_1: s_1, ..., x_n: s_n \texttt{\}} \rrbracket\ \rho & = & & \textbf{if}\ x_1 \in \rho\ \textbf{then}\ \llbracket s_1 \rrbracket\ \rho(x_1)\ \\ & & \text{and} & ... \\ & & \text{and} & \textbf{if}\ x_n \in \rho\ \textbf{then}\ \llbracket s_n \rrbracket\ \rho(x_n)
|
||||||
|
\end{array}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
The first "weird" case now results in the following:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\llbracket \texttt{\{} \texttt{x}: \bot \texttt{\}} \rrbracket\ \rho = \textbf{if}\ \texttt{x} \in \rho\ \textbf{then}\ \text{false}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Which is just another way of saying:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\llbracket \texttt{\{} \texttt{x}: \bot \texttt{\}} \rrbracket\ \rho = \texttt{x} \notin \rho
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
In the second case, the interpretation also results in a true statement:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\llbracket \texttt{\{} \texttt{x}: + \texttt{\}} \rrbracket\ \rho = \textbf{if}\ \texttt{x} \in \rho\ \textbf{then}\ \texttt{x} > 0
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
In Agda, I encode the fact that a verified analysis needs a semantic function
|
||||||
|
\(\llbracket\cdot\rrbracket\) for its element lattice \(L\) by taking such
|
||||||
|
a function as an argument called `⟦_⟧ˡ`:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Analysis/Forward.agda" 246 253 "hl_lines=5" >}}
|
||||||
|
|
||||||
|
I then define the semantic function for the variable-sign lattice in the following
|
||||||
|
way, which eschews the "..." notation in favor of a more Agda-compatible (and
|
||||||
|
equivalent) form:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Analysis/Forward.agda" 255 256 >}}
|
||||||
|
|
||||||
|
The above reads roughly as follows:
|
||||||
|
|
||||||
|
> For every variable `k` and sign [or, more generally, lattice element] `l` in
|
||||||
|
> the variable map lattice, if `k` is in the environment `ρ`, then it satisfies
|
||||||
|
> the predicate given by the semantic function applied to `l`.
|
||||||
|
|
||||||
|
Let's recap: we have defined a semantic function for our sign lattice, and
|
||||||
|
noted that to define a verified analysis, we always need such a semantic function.
|
||||||
|
We then showed how to construct a semantic function for a whole variable map
|
||||||
|
(of type \(\text{Variable} \to \text{Sign}\), or \(\text{Variable}\to L\)
|
||||||
|
in general). We also wrote some Agda code doing all this. As a result, we
|
||||||
|
have filled in the missing piece for {{< internalref "correctly-described" >}}property{{< /internalref >}}.
|
||||||
|
|
||||||
|
However, the way that we brought in the semantic function in the Agda code
|
||||||
|
above hints that there's more to be discussed. What's `latticeInterpretationˡ`?
|
||||||
|
In answering that question, we'll provide evidence for
|
||||||
|
{{< internalref "disjunction" >}}property{{< /internalref >}}
|
||||||
|
and
|
||||||
|
{{< internalref "equivalence" >}}property{{< /internalref >}}.
|
||||||
|
|
||||||
|
### Properties of the Semantic Function
|
||||||
|
|
||||||
|
As we briefly saw earlier, we loosened the notion of equality to that equivalences,
|
||||||
|
which made it possible to ignore things like the ordering of key-value pairs
|
||||||
|
in maps. That's great and all, but nothing is stopping us from defining semantic functions that violate our equivalence!
|
||||||
|
Supposing \(a \approx f(a)\), as far
|
||||||
|
as Agda is concerned, even though \(a\) and \(f(a)\) are "equivalent",
|
||||||
|
\(\llbracket a \rrbracket\) and \(\llbracket f(a) \rrbracket\) may be
|
||||||
|
totally different. For a semantic function to be correct, it must produce
|
||||||
|
the same predicate for equivalent elements of lattice \(L\). That's
|
||||||
|
{{< internalref "equivalence" >}}missing piece{{< /internalref >}}.
|
||||||
|
|
||||||
|
Another property of semantic functions that we will need to formalize
|
||||||
|
is that \((\sqcup)\) represents disjunction.
|
||||||
|
This comes into play when we reason about the correctness of predecessors in
|
||||||
|
a Control Flow Graph. Recall that during the last step of processing a given node,
|
||||||
|
when we are trying to move on to the next node in the trace, we have knowledge
|
||||||
|
that the current node's variable map accurately describes the intermediate
|
||||||
|
environment. In other words, \(\llbracket \textit{vs}_i \rrbracket\ \rho_2\) holds, where
|
||||||
|
\(\textit{vs}_i\) is the variable map for the current node. We can generalize this
|
||||||
|
kowledge a little, and get:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\llbracket \textit{vs}_1 \rrbracket\ \rho_2\ \text{or}\ ...\ \text{or}\ \llbracket \textit{vs}_n \rrbracket\ \rho_2
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
However, the assumption that we _need_ to hold when moving on to a new node
|
||||||
|
is in terms of \(\textit{JOIN}\), which combines all the predecessors' maps
|
||||||
|
\(\textit{vs}_1, ..., \textit{vs}_n\) using \((\sqcup)\). Thus, we will need to be in a world where
|
||||||
|
the following claim is true:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\llbracket \textit{vs}_1 \sqcup ... \sqcup \textit{vs}_n \rrbracket\ \rho
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
To get from one to the other, we will need to rely explicitly on the fact
|
||||||
|
that \((\sqcup)\) encodes "or". It's not necessary for the forward analysis,
|
||||||
|
but a similar property ought to hold for \((\sqcap)\) and "and". This
|
||||||
|
constraint provides {{< internalref "disjunction" >}}missing piece{{< /internalref >}}.
|
||||||
|
|
||||||
|
I defined a new data type that bundles a semantic function with proofs of
|
||||||
|
the properties in this section; that's precisely what `latticeInterpretationˡ`
|
||||||
|
is:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Semantics.agda" 66 73 >}}
|
||||||
|
|
||||||
|
In short, to leverage the framework for verified analysis, you would need to
|
||||||
|
provide a semantic function that interacts properly with `≈` and `∨`.
|
||||||
|
|
||||||
|
### Correctness of the Evaluator
|
||||||
|
|
||||||
|
All that's left is {{< internalref "abstract-interpretation" >}}the last missing piece, {{< /internalref >}},
|
||||||
|
which requires that `eval` matches the semantics of our language. Recall
|
||||||
|
the signature of `eval`:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 166 166 >}}
|
||||||
|
|
||||||
|
It operates on expressions and variable maps, which themselves associate a
|
||||||
|
sign (or, generally, an element of lattice \(L\)), with each variable. The
|
||||||
|
"real" evaluation judgement, on the other hand, is in the form
|
||||||
|
\(\rho, e \Downarrow v\), and reads "expression \(e\) in environment \(\rho\)
|
||||||
|
evaluates to value \(v\)". In Agda:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Language/Semantics.agda" 27 27 >}}
|
||||||
|
|
||||||
|
Let's line up the types of `eval` and the judgement. I'll swap the order of arguments
|
||||||
|
for `eval` to make the correspondence easier to see:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\begin{array}{ccccccc}
|
||||||
|
\text{eval} & : & (\text{Variable} \to \text{Sign}) & \to & \text{Expr} & \to & \text{Sign} \\
|
||||||
|
\cdot,\cdot\Downarrow\cdot & : & (\text{Variable} \to \text{Value}) & \to & \text{Expr} & \to & \text{Value} & \to & \text{Set} \\
|
||||||
|
& & \underbrace{\phantom{(\text{Variable} \to \text{Value})}}_{\text{environment-like inputs}} & & & & \underbrace{\phantom{Value}}_{\text{value-like outputs}}
|
||||||
|
\end{array}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Squinting a little, it's almost like the signature of `eval` is the signature
|
||||||
|
for the evaluation judgement, but it forgets a few details (the exact values
|
||||||
|
of the variables) in favor of abstractions (their signs). To show that `eval`
|
||||||
|
behaves correctly, we'll want to prove that this forgetful correspondence holds.
|
||||||
|
|
||||||
|
Concretely, for any expression \(e\), take some environment \(\rho\), and "forget"
|
||||||
|
the exact values, getting a sign map \(\textit{vs}\). Now, evaluate the expression
|
||||||
|
to some value \(v\) using the semantics, and also, compute the expression's
|
||||||
|
expected sign \(s\) using `eval`. The sign should be the same as forgetting
|
||||||
|
\(v\)'s exact value. Mathematically,
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\forall e, \rho, v, \textit{vs}.\ \textbf{if}\ \llbracket\textit{vs}\rrbracket \rho\ \text{and}\ \rho, e \Downarrow v\ \textbf{then}\ \llbracket \text{eval}\ \textit{vs}\ e\rrbracket v
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
In Agda:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 286 287 >}}
|
||||||
|
|
||||||
|
For a concrete analysis, we need to prove the above claim. In the case of
|
||||||
|
sign analysis, this boils down to a rather cumbersome proof by cases. I will collapse
|
||||||
|
the proofs to save some space and avoid overwhelming the reader.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Sign.agda" 237 258 "" "**(Click here to expand the proof of correctness for plus)**" >}}
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Sign.agda" 261 282 "" "**(Click here to expand the proof of correctness for minus)**" >}}
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Sign.agda" 284 294 "" >}}
|
||||||
|
|
||||||
|
This completes {{< internalref "abstract-interpretation" >}}our last missing piece,{{< /internalref >}}.
|
||||||
|
All that's left is to put everything together.
|
||||||
|
|
||||||
|
### Proving The Analysis Correct
|
||||||
|
|
||||||
|
#### Lifting Expression Evaluation Correctness to Statements
|
||||||
|
The individual analyses (like the sign analysis) provide only an evaluation
|
||||||
|
function for _expressions_, and thus only have to prove correctness of
|
||||||
|
that function. However, our language is made up of statements, with judgements
|
||||||
|
in the form \(\rho, s \Rightarrow \rho'\). Now that we've shown (or assumed)
|
||||||
|
that `eval` behaves correctly when evaluating expressions, we should show
|
||||||
|
that this correctness extends to evaluating statements, which in the
|
||||||
|
forward analysis implementation is handled by the
|
||||||
|
[`updateVariablesFromStmt` function]({{< relref "08_spa_agda_forward#define-updateVariablesFromStmt" >}}).
|
||||||
|
|
||||||
|
The property we need to show looks very similar to the property for `eval`:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\forall b, \rho, \rho', \textit{vs}.\ \textbf{if}\ \llbracket\textit{vs}\rrbracket \rho\ \text{and}\ \rho, b \Rightarrow \rho'\ \textbf{then}\ \llbracket \text{updateVariablesFromStmt}\ \textit{vs}\ b\rrbracket \rho'
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
In Agda:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 291 291 >}}
|
||||||
|
|
||||||
|
The proof is straightforward, and relies on the semantics of the [map update]({{< relref "08_spa_agda_forward#generalized-update" >}}).
|
||||||
|
Specifically, in the case of an assignment statement \(x \leftarrow e\), all we
|
||||||
|
do is store the new sign computed from \(e\) into the map at \(x\). To
|
||||||
|
prove the correctness of the entire final environment \(\rho'\), there are
|
||||||
|
two cases to consider:
|
||||||
|
|
||||||
|
* A variable in question is the newly-updated \(x\). In this case, since
|
||||||
|
`eval` produces correct signs, the variable clearly has the correct sign.
|
||||||
|
This is the first highlighted chunk in the below code.
|
||||||
|
* A variable in question is different from \(x\). In this case, its value
|
||||||
|
in the environment \(\rho'\) should be the same as it was prior, and
|
||||||
|
its sign in the updated variable map is the same as it was in the original.
|
||||||
|
Since the original map correctly described the original environment, we know
|
||||||
|
the sign is correct. This is the second highlighted chunk in the below
|
||||||
|
code.
|
||||||
|
|
||||||
|
The corresponding Agda proof is as follows:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 291 305 "hl_lines=5-7 10-15" >}}
|
||||||
|
|
||||||
|
From this, it follows with relative ease that each basic block in the lattice,
|
||||||
|
when evaluated, produces an environment that matches the prediction of our
|
||||||
|
forward analysis.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 318 318 >}}
|
||||||
|
|
||||||
|
#### Walking the Trace
|
||||||
|
|
||||||
|
Finally, we get to the meat of the proof, which follows the [outline](#high-level-algorithm). First,
|
||||||
|
let's take a look at `stepTrace`, which implements the second bullet in
|
||||||
|
our iterative procedure. I'll show the code, then we can discuss it
|
||||||
|
in detail.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 324 342 >}}
|
||||||
|
|
||||||
|
The first `let`-bound variable, `⟦joinAll-result⟧ρ₁` is kind of an intermediate
|
||||||
|
result, which I was forced to introduced because `rewrite` caused Agda to
|
||||||
|
allocate ~100GB of memory. It simply makes use of the fact that `joinAll`, the
|
||||||
|
function that performs predecessor joining for each node in the CFG, sets
|
||||||
|
every key of the map accordingly.
|
||||||
|
|
||||||
|
The second `let`-bound variable, `⟦analyze-result⟧`, steps through a given
|
||||||
|
node's basic block and leverages our proof of statement-correctness to validate
|
||||||
|
that the final environment `ρ₂` matches the predication of the analyzer.
|
||||||
|
|
||||||
|
The last two `let`-bound variables apply the equation we wrote above:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
r = \text{update}(\text{join}(r))
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
Recall that `analyze` is the combination of `update` and `join`:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 226 227 >}}
|
||||||
|
|
||||||
|
Finally, the `in` portion of the code uses `⟦⟧ᵛ-respects-≈ᵛ`, a proof
|
||||||
|
of {{< internalref "equivalence" >}}property{{< /internalref >}}, to produce
|
||||||
|
the final claim in terms of the `result` map.
|
||||||
|
|
||||||
|
Knowing how to step, we can finally walk the entire trace, implementing
|
||||||
|
the iterative process:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 344 357 >}}
|
||||||
|
|
||||||
|
The first step --- assuming that one of the predecessors of the
|
||||||
|
current node satisfies the initial environment `ρ₁` --- is captured by
|
||||||
|
the presence of the argument `⟦joinForKey-s₁⟧ρ₁`. We expect the calling code
|
||||||
|
to provide a proof of that.
|
||||||
|
|
||||||
|
The second step, in both cases, is implemented using `stepTrace`,
|
||||||
|
as we saw above. That results in a proof that at the end of the current basic
|
||||||
|
block, the final environment `ρ₂` is accurately described.
|
||||||
|
|
||||||
|
From there, we move on to the third iterative step, if necessary. The
|
||||||
|
sub-expression `edge⇒incoming s₁→s₂` validates that, since we have an edge
|
||||||
|
from the current node to the next, we are listed as a predecessor. This,
|
||||||
|
in turn, means that we are included in the list of states-to-join for the
|
||||||
|
\(\textit{JOIN}\) function. That fact is stored in `s₁∈incomingStates`.
|
||||||
|
Finally, relying on
|
||||||
|
{{< internalref "disjunction" >}}property{{< /internalref >}},
|
||||||
|
we construct an assumption fit for a recursive invocation of `walkTrace`,
|
||||||
|
and move on to the next CFG node. The `foldr` here is motivated by the fact
|
||||||
|
that "summation" using \((\sqcup)\) is a fold.
|
||||||
|
|
||||||
|
When the function terminates, what we have is a proof that the final program
|
||||||
|
state is accurately described by the results of our program analysis. All
|
||||||
|
that's left is to kick off the walk. To do that, observe that the initial state
|
||||||
|
has no predecessors (how could it, if it's at the beginning of the program?).
|
||||||
|
That, in turn, means that this state maps every variable to the bottom element.
|
||||||
|
Such a variable configuration only permits the empty environment \(\rho = \varnothing\).
|
||||||
|
If the program evaluation starts in an empty environment, we have the assumption
|
||||||
|
needed to kick off the iteration.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Analysis/Forward.agda" 359 366 "hl_lines=7" >}}
|
||||||
|
|
||||||
|
Take a look at the highlighted line in the above code block in particular.
|
||||||
|
It states precisely what we were hoping to see: that, when evaluating
|
||||||
|
a program, the final state when it terminates is accurately described by
|
||||||
|
the `result` of our static program analysis at the `finalState` in the CFG.
|
||||||
|
We have done it!
|
||||||
|
|
||||||
|
### Future Work
|
||||||
|
|
||||||
|
It took a lot of machinery to get where we are, but there's still lots of
|
||||||
|
things to do.
|
||||||
|
|
||||||
|
1. __Correctness beyond the final state__: the statement we've arrived at
|
||||||
|
only shows that the final state of the program matches the results of
|
||||||
|
the analysis. In fact, the property hold for all intermediate states, too.
|
||||||
|
The only snag is that it's more difficult to _state_ such a claim.
|
||||||
|
|
||||||
|
To do something like that, we probably need a notion of "incomplete evaluations"
|
||||||
|
of our language, which run our program but stop at some point before the end.
|
||||||
|
A full execution would be a special case of such an "incomplete evaluation"
|
||||||
|
that stops in the final state. Then, we could restate `analyze-correct`
|
||||||
|
in terms of partial evaluations, which would strengthen it.
|
||||||
|
2. __A more robust language and evaluation process__: we noted above that
|
||||||
|
our join-based analysis is a little bit weird, particularly in the
|
||||||
|
cases of uninitialized variables. There are ways to adjust our language
|
||||||
|
(e.g., introducing variable declaration points) and analysis functions
|
||||||
|
(e.g., only allowing assignment for declared variables) to reduce
|
||||||
|
the weirdness somewhat. They just lead to a more complicated language.
|
||||||
|
3. __A more general correctness condition__: converting lattice elements into
|
||||||
|
predicates on values gets us far. However, some types of analyses make claims
|
||||||
|
about more than the _current_ values of variables. For instance, _live variable
|
||||||
|
analysis_ checks if a variable's current value is going to be used in the
|
||||||
|
future. Such an analysis can help guide register (re)allocation. To
|
||||||
|
talk about future uses of a variable, the predicate will need to be formulated
|
||||||
|
in terms of the entire evaluation proof tree. This opens a whole can
|
||||||
|
of worms that I haven't begun to examine.
|
||||||
|
|
||||||
|
Now that I'm done writing up my code so far, I will start exploring these
|
||||||
|
various avenues of work. In the meantime, though, thanks for reading!
|
||||||
@@ -1,7 +1,8 @@
|
|||||||
---
|
---
|
||||||
title: Compiling a Functional Language Using C++, Part 10 - Polymorphism
|
title: Compiling a Functional Language Using C++, Part 10 - Polymorphism
|
||||||
date: 2020-03-25T17:14:20-07:00
|
date: 2020-03-25T17:14:20-07:00
|
||||||
tags: ["C and C++", "Functional Languages", "Compilers"]
|
tags: ["C++", "Functional Languages", "Compilers"]
|
||||||
|
series: "Compiling a Functional Language using C++"
|
||||||
description: "In this post, we extend our compiler's typechecking algorithm to implement the Hindley-Milner type system, allowing for polymorphic functions."
|
description: "In this post, we extend our compiler's typechecking algorithm to implement the Hindley-Milner type system, allowing for polymorphic functions."
|
||||||
favorite: true
|
favorite: true
|
||||||
---
|
---
|
||||||
@@ -46,7 +47,7 @@ some of our notation from the [typechecking]({{< relref "03_compiler_typecheckin
|
|||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
But using our rules so far, such a thing is impossible, since there is no way for
|
But using our rules so far, such a thing is impossible, since there is no way for
|
||||||
\\(\text{Int}\\) to be unified with \\(\text{Bool}\\). We need a more powerful
|
\(\text{Int}\) to be unified with \(\text{Bool}\). We need a more powerful
|
||||||
set of rules to describe our program's types.
|
set of rules to describe our program's types.
|
||||||
|
|
||||||
|
|
||||||
@@ -71,23 +72,23 @@ Rule|Name and Description
|
|||||||
\frac
|
\frac
|
||||||
{x:\sigma \in \Gamma}
|
{x:\sigma \in \Gamma}
|
||||||
{\Gamma \vdash x:\sigma}
|
{\Gamma \vdash x:\sigma}
|
||||||
{{< /latex >}}| __Var__: If the variable \\(x\\) is known to have some polymorphic type \\(\\sigma\\) then an expression consisting only of that variable is of that type.
|
{{< /latex >}}| __Var__: If the variable \(x\) is known to have some polymorphic type \(\sigma\) then an expression consisting only of that variable is of that type.
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\frac
|
\frac
|
||||||
{\Gamma \vdash e_1 : \tau_1 \rightarrow \tau_2 \quad \Gamma \vdash e_2 : \tau_1}
|
{\Gamma \vdash e_1 : \tau_1 \rightarrow \tau_2 \quad \Gamma \vdash e_2 : \tau_1}
|
||||||
{\Gamma \vdash e_1 \; e_2 : \tau_2}
|
{\Gamma \vdash e_1 \; e_2 : \tau_2}
|
||||||
{{< /latex >}}| __App__: If an expression \\(e\_1\\), which is a function from monomorphic type \\(\\tau\_1\\) to another monomorphic type \\(\\tau\_2\\), is applied to an argument \\(e\_2\\) of type \\(\\tau\_1\\), then the result is of type \\(\\tau\_2\\).
|
{{< /latex >}}| __App__: If an expression \(e_1\), which is a function from monomorphic type \(\tau_1\) to another monomorphic type \(\tau_2\), is applied to an argument \(e_2\) of type \(\tau_1\), then the result is of type \(\tau_2\).
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\frac
|
\frac
|
||||||
{\Gamma \vdash e : \tau \quad \text{matcht}(\tau, p_i) = b_i
|
{\Gamma \vdash e : \tau \quad \text{matcht}(\tau, p_i) = b_i
|
||||||
\quad \Gamma,b_i \vdash e_i : \tau_c}
|
\quad \Gamma,b_i \vdash e_i : \tau_c}
|
||||||
{\Gamma \vdash \text{case} \; e \; \text{of} \;
|
{\Gamma \vdash \text{case} \; e \; \text{of} \;
|
||||||
\{ (p_1,e_1) \ldots (p_n, e_n) \} : \tau_c }
|
\{ (p_1,e_1) \ldots (p_n, e_n) \} : \tau_c }
|
||||||
{{< /latex >}}| __Case__: This rule is not part of Hindley-Milner, and is specific to our language. If the expression being case-analyzed is of type \\(\\tau\\) and each branch \\((p\_i, e\_i)\\) is of the same type \\(\\tau\_c\\) when the pattern \\(p\_i\\) works with type \\(\\tau\\) producing extra bindings \\(b\_i\\), the whole case expression is of type \\(\\tau\_c\\).
|
{{< /latex >}}| __Case__: This rule is not part of Hindley-Milner, and is specific to our language. If the expression being case-analyzed is of type \(\tau\) and each branch \((p_i, e_i)\) is of the same type \(\tau_c\) when the pattern \(p_i\) works with type \(\tau\) producing extra bindings \(b_i\), the whole case expression is of type \(\tau_c\).
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\frac{\Gamma \vdash e : \sigma' \quad \sigma' \sqsubseteq \sigma}
|
\frac{\Gamma \vdash e : \sigma' \quad \sigma' \sqsubseteq \sigma}
|
||||||
{\Gamma \vdash e : \sigma}
|
{\Gamma \vdash e : \sigma}
|
||||||
{{< /latex >}}| __Inst (New)__: If type \\(\\sigma\\) is an instantiation (or specialization) of type \\(\\sigma\'\\) then an expression of type \\(\\sigma\'\\) is also an expression of type \\(\\sigma\\).
|
{{< /latex >}}| __Inst (New)__: If type \(\sigma\) is an instantiation (or specialization) of type \(\sigma'\) then an expression of type \(\sigma'\) is also an expression of type \(\sigma\).
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\frac
|
\frac
|
||||||
{\Gamma \vdash e : \sigma \quad \alpha \not \in \text{free}(\Gamma)}
|
{\Gamma \vdash e : \sigma \quad \alpha \not \in \text{free}(\Gamma)}
|
||||||
@@ -95,29 +96,29 @@ Rule|Name and Description
|
|||||||
{{< /latex >}}| __Gen (New)__: If an expression has a type with free variables, this rule allows us generalize it to allow all possible types to be used for these free variables.
|
{{< /latex >}}| __Gen (New)__: If an expression has a type with free variables, this rule allows us generalize it to allow all possible types to be used for these free variables.
|
||||||
|
|
||||||
Here, there is a distinction between different forms of types. First, there are
|
Here, there is a distinction between different forms of types. First, there are
|
||||||
monomorphic types, or __monotypes__, \\(\\tau\\), which are types such as \\(\\text{Int}\\),
|
monomorphic types, or __monotypes__, \(\tau\), which are types such as \(\text{Int}\),
|
||||||
\\(\\text{Int} \\rightarrow \\text{Bool}\\), \\(a \\rightarrow b\\)
|
\(\text{Int} \rightarrow \text{Bool}\), \(a \rightarrow b\)
|
||||||
and so on. These types are what we've been working with so far. Each of them
|
and so on. These types are what we've been working with so far. Each of them
|
||||||
represents one (hence, "mono-"), concrete type. This is obvious in the case
|
represents one (hence, "mono-"), concrete type. This is obvious in the case
|
||||||
of \\(\\text{Int}\\) and \\(\\text{Int} \\rightarrow \\text{Bool}\\), but
|
of \(\text{Int}\) and \(\text{Int} \rightarrow \text{Bool}\), but
|
||||||
for \\(a \\rightarrow b\\) things are slightly less clear. Does it really
|
for \(a \rightarrow b\) things are slightly less clear. Does it really
|
||||||
represent a single type, if we can put an arbtirary thing for \\(a\\)?
|
represent a single type, if we can put an arbtirary thing for \(a\)?
|
||||||
The answer is "yes"! Although \\(a\\) is not currently known, it stands
|
The answer is "yes"! Although \(a\) is not currently known, it stands
|
||||||
in place of another monotype, which is yet to be determined.
|
in place of another monotype, which is yet to be determined.
|
||||||
|
|
||||||
So, how do we represent polymorphic types, like that of \\(\\text{if}\\)?
|
So, how do we represent polymorphic types, like that of \(\text{if}\)?
|
||||||
We borrow some more notation from mathematics, and use the "forall" quantifier:
|
We borrow some more notation from mathematics, and use the "forall" quantifier:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\text{if} : \forall a \; . \; \text{Bool} \rightarrow a \rightarrow a \rightarrow a
|
\text{if} : \forall a \; . \; \text{Bool} \rightarrow a \rightarrow a \rightarrow a
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
This basically says, "the type of \\(\\text{if}\\) is \\(\\text{Bool} \\rightarrow a \\rightarrow a \\rightarrow a\\)
|
This basically says, "the type of \(\text{if}\) is \(\text{Bool} \rightarrow a \rightarrow a \rightarrow a\)
|
||||||
for all possible \\(a\\)". This new expression using "forall" is what we call a type scheme, or a polytype \\(\\sigma\\).
|
for all possible \(a\)". This new expression using "forall" is what we call a type scheme, or a polytype \(\sigma\).
|
||||||
For simplicity, we only allow "forall" to be at the front of a polytype. That is, expressions like
|
For simplicity, we only allow "forall" to be at the front of a polytype. That is, expressions like
|
||||||
\\(a \\rightarrow \\forall b \\; . \\; b \\rightarrow b\\) are not valid polytypes as far as we're concerned.
|
\(a \rightarrow \forall b \; . \; b \rightarrow b\) are not valid polytypes as far as we're concerned.
|
||||||
|
|
||||||
It's key to observe that only some of the typing rules in the above table use polytypes (\\(\\sigma\\)). Whereas expressions
|
It's key to observe that only some of the typing rules in the above table use polytypes (\(\sigma\)). Whereas expressions
|
||||||
consisting of a single variable can be polymorphically typed, this is not true for function applications and case expressions.
|
consisting of a single variable can be polymorphically typed, this is not true for function applications and case expressions.
|
||||||
In fact, according to our rules, there is no way to introduce a polytype anywhere into our system!
|
In fact, according to our rules, there is no way to introduce a polytype anywhere into our system!
|
||||||
|
|
||||||
@@ -126,8 +127,8 @@ this is called __Let-Polymorphism__, which means that only in `let`/`in` express
|
|||||||
be given a polymorphic type. We, on the other hand, do not (yet) have `let`/`in` expressions, so our polymorphism
|
be given a polymorphic type. We, on the other hand, do not (yet) have `let`/`in` expressions, so our polymorphism
|
||||||
is further limited: only functions (and data type constructors) can be polymorphically typed.
|
is further limited: only functions (and data type constructors) can be polymorphically typed.
|
||||||
|
|
||||||
Let's talk about what __Inst__ does, and what "\\(\\sqsubseteq\\)" means.
|
Let's talk about what __Inst__ does, and what "\(\sqsubseteq\)" means.
|
||||||
First, why don't we go ahead and write the formal inference rule for \\(\\sqsubseteq\\):
|
First, why don't we go ahead and write the formal inference rule for \(\sqsubseteq\):
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\frac
|
\frac
|
||||||
@@ -136,21 +137,21 @@ First, why don't we go ahead and write the formal inference rule for \\(\\sqsubs
|
|||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
In my opinion, this is one of the more confusing inference rules we have to deal with in Hindley-Milner.
|
In my opinion, this is one of the more confusing inference rules we have to deal with in Hindley-Milner.
|
||||||
In principle, though, it's not too hard to understand. \\(\\sigma\' \\sqsubseteq \\sigma\\) says "\\(\\sigma\'\\)
|
In principle, though, it's not too hard to understand. \(\sigma' \sqsubseteq \sigma\) says "\(\sigma'\)
|
||||||
is more general than \\(\\sigma\\)". Alternatively, we can interpret it as "\\(\\sigma\\) is an instance of \\(\\sigma\'\\)".
|
is more general than \(\sigma\)". Alternatively, we can interpret it as "\(\sigma\) is an instance of \(\sigma'\)".
|
||||||
|
|
||||||
What does it mean for one polytype to be more general than another? Intuitively, we might say that \\(\forall a \\; . \\; a \\rightarrow a\\) is
|
What does it mean for one polytype to be more general than another? Intuitively, we might say that \(\forall a \; . \; a \rightarrow a\) is
|
||||||
more general than \\(\\text{Int} \\rightarrow \\text{Int}\\), because the former type can represent the latter, and more. Formally,
|
more general than \(\text{Int} \rightarrow \text{Int}\), because the former type can represent the latter, and more. Formally,
|
||||||
we define this in terms of __substitutions__. A substitution \\(\\{\\alpha \\mapsto \\tau \\}\\) replaces a variable
|
we define this in terms of __substitutions__. A substitution \(\{\alpha \mapsto \tau \}\) replaces a variable
|
||||||
\\(\\alpha\\) with a monotype \\(\\tau\\). If we can use a substitution to convert one type into another, then the first
|
\(\alpha\) with a monotype \(\tau\). If we can use a substitution to convert one type into another, then the first
|
||||||
type (the one on which the substitution was performed) is more general than the resulting type. In our previous example,
|
type (the one on which the substitution was performed) is more general than the resulting type. In our previous example,
|
||||||
since we can apply the substitution \\(\\{a \\mapsto \\text{Int}\\}\\) to get \\(\\text{Int} \\rightarrow \\text{Int}\\)
|
since we can apply the substitution \(\{a \mapsto \text{Int}\}\) to get \(\text{Int} \rightarrow \text{Int}\)
|
||||||
from \\(\\forall a \\; . \\; a \\rightarrow a\\), the latter type is more general; using our notation,
|
from \(\forall a \; . \; a \rightarrow a\), the latter type is more general; using our notation,
|
||||||
\\(\\forall a \\; . \\; a \\rightarrow a \\sqsubseteq \\text{Int} \\rightarrow \\text{Int}\\).
|
\(\forall a \; . \; a \rightarrow a \sqsubseteq \text{Int} \rightarrow \text{Int}\).
|
||||||
|
|
||||||
That's pretty much all that the rule says, really. But what about the whole thing with \\(\\beta\\) and \\(\\text{free}\\)? The reason
|
That's pretty much all that the rule says, really. But what about the whole thing with \(\beta\) and \(\text{free}\)? The reason
|
||||||
for that part of the rule is that, in principle, we can substitute polytypes into polytypes. However, we can't
|
for that part of the rule is that, in principle, we can substitute polytypes into polytypes. However, we can't
|
||||||
do this using \\(\\{ \\alpha \\mapsto \\sigma \\}\\). Consider, for example:
|
do this using \(\{ \alpha \mapsto \sigma \}\). Consider, for example:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\{ a \mapsto \forall b \; . \; b \rightarrow b \} \; \text{Bool} \rightarrow a \rightarrow a \stackrel{?}{=}
|
\{ a \mapsto \forall b \; . \; b \rightarrow b \} \; \text{Bool} \rightarrow a \rightarrow a \stackrel{?}{=}
|
||||||
@@ -158,9 +159,9 @@ do this using \\(\\{ \\alpha \\mapsto \\sigma \\}\\). Consider, for example:
|
|||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
Hmm, this is not good. Didn't we agree that we only want quantifiers at the front of our types? Instead, to make that substitution,
|
Hmm, this is not good. Didn't we agree that we only want quantifiers at the front of our types? Instead, to make that substitution,
|
||||||
we only substitute the monotype \\(b \\rightarrow b\\), and then add the \\(\\forall b\\) at the front. But
|
we only substitute the monotype \(b \rightarrow b\), and then add the \(\forall b\) at the front. But
|
||||||
to do this, we must make sure that \\(b\\) doesn't occur anywhere in the original type
|
to do this, we must make sure that \(b\) doesn't occur anywhere in the original type
|
||||||
\\(\forall a \\; . \\; \\text{Bool} \\rightarrow a \\rightarrow a\\) (otherwise we can accidentally generalize
|
\(\forall a \; . \; \text{Bool} \rightarrow a \rightarrow a\) (otherwise we can accidentally generalize
|
||||||
too much). So then, our concrete inference rule is as follows:
|
too much). So then, our concrete inference rule is as follows:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
@@ -175,9 +176,9 @@ too much). So then, our concrete inference rule is as follows:
|
|||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
In the above rule we:
|
In the above rule we:
|
||||||
1. Replaced \\(a\\) with \\(b \\rightarrow b\\), getting rid of \\(a\\) in the quantifier.
|
1. Replaced \(a\) with \(b \rightarrow b\), getting rid of \(a\) in the quantifier.
|
||||||
2. Observed that \\(b\\) is not a free variable in the original type, and thus can be generalized.
|
2. Observed that \(b\) is not a free variable in the original type, and thus can be generalized.
|
||||||
3. Added the generalization for \\(b\\) to the front of the resulting type.
|
3. Added the generalization for \(b\) to the front of the resulting type.
|
||||||
|
|
||||||
Now, __Inst__ just allows us to perform specialization / substitution as many times
|
Now, __Inst__ just allows us to perform specialization / substitution as many times
|
||||||
as we need to get to the type we want.
|
as we need to get to the type we want.
|
||||||
@@ -187,12 +188,12 @@ as we need to get to the type we want.
|
|||||||
Alright, now we have all these rules. How does this change our typechecking algorithm?
|
Alright, now we have all these rules. How does this change our typechecking algorithm?
|
||||||
How about the following:
|
How about the following:
|
||||||
|
|
||||||
1. To every declared function, assign the type \\(a \\rightarrow ... \\rightarrow y \\rightarrow z\\),
|
1. To every declared function, assign the type \(a \rightarrow ... \rightarrow y \rightarrow z\),
|
||||||
where
|
where
|
||||||
{{< sidenote "right" "arguments-note" "\(a\) through \(y\) are the types of the arguments to the function," >}}
|
{{< sidenote "right" "arguments-note" "\(a\) through \(y\) are the types of the arguments to the function," >}}
|
||||||
Of course, there can be more or less than 25 arguments to any function. This is just a generalization;
|
Of course, there can be more or less than 25 arguments to any function. This is just a generalization;
|
||||||
we use as many input types as are needed.
|
we use as many input types as are needed.
|
||||||
{{< /sidenote >}} and \\(z\\) is the function's
|
{{< /sidenote >}} and \(z\) is the function's
|
||||||
return type.
|
return type.
|
||||||
2. We typecheck each declared function, using the __Var__, __Case__, __App__, and __Inst__ rules.
|
2. We typecheck each declared function, using the __Var__, __Case__, __App__, and __Inst__ rules.
|
||||||
3. Whatever type variables we don't fill in, we assume can be filled in with any type,
|
3. Whatever type variables we don't fill in, we assume can be filled in with any type,
|
||||||
@@ -213,15 +214,15 @@ defn testTwo = { if True 0 1 }
|
|||||||
|
|
||||||
If we go through and typecheck them top-to-bottom, the following happens:
|
If we go through and typecheck them top-to-bottom, the following happens:
|
||||||
|
|
||||||
1. We start by assuming \\(\\text{if} : a \\rightarrow b \\rightarrow c \\rightarrow d\\),
|
1. We start by assuming \(\text{if} : a \rightarrow b \rightarrow c \rightarrow d\),
|
||||||
\\(\\text{testOne} : e\\) and \\(\\text{testTwo} : f\\).
|
\(\text{testOne} : e\) and \(\text{testTwo} : f\).
|
||||||
2. We look at `if`. We infer the type of `c` to be \\(\\text{Bool}\\), and thus, \\(a = \\text{Bool}\\).
|
2. We look at `if`. We infer the type of `c` to be \(\text{Bool}\), and thus, \(a = \text{Bool}\).
|
||||||
We also infer that \\(b = c\\), since they occur in two branches of the same case expression.
|
We also infer that \(b = c\), since they occur in two branches of the same case expression.
|
||||||
Finally, we infer that \\(c = d\\), since whatever the case expression returns becomes the return
|
Finally, we infer that \(c = d\), since whatever the case expression returns becomes the return
|
||||||
value of the function. Thus, we come out knowing that \\(\\text{if} : \\text{Bool} \\rightarrow b
|
value of the function. Thus, we come out knowing that \(\text{if} : \text{Bool} \rightarrow b
|
||||||
\\rightarrow b \\rightarrow b\\).
|
\rightarrow b \rightarrow b\).
|
||||||
3. Now, since we never figured out \\(b\\), we use __Gen__: \\(\\text{if} : \\forall b \\; . \\;
|
3. Now, since we never figured out \(b\), we use __Gen__: \(\text{if} : \forall b \; . \;
|
||||||
\\text{Bool} \\rightarrow b \\rightarrow b \\rightarrow b\\). Like we'd want, `if` works with
|
\text{Bool} \rightarrow b \rightarrow b \rightarrow b\). Like we'd want, `if` works with
|
||||||
all types, as long as both its inputs are of the same type.
|
all types, as long as both its inputs are of the same type.
|
||||||
4. When we typecheck the body of `testOne`, we use __Inst__ to turn the above type for `if`
|
4. When we typecheck the body of `testOne`, we use __Inst__ to turn the above type for `if`
|
||||||
into a single, monomorphic instance. Then, type inference proceeds as before, and all is well.
|
into a single, monomorphic instance. Then, type inference proceeds as before, and all is well.
|
||||||
@@ -230,15 +231,15 @@ and all is well again.
|
|||||||
|
|
||||||
So far, so good. But what if we started from the bottom, and went to the top?
|
So far, so good. But what if we started from the bottom, and went to the top?
|
||||||
|
|
||||||
1. We start by assuming \\(\\text{if} : a \\rightarrow b \\rightarrow c \\rightarrow d\\),
|
1. We start by assuming \(\text{if} : a \rightarrow b \rightarrow c \rightarrow d\),
|
||||||
\\(\\text{testOne} : e\\) and \\(\\text{testTwo} : f\\).
|
\(\text{testOne} : e\) and \(\text{testTwo} : f\).
|
||||||
2. We look at `testTwo`. We infer that \\(a = \\text{Bool}\\) (since
|
2. We look at `testTwo`. We infer that \(a = \text{Bool}\) (since
|
||||||
we pass in `True` to `if`). We also infer that \\(b = \\text{Int}\\), and that \\(c = \\text{Int}\\).
|
we pass in `True` to `if`). We also infer that \(b = \text{Int}\), and that \(c = \text{Int}\).
|
||||||
Not yet sure of the return type of `if`, this is where we stop. We are left with
|
Not yet sure of the return type of `if`, this is where we stop. We are left with
|
||||||
the knowledge that \\(f = d\\) (because the return type of `if` is the return type of `testTwo`),
|
the knowledge that \(f = d\) (because the return type of `if` is the return type of `testTwo`),
|
||||||
but that's about it. Since \\(f\\) is no longer free, we don't generalize, and conclude that \\(\text{testTwo} : f\\).
|
but that's about it. Since \(f\) is no longer free, we don't generalize, and conclude that \(\text{testTwo} : f\).
|
||||||
3. We look at `testOne`. We infer that \\(a = \\text{Bool}\\) (already known). We also infer
|
3. We look at `testOne`. We infer that \(a = \text{Bool}\) (already known). We also infer
|
||||||
that \\(b = \\text{Bool}\\), and that \\(c = \\text{Bool}\\). But wait a minute! This is not right.
|
that \(b = \text{Bool}\), and that \(c = \text{Bool}\). But wait a minute! This is not right.
|
||||||
We are back to where we started, with a unification error!
|
We are back to where we started, with a unification error!
|
||||||
|
|
||||||
What went wrong? I claim that we typechecked the functions that _used_ `if` before we typechecked `if` itself,
|
What went wrong? I claim that we typechecked the functions that _used_ `if` before we typechecked `if` itself,
|
||||||
@@ -256,11 +257,11 @@ A transitive closure of a vertex in a graph is the list of all vertices reachabl
|
|||||||
from that original vertex. Check out the <a href="https://en.wikipedia.org/wiki/Transitive_closure#In_graph_theory">
|
from that original vertex. Check out the <a href="https://en.wikipedia.org/wiki/Transitive_closure#In_graph_theory">
|
||||||
Wikipedia page on this</a>.
|
Wikipedia page on this</a>.
|
||||||
{{< /sidenote >}}
|
{{< /sidenote >}}
|
||||||
of the function dependencies. We define a function \\(f\\) to be dependent on another function \\(g\\)
|
of the function dependencies. We define a function \(f\) to be dependent on another function \(g\)
|
||||||
if \\(f\\) calls \\(g\\). The transitive closure will help us find functions that are related indirectly.
|
if \(f\) calls \(g\). The transitive closure will help us find functions that are related indirectly.
|
||||||
For instance, if \\(g\\) also depends on \\(h\\), then the transitive closure of \\(f\\) will
|
For instance, if \(g\) also depends on \(h\), then the transitive closure of \(f\) will
|
||||||
include \\(h\\), even if \\(f\\) directly doesn't use \\(h\\).
|
include \(h\), even if \(f\) directly doesn't use \(h\).
|
||||||
2. We isolate groups of mutually dependent functions. If \\(f\\) depends on \\(g\\) and \\(g\\) depends \\(f\\),
|
2. We isolate groups of mutually dependent functions. If \(f\) depends on \(g\) and \(g\) depends \(f\),
|
||||||
they are placed in one group. We then construct a dependency graph __of these groups__.
|
they are placed in one group. We then construct a dependency graph __of these groups__.
|
||||||
3. We compute a topological order of the group graph. This helps us typecheck the dependencies
|
3. We compute a topological order of the group graph. This helps us typecheck the dependencies
|
||||||
of functions before checking the functions themselves. In our specific case, this would ensure
|
of functions before checking the functions themselves. In our specific case, this would ensure
|
||||||
@@ -270,7 +271,7 @@ in a group.
|
|||||||
4. We typecheck the function groups, and functions within them, following the above topological order.
|
4. We typecheck the function groups, and functions within them, following the above topological order.
|
||||||
|
|
||||||
To find the transitive closure of a graph, we can use [Warshall's Algorithm](https://cs.winona.edu/lin/cs440/ch08-2.pdf).
|
To find the transitive closure of a graph, we can use [Warshall's Algorithm](https://cs.winona.edu/lin/cs440/ch08-2.pdf).
|
||||||
This algorithm, with complexity \\(O(|V|^3)\\), goes as follows:
|
This algorithm, with complexity \(O(|V|^3)\), goes as follows:
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\begin{aligned}
|
\begin{aligned}
|
||||||
& A, R^{(i)} \in \mathbb{B}^{n \times n} \\
|
& A, R^{(i)} \in \mathbb{B}^{n \times n} \\
|
||||||
@@ -284,12 +285,12 @@ This algorithm, with complexity \\(O(|V|^3)\\), goes as follows:
|
|||||||
\end{aligned}
|
\end{aligned}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
In the above notation, \\(R^{(i)}\\) is the \\(i\\)th matrix \\(R\\), and \\(A\\) is the adjacency
|
In the above notation, \(R^{(i)}\) is the \(i\)th matrix \(R\), and \(A\) is the adjacency
|
||||||
matrix of the graph in question. All matrices in the algorithm are from \\(\\mathbb{B}^{n \times n}\\),
|
matrix of the graph in question. All matrices in the algorithm are from \(\mathbb{B}^{n \times n}\),
|
||||||
the set of \\(n\\) by \\(n\\) boolean matrices. Once this algorithm is complete, we get as output a
|
the set of \(n\) by \(n\) boolean matrices. Once this algorithm is complete, we get as output a
|
||||||
transitive closure adjacency matrix \\(R^{(n)}\\). Mutually dependent functions will be pretty easy to
|
transitive closure adjacency matrix \(R^{(n)}\). Mutually dependent functions will be pretty easy to
|
||||||
isolate from this matrix. If \\(R^{(n)}[i,j]\\) and \\(R^{(n)}[j,i]\\), then the functions represented by vertices
|
isolate from this matrix. If \(R^{(n)}[i,j]\) and \(R^{(n)}[j,i]\), then the functions represented by vertices
|
||||||
\\(i\\) and \\(j\\) depend on each other.
|
\(i\) and \(j\) depend on each other.
|
||||||
|
|
||||||
Once we've identified the groups, and
|
Once we've identified the groups, and
|
||||||
{{< sidenote "right" "group-graph-note" "constructed a group graph," >}}
|
{{< sidenote "right" "group-graph-note" "constructed a group graph," >}}
|
||||||
@@ -595,7 +596,7 @@ a value of a non-existent type), but a mature compiler should prevent this from
|
|||||||
On the other hand, here are the steps for function definitions:
|
On the other hand, here are the steps for function definitions:
|
||||||
|
|
||||||
1. Find the free variables of each function to create the ordered list of groups as described above.
|
1. Find the free variables of each function to create the ordered list of groups as described above.
|
||||||
2. Within each group, insert a general function type (like \\(a \\rightarrow b \\rightarrow c\\))
|
2. Within each group, insert a general function type (like \(a \rightarrow b \rightarrow c\))
|
||||||
into the environment for each function.
|
into the environment for each function.
|
||||||
3. Within each group (in the same pass) run typechecking
|
3. Within each group (in the same pass) run typechecking
|
||||||
(including polymorphism, using the rules as described above).
|
(including polymorphism, using the rules as described above).
|
||||||
@@ -706,8 +707,8 @@ is also updated to use topological ordering:
|
|||||||
|
|
||||||
The above code uses the yet-unexplained `generalize` method. What's going on?
|
The above code uses the yet-unexplained `generalize` method. What's going on?
|
||||||
|
|
||||||
Observe that the __Var__ rule of the Hindley-Milner type system says that a variable \\(x\\)
|
Observe that the __Var__ rule of the Hindley-Milner type system says that a variable \(x\)
|
||||||
can have a __polytype__ in the environment \\(\\Gamma\\). Our `type_ptr` can only represent monotypes,
|
can have a __polytype__ in the environment \(\Gamma\). Our `type_ptr` can only represent monotypes,
|
||||||
so we must change what `type_env` associates with names to a new struct for representing polytypes,
|
so we must change what `type_env` associates with names to a new struct for representing polytypes,
|
||||||
which we will call `type_scheme`. The `type_scheme` struct, just like the formal definition of
|
which we will call `type_scheme`. The `type_scheme` struct, just like the formal definition of
|
||||||
a polytype, contains zero or more "forall"-quantified type variables, followed by a monotype which
|
a polytype, contains zero or more "forall"-quantified type variables, followed by a monotype which
|
||||||
|
|||||||
@@ -1,7 +1,8 @@
|
|||||||
---
|
---
|
||||||
title: Compiling a Functional Language Using C++, Part 11 - Polymorphic Data Types
|
title: Compiling a Functional Language Using C++, Part 11 - Polymorphic Data Types
|
||||||
date: 2020-04-14T19:05:42-07:00
|
date: 2020-04-14T19:05:42-07:00
|
||||||
tags: ["C and C++", "Functional Languages", "Compilers"]
|
tags: ["C++", "Functional Languages", "Compilers"]
|
||||||
|
series: "Compiling a Functional Language using C++"
|
||||||
description: "In this post, we enable our compiler to understand polymorphic data types."
|
description: "In this post, we enable our compiler to understand polymorphic data types."
|
||||||
---
|
---
|
||||||
[In part 10]({{< relref "10_compiler_polymorphism.md" >}}), we managed to get our
|
[In part 10]({{< relref "10_compiler_polymorphism.md" >}}), we managed to get our
|
||||||
@@ -41,11 +42,11 @@ empty.
|
|||||||
|
|
||||||
Let's talk about `List` itself, now. I suggest that we ponder the following table:
|
Let's talk about `List` itself, now. I suggest that we ponder the following table:
|
||||||
|
|
||||||
\\(\\text{List}\\)|\\(\\text{Cons}\\)
|
\(\text{List}\)|\(\text{Cons}\)
|
||||||
----|----
|
----|----
|
||||||
\\(\\text{List}\\) is not a type; it must be followed up with arguments, like \\(\\text{List} \\; \\text{Int}\\).|\\(\\text{Cons}\\) is not a list; it must be followed up with arguments, like \\(\\text{Cons} \\; 3 \\; \\text{Nil}\\).
|
\(\text{List}\) is not a type; it must be followed up with arguments, like \(\text{List} \; \text{Int}\).|\(\text{Cons}\) is not a list; it must be followed up with arguments, like \(\text{Cons} \; 3 \; \text{Nil}\).
|
||||||
\\(\\text{List} \\; \\text{Int}\\) is in its simplest form.|\\(\\text{Cons} \\; 3 \\; \\text{Nil}\\) is in its simplest form.
|
\(\text{List} \; \text{Int}\) is in its simplest form.|\(\text{Cons} \; 3 \; \text{Nil}\) is in its simplest form.
|
||||||
\\(\\text{List} \\; \\text{Int}\\) is a type.|\\(\\text{Cons} \\; 3 \\; \\text{Nil}\\) is a value of type \\(\\text{List} \\; \\text{Int}\\).
|
\(\text{List} \; \text{Int}\) is a type.|\(\text{Cons} \; 3 \; \text{Nil}\) is a value of type \(\text{List} \; \text{Int}\).
|
||||||
|
|
||||||
I hope that the similarities are quite striking. I claim that
|
I hope that the similarities are quite striking. I claim that
|
||||||
`List` is quite similar to a constructor `Cons`, except that it occurs
|
`List` is quite similar to a constructor `Cons`, except that it occurs
|
||||||
@@ -73,18 +74,18 @@ for functional programming) or <a href="https://coq.inria.fr/">Coq</a> (to see h
|
|||||||
propositions and proofs can be encoded in a dependently typed language).
|
propositions and proofs can be encoded in a dependently typed language).
|
||||||
{{< /sidenote >}}
|
{{< /sidenote >}}
|
||||||
our type constructors will only take zero or more types as input, and produce
|
our type constructors will only take zero or more types as input, and produce
|
||||||
a type as output. In this case, writing \\(\\text{Type}\\) becomes quite repetitive,
|
a type as output. In this case, writing \(\text{Type}\) becomes quite repetitive,
|
||||||
and we will adopt the convention of writing \\(*\\) instead. The types of such
|
and we will adopt the convention of writing \(*\) instead. The types of such
|
||||||
constructors are called [kinds](https://en.wikipedia.org/wiki/Kind_(type_theory)).
|
constructors are called [kinds](https://en.wikipedia.org/wiki/Kind_(type_theory)).
|
||||||
Let's look at a few examples, just to make sure we're on the same page:
|
Let's look at a few examples, just to make sure we're on the same page:
|
||||||
|
|
||||||
* The kind of \\(\\text{Bool}\\) is \\(*\\): it does not accept any
|
* The kind of \(\text{Bool}\) is \(*\): it does not accept any
|
||||||
type arguments, and is a type in its own right.
|
type arguments, and is a type in its own right.
|
||||||
* The kind of \\(\\text{List}\\) is \\(*\\rightarrow *\\): it takes
|
* The kind of \(\text{List}\) is \(*\rightarrow *\): it takes
|
||||||
one argument (the type of the things inside the list), and creates
|
one argument (the type of the things inside the list), and creates
|
||||||
a type from it.
|
a type from it.
|
||||||
* If we define a pair as `data Pair a b = { MkPair a b }`, then its
|
* If we define a pair as `data Pair a b = { MkPair a b }`, then its
|
||||||
kind is \\(* \\rightarrow * \\rightarrow *\\), because it requires
|
kind is \(* \rightarrow * \rightarrow *\), because it requires
|
||||||
two parameters.
|
two parameters.
|
||||||
|
|
||||||
As one final observation, we note that effectively, all we're doing is
|
As one final observation, we note that effectively, all we're doing is
|
||||||
@@ -93,24 +94,24 @@ type.
|
|||||||
|
|
||||||
Let's now enumerate all the possible forms that (mono)types can take in our system:
|
Let's now enumerate all the possible forms that (mono)types can take in our system:
|
||||||
|
|
||||||
1. A type can be a placeholder, like \\(a\\), \\(b\\), etc.
|
1. A type can be a placeholder, like \(a\), \(b\), etc.
|
||||||
2. A type can be a type constructor, applied to
|
2. A type can be a type constructor, applied to
|
||||||
{{< sidenote "right" "zero-more-note" "zero ore more arguments," >}}
|
{{< sidenote "right" "zero-more-note" "zero ore more arguments," >}}
|
||||||
It is convenient to treat regular types (like \(\text{Bool}\)) as
|
It is convenient to treat regular types (like \(\text{Bool}\)) as
|
||||||
type constructors of arity 0 (that is, type constructors with kind \(*\)).
|
type constructors of arity 0 (that is, type constructors with kind \(*\)).
|
||||||
In effect, they take zero arguments and produce types (themselves).
|
In effect, they take zero arguments and produce types (themselves).
|
||||||
{{< /sidenote >}} such as \\(\\text{List} \\; \\text{Int}\\) or \\(\\text{Bool}\\).
|
{{< /sidenote >}} such as \(\text{List} \; \text{Int}\) or \(\text{Bool}\).
|
||||||
3. A function from one type to another, like \\(\\text{List} \\; a \\rightarrow \\text{Int}\\).
|
3. A function from one type to another, like \(\text{List} \; a \rightarrow \text{Int}\).
|
||||||
|
|
||||||
Polytypes (type schemes) in our system can be all of the above, but may also include a "forall"
|
Polytypes (type schemes) in our system can be all of the above, but may also include a "forall"
|
||||||
quantifier at the front, generalizing the type (like \\(\\forall a \\; . \\; \\text{List} \\; a \\rightarrow \\text{Int}\\)).
|
quantifier at the front, generalizing the type (like \(\forall a \; . \; \text{List} \; a \rightarrow \text{Int}\)).
|
||||||
|
|
||||||
Let's start implementing all of this. Why don't we start with the change to the syntax of our language?
|
Let's start implementing all of this. Why don't we start with the change to the syntax of our language?
|
||||||
We have complicated the situation quite a bit. Let's take a look at the _old_ grammar
|
We have complicated the situation quite a bit. Let's take a look at the _old_ grammar
|
||||||
for data type declarations (this is going back as far as [part 2]({{< relref "02_compiler_parsing.md" >}})).
|
for data type declarations (this is going back as far as [part 2]({{< relref "02_compiler_parsing.md" >}})).
|
||||||
Here, \\(L\_D\\) is the nonterminal for the things that go between the curly braces of a data type
|
Here, \(L_D\) is the nonterminal for the things that go between the curly braces of a data type
|
||||||
declaration, \\(D\\) is the nonterminal representing a single constructor definition,
|
declaration, \(D\) is the nonterminal representing a single constructor definition,
|
||||||
and \\(L\_U\\) is a list of zero or more uppercase variable names:
|
and \(L_U\) is a list of zero or more uppercase variable names:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\begin{aligned}
|
\begin{aligned}
|
||||||
@@ -126,7 +127,7 @@ This grammar was actually too simple even for our monomorphically typed language
|
|||||||
Since functions are not represented using a single uppercase variable, it wasn't possible for us
|
Since functions are not represented using a single uppercase variable, it wasn't possible for us
|
||||||
to define constructors that accept as arguments anything other than integers and user-defined
|
to define constructors that accept as arguments anything other than integers and user-defined
|
||||||
data types. Now, we also need to modify this grammar to allow for constructor applications (which can be nested).
|
data types. Now, we also need to modify this grammar to allow for constructor applications (which can be nested).
|
||||||
To do all of these things, we will define a new nonterminal, \\(Y\\), for types:
|
To do all of these things, we will define a new nonterminal, \(Y\), for types:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\begin{aligned}
|
\begin{aligned}
|
||||||
@@ -135,8 +136,8 @@ Y & \rightarrow N
|
|||||||
\end{aligned}
|
\end{aligned}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
We make it right-recursive (because the \\(\\rightarrow\\) operator is right-associative). Next, we define
|
We make it right-recursive (because the \(\rightarrow\) operator is right-associative). Next, we define
|
||||||
a nonterminal for all types _except_ those constructed with the arrow, \\(N\\).
|
a nonterminal for all types _except_ those constructed with the arrow, \(N\).
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\begin{aligned}
|
\begin{aligned}
|
||||||
@@ -147,15 +148,15 @@ N & \rightarrow ( Y )
|
|||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
The first of the above rules allows a type to be a constructor applied to zero or more arguments
|
The first of the above rules allows a type to be a constructor applied to zero or more arguments
|
||||||
(generated by \\(L\_Y\\)). The second rule allows a type to be a placeholder type variable. Finally,
|
(generated by \(L_Y\)). The second rule allows a type to be a placeholder type variable. Finally,
|
||||||
the third rule allows for any type (including functions, again) to occur between parentheses.
|
the third rule allows for any type (including functions, again) to occur between parentheses.
|
||||||
This is so that higher-order functions, like \\((a \rightarrow b) \rightarrow a \rightarrow a \\),
|
This is so that higher-order functions, like \((a \rightarrow b) \rightarrow a \rightarrow a \),
|
||||||
can be represented.
|
can be represented.
|
||||||
|
|
||||||
Unfortunately, the definition of \\(L\_Y\\) is not as straightforward as we imagine. We could define
|
Unfortunately, the definition of \(L_Y\) is not as straightforward as we imagine. We could define
|
||||||
it as just a list of \\(Y\\) nonterminals, but this would make the grammar ambigous: something
|
it as just a list of \(Y\) nonterminals, but this would make the grammar ambigous: something
|
||||||
like `List Maybe Int` could be interpreted as "`List`, applied to types `Maybe` and `Int`", and
|
like `List Maybe Int` could be interpreted as "`List`, applied to types `Maybe` and `Int`", and
|
||||||
"`List`, applied to type `Maybe Int`". To avoid this, we define a "type list element" \\(Y'\\),
|
"`List`, applied to type `Maybe Int`". To avoid this, we define a "type list element" \(Y'\),
|
||||||
which does not take arguments:
|
which does not take arguments:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
@@ -166,7 +167,7 @@ Y' & \rightarrow ( Y )
|
|||||||
\end{aligned}
|
\end{aligned}
|
||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
We then make \\(L\_Y\\) a list of \\(Y'\\):
|
We then make \(L_Y\) a list of \(Y'\):
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
\begin{aligned}
|
\begin{aligned}
|
||||||
@@ -176,7 +177,7 @@ L_Y & \rightarrow \epsilon
|
|||||||
{{< /latex >}}
|
{{< /latex >}}
|
||||||
|
|
||||||
Finally, we update the rules for the data type declaration, as well as for a single
|
Finally, we update the rules for the data type declaration, as well as for a single
|
||||||
constructor. In these new rules, we use \\(L\_T\\) to mean a list of type variables.
|
constructor. In these new rules, we use \(L_T\) to mean a list of type variables.
|
||||||
The rules are as follows:
|
The rules are as follows:
|
||||||
|
|
||||||
{{< latex >}}
|
{{< latex >}}
|
||||||
@@ -335,7 +336,7 @@ it will be once the type manager generates its first type variable, and things w
|
|||||||
wanted type constructors to be monomorphic (but generic, with type variables) we'd need to internally
|
wanted type constructors to be monomorphic (but generic, with type variables) we'd need to internally
|
||||||
instnatiate fresh type variables for every user-defined type variable, and substitute them appropriately.
|
instnatiate fresh type variables for every user-defined type variable, and substitute them appropriately.
|
||||||
{{< /sidenote >}}
|
{{< /sidenote >}}
|
||||||
as we have discussed above with \\(\\text{Nil}\\) and \\(\\text{Cons}\\).
|
as we have discussed above with \(\text{Nil}\) and \(\text{Cons}\).
|
||||||
To accomodate for this, we also add all type variables to the "forall" quantifier
|
To accomodate for this, we also add all type variables to the "forall" quantifier
|
||||||
of a new type scheme, whose monotype is our newly assembled function type. This
|
of a new type scheme, whose monotype is our newly assembled function type. This
|
||||||
type scheme is what we store as the type of the constructor.
|
type scheme is what we store as the type of the constructor.
|
||||||
|
|||||||
@@ -1,7 +1,8 @@
|
|||||||
---
|
---
|
||||||
title: Compiling a Functional Language Using C++, Part 12 - Let/In and Lambdas
|
title: Compiling a Functional Language Using C++, Part 12 - Let/In and Lambdas
|
||||||
date: 2020-06-21T00:50:07-07:00
|
date: 2020-06-21T00:50:07-07:00
|
||||||
tags: ["C and C++", "Functional Languages", "Compilers"]
|
tags: ["C++", "Functional Languages", "Compilers"]
|
||||||
|
series: "Compiling a Functional Language using C++"
|
||||||
description: "In this post, we extend our language with let/in expressions and lambda functions."
|
description: "In this post, we extend our language with let/in expressions and lambda functions."
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
@@ -1,7 +1,8 @@
|
|||||||
---
|
---
|
||||||
title: Compiling a Functional Language Using C++, Part 13 - Cleanup
|
title: Compiling a Functional Language Using C++, Part 13 - Cleanup
|
||||||
date: 2020-09-19T16:14:13-07:00
|
date: 2020-09-19T16:14:13-07:00
|
||||||
tags: ["C and C++", "Functional Languages", "Compilers"]
|
tags: ["C++", "Functional Languages", "Compilers"]
|
||||||
|
series: "Compiling a Functional Language using C++"
|
||||||
description: "In this post, we clean up our compiler."
|
description: "In this post, we clean up our compiler."
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
230
content/blog/agda_expr_pattern.md
Normal file
@@ -0,0 +1,230 @@
|
|||||||
|
---
|
||||||
|
title: "The \"Deeply Embedded Expression\" Trick in Agda"
|
||||||
|
date: 2024-03-11T14:25:52-07:00
|
||||||
|
tags: ["Agda"]
|
||||||
|
description: "In this post, I talk about a trick I developed to simplify certain Agda proofs."
|
||||||
|
---
|
||||||
|
|
||||||
|
I've been working on a relatively large Agda project for a few months now,
|
||||||
|
and I'd like to think that I've become quite proficient. Recently, I came
|
||||||
|
up with a little trick to help simplify some of my proofs, and it seems like
|
||||||
|
this trick might have broader applications.
|
||||||
|
|
||||||
|
In my head, I call this trick 'Deeply Embedded Expressions'. Before I introduce
|
||||||
|
it, let me explain the part of my work that motivated developing the trick.
|
||||||
|
|
||||||
|
### Proofs about Map Operations
|
||||||
|
|
||||||
|
A part of my Agda project is the formalization of simple key-value maps.
|
||||||
|
I model key-value maps as lists of key-value pairs. On top of this, I implement
|
||||||
|
two operations: `join` and `meet`, which in my code are denoted using `⊔` and `⊓`.
|
||||||
|
When "joining" two maps, you create a new map that has the keys from both input ones.
|
||||||
|
If a key is only present in one of the input maps, then the new "joined" map
|
||||||
|
has the same value for that key as the original. On the other hand, if the key
|
||||||
|
is present in both maps, then its value in the new map is the result of "joining"
|
||||||
|
the original values. The "meet" operation is similar, except instead of taking
|
||||||
|
keys from either map, the result only has keys that were present in both maps,
|
||||||
|
"meeting" their values. In a way, "join" and "meet" are similar to set union
|
||||||
|
and intersection --- but they also operate on the values in the map.
|
||||||
|
|
||||||
|
Given these operations, I need to prove certain properties of these operation.
|
||||||
|
The most inconvenient to prove is probably associativity:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Map.agda" 752 752 >}}
|
||||||
|
|
||||||
|
This property is, in turn, proven using two 'subset' relations on maps, defined
|
||||||
|
in the usual way.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Map.agda" 755 755 >}}
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Map.agda" 774 774 >}}
|
||||||
|
|
||||||
|
The reason this property is so inconvenient to prove is that there are a
|
||||||
|
lot of cases to consider. That's because your claim, in words, is something
|
||||||
|
like:
|
||||||
|
|
||||||
|
> Suppose a key-value pair `k , v` is present in `(m₁ ⊔ m₂) ⊔ m₃`. Show
|
||||||
|
> that `k , v` is also in `m₁ ⊔ (m₂ ⊔ m₃)`.
|
||||||
|
|
||||||
|
The only thing you can really do with `k , v` is figure out how it got into
|
||||||
|
the three-way union map: did it come from `m₁`, `m₂`, or `m₃`, or perhaps
|
||||||
|
several of them? The essence of the proof boils down to repeated uses
|
||||||
|
of the fact that for a key to be in the union, it must be in at least one
|
||||||
|
of the two maps. You end up with witnesses, repeated application of the same
|
||||||
|
lemmas, lots of `let`-expressions or `where` clauses. It's relatively tedious
|
||||||
|
and, what's more frustrating, __driven entirely by the structure of the
|
||||||
|
map operations__. It seems like one shouldn't have to mimic that structure
|
||||||
|
using boilerplate lemmas. So I started looking at other ways.
|
||||||
|
|
||||||
|
### Case Analysis using GADTs
|
||||||
|
|
||||||
|
A "proof by cases" in a dependently typed language like Agda usually brings
|
||||||
|
to mind pattern matching. So, here's an idea: what if for each expression
|
||||||
|
involving `⊔` and `⊓`, we had some kind of data type, and that data type
|
||||||
|
had exactly as many inhabitants as there are cases to analyze? A data type
|
||||||
|
corresponding to `m₁ ⊔ m₂` might have three cases, and the one for
|
||||||
|
`(m₁ ⊔ m₂) ⊔ m₃` might have seven. Each case would contain the information
|
||||||
|
necessary to perform the proof.
|
||||||
|
|
||||||
|
A data type whose "shape" depends on an expression in the way I described above
|
||||||
|
is said to be _indexed by_ that expression. In Agda, GADTs are used to create
|
||||||
|
indexed types. My initial attempt was something like this:
|
||||||
|
|
||||||
|
```Agda
|
||||||
|
data Provenance (k : A) : B → Map → Set (a ⊔ℓ b) where
|
||||||
|
single : ∀ {v : B} {m : Map} → (k , v) ∈ m → Provenance k v m
|
||||||
|
in₁ : ∀ {v : B} {m₁ m₂ : Expr} → Provenance k v e₁ → ¬ k ∈k m₂ → Provenance k v (e₁ ⊔ e₂)
|
||||||
|
in₂ : ∀ {v : B} {m₁ m₂ : Expr} → ¬ k ∈k m₁ → Provenance k v m₂ → Provenance k v (e₁ ⊔ e₂)
|
||||||
|
bothᵘ : ∀ {v₁ v₂ : B} {m₁ m₂ : Expr} → Provenance k v₁ m₁ → Provenance k v₂ m₂ → Provenance k (v₁ ⊔ v₂) (e₁ ⊔ e₂)
|
||||||
|
bothⁱ : ∀ {v₁ v₂ : B} {m₁ m₂ : Expr} → Provenance k v₁ m₁ → Provenance k v₂ m₂ → Provenance k (v₁ ⊓ v₂) (e₁ ⊓ e₂)
|
||||||
|
```
|
||||||
|
|
||||||
|
I was planning on a proof of associativity (in one direction) that looked
|
||||||
|
something like the following --- pattern matching on cases from the new
|
||||||
|
`Provenance` type.
|
||||||
|
|
||||||
|
```Agda
|
||||||
|
⊔-assoc₁ : ((m₁ ⊔ m₂) ⊔ m₃) ⊆ (m₁ ⊔ (m₂ ⊔ m₃))
|
||||||
|
⊔-assoc₁ k v k,v∈m₁₂m₃
|
||||||
|
with get-Provenance k,v∈m₁₂m₃
|
||||||
|
... | in₂ k∉km₁₂ (single v∈m₃) = ...
|
||||||
|
... | in₁ (in₂ k∉km₁ (single v∈m₂)) k∉km₃ = ...
|
||||||
|
... | bothᵘ (in₂ k∉km₁ (single {v₂} v₂∈m₂)) (single {v₃} v₃∈m₃) = ...
|
||||||
|
... | in₁ (in₁ (single v₁∈m₁) k∉km₂) k∉km₃ = ...
|
||||||
|
... | bothᵘ (in₁ (single {v₁} v₁∈m₁) k∉km₂) (single {v₃} v₃∈m₃) = ...
|
||||||
|
... | in₁ (bothᵘ (single {v₁} v₁∈m₁) (single {v₂} v₂∈m₂)) k∉ke₃ = ...
|
||||||
|
... | bothᵘ (bothᵘ (single {v₁} v₁∈m₁) (single {v₂} v₂∈m₂)) (single {v₃} v₃∈m₃) = ...
|
||||||
|
```
|
||||||
|
|
||||||
|
However, this doesn't work. Agda has trouble figuring out which cases of
|
||||||
|
the `Provenance` GADT are allowed, in which aren't. Is `m₁ ⊔ m` a single map,
|
||||||
|
fit for the `single` case, or should it be broken up into more cases like
|
||||||
|
`in₁` and `in₂`? In general, is some expression of type `Map` the "bottom"
|
||||||
|
of our recursion, or should it be analyzed further?
|
||||||
|
|
||||||
|
The above hints at what's wrong. The mistake here is requiring Agda to infer
|
||||||
|
the shape of our "join" and "meet" expressions from arbitrary terms.
|
||||||
|
The set of expressions that we want to reason about is much more restricted --
|
||||||
|
each expression will always be of three components: "meet", "join", and base-case maps being
|
||||||
|
combined using these operations.
|
||||||
|
|
||||||
|
### Defining an Expression Data Type
|
||||||
|
If you're like me, and have spent years of your life around programming language
|
||||||
|
theory and domain specific languages (DSLs), the last sentence of the previous
|
||||||
|
section may be ringing a bell. In fact, it's eerily similar to how we describe
|
||||||
|
recursive grammars:
|
||||||
|
|
||||||
|
> An expression of interest is either,
|
||||||
|
> * A map
|
||||||
|
> * The "join" of two expressions
|
||||||
|
> * The "meet" of two expressions
|
||||||
|
|
||||||
|
Mathematically, we might write this as follows:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\begin{array}{rcll}
|
||||||
|
e & ::= & m & \text{(maps)} \\
|
||||||
|
& | & e \sqcup e & \text{(join)} \\
|
||||||
|
& | & e \sqcap e & \text{(meet)}
|
||||||
|
\end{array}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
And in Agda,
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Map.agda" 543 546 >}}
|
||||||
|
|
||||||
|
In the code, I used the set union and intersection operators
|
||||||
|
to avoid overloading the `⊔` and `⊓` more than they already are.
|
||||||
|
|
||||||
|
We have just defined a very small expression language. In computer science,
|
||||||
|
a language is called _deeply embedded_ if a data type (or class hierarchy, or
|
||||||
|
other 'explicit' representation) is defined for its syntax in the _host_
|
||||||
|
language (Agda, in our case). This is in contrast to a _shallow embedding_, in
|
||||||
|
which expressions in the (new) language are just expressions in the host
|
||||||
|
language.
|
||||||
|
|
||||||
|
In this sense, our `Expr` is deeply embedded --- we defined new container for it,
|
||||||
|
and `_∪_` is a distinct entity from `_⊔_`. Our first attempt was a shallow
|
||||||
|
embedding. That fell through because the Agda language is much broader than
|
||||||
|
our expression language, which makes case analysis very difficult.
|
||||||
|
|
||||||
|
An obvious thing to do with an expression is to evaluate it. This will be
|
||||||
|
important for our proofs, because it will establish a connection between
|
||||||
|
expressions (created via `Expr`) and actual Agda objects that we need to
|
||||||
|
reason about at the end of the day. The notation \(\llbracket e \rrbracket\)
|
||||||
|
is commonly used in PL circles for evaluation (it comes from
|
||||||
|
[Denotational Semantics](https://en.wikipedia.org/wiki/Denotational_semantics)).
|
||||||
|
Thus, my Agda evaluation function is written as follows:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Map.agda" 586 589 >}}
|
||||||
|
|
||||||
|
On top of this, here is my actual implementation of the `Provenance` data type.
|
||||||
|
This time, it's indexed by expressions in `Expr`, which makes it much easier to
|
||||||
|
pattern match on instances:
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Map.agda" 591 596 >}}
|
||||||
|
|
||||||
|
Note that we have to use the evaluation function to be able to use
|
||||||
|
operators such as `∈`. That's because these are still defined on maps,
|
||||||
|
and not expressions.
|
||||||
|
|
||||||
|
With this, I was able to write my proof in the way that I had hoped. It has
|
||||||
|
the exact form of my previous sketch-of-proof.
|
||||||
|
|
||||||
|
{{< codelines "agda" "agda-spa/Lattice/Map.agda" 755 773 "" "**(click here to see the full example, including each case's implementation)**" >}}
|
||||||
|
|
||||||
|
### The General Trick
|
||||||
|
|
||||||
|
So far, I've presented a problem I faced in my Agda proof and a solution for
|
||||||
|
that problem. However, it may not be clear how useful the trick is beyond
|
||||||
|
this narrow case that I've encountered. The way I see it, the "deeply embedded
|
||||||
|
expression" trick is applicable whenever you have data that is constructed
|
||||||
|
from some fixed set of cases, and when proofs about that data need to follow
|
||||||
|
the structure of these cases. Thus, examples include:
|
||||||
|
|
||||||
|
* **Proofs about the origin of keys in a map (this one):** the "data" is the
|
||||||
|
key-value map that is being analyzed. The enumeration of cases for this
|
||||||
|
map is driven by the structure of the "join" and "meet" operations used
|
||||||
|
to build the map.
|
||||||
|
* **Automatic derivation of function properties:** suppose you're interested
|
||||||
|
in working with continuous functions. You also know that the addition,
|
||||||
|
subtraction, and multiplication of two functions preserves continuity.
|
||||||
|
Of course, the constant function \(x \mapsto c\) and the identity function
|
||||||
|
\(x \mapsto x\) are continuous too. You may define an expression data type
|
||||||
|
that has cases for these operations. Then, your evaluation function could
|
||||||
|
transform the expression into a plain function, and a proof on the
|
||||||
|
structure of the expression can be used to verify the resulting function's
|
||||||
|
continuity.
|
||||||
|
* **Proof search for algebraic expressions:** suppose that you wanted to
|
||||||
|
automatically find solutions for certain algebraic (in)equalities. Instead
|
||||||
|
of using some sort of reflection mechanism to inspect terms and determine
|
||||||
|
how constraints should be solved, you might represent the set of operations
|
||||||
|
in you equation system as cases in a data type. You can then use regular
|
||||||
|
Agda code to manipulate terms; an evaluation function can then be used
|
||||||
|
to recover the equations in Agda, together with witnesses justifying the
|
||||||
|
solution.
|
||||||
|
|
||||||
|
There are some pretty clear commonalities about examples above, which
|
||||||
|
are the ingredients to this trick:
|
||||||
|
|
||||||
|
* __The expression:__ you create a new expression data type that encodes all
|
||||||
|
the operations (and bases cases) on your data. In my example, this is
|
||||||
|
the `Expr` data type.
|
||||||
|
* __The evaluation function__: you provide a way to lower the expression
|
||||||
|
you've defined back into a regular Agda term. This connects your (abstract)
|
||||||
|
operations to their interpretation in Agda. In my example, this is the
|
||||||
|
`⟦_⟧` function.
|
||||||
|
* __The proofs__: you write proofs that consider only the fixed set of cases
|
||||||
|
encoded by the data type (`Expr`), but state properties about the
|
||||||
|
_evaluated_ expression. In my example, this is `Provenance` and
|
||||||
|
the `Expr-Provenance` function. Specifically, the `Provenance` data type connects
|
||||||
|
expressions and the terms they evaluate to, because it is indexed by expressions,
|
||||||
|
but contains data in the form `k ∈k ⟦ e₂ ⟧`.
|
||||||
|
|
||||||
|
### Conclusion
|
||||||
|
I'll be the first to admit that this trick is quite situational, and may
|
||||||
|
not be as far-reaching as the ["Is Something" pattern]({{< relref "agda_is_pattern" >}})
|
||||||
|
I wrote about before, which seems to occur far more in the wild. However, there
|
||||||
|
have now been two times when I personally reached for this trick, which seems
|
||||||
|
to suggest that it may be useful to someone else.
|
||||||
|
|
||||||
|
I hope you've found this useful. Happy (dependently typed) programming!
|
||||||
559
content/blog/agda_hugo.md
Normal file
@@ -0,0 +1,559 @@
|
|||||||
|
---
|
||||||
|
title: "Integrating Agda's HTML Output with Hugo"
|
||||||
|
date: 2024-05-30T00:29:26-07:00
|
||||||
|
tags: ["Agda", "Hugo", "Ruby"]
|
||||||
|
---
|
||||||
|
|
||||||
|
One of my favorite things about Agda are its clickable HTML pages. If you don't
|
||||||
|
know what they are, that's pages like [`Data.List.Properties`](https://agda.github.io/agda-stdlib/master/Data.List.Properties.html);
|
||||||
|
they just give the code from a particular Agda file, but make every identifier
|
||||||
|
clickable. Then, if you see some variable or function that you don't know, you
|
||||||
|
can just click it and jump right to it! It makes exploring the documentation
|
||||||
|
a lot smoother. I've found that these HTML pages provide all the information
|
||||||
|
I need for writing proofs.
|
||||||
|
|
||||||
|
Recently, I've been writing a fair bit about Agda; mostly about the patterns
|
||||||
|
that I've learned about, such as the ["is something" pattern]({{< relref "agda_is_pattern" >}})
|
||||||
|
and the ["deeply embedded expression" trick]({{< relref "agda_expr_pattern" >}}).
|
||||||
|
I've found myself wanting to click on definitions in my own code blocks; recently,
|
||||||
|
I got this working, and I wanted to share how I did it, in case someone else
|
||||||
|
wants to integrate Agda into their own static website. Though my stack
|
||||||
|
is based on Hugo, the general idea should work with any other static site
|
||||||
|
generator.
|
||||||
|
|
||||||
|
### TL;DR and Demo
|
||||||
|
|
||||||
|
I wrote a script to transfer links from an Agda HTML file into Hugo's HTML
|
||||||
|
output, making it possible to embellish "plain" Hugo output with Agda's
|
||||||
|
'go-to-definition links'. It looks like this. Here's an Agda code block
|
||||||
|
defining an 'expression' data type, from a project of mine:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 543 546 >}}
|
||||||
|
|
||||||
|
And here's the denotational semantics for that expression:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Lattice/Map.agda" 586 589 >}}
|
||||||
|
|
||||||
|
Notice that you can click `Expr`, `_∪_`, `⟦`, etc.! All of this integrates
|
||||||
|
with my existing Hugo site, and only required a little bit of additional
|
||||||
|
metadata to make it work. The conversion is implemented as
|
||||||
|
[a Ruby script](https://dev.danilafe.com/Web-Projects/blog-static/src/commit/04f12b545d5692a78b1a2f13ef968417c749e295/agda.rb);
|
||||||
|
this script transfers the link structure from an Agda-generated documentation
|
||||||
|
HTML file onto lightly-annotated Hugo code blocks.
|
||||||
|
|
||||||
|
To use the script, your Hugo theme (or your Markdown content) must
|
||||||
|
annotate the code blocks with several properties:
|
||||||
|
* `data-agda-block`, which marks code that needs to be processed.
|
||||||
|
* `data-file-path`, which tells the script what Agda file provided the
|
||||||
|
code in the block, and therefore what Agda HTML file should be searched
|
||||||
|
for links.
|
||||||
|
* `data-first-line` and `data-last-line`, which tell the script what
|
||||||
|
section of the Agda HTML file should be searched for said links.
|
||||||
|
|
||||||
|
Given this -- and a couple of other assumptions, such as that all Agda
|
||||||
|
projects are in a `code/<project>` folder, the script post-processes
|
||||||
|
the HTML files automatically. Right now, the solution is pretty tailored to my
|
||||||
|
site and workflow, but the core of the script -- the piece that transfers links
|
||||||
|
from an Agda HTML file into a syntax-highlighted Hugo HTML block -- should
|
||||||
|
be fairly reusable.
|
||||||
|
|
||||||
|
Now, the details.
|
||||||
|
|
||||||
|
### The Constraints
|
||||||
|
The goal was simple: to allow the code blocks on my Hugo-generated site to
|
||||||
|
have links that take the user to the definition of a given symbol.
|
||||||
|
Specifically, if the symbol occurs somewhere on the same blog page, the link
|
||||||
|
should take the user there (and not to a regular `Module.html` file). That
|
||||||
|
way, the reader can not only get to the code that they want to see, but also
|
||||||
|
have a chance to read the surrounding prose in properly-rendered Markdown.
|
||||||
|
|
||||||
|
Next, unlike standard "literate Agda" files, my blog posts are not single
|
||||||
|
`.agda` files with Markdown in comments. Rather, I use regular Hugo
|
||||||
|
Markdown, and present portions of an existing project, weaving together many
|
||||||
|
files, and showing the fragments out of order. So, my tool needs to support
|
||||||
|
links that come from distinct modules, in any order.
|
||||||
|
|
||||||
|
Additionally, I've recently been writing a whole series about an Agda project
|
||||||
|
of mine; in this series, I gradually build up to the final product, explaining
|
||||||
|
one or two modules at a time. I would expect that links on pages in this series
|
||||||
|
could jump to other pages in the same series: if I cover module `A` in part 1,
|
||||||
|
then write `A.f` in part 2, clicking on `A` -- and maybe `f` -- should take
|
||||||
|
the reader back to the first part's page; once again, this would help provide
|
||||||
|
them with the surrounding explanation.
|
||||||
|
|
||||||
|
Finally, I wanted the Agda code to appear exactly the same as any other code
|
||||||
|
on my site, including the Hugo-provided syntax highlighting and theme. This
|
||||||
|
ruled out just copy-pasting pieces of the Agda-generated HTML in place of
|
||||||
|
code blocks on my page (and redirecting the links). Thought it was not
|
||||||
|
a hard requirement, I also hoped to include Agda code in the same
|
||||||
|
manner that I include all other code: [my `codelines` shortcode]({{< relref "codelines" >}}).
|
||||||
|
In brief, the `codelines` shortcode creates a syntax-highlighted code block,
|
||||||
|
as well as a surrounding "context" that says what file the code is from,
|
||||||
|
which lines are listed, and where to find the full code (e.g., on my Git server).
|
||||||
|
It looks something like this:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-spa/Language/Base.agda" 12 20 >}}
|
||||||
|
|
||||||
|
In summary:
|
||||||
|
|
||||||
|
1. I want to create cross-links between symbols in Agda blocks in a blog post.
|
||||||
|
2. These code blocks could include code from disjoint files, and be out of order.
|
||||||
|
3. Code blocks among a whole series of posts should be cross-linked too.
|
||||||
|
4. The code blocks should be syntax highlighted the same way as the rest of the
|
||||||
|
code on the site.
|
||||||
|
5. Ideally, I should be able to use my regular method for referencing code.
|
||||||
|
|
||||||
|
I've hit all of these requirements; now it's time to dig into how I got there.
|
||||||
|
|
||||||
|
### Implementation
|
||||||
|
|
||||||
|
#### Processing Agda's HTML Output
|
||||||
|
It's pretty much a no-go to try to resolve Agda from Hugo, or perform some
|
||||||
|
sort of "heuristic" to detect cross-links. Agda is a very complex programming
|
||||||
|
language, and Hugo's templating engine, though powerful, is just not
|
||||||
|
up to this task. Fortunately, Agda has support for
|
||||||
|
[HTML output using the `--html` flag](https://agda.readthedocs.io/en/v2.6.4.3-r1/tools/generating-html.html).
|
||||||
|
As a build step, I can invoke Agda on files that are referenced by my blog,
|
||||||
|
and generate HTML. This would decidedly slow down the site build process,
|
||||||
|
but it would guarantee accurate link information.
|
||||||
|
|
||||||
|
|
||||||
|
On the other hand, to satisfy the 4th constraint, I need to somehow mimic --
|
||||||
|
or keep -- the format of Hugo's existing HTML output. The easiest way to
|
||||||
|
do this without worrying about breaking changes and version incompatibility
|
||||||
|
is to actually use the existing syntax-highlighted HTML, and annotate it
|
||||||
|
with links as I discover them. Effectively, what I need to do is a "link
|
||||||
|
transfer": I need to identify regions of code that are highlighted in Agda's HTML,
|
||||||
|
find those regions in Hugo's HTML output, and mark them with links. In addition,
|
||||||
|
I'll need to fix up the links themselves: the HTML output assumes that each
|
||||||
|
Agda file is its own HTML page, but this is ruled out by the second constraint of mine.
|
||||||
|
|
||||||
|
As a little visualization, the overall problems looks something like this:
|
||||||
|
|
||||||
|
````Agda {linenos=table}
|
||||||
|
-- Agda's HTML output (blocks of 't' are links):
|
||||||
|
-- |tttttt| |tttt| |t| |t| |ttttt|
|
||||||
|
module ModX ( x : T ) where
|
||||||
|
-- |tttttt| |tt|t| |t| |t| |ttttt|
|
||||||
|
-- Hugo's HTML output (blocks of 't' are syntax highlighting spans)
|
||||||
|
````
|
||||||
|
|
||||||
|
Both Agda and Hugo output a preformatted code block, decorated with various
|
||||||
|
inline HTMl that indicates information (token color for Hugo; symbol IDs and
|
||||||
|
links in Agda). However, Agda and Hugo do not use the same process to create
|
||||||
|
this decorated output; it's entirely possible -- and not uncommon -- for
|
||||||
|
Hugo and Agda to produce misaligned HTML nodes. In my diagram above,
|
||||||
|
this is reflected as `ModX` being considered a single token by Agda, but
|
||||||
|
two tokens (`Mod` and `X`) by the syntax highlighter. As a result, it's
|
||||||
|
difficult to naively iterate the two HTML formats in parallel.
|
||||||
|
|
||||||
|
What I ended up doing is translating Agda's HTML output into offsets and data
|
||||||
|
about the code block's _plain text_ -- the source code being decorated.
|
||||||
|
Both the Agda and Hugo HTML describe the same code; thus, the plain text
|
||||||
|
is the common denominator between the two.
|
||||||
|
{#plain-text}
|
||||||
|
|
||||||
|
I wrote a Ruby script to extract the decorations from the Agda output; here
|
||||||
|
it is in slightly abridged form. You can find the [original `agda.rb` file here](https://dev.danilafe.com/Web-Projects/blog-static/src/commit/04f12b545d5692a78b1a2f13ef968417c749e295/agda.rb).
|
||||||
|
|
||||||
|
```Ruby
|
||||||
|
# Traverse the preformatted Agda block in the given Agda HTML file
|
||||||
|
# and find which textual ranges have IDs and links to other ranges.
|
||||||
|
# Store this information in a hash, line => links[]
|
||||||
|
def process_agda_html_file(file)
|
||||||
|
document = Nokogiri::HTML.parse(File.open(file))
|
||||||
|
pre_code = document.css("pre.Agda")[0]
|
||||||
|
|
||||||
|
# The traversal is postorder; we always visit children before their
|
||||||
|
# parents, and we visit leaves in sequence.
|
||||||
|
line_infos = []
|
||||||
|
offset = 0 # Column index within the current Agda source code line
|
||||||
|
line = 1
|
||||||
|
pre_code.traverse do |at|
|
||||||
|
# Text nodes are always leaves; visiting a new leaf means we've advanced
|
||||||
|
# in the text by the length of that text. However, if there are newlines
|
||||||
|
# -- since this is a preformatted block -- we also advanced by a line.
|
||||||
|
# At this time, do not support links that span multiple lines, but
|
||||||
|
# Agda doesn't produce those either.
|
||||||
|
if at.text?
|
||||||
|
if at.content.include? "\n"
|
||||||
|
raise "no support for links with newlines inside" if at.parent.name != "pre"
|
||||||
|
|
||||||
|
# Increase the line and track the final offset. Written as a loop
|
||||||
|
# in case we eventually want to add some handling for the pieces
|
||||||
|
# sandwiched between newlines.
|
||||||
|
at.content.split("\n", -1).each_with_index do |bit, idx|
|
||||||
|
line += 1 unless idx == 0
|
||||||
|
offset = bit.length
|
||||||
|
end
|
||||||
|
else
|
||||||
|
# It's not a newline node. Just adjust the offset within the plain text.
|
||||||
|
offset += at.content.length
|
||||||
|
end
|
||||||
|
elsif at.name == "a"
|
||||||
|
# Agda emits both links and things-to-link-to as 'a' nodes.
|
||||||
|
|
||||||
|
line_info = line_infos.fetch(line) { line_infos[line] = [] }
|
||||||
|
href = at.attribute("href")
|
||||||
|
id = at.attribute("id")
|
||||||
|
if href or id
|
||||||
|
new_node = { :from => offset-at.content.length, :to => offset }
|
||||||
|
new_node[:href] = href if href
|
||||||
|
new_node[:id] = id if id
|
||||||
|
|
||||||
|
line_info << new_node
|
||||||
|
end
|
||||||
|
end
|
||||||
|
end
|
||||||
|
return line_infos
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
This script takes an Agda HTML file and returns a map in which each line
|
||||||
|
of the Agda source code is associated with a list of ranges; the ranges
|
||||||
|
indicate links or places that can be linked to. For example, for the `ModX`
|
||||||
|
example above, the script might produce:
|
||||||
|
|
||||||
|
```Ruby
|
||||||
|
3 => [
|
||||||
|
{ :from => 3, :to => 9, id => "..." }, # Agda creates <a> nodes even for keywords.
|
||||||
|
{ :from => 12, :to => 16, id => "ModX-id" }, # The IDs Agda generates aren't usually this nice.
|
||||||
|
{ :from => 20, :to => 21, id => "x-id" },
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Modifying Hugo's HTML
|
||||||
|
|
||||||
|
Given such line information, the next step is to transfer it onto existing
|
||||||
|
Hugo HTML files. Within a file, I've made my `codelines` shortcode emit
|
||||||
|
custom attributes that can be used to find syntax-highlighted Agda code.
|
||||||
|
The chief such attribute is `data-agda-block`; my script traverses all
|
||||||
|
elements with this attribute.
|
||||||
|
|
||||||
|
```Ruby
|
||||||
|
def process_source_file(file, document)
|
||||||
|
# Process each highlight group that's been marked as an Agda file.
|
||||||
|
document.css('div[data-agda-block]').each do |t|
|
||||||
|
# ...
|
||||||
|
```
|
||||||
|
|
||||||
|
To figure out which Agda HTML file to use, and which lines to search for links,
|
||||||
|
the script also expects some additional attributes.
|
||||||
|
|
||||||
|
```Ruby
|
||||||
|
# ...
|
||||||
|
first_line, last_line = nil, nil
|
||||||
|
|
||||||
|
if first_line_attr = t.attribute("data-first-line")
|
||||||
|
first_line = first_line_attr.to_s.to_i
|
||||||
|
end
|
||||||
|
if last_line_attr = t.attribute("data-last-line")
|
||||||
|
last_line = last_line_attr.to_s.to_i
|
||||||
|
end
|
||||||
|
|
||||||
|
if first_line and last_line
|
||||||
|
line_range = first_line..last_line
|
||||||
|
else
|
||||||
|
# no line number attributes = the code block contains the whole file
|
||||||
|
line_range = 1..
|
||||||
|
end
|
||||||
|
|
||||||
|
full_path = t.attribute("data-file-path").to_s
|
||||||
|
# ...
|
||||||
|
```
|
||||||
|
|
||||||
|
At this point, the Agda file could be in some nested directory, like
|
||||||
|
`A/B/C/File.agda`. However, the project root -- the place where Agda modules
|
||||||
|
are compiled from -- could be any one of the folders `A`, `B`, or `C`.
|
||||||
|
Thus, the fully qualified module name for `File.agda` could be `File`,
|
||||||
|
`C.File`, `B.C.File`, or `A.B.C.File`. Since Agda's HTML output produces
|
||||||
|
files named after the fully qualified module name, the script needs to guess
|
||||||
|
what the module file is. This is where some conventions come in play:
|
||||||
|
I keep my code in folders directly nested within a top-level `code` directory;
|
||||||
|
thus, I'll have folders `project1` or `project2` inside `code`, and those
|
||||||
|
will always be project roots. As a result,
|
||||||
|
I guess that the first directory relative to `code` should be discarded,
|
||||||
|
while the rest should be included in the path. The only exception to this is
|
||||||
|
Git submodules: if an Agda file is included using a submodule, the root
|
||||||
|
directory of the submodule is considered the Agda project root. My Hugo theme
|
||||||
|
indicates the submodule using an additional `data-base-path` attribute; in all,
|
||||||
|
that leads to the following logic:
|
||||||
|
|
||||||
|
```Ruby
|
||||||
|
# ...
|
||||||
|
full_path_dirs = Pathname(full_path).each_filename.to_a
|
||||||
|
base_path = t.attribute("data-base-path").to_s
|
||||||
|
base_dir_depth = 0
|
||||||
|
if base_path.empty?
|
||||||
|
# No submodules were used. Assume code/<X> is the root.
|
||||||
|
# The path of the file is given relative to `code`, so need
|
||||||
|
# to strip only the one outermost directory.
|
||||||
|
base_dir_depth = 1
|
||||||
|
base_path = full_path_dirs[0]
|
||||||
|
else
|
||||||
|
# The code is in a submodule. Assume that the base path / submodule
|
||||||
|
# root is the Agda module root, ignore all folders before that.
|
||||||
|
base_path_dirs = Pathname(base_path).each_filename.to_a
|
||||||
|
base_dir_depth = base_path_dirs.length
|
||||||
|
end
|
||||||
|
# ...
|
||||||
|
```
|
||||||
|
|
||||||
|
With that, the script determines the actual HTML file path ---
|
||||||
|
by assuming that there's an `html` folder in the same place as the Agda
|
||||||
|
project root --- and runs the above `process_agda_html_file`:
|
||||||
|
|
||||||
|
```Ruby
|
||||||
|
# ...
|
||||||
|
dirs_in_base = full_path_dirs[base_dir_depth..-1]
|
||||||
|
html_file = dirs_in_base.join(".").gsub(/\.agda$/, ".html")
|
||||||
|
html_path = File.join(["code", base_path, "html", html_file])
|
||||||
|
|
||||||
|
agda_info = process_agda_html_file(html_path)
|
||||||
|
# ...
|
||||||
|
```
|
||||||
|
|
||||||
|
The next step is specific to the output of Hugo's syntax highlighter,
|
||||||
|
[Chroma](https://github.com/alecthomas/chroma). When line numbers are enabled
|
||||||
|
-- and they are on my site -- Chroma generates a table that, at some point,
|
||||||
|
contains a bunch of `span` HTML nodes, each with the `line` class. Each
|
||||||
|
such `span` corresponds to a single line of output; naturally, the first
|
||||||
|
one contains the code from `first_line`, the second from `first_line + 1`,
|
||||||
|
and so on until `last_line`. This is quite convenient, because it saves the
|
||||||
|
headache of counting newlines the way that the Agda processing code above has to.
|
||||||
|
|
||||||
|
For each line of syntax-highlighted code, the script retrieves the corresponding
|
||||||
|
list of links that were collected from the Agda HTML file.
|
||||||
|
|
||||||
|
```Ruby
|
||||||
|
# ...
|
||||||
|
lines = t.css("pre.chroma code[data-lang] .line")
|
||||||
|
lines.zip(line_range).each do |line, line_no|
|
||||||
|
line_info = agda_info[line_no]
|
||||||
|
next unless line_info
|
||||||
|
|
||||||
|
# ...
|
||||||
|
```
|
||||||
|
|
||||||
|
The subsequent traversal -- which picks out the plain text of the Agda file
|
||||||
|
as [reasoned above](#plain-text) -- is very similar to the previous
|
||||||
|
one. Here too there's an `offset` variable, which gets incremented with
|
||||||
|
the length of a new plain text pieces. Since we know the lines match up
|
||||||
|
to `span`s, there's no need to count newlines.
|
||||||
|
|
||||||
|
```Ruby
|
||||||
|
# ...
|
||||||
|
offset = 0
|
||||||
|
line.traverse do |lt|
|
||||||
|
if lt.text?
|
||||||
|
content = lt.content
|
||||||
|
new_offset = offset + content.length
|
||||||
|
|
||||||
|
# ...
|
||||||
|
```
|
||||||
|
|
||||||
|
At this point, we have a line number, and an offset within that line number
|
||||||
|
that describes the portion of the text under consideration. We can
|
||||||
|
traverse all the links for the line, and find ones that mark a piece of
|
||||||
|
text somewhere in this range. For the time being -- since inserting overlapping
|
||||||
|
spans is quite complicated -- I require the links to lie entirely within a
|
||||||
|
particular plain text region. As a result, if Chroma splits a single Agda
|
||||||
|
identifier into several tokens, it will not be linked. For now, this seems
|
||||||
|
like the most conservative and safe approach.
|
||||||
|
|
||||||
|
```Ruby
|
||||||
|
# ...
|
||||||
|
matching_links = line_info.links.filter do |link|
|
||||||
|
link[:from] >= offset and link[:to] <= new_offset
|
||||||
|
end
|
||||||
|
# ...
|
||||||
|
```
|
||||||
|
|
||||||
|
All that's left is to slice up the plain text fragment into a bunch of HTML
|
||||||
|
pieces: the substrings that are links will turn into `a` HTML nodes, while
|
||||||
|
the substrings that are "in between" the links will be left over as plain
|
||||||
|
text nodes. The code to do so is relatively verbose, but not all that complicated.
|
||||||
|
|
||||||
|
```Ruby
|
||||||
|
replace_with = []
|
||||||
|
replace_offset = 0
|
||||||
|
matching_links.each do |match|
|
||||||
|
# The link's range is an offset from the beginning of the line,
|
||||||
|
# but the text piece we're splitting up might be partway into
|
||||||
|
# the line. Convert the link coordinates to piece-relative ones.
|
||||||
|
relative_from = match[:from] - offset
|
||||||
|
relative_to = match[:to] - offset
|
||||||
|
|
||||||
|
# If the previous link ended some time before the new link
|
||||||
|
# began (or if the current link is the first one, and is not
|
||||||
|
# at the beginning), ensure that the plain text "in between"
|
||||||
|
# is kept.
|
||||||
|
replace_with << content[replace_offset...relative_from]
|
||||||
|
|
||||||
|
tag = (match.include? :href) ? 'a' : 'span'
|
||||||
|
new_node = Nokogiri::XML::Node.new(tag, document)
|
||||||
|
if match.include? :href
|
||||||
|
# For nodes with links, note what they're referring to, so
|
||||||
|
# we can adjust their hrefs when we assign global IDs.
|
||||||
|
href = match[:href].to_s
|
||||||
|
new_node['href'] = note_used_href file, new_node, href
|
||||||
|
end
|
||||||
|
if match.include? :id
|
||||||
|
# For nodes with IDs visible in the current Hugo file, we'll
|
||||||
|
# want to redirect links that previously go to other Agda
|
||||||
|
# module HTML files. So, note the ID that we want to redirect,
|
||||||
|
# and pick a new unique ID to replace it with.
|
||||||
|
id = match[:id].to_s
|
||||||
|
new_node['id'] = note_defined_href file, "#{html_file}##{id}"
|
||||||
|
end
|
||||||
|
new_node.content = content[relative_from...relative_to]
|
||||||
|
|
||||||
|
replace_with << new_node
|
||||||
|
replace_offset = relative_to
|
||||||
|
end
|
||||||
|
replace_with << content[replace_offset..-1]
|
||||||
|
```
|
||||||
|
|
||||||
|
There's a little bit of a subtlety in the above code: specifically, I use
|
||||||
|
the `note_used_href` and `note_defined_href` methods. These are important
|
||||||
|
for rewriting links. Like I mentioned earlier, Agda's HTML output assumes
|
||||||
|
that each source file should produce a single HTML file -- named after its
|
||||||
|
qualified module -- and creates links accordingly. However, my blog posts
|
||||||
|
interweave multiple source files. Some links that would've jumped to a different
|
||||||
|
file must now point to an internal identifier within the page. Another
|
||||||
|
important aspect of the transformation is that, since I'm pulling HTML files
|
||||||
|
from distinct files, it's not guaranteed that each of them will have a unique
|
||||||
|
`id` attribute. After all, Agda just assigns sequential numbers to each
|
||||||
|
node that it generates; it would only take, e.g., including the first line
|
||||||
|
from two distinct modules to end up with two nodes with `id="1"`.
|
||||||
|
|
||||||
|
The solution is then twofold:
|
||||||
|
|
||||||
|
1. Track all the nodes referencing a particular `href` (made up of an HTML
|
||||||
|
file and a numerical identifier, like `File.html#123`). When we pick
|
||||||
|
new IDs -- thus guaranteeing their uniqueness -- we'll visit all the
|
||||||
|
nodes that refer to the old ID and HTML file, and update their `href`.
|
||||||
|
2. Track all existing Agda HTML IDs that we're inserting. If we transfer
|
||||||
|
an `<a id="1234">` onto the Hugo content, we know we'll need to pick a new
|
||||||
|
ID for it (since `1234` need not be unique), and that we'll need to redirect
|
||||||
|
the other links to that new ID as the previous bullet describes.
|
||||||
|
|
||||||
|
Here's how these two methods work:
|
||||||
|
|
||||||
|
```Ruby
|
||||||
|
def note_defined_href(file, href)
|
||||||
|
file_hrefs = @local_seen_hrefs.fetch(file) do
|
||||||
|
@local_seen_hrefs[file] = {}
|
||||||
|
end
|
||||||
|
|
||||||
|
uniq_id = file_hrefs.fetch(href) do
|
||||||
|
new_id = "agda-unique-ident-#{@id_counter}"
|
||||||
|
@id_counter += 1
|
||||||
|
file_hrefs[href] = new_id
|
||||||
|
end
|
||||||
|
|
||||||
|
unless @global_seen_hrefs.include? href
|
||||||
|
@global_seen_hrefs[href] = { :file => file, :id => uniq_id }
|
||||||
|
end
|
||||||
|
|
||||||
|
return uniq_id
|
||||||
|
end
|
||||||
|
|
||||||
|
def note_used_href(file, node, href)
|
||||||
|
ref_list = @nodes_referencing_href.fetch(href) { @nodes_referencing_href[href] = [] }
|
||||||
|
ref_list << { :file => file, :node => node }
|
||||||
|
return href
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
Note that they use class variables: these are methods on a `FileGroup` class.
|
||||||
|
I've omitted the various classes I've declared from the above code for brevity,
|
||||||
|
but here it makes sense to show them. Like I mentioned earlier, you can
|
||||||
|
view the [complete code here](https://dev.danilafe.com/Web-Projects/blog-static/src/commit/6a168f2fe144850ed3a81b796e07266cbf80f382/agda.rb).
|
||||||
|
|
||||||
|
Interestingly, `note_defined_href` makes use of _two_ global maps:
|
||||||
|
`@local_seen_hrefs` and `@global_seen_hrefs`. This helps satisfy the third
|
||||||
|
constraint above, which is linking between code defined in the same series.
|
||||||
|
The logic is as follows: when rewriting a link to a new HTML file and ID,
|
||||||
|
if the code we're trying to link to exists on the current page, we should link
|
||||||
|
to that. Otherwise, if the code we're trying to link to was presented in
|
||||||
|
a different part of the series, then we should link to that other part.
|
||||||
|
So, we consult the "local" map for `href`s that will be rewritten to HTML
|
||||||
|
nodes in the current file, and as a fallback, consult the "global" map for
|
||||||
|
`hrefs` that were introduced in other parts. The `note_defined_href` populates
|
||||||
|
both maps, and is "biased" towards the first occurrence of a piece of code:
|
||||||
|
if posts A and B define a function `f`, and post C only references `f`, then
|
||||||
|
that link will go to post A's definition, which came earlier.
|
||||||
|
|
||||||
|
The other method, `note_used_href`, is simpler. It just appends to a list
|
||||||
|
of Nokogiri HTML nodes that reference a given `href`. We keep track of the file
|
||||||
|
in which the reference occurred so we can be sure to consult the right sub-map
|
||||||
|
of `@local_seen_hrefs` when checking for in-page rewrites.
|
||||||
|
|
||||||
|
After running `process_source_file` on all Hugo HTML files within a particular
|
||||||
|
series, the following holds true:
|
||||||
|
* We have inserted `span` or `a` nodes wherever Agda's original output
|
||||||
|
had nodes with `id` or `href` elements. This is with the exception of the
|
||||||
|
case where Hugo's inline HTML doesn't "line up" with Agda's inline HTML,
|
||||||
|
which I've only found to happen when the leading character of an identifier is a digit.
|
||||||
|
* We have picked new IDs for each HTML node we inserted that had an ID,
|
||||||
|
noting them both globally and for the current file. We noted their original
|
||||||
|
`href` value (in the form `File.html#123`) and that it should be transformed
|
||||||
|
into our globally-unique identifiers, in the form `agda-unique-ident-1234`.
|
||||||
|
* For each HTML node we inserted that links to another, we noted the `href`
|
||||||
|
of the reference (also in the form `File.html#123`).
|
||||||
|
|
||||||
|
Now, all that's left is to redirect the `href`s of the nodes we inserted
|
||||||
|
from their old values to the new ones. I do this by iterating over `@nodes_referencing_href`,
|
||||||
|
which contains every link we inserted.
|
||||||
|
|
||||||
|
```Ruby
|
||||||
|
def cross_link_files
|
||||||
|
@nodes_referencing_href.each do |href, references|
|
||||||
|
references.each do |reference|
|
||||||
|
file = reference[:file]
|
||||||
|
node = reference[:node]
|
||||||
|
|
||||||
|
local_targets = @local_seen_hrefs[file]
|
||||||
|
if local_targets.include? href
|
||||||
|
# A code block in this file provides this href, create a local link.
|
||||||
|
node['href'] = "##{local_targets[href]}"
|
||||||
|
elsif @global_seen_hrefs.include? href
|
||||||
|
# A code block in this series, but not in this file, defines
|
||||||
|
# this href. Create a cross-file link.
|
||||||
|
target = @global_seen_hrefs[href]
|
||||||
|
other_file = target[:file]
|
||||||
|
id = target[:id]
|
||||||
|
|
||||||
|
relpath = Pathname.new(other_file).dirname.relative_path_from(Pathname.new(file).dirname)
|
||||||
|
node['href'] = "#{relpath}##{id}"
|
||||||
|
else
|
||||||
|
# No definitions in any blog page. For now, just delete the anchor.
|
||||||
|
node.replace node.content
|
||||||
|
end
|
||||||
|
end
|
||||||
|
end
|
||||||
|
end
|
||||||
|
```
|
||||||
|
|
||||||
|
Notice that for the time being, I simply remove links to Agda definitions that
|
||||||
|
didn't occur in the Hugo post. Ideally, this would link to the plain, non-blog
|
||||||
|
documentation page generated by Agda; however, this requires either hosting
|
||||||
|
those documentation pages, or expecting the Agda standard library HTML pages
|
||||||
|
to remain stable and hosted at a fixed URL. Neither was simple enough to do,
|
||||||
|
so I opted for the conservative "just don't insert links" approach.
|
||||||
|
|
||||||
|
And that's all of the approach that I wanted to show off today! There
|
||||||
|
are other details, like finding posts in the same series (I achieve this
|
||||||
|
with a `meta` element) and invoking `agda --html` on the necessary source files
|
||||||
|
(my [`build-agda-html.rb`](https://dev.danilafe.com/Web-Projects/blog-static/src/branch/master/build-agda-html.rb)
|
||||||
|
script is how I personally do this), but I don't think it's all that valuable
|
||||||
|
to describe them here.
|
||||||
|
|
||||||
|
Unfortunately, the additional metadata I had my theme insert makes it
|
||||||
|
harder for others to use this approach out of the box. However, I hope that by
|
||||||
|
sharing my experience, others who write Agda and post about it might be able
|
||||||
|
to get a similar solution working. And of course, it's always fun to write
|
||||||
|
about a recent project or endeavor.
|
||||||
|
|
||||||
|
Happy (dependently typed) programming and blogging!
|
||||||
169
content/blog/agda_is_pattern.md
Normal file
@@ -0,0 +1,169 @@
|
|||||||
|
---
|
||||||
|
title: "The \"Is Something\" Pattern in Agda"
|
||||||
|
date: 2023-08-31T22:15:34-07:00
|
||||||
|
tags: ["Agda"]
|
||||||
|
description: "In this post, I talk about a pattern I've observed in the Agda standard library."
|
||||||
|
---
|
||||||
|
|
||||||
|
Agda is a functional programming language with a relatively Haskell-like syntax
|
||||||
|
and feature set, so coming into it, I relied on my past experiences with Haskell
|
||||||
|
to get things done. However, the languages are sufficiently different to leave
|
||||||
|
room for useful design patterns in Agda that can't be brought over from Haskell,
|
||||||
|
because they don't exist there. One such pattern will be the focus of this post;
|
||||||
|
it's relatively simple, but I came across it by reading the standard library code.
|
||||||
|
My hope is that by writing it down here, I can save someone the trouble of
|
||||||
|
recognizing it and understanding its purpose. The pattern is "unique" to Agda
|
||||||
|
(in the sense that it isn't present in Haskell) because it relies on dependent types.
|
||||||
|
|
||||||
|
In my head, I call this the `IsSomething` pattern. Before I introduce it, let
|
||||||
|
me try to provide some motivation. I should say that this may not be the
|
||||||
|
only motivation for this pattern; it's just how I arrived at seeing its value.
|
||||||
|
|
||||||
|
### Type Classes for Related Operations
|
||||||
|
Suppose you wanted to define a type class for "a type that has an associative
|
||||||
|
binary operation". In Haskell, this is the famous `Semigroup` class. Here's
|
||||||
|
a definition I lifted from the [Haskell docs](https://hackage.haskell.org/package/base-4.18.0.0/docs/src/GHC.Base.html#Semigroup):
|
||||||
|
|
||||||
|
```Haskell
|
||||||
|
class Semigroup a where
|
||||||
|
(<>) :: a -> a -> a
|
||||||
|
a <> b = sconcat (a :| [ b ])
|
||||||
|
```
|
||||||
|
|
||||||
|
It says that a type `a` is a semigroup if it has a binary operation, which Haskell
|
||||||
|
calls `(<>)`. The language isn't expressive enough to encode the associative
|
||||||
|
property of this binary operation, but we won't hold it against Haskell: not
|
||||||
|
every language needs dependent types or SMT-backed refinement types. If
|
||||||
|
we translated this definition into Agda (and encoded the associativity constraint),
|
||||||
|
we'd end up with something like this:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-issomething/example.agda" 9 13 >}}
|
||||||
|
|
||||||
|
So far, so good. Now, let's also encode a more specific sort of type-with-binary-operation:
|
||||||
|
one where the operation is associative as before, but also has an identity element.
|
||||||
|
In Haskell, we can write this as:
|
||||||
|
|
||||||
|
```Haskell
|
||||||
|
class Semigroup a => Monoid a where
|
||||||
|
mempty :: a
|
||||||
|
```
|
||||||
|
|
||||||
|
This brings in all the requirements of `Semigroup`, with one additional one:
|
||||||
|
an element `mempty`, which is intended to be the aforementioned identity element for `(<>)`.
|
||||||
|
Once again, we can't encode the "identity element" property; I say this only
|
||||||
|
to explain the lack of any additional code in the preceding snippet.
|
||||||
|
|
||||||
|
In Agda, there isn't really a special syntax for "superclass"; we just use a field.
|
||||||
|
The "transliterated" implementation is as follows:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-issomething/example.agda" 15 24 >}}
|
||||||
|
|
||||||
|
This code might require a little bit of explanation. Like I said, the base class
|
||||||
|
is brought in as a field, `semigroup`. Then, every field of `semigroup`
|
||||||
|
is also made available within `Monoid`, as well as to users of `Monoid`, by
|
||||||
|
using an `open public` directive. The subsequent fields mimic the Haskell
|
||||||
|
definition amended with proofs of identity.
|
||||||
|
|
||||||
|
We get our first sign of awkwardness here. We can't refer to the binary operation
|
||||||
|
very easily; it's nested inside of `semigroup`, and we have to access its fields
|
||||||
|
to get ahold of `(∙)`. It's not too bad at all -- it just cost us an extra line.
|
||||||
|
However, the bookkeeping of what-operation-is-where gets frustrating quickly.
|
||||||
|
|
||||||
|
I will demonstrate the frustrations in one final example. I will admit to it
|
||||||
|
being contrived: I am trying to avoid introducing too many definitions and concepts
|
||||||
|
just for the sake of a motivating case. Suppose you are trying to specify
|
||||||
|
a type in which the binary operation has _two_ properties (e.g. it's a monoid
|
||||||
|
_and_ something else). Since the only two type classes I have so far are
|
||||||
|
`Monoid` and `Semigroup`, I will use those; note that in this particular instance,
|
||||||
|
using both is a contrivance, since one contains the latter.
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-issomething/example.agda" 26 32 >}}
|
||||||
|
|
||||||
|
However, there's a problem: nothing in the above definition ensures that the
|
||||||
|
binary operations of the two fields are the same! As far as Agda is concerned
|
||||||
|
(as one would quickly come to realize by trying a few proofs with the code),
|
||||||
|
the two operations are completely separate. One could perhaps add an equality
|
||||||
|
constraint:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-issomething/example.agda" 26 34 >}}
|
||||||
|
|
||||||
|
However, this will get tedious quickly. Proofs will need to leverage rewrites
|
||||||
|
(via the `rewrite` keyword, or via `cong`) to change one of the binary operations
|
||||||
|
into the other. As you build up more and more complex algebraic structures,
|
||||||
|
in which the various operations are related in nontrivial ways, you start to
|
||||||
|
look for other approaches. That's where the `IsSomething` pattern comes in.
|
||||||
|
|
||||||
|
### The `IsSomething` Pattern: Parameterizing By Operations
|
||||||
|
The pain point of the original approach is data flow. The way it's written,
|
||||||
|
data (operations, elements, etc.) flows from the fields of a record to the record
|
||||||
|
itself: `Monoid` has to _read_ the `(∙)` operation from `Semigroup`.
|
||||||
|
The more fields you add, the more reading and reconciliation you have to do.
|
||||||
|
It would be better if the data flowed the other direction: from `Monoid` to
|
||||||
|
`Semigroup`. `Monoid` could say, "here's a binary operation; it must satisfy
|
||||||
|
these constraints, in addition to having an identity element". To _provide_
|
||||||
|
the binary operation to a field, we use type application; this would look
|
||||||
|
something like this:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-issomething/example.agda" 42 42 >}}
|
||||||
|
|
||||||
|
Here's the part that's not possible in Haskell: we have a `record`, called `IsSemigroup`,
|
||||||
|
that's parameterized by a _value_ -- the binary operation! This new record
|
||||||
|
is quite similar to our original `Semigroup`, except that it doesn't need a field
|
||||||
|
for `(∙)`: it gets that from outside. Note the additional parameter in the
|
||||||
|
`record` header:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-issomething/example.agda" 37 38 >}}
|
||||||
|
|
||||||
|
We can define an `IsMonoid` similarly:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-issomething/example.agda" 40 47 >}}
|
||||||
|
|
||||||
|
We want to make an "is" version for each algebraic property; this way,
|
||||||
|
if we want to use "monoid" as part of some other structure, we can pass it
|
||||||
|
the required binary operation the same way we passed it to `IsSemigroup`.
|
||||||
|
Finally, the contrived motivating example from above becomes:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-issomething/example.agda" 49 55 >}}
|
||||||
|
|
||||||
|
Since we passed the same operation to both `IsMonoid` and `IsSemigroup`, we
|
||||||
|
know that we really do have a _single_ operation with _both_ properties,
|
||||||
|
no strange equality witnesses or anything necessary.
|
||||||
|
|
||||||
|
Of course, these new records are not quite equivalent to our original ones. They
|
||||||
|
need to be passed a binary operation; a "complete" package should include the
|
||||||
|
binary operation _in addition_ to its properties encoded as `IsSemigroup` or
|
||||||
|
`IsMonoid`. Such a complete package would be more-or-less equivalent to our
|
||||||
|
original `Semigroup` and `Monoid` instances. Here's what that would look like:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-issomething/example.agda" 57 66 >}}
|
||||||
|
|
||||||
|
Agda calls records that include both the operation and its `IsSomething` record
|
||||||
|
_bundles_ (see [`Algebra.Bundles`](https://agda.github.io/agda-stdlib/Algebra.Bundles.html), for example).
|
||||||
|
Notice that the bundles don't rely on other bundles to define properties; that
|
||||||
|
would lead right back to the "bottom-up" data flow in which a parent record has
|
||||||
|
to access the operations and values stored in its fields. Hower, bundles do
|
||||||
|
sometimes "contain" (via a definition, not a field) smaller bundles, in case,
|
||||||
|
for example, you need _only_ a semigroup, but you have a monoid.
|
||||||
|
|
||||||
|
### Bonus: Using Parameterized Modules to Avoid Repetitive Arguments
|
||||||
|
|
||||||
|
One annoying thing about our definitions above is that we had to accept our
|
||||||
|
binary operation, and sometimes the zero element, as an argument to each one,
|
||||||
|
and to thread it through to all the fields that require it. Agda has a nice
|
||||||
|
mechanism to help alleviate some of this repetition: [parameterized modules](https://agda.readthedocs.io/en/latest/language/module-system.html#parameterised-modules).
|
||||||
|
We can define a _whole module_ that accepts the binary operation as an argument;
|
||||||
|
it will be implicitly passed as an argument to all of the definitions within.
|
||||||
|
Thus, our entire `IsMonoid`, `IsSemigroup`, and `IsContrivedExample` code could
|
||||||
|
look like this:
|
||||||
|
|
||||||
|
{{< codelines "Agda" "agda-issomething/example.agda" 68 87 >}}
|
||||||
|
|
||||||
|
The more `IsSomething` records you declare, the more effective this trick becomes.
|
||||||
|
|
||||||
|
### Conclusion
|
||||||
|
That's all I have! The pattern I've described shows up all over the Agda
|
||||||
|
standard library; the example that made me come across it was
|
||||||
|
the [`Algebra.Structures` module](https://agda.github.io/agda-stdlib/Algebra.Structures.html).
|
||||||
|
I hope you find it useful.
|
||||||
|
|
||||||
|
Happy (dependently typed) programming!
|
||||||
@@ -98,7 +98,7 @@ After my first post complaining about the state of mathematics on the web, I rec
|
|||||||
the following email (which the author allowed me to share):
|
the following email (which the author allowed me to share):
|
||||||
|
|
||||||
> Sorry for having a random stranger email you, but in your blog post
|
> Sorry for having a random stranger email you, but in your blog post
|
||||||
[(link)](https://danilafe.com/blog/math_rendering_is_wrong) you seem to focus on MathJax's
|
[(link)]({{< relref "math_rendering_is_wrong" >}}) you seem to focus on MathJax's
|
||||||
difficulty in rendering things server-side, while quietly ignoring that KaTeX's front
|
difficulty in rendering things server-side, while quietly ignoring that KaTeX's front
|
||||||
page advertises server-side rendering. Their documentation [(link)](https://katex.org/docs/options.html)
|
page advertises server-side rendering. Their documentation [(link)](https://katex.org/docs/options.html)
|
||||||
even shows (at least as of the time this email was sent) that it renders both HTML
|
even shows (at least as of the time this email was sent) that it renders both HTML
|
||||||
|
|||||||
BIN
content/blog/bergamot/CIC.png
Normal file
|
After Width: | Height: | Size: 46 KiB |
BIN
content/blog/bergamot/badrule.png
Normal file
|
After Width: | Height: | Size: 70 KiB |
BIN
content/blog/bergamot/bergamot.png
Normal file
|
After Width: | Height: | Size: 232 KiB |
BIN
content/blog/bergamot/goodrule.png
Normal file
|
After Width: | Height: | Size: 86 KiB |
BIN
content/blog/bergamot/goodrulerec.png
Normal file
|
After Width: | Height: | Size: 134 KiB |
353
content/blog/bergamot/index.md
Normal file
@@ -0,0 +1,353 @@
|
|||||||
|
---
|
||||||
|
title: "Bergamot: Exploring Programming Language Inference Rules"
|
||||||
|
date: 2023-12-22T18:16:44-08:00
|
||||||
|
tags: ["Project", "Programming Languages"]
|
||||||
|
description: "In this post, I show off Bergamot, a tiny logic programming language and an idea for teaching inference rules."
|
||||||
|
bergamot:
|
||||||
|
render_presets:
|
||||||
|
default: "bergamot/rendering/lc.bergamot"
|
||||||
|
---
|
||||||
|
|
||||||
|
### Inference Rules and the Study of Programming Languages
|
||||||
|
In this post, I will talk about _inference rules_, particularly in the field
|
||||||
|
of programming language theory. The first question to get out of the way is
|
||||||
|
"what on earth is an inference rule?". The answer is simple: an inference rule
|
||||||
|
is just a way of writing "if ... then ...". When writing an inference rule,
|
||||||
|
we write the "if" stuff above a line, and the "then" stuff below the line. Really,
|
||||||
|
that's all there is to it. I'll steal an example from another one of my posts on the blog -- here's an inference rule:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\frac
|
||||||
|
{\text{I'm allergic to cats} \quad \text{My friend has a cat}}
|
||||||
|
{\text{I will not visit my friend very much}}
|
||||||
|
{{</ latex >}}
|
||||||
|
|
||||||
|
We can read this as "__if__ I'm allergic to cats, and my friend has a cat, __then__
|
||||||
|
I will not visit my friend very much".
|
||||||
|
|
||||||
|
In the field of programming languages, inference rules are everywhere.
|
||||||
|
Practically any paper I read has a table that looks something like this:
|
||||||
|
|
||||||
|
{{< figure src="rules.png" caption="Inference rules from [Logarithm and program testing](https://dl.acm.org/doi/abs/10.1145/3498726) by Kuen-Bang Hou (Favonia) and Zhuyang Wang" class="fullwide" >}}
|
||||||
|
|
||||||
|
And I, for one, love it! They're a precise and concise way to describe static
|
||||||
|
and dynamic behavior of programs. I might've written this elsewhere on the blog,
|
||||||
|
but whenever I read a paper, my eyes search for the rules first and foremost.
|
||||||
|
|
||||||
|
But to those just starting their PL journey, inference rules can be quite cryptic
|
||||||
|
-- I know they were to me! The first level of difficulty are the symbols: we have
|
||||||
|
lots of Greek (\(\Gamma\) and \(\Delta\) for environments, \(\tau\) and perhaps \(\sigma\)
|
||||||
|
for types), and the occasional mathematical symbol (the "entails" symbol \(\vdash\) is the most
|
||||||
|
common, but for operational semantics we can have \(\leadsto\) and \(\Downarrow\)).
|
||||||
|
If you don't know what they mean, or if you're still getting used to them, symbols
|
||||||
|
in judgements are difficult enough to parse.
|
||||||
|
|
||||||
|
The second level of difficulty is making sense of the individual rules:
|
||||||
|
although they tend to not be too bad, for some languages even making
|
||||||
|
sense of one rule can be challenging. The following rule from the Calculus of
|
||||||
|
Inductive Constructions is a doozy, for instance.
|
||||||
|
|
||||||
|
{{< figure src="CIC.png" caption="The `match` inference rule from [Introduction to the Calculus of Inductive Constructions](https://inria.hal.science/hal-01094195/document) by Christine Paulin-Mohring" class="fullwide" >}}
|
||||||
|
|
||||||
|
Just look at the metavariables! We have \(\textit{pars}\), \(t_1\) through \(t_p\),
|
||||||
|
\(x_1\) through \(x_n\), plain \(x\), and at least two other sets of variables. Not
|
||||||
|
only this, but the rule requires at least some familiarity with [GADTs](https://en.wikipedia.org/wiki/Generalized_algebraic_data_type) to understand
|
||||||
|
completely.
|
||||||
|
|
||||||
|
The third level is making sense of how the rules work, _together_. In my
|
||||||
|
programming languages class in college, a familiar question was:
|
||||||
|
|
||||||
|
> the [Hindley-Milner type system](https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system)
|
||||||
|
> supports let-polymorphism only. What is it about the rules that implies let-polymorphism,
|
||||||
|
> and not any other kind of polymorphism?
|
||||||
|
|
||||||
|
If you don't know the answer, or the question doesn't make sense do you, don't
|
||||||
|
worry about it -- suffice to say that whole systems of inference rules exhibit
|
||||||
|
certain behaviors, and it takes familiarity with several rules to spot these
|
||||||
|
behaviors.
|
||||||
|
|
||||||
|
### Seeing What Works and What Doesn't
|
||||||
|
|
||||||
|
Maybe I'm just a tinker-y sort of person, but for me, teaching inference rules
|
||||||
|
just by showing them is not really enough. For instance, let me show you two
|
||||||
|
ways of writing the following (informal) rule:
|
||||||
|
|
||||||
|
> When adding two operands, if both operands are strings, then the result
|
||||||
|
> of adding them is also a string.
|
||||||
|
|
||||||
|
There's a right way to write this inference rule, and there is a wrong way.
|
||||||
|
Let me show you both, and try to explain the two. First, here's the wrong way:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\cfrac
|
||||||
|
{x : \text{string} \in \Gamma \quad y : \text{string} \in \Gamma}
|
||||||
|
{\Gamma \vdash x + y : \text{string}}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
This says that the type of adding two _variables_ of type `string` is still `string`.
|
||||||
|
Here, \(\Gamma\) is a _context_, which keeps track of which variable has what
|
||||||
|
type. Writing \(x : \text{string} \in \Gamma\) is the same as saying
|
||||||
|
"we know the variable `x` has type `string`". The whole rule reads,
|
||||||
|
|
||||||
|
> If the variables `x` and `y` both have type `string`,
|
||||||
|
> then the result of adding these two variables, `x+y`, also has type `string`.
|
||||||
|
|
||||||
|
The trouble with this rule is that it only works when adding two variables.
|
||||||
|
But `x+x` is not itself a variable, so the rule wouldn't work for an expression
|
||||||
|
like `(x+x)+(y+y)`. The proper way of writing the rule is, then, something like
|
||||||
|
this:
|
||||||
|
|
||||||
|
{{< latex >}}
|
||||||
|
\cfrac
|
||||||
|
{\Gamma \vdash e_1 : \text{string} \quad \Gamma \vdash e_2 : \text{string}}
|
||||||
|
{\Gamma \vdash e_1 + e_2 : \text{string}}
|
||||||
|
{{< /latex >}}
|
||||||
|
|
||||||
|
This rule says:
|
||||||
|
|
||||||
|
> If the two subexpressions `e1` and `e2` both have type `string`,
|
||||||
|
> then the result of adding these two subexpressions, `e1+e2`, also has type `string`.
|
||||||
|
|
||||||
|
Much better! We can apply this rule recursively: to get the type of `(x+x)+(y+y)`,
|
||||||
|
we consider `(x+x)` and `(y+y)` as two subexpressions, and go on to compute
|
||||||
|
their types first. We can then break `(x+x)` into two subexpressions (`x` and `x`),
|
||||||
|
and determine their type separately. Supposing that the variables `x` and `y`
|
||||||
|
indeed have the type `string`, this tells us that `(x+x)` and `(y+y)` are both
|
||||||
|
string, and therefore that the whole of `(x+x)+(y+y)` is a string.
|
||||||
|
|
||||||
|
What I'd really like to do is type the program in question and have the
|
||||||
|
computer figure out whether my rules accept or reject this program.
|
||||||
|
With my new rules, perhaps I'd get something like this:
|
||||||
|
|
||||||
|
{{< figure src="goodrule.png" caption="Verifying the `(x+x)+(y+y)` expression using the good rule" class="fullwide" >}}
|
||||||
|
|
||||||
|
To fully understand how the rule works to check a big expression like the above
|
||||||
|
sum, I'd need to see the recursion we applied a couple of paragraphs ago. My
|
||||||
|
ideal tool would display this too. For simplicity, I'll just show the output
|
||||||
|
for `(1+1)+(1+1)`, sidestepping variables and using numbers instead. This just
|
||||||
|
saves a little bit of space and visual noise.
|
||||||
|
|
||||||
|
{{< figure src="goodrulerec.png" caption="Verifying the `(1+1)+(1+1)` expression using the good rule" class="fullwide" >}}
|
||||||
|
|
||||||
|
On the other hand, since the sum of two `x`s and two `y`s doesn't work with my old rules, maybe
|
||||||
|
i wouldn't get a valid type at all:
|
||||||
|
|
||||||
|
{{< figure src="badrule.png" caption="Verifying (unsuccessfully) the `(x+x)+(y+y)` expression using the old rule" class="fullwide" >}}
|
||||||
|
|
||||||
|
More generally, I want to be able to write down some inference rules, and apply
|
||||||
|
them to some programs. This way, I can see what works and what doesn't, and when
|
||||||
|
it works, which rules were used for what purposes. I also want to be able to
|
||||||
|
try tweaking, removing, or adding inference rules, to see what breaks.
|
||||||
|
|
||||||
|
This brings me to the project that I'm trying to show off in this post:
|
||||||
|
Bergamot!
|
||||||
|
|
||||||
|
### Introducing Bergamot
|
||||||
|
|
||||||
|
A certain class of programming languages lends itself particularly well to
|
||||||
|
writing inference rules and applying them to programs: logic programming.
|
||||||
|
The most famous example of a logic programming language is [Prolog](https://www.swi-prolog.org/).
|
||||||
|
In logic programming languages like Prolog, we can write rules describing when
|
||||||
|
certain statements should hold. The simplest rule I could write is a fact. Perhaps
|
||||||
|
I'd like to say that the number 42 is a "good" number:
|
||||||
|
|
||||||
|
```Prolog
|
||||||
|
good(42).
|
||||||
|
```
|
||||||
|
|
||||||
|
Perhaps I'd then like to say that adding two good numbers together creates
|
||||||
|
another good number.
|
||||||
|
|
||||||
|
```Prolog
|
||||||
|
good(N) :- good(A), good(B), N is A+B.
|
||||||
|
```
|
||||||
|
|
||||||
|
The above can be read as:
|
||||||
|
|
||||||
|
> the number `N` is good if the numbers `A` and `B` are good, and `N` is the sum of `A` and `B`.
|
||||||
|
|
||||||
|
I can then ask Prolog to give me some good numbers:
|
||||||
|
|
||||||
|
```Prolog
|
||||||
|
?- good(X)
|
||||||
|
```
|
||||||
|
|
||||||
|
Prompting Prolog a few times, I get:
|
||||||
|
|
||||||
|
```
|
||||||
|
X = 42
|
||||||
|
X = 84
|
||||||
|
X = 126
|
||||||
|
X = 168
|
||||||
|
```
|
||||||
|
|
||||||
|
It's not a huge leap from this to programming language type rules. Perhaps instead
|
||||||
|
of something being "good", we can say that it has type `string`. Of course, adding
|
||||||
|
two strings together, as we've established, creates another string. In Prolog:
|
||||||
|
|
||||||
|
```Prolog
|
||||||
|
/* A string literal like "hello" has type string */
|
||||||
|
type(_, strlit(_), string).
|
||||||
|
/* Adding two string expressions has type string */
|
||||||
|
type(Env, plus(X, Y), string) :- type(Env, X, string), type(Env, Y, string).
|
||||||
|
```
|
||||||
|
|
||||||
|
That's almost identical to our inference rules above, except that it's written
|
||||||
|
using code instead of mathematical notation! If we could just take these
|
||||||
|
Prolog rules and display them as inference rules, we'd be able to "have our cake"
|
||||||
|
(draw pretty inference rules like in the papers) and "eat it too" (run our rules
|
||||||
|
on the computer against various programs).
|
||||||
|
|
||||||
|
This is where it gets a teensy bit hairy. It's not that easy to embed a Prolog
|
||||||
|
engine into the browser; alternatives that I've surveyed are either poorly documented,
|
||||||
|
hard to extend, or both. Furthermore, for studying _what each rule was used for_,
|
||||||
|
it's nice to be able to see a _proof tree_: a tree made up from the rules that
|
||||||
|
we used to arrive at a particular answer. Prolog engines are excellent at
|
||||||
|
applying rules and finding answers, but they don't usually provide a way
|
||||||
|
to get all the rules that were used, making it harder to get proof trees.
|
||||||
|
|
||||||
|
Thus, [Bergamot](https://dev.danilafe.com/Everything-I-Know-About-Types/bergamot-elm)
|
||||||
|
is a new, tiny programming language that I threw together in a couple of days.
|
||||||
|
It comes as JavaScript-based widget, and can be embedded into web pages like
|
||||||
|
this one to provide an interactive way to write and explore proof trees. Here's
|
||||||
|
a screenshot of what all of that looks like:
|
||||||
|
|
||||||
|
{{< figure src="bergamot.png" caption="A screenshot of a Bergamot widget with some type rules" class="fullwide" >}}
|
||||||
|
|
||||||
|
The components of Bergamot are:
|
||||||
|
|
||||||
|
* The programming language, as stated above. This language is a very simple,
|
||||||
|
unification-based logic programming language.
|
||||||
|
* A rule rendering system, which takes Prolog-like rules written in Bergamot
|
||||||
|
and converts them into pretty LaTeX inference rules.
|
||||||
|
* An Elm-based widget that you can embed into your web page, which accepts
|
||||||
|
Bergamot rule and an input expression, and applies the rules to produce
|
||||||
|
a result (or a whole proof tree!).
|
||||||
|
|
||||||
|
Much like in Prolog, we can write Bergamot rules that describe when certain
|
||||||
|
things are true. Unlike Prolog, Bergamot requires each rule to have a name.
|
||||||
|
This is common practice in programming languages literature (when we talk
|
||||||
|
about rules in papers, we like to be able to refer to them by name). Below
|
||||||
|
are some sample Bergamot rules, corresponding to the first few inference rules
|
||||||
|
in the above screenshot.
|
||||||
|
|
||||||
|
```
|
||||||
|
TNumber @ type(?Gamma, intlit(?n), number) <-;
|
||||||
|
TString @ type(?Gamma, strlit(?s), string) <-;
|
||||||
|
TVar @ type(?Gamma, var(?x), ?tau) <- inenv(?x, ?tau, ?Gamma);
|
||||||
|
TPlusI @ type(?Gamma, plus(?e_1, ?e_2), number) <-
|
||||||
|
type(?Gamma, ?e_1, number), type(?Gamma, ?e_2, number);
|
||||||
|
TPlusS @ type(?Gamma, plus(?e_1, ?e_2), string) <-
|
||||||
|
type(?Gamma, ?e_1, string), type(?Gamma, ?e_2, string);
|
||||||
|
```
|
||||||
|
|
||||||
|
Unlike Prolog, where "variables" are anything that starts with a capital letter,
|
||||||
|
in Bergamot, variables are things that start with the special `?` symbol. Also,
|
||||||
|
Prolog's `:-` has been replaced with an arrow symbol `<-`, for reverse implication.
|
||||||
|
These are both purely syntactic differences.
|
||||||
|
|
||||||
|
#### Demo
|
||||||
|
If you want to play around with it, here's an embedded Bergamot widget with
|
||||||
|
{{< sidenote "right" "wrong-rule-note" "some rules pre-programmed in." >}}
|
||||||
|
Actually, one of the rules is incorrect to my knowledge. Can you spot it?
|
||||||
|
Hint: is <code>\x : number. \x: string. x+1</code> well-typed? What does
|
||||||
|
Bergamot report? Can you see why?
|
||||||
|
{{< /sidenote >}}
|
||||||
|
It has two modes:
|
||||||
|
|
||||||
|
1. __Language Term__: accepts a rather simple programming language
|
||||||
|
to typecheck. Try `1+1`, `fst((1,2))`, or maybe `(\x : number. x) 42`.
|
||||||
|
2. __Query__: it accepts Bergamot expressions to query, similarly to Prolog; try
|
||||||
|
`type(empty, ?e, tpair(number, string))` to search for expressions that have
|
||||||
|
the type "a pair of a number and a string".
|
||||||
|
|
||||||
|
{{< bergamot_widget id="widget" query="" prompt="type(empty, TERM, ?t)" >}}
|
||||||
|
section "" {
|
||||||
|
TNumber @ type(?Gamma, lit(?n), number) <- num(?n);
|
||||||
|
TString @ type(?Gamma, lit(?s), string) <- str(?s);
|
||||||
|
TVar @ type(?Gamma, var(?x), ?tau) <- inenv(?x, ?tau, ?Gamma);
|
||||||
|
TPlusI @ type(?Gamma, plus(?e_1, ?e_2), number) <-
|
||||||
|
type(?Gamma, ?e_1, number), type(?Gamma, ?e_2, number);
|
||||||
|
TPlusS @ type(?Gamma, plus(?e_1, ?e_2), string) <-
|
||||||
|
type(?Gamma, ?e_1, string), type(?Gamma, ?e_2, string);
|
||||||
|
}
|
||||||
|
section "" {
|
||||||
|
TPair @ type(?Gamma, pair(?e_1, ?e_2), tpair(?tau_1, ?tau_2)) <-
|
||||||
|
type(?Gamma, ?e_1, ?tau_1), type(?Gamma, ?e_2, ?tau_2);
|
||||||
|
TFst @ type(?Gamma, fst(?e), ?tau_1) <-
|
||||||
|
type(?Gamma, ?e, tpair(?tau_1, ?tau_2));
|
||||||
|
TSnd @ type(?Gamma, snd(?e), ?tau_2) <-
|
||||||
|
type(?Gamma, ?e, tpair(?tau_1, ?tau_2));
|
||||||
|
}
|
||||||
|
section "" {
|
||||||
|
TAbs @ type(?Gamma, abs(?x, ?tau_1, ?e), tarr(?tau_1, ?tau_2)) <-
|
||||||
|
type(extend(?Gamma, ?x, ?tau_1), ?e, ?tau_2);
|
||||||
|
TApp @ type(?Gamma, app(?e_1, ?e_2), ?tau_2) <-
|
||||||
|
type(?Gamma, ?e_1, tarr(?tau_1, ?tau_2)), type(?Gamma, ?e_2, ?tau_1);
|
||||||
|
}
|
||||||
|
|
||||||
|
section "" {
|
||||||
|
GammaTake @ inenv(?x, ?tau_1, extend(?Gamma, ?x, ?tau_1)) <-;
|
||||||
|
GammaSkip @ inenv(?x, ?tau_1, extend(?Gamma, ?y, ?tau_2)) <- inenv(?x, ?tau_1, ?Gamma);
|
||||||
|
}
|
||||||
|
{{< /bergamot_widget >}}
|
||||||
|
|
||||||
|
#### Rendering Bergamot with Bergamot
|
||||||
|
|
||||||
|
There's something to be said about the conversion between Bergamot's rules,
|
||||||
|
encoded as plain text, and pretty LaTeX-based inference rules that the users
|
||||||
|
see. Crucially, __we don't want to hardcode how any particular Bergamot
|
||||||
|
expression is rendered__. For one, this is a losing battle: we can't possibly
|
||||||
|
keep up with all the notation that people use in PL literature, and even if
|
||||||
|
we focused ourselves on only "beginner" notation, there wouldn't be one way to do it!
|
||||||
|
Different PL papers and texts use slightly different variations of notation.
|
||||||
|
For instance, I render my pairs as \((a, b)\), but the very first screenshot
|
||||||
|
in this post demonstrates a PL paper that writes pairs as \(\langle a, b \rangle\).
|
||||||
|
Neither way (as far as I know!) is right or wrong. But if we hardcode one,
|
||||||
|
we lose the ability to draw the other.
|
||||||
|
|
||||||
|
More broadly, one aspect about writing PL rules is that _we control the notation_.
|
||||||
|
We are free to define shorthands, symbols, and anything else that would make
|
||||||
|
reading our rules easier for others. As an example, a paper from POPL22 about
|
||||||
|
programming language semantics with garbage collection used a literal trash
|
||||||
|
symbol in their rules:
|
||||||
|
|
||||||
|
{{< figure src="trashrule.png" caption="A rule that uses a trashcan icon as notation, from [A separation logic for heap space under garbage collection](https://dl.acm.org/doi/10.1145/3498672) by Jean-Marie Madiot and François Pottier" >}}
|
||||||
|
|
||||||
|
Thus, what I want to do is __encourage the (responsible) introduction of new notation__.
|
||||||
|
This can only be done if Bergamot itself supports custom notation.
|
||||||
|
|
||||||
|
When thinking about how I'd like to implement this custom notation, I was imagining
|
||||||
|
some sort of templated rule engine, that would define how each term in a Bergamot
|
||||||
|
program can be converted to its LaTeX variant. But then I realized: Bergamot
|
||||||
|
is already a rule engine! Instead of inventing yet another language or format
|
||||||
|
for defining LaTeX pretty printing, I could just use Bergamot. This turned
|
||||||
|
out to work quite nicely -- the "Presentation Rules" tab in the demo above
|
||||||
|
should open a text editor with Bergamot rules that handle the conversion
|
||||||
|
of Bergamot notation into LaTex. Here are some example rules:
|
||||||
|
|
||||||
|
```
|
||||||
|
LatexPlus @ latex(plus(?e_1, ?e_2), ?l) <- latex(?e_1, ?l_1), latex(?e_2, ?l_2), join([?l_1, " + ", ?l_2], ?l);
|
||||||
|
LatexPair @ latex(pair(?e_1, ?e_2), ?l) <- latex(?e_1, ?l_1), latex(?e_2, ?l_2), join(["(", ?l_1, ", ", ?l_2, ")"], ?l);
|
||||||
|
```
|
||||||
|
|
||||||
|
If we change the `LatexPair` to the following, we can make all pairs render
|
||||||
|
using angle brackets:
|
||||||
|
|
||||||
|
```
|
||||||
|
LatexPair @ latex(pair(?e_1, ?e_2), ?l) <- latex(?e_1, ?l_1), latex(?e_2, ?l_2), join(["\\langle", ?l_1, ", ", ?l_2, "\\rangle"], ?l);
|
||||||
|
```
|
||||||
|
|
||||||
|
{{< figure src="rangle.png" caption="The LaTeX output when angle brackets are used in the rule instead of parentheses." >}}
|
||||||
|
|
||||||
|
You can write rules about arbitrary Bergamot terms for rendering; thus, you can
|
||||||
|
invent completely new notation for absolutely anything.
|
||||||
|
|
||||||
|
### Next Steps
|
||||||
|
|
||||||
|
I hope to use Bergamot to write a series of articles about type systems. By
|
||||||
|
providing an interactive widget, I hope to make it possible for users to do
|
||||||
|
exercises: writing variations of inference rules, or even tweaking the notation,
|
||||||
|
and checking them against sets of programs to make sure that they work. Of course,
|
||||||
|
I also hope that Bergamot can be used to explore _why_ an existing set of inference
|
||||||
|
rules (such as Hindley-Milner) works. Stay tuned for those!
|
||||||
BIN
content/blog/bergamot/rangle.png
Normal file
|
After Width: | Height: | Size: 33 KiB |
BIN
content/blog/bergamot/rules.png
Normal file
|
After Width: | Height: | Size: 199 KiB |
BIN
content/blog/bergamot/trashrule.png
Normal file
|
After Width: | Height: | Size: 30 KiB |
BIN
content/blog/blog_microfeatures/chapel-seriesnav.png
Normal file
|
After Width: | Height: | Size: 30 KiB |
|
After Width: | Height: | Size: 157 KiB |
BIN
content/blog/blog_microfeatures/csstricks-toc.gif
Normal file
|
After Width: | Height: | Size: 494 KiB |
BIN
content/blog/blog_microfeatures/drew-openring.png
Normal file
|
After Width: | Height: | Size: 203 KiB |
BIN
content/blog/blog_microfeatures/fasterthanlime-dialogue.png
Normal file
|
After Width: | Height: | Size: 276 KiB |
BIN
content/blog/blog_microfeatures/fasterthanlime-toc.png
Normal file
|
After Width: | Height: | Size: 183 KiB |
BIN
content/blog/blog_microfeatures/feedbin.png
Normal file
|
After Width: | Height: | Size: 101 KiB |
BIN
content/blog/blog_microfeatures/gwern-hover.gif
Normal file
|
After Width: | Height: | Size: 2.2 MiB |
BIN
content/blog/blog_microfeatures/gwern-linkicons-haskell.png
Normal file
|
After Width: | Height: | Size: 57 KiB |
BIN
content/blog/blog_microfeatures/gwern-linkicons-wiki.png
Normal file
|
After Width: | Height: | Size: 57 KiB |
BIN
content/blog/blog_microfeatures/gwern-linkicons-zip.png
Normal file
|
After Width: | Height: | Size: 58 KiB |
BIN
content/blog/blog_microfeatures/gwern-sidenotes.png
Normal file
|
After Width: | Height: | Size: 975 KiB |
BIN
content/blog/blog_microfeatures/hugo-gist.png
Normal file
|
After Width: | Height: | Size: 73 KiB |
BIN
content/blog/blog_microfeatures/hugo-titlelink.png
Normal file
|
After Width: | Height: | Size: 49 KiB |
382
content/blog/blog_microfeatures/index.md
Normal file
@@ -0,0 +1,382 @@
|
|||||||
|
---
|
||||||
|
title: "Microfeatures I Love in Blogs and Personal Websites"
|
||||||
|
date: 2024-06-23T11:03:10-07:00
|
||||||
|
tags: ["Website"]
|
||||||
|
favorite: true
|
||||||
|
description: "In this post, I talk about pleasant but seemingly minor features in personal sites"
|
||||||
|
---
|
||||||
|
|
||||||
|
Some time ago, Hillel Wayne published an article titled
|
||||||
|
[_Microfeatures I'd like to see in more languages_](https://buttondown.email/hillelwayne/archive/microfeatures-id-like-to-see-in-more-languages/).
|
||||||
|
In this article, he described three kinds of features in _programming languages_:
|
||||||
|
fundamental features, deeply engrained features, and nice-to-have convenience features.
|
||||||
|
Hillel's premise was that language designers tend to focus on the first two;
|
||||||
|
however, because the convenience features are relatively low-overhead, it's
|
||||||
|
easier for them to jump between projects, and they provide a quality-of-life
|
||||||
|
increase.
|
||||||
|
|
||||||
|
I've been running a blog for a while --- some of the oldest posts I've found
|
||||||
|
(which are no longer reflected on this site due to their low quality) were from
|
||||||
|
2015. In this time, I've been on the lookout for ways to improve the site,
|
||||||
|
and I've seen quite a few little things that are nice to use, but relatively
|
||||||
|
easy to implement. They don't really make or break a website; the absence of
|
||||||
|
such features might be noticed, but will not cause any disruption for the reader.
|
||||||
|
On the other hand, their presence serves as a QoL enhancement. I find these to be
|
||||||
|
analogous to Hillel's notion of "microfeatures". If you're interested in adding
|
||||||
|
something to your site, consider browsing this menu to see if anything resonates!
|
||||||
|
|
||||||
|
One last thing is that this post is not necessarily about microfeatures
|
||||||
|
I'd like _every_ blog or personal website to have. Some ideas I present
|
||||||
|
here are only well-suited to certain types of content and certain written
|
||||||
|
voices. They need not be applied indiscriminately.
|
||||||
|
|
||||||
|
With that, let's get started!
|
||||||
|
|
||||||
|
### Sidenotes
|
||||||
|
|
||||||
|
[Gwern](https://gwern.net/me) is, in my view, the king of sidenotes.
|
||||||
|
Gwern's writing makes very heavy use of them (at least based on the articles
|
||||||
|
that I've read). This is where I originally got inspiration for
|
||||||
|
[my own implementation in Hugo]({{< relref "sidenotes" >}}). Check out the page
|
||||||
|
on [hydrocephalus](https://gwern.net/hydrocephalus) for an example; Here's what
|
||||||
|
a piece of that page looks like on my end at the time of writing:
|
||||||
|
|
||||||
|
{{< figure src="gwern-sidenotes.png" class="fullwide" caption="A screenshot of Gwern's page on hydrocephalus" alt="A screenshot of Gwern's page on hydrocephalus. The main article text is accompanied by notes in both the left and right margin." >}}
|
||||||
|
|
||||||
|
Sidenotes are nice because they allow for diversions without interrupting
|
||||||
|
the main article's flow. You can provide additional details for the curious
|
||||||
|
reader, or --- [as Gwern does](https://gwern.net/hydrocephalus#sn4) ---
|
||||||
|
use the sidenotes for citing studies or sources. In either case, the reading
|
||||||
|
experience is significantly more pleasant that footnotes, for which you typically
|
||||||
|
have to go to the bottom of the page, and then return to the top.
|
||||||
|
|
||||||
|
Another reason I called Gwern the "king of sidenotes" is
|
||||||
|
[this page on sidenotes](https://gwern.net/sidenote). There, Gwern documents
|
||||||
|
numerous approaches to this feature, mostly inspired by [Tufte CSS](https://edwardtufte.github.io/tufte-css/).
|
||||||
|
The page is very thorough --- it even includes a link to my own work, as unknown
|
||||||
|
as it may be! I would recommend checking it out if you are interested in
|
||||||
|
enhancing your site with sidenotes.
|
||||||
|
|
||||||
|
### Tables of Contents
|
||||||
|
|
||||||
|
Not all personal sites include tables of contents (TOCs), but they are nice.
|
||||||
|
They serve two purposes:
|
||||||
|
|
||||||
|
1. Seeing at a glance what the post will be about, in the form of headings.
|
||||||
|
2. Being able to navigate to an interesting part of the page without
|
||||||
|
having to scroll.
|
||||||
|
|
||||||
|
Static site generators (I myself use [Hugo](https://gohugo.io/)) are
|
||||||
|
typically able to generate TOCs automatically, since they are already generating
|
||||||
|
the HTML and know what headings they are inserting into the page. For instance,
|
||||||
|
Hugo has [`TableOfContents`](https://gohugo.io/methods/page/tableofcontents/).
|
||||||
|
I suspect the same is true for other existing website technologies.
|
||||||
|
|
||||||
|
Despite this, I actually had to look relatively long to find sites I frequent that have
|
||||||
|
TOCs to show off as examples here. The first one I came across --- after Gwern's,
|
||||||
|
whose site will be mentioned plenty in this post anyway --- is [Faster than Lime](https://fasterthanli.me).
|
||||||
|
Take this post on [Rust's Futures](https://fasterthanli.me/articles/understanding-rust-futures-by-going-way-too-deep);
|
||||||
|
this is what the top of it looks like at the time of writing:
|
||||||
|
|
||||||
|
{{< figure src="fasterthanlime-toc.png" class="small" caption="A screenshot of the table of contents on Faster than Lime" alt="A screenshot of the table of contents on Faster than Lime. A box with the word \"Contents\" contains links to several sections on the page bellow (off screen)" >}}
|
||||||
|
|
||||||
|
The quality and value of TOCs certainly depends on the
|
||||||
|
sections within the page itself --- and whether or not the page has sections
|
||||||
|
at all! --- but in my opinion, the benefits to navigation become apparent even
|
||||||
|
for relatively simple pages.
|
||||||
|
|
||||||
|
As an honorable mention, I'd like to show [Lars Hupel's site](https://lars.hupel.info/).
|
||||||
|
The pages on the site don't --- as far as I can tell --- have _internal_ tables
|
||||||
|
of contents. However, pages that are part of a series --- such as
|
||||||
|
the [introduction to CRDTs](https://lars.hupel.info/topics/crdt/01-intro/) ---
|
||||||
|
have tables of contents that span the entire series.
|
||||||
|
|
||||||
|
{{< figure src="lars-toc.png" class="small" caption="A screenshot of the table of contents on Lars Hupel's site" alt="A screenshot of the table of contents on Lars Hupel's site. A box with the words \"Series Navigation\" contains links to several other pages in the series." >}}
|
||||||
|
|
||||||
|
I also find this very nice, though it does miss out on headings within a page.
|
||||||
|
|
||||||
|
#### Bonus: Showing Page Progress
|
||||||
|
|
||||||
|
I've mentioned that tables of contents can communicate the structure of the
|
||||||
|
page. However, they do so from the outset, before you've started reading.
|
||||||
|
In their "base form", the reader stops benefiting from tables of contents
|
||||||
|
{{< sidenote "right" "jump-top-note" "once they've started reading." >}}
|
||||||
|
That is, of course, unless they jump back to the top of the post and
|
||||||
|
find the table of contents again.
|
||||||
|
{{< /sidenote >}}
|
||||||
|
|
||||||
|
If you want to show progress while the reader is somewhere in the middle
|
||||||
|
of a page, you could use a page progress bar. I've noticed one while
|
||||||
|
reading [Quanta Magazine](https://www.quantamagazine.org); it looks like
|
||||||
|
this (recording my scrolling through
|
||||||
|
the [most recent article at the time of writing](https://www.quantamagazine.org/how-the-square-root-of-2-became-a-number-20240621/)).
|
||||||
|
|
||||||
|
{{< figure src="quanta-scroll.gif" class="fullwide" caption="The progress bar on a Quanta Magazine article" alt="The progress bar on a Quanta Magazine article. As the page scrolls, an orange bar at the top gradually fills up from left to right." >}}
|
||||||
|
|
||||||
|
One immediate thought is that this is completely superseded by the regular
|
||||||
|
browser scroll bar that's ever-present at the side of the page. However,
|
||||||
|
the scroll bar could be deceiving. If your page has a comments section,
|
||||||
|
the comments could make the page look dauntingly long. Similarly, references
|
||||||
|
to other pages and general "footer material" count towards the scroll bar,
|
||||||
|
but would not count towards the progress bar.
|
||||||
|
|
||||||
|
Combining the two, you could imagine an always-visible table of contents
|
||||||
|
that highlights the current section you're in. With such a feature, you
|
||||||
|
can always see where you are (including a rough estimate of how far into
|
||||||
|
the page you've scrolled), and at the same time see how the current section
|
||||||
|
integrates into the broader structure. I've seen this done before, but could
|
||||||
|
not find a site off the top of my head that implements the feature; as a
|
||||||
|
fallback, here's the [CSS tricks tutorial](https://css-tricks.com/sticky-table-of-contents-with-scrolling-active-states/)
|
||||||
|
that shows how to implement a dynamic table of contents, and a recording
|
||||||
|
of me scrolling through it:
|
||||||
|
|
||||||
|
{{< figure src="csstricks-toc.gif" class="fullwide" caption="The table of contents from a CSS Tricks demo" alt="The table of contents from a CSS Tricks demo. As the page scrolls, the current section in the table of contents becomes bold." >}}
|
||||||
|
|
||||||
|
### Easily Linkable Headings
|
||||||
|
|
||||||
|
How can you link a particular section of a page to your friend? There's a
|
||||||
|
well-defined mechanism to do this in HTML: you can use the ID of a particular
|
||||||
|
HTML element, and add it as `#some-id` to the end of a link to the page. The
|
||||||
|
link then takes the user to that particular HTML element. I can do this,
|
||||||
|
for instance, to link to the [sidenotes section above](#sidenotes).
|
||||||
|
|
||||||
|
How does one discover the ID of the part of the page that they want to
|
||||||
|
link to? The ID is not a "visual" property; it's not displayed to the user,
|
||||||
|
and is rather a detail of HTML itself. Thus, on any given page, even
|
||||||
|
if every element has a unique, linkable ID, I can't make use of it without
|
||||||
|
going into __Inspect Element__ and trying to find the ID in the HTML tree.
|
||||||
|
|
||||||
|
The simple solution is to make the elements that you want to be easily "linkable"
|
||||||
|
into links to themselves! Then, the user can right-click
|
||||||
|
the element in question (probably the heading) and click __Copy Link__.
|
||||||
|
Much easier! To demonstrate a similar idea, [here is a link to this paragraph itself](#linked-paragraph).
|
||||||
|
You can now use the context menu to __Copy Link__, put it in your browser,
|
||||||
|
and voilà --- you're right back here!
|
||||||
|
{#linked-paragraph}
|
||||||
|
|
||||||
|
As with [tables of contents](#tables-of-contents), many website technologies
|
||||||
|
provide most of the tooling to add support for this feature. Relatively
|
||||||
|
often I come across pages that have unique IDs for each header, but no clickable
|
||||||
|
links! I end up having to use inspect element to find the anchor points.
|
||||||
|
|
||||||
|
A variation on this idea --- if you don't want to make the entire heading or title
|
||||||
|
a link --- is to include alongside it (before or after) a clickable element
|
||||||
|
that is a link to that title. You can click that element to retrieve link
|
||||||
|
information, instead (and the icon additionally tells you that this is possible).
|
||||||
|
Hugo's documentation does this: here's a screenshot of
|
||||||
|
[an arbitrary page](https://gohugo.io/content-management/markdown-attributes/#overview).
|
||||||
|
|
||||||
|
{{< figure src="hugo-titlelink.png" class="small" caption="A title and paragraph from the Hugo documentation" alt="A title and paragraph from the Hugo documentation. Next to the title there is a blue link symbol." >}}
|
||||||
|
|
||||||
|
### Grouping Series of Posts
|
||||||
|
|
||||||
|
Some authors like to write at length on a particular topic; to get the
|
||||||
|
content out to readers faster (and to make the resulting pages less daunting),
|
||||||
|
it makes sense to break a single topic up into a series. The easiest way
|
||||||
|
to do this is to just... publish several articles, possibly with related
|
||||||
|
names, and link them to each other. Done!
|
||||||
|
|
||||||
|
With a little more effort, though, the series-reading and series-writing
|
||||||
|
experience could be nicer. Instead of manually inserting links, you
|
||||||
|
could configure your website to automatically add a "next" and "previous"
|
||||||
|
button to pages in a given series. You could also give an overview of a particular
|
||||||
|
series and create a "navigation hub" for it.
|
||||||
|
|
||||||
|
As an example, the [Chapel language blog](https://chapel-lang.org/blog/) has navigation
|
||||||
|
buttons. Here's a screenshot from [a post in the Advent of Code series](https://chapel-lang.org/blog/posts/aoc2022-day09-elvish-string-theory/):
|
||||||
|
|
||||||
|
{{< figure src="chapel-seriesnav.png" class="fullwide" caption="Series navigation buttons on a Chapel blog post" alt="Series navigation buttons on a Chapel blog post. There are two buttons; one links to a previous page in the series, another links to the next." >}}
|
||||||
|
|
||||||
|
I've mentioned this in the section on [tables of contents](#tables-of-contents),
|
||||||
|
but [Lars Hupel's site](https://lars.hupel.info/) has tables of contents
|
||||||
|
that link between series. I'm not sure if it's automatically generated
|
||||||
|
or hand-written, but it's definitely nice.
|
||||||
|
|
||||||
|
{{< figure src="lars-toc.png" class="small" caption="A screenshot of the table of contents on Lars Hupel's site" alt="A screenshot of the table of contents on Lars Hupel's site. A box with the words \"Series Navigation\" contains links to several other pages in the series." >}}
|
||||||
|
|
||||||
|
### Dialogues
|
||||||
|
|
||||||
|
I first came across dialogues on [Xe Iaso's site](https://xeiaso.net/),
|
||||||
|
but I think I see them used most often in posts on [Faster than Lime](https://fasterthanli.me/).
|
||||||
|
As an example, here's a little dialogue on [a post about Rust's futures](https://fasterthanli.me/articles/understanding-rust-futures-by-going-way-too-deep#it-s-waiting-for-the-first-one-to-finish).
|
||||||
|
At the time of writing, it looks like this:
|
||||||
|
|
||||||
|
{{< figure src="fasterthanlime-dialogue.png" class="medium" caption="A dialogue with \"cool bear\" on Faster than Lime" alt="A dialogue with \"cool bear\" on Faster than Lime. The page contains chat bubbles that alternate between a bear character and the author." >}}
|
||||||
|
|
||||||
|
Using dialogues --- even for technical writing --- is not a particularly novel
|
||||||
|
idea. I know I've seen it in a textbook before; probably
|
||||||
|
this part of [Operating Systems: Three Easy Pieces](https://pages.cs.wisc.edu/~remzi/OSTEP/dialogue-virtualization.pdf).
|
||||||
|
It can help ask questions from a less-experienced point of view, and therefore
|
||||||
|
possibly voice concerns that a reader might themselves be having. And of course
|
||||||
|
--- as with "cool bear" and Xe Iaso's [many characters](https://xeiaso.net/characters)
|
||||||
|
--- it can change the tone and make the page a bit more fun.
|
||||||
|
|
||||||
|
### Code Blocks with Origin
|
||||||
|
This one was recommended to me by a reader, and so I'll be talking about
|
||||||
|
my page specifically!
|
||||||
|
|
||||||
|
When I was [writing about making a compiler]({{< relref "series/compiling-a-functional-language-using-c++" >}}),
|
||||||
|
a reader emailed me and pointed out that they were getting lost in the various
|
||||||
|
code blocks. My page displayed the code that I was writing about, but the
|
||||||
|
project had grown beyond a single file. As a result, I'd be making changes
|
||||||
|
midway through one file at one moment, and another file the next. This
|
||||||
|
prompted me to add decorators to my code blocks that look
|
||||||
|
something like this:
|
||||||
|
|
||||||
|
{{< codelines "Ruby" "patterns/patterns.rb" 3 8 >}}
|
||||||
|
|
||||||
|
The decorator says what file the code is from, as well as what lines
|
||||||
|
are being presented. If you click the file name, the decorator links
|
||||||
|
to my Gitea instance, allowing you to read the code in context.
|
||||||
|
|
||||||
|
Though it's not quite the same (in particular, it's unfortunately missing
|
||||||
|
links), the Crafting Interpreters online book does something similar. It
|
||||||
|
describes changes to the code in words next to the changed code itself,
|
||||||
|
like "added after `MyStruct`". Here's a screenshot of the page on
|
||||||
|
[local variables](https://craftinginterpreters.com/local-variables.html)
|
||||||
|
at the time of writing.
|
||||||
|
|
||||||
|
{{< figure src="craftinginterpreters-codenotes.png" class="fullwide" caption="Location notes on code in Crafting Interpreters" alt="Location notes on code in Crafting Interpreters. On the right of code blocks, a margin note describes the file and nature of the change." >}}
|
||||||
|
|
||||||
|
I think it looks quite elegant, and in some ways --- specifically in
|
||||||
|
the verbal descriptions of what each change does --- might be superior to my
|
||||||
|
approach.
|
||||||
|
|
||||||
|
It's not quite the same thing, but [GitHub Gists](https://gist.github.com/)
|
||||||
|
can help approximate this feature. A Gist could contain multiple files,
|
||||||
|
and each file can be individually embedded into your page. Hugo in particular has
|
||||||
|
[built-in support](https://gohugo.io/content-management/shortcodes/#gist) for
|
||||||
|
Gists (and I've snagged that link using the docs' [easily linkable headings](#easily-linkable-headings));
|
||||||
|
I suspect that other website engines have some form of support
|
||||||
|
as well. At the time of writing, an embedded Gist looks something like this:
|
||||||
|
|
||||||
|
{{< figure src="hugo-gist.png" class="small" caption="Code embedded in Hugo documentation using a GitHub Gist" alt="Code embedded in Hugo documentation using a GitHub Gist." >}}
|
||||||
|
|
||||||
|
Clicking `list.html` takes you to the source code of the file.
|
||||||
|
|
||||||
|
#### Bonus: Code Blocks with Clickable Links
|
||||||
|
If we're going for fancy code blocks, another fancy feature is provided
|
||||||
|
by the [Agda programming language](https://agda.readthedocs.io/en/latest/getting-started/what-is-agda.html).
|
||||||
|
Agda can generate HTML code blocks in which every symbol (like a variable,
|
||||||
|
record name, function name) are linked to where they are defined. So if
|
||||||
|
you're reading the code, and wonder "what the heck is `x`?", you can just
|
||||||
|
click it to see how it's defined.
|
||||||
|
|
||||||
|
It's not simple to integrate Agda's plain HTML output into an existing
|
||||||
|
webpage, but some projects do that. I took a stab at it in
|
||||||
|
my [post about integrating it with Hugo]({{< relref "agda_hugo" >}}).
|
||||||
|
I wager this would be even harder for other languages. However, it leads
|
||||||
|
to nice results; my go-to is [Programming Languages Foundations in Agda](https://plfa.github.io/).
|
||||||
|
The online book introduces various concepts from Programming Language Theory,
|
||||||
|
and each code block that it shows is fully linked. This makes it possible
|
||||||
|
to jump around the page like so:
|
||||||
|
|
||||||
|
{{< figure src="plfa-goto.gif" class="fullwide" caption="Navigating code blocks on a page from PLFA" alt="Navigating code blocks on a page from PLFA. I hover over then click a plus sign to see how addition is defined. I then do the same to see how natural numbers are defined." >}}
|
||||||
|
|
||||||
|
### Markers for External Links
|
||||||
|
Some sites I've seen mark links that go to a different domain with
|
||||||
|
a little icon. If you've read this far, you've likely noticed that my
|
||||||
|
site does the same. Another good example of this --- even though the CSS
|
||||||
|
is little rough at the time of writing --- is [James' Coffee Blog ☕](https://jamesg.blog/).
|
||||||
|
I've taken the (small) liberty to adjust the color of the icon, which
|
||||||
|
I suspect is buggy in my browser.
|
||||||
|
|
||||||
|
{{< figure src="jamesg-external.png" class="fullwide" caption="An external link on James' blog" alt="An external link on James' blog. The link is displayed as normal, and an additional diagonal arrow aiming up and to the right and surrounded by a square is displayed to the right of the link text." >}}
|
||||||
|
|
||||||
|
Some websites (~~this one included~~) also make such links open in a new tab
|
||||||
|
automatically. That way, you tend to not lose the original article by clicking
|
||||||
|
through one of its references.
|
||||||
|
|
||||||
|
#### Bonus: Different Markers for Different Destinations
|
||||||
|
[Gwern's website](https://gwern.net) takes this idea further, by changing
|
||||||
|
the icon for external links depending on the destination. For instance,
|
||||||
|
links to Wikipedia articles are stylized with a little "W", links to
|
||||||
|
Haskell.org are stylized using a lambda (\(\lambda\)), and links to
|
||||||
|
`.zip` files have a little archive icon. There are more; ~~I've found
|
||||||
|
the [link processing code on GitHub](https://github.com/gwern/gwern.net/blob/959ba9c50d327a960e07241b2c7f13630bf8b80c/js/old/links.js),
|
||||||
|
and even the [list of websites that get their own icons](https://github.com/gwern/gwern.net/blob/959ba9c50d327a960e07241b2c7f13630bf8b80c/js/old/links.js#L380-L387).~~
|
||||||
|
I could not find a verbal description, though.
|
||||||
|
|
||||||
|
__Edit:__ Gwern has pointed out that the links I provided go to obsolete code.
|
||||||
|
The link processing functionality is [documented in comments here](https://github.com/gwern/gwern.net/blob/959ba9c50d327a960e07241b2c7f13630bf8b80c/build/LinkIcon.hs#L15)
|
||||||
|
and the [link icon rules are here](https://github.com/gwern/gwern.net/blob/959ba9c50d327a960e07241b2c7f13630bf8b80c/build/Config/LinkIcon.hs#L83).
|
||||||
|
A [non-code list of icons](https://gwern.net/lorem-link#link-icons) exists too.
|
||||||
|
|
||||||
|
Now for some pictures. Here are a ton of links from
|
||||||
|
the ["About"](https://gwern.net/about) page!
|
||||||
|
|
||||||
|
{{< figure src="gwern-linkicons-wiki.png" class="fullwide" caption="Links to Wikipedia on Gwern's site" alt="Links to Wikipedia on Gwern's blog. Each link is followed by a superscript \"W\"." >}}
|
||||||
|
{{< figure src="gwern-linkicons-haskell.png" class="fullwide" caption="A link to Haskell.org on Gwern's site" alt="A link to Haskell.org on Gwern's blog. The link is followed by a superscript lambda." >}}
|
||||||
|
{{< figure src="gwern-linkicons-zip.png" class="fullwide" caption="Links zip files on Gwern's site" alt="Links zip files on Gwern's site. Each link is followed by an archive icon." >}}
|
||||||
|
|
||||||
|
|
||||||
|
#### Bonus: Link Preview
|
||||||
|
[Gwern's website](https://gwern.net) has no shortage of cool ideas. Among
|
||||||
|
them showing link previews on hover. When hovering over a link, the site
|
||||||
|
displays a popup window that contains a view into that page. I suspect that
|
||||||
|
this view is also archived somehow, so that it retains a view into the
|
||||||
|
page that matches it at the time of writing.
|
||||||
|
|
||||||
|
To be perfectly honest, I found this feature a little jarring at first.
|
||||||
|
As I would try to click links, I would get surprised by an additional overlay.
|
||||||
|
However, as I spent more time browsing the site, I grew quite accustomed to
|
||||||
|
the previews. I would hover over a link to see the first paragraph and
|
||||||
|
thus get a short synopsis. This worked really well in tandem with
|
||||||
|
[per-destination marker icons](#bonus-different-markers-for-different-destinations);
|
||||||
|
I could tell at a glance whether a link was worth hovering over.
|
||||||
|
|
||||||
|
Here's what it looks like:
|
||||||
|
|
||||||
|
{{< figure src="gwern-hover.gif" class="medium" caption="Hovering over a link on Gwern's site" alt="Hovering over a link on Gwern's site. After the link is hovered over, a rectangular popup displays a section of the Wikipedia page the link goes to. I scroll through the section to the table of contents." >}}
|
||||||
|
|
||||||
|
### RSS Feeds
|
||||||
|
|
||||||
|
RSS is a feed standard that allows sites to publish updates. Blogs in
|
||||||
|
particular can make use of RSS to notify readers of updates.
|
||||||
|
RSS feeds are processed by a feed reader, which is a program that polls
|
||||||
|
a website's `index.xml` file (or other similar files) and reads it to
|
||||||
|
detect new content. If you opt in to full-text RSS feeds, users can read
|
||||||
|
the entire post entirely from their reader.
|
||||||
|
|
||||||
|
RSS makes it easier to keep up with your site. Rather than having
|
||||||
|
to check in on every author whose content I enjoy on the internet, I can
|
||||||
|
add their feed URL to my list, and have my feed reader automatically aggregate
|
||||||
|
all updates for me to read. It's kind of like a social media or news feed,
|
||||||
|
except that I control what's shown to me, and authors of the blogs I follow
|
||||||
|
don't need to create accounts and explicitly share their work on social media!
|
||||||
|
|
||||||
|
I don't have any particular website to show off in this section; instead
|
||||||
|
I'll show you a list of websites that I'm following in my feed reader of choice.
|
||||||
|
You might notice that a lot of these websites are listed here as inspiration
|
||||||
|
for other microfeatures.
|
||||||
|
|
||||||
|
{{< figure src="feedbin.png" class="small" caption="A screenshot of my Feedbin list" alt="A screenshot of my Feedbin list. Some sites include Hillel Wayne's, Faster than Lime, Drew DeVault, and the Chapel Language Blog" >}}
|
||||||
|
|
||||||
|
### Links to Other Sites
|
||||||
|
|
||||||
|
This feature I first noticed on Drew DeVault's blog. Every page on Drew's
|
||||||
|
blog, at the bottom, has a section titled "Articles from blogs I read".
|
||||||
|
For instance, on [a sample post](https://drewdevault.com/2024/05/24/2024-05-24-Bunnix.html),
|
||||||
|
at the time of writing, I'm seeing the following footer:
|
||||||
|
|
||||||
|
{{< figure src="drew-openring.png" class="fullwide" caption="Links to other blogs from Drew DeVault's blog" alt="Links to other blogs from Drew DeVault's blog. The links consist of three side-by-side boxes, each with a title and brief excerpt." >}}
|
||||||
|
|
||||||
|
As indicated in the image, Drew's site in particular uses a program
|
||||||
|
called [`openring`](https://git.sr.ht/~sircmpwn/openring), which is based on
|
||||||
|
RSS feeds (another [microfeature I love](#rss-feeds)). However,
|
||||||
|
_how_ the site finds such articles (statically like `openring`, or
|
||||||
|
on page load using some JavaScript) isn't hugely important to me. What's
|
||||||
|
important is that you're promoting other content creators whose work
|
||||||
|
you enjoy, which is the ethos of my favorite slice of the internet.
|
||||||
|
|
||||||
|
### Conclusion + Anything Else?
|
||||||
|
|
||||||
|
Those are all the microfeatures that I could think of in a single sitting.
|
||||||
|
I hope that you have been inspired to integrate features like these into
|
||||||
|
your own site, or at the very least that you think doing so would be a good idea.
|
||||||
|
|
||||||
|
This list isn't exhaustive. I've probably missed some good microfeatures!
|
||||||
|
If you can think of such a feature, let me know; my email address is linked
|
||||||
|
in the footer of this article.
|
||||||
|
|
||||||
|
Thank you for reading, and cheers!
|
||||||
BIN
content/blog/blog_microfeatures/jamesg-external.png
Normal file
|
After Width: | Height: | Size: 61 KiB |
BIN
content/blog/blog_microfeatures/lars-toc.png
Normal file
|
After Width: | Height: | Size: 92 KiB |
BIN
content/blog/blog_microfeatures/plfa-goto.gif
Normal file
|
After Width: | Height: | Size: 1.3 MiB |
BIN
content/blog/blog_microfeatures/quanta-scroll.gif
Normal file
|
After Width: | Height: | Size: 6.8 MiB |
@@ -1,7 +1,7 @@
|
|||||||
---
|
---
|
||||||
title: "How Many Values Does a Boolean Have?"
|
title: "How Many Values Does a Boolean Have?"
|
||||||
date: 2020-08-21T23:05:55-07:00
|
date: 2020-08-21T23:05:55-07:00
|
||||||
tags: ["Java", "Haskell", "C and C++"]
|
tags: ["Java", "Haskell", "C", "C++"]
|
||||||
favorite: true
|
favorite: true
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -48,7 +48,7 @@ _expression_ in a programming language (like those in the form `fact(n)`)
|
|||||||
or a value in that same programming language (like `5`).
|
or a value in that same programming language (like `5`).
|
||||||
|
|
||||||
Dealing with values is rather simple. Most languages have finite numbers,
|
Dealing with values is rather simple. Most languages have finite numbers,
|
||||||
usually with \\(2^{32}\\) values, which have type `int`,
|
usually with \(2^{32}\) values, which have type `int`,
|
||||||
`i32`, or something in a similar vein. Most languages also have
|
`i32`, or something in a similar vein. Most languages also have
|
||||||
strings, of which there are as many as you have memory to contain,
|
strings, of which there are as many as you have memory to contain,
|
||||||
and which have the type `string`, `String`, or occasionally
|
and which have the type `string`, `String`, or occasionally
|
||||||
@@ -129,20 +129,20 @@ terminate; that is the [halting problem](https://en.wikipedia.org/wiki/Halting_p
|
|||||||
So, what do we do?
|
So, what do we do?
|
||||||
|
|
||||||
It turns out to be convenient -- formally -- to treat the result of a diverging computation
|
It turns out to be convenient -- formally -- to treat the result of a diverging computation
|
||||||
as its own value. This value is usually called 'bottom', and written as \\(\\bot\\).
|
as its own value. This value is usually called 'bottom', and written as \(\bot\).
|
||||||
Since in most programming languages, you can write a nonterminating expression or
|
Since in most programming languages, you can write a nonterminating expression or
|
||||||
function of any type, this 'bottom' is included in _all_ types. So in fact, the
|
function of any type, this 'bottom' is included in _all_ types. So in fact, the
|
||||||
possible values of `unsigned int` are \\(\\bot, 0, 1, 2, ...\\) and so on.
|
possible values of `unsigned int` are \(\bot, 0, 1, 2, ...\) and so on.
|
||||||
As you may have by now guessed, the same is true for a boolean: we have \\(\\bot\\), `true`, and `false`.
|
As you may have by now guessed, the same is true for a boolean: we have \(\bot\), `true`, and `false`.
|
||||||
|
|
||||||
### Haskell and Bottom
|
### Haskell and Bottom
|
||||||
You may be thinking:
|
You may be thinking:
|
||||||
|
|
||||||
> Now he's done it; he's gone off the deep end with all that programming language
|
> Now he's done it; he's gone off the deep end with all that programming language
|
||||||
theory. Tell me, Daniel, where the heck have you ever encountered \\(\\bot\\) in
|
theory. Tell me, Daniel, where the heck have you ever encountered \(\bot\) in
|
||||||
code? This question was for a software engineering interview, after all!
|
code? This question was for a software engineering interview, after all!
|
||||||
|
|
||||||
You're right; I haven't _specifically_ seen the symbol \\(\\bot\\) in my time
|
You're right; I haven't _specifically_ seen the symbol \(\bot\) in my time
|
||||||
programming. But I have frequently used an equivalent notation for the same idea:
|
programming. But I have frequently used an equivalent notation for the same idea:
|
||||||
`undefined`. In fact, here's a possible definition of `undefined` in Haskell:
|
`undefined`. In fact, here's a possible definition of `undefined` in Haskell:
|
||||||
|
|
||||||
@@ -152,8 +152,8 @@ undefined = undefined
|
|||||||
|
|
||||||
Just like `meaningOfLife`, this is a divergent computation! What's more is that
|
Just like `meaningOfLife`, this is a divergent computation! What's more is that
|
||||||
the type of this computation is, in Haskell, `a`. More explicitly -- and retreating
|
the type of this computation is, in Haskell, `a`. More explicitly -- and retreating
|
||||||
to more mathematical notation -- we can write this type as: \\(\\forall \\alpha . \\alpha\\).
|
to more mathematical notation -- we can write this type as: \(\forall \alpha . \alpha\).
|
||||||
That is, for any type \\(\\alpha\\), `undefined` has that type! This means
|
That is, for any type \(\alpha\), `undefined` has that type! This means
|
||||||
`undefined` can take on _any_ type, and so, we can write:
|
`undefined` can take on _any_ type, and so, we can write:
|
||||||
|
|
||||||
```Haskell
|
```Haskell
|
||||||
@@ -187,7 +187,7 @@ expression. What you're doing is a kind of academic autofellatio.
|
|||||||
|
|
||||||
Alright, I can accept this criticism. Perhaps just calling a nonterminating
|
Alright, I can accept this criticism. Perhaps just calling a nonterminating
|
||||||
function a value _is_ far-fetched (even though in [denotational semantics](https://en.wikipedia.org/wiki/Denotational_semantics)
|
function a value _is_ far-fetched (even though in [denotational semantics](https://en.wikipedia.org/wiki/Denotational_semantics)
|
||||||
we _do_ extend types with \\(\\bot\\)). But denotational semantics are not
|
we _do_ extend types with \(\bot\)). But denotational semantics are not
|
||||||
the only place where types are implicitly extend with an extra value;
|
the only place where types are implicitly extend with an extra value;
|
||||||
let's look at Java.
|
let's look at Java.
|
||||||
|
|
||||||
@@ -294,7 +294,7 @@ question. Its purpose can be one of two things:
|
|||||||
|
|
||||||
* The interviewer expected a long-form response such as this one.
|
* The interviewer expected a long-form response such as this one.
|
||||||
This is a weird expectation for a software engineering candidate -
|
This is a weird expectation for a software engineering candidate -
|
||||||
how does knowing about \\(\\bot\\), `undefined`, or `null` help in
|
how does knowing about \(\bot\), `undefined`, or `null` help in
|
||||||
creating software, especially if this information is irrelevant
|
creating software, especially if this information is irrelevant
|
||||||
to the company's language of choice?
|
to the company's language of choice?
|
||||||
* The interviewer expected the simple answer. In that case,
|
* The interviewer expected the simple answer. In that case,
|
||||||
|
|||||||
591
content/blog/chapel_runtime_types.md
Normal file
@@ -0,0 +1,591 @@
|
|||||||
|
---
|
||||||
|
title: "Chapel's Runtime Types as an Interesting Alternative to Dependent Types"
|
||||||
|
date: 2025-03-02T22:52:01-08:00
|
||||||
|
tags: ["Chapel", "C++", "Idris", "Programming Languages"]
|
||||||
|
description: "In this post, I discuss Chapel's runtime types as a limited alternative to dependent types."
|
||||||
|
---
|
||||||
|
|
||||||
|
One day, when I was in graduate school, the Programming Languages research
|
||||||
|
group was in a pub for a little gathering. Amidst beers, fries, and overpriced
|
||||||
|
sandwiches, the professor and I were talking about [dependent types](https://en.wikipedia.org/wiki/Dependent_type). Speaking
|
||||||
|
loosely and imprecisely, these are types that are somehow constructed from
|
||||||
|
_values_ in a language, like numbers.
|
||||||
|
|
||||||
|
For example, in C++, [`std::array`](https://en.cppreference.com/w/cpp/container/array)
|
||||||
|
is a dependent type. An instantiation of the _type_ `array`, like `array<string, 3>`
|
||||||
|
is constructed from the type of its elements (here, `string`) and a value
|
||||||
|
representing the number of elements (here, `3`). This is in contrast with types
|
||||||
|
like `std::vector`, which only depends on a type (e.g., `vector<string>` would
|
||||||
|
be a dynamically-sized collection of strings).
|
||||||
|
|
||||||
|
I was extolling the virtues of general dependent types, like you might find
|
||||||
|
in [Idris](https://www.idris-lang.org/) or [Agda](https://agda.readthedocs.io/en/latest/getting-started/what-is-agda.html):
|
||||||
|
more precise function signatures! The
|
||||||
|
{{< sidenote "right" "curry-howard-note" "Curry-Howard isomorphism!" >}}
|
||||||
|
The Curry-Howard isomorphism is a common theme on this blog. I've
|
||||||
|
<a href="{{< relref "typesafe_interpreter_revisited#curry-howard-correspondence" >}}">
|
||||||
|
written about it myself</a>, but you can also take a look at the
|
||||||
|
<a href="https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspondence">
|
||||||
|
Wikipedia page</a>.
|
||||||
|
{{< /sidenote >}} The professor was skeptical. He had been excited about
|
||||||
|
dependent types in the past, but nowadays he felt over them. They were cool, he
|
||||||
|
said, but there are few practical uses. In fact, he posed a challenge:
|
||||||
|
|
||||||
|
> Give me one good reason to use dependent types in practice that doesn't
|
||||||
|
> involve keeping track of bounds for lists and matrices!
|
||||||
|
{#bounds-quote}
|
||||||
|
|
||||||
|
This challenge alludes to fixed-length lists -- [vectors](https://agda.github.io/agda-stdlib/master/Data.Vec.html)
|
||||||
|
-- which are one of the first dependently-typed data structures one learns about.
|
||||||
|
Matrices are effectively vectors-of-vectors. In fact, even in giving my introductory
|
||||||
|
example above, I demonstrated the C++ equivalent of a fixed-length list, retroactively
|
||||||
|
supporting the professor's point.
|
||||||
|
|
||||||
|
It's not particularly important to write down how I addressed the challenge;
|
||||||
|
suffice it to say that the notion resonated with some of the other
|
||||||
|
students present in the pub. In the midst of practical development, how much
|
||||||
|
of dependent types' power can you leverage, and how much power do you pay
|
||||||
|
for but never use?
|
||||||
|
|
||||||
|
A second round of beers arrived. The argument was left largely unresolved,
|
||||||
|
and conversation flowed to other topics. Eventually, I graduated, and started
|
||||||
|
working on the [Chapel language](https://chapel-lang.org/) team (I also
|
||||||
|
[write on the team's blog](https://chapel-lang.org/blog/authors/daniel-fedorin/)).
|
||||||
|
|
||||||
|
When I started looking at Chapel programs, I could not believe my eyes...
|
||||||
|
|
||||||
|
### A Taste of Chapel's Array Types
|
||||||
|
|
||||||
|
Here's a simple Chapel program that creates an array of 10 integers.
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
var A: [0..9] int;
|
||||||
|
```
|
||||||
|
|
||||||
|
Do you see the similarity to the `std::array` example above? Of course, the
|
||||||
|
syntax is quite different, but in _essence_ I think the resemblance is
|
||||||
|
uncanny. Let's mangle the type a bit --- producing invalid Chapel programs ---
|
||||||
|
just for the sake of demonstration.
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
var B: array(0..9, int); // first, strip the syntax sugar
|
||||||
|
var C: array(int, 0..9); // swap the order of the arguments to match C++
|
||||||
|
```
|
||||||
|
|
||||||
|
Only one difference remains: in C++, arrays are always indexed from zero. Thus,
|
||||||
|
writing `array<int, 10>` would implicitly create an array whose indices start
|
||||||
|
with `0` and end in `9`. In Chapel, array indices can start at values other
|
||||||
|
than zero (it happens to be useful for elegantly writing numerical programs),
|
||||||
|
so the type explicitly specifies a lower and a higher bound. Other than that,
|
||||||
|
though, the two types look very similar.
|
||||||
|
|
||||||
|
In general, Chapel arrays have a _domain_, typically stored in variables like `D`.
|
||||||
|
The domain of `A` above is `{0..9}`. This domain is part of the array's type.
|
||||||
|
|
||||||
|
Before I move on, I'd like to pause and state a premise that is crucial
|
||||||
|
for the rest of this post: __I think knowing the size of a data structure,
|
||||||
|
like `std::array` or Chapel's `[0..9] int`, is valuable__. If this premise
|
||||||
|
were not true, there'd be no reason to prefer `std::array` to `std::vector`, or
|
||||||
|
care that Chapel has indexed arrays. However, having this information
|
||||||
|
can help in numerous ways, such as:
|
||||||
|
|
||||||
|
* __Enforcing compatible array shapes.__ For instance, the following Chapel
|
||||||
|
code would require two arrays passed to function `foo` to have the same size.
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
proc doSomething(people: [?D] person, data: [D] personInfo) {}
|
||||||
|
```
|
||||||
|
|
||||||
|
Similarly, we can enforce the fact that an input to a function has the same shape
|
||||||
|
as the output:
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
proc transform(input: [?D] int): [D] string;
|
||||||
|
```
|
||||||
|
* __Consistency in generics__. Suppose you have a generic function that declares
|
||||||
|
a new variable of a given type, and just returns it:
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
proc defaultValue(type argType) {
|
||||||
|
var x: argType;
|
||||||
|
return x;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Code like this exists in "real" Chapel software, by the way --- the example
|
||||||
|
is not contrived. By including the bounds etc. into the array type, we can
|
||||||
|
ensure that `x` is appropriately allocated. Then, `defaultValue([1,2,3].type)`
|
||||||
|
would return an array of three default-initialized integers.
|
||||||
|
* __Eliding boundary checking__. Boundary checking is useful for safety,
|
||||||
|
since it ensures that programs don't read or write past the end of allocated
|
||||||
|
memory. However, bounds checking is also slow. Consider the following function that
|
||||||
|
sums two arrays:
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
proc sumElementwise(A: [?D] int, B: [D] int) {
|
||||||
|
var C: [D] int;
|
||||||
|
for idx in D do
|
||||||
|
C[idx] = A[idx] + B[idx];
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Since arrays `A`, `B`, and `C` have the same domain `D`, we don't need
|
||||||
|
to do bound checking when accessing any of their elements. I don't believe
|
||||||
|
this is currently an optimisation in Chapel, but it's certainly on the
|
||||||
|
table.
|
||||||
|
* __Documentation__. Including the size of the array as part of type
|
||||||
|
signature clarifies the intent of the code being written. For instance,
|
||||||
|
in the following function:
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
proc sendEmails(numEmails: int, destinationAddrs: [1..numEmails] address) { /* ... */ }
|
||||||
|
```
|
||||||
|
|
||||||
|
It's clear from the type of the `destinationAddrs`s that there ought to
|
||||||
|
be exactly as many `destinationAddrs` as the number of emails that should
|
||||||
|
be sent.
|
||||||
|
|
||||||
|
Okay, recap: C++ has `std::array`, which is a dependently-typed container
|
||||||
|
that represents an array with a fixed number of elements. Chapel has something
|
||||||
|
similar. I think these types are valuable.
|
||||||
|
|
||||||
|
At this point, it sort of looks like I'm impressed with Chapel for copying a C++
|
||||||
|
feature from 2011. Not so! As I played with Chapel programs more and more,
|
||||||
|
arrays miraculously supported patterns that I knew I couldn't write in C++.
|
||||||
|
The underlying foundation of Chapel's array types is quite unlike any other.
|
||||||
|
Before we get to that, though, let's take a look at how dependent types
|
||||||
|
are normally used (by us mere mortal software engineers).
|
||||||
|
|
||||||
|
### Difficulties with Dependent Types
|
||||||
|
|
||||||
|
Let's start by looking at a simple operation on fixed-length lists: reversing them.
|
||||||
|
One might write a reverse function for "regular" lists, ignoring details
|
||||||
|
like ownership, copying, that looks like this:
|
||||||
|
|
||||||
|
```C++
|
||||||
|
std::vector<int> reverse(std::vector<int>);
|
||||||
|
```
|
||||||
|
|
||||||
|
This function is not general: it won't help us reverse lists of
|
||||||
|
strings, for instance. The "easy fix" is to replace `int` with some kind
|
||||||
|
of placeholder that can be replaced with any type.
|
||||||
|
|
||||||
|
```C++
|
||||||
|
std::vector<T> reverse(std::vector<T>);
|
||||||
|
```
|
||||||
|
|
||||||
|
You can try compiling this code, but you will immediately run into an error.
|
||||||
|
What the heck is `T`? Normally,
|
||||||
|
when we name a variable, function, or type (e.g., by writing `vector`, `reverse`),
|
||||||
|
we are referring to its declaration somewhere else. At this time, `T` is not
|
||||||
|
declared anywhere. It just "appears" in our function's type. To fix this,
|
||||||
|
we add a declaration for `T` by turning `reverse` into a template:
|
||||||
|
|
||||||
|
```C++
|
||||||
|
template <typename T>
|
||||||
|
std::vector<T> reverse(std::vector<T>);
|
||||||
|
```
|
||||||
|
|
||||||
|
The new `reverse` above takes two arguments: a type and a list of values of
|
||||||
|
that type. So, to _really_ call this `reverse`, we need to feed the type
|
||||||
|
of our list's elements into it. This is normally done automatically
|
||||||
|
(in C++ and otherwise) but under the hood, invocations might look like this:
|
||||||
|
|
||||||
|
```C++
|
||||||
|
reverse<int>({1,2,3}); // produces 3, 2, 1
|
||||||
|
reverse<string>({"world", "hello"}) // produces "hello", "world"
|
||||||
|
```
|
||||||
|
|
||||||
|
This is basically what we have to do to write `reverse` on `std::array`, which,
|
||||||
|
includes an additional parameter that encodes its length. We might start with
|
||||||
|
the following (using `n` as a placeholder for length, and observing that
|
||||||
|
reversing an array doesn't change its length):
|
||||||
|
|
||||||
|
```C++
|
||||||
|
std::array<T, n> reverse(std::array<T, n>);
|
||||||
|
```
|
||||||
|
|
||||||
|
Once again, to make this compile, we need to add template parameters for `T` and `n`.
|
||||||
|
|
||||||
|
```C++
|
||||||
|
template <typename T, size_t n>
|
||||||
|
std::array<T, n> reverse(std::array<T, n>);
|
||||||
|
```
|
||||||
|
|
||||||
|
Now, you might be asking...
|
||||||
|
|
||||||
|
{{< dialog >}}
|
||||||
|
{{< message "question" "reader" >}}
|
||||||
|
This section is titled "Difficulties with Dependent Types". What's the difficulty?
|
||||||
|
{{< /message >}}
|
||||||
|
{{< /dialog >}}
|
||||||
|
|
||||||
|
Well, here's the kicker. C++ templates are a __compile-time mechanism__. As
|
||||||
|
a result, arguments to `template` (like `T` and `n`) must be known when the
|
||||||
|
program is being compiled. This, in turn, means
|
||||||
|
{{< sidenote "right" "deptype-note" "the following program doesn't work:" >}}
|
||||||
|
The observant reader might have noticed that one of the Chapel programs we
|
||||||
|
saw above, <code>sendEmails</code>, does something similar. The
|
||||||
|
<code>numEmails</code> argument is used in the type of the
|
||||||
|
<code>destinationAddrs</code> parameter. That program is valid Chapel.
|
||||||
|
{{< /sidenote >}}
|
||||||
|
|
||||||
|
```C++
|
||||||
|
void buildArray(size_t len) {
|
||||||
|
std::array<int, len> myArray;
|
||||||
|
// do something with myArray
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
You can't use these known-length types like `std::array` with any length
|
||||||
|
that is not known at compile-time. But that's a lot of things! If you're reading
|
||||||
|
from an input file, chances are, you don't know how big that file is. If you're
|
||||||
|
writing a web server, you likely don't know the length the HTTP requests.
|
||||||
|
With every setting a user can tweak when running your code, you sacrifice the
|
||||||
|
ability to use templated types.
|
||||||
|
|
||||||
|
Also, how do you _return_ a `std::array`? If the size of the returned array is
|
||||||
|
known in advance, you just list that size:
|
||||||
|
|
||||||
|
```C++
|
||||||
|
std::array<int, 10> createArray();
|
||||||
|
```
|
||||||
|
|
||||||
|
If the size is not known at compile-time, you might want to do something like
|
||||||
|
the following --- using an argument `n` in the type of the returned array ---
|
||||||
|
but it would not compile:
|
||||||
|
|
||||||
|
```C++
|
||||||
|
auto computeNNumbers(size_t n) -> std::array<int, n>; // not valid C++
|
||||||
|
```
|
||||||
|
|
||||||
|
Moreover, you actually can't use `createArray` to figure out the required
|
||||||
|
array size, and _then_ return an array that big, even if in the end you
|
||||||
|
only used compile-time-only computations in the body of `createArray`.
|
||||||
|
What you would need is to provide a "bundle" of a value and a type that is somehow
|
||||||
|
built from that value.
|
||||||
|
|
||||||
|
```C++
|
||||||
|
// magic_pair is invented syntax, will not even remotely work
|
||||||
|
auto createArray() -> magic_pair<size_t size, std::array<int, size>>;
|
||||||
|
```
|
||||||
|
|
||||||
|
This pair contains a `size` (suppose it's known at compilation time for
|
||||||
|
the purposes of appeasing C++) as well as an array that uses that `size`
|
||||||
|
as its template argument. This is not real C++ -- not even close -- but
|
||||||
|
such pairs are a well-known concept. They are known as
|
||||||
|
[dependent pairs](https://unimath.github.io/agda-unimath/foundation.dependent-pair-types.html),
|
||||||
|
or, if you're trying to impress people, \(\Sigma\)-types. In Idris, you
|
||||||
|
could write `createArray` like this:
|
||||||
|
|
||||||
|
```Idris
|
||||||
|
createArray : () -> (n : Nat ** Vec n Int)
|
||||||
|
```
|
||||||
|
|
||||||
|
There are languages out there -- that are not C++, alas -- that support
|
||||||
|
dependent pairs, and as a result make it more convenient to use types that
|
||||||
|
depend on values. Not only that, but a lot of these languages do not force
|
||||||
|
dependent types to be determined at compile-time. You could write that
|
||||||
|
coveted `readArrayFromFile` function:
|
||||||
|
|
||||||
|
```Idris
|
||||||
|
readArrayFromFile : String -> IO (n : Nat ** Vec n String)
|
||||||
|
```
|
||||||
|
|
||||||
|
Don't mind `IO`; in pure languages like Idris, this type is a necessity when
|
||||||
|
interacting when reading data in and sending it out. The key is that
|
||||||
|
`readArrayFromFile` produces, at runtime, a pair of `n`, which is the size
|
||||||
|
of the resulting array, and a `Vec` of that many `String`s (e.g., one string
|
||||||
|
per line of the file).
|
||||||
|
|
||||||
|
Dependent pairs are cool and very general. However, the end result of
|
||||||
|
types with bounds which are not determined at compile-time is that you're
|
||||||
|
_required_ to use dependent pairs. Thus, you must always carry the array's length
|
||||||
|
together with the array itself.
|
||||||
|
|
||||||
|
The bottom line is this:
|
||||||
|
|
||||||
|
* In true dependently typed languages, a type that depends on a value (like `Vec`
|
||||||
|
in Idris) lists that value in its type. When this value is listed by
|
||||||
|
referring to an identifier --- like `n` in `Vec n String` above --- this
|
||||||
|
identifier has to be defined somewhere, too. This necessitates dependent pairs,
|
||||||
|
in which the first element is used syntactically as the "definition point"
|
||||||
|
of a type-level value. For example, in the following piece of code:
|
||||||
|
|
||||||
|
```Idris
|
||||||
|
(n : Nat ** Vec n String)
|
||||||
|
```
|
||||||
|
|
||||||
|
The `n : Nat` part of the pair serves both to say that the first element
|
||||||
|
is a natural number, and to introduce a variable `n` that refers to
|
||||||
|
this number so that the second type (`Vec n String`) can refer to it.
|
||||||
|
|
||||||
|
A lot of the time, you end up carrying this extra value (bound to `n` above)
|
||||||
|
with your type.
|
||||||
|
* In more mainstream languages, things are even more restricted: dependently
|
||||||
|
typed values are a compile-time property, and thus, cannot be used with
|
||||||
|
runtime values like data read from a file, arguments passed in to a function,
|
||||||
|
etc..
|
||||||
|
|
||||||
|
### Hiding Runtime Values from the Type
|
||||||
|
|
||||||
|
Let's try to think of ways to make things more convenient. First of all, as
|
||||||
|
we saw, in Idris, it's possible to use runtime values in types. Not only that,
|
||||||
|
but Idris is a compiled language, so presumably we can compile dependently typed programs
|
||||||
|
with runtime-enabled dependent types. The trick is to forget some information:
|
||||||
|
turn a vector `Vec n String` into two values (the size of the vector and the
|
||||||
|
vector itself), and forget -- for the purposes of generating code -- that they're
|
||||||
|
related. Whenever you pass in a `Vec n String`, you can compile that similarly
|
||||||
|
to how you'd compile passing in a `Nat` and `List String`. Since the program has
|
||||||
|
already been type checked, you can be assured that you don't encounter cases
|
||||||
|
when the size and the actual vector are mismatched, or anything else of that
|
||||||
|
nature.
|
||||||
|
|
||||||
|
Additionally, you don't always need the length of the vector at all. In a
|
||||||
|
good chunk of Idris code, the size arguments are only used to ensure type
|
||||||
|
correctness and rule out impossible cases; they are never accessed at runtime.
|
||||||
|
As a result, you can _erase_ the size of the vector altogether. In fact,
|
||||||
|
[Idris 2](https://github.com/idris-lang/Idris2/) leans on [Quantitative Type Theory](https://bentnib.org/quantitative-type-theory.html)
|
||||||
|
to make erasure easier.
|
||||||
|
|
||||||
|
At this point, one way or another, we've "entangled" the vector with a value
|
||||||
|
representing its size:
|
||||||
|
|
||||||
|
* When a vector of some (unknown, but fixed) length needs to be produced from
|
||||||
|
a function, we use dependent pairs.
|
||||||
|
* Even in other cases, when compiling, we end up treating a vector as a
|
||||||
|
length value and the vector itself.
|
||||||
|
|
||||||
|
Generally speaking, a good language design practice is to hide extraneous
|
||||||
|
complexity, and to remove as much boilerplate as necessary. If the size
|
||||||
|
value of a vector is always joined at the hip with the vector, can we
|
||||||
|
avoid having to explicitly write it?
|
||||||
|
|
||||||
|
This is pretty much exactly what Chapel does. It _allows_ explicitly writing
|
||||||
|
the domain of an array as part of its type, but doesn't _require_ it. When
|
||||||
|
you do write it (re-using my original snippet above):
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
var A: [0..9] int;
|
||||||
|
```
|
||||||
|
|
||||||
|
What you are really doing is creating a value (the [range](https://chapel-lang.org/docs/primers/ranges.html) `0..9`),
|
||||||
|
and entangling it with the type of `A`. This is very similar to what a language
|
||||||
|
like Idris would do under the hood to compile a `Vec`, though it's not quite
|
||||||
|
the same.
|
||||||
|
|
||||||
|
At the same time, you can write code that omits the bounds altogether:
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
proc processArray(A: [] int): int;
|
||||||
|
proc createArray(): [] int;
|
||||||
|
```
|
||||||
|
|
||||||
|
In all of these examples, there is an implicit runtime value (the bounds)
|
||||||
|
that is associated with the array's type. However, we are never forced to
|
||||||
|
explicitly thread through or include a size. Where reasoning about them is not
|
||||||
|
necessary, Chapel's domains are hidden away. Chapel refers to the implicitly
|
||||||
|
present value associated with an array type as its _runtime type_.
|
||||||
|
|
||||||
|
I hinted earlier that things are not quite the same in this representation
|
||||||
|
as they are in my simplified model of Idris. In Idris, as I mentioned earlier,
|
||||||
|
the values corresponding to vectors' indices can be erased if they are not used.
|
||||||
|
In Chapel, this is not the case --- a domain always exists at runtime. At the
|
||||||
|
surface level, this means that you may pay for more than what you use. However,
|
||||||
|
domains enable a number of interesting patterns of array code. We'll get
|
||||||
|
to that in a moment; first, I want to address a question that may be on
|
||||||
|
your mind:
|
||||||
|
|
||||||
|
{{< dialog >}}
|
||||||
|
{{< message "question" "reader" >}}
|
||||||
|
At this point, this looks just like keeping a <code>.length</code> field as
|
||||||
|
part of the array value. Most languages do this. What's the difference
|
||||||
|
between this and Chapel's approach?
|
||||||
|
{{< /message >}}
|
||||||
|
{{< /dialog >}}
|
||||||
|
|
||||||
|
This is a fair question. The key difference is that the length exists even if an array
|
||||||
|
does not. The following is valid Chapel code (re-using the `defaultValue`
|
||||||
|
snippet above):
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
proc defaultValue(type argType) {
|
||||||
|
var x: argType;
|
||||||
|
return x;
|
||||||
|
}
|
||||||
|
|
||||||
|
proc doSomething() {
|
||||||
|
type MyArray = [1..10] int;
|
||||||
|
var A = defaultValue(MyArray);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Here, we created an array `A` with the right size (10 integer elements)
|
||||||
|
without having another existing array as a reference. This might seem like
|
||||||
|
a contrived example (I could've just as well written `var A: [1..10] int`),
|
||||||
|
but the distinction is incredibly helpful for generic programming. Here's
|
||||||
|
a piece of code from the Chapel standard library, which implements
|
||||||
|
a part of Chapel's [reduction](https://chapel-lang.org/docs/primers/reductions.html) support:
|
||||||
|
|
||||||
|
{{< githubsnippet "chapel-lang/chapel" "e8ff8ee9a67950408cc6d4c3220ac647817ddae3" "modules/internal/ChapelReduce.chpl" "Chapel" 146 >}}
|
||||||
|
inline proc identity {
|
||||||
|
var x: chpl__sumType(eltType); return x;
|
||||||
|
}
|
||||||
|
{{< /githubsnippet >}}
|
||||||
|
|
||||||
|
Identity elements are important when performing operations like sums and products,
|
||||||
|
for many reasons. For one, they tell you what the sum (e.g.) should be when there
|
||||||
|
are no elements at all. For another, they can be used as an initial value for
|
||||||
|
an accumulator. In Chapel, when you are performing a reduction, there is a
|
||||||
|
good chance you will need several accumulators --- one for each thread performing
|
||||||
|
a part of the reduction.
|
||||||
|
|
||||||
|
That `identity` function looks almost like `defaultValue`! Since it builds the
|
||||||
|
identity element from the type, and since the type includes the array's dimensions,
|
||||||
|
summing an array-of-arrays, even if it's empty, will produce the correct output.
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
type Coordinate = [1..3] real;
|
||||||
|
|
||||||
|
var Empty: [0..<0] Coordinate;
|
||||||
|
writeln(+ reduce Empty); // sum up an empty list of coordinates
|
||||||
|
```
|
||||||
|
|
||||||
|
As I mentioned before, having the domain be part of the type can also enable
|
||||||
|
indexing optimizations --- without any need for [interprocedural analysis](https://en.wikipedia.org/wiki/Interprocedural_optimization) ---
|
||||||
|
in functions like `sumElementwise`:
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
proc sumElementwise(A: [?D] int, B: [D] int) {
|
||||||
|
var C: [D] int;
|
||||||
|
for idx in D do
|
||||||
|
C[idx] = A[idx] + B[idx];
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The C++ equivalent of this function -- using `vectors` to enable arbitrary-size
|
||||||
|
lists of numbers read from user input, and `.at` to enable bounds checks ---
|
||||||
|
does not include enough information for this optimization to be possible.
|
||||||
|
|
||||||
|
```C++
|
||||||
|
void sumElementwise(std::vector<int> A, std::vector<int> B) {
|
||||||
|
std::vector<int> C(A.size());
|
||||||
|
|
||||||
|
for (size_t i = 0; i < A.size(); i++) {
|
||||||
|
C.at(i) = A.at(i) + B.at(i);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
All in all, this makes for a very interesting mix of features:
|
||||||
|
|
||||||
|
* __Chapel arrays have their bounds as part of types__, like `std::array` in C++
|
||||||
|
and `Vec` in Idris. This enables all the benefits I've described above.
|
||||||
|
* __The bounds don't have to be known at compile-time__, like all dependent
|
||||||
|
types in Idris. This means you can read arrays from files (e.g.) and still
|
||||||
|
reason about their bounds as part of the type system.
|
||||||
|
* __Domain information can be hidden when it's not used__, and does not require
|
||||||
|
explicit additional work like template parameters or dependent pairs.
|
||||||
|
|
||||||
|
Most curiously, runtime types only extend to arrays and domains. In that sense,
|
||||||
|
they are not a general purpose replacement for dependent types. Rather,
|
||||||
|
they make arrays and domains special, and single out the exact case my
|
||||||
|
professor was [talking about in the introduction](#bounds-quote). Although
|
||||||
|
at times I've [twisted Chapel's type system in unconventional ways](https://chapel-lang.org/blog/posts/linear-multistep/)
|
||||||
|
to simulate dependent types, rarely have I felt a need for them while
|
||||||
|
programming in Chapel. In that sense --- and in the "practical software engineering"
|
||||||
|
domain --- I may have been proven wrong.
|
||||||
|
|
||||||
|
### Pitfalls of Runtime Types
|
||||||
|
|
||||||
|
Should all languages do things the way Chapel does? I don't think so. Like
|
||||||
|
most features, runtime types like that in Chapel are a language design
|
||||||
|
tradeoff. Though I've covered their motivation and semantics, perhaps
|
||||||
|
I should mention the downsides.
|
||||||
|
|
||||||
|
The greatest downside is that, generally speaking, _types are not always a
|
||||||
|
compile-time property_. We saw this earlier with `MyArray`:
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
type MyArray = [1..10] int;
|
||||||
|
```
|
||||||
|
|
||||||
|
Here, the domain of `MyArray` (one-dimensional with bounds `1..10`) is a runtime
|
||||||
|
value. It has an
|
||||||
|
{{< sidenote "right" "dce-note" "execution-time cost." >}}
|
||||||
|
The execution-time cost is, of course, modulo <a href="https://en.wikipedia.org/wiki/Dead-code_elimination">dead code elimination</a> etc.. If
|
||||||
|
my snippet made up the entire program being compiled, the end result would
|
||||||
|
likely do nothing, since <code>MyArray</code> isn't used anywhere.
|
||||||
|
{{< /sidenote >}}
|
||||||
|
Moreover, types that serve as arguments to functions (like `argType` for
|
||||||
|
`defaultValue`), or as their return values (like the result of `chpl__sumType`)
|
||||||
|
also have an execution-time backing. This is quite different from most
|
||||||
|
compiled languages. For instance, in C++, templates are "stamped out" when
|
||||||
|
the program is compiled. A function with a `typename T` template parameter
|
||||||
|
called with type `int`, in terms of generated code, is always the same as
|
||||||
|
a function where you search-and-replaced `T` with `int`. This is called
|
||||||
|
[monomorphization](https://en.wikipedia.org/wiki/Monomorphization), by the
|
||||||
|
way. In Chapel, however, if the function is instantiated with an array type,
|
||||||
|
it will have an additional parameter, which represents the runtime component
|
||||||
|
of the array's type.
|
||||||
|
|
||||||
|
The fact that types are runtime entities means that compile-time type checking
|
||||||
|
is insufficient. Take, for instance, the above `sendEmails` function:
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
proc sendEmails(numEmails: int, destinationAddrs: [1..numEmails] address) { /* ... */ }
|
||||||
|
```
|
||||||
|
|
||||||
|
Since `numEmails` is a runtime value (it's a regular argument!), we can't ensure
|
||||||
|
at compile-time that a value of some array matches the `[1..numEmails] address`
|
||||||
|
type. As a result, Chapel defers bounds checking to when the `sendEmails`
|
||||||
|
function is invoked.
|
||||||
|
|
||||||
|
This leads to some interesting performance considerations. Take two Chapel records
|
||||||
|
(similar to `struct`s in C++) that simply wrap a value. In one of them,
|
||||||
|
we provide an explicit type for the field, and in the other, we leave the field
|
||||||
|
type generic.
|
||||||
|
|
||||||
|
```Chapel
|
||||||
|
record R1 { var field: [1..10] int; }
|
||||||
|
record R2 { var field; }
|
||||||
|
|
||||||
|
var A = [1,2,3,4,5,6,7,8,9,10];
|
||||||
|
var r1 = new R1(A);
|
||||||
|
var r2 = new R2(A);
|
||||||
|
```
|
||||||
|
|
||||||
|
In a conversation with a coworker, I learned that these are not the same.
|
||||||
|
That's because the record `R1` explicitly specifies a type
|
||||||
|
for `field`. Since the type has a runtime component, the constructor
|
||||||
|
of `R1` will actually perform a runtime check to ensure that the argument
|
||||||
|
has 10 elements. `R2` will not do this, since there isn't any other type
|
||||||
|
to check against.
|
||||||
|
|
||||||
|
Of course, the mere existence of an additional runtime component is a performance
|
||||||
|
consideration. To ensure that Chapel programs perform as well as possible,
|
||||||
|
the Chapel standard library attempts to avoid using runtime components
|
||||||
|
wherever possible. This leads to a distinction between a "static type"
|
||||||
|
(known at compile-time) and a "dynamic type" (requiring a runtime value).
|
||||||
|
The `chpl__sumType` function we saw mentioned above uses static components of
|
||||||
|
types, because we don't want each call to `+ reduce` to attempt to run a number
|
||||||
|
of extraneous runtime queries.
|
||||||
|
|
||||||
|
### Conclusion
|
||||||
|
|
||||||
|
Though runtime types are not a silver bullet, I find them to be an elegant
|
||||||
|
middle-ground solution to the problem of tracking array bounds. They enable
|
||||||
|
optimizations, generic programming, and more, without the complexity of
|
||||||
|
a fully dependently-typed language. They are also quite unlike anything I've
|
||||||
|
seen in any other language.
|
||||||
|
|
||||||
|
What's more, this post only scratches the surface of what's possible using
|
||||||
|
arrays and domains. Besides encoding array bounds, domains include information
|
||||||
|
about how an array is distributed across several nodes (see the
|
||||||
|
[distributions primer](https://chapel-lang.org/docs/primers/distributions.html)),
|
||||||
|
and how it's stored in memory (see the [sparse computations](https://chapel-lang.org/blog/posts/announcing-chapel-2.3/#sparse-computations)
|
||||||
|
section of the recent 2.3 release announcement). In general, they are a very
|
||||||
|
flavorful component to Chapel's "special sauce" as a language for parallel
|
||||||
|
computing.
|
||||||
|
|
||||||
|
You can read more about arrays and domains in the [corresponding primer](https://chapel-lang.org/docs/primers/arrays.html).
|
||||||
700
content/blog/chapel_x_macros.md
Normal file
@@ -0,0 +1,700 @@
|
|||||||
|
---
|
||||||
|
title: "My Favorite C++ Pattern: X Macros"
|
||||||
|
date: 2023-10-14T15:38:17-07:00
|
||||||
|
tags: ["C++", "Chapel", "Compilers"]
|
||||||
|
description: "In this post, I talk about my favorite C/C++ pattern involving macros."
|
||||||
|
---
|
||||||
|
|
||||||
|
When I first joined the [Chapel](https://github.com/chapel-lang/chapel/) team,
|
||||||
|
one pattern used in its C++-based compiler made a strong impression on me. Since
|
||||||
|
then, I've used the pattern many more times, and have been very satisfied with
|
||||||
|
how it turned out. However, it feels like the pattern is relatively unknown, so
|
||||||
|
I thought I'd show it off, and some of its applications in the
|
||||||
|
[Chapel compiler](https://github.com/chapel-lang/chapel/). I've slightly tweaked
|
||||||
|
a lot of the snippets I directly show in this article for the sake of simpler
|
||||||
|
presentation; I've included links to the original code (available on GitHub)
|
||||||
|
if you want to see the unabridged version.
|
||||||
|
|
||||||
|
Broadly speaking, the "X Macros" pattern is about generating code. If you have a _lot_
|
||||||
|
of repetitive code to write (declaring many variables or classes, performing
|
||||||
|
many very similar actions, etc.), this pattern can save a lot of time, lead
|
||||||
|
to much more maintainable code, and reduce the effort required to add _more_
|
||||||
|
code.
|
||||||
|
|
||||||
|
I will introduce the pattern in its simplest form with my first example:
|
||||||
|
[interning strings](https://en.wikipedia.org/wiki/String_interning).
|
||||||
|
|
||||||
|
### Application 1: String Interning
|
||||||
|
The Chapel compiler interns a lot of its strings. This way, it can reduce the
|
||||||
|
memory footprint of keeping identifiers in memory (every string `"x"` is
|
||||||
|
actually the _same_ string) and make for much faster equality comparisons
|
||||||
|
(you can just perform a pointer comparison!). Generally, a `Context` class
|
||||||
|
is used to manage interning state. A new interned string can be constructed
|
||||||
|
using the context object in the following manner:
|
||||||
|
|
||||||
|
```C++
|
||||||
|
UniqueString::get(ctxPtr, "the string");
|
||||||
|
```
|
||||||
|
|
||||||
|
Effectively, this performs a search of the currently existing unique strings.
|
||||||
|
If one with the content (`"the string"` in this case) doesn't exist, it's
|
||||||
|
created and registered with the `Context`. Otherwise, the existing string is
|
||||||
|
returned. Some strings, however, occur a lot in the compiler, to the point that
|
||||||
|
it would be inefficient to perform the whole "find-or-create" operation every
|
||||||
|
time. One example is the `"this"` string, which is an identifier with a lot of
|
||||||
|
special behavior in the language (much like `this` in languages such as Java).
|
||||||
|
To support such frequent flier strings, the compiler initializes them once,
|
||||||
|
and creates a variable per-string that can be accessed to get that string's value.
|
||||||
|
|
||||||
|
There's that repetitive code. Defining a brand new variable for each string,
|
||||||
|
of which there are around 100 at the time of writing, is a lot of boilerplate.
|
||||||
|
There are also at least two places where code needs to be added:
|
||||||
|
{{< sidenote "right" "template-note" "once in the declaration of the variables, once in the code that initializes them." >}}
|
||||||
|
A third use in the compiler is actually a variadic template defined over
|
||||||
|
character arrays. The template is defined and specialized in such a way that
|
||||||
|
you can refer to a variable by its string contents (i.e., you can write
|
||||||
|
<code>USTR("the string")</code> instead of
|
||||||
|
<code>theStringVariable</code>).
|
||||||
|
{{< /sidenote >}}
|
||||||
|
It would be very easy to accidentally modify the former but not the latter,
|
||||||
|
especially for developers not familiar with how these "common strings" are
|
||||||
|
implemented.
|
||||||
|
|
||||||
|
This is where the X Macros come in. If you look around the compiler source code,
|
||||||
|
there's a header file that looks something like the following:
|
||||||
|
|
||||||
|
{{< githubsnippet "chapel-lang/chapel" "cd108338d321d0b3edf6258e0b2a58459d88a348" "frontend/include/chpl/framework/all-global-strings.h" "C++" 31 >}}
|
||||||
|
X(align , "align")
|
||||||
|
X(atomic , "atomic")
|
||||||
|
X(bool_ , "bool")
|
||||||
|
X(borrow , "borrow")
|
||||||
|
X(borrowed , "borrowed")
|
||||||
|
X(by , "by")
|
||||||
|
X(bytes , "bytes")
|
||||||
|
// A lot more of these...
|
||||||
|
{{< /githubsnippet >}}
|
||||||
|
|
||||||
|
What's this `X` thing? That right there is the essence of the pattern: the macro
|
||||||
|
`X` _isn't defined in the header!_ Effectively, `all-global-strings.h` is just
|
||||||
|
a list, and we can "iterate" over this list to generate some code for each
|
||||||
|
one of its elements, in as many places as we want. What I mean by this is
|
||||||
|
that we can then write code like the following:
|
||||||
|
|
||||||
|
{{< githubsnippet "chapel-lang/chapel" "cd108338d321d0b3edf6258e0b2a58459d88a348" "frontend/include/chpl/framework/global-strings.h" "C++" 76 >}}
|
||||||
|
struct GlobalStrings {
|
||||||
|
#define X(field, str) UniqueString field;
|
||||||
|
#include "all-global-strings.h"
|
||||||
|
#undef X
|
||||||
|
};
|
||||||
|
{{< /githubsnippet >}}
|
||||||
|
|
||||||
|
In this case, we define the macro `X` to ignore the value of the string (we're
|
||||||
|
just declaring it here), and create a new `UniqueString` variable declaration.
|
||||||
|
Since the declaration is inside the `GlobalStrings` struct, this ends up
|
||||||
|
creating a field. Just like that, we've declared a class with over 100
|
||||||
|
fields. Initialization is equally simple:
|
||||||
|
|
||||||
|
{{< githubsnippet "chapel-lang/chapel" "cd108338d321d0b3edf6258e0b2a58459d88a348" "frontend/lib/framework/Context.cpp" "C++" 49 >}}
|
||||||
|
GlobalStrings globalStrings;
|
||||||
|
Context rootContext;
|
||||||
|
|
||||||
|
static void initGlobalStrings() {
|
||||||
|
#define X(field, str) globalStrings.field = UniqueString::get(&rootContext, str);
|
||||||
|
#include "chpl/framework/all-global-strings.h"
|
||||||
|
#undef X
|
||||||
|
}
|
||||||
|
{{< /githubsnippet >}}
|
||||||
|
|
||||||
|
With this, we've completely automated the code for for both declaring and
|
||||||
|
initializing all 100 of our unique strings. Adding a new string doesn't require
|
||||||
|
a developer to know all of the places where this is implemented: just by
|
||||||
|
modifying the `all-global-strings.h` header with a new call to `X`, they can
|
||||||
|
add both a new variable and code to initialize it. Pretty robust!
|
||||||
|
|
||||||
|
### Application 2: AST Class Hierarchy
|
||||||
|
|
||||||
|
Altough the interned strings are an excellent first example, it wasn't the
|
||||||
|
first usage of X Macros that I encountered in the Chapel compiler. Beyond
|
||||||
|
strings, the compiler uses X Macros to represent the whole class hierarchy
|
||||||
|
of [abstract syntax tree (AST)](https://en.wikipedia.org/wiki/Abstract_syntax_tree)
|
||||||
|
nodes that it uses. Here, the code is actually a bit more complicated; the
|
||||||
|
class hierarchy isn't a _list_ like the strings were; it is itself a tree.
|
||||||
|
To represent such a structure, we need more than a single `X` macro; the
|
||||||
|
compiler went with `AST_NODE`, `AST_BEGIN_SUBCLASSES`, and `AST_END_SUBCLASSES`.
|
||||||
|
Here's what that looks like:
|
||||||
|
|
||||||
|
{{< githubsnippet "chapel-lang/chapel" "cd108338d321d0b3edf6258e0b2a58459d88a348" "frontend/include/chpl/uast/uast-classes-list.h" "C++" 96 >}}
|
||||||
|
// Other AST nodes above...
|
||||||
|
|
||||||
|
AST_BEGIN_SUBCLASSES(Loop)
|
||||||
|
AST_NODE(DoWhile)
|
||||||
|
AST_NODE(While)
|
||||||
|
|
||||||
|
AST_BEGIN_SUBCLASSES(IndexableLoop)
|
||||||
|
AST_NODE(BracketLoop)
|
||||||
|
AST_NODE(Coforall)
|
||||||
|
AST_NODE(For)
|
||||||
|
AST_NODE(Forall)
|
||||||
|
AST_NODE(Foreach)
|
||||||
|
AST_END_SUBCLASSES(IndexableLoop)
|
||||||
|
|
||||||
|
AST_END_SUBCLASSES(Loop)
|
||||||
|
|
||||||
|
// Other AST nodes below...
|
||||||
|
{{< /githubsnippet >}}
|
||||||
|
|
||||||
|
The class hierarchy defined in this header, called `uast-classes-list.h`, is
|
||||||
|
used for a lot of things, both in the compiler itself and in some libraries
|
||||||
|
that _use_ the compiler. I'll go through the use cases in turn.
|
||||||
|
|
||||||
|
#### Tags and Dynamic Casting
|
||||||
|
First, to deal with a general absence of
|
||||||
|
[RTTI](https://en.wikipedia.org/wiki/Run-time_type_information), the hierarchy header
|
||||||
|
is used to declare a "tag" enum. Each AST node has a tag matching its class;
|
||||||
|
this allows us inspect the AST and perform safe casts similar to `dynamic_cast`.
|
||||||
|
Note that for parent classes (defined via `BEGIN_SUBCLASSES`), we actually
|
||||||
|
end up creating _two_ tags: one `START_...` and one `END_...`. The reason
|
||||||
|
for this will become clear in a moment.
|
||||||
|
|
||||||
|
{{< githubsnippet "chapel-lang/chapel" "cd108338d321d0b3edf6258e0b2a58459d88a348" "frontend/include/chpl/uast/AstTag.h" "C++" 36 >}}
|
||||||
|
enum AstTag {
|
||||||
|
#define AST_NODE(NAME) NAME ,
|
||||||
|
#define AST_BEGIN_SUBCLASSES(NAME) START_##NAME ,
|
||||||
|
#define AST_END_SUBCLASSES(NAME) END_##NAME ,
|
||||||
|
#include "chpl/uast/uast-classes-list.h"
|
||||||
|
#undef AST_NODE
|
||||||
|
#undef AST_BEGIN_SUBCLASSES
|
||||||
|
#undef AST_END_SUBCLASSES
|
||||||
|
NUM_AST_TAGS,
|
||||||
|
AST_TAG_UNKNOWN
|
||||||
|
};
|
||||||
|
{{< /githubsnippet >}}
|
||||||
|
|
||||||
|
The above snippet makes `AstTag` contain elements such as `DoWhile`,
|
||||||
|
`While`, `START_Loop`, and `END_Loop`. For convenience, we also add a couple
|
||||||
|
of other elements: `NUM_AST_TAGS`, which is
|
||||||
|
{{< sidenote "right" "numbering-node" "automatically assigned the number of tags we generated," >}}
|
||||||
|
This is because C++ assigns integer values to enum elements sequentially, starting
|
||||||
|
at zero.
|
||||||
|
{{< /sidenote >}}
|
||||||
|
and a generic "unknown tag" value.
|
||||||
|
|
||||||
|
Having generated the enum elements in this way, we can write query functions.
|
||||||
|
This way, the API consumer can write `isLoop(tag)` instead of manually performing
|
||||||
|
a comparison. Code generation here is actually split into two distinct forms
|
||||||
|
of "is bla" methods: those for concrete AST nodes (`DoWhile,` `While`) and
|
||||||
|
those for abstract base classes (`Loop`). The reason for this is simple:
|
||||||
|
only a `AstTag::DoWhile` represents a do-while loop, but both `DoWhile`
|
||||||
|
and `While` are instances of `Loop`. So, `isLoop` should return true for both.
|
||||||
|
|
||||||
|
This is where the `START_...` and `END_...` enum elements come in. Reading
|
||||||
|
the header file top-to-bottom, we first end up generating `START_Loop`,
|
||||||
|
then `DoWhile` and `While`, and then `END_Loop`. Since C++ assigns integer
|
||||||
|
value to enums sequentially, to check if a tag "extends" a base class, it's
|
||||||
|
sufficient to check if its value is greater than the `START` token, and
|
||||||
|
smaller than the `END` token -- this means it was declared within the
|
||||||
|
matching pair of `BEGIN_SUBCLASSES` and `END_SUBCLASES`.
|
||||||
|
|
||||||
|
{{< githubsnippet "chapel-lang/chapel" "cd108338d321d0b3edf6258e0b2a58459d88a348" "frontend/include/chpl/uast/AstTag.h" "C++" 59 >}}
|
||||||
|
// define is___ for leaf and regular nodes
|
||||||
|
// (not yet for abstract parent classes)
|
||||||
|
#define AST_NODE(NAME) \
|
||||||
|
static inline bool is##NAME(AstTag tag) { \
|
||||||
|
return tag == NAME; \
|
||||||
|
}
|
||||||
|
#define AST_BEGIN_SUBCLASSES(NAME)
|
||||||
|
#define AST_END_SUBCLASSES(NAME)
|
||||||
|
// Apply the above macros to uast-classes-list.h
|
||||||
|
#include "chpl/uast/uast-classes-list.h"
|
||||||
|
// clear the macros
|
||||||
|
#undef AST_NODE
|
||||||
|
#undef AST_BEGIN_SUBCLASSES
|
||||||
|
#undef AST_END_SUBCLASSES
|
||||||
|
|
||||||
|
// define is___ for abstract parent classes
|
||||||
|
#define AST_NODE(NAME)
|
||||||
|
#define AST_BEGIN_SUBCLASSES(NAME) \
|
||||||
|
static inline bool is##NAME(AstTag tag) { \
|
||||||
|
return START_##NAME < tag && tag < END_##NAME; \
|
||||||
|
}
|
||||||
|
#define AST_END_SUBCLASSES(NAME)
|
||||||
|
// Apply the above macros to uast-classes-list.h
|
||||||
|
#include "chpl/uast/uast-classes-list.h"
|
||||||
|
// clear the macros
|
||||||
|
#undef AST_NODE
|
||||||
|
#undef AST_BEGIN_SUBCLASSES
|
||||||
|
#undef AST_END_SUBCLASSES
|
||||||
|
{{< /githubsnippet >}}
|
||||||
|
|
||||||
|
These helpers are quite convenient. Here are a few examples of what we end up
|
||||||
|
with:
|
||||||
|
|
||||||
|
```C++
|
||||||
|
isFor(AstTag::For) // Returns true; a 'for' loop is indeed a 'for' loop.
|
||||||
|
isIndexableLoop(AstTag::For) // Returns true; a 'for' loop is "indexable" ('for i in ...')
|
||||||
|
isLoop(AstTag::For) // Returns true; a 'for' loop is a loop.
|
||||||
|
isFor(AstTag::While) // Returns false; a 'while' loop is not a 'for' loop.
|
||||||
|
isIndexableLoop(AstTag::While) // Returns false; a 'while' loop uses a boolean condition, not an index
|
||||||
|
isLoop(AstTag::While) // Returns true; a 'while' loop is a loop.
|
||||||
|
```
|
||||||
|
|
||||||
|
On the top-level AST node class, we generate `isWhateverNode` and
|
||||||
|
`toWhateverNode` for each AST subclass. Thus, user code is able to inspect the
|
||||||
|
AST and perform (checked) casts using plain methods. I omit `isWhateverNode`
|
||||||
|
here for brevity (its definition is very simple), and include only
|
||||||
|
`toWhateverNode`.
|
||||||
|
|
||||||
|
{{< githubsnippet "chapel-lang/chapel" "cd108338d321d0b3edf6258e0b2a58459d88a348" "frontend/include/chpl/uast/AstNode.h" "C++" 313 >}}
|
||||||
|
#define AST_TO(NAME) \
|
||||||
|
const NAME * to##NAME() const { \
|
||||||
|
return this->is##NAME() ? (const NAME *)this : nullptr; \
|
||||||
|
} \
|
||||||
|
NAME * to##NAME() { \
|
||||||
|
return this->is##NAME() ? (NAME *)this : nullptr; \
|
||||||
|
}
|
||||||
|
#define AST_NODE(NAME) AST_TO(NAME)
|
||||||
|
#define AST_LEAF(NAME) AST_TO(NAME)
|
||||||
|
#define AST_BEGIN_SUBCLASSES(NAME) AST_TO(NAME)
|
||||||
|
#define AST_END_SUBCLASSES(NAME)
|
||||||
|
// Apply the above macros to uast-classes-list.h
|
||||||
|
#include "chpl/uast/uast-classes-list.h"
|
||||||
|
// clear the macros
|
||||||
|
#undef AST_NODE
|
||||||
|
#undef AST_LEAF
|
||||||
|
#undef AST_BEGIN_SUBCLASSES
|
||||||
|
#undef AST_END_SUBCLASSES
|
||||||
|
#undef AST_TO
|
||||||
|
{{< /githubsnippet >}}
|
||||||
|
|
||||||
|
These methods are used heavily in the compiler. For example, here's a completely
|
||||||
|
random snippet of code I pulled out:
|
||||||
|
|
||||||
|
{{< githubsnippet "chapel-lang/chapel" "cd108338d321d0b3edf6258e0b2a58459d88a348" "frontend/lib/resolution/Resolver.cpp" "C++" 1161 >}}
|
||||||
|
if (auto var = decl->toVarLikeDecl()) {
|
||||||
|
// Figure out variable type based upon:
|
||||||
|
// * the type in the variable declaration
|
||||||
|
// * the initialization expression in the variable declaration
|
||||||
|
// * the initialization expression from split-init
|
||||||
|
|
||||||
|
auto typeExpr = var->typeExpression();
|
||||||
|
auto initExpr = var->initExpression();
|
||||||
|
|
||||||
|
if (auto var = decl->toVariable())
|
||||||
|
if (var->isField())
|
||||||
|
isField = true;
|
||||||
|
{{< /githubsnippet >}}
|
||||||
|
|
||||||
|
Thus, developers adding new AST nodes are not required to manually implement
|
||||||
|
the `isWhatever`, `toWhatever`, and other functions. This and a fair bit
|
||||||
|
of other AST functionality (which I will cover in the next subsection) is
|
||||||
|
automatically generated using X Macros.
|
||||||
|
|
||||||
|
{{< dialog >}}
|
||||||
|
{{< message "question" "reader" >}}
|
||||||
|
You haven't actually shown how the AST node classes are declared, only the
|
||||||
|
tags. It seems implausible that they be generated using this same strategy -
|
||||||
|
doesn't each AST node have its own different methods and implementation code?
|
||||||
|
{{< /message >}}
|
||||||
|
{{< message "answer" "Daniel" >}}
|
||||||
|
You're right. The AST node classes are defined "as usual", and their constructors
|
||||||
|
must explicitly set their <code>tag</code> field to the corresponding
|
||||||
|
<code>AstTag</code> value. It's also on the person defining the new class to
|
||||||
|
extend the node that they promise to extend in <code>uast-classes-list.h</code>.
|
||||||
|
{{< /message >}}
|
||||||
|
{{< message "question" "reader" >}}
|
||||||
|
This seems like an opportunity for bugs. Nothing is stopping a developer
|
||||||
|
from returning the wrong tag, which would break the auto-casting behavior.
|
||||||
|
{{< /message >}}
|
||||||
|
{{< message "answer" "Daniel" >}}
|
||||||
|
Yes, it's not bulletproof. Just recently, a team meber found
|
||||||
|
<a href="https://github.com/chapel-lang/chapel/pull/23508"> a bug</a> in which
|
||||||
|
a node was listed to inherit from <code>AstNode</code>, but actually inherited
|
||||||
|
from <code>NamedDecl</code>. The <code>toNamedDecl</code> method would not
|
||||||
|
have worked on it, even though it inherited from the class.<br>
|
||||||
|
<br>
|
||||||
|
Still, this pattern provides the Chapel compiler with a lot of value; I will
|
||||||
|
show more use cases in the next subsection, like promised.
|
||||||
|
{{< /message >}}
|
||||||
|
{{< /dialog >}}
|
||||||
|
|
||||||
|
#### The Visitor Pattern without Double Dispatch
|
||||||
|
|
||||||
|
The Visitor Pattern is very important in general, but it's beyond ubiquitous
|
||||||
|
for us compiler developers. It helps avoid bloating AST node classes with methods
|
||||||
|
and state required for the various operations we perform on them. It also often
|
||||||
|
saves us from writing AST traversal code.
|
||||||
|
|
||||||
|
Essentially, rather than adding each new operation (e.g. convert to string,
|
||||||
|
compute the type, assign IDs) as methods on each AST node class, we extract
|
||||||
|
this code into a per-operation _visitor_. This visitor is a class that has methods
|
||||||
|
implementing the custom behavior on the AST nodes. A `visit(WhileLoop*)` method
|
||||||
|
might be used to perform the operation on 'while' loops, and `visit(ForLoop*)` might
|
||||||
|
do the same for 'for' loops. The AST nodes themselves only have a `traverse`
|
||||||
|
method that accepts a visitor, whatever it may be, and calls the appropriate
|
||||||
|
visit methods. This way, the AST node implementations remain simple and relatively
|
||||||
|
stable.
|
||||||
|
|
||||||
|
As a very simple example, suppose you wanted to count the number of loops used
|
||||||
|
in a program for an unspecified reason. You could add a `countLoops` method,
|
||||||
|
but then you've introduced a method to the AST node API for what might be a
|
||||||
|
one-time, throwaway operation. With the visitor pattern, you don't need to do
|
||||||
|
that; you can just create a new class:
|
||||||
|
|
||||||
|
```C++
|
||||||
|
struct MyVisitor {
|
||||||
|
int count = 0;
|
||||||
|
|
||||||
|
void visit(const Loop*) { count += 1; }
|
||||||
|
void visit(const AstNode*) { /* do nothing for other nodes */ }
|
||||||
|
}
|
||||||
|
|
||||||
|
int countLoops(const AstNode* root) {
|
||||||
|
MyVisitor visitor;
|
||||||
|
root->traverse(visitor);
|
||||||
|
return visitor.count;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `traverse` method is a nice API, isn't it? It's very easy to add operations
|
||||||
|
that work on your syntax trees, without modifying them. There is still an important
|
||||||
|
open question, though: how does `traverse` know to call the right `visit` function?
|
||||||
|
|
||||||
|
If `traverse` were only defined on `AstNode*`, and it simply called `visit(this)`,
|
||||||
|
we'd always end up calling the `AstNode` version of the `visit` function. This
|
||||||
|
is because C++ doesn't dynamic dispatch
|
||||||
|
{{< sidenote "right" "vtable-note" "based on the types of method arguments." >}}
|
||||||
|
Obviously, C++ has the ability to pick the right method based on the runtime
|
||||||
|
type of the <em>receiver</em>: that's just <code>virtual</code> functions
|
||||||
|
and <code>vtable</code>s.
|
||||||
|
{{< /sidenote >}}
|
||||||
|
Statically, the call clearly accepts an `AstNode`, and nothing more specific.
|
||||||
|
The compiler therefore picks that version of the `visit` method.
|
||||||
|
|
||||||
|
The "traditional" way to solve this problem in a language like C++ or Java
|
||||||
|
is called _double dispatch_. Using our example as reference, this involves
|
||||||
|
making _each_ AST node class have its own `traverse` method. This way,
|
||||||
|
calls to `visit(this)` have more specific type information, and are resolved
|
||||||
|
to the appropriate overload. But that's more boilerplate code: each new AST
|
||||||
|
node will need to have a virtual traverse method that looks something like this:
|
||||||
|
|
||||||
|
```C++
|
||||||
|
void MyNode::traverse(Visitor& v) {
|
||||||
|
v.visit(this);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
It would also require all visitors to extend from `Visitor`. So now you have:
|
||||||
|
|
||||||
|
* Boilerplate code on every AST node that looks the same but needs to be duplicated
|
||||||
|
* A parent `Visitor` class that must have a `visit` method for each AST node in
|
||||||
|
the language (so that children can override it).
|
||||||
|
* To make it easier to write code like our `MyVisitor` above, the `visit`
|
||||||
|
methods in the `Visitor` must be written such that `visit(ChildNode*)` calls
|
||||||
|
`visit(ParentNode*)` by default. Otherwise, the `Loop` overload wouldn't
|
||||||
|
have been called by the `DoWhile` overload (e.g.).
|
||||||
|
|
||||||
|
So there's a fair bit of tedious boilerplate, and more code to manually modify
|
||||||
|
when adding an AST node: you have to go and adjust the `Visitor` class with
|
||||||
|
new `visit` stub.
|
||||||
|
|
||||||
|
The reason all of this is necessary is that everyone (myself included) generally
|
||||||
|
agrees that code like the following is generally a bad idea:
|
||||||
|
|
||||||
|
```C++
|
||||||
|
struct AstNode {
|
||||||
|
void traverse(Visitor& visitor) {
|
||||||
|
if (auto forLoop = toForLoop()) {
|
||||||
|
visitor.visit(forLoop);
|
||||||
|
} else if (auto whileLoop = toWhileLoop()) {
|
||||||
|
visitor.visit(whileLoop);
|
||||||
|
} else {
|
||||||
|
// 100 more lines like this...
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
After all, what happens when you add a new AST node? You'd still have to modify
|
||||||
|
this list, and since everything still extends `Visitor`, you'd still need to
|
||||||
|
add a new `visit` stub there. But what if there were no base class? Instead,
|
||||||
|
what if `traverse` were a template?
|
||||||
|
|
||||||
|
```C++
|
||||||
|
struct AstNode {
|
||||||
|
template <typename VisitorType>
|
||||||
|
void traverse(VisitorType& visitor) {
|
||||||
|
if (auto forLoop = toForLoop()) {
|
||||||
|
visitor.visit(forLoop);
|
||||||
|
} else if (auto whileLoop = toWhileLoop()) {
|
||||||
|
visitor.visit(whileLoop);
|
||||||
|
} else {
|
||||||
|
// 100 more lines like this...
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Note that this wouldn't be possible to write in C++ if `visit` were a virtual
|
||||||
|
method; have you ever heard of a virtual template? With code like this, the
|
||||||
|
`VisitorType` wouldn't need to define _every_ overload, as long as it had
|
||||||
|
a version for `AstNode`. Furthermore, C++'s regular overload resolution rules
|
||||||
|
would take care of calling the `Loop` overload if a more specific one for
|
||||||
|
`DoWhile` didn't exist.
|
||||||
|
|
||||||
|
The only problem that remains is that of having a 100-line if-else (which could
|
||||||
|
be a `switch` to little aesthetic benefit). But this is exactly where the
|
||||||
|
X Macro pattern shines again! We already have a list of all AST node classes,
|
||||||
|
and the code for invoking them is nearly identical. Thus, the Chapel compiler
|
||||||
|
has a `doDispatch` function (used by `traverse`) that looks like this:
|
||||||
|
|
||||||
|
{{< githubsnippet "chapel-lang/chapel" "cd108338d321d0b3edf6258e0b2a58459d88a348" "frontend/include/chpl/uast/AstNode.h" "C++" 377 >}}
|
||||||
|
static void doDispatch(const AstNode* ast, Visitor& v) {
|
||||||
|
|
||||||
|
switch (ast->tag()) {
|
||||||
|
#define CONVERT(NAME) \
|
||||||
|
case chpl::uast::asttags::NAME: \
|
||||||
|
{ \
|
||||||
|
v.visit((const chpl::uast::NAME*) ast); \
|
||||||
|
return; \
|
||||||
|
}
|
||||||
|
|
||||||
|
#define IGNORE(NAME) \
|
||||||
|
case chpl::uast::asttags::NAME: \
|
||||||
|
{ \
|
||||||
|
CHPL_ASSERT(false && "this code should never be run"); \
|
||||||
|
}
|
||||||
|
|
||||||
|
#define AST_NODE(NAME) CONVERT(NAME)
|
||||||
|
#define AST_BEGIN_SUBCLASSES(NAME) IGNORE(START_##NAME)
|
||||||
|
#define AST_END_SUBCLASSES(NAME) IGNORE(END_##NAME)
|
||||||
|
|
||||||
|
#include "chpl/uast/uast-classes-list.h"
|
||||||
|
|
||||||
|
IGNORE(NUM_AST_TAGS)
|
||||||
|
IGNORE(AST_TAG_UNKNOWN)
|
||||||
|
|
||||||
|
#undef AST_NODE
|
||||||
|
#undef AST_BEGIN_SUBCLASSES
|
||||||
|
#undef AST_END_SUBCLASSES
|
||||||
|
#undef CONVERT
|
||||||
|
#undef IGNORE
|
||||||
|
}
|
||||||
|
|
||||||
|
CHPL_ASSERT(false && "this code should never be run");
|
||||||
|
}
|
||||||
|
{{< /githubsnippet >}}
|
||||||
|
|
||||||
|
And that's it. We have automatically generated the traversal code, allowing
|
||||||
|
us to use the visitor pattern in what I think is a very elegant way. Assuming
|
||||||
|
a developer adding a new AST node updates the `uast-classes-list.h` header,
|
||||||
|
the traversal logic will be auto-modified to properly handle the new node.
|
||||||
|
|
||||||
|
#### Generating a Python Class Hierarchy
|
||||||
|
|
||||||
|
This is a fun one. For a while, in my spare time, I was working on
|
||||||
|
[Python bindings for Chapel](https://github.com/chapel-lang/chapel/tree/main/tools/chapel-py).
|
||||||
|
These bindings are oriented towards developing language tooling: it feels much
|
||||||
|
easier to write a language linter, auto-formatter, or maybe even a language
|
||||||
|
server in Python rather than in C++. It's definitely much easier to use Python to
|
||||||
|
develop throwaway scripts that work with Chapel programs, which is something
|
||||||
|
that developers on the Chapel team tend to do quite often.
|
||||||
|
|
||||||
|
I decided I wanted the Python AST node class hierarchy to match the C++ version.
|
||||||
|
This is convenient for many reasons, including being able to wrap methods on
|
||||||
|
parent AST nodes and have them be available through child AST nodes and having
|
||||||
|
`isinstance` work properly. It's also advantageous from the point of view
|
||||||
|
of conceptual simplicity. However, I very much did not want to write CPython
|
||||||
|
API code to define the many AST node classes that are available in the Chapel
|
||||||
|
language.
|
||||||
|
|
||||||
|
Once again, the `uast-classes-list.h` header came into play here. With little
|
||||||
|
effort, I was able to auto-generate `PyTypeObject`s for each AST node in the
|
||||||
|
class hierarchy:
|
||||||
|
|
||||||
|
{{< githubsnippet "chapel-lang/chapel" "31a296e80cfb69bfc0c79a48d5cc9e8891f54818" "tools/chapel-py/chapel.cpp" "C++" 563 >}}
|
||||||
|
#define DEFINE_PY_TYPE_FOR(NAME, TAG, FLAGS)\
|
||||||
|
PyTypeObject NAME##Type = { \
|
||||||
|
PyVarObject_HEAD_INIT(NULL, 0) \
|
||||||
|
.tp_name = #NAME, \
|
||||||
|
.tp_basicsize = sizeof(NAME##Object), \
|
||||||
|
.tp_itemsize = 0, \
|
||||||
|
.tp_flags = FLAGS, \
|
||||||
|
.tp_doc = PyDoc_STR("A Chapel " #NAME " AST node"), \
|
||||||
|
.tp_methods = (PyMethodDef*) PerNodeInfo<TAG>::methods, \
|
||||||
|
.tp_base = parentTypeFor(TAG), \
|
||||||
|
.tp_init = (initproc) NAME##Object_init, \
|
||||||
|
.tp_new = PyType_GenericNew, \
|
||||||
|
};
|
||||||
|
|
||||||
|
#define AST_NODE(NAME) DEFINE_PY_TYPE_FOR(NAME, chpl::uast::asttags::NAME, Py_TPFLAGS_DEFAULT)
|
||||||
|
#define AST_BEGIN_SUBCLASSES(NAME) DEFINE_PY_TYPE_FOR(NAME, chpl::uast::asttags::START_##NAME, Py_TPFLAGS_BASETYPE)
|
||||||
|
#define AST_END_SUBCLASSES(NAME)
|
||||||
|
#include "chpl/uast/uast-classes-list.h"
|
||||||
|
#undef AST_NODE
|
||||||
|
#undef AST_BEGIN_SUBCLASSES
|
||||||
|
#undef AST_END_SUBCLASSES
|
||||||
|
{{< /githubsnippet >}}
|
||||||
|
|
||||||
|
You may have noticed that I snuck templates into the code above. The motivation there
|
||||||
|
is to avoid writing out the (usually empty) Python method table for every single
|
||||||
|
AST node. In particular, I have a template that, by default, provides an empty
|
||||||
|
method table, which can be specialized per node to add methods when necessary.
|
||||||
|
This detail is useful for application 3 below, but not necessary to understand
|
||||||
|
the use of X Macros here.
|
||||||
|
|
||||||
|
I used the same `<` and `>` trick to generate the `parentTypeFor` each tag:
|
||||||
|
|
||||||
|
{{< githubsnippet "chapel-lang/chapel" "31a296e80cfb69bfc0c79a48d5cc9e8891f54818" "tools/chapel-py/chapel.cpp" "C++" 157 >}}
|
||||||
|
static PyTypeObject* parentTypeFor(chpl::uast::asttags::AstTag tag) {
|
||||||
|
#define AST_NODE(NAME)
|
||||||
|
#define AST_LEAF(NAME)
|
||||||
|
#define AST_BEGIN_SUBCLASSES(NAME)
|
||||||
|
#define AST_END_SUBCLASSES(NAME) \
|
||||||
|
if (tag > chpl::uast::asttags::START_##NAME && tag < chpl::uast::asttags::END_##NAME) { \
|
||||||
|
return &NAME##Type; \
|
||||||
|
}
|
||||||
|
#include "chpl/uast/uast-classes-list.h"
|
||||||
|
#include "chpl/uast/uast-classes-list.h"
|
||||||
|
#undef AST_NODE
|
||||||
|
#undef AST_LEAF
|
||||||
|
#undef AST_BEGIN_SUBCLASSES
|
||||||
|
#undef AST_END_SUBCLASSES
|
||||||
|
return &AstNodeType;
|
||||||
|
}
|
||||||
|
{{< /githubsnippet >}}
|
||||||
|
|
||||||
|
A few more invocations of the `uast-classes-list.h` macro, and I had a working
|
||||||
|
class hierarchy. I didn't explicitly mention any AST node at all; all was derived
|
||||||
|
from the Chapel compiler header. This also meant that as the language changed
|
||||||
|
and the AST class hierarchy developed, the Python bindings' code would not need
|
||||||
|
to be updated. As long as it was compiled with an up-to-date version of the
|
||||||
|
header, the hierarchy would match that present within the language.
|
||||||
|
|
||||||
|
This allows for code like the following to be written in Python:
|
||||||
|
|
||||||
|
```Python
|
||||||
|
def print_decls(mod):
|
||||||
|
"""
|
||||||
|
Print all the things declared in this Chapel module.
|
||||||
|
"""
|
||||||
|
for child in mod:
|
||||||
|
if isinstance(child, NamedDecl):
|
||||||
|
print(child.name())
|
||||||
|
```
|
||||||
|
|
||||||
|
### Application 3: CPython Method Tables and Getters
|
||||||
|
|
||||||
|
The Chapel Python bindings use the X Macro pattern another time, actually.
|
||||||
|
Like I mentioned earlier, I use [template specialization](https://en.cppreference.com/w/cpp/language/template_specialization)
|
||||||
|
to reduce the amount of boilerplate code required for declaring Python objects.
|
||||||
|
In particular, there's a general method table declared as follows:
|
||||||
|
|
||||||
|
{{< githubsnippet "chapel-lang/chapel" "31a296e80cfb69bfc0c79a48d5cc9e8891f54818" "tools/chapel-py/chapel.cpp" "C++" 541 >}}
|
||||||
|
template <chpl::uast::asttags::AstTag tag>
|
||||||
|
struct PerNodeInfo {
|
||||||
|
static constexpr PyMethodDef methods[] = {
|
||||||
|
{NULL, NULL, 0, NULL} /* Sentinel */
|
||||||
|
};
|
||||||
|
};
|
||||||
|
{{< /githubsnippet >}}
|
||||||
|
|
||||||
|
Then, when I need to add methods, I use template specialization by writing
|
||||||
|
something like the following:
|
||||||
|
|
||||||
|
```C++
|
||||||
|
template <>
|
||||||
|
struct PerNodeInfo<TheAstTag> {
|
||||||
|
static constexpr PyMethodDef methods[] = {
|
||||||
|
{"method_name", TheNode_method_name, METH_NOARGS, "Documentation string"},
|
||||||
|
// ... more like the above ...
|
||||||
|
{NULL, NULL, 0, NULL} /* Sentinel */
|
||||||
|
};
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
When reviewing a PR that adds more methods to the Python bindings (by
|
||||||
|
defining new `TheNode_methodname` functions and then including them in the
|
||||||
|
method table), I noticed that in the PR, the developer added some methods
|
||||||
|
but forgot to put them into the respective table, leaving them unusable by
|
||||||
|
the Python client code. This came with the additional observation that there
|
||||||
|
was a moderate amount of duplication when declaring the C++ functions and then
|
||||||
|
listing them in the table. The name (`method_name` in the code) occurred many
|
||||||
|
times.
|
||||||
|
|
||||||
|
The developer who opened the PR suggesting using X Macros to combine the
|
||||||
|
information (declaration of function and its use in the corresponding method table)
|
||||||
|
into a single list. This led to the following header file:
|
||||||
|
|
||||||
|
{{< githubsnippet "chapel-lang/chapel" "31a296e80cfb69bfc0c79a48d5cc9e8891f54818" "tools/chapel-py/method-tables.h" "C++" 323 >}}
|
||||||
|
CLASS_BEGIN(FnCall)
|
||||||
|
METHOD_PROTOTYPE(FnCall, actuals, "Get the actuals of this FnCall node")
|
||||||
|
PLAIN_GETTER(FnCall, used_square_brackets, "Check if this FnCall was made using square brackets",
|
||||||
|
"b", return node->callUsedSquareBrackets())
|
||||||
|
CLASS_END(FnCall)
|
||||||
|
{{< /githubsnippet >}}
|
||||||
|
|
||||||
|
The `PLAIN_GETTER` macro in this case is used to define trivial getters
|
||||||
|
(precluding the need for handling the Python-object-to-AST-node conversion,
|
||||||
|
and other CPython-specific things), whereas the `METHOD_PROTOTYPE` is used
|
||||||
|
to refer to methods that needed explicit implementations. With
|
||||||
|
this, the method tables are generated as follows:
|
||||||
|
|
||||||
|
{{< githubsnippet "chapel-lang/chapel" "31a296e80cfb69bfc0c79a48d5cc9e8891f54818" "tools/chapel-py/chapel.cpp" "C++" 548 >}}
|
||||||
|
#define CLASS_BEGIN(TAG) \
|
||||||
|
template <> \
|
||||||
|
struct PerNodeInfo<chpl::uast::asttags::TAG> { \
|
||||||
|
static constexpr PyMethodDef methods[] = {
|
||||||
|
#define CLASS_END(TAG) \
|
||||||
|
{NULL, NULL, 0, NULL} /* Sentinel */ \
|
||||||
|
}; \
|
||||||
|
};
|
||||||
|
#define PLAIN_GETTER(NODE, NAME, DOCSTR, TYPESTR, BODY) \
|
||||||
|
{#NAME, NODE##Object_##NAME, METH_NOARGS, DOCSTR},
|
||||||
|
#define METHOD_PROTOTYPE(NODE, NAME, DOCSTR) \
|
||||||
|
{#NAME, NODE##Object_##NAME, METH_NOARGS, DOCSTR},
|
||||||
|
#include "method-tables.h"
|
||||||
|
{{< /githubsnippet >}}
|
||||||
|
|
||||||
|
The `CLASS_BEGIN` generates the initial `template <>` header and the code up
|
||||||
|
to the opening curly brace of the table definition. Then, for each method,
|
||||||
|
`PLAIN_GETTER` and `METHOD_PROTOTYPE` generate the relevant entries. Finally,
|
||||||
|
`CLASS_END` inserts the sentinel and the closing curly brace.
|
||||||
|
|
||||||
|
Another invocation of the macros in `method-tables.h` is used to generate the
|
||||||
|
implementations of "plain getters", which is boilerplate that I won't get into
|
||||||
|
it here, since it's pretty CPython specific.
|
||||||
|
|
||||||
|
### Discussion
|
||||||
|
|
||||||
|
I've presented to you a three applications of the pattern, in an order that happens
|
||||||
|
to be from least to most "extreme". It's possible that some of these are
|
||||||
|
over the line for using macros, especially for those who think of macros as
|
||||||
|
unfortunate remnants of C++'s past. However, I think that what I've demonstrated
|
||||||
|
demonstrates the versatility of the X Macro pattern -- feel free to apply it to
|
||||||
|
the degree that you find appropriate.
|
||||||
|
|
||||||
|
The thing I like the most about this pattern is that the header files read quite nicely:
|
||||||
|
you end up with a very declarative "scaffold" of what's going on. The
|
||||||
|
`uast-classes-list.h` makes for an excellent and fairly readable reference of
|
||||||
|
all the AST nodes in the Chapel compiler. The `method-tables.h` header provides
|
||||||
|
a fairly concise summary of what methods are available on what (Python) AST
|
||||||
|
node.
|
||||||
|
|
||||||
|
Of course, this approach is not without its drawbacks. Drawback zero is
|
||||||
|
the heavy use of macros: to the best of my knowledge, modern C++ tends to
|
||||||
|
discourage the usage of macros in favor of C++-specific features. Of course,
|
||||||
|
this "pure C++" preference is applicable to variable degrees in different use
|
||||||
|
cases and code bases; because of this, I won't count macros as (too much of)
|
||||||
|
a drawback.
|
||||||
|
|
||||||
|
The more significant downside is that this approach introduces a lot of dependencies
|
||||||
|
between source files. Any time the header changes, anything that uses any part
|
||||||
|
of the code generated by the header must be recompiled. Thus, if you're generating
|
||||||
|
classes, changing any one class will "taint" any code that uses _any_ of the
|
||||||
|
generated classes. In the Chapel compiler, touching the AST class hierarchy
|
||||||
|
requires a recompilation of all the AST nodes, and any compiler code that uses
|
||||||
|
the AST nodes (a lot). This is because each AST node needs access to the
|
||||||
|
`AstTag` enum, and that enum is generated from the hierarchy header.
|
||||||
|
|
||||||
|
That's all I have for today! Thanks for reading. I hope you got something useful
|
||||||
|
for your day-to-day programming out of this.
|
||||||