Problem Spotlight: Beacon Scanner (Advent of Code 2021 #19)

Introduction

Advent of Code is an annual compilation of very easy programming challenges. It's designed to promote interest in coding and the problems are supposed to be solved every day.

To make things more interesting, I decided to solve the problems on my phone (within termux). While most were very easy, there are a few that were interesting.

Problem 24 focused on static analysis. This problem was best solved by inspection: it asked to find the smallest and largest number accepted by a program in a fictitious assembly language. The program was formulaic and made of many very similar blocks, and the blocks involved division and multiplication, which combined to make analyzing it by hand the best approach.

Problem 19 was the most complicated and fun to work through. It presented us with a list of "scanners", and for each a list of "beacons" or "points" it sees. Each scanner sees all points within a 2000 by 2000 by 2000 cube centered on itself. However, the area over which the points seen by all scanners are spread is not known. The location and orientation of each scanner is also unknown, but the orientation of each scanner is fixed to be orthogonal.

The task is to discover the position of each scanner and point in global coordinates. For convenience, we can use the local coordinates of the first scanner as global coordinates.

The problem specifies that overlapping scanners must see at least 12 of the same points. However, finding the overlapping scanners directly would be very difficult and inefficient. Projecting all of the points onto one axis, finding all possible overlaps, then extending to two then three axes or repeating this for all axes and combining the results might work. But these would be complicated and probably slow, so I immediatly made an insight that I thought would be very helpful in solving the problem.

Analysis: Identifying Potential Overlaps

If we consider the distance between points, this is invariant under rotations. Therefore if we can find some distance that only occurs in two scanners, this corresponds to a possible overlap between those two scanners.

If we create a table for each scanner mapping distances to sets of pairs of points which are that distance apart. Then we can create a second table mapping distances to sets of scanners which contain points that distance apart. This allows us to find the distances which occur in exactly two scanners, and note down these two scanners as potentially overlapping.

The naive approach is cubic in the number of points in the range of each scanner, but binning pairs by distance first allows us to do it in closer to quadratic time. Both methods can probably be pushed to somewhere in the middle, depending on how pathological or well behaved the points are. For binning pairs, there's quadratically many pairs and for each one we either do a constant amount of work (if using a hashtable) or logarithmic amount of work (if using a tree). But then we have to do a variable amount of work for every pair of scanners that shares a distance, which is based on how many distances actually occur in the data set. It's actually more difficult to reason about the naive approach, because finding matching pairs and larger "constellations" of points is so fundamental. There are quadratically many offsets between any two scanners which will produce an overlap containing a fixed number of points from each, since once the first two coordinate offsets are chosen, the third is forced. Each offset will be very unlikely to correspond to a valid overlap, and while this is also true for each pair in the binning approach, most bins can be discarded immediately or with little work but this is much less true for offsets.

What's really cool about "constellations" of more than two points is that enumerating all of them could conceivably only take quadratic time. This and efficiently implementing the naive approach would both require more complicated data structures: kd trees in both cases and something more for large constellations. Organizing the points into a kd tree lets us iterate over all points and all radii. Go down to the "furthur investigation possibility" section for a slightly more in depth discussion of larger constellations.

The previously discussed table gives us a graph of potential connections that we can run a breadth first search on to lock in the position and orientation of each scanner. For each potential connection, we consider each pair of points in each of the two scanners that has the common distance. One scanner will have a known position and orientation and the other won't (because we start by using the frame of reference of the first scanner as global coordinates and then always consider potential connections from scanners with known frames of reference to ones with unknown).

The following code implements building the table mapping distances to pairs of points. We don't just bin the pairs by distance though, we bin them by a "distance profile". Distance is just the square root of the sum of the squared distance over all coordinates. However, it's common to use squared distance instead of distance because it skips computation, preserves order, and keeps us within the integers. We can also use other p norms, which, while not invariant under general rotations, are symmetric functions (under coordinate permutation and negation), making them invariant under the orthogonal rotations we care about.

We can simplify how we define the "distance profile". Instead of considering, say, the 1, 2, and 3 norms, we can just consider the minimum, median, and maximum coordinatewise displacements. Both of these triples can be converted uniquely back and forth.

class Scanner:
	def __init__(self):
		self.pos = (0, 0, 0)
		self.rotation = ((1, 0, 0), (0, 1, 0), (0, 0, 1))
		self.points = []
		self.is_transformed = False

	def add_point(self, point):
		self.points.append(point)
	
	def compute_dists(self):
		self.dist_pairs = defaultdict(list)
		for i, j in itt.combinations(range(len(self.points)), 2):
			dist_profile = tuple(sorted([abs(self.points[i][k] - self.points[j][k]) for k in range(3)]))
			self.dist_pairs[dist_profile].append((i, j))
				

Analysis: Rotations

We could find all rotations that line up the displacements, but it is easier to just test all 24 rotations. Finding all these rotations is a little interesting in and of itself. A rotation can be represented by an orthonormal matrix with determinant 1. These characteristics ensure that the transformation of the matrix is conformal (preserves angles), isometric (preserves distances), and not a reflection (as it would be if the determinant were -1 instead). This lets us write the last column of the matrix as the cross product of the first two.

From another point of view, because the rotations are constrained to be orthonormal, we know the matrix sends each standard basis vector to a standard basis vector or its negation.

The below snippet generates a list of all 24 orthonormal rotation matrices by looping over all pairs of standard basis vectors "ei" and "ej". For each pair, the cross product "ek" is found. This would produce only the 6 rotations where the first two standard basis vectors are mapped to standard basis vectors, so we also include these rotations with two coordinates negated to get all rotations.

rotations = []
for i in range(3):
	ei = [int(l == i) for l in range(3)]
	for j in range(3):
		if j == i:
			continue
		ej = [int(l == j) for l in range(3)]
		ek = [0]*3
		ek[3 - i - j] = 1 if (i + 1)%3 == j else -1
		rotations.append(tuple((ei[l], ej[l], ek[l]) for l in range(3)))
		rotations.append(tuple((-ei[l], -ej[l], ek[l]) for l in range(3)))
		rotations.append(tuple((-ei[l], ej[l], -ek[l]) for l in range(3)))
		rotations.append(tuple((ei[l], -ej[l], -ek[l]) for l in range(3)))

def rotate(point, rot):
	return tuple(sum(a*b for (a, b) in zip(row, point)) for row in rot)

def have_same_octant(ai, bi, aj, bj):
	return all((bi[k] - ai[k])*(bj[k] - aj[k]) >= 0 for k in range(3))
				

Finalizing an Algorithm

Once we have a pair of points in a known scanner with a distance matching a pair of points in an unknown scanner in hand and have found a rotation that lines them up, all we need to do is find the displacement between the scanners, compute the location of the second scanner's points in global coordinates, and finally verify that the overlap of their detection volumes contains at least 12 matching points.

This code is a first draft, with some compromises to performance that are discussed later.

dist_scanners = defaultdict(list)

# For each scanner, bin pairs of points according to distance profile
# (not just the (L2) distance, but the L1 and L3 distances as well)
# (in fact the distance profile is just stored as a sorted list of coordinatewise distances)
for i, scanner in enumerate(scanners):
	scanner.compute_dists()
	for dist_profile in scanner.dist_pairs:
		dist_scanners[dist_profile].append(i)

potential_neighbors = defaultdict(lambda:defaultdict(list))
num_potential_overlaps = 0

for dist_profile, ids in dist_scanners.items():
	if len(ids) == 2:
		potential_neighbors[ids[0]][ids[1]].append(dist_profile)
		potential_neighbors[ids[1]][ids[0]].append(dist_profile)
		num_potential_overlaps += 1

num_untransformed = len(scanners) - 1
scanners[0].is_transformed = True
frontier = {0}

while frontier and num_untransformed:
	i = frontier.pop()
	pos_i = scanners[i].pos
	for j, dist_profiles in potential_neighbors[i].items():
		del potential_neighbors[j][i]
		if scanners[j].is_transformed:
			continue
		# scanners[i] has points in global coordinates, so try all rotations of scanners[j]
		for dist_profile in dist_profiles:
			for ai, bi in scanners[i].dist_pairs[dist_profile]:
				ai = scanners[i].points[ai]
				bi = scanners[i].points[bi]
				for rot in rotations:
					for aj, bj in scanners[j].dist_pairs[dist_profile]:
						aj = rotate(scanners[j].points[aj], rot)
						bj = rotate(scanners[j].points[bj], rot)
						# first we confirm that this rotation makes the coordinatewise distances line up
						if any(abs(bj[k] - aj[k]) != abs(bi[k] - ai[k]) for k in range(3)):
							continue
						# next we need to confirm the pair of points seen by j is actually a rotation of those seen by i
						# note that this is complicated by the existence of 0 coordinates
						if not have_same_octant(ai, bi, aj, bj):
							aj, bj = bj, aj
							if not have_same_octant(ai, bi, aj, bj):
								continue
						displacement = tuple(ai[k] - aj[k] for k in range(3))
						pos_j = displacement
						points1 = [rotate(point, rot) for point in scanners[j].points]
						points1 = [tuple(point[k] + displacement[k] for k in range(3)) for point in points1]
						# now we need to check that the set of points in the overlap is consistent
						bl = tuple(max(pos_i[k], pos_j[k]) - 1000 for k in range(3))
						tr = tuple(min(pos_i[k], pos_j[k]) + 1000 for k in range(3))
						overlap_a = {point for point in scanners[i].points if all(bl[i] <= point[i] <= tr[i] for i in range(3))}
						overlap_b = {point for point in points1 if all(bl[i] <= point[i] <= tr[i] for i in range(3))}
						if overlap_a != overlap_b:
							continue
						# the rules state there must be at least 12 points overlapping
						if len(overlap_a) < 12:
							continue
						# finally we have an overlap !
						scanners[j].points = points1
						scanners[j].pos = pos_j
						scanners[j].is_transformed = True
						frontier.add(j)
						num_untransformed -= 1
			break
	del potential_neighbors[i]

points = set()
for scanner in scanners:
	if not scanner.is_transformed:
		raise ValueError("Not all scanners ranges could be combined!")
	for point in scanner.points:
		points.add(point)
				

Further Investigation Possibilities

This problem has a lot of interesting generalizations and real world applications. For example, it's similar to finding the position of objects seen by multiple cameras, multiple radar detectors, or so on. Most real world applications would have more a priori information about the position of the scanners (making the problem simpler).

On the other hand, real world applications would have general (non-orthogonal) rotations, possibly moving detectors, and almost surely scanner error. General rotations and scanner error can be dealt with by using only distance squared as the "distance profile", and by using a binary search tree to store the table of points with a given distance rather than a hash table. This would enable finding corresponding pairs of points that have slightly unequal distances due to scanner error.

The algorithm for stitching the detection data from multiple scanners together could be improved in a couple ways. Instead of finding the appropriate rotation to try by cycling through all 24 possible orthogonal rotations, we could do some vector algebra to compute two one-dimensional families of rotations which send the unknown pair of points onto the known pair of points. In particular, these families consist of one which flips the points, sending the first in the unknown pair to the second in the known pair and visa versa, and one which doesn't, with both families being parameterized by how far they rotate around the line through the pair of points.

Another potential improvement would be parallel stitching of different overlapping scanners. Instead of building up the stitched region using breadth first search starting at the first scanner, overlapping scanners could be stitched together in multiple connected components at once, eventually being combined into one unified region. This would require storing a transformation matrix for each scanner's local reference frame relative to the reference frame of the first scanner it its connected component, since the relative orientation of connected components would not be known beforehand. This would also eliminate the need to actually transform each scanner's list of points.

Finally, considering constellations larger than two points could improve performance by increasing how well we can identify potential overlaps. Building a kd tree lets us find all constellations in quadratic time, but storing them becomes extremely complicated. The mapping from center, radius pairs to constellations contained within is simple enough, the challenge is in storing constellations and especially indexing constellations by shape.

Storing the largest constellation within concentric spheres is simple enough, by splitting up points into concentric shells and introducing graph edges between points in adjacent shells. However, we absolutely do not have concentric spheres, and efficiently storing constellations in general probably requires constructing a Delaunay triangulation on the points. Then developing a data structure indexed by constellations is a whole other monster. Not just the Delaunay graph and associated distances matter, but, obviously, the exact relative angles of the edges as well.

The most practical way to build an algorithm given a Delaunay graph for each scanner's points is probably to start with matching single segments as done in the code above, but then extend each match using Delaunay edges until a maximal matching subgraph is found, and only then lock in the rotation and ensure no extra points exist in either scanner (again using the Delaunay graph).