Scala 3: “Erased” Definitions

Dean Wampler
Scala 3
Published in
6 min readMar 26, 2022

--

Martin Odersky and the EPFL Scala Center continue experimenting with potential new features in Scala 3. Let’s explore one of them, erased definitions.

Red Clouds at Sunrise, Miami Beach © 2022, Dean Wampler

This experimental feature is described in the documentation. Here, I’ll summarize its purpose, how it works, and provide an example of how it might be used.

A new keyword erased is introduced that can be applied to method arguments, including in using argument lists, method and value defintions, including givens, and even classes. The primary motivation for this feature is to provide a mechanism for constraining allowed “states” such that errors are caught at compile time, not runtime, but without extra overhead in the generated byte code. No byte code is generated for an “erased” definition. Hence, it can’t be used at runtime, e.g., referencing an erased value. It can only be used at compile time.

The documentation has a nice, simple example, where a type representing a machine can either be turned on or off, but the compiler will reject calling the method to turn off the machine if it is already off and similarly for the on state.

To see this in action, I decided to tackle a slightly more challenging problem. It’s always bugged me that Seq[T] has head, tail, and some other methods that should never be called on Nil. When you do that, an exception is thrown.

A good goal in API design is to make it impossible for the user to make a mistake. Ideally, you should only be able to call head or tail on a non-empty sequence and the compiler should enforce this. The library could define these methods only for non-empty sequences, but then you would have the usual problem that passing around a Seq[T] instance is practically useless, as it would not have many of the methods you need defined for it. Hence, defining methods like head and tail on Seq[T] is a pragmatic compromise. We avoid the potential error by using idioms like pattern matching or calling isEmpty to first determine if the sequence is empty or not, or use methods like headOption.

I explored a possible solution using dependent types in Scala 3: Dependent Types, Part II. (I guess I’m obsessive about this…) It worked, but the code was quite challenging to understand, at least for me, and I didn’t even attempt to define more than a few methods.

Instead, can we have the benefits of methods defined in Seq[T] that the compiler will not allow us to call on Nil. Oh, and no extra runtime overhead, please! Let’s see if erased meets these goals with a simpler implementation than my previous attempt.

The compiler won’t let you use experimental features at all unless you are using a nightly or snapshot build. So, I first cloned the dotty GitHub repo locally, then executed sbt publishLocal to build it and put the artifacts in my local ivy2 repo. The version string was 3.1.3-RC1-bin-SNAPSHOT , which might be different for you when you try this.

Next, I created a new directory for the example. I used the following build.sbt, adapted from my Programming Scala, Third Edition examples repo. Change the definition of scala3 as required for your local build of Dotty:

I also created a file project/build.properties to specify the sbt version. It contains this line:

sbt.version=1.6.2

Next, I created a directory for the source files, src/main/scala/progscala3/erased, but you can certainly use any path you want.

Finally, here is the implementation:

Let’s work through this. The first import is an annotation that we’ll use to define the compiler error message. You can see it used for IsEmpty and IsNotEmpty. We also need to import the experimental feature.

The sealed trait Emptiness defines types that will be used as type parameters for IsEmpty and IsNotEmpty. Note the actual given instances for those two types: IsEmpty[Empty] and IsNotEmpty[NotEmpty] . Note they are declared with the new erased keyword. We don’t define given instances for the other two combinations: IsEmpty[NotEmpty] and IsNotEmpty[Empty], because they are disallowed states. These definitions follow the example in the erased documentation.

The defined given instances constrain what’s allowed by the compiler. This has been a common idiom, to limit what’s allowed to compile, implemented with implicits in Scala 2 and given/using constructs in Scala 3. Now we’re adding the ability to reduce the byte code footprint.

Note how ESeq is declared. I found it necessary to have two type parameters, one for the head element, H, and one for the tail sequence T. The latter is not needed for the library Seq, but I could not figure out how to avoid it, because it’s now necessary to carry some type information about whether or not the tail is empty, i.e., EmptySeq, as I call it here.

I found that using a type alias E works well, because each concrete subtype will either define E as NotEmpty or Empty.

The “magic” is in the methods head and tail. Each takes an erased using argument of type IsNotEmpty[E], which effectively means that calls to these methods will be rejected at compile time unless a suitable given instance is found in scope. For EmptySeq, no given instance will exist, specifically, IsNotEmpty[Empty] is not defined. We require a valid IsNotEmpty for the argument for both methods, but in the context of EmptySeq, the E type is Empty. The net effect is that calling head and tail on EmptySeq will fail to compile. I hope this makes sense…

Note that definitions for hd and tl are required in EmptySeq , but we can just use ???. They will never be called!

Carrying a type for the tail ESeq made the +: methods tricky. In essence, the type of the returned value is ESeq[new_head_type, whatever_this_is].

Finally, I found it convenient to only allow construction using the carefully defined NotEmptySeq.apply methods or the +: methods.

Let’s try it! Here is my ErasedSeqMain.scala file:

I won’t show the long output, but it is what you would expect. If you uncomment any of the lines with COMPILATION ERROR, you get compilation errors, not runtime errors, like the following for lines 64 and 65 (edited to reduce wrapping):

[error] -- Error: .../src/main/scala/progscala3/erased/ErasedSeqMain.scala:64:67
[error] 64 | println(s"... = ${nes3b.tail.tail.tail.head}") // COMPILATION ERROR
[error] | ^
[error] | The seq must be not empty
[error] -- Error: .../src/main/scala/progscala3/erased/ErasedSeqMain.scala:65:67
[error] 65 | println(s"... = ${nes3b.tail.tail.tail.tail}") // COMPILATION ERROR
[error] | ^
[error] | The seq must be not empty

How cool is that?!

Okay, I should point out that the byte code will carry more type information than for the library’s Seq[T], but I don’ t think that overhead can be eliminated. Try using the sbt console with some of those definitions and notice the types printed, for example (with no wrapping removed):

scala> import progscala3.erased.*
|
| val nes1 = "one" +: EmptySeq
| val nes2 = "two" +: nes1
| val nes3 = "three" +: nes2
|
| val nes1Head = nes1.head
| val nes1Tail = nes1.tail
|
| val nes2Head = nes2.head
| val nes2Tail = nes2.tail
| val nes2TailHead = nes2.tail.head
| val nes2TailTail = nes2.tail.tail
|
| val nes3Head = nes3.head
| val nes3Tail = nes3.tail
| val nes3TailHead = nes3.tail.head
| val nes3TailTail = nes3.tail.tail
|
val nes1: progscala3.erased.NotEmptySeq[String, progscala3.erased.EmptySeq.type] = one +: EmptySeq
val nes2: progscala3.erased.NotEmptySeq[String, nes1.type] = two +: one +: EmptySeq
val nes3: progscala3.erased.NotEmptySeq[String, nes2.type] = three +: two +: one +: EmptySeq
val nes1Head: String = one
val nes1Tail: progscala3.erased.EmptySeq.type = EmptySeq
val nes2Head: String = two
val nes2Tail:
progscala3.erased.NotEmptySeq[String, progscala3.erased.EmptySeq.type] = one +: EmptySeq
val nes2TailHead: String = one
val nes2TailTail: progscala3.erased.EmptySeq.type = EmptySeq
val nes3Head: String = three
val nes3Tail: progscala3.erased.NotEmptySeq[String, nes1.type] = two +: one +: EmptySeq
val nes3TailHead: String = two
val nes3TailTail:
progscala3.erased.NotEmptySeq[String, progscala3.erased.EmptySeq.type] = one +: EmptySeq

It took me quite a bit of experimentation to make this work. As we saw in Scala 3: Dependent Types, Part II, some of the newer features in Scala are quite powerful, but nontrivial to use. However, this erased sequence ESeq is easier to understand than the DTList implementation in the previous post, which is nice. It remains to be seen how much adoption these advanced features actually experience, going forward.

For a concise summary of the more “mainstream” notable changes in Scala 3, see my new Scala 3 Highlights page.

See Programming Scala, Third Edition for comprehensive introduction to Scala 3, including details on how to migrate from Scala 2.

--

--

Dean Wampler
Scala 3

The person who is wrong on the Internet. ML/AI and FP enthusiast. Lurks at the AI Alliance and IBM Research. Speaker, author, pretend photographer.