The Value of Parametric Polymorphism

Dean Wampler
Scala 3
Published in
7 min readOct 10, 2021

--

Blue Angels in Chicago, © 2021, Dean Wampler

I’m giving a talk at Scale by the Bay, October 28th, on Lessons Learned from 15 Years of Scala in the Wild.

Programming Scala, Third Edition is now available. It provides a comprehensive introduction to Scala 3 for experienced Scala developers, as well as a complete introduction to Scala for new Scala developers.

The other day, I tweeted the following:

A friend replied that I should provide an example, so here you go… I’ll use Scala 3 syntax, but the arguments apply equally well to Scala 2, to Java, and most other statically-typed languages. This repo has the Scala 3 and Java examples. For brevity, I’ll just discuss the Scala 3 example.

While I’ll focus on the boilerplate reduction aspect, this post ends with thoughts about the real “superpower” of parametric polymorphism, which might be less familiar to many of you.

First, what is parametric polymorphism? You probably already understand subtype polymorphism, the object-oriented notion of polymorphism, where an abstraction is defined, say with Java interface or Scala trait, then multiple concrete implementations provide the specific details that vary for each of those types. This is one of several kinds of polymorphism that I discuss in Programming Scala, Third Edition. Parametric polymorphism (also discussed at length in the book) is today’s subject, where a type parameter is the abstraction and implementations are created (by you or by the compiler deities) for specific implementations.

For example, suppose you didn’t trust the size method for Scala collections 🤓. Here is a trivial example of a parametrically-polymorphic method alternative:

$ scala
Starting scala3 REPL...
scala> def len[T](xs: Seq[T]): Int = xs.foldLeft(0)((count, _) => count + 1)
|
def len[T](xs: Seq[T]): Int
scala> len(Seq(1,2,3,4,5))
val res0: Int = 5

We obviously don’t care about the type of elements. You could even rewrite the method signature as def len2(xs: Seq[?]): Int and the same implementation would still work. I’ll show a more interesting example at the end, but the point here is that I specify a type parameter T and the method works for any Seq[T] I provide. I don’t need separate methods for Seq[Int], Seq[Double], etc.

Okay, now suppose a define some messages that can be sent between subsystems, where the receiver of a message needs to handle it properly. First, here are the “incoming” and “outgoing” message types:

(repo link)

Why didn’t I use the new fancy enum feature in Scala 3? For details I won’t go into here, there’s a slight disadvantage in the way typing works for enums vs. the “classic” approach shown here. (Also, we wouldn’t be able to support adding new IncomingMessage types with enums). See the repo branch with-enums and the code comments there for the details of the typing difference and the additional code required if you use enums.

Okay, here is the first version (v1) of how we might work with these messages. First, I need message handlers:

(repo link)

I start with a simple trait that defines the handler abstraction, then I define instances for each kind of IncomingMessage. Yes, the apply method bodies are all identical and I could have just provided a concrete MessageHandler.apply body. Realistically, the bodies won’t be identical. I’ll come back to this below.

Now let’s use the implementation:

(repo link)

The Processor.apply has to pattern match on the message and call the write one. It’s a bit tedious, but it is straightforward to implement and to read when you come back to the code later. That’s a virtue we shouldn’t ignore, even though we’ll eliminate the need for this pattern matching in a moment.

If you start sbt and run it, entering 1 at the prompt, here’s what you get:

sbt:parametric-polymorphism> runMultiple main classes detected. Select one to run:
[1] polyglotprogramming.messaging.v1.Messaging
[2] polyglotprogramming.messaging.v2.Messaging
[3] polyglotprogramming.messaging.v3.Messaging
[4] polyglotprogramming.messaging.v4.Messaging
[5] polyglotprogramming.messaging.v4java.Main
Enter number: 1
[info] running polyglotprogramming.messaging.v1.Messaging
Received message: Start(start)
Succeeded(Successfully processed Start(start))
Received message: DoWork1(dowork1)
Succeeded(Successfully processed DoWork1(dowork1))
Received message: DoWork2(dowork2)
Succeeded(Successfully processed DoWork2(dowork2))
Received message: Stop(stop)
Succeeded(Successfully processed Stop(stop))

Now let’s use parametric polymorphism to reduce the size of this code, then discover other benefits. This is v2. The messages are the same. For the moment, I’ll go ahead and use a single implementation of MessageHandler.apply, then return to the general case where that won’t work (in v3):

(repo link)

It’s drastically smaller. We still need concrete instances for each IncomingMessage type, but now we can use concise given instances.

Here is the updated Processor and an identical @main method:

Processor.apply works for any type IM that is a subtype of IncomingMessage. The message argument is typed specifically to be IM, too. The “secret sauce” is the addition of a using parameter list where a corresponding MessageHandler[IM] must be in scope. We imported the given instances we need using the second import statement on line 3. Not only have we eliminated some boilerplate (it would be even more noticable in a less trivial example…), but we have also eliminated the need to match in IncomingMessages.

How can we handle different MessageHandler.apply implementations? The Template Method Pattern is our friend. Here’s our v3 iteration:

(repo link)

MessageHandler.apply is still concrete and I’ve now declared it to be final, to emphasize that users shouldn’t override a concrete method, because it’s very easy to break its “contract”. Instead, the protected process method is declared for variations in handling. There are also two protected helper methods, success and failure.

Each given instance only needs to define what’s unique about the way it handles messages, by implementing process. Now, I’ll assume that all handling of DoWork2 messages fail. For all four implementations of process, a unique “details” string is returned.

I won’t show the corresponding v3/Main.scala, as it identical to the v2 implementation, except for the package name and import statement.

This code uses parametric polymorphism to define uniform handling for all allowed subtypes of IncomingMessage, while providing the hooks required for implementing variant behavior.

Finally, let’s see how a new IncomingMessage type can be supported (v4). v4/MessageHandlers.scalais identical to the v3 implementation, except for the package name. Here’s the new Main.scala:

Processor is unchanged. It continues to work as long as the input message and handler have the correct types. A new message type DoWork3 is declared along with a corresponding given instance for it. Messaging() adds an instance of DoWork3 and everything works as before! Note the comment; if you forget to define a matching given instance, you’ll get a compile time (vs. runtime) error.

Recap

I eliminated all the original boilerplate code using parametric polymorphism, while preserving the flexibility necessary for variant behavior, using the Template Method Pattern (perhaps my favorite Gang of Four pattern). Yes, the simple example wasn’t that verbose to begin with, but hopefully you can see how this could make a real difference in many real-world problems.

I also eliminated the pattern matching that was required in v1, along with the need to modify that code if and when a new message type is added! In v4, I exploited the “pluggable” nature of Producer. All I needed besides the new IncomingMessage type was a given instance of a corresponding MessageHandler. So, it is easy to add new message types, but it’s also robust against an easy mistake to make. A compile-time error is thrown if I forget to add a given instance of a MessageHandler!

It’s not quite as easy to do parametric polymorphism in Java and you don’t have given instances and using clauses to keep the code in Messaging() as clean and concise, but you can still leverage these ideas. See the Java implementation I provided in the repo. (My Java is rusty; PRs for improvements are welcome!)

The Real “Superpower” of Parametric Polymorphism

Okay, this post was inspired by the need to keep code as concise and still robust as possible and how parametric polymorphism gives you powerful tools for this purpose. However, there is really an even more important “superpower” that parametric polymorphism gives us.

Remember the “trivial” len method I showed at the beginning? Here’s the signature again, this time with a generic name and omitting the body:

def foo1[T](xs: Seq[T]): Int

Hmm. What if I didn’t have parametric polymorphism and I wrote something like this instead:

def foo2(xs: SeqOfInt): Int  // SeqOfInt is a sequence of integers

Forget the fact we started with len and ask yourself, what are the possible, “legal” implementations for each of these methods?

foo2 could be len, but also sum, multiple, median, and others, whereas there is really only one possible implementation for foo1, the length of the collection.

By being more abstract in the types of the elements, we constrain the allowed implementations.

The real power of parametric polymorphism is the way it makes our type and method signatures much more precise, even while they become more flexible for use with different types. It becomes much harder to write giant hairball methods that mix all kinds of functionality, creating a morass that’s hard to understand, test, debug, and reuse. It brings greater discipline and precision to our designs.

All that said, len (a.k.a foo1) is kind of trivial. Here’s a more sophisticated case that still leverages the same power. What if the collection elements are “numeric”, i.e., Numeric[T] exists for the element type T? Here’s how we can multiple any such sequence of elements:

scala> def multiply[T : Numeric](xs: Seq[T]): T =
| xs.reduceLeft((x1,x2) => summon[Numeric[T]].times(x1,x2))
|
def multiply[T](xs: Seq[T])(using evidence$1: Numeric[T]): T
scala> multiply(Seq(1,2,3,4,5))
val res0: Int = 120
scala> multiply(Seq(1.1,2.2,3.3,4.4,5.5))
val res1: Double = 193.26120000000003

Now the type parameter is constrained by a context bound. (I discussed context bounds and what summon does here.) There must exist a Numeric[T] for any concrete type T we use. These are predefined in the Scala library for Int and Double . You can define your own Numeric[T] for your own types, like Matrix, Rational, GeoCoordinates, etc.

Change the method name to foo again and ask yourself, what are the possible implementations allowed by the method signature? In this case, it’s not just one implementation, but the defined operations that Numeric[T] provides where the corresponding methods return T: addition, subtraction, multiplication, etc.

This is just the tip of the Iceberg. You can find plenty of references on the web that expound upon this idea. See for example, the great talk by Rúnar Bjornasson called Constraints Liberate, Liberties Contrain and the book he cowrote with Paul Chiusano called Functional Programming in Scala. (A second edition is under development.)

For more on parametric and subtype polymorphisms in Scala, see Programming Scala, Third Edition and join me at Scale by the Bay, October 28th!

--

--

Dean Wampler
Scala 3

The person who is wrong on the Internet. ML/AI and FP enthusiast. Lurks at the AI Alliance and IBM Research. Speaker, author, pretend photographer.