How to Get Started with Scala

Scala is a key language in the data space. While Python is the lingua franca of data science and machine learning, Scala frequently pops up in data engineering and backend systems. It provides a type-safe, functional layer on top of the battle-hardened JVM, which means it benefits from a rich ecosystem that’s available without all of Java’s boilerplate. Scala comes with its own REPL, so it is as easy and fast as Python to experiment with code. But what are the best ways to learn Scala? Here are a few of my suggestions.

The official resources list a few options, but it may be hard to choose, so here a few possibilities I can recommend based on my own experience.

General

My absolute favourite book for when you already know a bit of programming but want to learn Scala is Cay Horstmann’s Scala for the Impatient. It does not beat about the bush and the book is clearly written. The author does not shy away from more advanced features that may scare beginners, such as implicits. These are however crucial to using the language efficiently, so it’s good they are treated with respect. The book assumes you know how to write code and it sometimes refers to Java or JVM quirks, but that does not mean you need to be a Java guru to understand it. I studied it to learn Scala, and back then my knowledge of Java was considerably worse than it is today.

Hello, Scala is a free alternative that is pretty decent and covers the basics.

Programming in Scala by its creator Martin Odersky is surprisingly readable, but it’s not really suited for picking up the language from scratch; it’s best kept as a reference. Once you know how to solve problems with Scala, it’s great to have in your library and learn why certain bits of the language work the way they do.

Twitter’s Scala School is definitely worth a visit. It’s best combined with trying code out in a REPL, for which there are also online alternatives in case you do not want to install anything. If you must use notebooks, there is Polynote, but I cannot imagine why anyone would prefer notebooks over an IDE, which comes with proper support for refactoring. I prefer IntelliJ IDEA as an IDE, but of course Eclipse is possible too, or even Metals, which works with most popular text editors. Most IDEs, including IntelliJ IDEA, have REPL-like worksheets, so you don’t have to leave the development environment to try out some code.

Scala Exercises have a basic online tutorial on Scala’s basic functionality. As of the time of this publication, the content is in alphabetical order, so it starts with ScalaTest assertions rather than language basics. That is unnecessarily confusing.

For those who prefer videos over books, there is Lightbend’s Scala 101, which is a decent introductory course. Underscore offer hands-on workshops too, but you can also grab their free e-book Essential Scala, which teaches the very basics, including Scala’s powerful pattern matching capabilities.

Once you’re familiar with the basics and need to look up stuff, check out the API, which is pretty readable, although you need to be comfortable with types to make sense of it. And that brings us to functional programming.

Functional Programming

Functional programming is one of Scala’s main advantages. For data engineering it is a natural way of coding as (referentially transparent) functions operate on immutable data structures. It’s a core part of the language rather than crowbarred in after the fact, as with Java. The most accessible book I have found that does not require you to master Haskell first is Alvin Alexander’s Functional Programming Simplified. It goes well with Scala Exercises’ Scala Tutorial, which is about functional loops, tail recursion, higher-order functions, and so on.

If that’s too simple, there is always Functional Programming in Scala, also known as ‘the red book’. This leitmotif of the book is re-implementing classic functional patterns, which may or may not be up your alley. The companion booklet is a worthwhile investment though.

Again, if you enjoy listening more than reading, Coursera have a specialization that may be of interest to you. It’s taught by Martin Odersky.

Category theory and functional programming go somewhat hand in hand, although many resources are not accessible unless you already know category theory and functional programming. Scalaz is a library that in my opinion follows that pattern: unless you’re already familiar with the material, you won’t be able to understand much of it. Cats is another such library, but it’s easier to understand than Scalaz. And with Scala Exercises you can learn most of the basic concepts by trying stuff out in an online REPL and seeing what it looks in actual code.

Underscore’s Creative Scala is mostly about fundamentals: lists, recursion, and abstract data types. But it’s a quick read at under 200 pages. They also offer more advanced e-book on Cats and Shapeless.

Testing

There are a couple of frameworks for unit testing in Scala: ScalaTest and Specs2. Testing in Scala talks about both methods, on top of which it also talks about ScalaCheck and EasyMock.

My preference goes out to ScalaTest because I think its DSL is nowhere near as messy as Specs2’s. ScalaCheck is great for property-based checks, and the user guide is by far the best resource. It’s concise, detailed, and contains pretty much all you’ll need to get started with property-based testing in Scala.

If you are curious about mutation testing in Scala, there is stryker4s, but the last time I checked it, it did not very perform all that well on production code.

sbt

While dependency management with Gradle or Maven is possible, sbt is more or less the standard in the Scala community. It has the best support, although its DSL takes some getting used to.

The reference manual talks about what you need to know: directory structure, basic commands, and syntax. If that does not suit you, I suggest Getting Started with sbt for Scala.

Have fun with Scala!