Authors: Martin Odersky, Lex Spoon, and Bill Venners
Category: Technology
Format: Kindle
Language: English
Pages: 837
Buy link: Amazon

Review

Programming in Scala manages to introduce the language in a deep way. It is a comprehensive step by step, with each chapter deepening the knowledge of the reader by expanding what was introduced in the previous chapters.

The language itself is a powerful blend of functional and object-oriented programming concepts, which are well explored in the book. The code examples are good and the book does not just keep the focus on the language but also explores themes like good code design. Although not always mentioned explicitly, it is possible to identify concerns with object-oriented design, testable code, and rich domain models.

Particularly interesting is the exploration of functional programming concepts. They receive the spotlight throughout the book, which will satisfy people looking to learn them well. Pure functions, referential transparency, high-order functions, first-class functions, and partially applied functions are not just explained but also blended in object-oriented code.

However, there are some chapters that would benefit from an editorial review (maybe it was something done in the fourth edition). Chapter 19 (Type Parameterization) explanation of variance and bounds could be simpler. The concept is better understood after reading chapter 22 (Implementing Lists). Chapter 24 (Collections in Depth) was mostly ripped from the official documentation, without too much adaptation.

Also, there are some code listings that pretend to explain the inner workings of Scala in purely functional terms. The problem is that, when you look at Scala’s source code, the correspondence is not there. Albeit the code listings provide an interesting, functional, and sound explanation of the computations, the fact is that the actual code is optimized (and mostly procedural). Take, for example, the combinator parser method ~ (listing 33.6), which is exemplified as a pattern matching application: it’s just an object instantiation in the source code. Even a method like TraversableLike’s filter (Scala pre-2.13, listing 25.2) which is listed in a more similar way to the actual code can confuse the more curious readers.

Mostly well written, with good code examples and exploration of good coding practices, Programming in Scala is a great programming book, succeeding in deeply introducing the language. It would benefit if it had practical examples of tasks like managing dependencies with sbt and packaging the project’s source code. But it delivers its promise. Recommended.

Some quotes

When you apply parentheses surrounding one or more values to a variable, Scala will transform the code into an invocation of a method named apply on that variable. So greetStrings(i) gets transformed into greetStrings.apply(i). Thus accessing an element of an array in Scala is simply a method call like any other. This principle is not restricted to arrays: any application of an object to some arguments in parentheses will be transformed to an apply method call. Of course this will compile only if that type of object actually defines an apply method. So it’s not a special case; it’s a general rule. Similarly, when an assignment is made to a variable to which parentheses and one or more arguments have been applied, the compiler will transform that into an invocation of an update method that takes the arguments in parentheses as well as the object to the right of the equals sign.

One of the big ideas of the functional style of programming is that methods should not have side effects. A method’s only act should be to compute and return a value. Some benefits gained when you take this approach are that methods become less entangled, and therefore more reliable and reusable. Another benefit (in a statically typed language) is that everything that goes into and out of a method is checked by a type checker, so logic errors are more likely to manifest themselves as type errors. Applying this functional philosophy to the world of objects means making objects immutable.

Thus, when you say 1 -> "Go to island.", you are actually calling a method named -> on an integer with the value 1, passing in a string with the value "Go to island." This -> method, which you can invoke on any object in a Scala program, returns a two-element tuple containing the key and value.

Defining a singleton object doesn’t define a type (at the Scala level of abstraction). Given just a definition of object ChecksumAccumulator, you can’t make a variable of type ChecksumAccumulator. Rather, the type named ChecksumAccumulator is defined by the singleton object’s companion class. However, singleton objects extend a superclass and can mix in traits. Given each singleton object is an instance of its superclasses and mixed-in traits, you can invoke its methods via these types, refer to it from variables of these types, and pass it to methods expecting these types.

In Scala operators are not special language syntax; any method can be an operator. What makes a method an operator is how you use it. When you write "s.indexOf('o')", indexOf is not an operator. But when you write "s indexOf 'o'", indexOf is an operator, because you’re using it in operator notation.

One of the benefits of object-oriented programming is that it allows you to encapsulate data inside objects so that you can ensure the data is valid throughout its lifetime.

Scala has only a handful of built-in control structures. The only control structures are if, while, for, try, match, and function calls. The reason Scala has so few is that it has included function literals since its inception. Instead of accumulating one higher-level control structure after another in the base syntax, Scala accumulates them in libraries.

A second advantage to using a val instead of a var is that it better supports equational reasoning. The introduced variable is equal to the expression that computes it, assuming that expression has no side effects.

an important design principle of the functional programming style: programs should be decomposed into many small functions that each do a well-defined task. Individual functions are often quite small. The advantage of this style is that it gives a programmer many building blocks that can be flexibly composed to do more difficult things. Each building block should be simple enough to be understood individually. One problem with this approach is that all the helper function names can pollute the program namespace. In the interpreter this is not so much of a problem, but once functions are packaged in reusable classes and objects, it’s desirable to hide the helper functions from clients of a class. They often do not make sense individually, and you often want to keep enough flexibility to delete the helper functions if you later rewrite the class a different way.

A function literal is compiled into a class that when instantiated at runtime is a function value. Thus the distinction between function literals and values is that function literals exist in the source code, whereas function values exist as objects at runtime. The distinction is much like that between classes (source code) and objects (runtime).

A partially applied function is an expression in which you don’t supply all of the arguments needed by the function. Instead, you supply some, or none, of the needed arguments.

In languages with first-class functions, you can effectively make new control structures even though the syntax of the language is fixed. All you need to do is create methods that take functions as arguments.

Such parameterless methods are quite common in Scala. By contrast, methods defined with empty parentheses, such as def height(): Int, are called empty-paren methods. The recommended convention is to use a parameterless method whenever there are no parameters and the method accesses mutable state only by reading fields of the containing object (in particular, it does not change mutable state). This convention supports the uniform access principle, which says that client code should not be affected by a decision to implement an attribute as a field or method

Generally, Scala has just two namespaces for definitions in place of Java’s four. Java’s four namespaces are fields, methods, types, and packages. By contrast, Scala’s two namespaces are: values (fields, methods, packages, and singleton objects) types (class and trait names) The reason Scala places fields and methods into the same namespace is precisely so you can override a parameterless method with a val, something you can’t do with Java.

A factory object contains methods that construct other objects. Clients would then use these factory methods to construct objects, rather than constructing the objects directly with new. An advantage of this approach is that object creation can be centralized and the details of how objects are represented with classes can be hidden. This hiding will both make your library simpler for clients to understand, because less detail is exposed, and provide you with more opportunities to change your library’s implementation later without breaking client code.

Type inference in Scala is flow based. In a method application m(args), the inferencer first checks whether the method m has a known type. If it does, that type is used to infer the expected type of the arguments. For instance, in abcde.sortWith(_ > _), the type of abcde is List[Char]. Hence, sortWith is known to be a method that takes an argument of type (Char, Char) => Boolean and produces a result of type List[Char]. Since the parameter types of the function arguments are known, they need not be written explicitly. With what it knows about sortWith, the inferencer can deduce that (_ > _) should expand to ((x: Char, y: Char) => x > y) where x and y are some arbitrary fresh names.

This inference scheme suggests the following library design principle: When designing a polymorphic method that takes some non-function arguments and a function argument, place the function argument last in a curried parameter list on its own. That way, the method’s correct instance type can be inferred from the non-function arguments, and that type can in turn be used to type check the function argument. The net effect is that users of the method will be able to give less type information and write function literals in more compact ways.

Besides being potentially easier to reason about, immutable collections can usually be stored more compactly than mutable ones if the number of elements stored in the collection is small. For instance an empty mutable map in its default representation of HashMap takes up about 80 bytes, and about 16 more are added for each entry that’s added to it. An empty immutable Map is a single object that’s shared between all references, so referring to it essentially costs just a single pointer field. What’s more, the Scala collections library currently stores immutable maps and sets with up to four entries in a single object, which typically takes up between 16 and 40 bytes, depending on the number of entries stored in the collection. So for small maps and sets, the immutable versions are much more compact than the mutable ones. Given that many collections are small, switching them to be immutable can bring important space savings and performance advantages.

More precisely, an initializer = _ of a field assigns a zero value to that field. The zero value depends on the field’s type. It is 0 for numeric types, false for booleans, and null for reference types. This is the same as if the same variable was defined in Java without an initializer.

Private constructors and private members are one way to hide the initialization and representation of a class. Another more radical way is to hide the class itself and only export a trait that reveals the public interface of the class.

Implicit conversions are often helpful for working with two bodies of software that were developed without each other in mind. Each library has its own way to encode a concept that is essentially the same thing. Implicit conversions help by reducing the number of explicit conversions that are needed from one type to another.

Arrays are a special kind of collection in Scala. One the one hand, Scala arrays correspond one-to-one to Java arrays. That is, a Scala array Array[Int] is represented as a Java int[], an Array[Double] is represented as a Java double[] and an Array[String] is represented as a Java String[]. But at the same time, Scala arrays offer much more their Java analogues. First, Scala arrays can be generic. That is, you can have an Array[T], where T is a type parameter or abstract type. Second, Scala arrays are compatible with Scala sequences—you can pass an Array[T] where a Seq[T] is required.

An extractor in Scala is an object that has a method called unapply as one of its members. The purpose of that unapply method is to match a value and take it apart. Often, the extractor object also defines a dual method apply for building values, but this is not required.

Representation independence is an important advantage of extractors over case classes. On the other hand, case classes also have some advantages of their own over extractors. First, they are much easier to set up and to define, and they require less code. Second, they usually lead to more efficient pattern matches than extractors, because the Scala compiler can optimize patterns over case classes much better than patterns over extractors. This is because the mechanisms of case classes are fixed, whereas an unapply or unapplySeq method in an extractor could do almost anything. Third, if your case classes inherit from a sealed base class, the Scala compiler will check your pattern matches for exhaustiveness and will complain if some combination of possible values is not covered by a pattern. No such exhaustiveness checks are available for extractors. So which of the two methods should you prefer for your pattern matches? It depends. If you write code for a closed application, case classes are usually preferable because of their advantages in conciseness, speed and static checking. If you decide to change your class hierarchy later, the application needs to be refactored, but this is usually not a problem. On the other hand, if you need to expose a type to unknown clients, extractors might be preferable because they maintain representation independence.

What’s more, every regular expression in Scala defines an extractor. The extractor is used to identify substrings that are matched by the groups of the regular expression.

Such tools are called meta-programming tools, because they are programs that take other programs as input. Annotations support these tools by letting the programmer sprinkle directives to the tool throughout their source code. Such directives let the tools be more effective than if they could have no user input.

(…) the private keyword, so useful for implementing classes, is also useful for implementing modules. Items marked private are part of the implementation of a module, and thus are particularly easy to change without affecting other modules. At this point, many more facilities could be added, but you get the idea. Programs can be divided into singleton objects, which you can think of as modules.

The == equality is reserved in Scala for the “natural” equality of each type. For value types, == is value comparison, just like in Java. For reference types, == is the same as equals in Scala. You can redefine the behavior of == for new types by overriding the equals method, which is always inherited from class Any. The inherited equals, which takes effect unless overridden, is object identity, as is the case in Java. So equals (and with it, ==) is by default the same as eq, but you can change its behavior by overriding the equals method in the classes you define. It is not possible to override == directly, as it is defined as a final method in class Any.

It turns out that writing a correct equality method is surprisingly difficult in object-oriented languages. In fact, after studying a large body of Java code, the authors of a 2007 paper concluded that almost all implementations of equals methods are faulty.

Lastly, if you find that a particular hash code calculation is harming the performance of your program, consider caching the hash code. If the object is immutable, you can calculate the hash code when the object is created and store it in a field. You can do this by simply overriding hashCode with a val instead of a def, like this: override val hashCode: Int = (numer, denom).## This approach trades off memory for computation time, because each instance of the immutable class will have one more field to hold the cached hash code value.

Scala does not check that thrown exceptions are caught. That is, Scala has no equivalent to Java’s throws declarations on methods. All Scala methods are translated to Java methods that declare no thrown exceptions.

With shared data and locks, you must get the program correct through reason alone. Moreover, you can’t solve the problem by over-synchronizing either. It can be just as problematic to synchronize everything as it is to synchronize nothing. Although new lock operations may remove possibilities for race conditions, they simultaneously add possibilities for deadlocks.

Although not a silver bullet, Scala’s Future offers one way to deal with concurrency that can reduce, and often eliminate, the need to reason about shared data and locks. When you invoke a Scala method, it performs a computation “while you wait” and returns a result. If that result is a Future, the Future represents another computation to be performed asynchronously, often by a completely different thread. As a result, many operations on Future require an implicit execution context that provides a strategy for executing functions asynchronously.

The async testing use case illustrates a general principle for working with futures: Once in “future space,” try to stay in future space. Don’t block on a future then continue the computation with the result. Stay asynchronous by performing a series of transformations, each of which returns a new future to transform. To get results out of future space, register side effects to be performed asynchronously once futures complete. This approach will help you make maximum use of your threads.