Categories
clojure java programming

Helper code to mimic Clojure fns in Scala

I’ve finished my 3.5 year stint writing Scala, and I haven’t stopped missing writing Clojure. The knowledge of Clojure continues to heighten and inform my programmer sensibilities. One thing that I appreciated about Scala is that it was as good of a medium as you might practically find to allow writing Clojure without writing Clojure. I liked to think of Scala as the canvas on which I painted my Clojure ideas. Because Scala makes itself amenable to many styles of programming at once (at least, FP and OOP), it was possible to write code by imagining what the Clojure code would look like, and then writing that in Scala syntax. Interestingly, the more I did this, and the more faithfully I did so, the more people implicitly (no pun intended!) acknowledged the code as “good Scala code”. Because, you know, most Scala programmers agree that good Scala code puts “val”s at the top of a function body, uses immutable collections exclusively, prefers functions over (object) methods, and makes functions small, stateless, and composable. More on that later. Here, I want to simply release some of the code that I wrote in Scala to fill in a few perceived gaps in Scala’s Seq abstraction, where the perception is based on what I was accustomed to using in Clojure.

Code snippets for implementing Clojure fns / functionality in Scala

Note: the following code snippets are Apache 2 licensed, so go ahead and use them wherever you would like as you see fit!

The first code snippet is perhaps the more interesting of the two. I provide my implementations in Scala of Clojure’s merge-with and partition-by:


/**
Copyright 2018 Google LLC.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
/**
* Seq util functions
*/
object SeqUtil
{
/**
* Implements Clojure's merge-with
*
* @param m1
* @param m2
* @param f
* @tparam K
* @tparam V
* @return
*/
def mergeWith[K,V](m1: Map[K,V], m2: Map[K,V], f: (V,V) => V): Map[K,V] = {
val keys1 = m1.keySet
val keys2 = m2.keySet
val allKeys = keys1.union(keys2)
def foldFn(m: Map[K,V], k: K) = {
if (keys1.contains(k) && keys2.contains(k)) {
m + (k -> f(m1(k), m2(k)))
} else if (keys1.contains(k)) {
m + (k -> m1(k))
} else {
m + (k -> m2(k))
}
}
val init = Map[K,V]()
val result = allKeys.foldLeft(init)(foldFn)
result
}
/**
* This is loosely based on Clojure's merge-with, but allows for
* the return map type to be different from the input maps' type since
* this is Scala == statically typed.
* It probably really makes sense to use this as an argument to a
* foldLeft type of reduction because an initial value of a different type
* needs to be guaranteed.
*
* @param m1 map to merge into to create output map (akin to init val for a foldLeft)
* @param m2 input map
* @param initW a fn to initialize the output map value for a new key
* @param f how to merge a value in the new map with a value in the output map
* @tparam K key type for both input and output map
* @tparam V key type for input map
* @tparam W key type for output amp
* @return
*/
def mergeLeftWith[K,V,W](m1: Map[K,W], m2: Map[K,V], initW: K => W, f: (W,V) => W): Map[K,W] = {
val keys1 = m1.keySet
val keys2 = m2.keySet
val allKeys = keys1.union(keys2)
def foldFn(m: Map[K,W], k: K) = {
if (keys1.contains(k) && keys2.contains(k)) {
m + (k -> f(m1(k), m2(k)))
} else if (keys1.contains(k)) {
m + (k -> m1(k))
} else {
val newVal = f(initW(k), m2(k))
m + (k -> newVal)
}
}
val init = Map[K,W]()
val result = allKeys.foldLeft(init)(foldFn)
result
}
/**
* Implements Clojure's partition-by fn for Scala seqs — given a seq and a fn,
* segment the entire seq, in order, into a seq-of-seqs, where each inner
* seq's values all produce the same output value (when f applied) as each other
* Note: implementation does not use lazy seqs nor lazy vals
*
* @param seq
* @param f set of all possible output values must implement ==
* @tparam T
* @return
*/
def partitionBy[T](seq: Seq[T], f: (T) => Any): Seq[Seq[T]] = {
if (seq.isEmpty) {
val emptySeq: Seq[T] = Seq.empty
Seq(emptySeq)
} else if (seq.tail.isEmpty) {
val onlyElem = seq.head
val oneElemSeq: Seq[T] = Seq(onlyElem)
Seq(oneElemSeq)
} else {
val firstElem = seq.head
var currFnVal: Any = f(firstElem)
var currSeq: Seq[T] = Seq()
var result: Seq[Seq[T]] = Seq()
for (e <- seq: Seq[T]) {
val fnVal = f(e)
if (fnVal == currFnVal) {
currSeq = currSeq :+ e
} else {
currFnVal = fnVal
result = result :+ currSeq
currSeq = Seq(e)
}
}
result = result :+ currSeq
result
}
}
}

view raw

SeqUtil.scala

hosted with ❤ by GitHub

The code originated in the fact that something as simple as partition-by didn’t exist in Scala, and there was really no way to cleanly finish the task I was working on without going off and implementing it. Soon after, merge-with followed, and then “mergeLeftWith” was created to offer a version that starts with an initial value. The analogy is if merge-with is like reduce with no initial value argument, then “mergeLeftWith” is like using Clojure’s reduce with an initial value argument (aka Scala’s foldLeft).

The second code snippet was useful for reducing all the boilerplate that inevitably surrounds the use of Options in Scala. I also added some pretty-printing functions that I used in testing:


/*
Copyright 2018 Google LLC.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
object LangUtil
{
/**
* Given two operands that are the same concrete Option type, and a binary
* function, return the result of the operation as an option in a default
* way.
*
* @param val1
* @param val2
* @param op
* @tparam U
* @tparam V
* @return
*/
def optionGenericOp[U,V](val1: Option[U], val2: Option[U], op: (U,U) => V): Option[V] = {
(val1, val2) match {
case (Some(a), Some(b)) => Option(op(a, b))
case (None, None) => None
case (_, None) => None
case (None, _) => None
case _ => None // redundant
}
}
/**
* Provide the logical equivalent of && for Option[Boolean]
*
* @param b1
*/
def optionAnd(b1: Option[Boolean], b2: Option[Boolean]): Option[Boolean] = {
optionGenericOp(b1, b2, (x: Boolean, y: Boolean) => x && y)
}
/**
* Provide the logical equivalent of || for Option[Boolean]
* @param b1
* @param b2
* @return
*/
def optionOr(b1: Option[Boolean], b2: Option[Boolean]): Option[Boolean] = {
optionGenericOp(b1, b2, (x: Boolean, y: Boolean) => x || y)
}
//
// pretty-printing util fns
//
def prettyPrintableMap[K,V](m: Map[K,V]): String = {
val keys = m.keys.toSeq
val kvLines = keys.map(k => s"\t$k -> \t${m(k)}")
kvLines.mkString("\n")
}
def prettyPrintableSortableMap[K,V,B](m: Map[K,V], keySortBy: K => B)(implicit ord: math.Ordering[B]): String = {
val keys = m.keys.toSeq.sortBy(keySortBy)
val kvLines = keys.map(k => s"\t$k -> \t${m(k)}")
kvLines.mkString("\n")
}
def prettyPrintableSortableMap[K <: Ordered[K],V](m: Map[K,V]): String = {
val keys = m.keys.toSeq.sorted
val kvLines = keys.map(k => s"\t$k -> \t${m(k)}")
kvLines.mkString("\n")
}
}

view raw

LangUtil.scala

hosted with ❤ by GitHub

On the topic of whether all the ceremonial code required for Scala’s liberal use of Option (now partially present in Java 8+ due to Java’s careful embrace of FP), you should really see Effective Programs – 10 Years of Clojure by Rich Hickey. It articulates well the inherent tradeoffs that we make in our choice of programming languages, which are merely tools to a means. But it brilliantly articulates an opinion/perspective that is practical and speaks to my sensibilities of why I found the boilerplate code in Scala slowing me down more than I would like for the amount of perceived benefit I got in return (not much). Most of the benefit in terms of confidence in my code came from my various unit and integration tests.

And speaking of tests, don’t underestimate the utility of the pretty-printing functions. The reason why I created them was because I had to convert Clojure code that I wrote that used the expectations testing library. That library is amazing, especially when your logic requires data structures. The library isn’t radically different to other “fluent” testing libraries, nor is that where most of the benefit lies. The real benefit occurs when you spend the most time using it — when your tests fail! And you don’t necessarily look back at your test code, but rather, you look at the test output to gather clues of what failed and how. Expectations does the following in its error output:

  • re-prints the test code causing the failure, with the provided values plugged into the code if necessary
  • instead of printing “actual value [A] is not equal to expected value [E]”, it neatly prints (using line breaks and horizontal spacing) the values so that they line up. (I can’t tell you how many times I’ve seen the test error output that reproduces default Java object printing of 2 large, detailed objects side by side without even line breaking
  • more importantly and awesome-ly, it only shows you the portion of expected and actual values you need to see
  • and in a terminal with colors, you get different colors for the re-printing of the original test code, the expected value, and the actual value

I wasn’t about to do all that because I couldn’t possibly do so. I don’t think Scala has data structure diff’ing libraries because it doesn’t share Clojure’s proud focus on data-oriented programming. So the best I could do to recreate expectations in Scala was create helper test functions, one for comparing sets, one for comparing maps, one for comparing seqs, etc. where the differing values would be printed on separate lines so that they would line up and you could more easily see where they diverge. For seqs, my testing fns would additionally iterate through the seqs and print the indices of the last congruent & first divergent elements in the seqs. Ultimately, the less time you spend sifting through the error messages, the faster your unit/integration tests are, and the faster you get back to doing the interesting, productive work that you intended to.

Parting thoughts about Scala

I found it strange that when I stepped in my previous role, which used Scala heavily, I basically just started writing my Scala code as if it were Clojure and didn’t really get punished for it. I created a couple of semi-stateful OOP-y classes in my very first Scala program just to make it not look obvious that I was doing so, but then I got critiqued for how it made my code confusing. After that, I decided to just write Clojure code in Scala syntax from then on. As a result, at a high-level, my code:

  • used Scala object classes in lieu of Clojure namespaces
  • wrote functions statelessly, with all the vals and block-local function definitions occurring before any other code in the function
  • used the plain data structures (Map, Vector, Set) wherever possible
  • used Scala case classes to represent Clojure heterogenous maps / Clojure records, because a Scala Map doesn’t support that
  • avoided any typical OOP “plain” Java classes
  • used Java interfaces as substitutes for Clojure protocols

With all that, and combined with my helper fns/code in main code and test code, I felt pretty comfortable in Scala. And arguably, I may have had a quicker pace in getting code written than many Scala programmers around me who inevitably got caught in some compiler error or type representation riddle because they felt they were trying to do Scala the “right way” (for whatever value of “right way” they imbibed).

“Hey, (at least) it’s better than Java!” — that’s the most popular selling point that I’ve heard for Scala through the years. And I agree. There’s nothing more that I wish to say further on the topic that Rich doesn’t say much more insightfully in his his talk Effective Programs – 10 Years of Clojure. I don’t know what the future holds for programming languages. But I’m still optimistic in declaring that Clojure is a language for the ages.

2 replies on “Helper code to mimic Clojure fns in Scala”

No, I don’t, although I did definitely hear of scalaz and cats sounds very vaguely familiar. I remember that teammates who knew more about scalaz and shapeless said to steer clear because their benefit was not worth the complexity that they introduced. I could understand their perspective because it is a more localized “instance” of the point about language tradeoffs that Rich explained in his talk. So I never bothered to try for that reason, for better or for worse.

If you’re alluding to the idea that maybe those optionAnd and optionOr functions are duplicates of logic elsewhere, then I totally agree. I had the feeling that it was true as I wrote them, but I was going for fast and simple over the complexity of finding & learning a new dep for those 2 functions. Kind of like how lots of Clojure programmers probably re-invent mapping over map values instead of using the pre-existing fmap function (https://github.com/clojure/algo.generic/blob/master/src/main/clojure/clojure/algo/generic/functor.clj#L33).

Leave a reply to Eric Cancel reply