Scala Collections Series, Part 3 - Operations on Scala Collections, File I/O and Trivia

This blog post is the last of a three part series, this section covers:

  1. Operations on Scala Collections
  2. File I/O
  3. Trivia

Operations on Scala Collections

It’s possible to compose operations on collections, here are a few examples.

scala> val listOfSentences = List("'Tis better to have loved and lost than never to have loved at all. #lovedthanlost", "It ain't what you don't know that gets you into trouble, it's what you know for sure that just ain't so. #marktwain", "#moon #onesmallstepforman That's one small step for man, one giant leap for mankind.")
listOfSentences: List[String] = List('Tis better to have loved and lost than never to have loved at all. #lovedthanlost, It ain't what you don't know that gets you into trouble, it's what you know for sure that just ain't so. #marktwain, #moon #onesmallstepforman That's one small step for man, one giant leap for mankind.)

scala> val numberOfWords = listOfSentences.flatMap(_.split(" ")).size
numberOfWords: Int = 50

scala> val distinctWords = listOfSentences.map(_.filter(c => !Set(',', '.', '!', ',').contains(c))).flatMap(_.split(" ")).toSet
distinctWords: scala.collection.immutable.Set[String] = Set(it's, trouble, for, That's, giant, have, #lovedthanlost, step, than, mankind, sure, what, so, #moon, all, just, #marktwain, ain't, man, don't, lost, that, to, you, know, small, 'Tis, at, leap, #onesmallstepforman, loved, gets, It, into, better, and, one, never)

scala> distinctWords.size
res102: Int = 38

scala> val hashtags = listOfSentences.map(_.filter(c => !Set(',', '.', '!', ',').contains(c))).flatMap(_.split(" ")).filter(_.head == '#')
hashtags: List[String] = List(#lovedthanlost, #marktwain, #moon, #onesmallstepforman)

File I/O

Input file:

    chevy
    ford
    tesla
    ferrari

To read the above input file, named input.dat into a list:

import scala.io.Source
import java.io._

// get makes as a List of Strings
val makes = Source.fromFile("input.dat").getLines.toList

// get makes as a comma separated String
Source.fromFile("input.dat").getLines.mkString(",")

// Write the makes back to a file, separated by newline
val writer = new PrintWriter(new File("output.dat"))
makes.foreach(make => writer.write(make + "\n"))
writer.close()

Trivia

A few bits of trivia…

There are Set(1,2,3,4) and then a HashTrieSet:

scala> Set(1).getClass
res3: Class[_ <: scala.collection.immutable.Set[Int]] = class scala.collection.immutable.Set$Set1

scala> Set(1,2).getClass
res4: Class[_ <: scala.collection.immutable.Set[Int]] = class scala.collection.immutable.Set$Set2

scala> Set(1,2,3).getClass
res5: Class[_ <: scala.collection.immutable.Set[Int]] = class scala.collection.immutable.Set$Set3

scala> Set(1,2,3,4).getClass
res6: Class[_ <: scala.collection.immutable.Set[Int]] = class scala.collection.immutable.Set$Set4

scala> Set(1,2,3,4,5).getClass
res7: Class[_ <: scala.collection.immutable.Set[Int]] = class scala.collection.immutable.HashSet$HashTrieSet

Instead of checking emptiness using length, used the built in methods:

scala> "Bob".length > 0
res103: Boolean = true

scala> "Bob".nonEmpty
res104: Boolean = true

scala> "Bob".isEmpty
res105: Boolean = false

Don’t nest a match inside a map, use a partial function instead:

scala> List("Bob", "Tom").map { _ match { case s: String if s == "Tom" => true; case _ => false } }
res109: List[Boolean] = List(false, true)

scala> List("Bob", "Tom").map { case s: String if s == "Tom" => true; case _ => false }
res110: List[Boolean] = List(false, true)

Min/Max can replace reduceLeft if you’re looking for those values:

scala> List(1, 4, 5, 100).min
res111: Int = 1

scala> List(1, 4, 5, 100).max
res112: Int = 100