Meet object-csv, a Strongly Typed CSV Helper for Scala

May 16, 2014

tags:
5 comments

Yeah, I know this is a .NET blog, but recently my team at Ginger Software ventured into some Android coding. Now, we didn’t want to use Java, and preferred something with more functional capabilities. Scala was the natural choice. One thing we use a lot in our C# projects is CSV files. They are much easier to programmatically read/write than Excel files, and our analysts can still work with them as if they were Excel files. Sadly, Scala was missing a library to read/write CSV files to/from objects, which was something we sorely missed. Therefore, I set out to build our own. This is not a fully fledged framework, it is tailored to our specific needs, but can be easily modified to suit your own. It uses the scala-csv project as a dependency, and we call it object-csv.

Let’s say you defined this case class:

case class Person (name: String, age: Int, salary: Double, isNice:Boolean = false)

 

You can write a collection of Person to a .csv file this way:

import com.gingersoftware.csv.ObjectCSV._
//...
val person1 = new Person("Doron,y",10,5.5)
val person2 = new Person("David",20,6.5)
writeCSV(IndexedSeq(person1,person2), fileName)

 

This will generate the following CSV file:
#Name,Age,Salary,IsNice
“Doron,y”,10,5.5,false
David,20,6.5,false
In a similar manner, you can also read this CSV file as a collection of Person. Note that the order of the columns in the CSV file doesn’t matter, we use the header to match each value to the correct constructor argument. This allows your files to be more flexible: add columns or change their order, and your code won’t break.
val peopleFromCSV = readCSV[Person](fileName)
assert(peopleFromCSV === IndexedSeq(Person("Doron,y",10,5.5),Person("David",20,6.5)))

 

Nice and simple, right? It serves our needs nicely, but it comes with some caveats:

1) It only works with Scala 2.11, as it uses scala.reflect which wasn’t really stable on 2.10. So make sure you have set scalaVersion := “2.11.0” in your build.sbt.

2) It only works with case classes, as all the reflection stuff is based on using your case class primary constructor.

3) For reading, we only currently support the following data types: Int, Double, Boolean and String. We’ll probably add more as we need them. Writing works with everything, as we just .toString it all.

4) We can only read/write CSV files with headers, and the header must begin with the comment mark (#).

5) The API currently doesn’t expose ways to control the type of separator used in the CSV file, but it is very easy to add (the scalacsv project does support it).

6) We didn’t test it for speed, reading is likely to be slow as it uses reflection heavily.

Still wanna use it? Great! Just add the following to your sbt file:
libraryDependencies += "com.gingersoftware" % "object-csv_2.11" % "0.1"

Or you can just browse the code and copy the classes (there are only 2) to your project. We’d appreciate any feedback, we’re pretty new to Scala development.

Add comment
facebook linkedin twitter email

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*

5 comments

  1. EranMay 19, 2014 ב 22:35

    אהבתי!

    This is really cool. What is the license by the way? can I use this in a commercial project? 🙂

    Thanks

    Reply
    1. Doron Yaacoby
      Doron YaacobyMay 25, 2014 ב 10:58

      Thanks! You can use it for whatever you want. Thanks for reminding me to add the Apache 2.0 license to my GitHub readme 🙂

      Reply
  2. Mark ListerApril 7, 2015 ב 14:13

    I’ve written a similar framework for csv that differs from yours in a few ways:

    * No reflection.
    * No header requirement.
    * Columns can’t be reordered.
    * Works on case classes or tuples.

    Lack of reflection is good for performance and compatibility with scala-js. Read performance is not bad.

    product-collections

    Reply
    1. Doron Yaacoby
      Doron YaacobyApril 8, 2015 ב 19:00

      Thanks Mark, always good to have options. Just one question, if you’re not using reflection, how do you instantiate the case classes?

      Reply
      1. Mark ListerApril 14, 2015 ב 13:07

        Hi Doron, case classes inherit from FunctionN (well at least with <=22 arguments).

        Reply