Join My Big Data Newsletter

  • Hadoop Tutorials
  • Engineering Walkthroughs
  • Book Recommendations
  • Open Source Code

Recent Articles View All

Beginners Guide to Columnar File Formats in Spark and Hadoop

File formats can be confusing, so lets delve into Columnar file formats (like Parquet) and explain why they're different to regular formats (like CSV, JSON, or Avro)

A Quick Guide to Concurrency in Scala

I'll talk through the basics of Threads, Akka, Futures, and Timers in this quick overview of concurrency for Scala. Great for those building apps in Scala.

4 Fun and Useful Things to Know about Scala's apply() functions

Scala's apply functions are commonly seen alongside case classes, but they can do so much more. Here are 4 fun ways they are used in Scala.

10+ Great Books and Resources for Learning and Perfecting Scala

While Scala is amazing it has an overwhelming number of features. These books and on-line resources will help you learn and perfect Scala whether you're coming from Java, Python, Ruby, or any other language.

10+ Great Books for Apache Spark

Apache Spark is a powerful technology with some fantastic books. I'll help you choose which book to buy with my guide to the top 10+ Spark books on the market.

A Beginner's Guide to Hadoop Storage Formats (or File Formats)

I'll walk through what we mean when we talk about 'storage formats' or 'file formats' for Hadoop and give you some initial advice on what format to use and how.