Spark 46
Spark Streaming with Kafka Example
Spark Streaming with Kafka is becoming so common in data pipelines these days, it’s difficult to find one without the other. This tutorial will present an example of streaming Kafka from Spark. In this example, we’ll be feeding weather data into Kafka and then processing this data from Spark Streaming in Scala. As the data […]
Spark Command Line Arguments in Scala Example
The primary reason why we want to use Spark command line arguments is to avoid hard-coding values into our code. As we know, hard-coding should be avoided because it makes our application more rigid and less flexible. For example, let’s assume we want to run our Spark job in both test and production environments. Let’s […]
Spark Performance Monitoring with Metrics, Graphite and Grafana
Spark is distributed with the Metrics Java library which can greatly enhance your abilities to diagnose issues with your Spark jobs. In this post, we’ll cover how to configure Metrics to report to a Graphite backend and view the results with Grafana. Optional, 20 Second Background If you already know about Metrics, Graphite and Grafana, […]
Spark Broadcast and Accumulator Examples in Scala
Spark Broadcast and Accumulator Overview So far, we’ve learned about distributing processing tasks across a Spark cluster. But, let’s go a bit deeper in a couple of approaches you may need when designing distributed tasks. I’d like to start with a question. What do we do when we need each Spark worker task to coordinate certain […]
IntelliJ Scala and Apache Spark – Well, Now You Know
IntelliJ Scala and Spark Setup Overview In this post, we’re going to review one way to setup IntelliJ for Scala and Spark development. The IntelliJ Scala combination is the best, free setup for Scala and Spark development. And I have nothing against ScalaIDE (Eclipse for Scala) or using editors such as Sublime. I switched from […]
Spark Streaming Testing with Scala Example
Spark Streaming Testing How do you create and automate tests of Spark Streaming applications? In this post, we’ll show an example of one way in Scala. This post is heavy on code examples and has the added bonus of using a code coverage plugin. Are the tests in this tutorial examples unit tests? Or, are […]
Learning Spark PDF
So, I’ve noticed “Learning Spark PDF” is a search term which happens on this site. Can someone help me understand what people are looking for when using this phrase? Are readers looking for the Learning Spark: Lightning-Fast Big Data Analysis book from O’Reilly? Perhaps looking for the new Apache Spark with Scala Tutorial book? It’s […]
Categories
Recent Posts
- Kafka Streams – Transformations Examples February 13, 2019
- Kafka Producer January 29, 2019
- Kafka Consumer January 27, 2019
Most Commented