Spark Machine Learning – Chapter 11 Machine Learning with MLlib

Spark Machine Learning
/ Categories: Spark Comments: no comments

Spark Machine Learning is contained with Spark MLlib.  Spark MLlib Spark’s library of machine learning (ML) functions designed to run in parallel on clusters.  MLlib contains a variety of learning algorithms. The topic of machine learning itself could fill many books, so instead, this chapter explains ML in Apache Spark. This post is an excerpt

read more

The Mythical Man Month Summary Chapter 2

Mythical Man Month Summary
/ Categories: Summary Series Comments: no comments

Why do software projects go awry? There are five common elements to the answer of this question: Techniques for estimating are poorly developed. Effort does not always equal progress. Our estimates are not stubborn enough.  The overall progress of the schedule is often poorly monitored. The traditional response of adding more manpower to late projects

read more

Data Science From Scratch Summary: New Book

Machine Learning
/ Categories: Summary Series Comments: no comments

Machine Learning Chapter 11 This post is an excerpt for our book Data Science From Scratch Summary Many people believe data science is machine learning and that data scientists mostly build and train and tweak machine-learning models. In reality, data science is mostly addressing business problems by collecting, understanding, cleaning, and formatting data.  But, once

read more

Spark Streaming from Learning Spark Chapter 10

Spark Streaming
/ Categories: Spark, Summary Series Comments: no comments

Spark Streaming Spark Streaming based applications are tracking statistics about page views in real time, train a machine learning model, or automatically detect anomalies. The abstraction in Spark Streaming is called DStreams or discretized streams. A DStream is a sequence of data which arrives over time. Internally, each DStream is represented as a sequence of

read more

Clean Code Summary – Sample Chapter

clean code
/ Categories: Summary Series Comments: no comments

The following is chapter 3 from Clean Code Summary book available from Amazon. It was written and designed for experienced software engineers and managers looking to save time and learn key concepts from the critically acclaimed software engineering book, Clean Code: A Handbook of Agile Software Craftsmanship. Functions Chapter 3 A function is a type of procedure

read more

The Pragmatic Programmer Summary – Chapter 1 – A Pragmatic Philosophy

The Pragmatic Programmer Summary
/ Categories: Summary Series Comments: no comments

The following is chapter 1 from The Pragmatic Programmer Summary book available now on Amazon.  It was written and designed for people looking to save time and learn key concepts from the classic software engineering book, The Pragmatic Programmer: From Journeyman to Master. Chapter 1 A Pragmatic Philosophy Pragmatic programmers are determined by their attitude, style

read more

Learning Spark: Lightning-Fast Big Data Analysis – Developer Deconstructed – Chapter 7 – Running on a Cluster

Spark cluster managers
/ Categories: Spark Comments: no comments

This is the seventh post in the Learning Spark book summary series[1]. Chapter 7 Running on a Cluster A feature of Spark is the ability to run computations in parallel by using many machines running in cluster mode.  Even better is writing parallelized applications use the same API as previously shown examples. Spark can run on

read more

Learning Spark: Lightning-Fast Big Data Analysis – Developer Deconstructed – Chapter 6

Advanced Spark Programming
/ Categories: Spark, Summary Series Comments: no comments

This is the sixth post in the Learning Spark book summary series[1]. Chapter 6 Advanced Spark Programming Overview Two types of shared variables: accumulators to aggregate information and broadcast variables to efficiently distribute large values are introduced.  The examples use ham radio operators’ call logs as the input. Dividing work on a per-partition basis allows us

read more