Finding a Spark at Yahoo!

Recently I had an opportunity to learn a little more aboutĀ Apache Spark, a new in-memory cluster computing system originally developed at theĀ UC Berekeley AMPlab. By moving data into memory, Spark improves performance for tasks like interactive data analysis and iterative machine learning. These improvements are especially pronounced when comparing them to a batch oriented, disk-bound system like Apache Hadoop. While Spark has seen rapid adoption at a number of companies, I learned how Yahoo! has started integrating Spark into its analytics.

Read More…

Copyright © Nick Heudecker

Built on Notes Blog Core
Powered by WordPress