Showing posts with label Storm. Show all posts
Showing posts with label Storm. Show all posts

Friday, 12 December 2014

Storm Blueprints: Patterns for Distributed Real-time Computation

A very useful book for anyone interested in Storm for real-time processing. I am particularly benefited by some of the practical use cases of Storm, especially on Storm Trident. Compared to other books on Storm that I read so far, this one seems to offer very good tutorials on several aspects of Trident which answers some of my puzzles over Trident such as:


  • How to implement a Trident State
  • How to implement a Trident State Factory
  • How to implement a Trident State Updater
  • How to effectively uses combiner, reducer, aggregator
  • How to implement a Trident non-transactional, repeat transactional topology and opaque trident map state
  • How to implement a Trident spout, coordinator, emitter
  • How to implement recursive functions in Trident


While the later chapters involving druid and hadoop are a bit difficult for me to assimilate at this stage due to time constraints, i will definitely like to read it again on these chapters.

https://www.packtpub.com/big-data-and-business-intelligence/storm-blueprints-patterns-distributed-real-time-computation

Tuesday, 25 November 2014

Learning Storm

Just completes the book "Learning Storm". Very nice read, interesting readers can go to the following link to buy:

https://www.packtpub.com/big-data-and-business-intelligence/learning-storm

The book covers quite widely, with quite a number of ways to show how other technologies  working with Storm introduced in a easy-to-understand way such as the covering of ZooKeeper, Kafka, Hadoop, YARN, Ganglia, JMX, HBase, Redis, MySQL, etc. I especially likes the way they teach Trident, which makes it much easier to grasp the concept of Trident, and the last chapter on machine is extremely useful.

While the normal readers can read the book chapters by chapters to take a slow and full exposure. For someone like me, who always like to delve directly into practice, the best approach is actually to read the book three times, each time skipping some chapters.

During first time, the reader should go through chapter 1 to chapter 4, skipping the thrift library introduction in chapter 3, and then directly jump to chapter 8, which gives an example of log processing in Storm. With this the reader will build a level of confidence after practicing the simple cases in these chapters.

During the second time, the reader should go through chapter 5 and chapter 9 to get a good ideas of what Trident is and how Trident work, as well as how to do machine learning using Trident.

During the third time, the reader can optionally go through the thrift library in chapter 3, then go to chapter 7 which show rich tools to interact with Storm such JMX and Ganglia. Finally if there is a need for integration with Hadoop, then go to chapter 6 and some other parts in chapter 7.

Wednesday, 12 November 2014

Getting Started with Storm

Just completed reading the "Getting Started with Storm" book, I will say this is one of the easiest-to-follow books I have read, yet it provides good basic understanding of working with Storm, a distributed system for processing streaming data. A good read, totally recommended for someone interested in learning Storm.

http://www.amazon.com/Getting-Started-Storm-Jonathan-Leibiusky/dp/1449324010