In my last post, I described the basics for processing Adobe Analytics Click Stream Data Feeds using Hadoop. While the solutions outlined there will scale remarkably well, there is a more memory efficient way to do it. Having this flexibility is nice if you have lots of CPU cores available but not as much ram. […]
Tag: hadoop
Introduction to Processing Click Stream Data Feeds with Hadoop and Map/Reduce
In an earlier post, Matt Moss showed how to process data feed data using an SQL database. This can be useful in a pinch when you have a smaller amount of data and need an answer quickly. What happens though when you now need to process the data at a large scale? For example, you […]