hadoop Archives - Trevor With Data

Using Secondary Sort to Enhance Adobe Data Feed Processing in Hadoop

June 13, 2017April 30, 2022 Jared Stevens

In my last post, I described the basics for processing Adobe Analytics Click Stream Data Feeds using Hadoop. While the solutions outlined there will scale remarkably well, there is a more memory efficient way to do it. Having this flexibility is nice if you have lots of CPU cores available but not as much ram. […]

Introduction to Processing Click Stream Data Feeds with Hadoop and Map/Reduce

June 6, 2017April 30, 2022 Jared Stevens

In an earlier post, Matt Moss showed how to process data feed data using an SQL database. This can be useful in a pinch when you have a smaller amount of data and need an answer quickly. What happens though when you now need to process the data at a large scale? For example, you […]