As of February 16 of this year, you can now setup and manage Adobe Analytics’ data feeds right in the UI. This is super helpful, because you no longer have to rely on Adobe Customer Care to help you set this up! If you don’t know what data feeds are, they’re basically the log files of each and every hit that has been to a particular Adobe Analytics report suite.
Data feeds are useful when you want to do advanced types of analysis that aren’t currently possible in the Adobe Analytics interface, or you need to join your Analytics data with other datasets in a backend BI system – although be prepared; dealing with data feeds is not for the faint of heart. It can be fairly difficult to manage the sheer volume of data and also tricky to reproduce the same reports that Adobe Analytics does so quickly (part of the reason Jared and I created this blog!).
So, given you’re brave enough to continue, here’s how you set it up. First, you’ll want to navigate to the new data feed component manager:
From there, you’ll click the little plus symbol to create a new data feed:
When setting up a new data feed, you’ll notice a few important options.
- Feed Name – This is the name of the data feed you’ll setup. For example “Dataset for cluster analysis March 2017”. I really like to include the purpose of the dataset and the date ranges included in case I need to reuse the data another time.
- Report Suite – This is the report suite you want to pull the log data from.
- Email Address – Self explanatory; put your email in there to be notified when a file is delivered or if there are failures.
- Feed Interval – This one sets how often you want to get a data feed file added to your destination. If your application or analysis requires frequent data updates (which is not common for me) you can select “Hourly”, otherwise I prefer “Daily”.
- Delay Processing – I typically set this to “No Delay”, but if you have a ton of data feed requests it’s a good idea sometimes to add a delay so that you’re not hammering the data feed servers with a bajillion requests all at once every day.
- Start & End Dates – These are the dates for which you want data feed data. If you check “Continuous Feed”, it’ll keep sending you data feed files in perpetuity every day (or hour) into the future. Typically, you only want to do that if you have a BI process to pick those files up and do something with them automatically. Otherwise, I’d leave this unchecked.
Finally, you can configure your destination settings: FTP, SFTP, or S3.
When configuring your data feed, you’ll have to select the data feed columns you want to include. There are a ton of options here, so it’s really easy to get overwhelmed (explaining each of these is probably outside the scope of this post). Be careful about just selecting all of them – that can dramatically increase the disk space and processing times you’ll need to do what you want.
That said, if disk space and processing times are less of a concern for you (the cost of these are dropping every day), it is certainly more convenient and flexible to select all columns. If you want to use all columns, I recommend selecting the latest “All Standard” columns if you’re an Analytics Standard customer vs. the latest “All Premium” columns if you’re an Analytics Premium customer.
To check out a comprehensive list of columns and what they all mean, you can always grab them from Adobe’s documentation. To select the columns you want, just multi-select or search them from the left menu and add them to the right:
I typically set compression format to Gzip (but you can use zip if you want), the packaging type to “Multiple Files” (that’s definitely best if you’re going to be loading them into Spark or Hadoop), and I also recommend including the manifest file to be sure you received the entire file. A mistake people sometimes make is to download a file from FTP or SFTP while Adobe is still uploading it. This seems unlikely, but it happens! The manifest file only appears after the data has finished transferring and contains the MD5 hashes for the data so you can be sure you have it all.
Once you’ve completed that, you should see the data feed request appear in your component list and you can check the status under the “Jobs” tab. When it’s done, it’ll say “Complete”, and you’re all set.
Happy data-feeding!