Persist DataLayer in Google Tag Manager

A script that lets you persist dataLayer variables from one page to the next when using Google Tag Manager.

(Update 19 November 2018: See this article for a more elegant solution.)

If you know your JavaScript, you know that all variables, functions, objects, resources, and data in the document get rewritten with every page load. In other words, every single page refresh builds the page from scratch, and the state of the document before the page refresh is left drifting in the ocean of oblivion.

Google Tag Manager’s dataLayer is also one such entity. It gets rewritten with every page load, so it’s not possible to have a dataLayer variable persist from one page to the other without using cookies or, as I’m going to show in this guide, the HTML5 Web Storage API. For web analytics, this is a bit of a shame. It forces us to think of things in the scope of a single page, when rarely anything worth mentioning has a lifespan that short.

Before we begin, take a look at what Shay Sharon expertly wrote two years ago about the persistent dataLayer. In his solution, cookies are used to carry the dataLayer across page refreshes. It’s a good patch, and it performs the task admirably. However, the thing with cookies is that they can be deleted (and usually are). Also, the length of the cookie string is limited (though 4KB is still plenty), and there’s always the fact that cookies are resolved with every single page load (since they are part of the document object).

Data in browser storage, on the other hand, is actually pretty difficult to delete on a granular level, it has a huge quota, and it’s only retrieved and stored on demand. Also, using localStorage (more about this soon), you have the data stored indefinitely, so user-level storage is very easy to implement.

(UPDATE: Remember that browser local storage should be treated the same as cookies when observing the cookie laws of your country. Thanks to Martijn Visser for pointing this out (he’s from the Netherlands, and they have pretty much the strictest interpretation of the cookie law in place.))

In this guide, I’ll give you an API of sorts to save, load, replace, and delete items from the local storage. I use a 30 minute expiration for data in localStorage to mimic session-length storage.

Introducing localStorage

localStorage stores data indefinitely in your browser’s own storage compartment. It has a larger-than-needed size limit (around 5MB), and data is handled more securely than with cookies. Cookies, you see, are written on the document every time you load a page, and this introduces potential security risks.

Items in localStorage, on the other hand, are retrieved only when called. This way all the data you store will be picked up on demand, and all you need to do is make sure your script handles the data correctly (which is what I’ll do for you in this guide).

In localStorage, data is written as key-value pairs, but unlike with dataLayer, both the key and the value need to be of String type. This means that in order to save stuff from dataLayer, which can hold any types of values, to localStorage, some object serialization and type conversion needs to take place (Object => String). Similarly, the reverse needs to take place when loading data from localStorage and pushing it to dataLayer.

The Persistent dataLayer API

Without further ado, allow me to introduce the Persistent dataLayer API. Copy the following code into a Custom HTML Tag, and read on.

<script>
  (function() {
    // Declare some utility variables
    var retrievedDL = '',
        getDL = {},
        saveDL = {},
        persistEvent = /^persist(Save|Replace)/,
        timeNow = new Date().getTime(),
        timeStorage = '',
        persistTime = 1000*60*30; // Expiration in milliseconds; set to null to never expire
    
    // Only works if browser supports Storage API
    if(typeof(Storage)!=='undefined') {
        
      retrievedDL = localStorage.getItem('persistDL');
      timeStorage = localStorage.getItem('persistTime');
      
      // Append current dL with objects from storage
      var loadDL = function() {
        if(retrievedDL) {
          dataLayer.push(JSON.parse(retrievedDL));
          // dataLayer.push({'event': 'DLLoaded'});
        }
      }
      
      // Save specified object in storage
      var storeDL = function() {
        for (var i = 0; i < dataLayer.length; i++) {
          if (persistEvent.test(dataLayer[i].event)) {
            saveDL = dataLayer[i];
            delete saveDL.event;
            getDL = JSON.parse(retrievedDL) || {};
            for (var key in saveDL) {
              if (saveDL.hasOwnProperty(key)) {
                getDL[key] = saveDL[key];
              }
            }
            localStorage.setItem('persistDL', JSON.stringify(getDL));
          }
        }
      }
      var deleteDL = function() {
        localStorage.removeItem('persistDL');
      }
      switch ({{event}}) {
        case 'gtm.js':
          if (retrievedDL && timeStorage) {
            if (persistTime && timeNow > Number(timeStorage) + persistTime) {
              deleteDL();
            } else {
              loadDL();
            }
          }
          break;
        // Delete dataLayer variables
        case 'persistDelete':
          deleteDL();
          break;
        // Replace dataLayer variables
        case 'persistReplace':
          retrievedDL = null;
        // Save dataLayer variables
        case 'persistSave':
          storeDL();
          break;
      }
      
      localStorage.setItem('persistTime', JSON.stringify(timeNow));
    }
  })();
</script>

The tag will need the following firing rule:

{{event}} matches RegEx ^(gtm.js|persist(Save|Replace|Delete))

Here’s a rundown of how the API works, before I go into the technical stuff:

  1. With every page load, dataLayer variables stored in localStorage are retrieved and pushed into dataLayer.

  2. When pushing an object with ‘event’: ‘persistSave’, all variables in the same push (e.g. ‘pageCount’ and ‘author’ in dataLayer.push({'event': 'persistSave', 'pageCount': '5', 'author': 'Simo-Ahava'});) are saved to localStorage. If a variable with the same name already exists in localStorage, it is updated with the new value.

  3. When pushing an object with ‘event’: ‘persistDelete’, all dataLayer variables in localStorage are deleted.

  4. When pushing an object with ‘event’: ‘persistReplace’, all existing dataLayer variables in localstorage are deleted, and all variables in the dataLayer.push() are saved.

So remember the following commands:

dataLayer.push({'var1': 'value1', 'var2': 'value2', **'event': 'persistSave'**}); stores ‘var1’ and ‘var2’ (with values) to localStorage.

dataLayer.push({'var1': 'value1', 'var2': 'value2', **'event': 'persistDelete'**}); deletes all saved dataLayer variables in localStorage; doesn’t store anything.

dataLayer.push({'var1': 'value1', 'var2': 'value2', **'event': 'persistReplace'**}); deletes all saved dataLayer variables in localStorage; stores ‘var1’ and ‘var2’ (with values) to localStorage.

A few additional details:

  • With every page load, variables are loaded from localStorage and pushed to dataLayer. Even though the tag itself fires on {{event}} equals gtm.js, there’s a slight delay when processing data through the Storage API. This means that usually the variables appear in dataLayer after gtm.dom but before gtm.load.

  • Every interaction with localStorage, resets the 30 minute expiration timer. If a page load occurs so that the last interaction was over 30 minutes ago, all saved dataLayer variables in localStorage are deleted.

  • Before saving the data, dataLayer variables are serialized into a String. When loading from localStorage, they are parsed from JSON back to their original types. Thus Arrays, objects, and primitive values are restored to their original types before pushed back into dataLayer.

Technical stuff

Let’s go over the code (almost) line-by-line.

(function() {
...
})();

The function is wrapped in an IIFE (immediately invoked function expression). This is because I want to avoid using the global scope when utilizing so many different variables. Scoping the variables to this function ensures that I don’t mess with global variables of some other library, for example.

var retrievedDL = '',
getDL = {},
saveDL = {},
persistEvent = /^persist(Save|Replace)/,
timeNow = new Date().getTime(),
timeStorage = '',
persistTime = 1000*60*30; // Expiration in milliseconds; set to null to never expire

Here I introduce a bunch of utility variables. If you want to have the storage persist for ever and ever, set persistTime = null;.

if(typeof(Storage)!=='undefined') {
  retrievedDL = localStorage.getItem('persistDL');
  timeStorage = localStorage.getItem('persistTime');
  ...
}

Only run the API if the browser supports HTML5 Web Storage. It’s basically only a problem with IE versions older than 8. I saw no reason to provide an alternative for them, since if you’re still using IE7 or older, you deserve a horrible browsing experience.

Check Shay Sharon’s excellent post I linked to in the beginning for a cookie-solution to the persistent dataLayer. That should work with your crappy, out-of-date browser.

var loadDL = function() {
  if(retrievedDL) {
    dataLayer.push(JSON.parse(retrievedDL));
    // dataLayer.push({'event': 'DLLoaded'});
  }
}
...
case 'gtm.js':
  if(retrievedDL && timeStorage) {
    if(persistTime && timeNow > Number(timeStorage)+persistTime) {
      deleteDL();
    } else {
      loadDL();
    }
  }
  break;

With every page load, parse the dataLayer variables in localStorage back to their original types, and push them into dataLayer as a single object. If you want to set a trigger event to fire your tags after the variables have been loaded from localStorage, uncomment the line with ‘event’: ‘DLLoaded’. This way you can have your dependent tags fire on {{event}} equals DLLoaded to ensure that they have access to the stored variables.

Also, if the variables in storage expire (default is 30 minutes since last interaction), no data is loaded and the variables are deleted from localStorage. If you’ve set persistTime = null;, then there’s no expiration for the data in storage, and the variables are stored until they are manually deleted.

When a dataLayer.push(); is made so that the object that is pushed contains the property ‘event’: ‘persistSave’, the function storeDL() is run.

First, the function removes the ‘event’ property from the object that was pushed into dataLayer. This is done because you don’t want to store the trigger event itself in localStorage. (Another possibility is to just skip the ‘event’ property when storing properties into localStorage.)

Next, each property in this dataLayer object is pushed into the object that was found in localStorage. Thus, all variables that were already stored are updated with new values, and new variables are appended.

Finally, the object, now serialized into a string of keys and values, is stored in localStorage, waiting to be loaded with a new page refresh.

If the ‘event’ was ‘persistReplace’, then this the storeDL() is run as well, but all dataLayer variables in storage are replaced with the new variables.

var deleteDL = function() {
  localStorage.removeItem('persistDL');
}
...
case 'persistDelete':
  deleteDL();
  break;

If the trigger event was ‘persistDelete’, all dataLayer variables in localStorage are deleted.

localStorage.setItem('persistTime', JSON.stringify(timeNow));

Finally, every time the script is run, the current time is saved as a timestamp in localStorage.

Conclusions

I’ve noticed that many people have been aching for a persistent dataLayer. I’m pretty sure this post will become obsolete as soon as the GTM team choose to deploy such a feature into the product itself, but until then, this should serve you well.

And if nothing else, at least you got to learn about yet another really cool JavaScript API!

Do you have suggestions for the API? Or maybe you have a sweet use case for persistent variables that you might want to share with others?