#GTMTips: Use localStorage For Client ID Persistence In Google Analytics
Updated 1 October 2019 With ITP 2.3 it looks like Safari is reducing the usefulness of
localStorage
as well, so this solution should not be considered future-proof. The only stable way to persist client-side data at the moment seems to be HTTP cookies.
Updated 7 March 2019 - Added some extra caveats to this solution. Also, be sure to read my article on ITP 2.1, which has far more detail on what Intelligent Tracking Prevention is and how to work with it.
Looks like Safari is tightening the noose around browser cookies with the introduction of ITP 2.1 (Intelligent Tracking Prevention). Among other things, ITP 2.1 caps the expiration of client-side cookies to 7 days. Client-side cookies in this context refer to cookies set with the document.cookie
API, which would cover pretty much all cookies set by vendored JavaScript libraries like Google Analytics’ analytics.js.
Personally, I think anything that forces a rethink of the cookie model is welcome. It’s awkward and twisted that we still rely on such flimsy and fragile browser storage for persisting important data such as anonymous analytics identifiers, authentication flags, and basically anything that needs to persist from one page to the next.
Having said that, this can make potentially benevolent tracking such as that done with Google Analytics on a website quite difficult for users with Safari browsers. Since GA relies on a browser cookie capped at 2 years’ expiration (by default), this 7 days max can really hurt data quality.
In this article, I want to show you how to use customTask
with localStorage
to serve as a backup of sorts for your cookie-based Client ID persistence.
It’s not perfect. Check out the caveats chapter for details. Please treat this as a technology demo rather than a be-all, end-all solution to cookie woes. We’ll need to rely on the vendors and browsers finding consensus for how to make life harder on the bad agents without compromising the work us “regular folks” need to do to improve our websites with first-party analytics data.
XThe Simmer Newsletter
Subscribe to the Simmer newsletter to get the latest news and content from Simo Ahava into your email inbox!
Tip 96: Persist the Client ID in localStorage
It’s not a novel trick. In fact, Google themselves show you how to do this in the developer documentation. The thing is, though, that Google only shows you how to replace cookie storage with localStorage
. I still want to leverage cookies, because they make things like cross-domain tracking far easier to do.
The process is basically this:
-
When a Google Analytics tracker is created, check if Client ID is persisted in
localStorage
. -
If this is the case, check if its expiration is in the future.
-
If this is the case, build the tracker with the Client ID from
localStorage
(the tracker will generate the_ga
cookie with this data, too). -
If there is no Client ID in
localStorage
use GA’s default Client ID creation and storage mechanisms. -
When the tag fires, write the Client ID in
localStorage
.
It should be pretty smooth sailing. Since the _ga
cookie and localStorage
are kept in sync, this should work across origins which is something that localStorage
by default doesn’t do.
NOTE! Since you’re always checking localStorage
first, this means that if the user deletes their cookies, they don’t actually delete the Client ID, because that would require localStorage
to be flushed, too. You might want to discuss this solution with your legal team before going forward with it.
How to set the Client ID
You’ll need two Custom JavaScript variables. One to set the Client ID, and one to get it.
To set the Client ID, go to the customTask
Builder tool, select the option named Use localStorage To Persist ClientId, and then click Copy to clipboard.
In Google Tag Manager, create a new Custom JavaScript variable, and paste the clipboard content within. Then, do the following two maintenance operations:
-
Delete the following text from the start of the code block:
var customTask =
. Just that, nothing else. -
Delete the last character of the entire code block, which should be a semi-colon.
Then, modify the configuration object, if you wish.
var localStorageCid = {
objectName: 'ga_client_id',
expires: 1000*60*60*24*365*2
};
If you want the name of the object written to localStorage
to be something other than 'ga_client_id'
, change the respective string value. The expires
value is a number in milliseconds denoting how long the object should be in storage. It’s updated every time a tag fires with this customTask
script. If you want the storage to expire sooner or later than two years’ time, change the value of expires
accordingly.
NOTE! There is no automatic expiration mechanism with
localStorage
. “Expires” here simply means that if the expiration of the item is in the past, it will be overwritten with the Client ID generated by GA.
Finally, add this customTask
to all your Google Analytics tags. Easiest way to do it is to use a Google Analytics Settings variable. Add it like this:
Remember that you can only add one customTask
per tag or Google Analytics Settings variable. Use the customTask
Builder tool to compile a customTask
that incorporates multiple different functions.
How to get the Client ID
The second Custom JavaScript variable you’ll need is something that will pull the Client ID from localStorage
and use that instead of the value stored in the _ga
cookie (if any).
This is what the Custom JavaScript variable code looks like:
function() {
var objectName = 'ga_client_id';
if (window.localStorage) {
var jsonObj = window.localStorage.getItem(objectName) || '{}';
var obj = JSON.parse(jsonObj);
var now = new Date().getTime();
if (obj.clientId && obj.expires) {
if (now <= obj.expires) {
return obj.clientId;
}
}
}
return;
}
Change the value of objectName
to match the name of the localStorage
object you set in the customTask
above. By default, it’s 'ga_client_id'
.
This script checks if a Google Analytics Client ID is found in localStorage
, and that if found, its expiration time hasn’t whizzed past yet. If both of these conditions pass, the Client ID is returned from localStorage
.
If no item is found, or if object has expired, or if the browser doesn’t support localStorage
, the variable returns undefined
which basically means that GA falls back to its default method of Client ID generation, retrieval, and storage.
You need to add this variable to every single tag to which you added the customTask
from above. Again, it’s imperative that every single tag uses this new localStorage
method consistently, or you might end up with tags firing with the wrong Client ID, thus skewing your data.
You add this variable with the field name clientId
, like this:
That’s it for this simple solution. There is one big problem with this approach, though. It will always take the Client ID from localStorage
if it’s available and hasn’t expired. This is problematic in one specific scenario: cross-domain tracking.
How to respect GA’s cross-domain linker parameter
Turns out, this isn’t totally trivial to do. Even if you set the allowLinker
field to true
in the tag, the Client ID from localStorage
will always overwrite the respective field in the tag, no matter how valid the linker parameter was.
So, you need to replicate how allowLinker
works, checking if the URL has a valid linker parameter. If it does, then you need to bypass the localStorage
fetch, so that analytics.js can build the Client ID with the linker parameter, and then write the updated ID into localStorage
in the customTask
.
Unfortunately, analytics.js doesn’t expose the allowLinker
functionality as an API you could simply query to know whether the URL has a valid linker parameter or not.
This leaves us with very few options. Your best bet is to actually reproduce what allowLinker
does. It’s not trivial, so I did the legwork for you (thanks to David Vallejo’s generous help). You can find the source code in this Gist.
To add it to your setup, open the Custom JavaScript variable you created in the previous chapter, and edit it to this:
function() {
var objectName = 'ga_client_id';
var checkLinker=function(t){var n,e,i=function(t,n){for(var e=new Date,i=window.navigator,r=i.plugins||[],a=[t,i.userAgent,e.getTimezoneOffset(),e.getYear(),e.getDate(),e.getHours(),e.getMinutes()+n],s=0;s<r.length;++s)a.push(r[s].description);return o(a.join("."))},r=function(t,n){var e=new Date,i=window.navigator,r=e.getHours()+Math.floor((e.getMinutes()+n)/60);return o([t,i.userAgent,i.language||"",e.getTimezoneOffset(),e.getYear(),e.getDate()+Math.floor(r/24),(24+r)%24,(60+e.getMinutes()+n)%60].join("."))},o=function(t){var n,e=1;if(t)for(e=0,n=t.length-1;0<=n;n--){var i=t.charCodeAt(n);e=0!=(i=266338304&(e=(e<<6&268435455)+i+(i<<14)))?e^i>>21:e}return e.toString()};if("string"==typeof t&&t.length){if(!/_ga=/.test(t))return"Invalid linker format in string argument!";e=t.split("&").filter(function(t){return"_ga"===t.split("=")[0]}).shift()}else e=(n=/[?&]_ga=/.test(window.location.search)?"search":/[#&]_ga=/.test(window.location.hash)?"hash":void 0)&&window.location[n].substring(1).split("&").filter(function(t){return"_ga"===t.split("=")[0]}).shift();if(void 0===e||!e.length)return"Invalid linker format in URL!";var a,s,g,u,f=e.indexOf(".");return f>-1&&(e.substring(0,f),s=(a=e.substring(f+1)).indexOf("."),g=a.substring(0,s),u=a.substring(s+1)),void 0!==u?g===i(u=u.split("-").join(""),0)||g===i(u,-1)||g===i(u,-2)||g===r(u,0)||g===r(u,-1)||g===r(u,-2):void 0};
if (checkLinker()) {
return;
}
if (window.localStorage) {
var jsonObj = window.localStorage.getItem(objectName) || '{}';
var obj = JSON.parse(jsonObj);
var now = new Date().getTime();
if (obj.clientId && obj.expires) {
if (now <= obj.expires) {
return obj.clientId;
}
}
}
return;
}
That big block of code starting with var checkLinker=
contains the code in the GitHubGist just minified to reduce clutter.
The checkLinker()
method parses the URL for a valid linker parameter. If one is found, then the variable ignores the Client ID stored in localStorage
and allows the GA tag to build the Client ID from the linker parameter instead.
Read on for one major caveat in this approach.
Caveats
There are, naturally, some caveats to this workaround.
-
There’s no mechanism involved to handle multiple
_ga
cookies. So if you have a set of tags that need to have their Client ID handled separately (e.g. due to roll-up tracking, simply use a differentlocalStorage
object name for that set of tags. -
Unlike typically with
localStorage
, you don’t have to worry about cross-origin tracking, since the_ga
cookie would persist across subdomains (assuming it has thecookieDomain
field set toauto
), and thus when the user lands on a subdomain without thelocalStorage
object, the_ga
cookie is used instead, and this is then written intolocalStorage
in thecustomTask
. -
Tracking across subdomains is nevertheless a problem, because the
_ga
cookie only survives 7 days without returning visits. Thus if the user visits one subdomain on day 1 and another subdomain on day 8, thelocalStorage
solution described here will be of little help. -
Cross-domain tracking, even with the linker trick mentioned in the previous chapter, is still a pain. Google is experimenting with different linker parameters, so the solution above might become outdated soon. I’ll update the code when the linker parameter format changes.
-
There are potential legal implications of disregarding cookie purges as described in this article (thanks to Brian Clifton) for pointing this out in the comments. I seriously recommend to only treat this solution as a technical demo of what could be done, not what should be done.
-
Safari is already blocking
localStorage.setItem()
in private browsing mode, and other browsers might follow suit.
And then, of course, the biggest caveat:
This isn’t the last we’ve seen of the concentrated attack against browser cookie storage. Other browsers will likely follow suit, after they let Safari take the blow for being the first one audacious enough to delimit first-party cookies in this way.
Final thoughts
I want you to consider this a tech demo first and foremost. It’s not robust enough to carry your entire enterprise web analytics setup, but it should be usable in situations where you want to minimize data loss especially with Safari users.
Please do note that you are potentially making it more difficult for your visitors to clear their tracking identifiers. Make it obvious in your privacy statement that this type of persistence is happening, even if it’s not necessary per the regulations or laws of your region. It’s just good behavior. Users should be allowed to purge their tracking identifiers without having to be rocket scientists to do so.
I’m also expecting this solution to be temporary. It’s still unclear how Intelligent Tracking Prevention evolves. One of the purposes of the 7-day cap to client-side cookies is to prevent third party JavaScript libraries from writing cookies on one domain and accessing them on another subdomain. If Apple considers this localStorage
trick to be a hack around this limitation, it’s possible they’ll invest in measures to prevent this from working, too.
Also, I’m expecting big vendors who suffer most from this (such as Google and Facebook) to introduce their own solutions that help avoid the potential compromisation of data quality that’s at stake here. As such, I do hope that this article becomes outdated soon, when the platforms suggest an officially supported way of persisting data reliably.
Finally, many might be now even more tempted to move towards server-side tracking due to the increased fragility of client-side persistence. You don’t need a full server-side proxy to cope with ITP 2.1. Since it only targets cookies written with document.cookie
, you could simply generate anonymous identifiers like GA’s Client ID server-side, and set them in the Set-Cookies
header of the HTTP response. Be sure to check out my more extensive article on ITP 2.1 and web analytics for more details.
What do you think of this latest ITP 2.1 release? Feel free to join the discussion in the comments!