One of the big problems in Google Analytics’ data model is the immutability of historical data. Once a row of data is written into the data table, it is there practically for good. This is especially annoying in two cases: spam and bogus ecommerce hits. The first is a recognized issue with an open and public data collection protocol, the latter is an annoyance that can explode into full-blown sabotage (you can use the Measurement Protocol to send hundreds of huge transactions to your competitor’s GA property, for example).
In this article, I’ll tackle a different symptom. I’ve covered this before, and David Vallejo’s written an excellent article about this topic as well.
The problem is duplicate transactions. It’s very common if your site has a receipt page that can be revisited via the browser cache or history. By revisiting the receipt page, it’s often the case that the transaction details are written into dataLayer
(or sent with the ga()
method) again, resulting in the inflation of your ecommerce data.
A simpler way to confirm if this is an issue is to create custom report where you inspect Transaction IDs against the Transactions metric. Optimally, you’d have one transaction per ID, unless you’ve decided to enable transaction updates/upgrades in your data collection.
I’ve decided to upgrade my solution to utilize customTask, since it lets you do without extra triggers or variable logic. We’ll set things up with Google Tag Manager, but there’s nothing stopping you from doing this with on-page analytics.js, too.
The Simmer Newsletter
Follow this link to subscribe to the Simmer Newsletter! Stay up-to-date with the latest content from Simo Ahava and the Simmer online course platform.
How it works
When you add the customTask
variable to your tags, it activates any time the tag tries to send a hit to Google Analytics.
During this activation, the method looks for the key &ti
in the hit model. This key corresponds to the Transaction ID value.
Next, it looks into your browser storage for any transaction IDs already sent from the current browser.
If the transaction ID in the hit is found in browser storage, this customTask
blocks the hit from ever being fired, this preventing the duplicated information from reaching Google Analytics.
If the transaction ID in the hit is not found in browser storage, the customTask
sends the hit to GA normally, but it also stores the transaction ID in the list of transactions that has already been recorded. Thus, it blocks any future hits with this ID from being sent.
NOTE! This auto-blocking feature only works with Enhanced Ecommerce. With Standard Ecommerce, the customTask
will only update browser storage but it won’t block anything. You’ll need to use triggers instead (read on for instructions how to do this).
The customTask variable
You can use the customTask Builder tool to generate the necessary variable, or you can simply copy-paste the code below into a Custom JavaScript variable, if you wish. The benefit of using the builder tool is that you can combine this customTask
with all the other setups I’ve added to the tool.
Anyway, this is what the Custom JavaScript variable should end up looking like:
function() {
// customTask Builder by Simo Ahava
//
// More information about customTask: https://www.simoahava.com/analytics/customtask-the-guide/
//
// Change the default values for the settings below.
// transactionDeduper: Configuration object for preventing duplicate transactions from being recorded.
// https://bit.ly/2AvSZ2Y
var transactionDeduper = {
keyName: '_transaction_ids',
cookieExpiresDays: 365
};
// DO NOT EDIT ANYTHING BELOW THIS LINE
var readFromStorage = function(key) {
if (!window.Storage) {
// From: https://stackoverflow.com/a/15724300/2367037
var value = '; ' + document.cookie;
var parts = value.split('; ' + key + '=');
if (parts.length === 2) return parts.pop().split(';').shift();
} else {
return window.localStorage.getItem(key);
}
};
var writeToStorage = function(key, value, expireDays) {
if (!window.Storage) {
var expiresDate = new Date();
expiresDate.setDate(expiresDate.getDate() + expireDays);
document.cookie = key + '=' + value + ';expires=' + expiresDate.toUTCString();
} else {
window.localStorage.setItem(key, value);
}
};
var globalSendHitTaskName = '_ga_originalSendHitTask';
return function(customTaskModel) {
window[globalSendHitTaskName] = window[globalSendHitTaskName] || customTaskModel.get('sendHitTask');
var tempFieldObject, dimensionIndex, count, ga, tracker, decorateTimer, decorateIframe, iframe;
customTaskModel.set('sendHitTask', function(sendHitTaskModel) {
var originalSendHitTaskModel = sendHitTaskModel,
originalSendHitTask = window[globalSendHitTaskName],
canSendHit = true;
var hitPayload, hitPayloadParts, param, val, regexI, trackingId, snowplowVendor, snowplowVersion, snowplowPath, request, originalTrackingId, hitType, nonInteraction, d, transactionId, storedIds;
try {
// transactionDeduper
if (typeof transactionDeduper === 'object' && transactionDeduper.hasOwnProperty('keyName') && transactionDeduper.hasOwnProperty('cookieExpiresDays') && typeof sendHitTaskModel.get('&ti') !== 'undefined') {
transactionId = sendHitTaskModel.get('&ti');
storedIds = JSON.parse(readFromStorage(transactionDeduper.keyName) || '[]');
if (storedIds.indexOf(transactionId) > -1 && ['transaction', 'item'].indexOf(sendHitTaskModel.get('hitType')) === -1) {
canSendHit = false;
} else if (storedIds.indexOf(transactionId) === -1) {
storedIds.push(transactionId);
writeToStorage(transactionDeduper.keyName, JSON.stringify(storedIds), transactionDeduper.cookieExpiresDays);
}
}
// /transactionDeduper
if (canSendHit) {
originalSendHitTask(sendHitTaskModel);
}
} catch(e) {
originalSendHitTask(originalSendHitTaskModel);
}
});
};
}
There’s a configuration object in the variable transactionDeduper
that you can modify, if you wish. The configuration object must have both keyName
and cookieExpiresDays
.
Set keyName
to what you want the name of the cookie or the localStorage
key to be. The default value is _transaction_ids
.
Set cookieExpiresDays
to the number of days the cookie should exist. A cookie is only used if the user’s browser doesn’t support localStorage
. If localStorage
is used, no expiration is set.
The main logic happens near the end of the code block, where I’ve boxed the solution with // transactionDeduper
and // /transactionDeduper
.
It really is very simple, and it follows the process I outlined in the previous chapter.
If you’re using Enhanced Ecommerce, then this customTask
will take care of everything for you. In case the transaction ID is found in the list of stored IDs, it will simply block the hit from departing to Google Analytics.
If you’re using Standard Ecommerce, the customTask
will only write the ID into storage, and you’ll need to handle the blocking logic yourself.
In any case, you need to add this customTask
variable to all the Enhanced Ecommerce and/or Transaction tags that have the power to send purchase information to Google Analytics. For instructions how to add the customTask
to your tags, see this guide or follow the instructions in the customTask Builder tool.
Triggers and variables for Standard Ecommerce
Due to how Standard Ecommerce is split into transaction
and item
hits, the blocking logic would be extremely difficult to automate in customTask
. This is why you’ll need to create an exception trigger for your Transaction tag, which blocks the tag from firing if the transaction ID is found in the list of stored IDs.
Data Layer variable for the Transaction ID
Create a Data Layer variable for transactionId
, like this:
1st Party Cookie variable for the transaction ID list
Create a 1st Party Cookie variable for the transaction ID list, using the keyName
you configured in the customTask
. This is what a default setup would look like:
Custom JavaScript variable to check if the ID is within the list
Finally, create a Custom JavaScript variable which returns true
if the transaction ID is found in the list.
function() {
// Change this to match the keyName you added to customTask:
var keyName = '_transaction_ids';
var ids = JSON.parse((!!window.Storage ? window.localStorage.getItem(keyName) : {{Cookie - _transaction_ids}}) || '[]');
return ids.indexOf({{DLV - transactionId}}) > -1;
}
Name the variable something like {{JS - transactionId sent}}.
The exception trigger
The last step is to create a trigger which blocks your transaction tag from firing. It must use the same event as the transaction tag. So, if the transaction tag fires on a Page View trigger, the exception trigger must also be a Page View trigger (read more about exceptions here).
In the trigger conditions, check if {{JS - transactionId sent}} equals true
. Thus, the exception will block the tag it is attached to if the transaction ID is found in the list of IDs already recorded.
Here’s an example of what the exception looks like with a Custom Event trigger, and how the exception is added to the tag.
Final thoughts
My quest for improving Google Analytics’ data quality using customTask
continues.
Preventing hits from being sent if certain conditions arise is nice and elegant to run through customTask
, since you don’t have to mess with complicated triggers cluttering your GTM interface.
However, every customTask
used does increase the opaqueness of your setup, so the trade-off is that you won’t know what individual tags do at a glance without drilling into their set fields (and understanding what customTask
does in the first place).
As always, let me know what you think about this solution in the comments, and let me know also if you have suggestions for improving it! Thank you.