Categories
Articles

How to use document language as a content grouping in Google Analytics

Very often, I use content language as a custom dimension or as a content group in Google Analytics. It is very easy to do as long as your HTML  code defines the language correctly.

For me, the content language is always one of the most important content groupings to implement. Many websites here in Finland are bilingual (or trilingual) and it is always interesting to analyse the content consumption based on the language. 

Very often, it is possible to identify the language using the page path. For example, if pages in Finnish always include /fi-fi/ in the path, you can identify the language using regex extraction.

Sometimes, this is not possible. What should you do in these cases?

How to read the HTML lang attribute with JavaScript

In HTML, the lang attribute specifies the language of the element using ISO-639-1 language codes. Most of the time, this is used before the document head to specify the language of the document.

Using the lang attribute is especially useful when the language can’t be identified by substrings in the path. In those cases, you can always simply define rules for identifying languages using Google Analytics’ user interface.

For example, the second line of this HTML snippet defines the document’s language as English.

<!doctype html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>My title here | My site</title>

In this case, you can read the value of lang attribute with JavaScript:

var language=document.documentElement.lang;

This is all the JavaScript you need to know to implement a custom JavaScript variable in Google Analytics. This variable HTML the lang attributes value to be used in GA settings variable.

Custom js – content language

Next, you will use this custom JavaScript variable in GA settings variable to send the language attribute to Google Analytics.

Use the document path for identifying language

In some cases, the document URL provides us the easiest way to identify the language. Usually, content is published in language specific directories or subdomains. Sometimes, the can be identified by a parameter. Using a specific directory for each language is probably the most common strategy used by multilingual sites.

https//www.example.com/fi/etusivu.html
https//www.example.com/en/frontpage.html

In these cases, it is easy to see that /en/ and /fi/ are used for content in different languages. And of course, it is again easy to use some simple JavaScript for getting the directory name. 

Just create a custom JavaScript function with:

function() {
   return(document.location.pathname.split("/")[1]);
}

In this function, document.location.pathname returns e.g. fi/etusivu.html. Then it split into substrings using “/” as separator and the correct substring is returned. Save it as Custom js – content language and use it later in the GA settings variable.

Implement the content grouping

First, you need to decide how to store the content languge in Google Analytics. There are two possibilites:

  • Custom dimension with hit scope
  • Content grouping for language

In both cases, you’ll need to modify the GA settings variable or tweak the page view and event tags. But of course, you should by now really be using settings variable instead of HTML GA tags.

Before doing any modifications, you need to add the content grouping to your Google Analytics view.  Do this in Admin > View > Content Grouping. Add the content grouping and give it a name, e.g. Content language. Choose “Group by tracking code”,  enable it and select the index.

Google Analytics settings for content groupings

Next, you will send the value of your JavaScript variable to Google Analytics using this content grouping. In order to do this, you will need to modify the GA settings variable. Remember to use here the correct index.

GA settings variable in Google Tag Manager

Now, all you need to do is to preview, debug and publish the container.

The same steps are needed if you want to use the document language as a custom dimension (hit scope). Define the custom dimension in Google Analytics (Admin > Property settings > Custom definitions > Custom dimensions). Note the index number and use it in GA settings variable for sending the language as a custom dimension.