Very often, I use content language as a custom dimension or as a content group in Google Analytics. It is very easy to do as long as your HTML code defines the language correctly.
For me, the content language is always one of the most important content groupings to implement. Many websites here in Finland are bilingual (or trilingual) and it is always interesting to analyse the content consumption based on the language.
Very often, it is possible to identify the language using the page path. For example, if pages in Finnish always include /fi-fi/ in the path, you can identify the language using regex extraction.
Sometimes, this is not possible. What should you do in these cases?
How to read the HTML lang attribute with JavaScript
In HTML, the lang attribute specifies the language of the element using ISO-639-1 language codes. Most of the time, this is used before the document head to specify the language of the document.
Using the lang attribute is especially useful when the language can’t be identified by substrings in the path. In those cases
For example, the second line of this HTML snippet defines the document’s language as English.
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>My title here | My site</title>
In this case, you can read the value of lang attribute with JavaScript:
var language=document.documentElement.lang;
This is all the JavaScript you need to know to implement a custom JavaScript variable in Google Analytics. This variable HTML the lang attributes value to be used in GA settings variable.
Next, you will use this custom JavaScript variable in GA settings variable to send the language attribute to Google Analytics.
Use the document path for identifying language
In some cases, the document URL provides us
https//www.example.com/fi/etusivu.html
https//www.example.com/en/frontpage.html
In these cases, it is easy to see that /en/ and /fi/ are used for content in different languages. And of course, it is again easy to use some simple JavaScript for getting the directory name.
Just create a custom JavaScript function with:
function() {
return(document.location.pathname.split("/")[1]);
}
In this function, document.location.pathname returns e.g. fi/etusivu.html. Then it split into substrings using “/” as
Implement the content grouping
First, you need to decide how to store the content languge in Google Analytics. There are two possibilites:
- Custom dimension with hit scope
- Content grouping for language
In both cases, you’ll need to modify the GA settings variable or tweak the page view and event tags. But of course, you should by now really be using settings variable instead of HTML GA tags.
Before doing any modifications, you need to add the content grouping to your Google Analytics view. Do this in Admin > View > Content Grouping. Add the content grouping and give it a name, e.g. Content language. Choose “Group by tracking code”, enable it and select the index.
Next, you will send the value of your JavaScript variable to Google Analytics using this content grouping. In order to do this, you will need to modify the GA settings variable. Remember to use here the correct index.
Now, all you need to do is to preview, debug and publish the container.
The same steps are needed if you want to use the document language as a custom dimension (hit scope). Define the custom dimension in Google Analytics (Admin > Property settings > Custom definitions > Custom dimensions). Note the index number and use it in GA settings variable for sending the language as a custom dimension.