JavaScript one-liner to get element’s text content without its child nodes
A testing engineer here at work asked me how he would be able to get an element’s text content without the text inside the possible child elements.
The JavaScript DOM doesn’t give us a method to do that directly, but there’s a one-line solution that uses some interesting JavaScript tricks.
Examples (not real world examples)
The problem with using parentElement.textContent directly:
<h1>Page title <em>Other stuff</em></h1>
-> "Page title Other stuff"
<p>My <blink>great</blink> website</p>
-> "My great website"
What we want to achieve:
<h1>Page title <em>Other stuff</em></h1>
-> "Page title "
<p>My <blink>great</blink> website</p>
-> "My website"
The solution
// Get the parent element somehow, you can just as well use
// .getElementById() or any other DOM method
var parentElement = document.querySelector('#myDiv');
// Returns the text content as a string
[].reduce.call(parentElement.childNodes, function(a, b) { return a + (b.nodeType === 3 ? b.textContent : ''); }, '');
How does it work?
DOM element’s childNodes property is not an array even though it looks like one. It’s actually an instance of NodeList which doesn’t have the usual array methods, such as .forEach(), .map() or .reduce(). Luckily, we can easily borrow them from Array by using the .call() method found in the Function prototype.
So, we’re calling Array.prototype.reduce with a NodeList by creating an empty array and using its method:
[].reduce.call(arrayLikeObject, callbackFn, initialValue);
// same as this, but we saved some characters
Array.prototype.reduce.call(arrayLikeObject, callbackFunction, initialValue);
The .reduce() method takes one mandatory parameter – the callback function – and optionally the initial value. As stated by MDN:
The reduce() method applies a function against an accumulator and each value of the array (from left-to-right) has to reduce it to a single value.
We want the end result to be a string, so we give an empty string as the initial value.
Our callback function looks like this (replaced the one character variable names with slightly more descriptive ones):
function(result, childNode) {
return result + (childNode.nodeType === 3 ? childNode.textContent : '');
}
The function will be called for each child node (including the text nodes, not only elements separated with HTML tags) that our parent element has.
The result parameter contains the string that has accumulated so far.
The childNode parameter contains the node currently being processed. First we’ll check the child node’s type. If it’s a text node (its nodeType is 3, also found in the constant Node.TEXT_NODE), we concatenate the node’s textContent to the result. Otherwise we concatenate an empty string, keeping the result intact.
What about whitespace?
As you may have noticed, textContent property contains all the whitespace between the elements, including linebreaks. If you need them trimmed, I’ll let you do that as a homework.
Thanks to Javier Márquez for publishing the blog post Learning much javascript from one line of code which gave me a great starting point.
Originally posted at https://medium.com/@roxeteer/javascript-one-liner-to-get-elements-text-content-without-its-child-nodes-8e59269d1e71