The backbone of an HTML document are tags.
According to Document Object Model (DOM), every HTML-tag is an object. Nested tags are called “children” of the enclosing one.
The text inside a tag it is an object as well.
For instance, let’s explore the DOM for this document:
<!DOCTYPE HTML> <html> <head> <title>About elks</title> </head> <body> The truth about elks. </body> </html>
The DOM represents HTML as a tree structure of tags. Here’s how it looks:
On the picture above, you can click on element nodes and their children will open/collapse.
Tags are called element nodes (or just elements). Nested tags become children of the enclosing ones. As a result we have a tree of elements:
<html> is at the root, then
<body> are its children, etc.
The text inside elements forms text nodes, labelled as
#text. A text node contains only a string. It may not have children and is always a leaf of the tree.
For instance, the
<title> tag has the text
Please note the special characters in text nodes:
- a newline:
- a space:
Spaces and newlines – are totally valid characters, they form text nodes and become a part of the DOM. So, for instance, in the example above the
<head> tag contains some spaces before
<title>, and that text becomes a
#text node (it contains a newline and some spaces only).
There are only two top-level exclusions:
- Spaces and newlines before
<head>are ignored for historical reasons,
- If we put something after
</body>, then that is automatically moved inside the
body, at the end, as the HTML spec requires that all content must be inside
<body>. So there may be no spaces after
In other cases everything’s straightforward – if there are spaces (just like any character) in the document, then they become text nodes in DOM, and if we remove them, then there won’t be any.
Here are no space-only text nodes:
<!DOCTYPE HTML> <html><head><title>About elks</title></head><body>The truth about elks.</body></html>
Browser tools (to be covered soon) that work with DOM usually do not show spaces at the start/end of the text and empty text nodes (line-breaks) between tags.
That’s because they are mainly used to decorate HTML, and do not affect how it is shown (in most cases).
On further DOM pictures we’ll sometimes omit them where they are irrelevant, to keep things short.
If the browser encounters malformed HTML, it automatically corrects it when making DOM.
For instance, the top tag is always
<html>. Even if it doesn’t exist in the document – it will exist in the DOM, the browser will create it. The same goes for
As an example, if the HTML file is a single word
"Hello", the browser will wrap it into
<body>, add the required
<head>, and the DOM will be:
While generating the DOM, browsers automatically process errors in the document, close tags and so on.
Such an “invalid” document:
<p>Hello <li>Mom <li>and <li>Dad
…Will become a normal DOM, as the browser reads tags and restores the missing parts:
An interesting “special case” is tables. By the DOM specification they must have
<tbody>, but HTML text may (officially) omit it. Then the browser creates
<tbody> in DOM automatically.
For the HTML:
DOM-structure will be:
You see? The
<tbody> appeared out of nowhere. You should keep this in mind while working with tables to avoid surprises.
Let’s add more tags and a comment to the page:
<!DOCTYPE HTML> <html> <body> The truth about elks. <ol> <li>An elk is a smart</li> <!-- comment --> <li>...and cunning animal!</li> </ol> </body> </html>
Here we see a new tree node type – comment node, labeled as
We may think – why a comment is added to the DOM? It doesn’t affect the visual representation in any way. But there’s a rule – if something’s in HTML, then it also must be in the DOM tree.
Everything in HTML, even comments, becomes a part of the DOM.
<!DOCTYPE...> directive at the very beginning of HTML is also a DOM node. It’s in the DOM tree right before
<html>. We are not going to touch that node, we even don’t draw it on diagrams for that reason, but it’s there.
document object that represents the whole document is, formally, a DOM node as well.
There are 12 node types. In practice we usually work with 4 of them:
document– the “entry point” into DOM.
- element nodes – HTML-tags, the tree building blocks.
- text nodes – contain text.
- comments – sometimes we can put the information there, it won’t be shown, but JS can read it from the DOM.
To see the DOM structure in real-time, try Live DOM Viewer. Just type in the document, and it will show up DOM at an instant.
Another way to explore the DOM is to use the browser developer tools. Actually, that’s what we use when developing.
To do so, open the web-page elks.html, turn on the browser developer tools and switch to the Elements tab.
It should look like this:
You can see the DOM, click on elements, see their details and so on.
Please note that the DOM structure in developer tools is simplified. Text nodes are shown just as text. And there are no “blank” (space only) text nodes at all. That’s fine, because most of the time we are interested in element nodes.
Clicking the button in the left-upper corner allows to choose a node from the webpage using a mouse (or other pointer devices) and “inspect” it (scroll to it in the Elements tab). This works great when we have a huge HTML page (and corresponding huge DOM) and would like to see the place of a particular element in it.
Another way to do it would be just right-clicking on a webpage and selecting “Inspect” in the context menu.
At the right part of the tools there are the following subtabs:
- Styles – we can see CSS applied to the current element rule by rule, including built-in rules (gray). Almost everything can be edited in-place, including the dimensions/margins/paddings of the box below.
- Computed – to see CSS applied to the element by property: for each property we can see a rule that gives it (including CSS inheritance and such).
- Event Listeners – to see event listeners attached to DOM elements (we’ll cover them in the next part of the tutorial).
- …and so on.
The best way to study them is to click around. Most values are editable in-place.
- Select the first
<li>in the Elements tab.
- Press Esc – it will open console right below the Elements tab.
Now the last selected element is available as
$0, the previously selected is
We can run commands on them. For instance,
$0.style.background = 'red' makes the selected list item red, like this:
From the other side, if we’re in console and have a variable referencing a DOM node, then we can use the command
inspect(node) to see it in the Elements pane.
Or we can just output it in the console and explore “at-place”, like
The browser developer tools are a great help in development: we can explore the DOM, try things and see what goes wrong.
An HTML/XML document is represented inside the browser as the DOM tree.
- Tags become element nodes and form the structure.
- Text becomes text nodes.
- …etc, everything in HTML has its place in DOM, even comments.
We can use developer tools to inspect DOM and modify it manually.
Here we covered the basics, the most used and important actions to start with. There’s an extensive documentation about Chrome Developer Tools at https://developers.google.com/web/tools/chrome-devtools. The best way to learn the tools is to click here and there, read menus: most options are obvious. Later, when you know them in general, read the docs and pick up the rest.
DOM nodes have properties and methods that allow to travel between them, modify, move around the page and more. We’ll get down to them in the next chapters.