We want to make this open-source project available for people all around the world. Please help us to translate the content of this tutorial to the language you know

Sets and ranges [...]

Several characters or character classes inside square brackets […] mean to “search for any character among given”.

Sets

For instance, [eao] means any of the 3 characters: 'a', 'e', or 'o'.

That’s called a set. Sets can be used in a regexp along with regular characters:

// find [t or m], and then "op"
alert( "Mop top".match(/[tm]op/gi) ); // "Mop", "top"

Please note that although there are multiple characters in the set, they correspond to exactly one character in the match.

So the example above gives no matches:

// find "V", then [o or i], then "la"
alert( "Voila".match(/V[oi]la/) ); // null, no matches

The pattern assumes:

  • V,
  • then one of the letters [oi],
  • then la.

So there would be a match for Vola or Vila.

Ranges

Square brackets may also contain character ranges.

For instance, [a-z] is a character in range from a to z, and [0-5] is a digit from 0 to 5.

In the example below we’re searching for "x" followed by two digits or letters from A to F:

alert( "Exception 0xAF".match(/x[0-9A-F][0-9A-F]/g) ); // xAF

Please note that in the word Exception there’s a substring xce. It didn’t match the pattern, because the letters are lowercase, while in the set [0-9A-F] they are uppercase.

If we want to find it too, then we can add a range a-f: [0-9A-Fa-f]. The i flag would allow lowercase too.

Character classes are shorthands for certain character sets.

For instance:

  • \d – is the same as [0-9],
  • \w – is the same as [a-zA-Z0-9_],
  • \s – is the same as [\t\n\v\f\r ] plus few other unicode space characters.

We can use character classes inside […] as well.

For instance, we want to match all wordly characters or a dash, for words like “twenty-third”. We can’t do it with \w+, because \w class does not include a dash. But we can use [\w-].

We also can use a combination of classes to cover every possible character, like [\s\S]. That matches spaces or non-spaces – any character. That’s wider than a dot ".", because the dot matches any character except a newline.

Excluding ranges

Besides normal ranges, there are “excluding” ranges that look like [^…].

They are denoted by a caret character ^ at the start and match any character except the given ones.

For instance:

  • [^aeyo] – any character except 'a', 'e', 'y' or 'o'.
  • [^0-9] – any character except a digit, the same as \D.
  • [^\s] – any non-space character, same as \S.

The example below looks for any characters except letters, digits and spaces:

alert( "alice15@gmail.com".match(/[^\d\sA-Z]/gi) ); // @ and .

No escaping in […]

Usually when we want to find exactly the dot character, we need to escape it like \.. And if we need a backslash, then we use \\.

In square brackets the vast majority of special characters can be used without escaping:

  • A dot '.'.
  • A plus '+'.
  • Parentheses '( )'.
  • Dash '-' in the beginning or the end (where it does not define a range).
  • A caret '^' if not in the beginning (where it means exclusion).
  • And the opening square bracket '['.

In other words, all special characters are allowed except where they mean something for square brackets.

A dot "." inside square brackets means just a dot. The pattern [.,] would look for one of characters: either a dot or a comma.

In the example below the regexp [-().^+] looks for one of the characters -().^+:

// No need to escape
let reg = /[-().^+]/g;

alert( "1 + 2 - 3".match(reg) ); // Matches +, -

…But if you decide to escape them “just in case”, then there would be no harm:

// Escaped everything
let reg = /[\-\(\)\.\^\+]/g;

alert( "1 + 2 - 3".match(reg) ); // also works: +, -

Tasks

We have a regexp /Java[^script]/.

Does it match anything in the string Java? In the string JavaScript?

Answers: no, yes.

  • In the script Java it doesn’t match anything, because [^script] means “any character except given ones”. So the regexp looks for "Java" followed by one such symbol, but there’s a string end, no symbols after it.

    alert( "Java".match(/Java[^script]/) ); // null
  • Yes, because the regexp is case-insensitive, the [^script] part matches the character "S".

    alert( "JavaScript".match(/Java[^script]/) ); // "JavaS"

The time can be in the format hours:minutes or hours-minutes. Both hours and minutes have 2 digits: 09:00 or 21-30.

Write a regexp to find time:

let reg = /your regexp/g;
alert( "Breakfast at 09:00. Dinner at 21-30".match(reg) ); // 09:00, 21-30

P.S. In this task we assume that the time is always correct, there’s no need to filter out bad strings like “45:67”. Later we’ll deal with that too.

Answer: \d\d[-:]\d\d.

let reg = /\d\d[-:]\d\d/g;
alert( "Breakfast at 09:00. Dinner at 21-30".match(reg) ); // 09:00, 21-30

Please note that the dash '-' has a special meaning in square brackets, but only between other characters, not when it’s in the beginning or at the end, so we don’t need to escape it.

Tutorial map

江苏快三基本走势图带连线_江苏快3走势图

read this before commenting…
  • You're welcome to post additions, questions to the articles and answers to them.
  • To insert a few words of code, use the <code> tag, for several lines – use <pre>, for more than 10 lines – use a sandbox (plnkr, JSBin, codepen…)
  • If you can't understand something in the article – please elaborate.