userver: utils::regex Class Reference
Loading...
Searching...
No Matches
utils::regex Class Referencefinal

#include <userver/utils/regex.hpp>

Detailed Description

A drop-in replacement for std::regex without huge includes and with better performance characteristics.

utils::regex is currently implemented using re2.

See also
utils::regex_match
utils::regex_search
utils::regex_replace

Read re2 documentation on the limitations of re2 engine. Notably, it does not support:

  1. lookahead and lookbehind;
  2. quantifiers over 1000, regexes with large repetition counts consume more memory;
  3. spaces in quantifiers like \w{1, 5};
  4. possessive quantifiers.

An example of complex string parsing using utils::regex

// An example of complex regex parsing using 'prefix' and 'suffix' methods.
// Suppose that we want to split a text into words and also check that
// the first letter of each sentence is capitalized.
std::vector<std::string_view> SplitTextIntoWords(const std::string_view text) {
static const utils::regex word_regex("[a-zA-Z]+");
static const utils::regex punctuation_regex("[., ]*");
static const utils::regex capitalized_word_start_regex("^[A-Z]");
std::vector<std::string_view> words;
auto remaining = text;
while (utils::regex_search(remaining, word_match, word_regex)) {
const auto punctuation = word_match.prefix();
if (!utils::regex_match(punctuation, punctuation_regex)) {
throw std::invalid_argument(fmt::format("Invalid characters '{}'", punctuation));
}
const auto word = word_match[0];
const bool should_be_capitalized = words.empty() || punctuation.find('.') != std::string_view::npos;
if (should_be_capitalized && !utils::regex_search(word, capitalized_word_start_regex)) {
throw std::invalid_argument(fmt::format("Word '{}' should be capitalized", word));
}
words.push_back(word);
remaining = word_match.suffix();
}
if (!utils::regex_match(remaining, punctuation_regex)) {
throw std::invalid_argument(fmt::format("Invalid characters '{}'", remaining));
}
return words;
}
TEST(Regex, SplitTextIntoWords) {
EXPECT_THAT(
SplitTextIntoWords("Foo bar. Baz, qux quux."), testing::ElementsAre("Foo", "bar", "Baz", "qux", "quux")
);
UEXPECT_THROW_MSG(SplitTextIntoWords("Foo + bar"), std::invalid_argument, "Invalid characters ' + '");
UEXPECT_THROW_MSG(SplitTextIntoWords("Foo bar. baz."), std::invalid_argument, "Word 'baz' should be capitalized");
UEXPECT_THROW_MSG(SplitTextIntoWords("Foo, bar% "), std::invalid_argument, "Invalid characters '% '");
}

Definition at line 44 of file regex.hpp.

Public Member Functions

 regex ()
 Constructs a null regex, any usage except for copy/move is UB.
 
 regex (std::string_view pattern)
 Compiles regex from pattern, always valid on construction.
 
 regex (const regex &)
 
 regex (regex &&) noexcept
 
regexoperator= (const regex &)
 
regexoperator= (regex &&) noexcept
 
bool operator== (const regex &) const
 
std::string_view GetPatternView () const
 
std::string str () const
 

Constructor & Destructor Documentation

◆ regex()

utils::regex::regex ( std::string_view pattern)
explicit

Compiles regex from pattern, always valid on construction.

Exceptions
utils::InvalidRegexif pattern is invalid

Member Function Documentation

◆ GetPatternView()

std::string_view utils::regex::GetPatternView ( ) const
Returns
a view to the original pattern stored inside.

◆ operator==()

bool utils::regex::operator== ( const regex & ) const
Returns
true if the patterns are equal.
Note
May also return true if the patterns are not equal, but are equivalent.

◆ str()

std::string utils::regex::str ( ) const
Returns
the original pattern.

Friends And Related Symbol Documentation

◆ match_results

Definition at line 73 of file regex.hpp.

◆ regex_match

bool regex_match ( std::string_view str,
match_results & m,
const regex & pattern )
friend

Returns true if the specified regular expression matches the whole of the input. Fills in what matched in m.

Note
m may be clobbered on failure.

◆ regex_replace [1/2]

std::string regex_replace ( std::string_view str,
const regex & pattern,
Re2Replacement repl )
friend

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

See also
utils::Re2Replacement

◆ regex_replace [2/2]

std::string regex_replace ( std::string_view str,
const regex & pattern,
std::string_view repl )
friend

Create a new string where all regular expression matches replaced with repl.

Interprets repl as a literal, does not support substitutions.

See also
utils::Re2Replacement

◆ regex_search

bool regex_search ( std::string_view str,
match_results & m,
const regex & pattern )
friend

Determines whether the regular expression matches anywhere in the target character sequence. Fills in what matched in m.

Note
m may be clobbered on failure.

The documentation for this class was generated from the following file: