userver: Pre-caching data from HTTP remote
Loading...
Searching...
No Matches
Pre-caching data from HTTP remote

Before you start

Make sure that you can compile and run core tests and read a basic example Writing your first HTTP server.

Step by step guide

Consider the case when you need to write an HTTP handler that greets a user. However, the localized strings are stored at some remote HTTP server, and those localizations rarely change. To reduce the network traffic and improve response times it is profitable to bulk retrieve the translations from remote and cache them in a local cache.

In this sample we show how to write and test a cache along with creating HTTP requests. If you wish to cache data from database prefer using cache specializations for DB.

Remote HTTP server description

For the purpose of this there is some HTTP server that has HTTP handler http://localhost:8090/v1/translations.

Handler returns all the available translations as JSON on GET:

bash
curl http://localhost:8090/v1/translations -s | jq
{
"content": {
"hello": {
"ru": "Привет",
"en": "Hello"
},
"wellcome": {
"ru": "Добро пожаловать",
"en": "Wellcome"
}
},
"update_time": "2021-11-01T12:00:00Z"
}

In case of query parameter "last_update" the handler returns only the changed translations since requested time:

bash
$ curl http://localhost:8090/v1/translations?last_update=2021-11-01T12:00:00Z -s | jq
{
"content": {
"hello": {
"ru": "Приветище"
}
},
"update_time": "2021-12-01T12:00:00Z"
}

We are planning to cache those translations in a std::unordered_map:

namespace samples::http_cache {
struct KeyLang {
std::string key;
std::string language;
};
struct KeyLangEq {
bool operator()(const KeyLang& x, const KeyLang& y) const noexcept;
};
struct KeyLangHash {
bool operator()(const KeyLang& x) const noexcept;
};
using KeyLangToTranslation =
std::unordered_map<KeyLang, std::string, KeyLangHash, KeyLangEq>;
} // namespace samples::http_cache

Cache component

Our cache component should have the following fields:

namespace samples::http_cache {
class HttpCachedTranslations final
: public components::CachingComponentBase<KeyLangToTranslation> {
public:
static constexpr std::string_view kName = "cache-http-translations";
HttpCachedTranslations(const components::ComponentConfig& config,
~HttpCachedTranslations() override;
void Update(
[[maybe_unused]] const std::chrono::system_clock::time_point& last_update,
[[maybe_unused]] const std::chrono::system_clock::time_point& now,
cache::UpdateStatisticsScope& stats_scope) override;
static yaml_config::Schema GetStaticConfigSchema();
private:
clients::http::Client& http_client_;
const std::string translations_url_;
std::string last_update_remote_;
formats::json::Value GetAllData() const;
formats::json::Value GetUpdatedData() const;
void MergeAndSetData(KeyLangToTranslation&& content,
};
} // namespace samples::http_cache

To create a non LRU cache cache you have to derive from components::CachingComponentBase, call CacheUpdateTrait::StartPeriodicUpdates() at the component constructor and CacheUpdateTrait::StopPeriodicUpdates() at the destructor:

HttpCachedTranslations::HttpCachedTranslations(
: CachingComponentBase(config, context),
http_client_(
context.FindComponent<components::HttpClient>().GetHttpClient()),
translations_url_(config["translations-url"].As<std::string>()) {
CacheUpdateTrait::StartPeriodicUpdates();
}
HttpCachedTranslations::~HttpCachedTranslations() {
CacheUpdateTrait::StopPeriodicUpdates();
}

The constructor initializes component fields with data from static configuration and with references to clients.

Update

Depending on cache configuration settings the overloaded Update function is periodically called with different options:

void HttpCachedTranslations::Update(
[[maybe_unused]] const std::chrono::system_clock::time_point& last_update,
[[maybe_unused]] const std::chrono::system_clock::time_point& now,
switch (type) {
json = GetAllData();
break;
json = GetUpdatedData();
break;
default:
UASSERT(false);
}
if (json.IsEmpty()) {
stats_scope.FinishNoChanges();
return;
}
KeyLangToTranslation content;
const auto snapshot = Get(); // smart pointer to a shared cache data
content = *snapshot; // copying the shared data
}
MergeAndSetData(std::move(content), json, stats_scope);
}

In this sample full and incremental data retrieval is implemented in GetAllData() and GetUpdatedData() respectively.

Both functions return data in the same format and we parse both responses using MergeAndSetData:

void HttpCachedTranslations::MergeAndSetData(
KeyLangToTranslation&& content, formats::json::Value json,
for (const auto& [key, value] : Items(json["content"])) {
for (const auto& [lang, text] : Items(value)) {
content.insert_or_assign(KeyLang{key, lang}, text.As<std::string>());
}
stats_scope.IncreaseDocumentsReadCount(value.GetSize());
}
auto update_time = json["update_time"].As<std::string>();
const auto size = content.size();
Set(std::move(content));
last_update_remote_ = std::move(update_time);
stats_scope.Finish(size);
}

At the end of the MergeAndSetData function the components::CachingComponentBase::Set() invocation stores data as a new cache.

Update time from remote stored into last_update_remote_. Clocks on different hosts are usually out of sync, so it is better to store the remote time if possible, rather than using a local times from last_update and now input parameters.

Data retrieval

To make an HTTP request call clients::http::Client::CreateRequest() to get an instance of clients::http::Request builder. Work with builder is quite straightforward:

formats::json::Value HttpCachedTranslations::GetAllData() const {
auto response = http_client_.CreateRequest()
.get(translations_url_) // HTTP GET translations_url_ URL
.retry(2) // retry once in case of error
.timeout(std::chrono::milliseconds{500})
.perform(); // start performing the request
response->raise_for_status();
return formats::json::FromString(response->body_view());
}

HTTP requests for incremental update differ only in URL and query parameter last_update:

formats::json::Value HttpCachedTranslations::GetUpdatedData() const {
const auto url =
http::MakeUrl(translations_url_, {{"last_update", last_update_remote_}});
auto response = http_client_.CreateRequest()
.get(url)
.retry(2)
.timeout(std::chrono::milliseconds{500})
.perform();
response->raise_for_status();
return formats::json::FromString(response->body_view());
}

Static configuration

To configure the new cache component provide its own options, options from components::CachingComponentBase:

# yaml
components_manager:
components: # Configuring components that were registered via component_list
cache-http-translations:
translations-url: 'mockserver/v1/translations' # Some other microservice listens on this URL
update-types: full-and-incremental
full-update-interval: 1h
update-interval: 15m

Options for dependent components components::HttpClient, components::TestsuiteSupport and support handler server::handlers::TestsControl should be provided:

http-client:
fs-task-processor: fs-task-processor
dns-client: # Asynchronous DNS component
fs-task-processor: fs-task-processor
testsuite-support:
tests-control:
# Some options from server::handlers::HttpHandlerBase
path: /tests/{action}
method: POST
task_processor: main-task-processor
server:
# ...

Dynamic configuration

Dynamic configuration is close to the basic configuration from Writing your first HTTP server but should have additional options for HTTP client: samples/http_caching/dynamic_config_fallback.json

All the values are described in at Dynamic configs.

A production ready service would dynamically retrieve the above options at runtime from a configuration service. See Writing your own configs server for insights on how to change the above options on the fly, without restarting the service.

Cache component usage

Now the cache could be used just as any other component. For example, a handler could get a reference to the cache and use it in HandleRequestThrow:

class GreetUser final : public server::handlers::HttpHandlerBase {
public:
static constexpr std::string_view kName = "handler-greet-user";
GreetUser(const components::ComponentConfig& config,
: HttpHandlerBase(config, context),
cache_(context.FindComponent<HttpCachedTranslations>()) {}
std::string HandleRequestThrow(
const server::http::HttpRequest& request,
const auto cache_snapshot = cache_.Get();
using samples::http_cache::KeyLang;
const auto& hello = cache_snapshot->at(KeyLang{"hello", "ru"});
const auto& welcome = cache_snapshot->at(KeyLang{"welcome", "ru"});
return fmt::format("{}, {}! {}", hello, request.GetArg("username"),
welcome);
}
private:
samples::http_cache::HttpCachedTranslations& cache_;
};

Note that the cache is concurrency safe as all the components.

int main()

Finally, after writing down the dynamic configuration values into file at dynamic-config-fallbacks.fallback-path, we add our component to the components::MinimalServerComponentList(), and start the server with static configuration kStaticConfig.

int main(int argc, char* argv[]) {
const auto component_list =
.Append<samples::http_cache::HttpCachedTranslations>()
.Append<samples::http_cache::GreetUser>()
.Append<server::handlers::TestsControl>()
.Append<components::HttpClient>();
return utils::DaemonMain(argc, argv, component_list);
}

Build and Run

To build the sample, execute the following build steps at the userver root directory:

mkdir build_release
cd build_release
cmake -DCMAKE_BUILD_TYPE=Release ..
make userver-samples-http_caching

Note that you need a running translations service with bulk handlers to start the service. You could use the mongo service for that purpose.

The sample could be started by running make start-userver-samples-http_caching. The command would invoke testsuite start target that sets proper paths in the configuration files and starts the service.

To start the service manually run ./samples/http_caching/userver-samples-http_caching -c </path/to/static_config.yaml> (do not forget to prepare the configuration files!).

Now you can send a request to your server from another terminal:

bash
$ curl -X POST http://localhost:8089/samples/greet?username=Dear+Developer -i
HTTP/1.1 200 OK
Date: Thu, 09 Dec 2021 17:01:44 UTC
Content-Type: text/html; charset=utf-8
X-YaRequestId: 94193f99ebf94eb58252139f2e9dace4
X-YaSpanId: 4e17e02dfa7b8322
X-YaTraceId: 306d2d54fd0543c09376a5c4bb120a88
Server: userver/1.0.0 (20211209085954; rv:d05d059a3)
Connection: keep-alive
Content-Length: 61
Привет, Dear Developer! Добро пожаловать

Auto testing

The server::handlers::TestsControl and components::TestsuiteSupport components among other things provide necessary control over the caches. You can stop/start periodic updates or even force updates of caches by HTTP requests to the server::handlers::TestsControl handler.

For example the following JSON forces incremental update of the cache-http-translations cache:

{"invalidate_caches": {
"update_type": "incremental",
"names": ["cache-http-translations"]
}}

Fortunately, the testsuite API provides all the required functionality via simpler to use Python functions.

Functional testing

Functional tests for the service could be implemented using the testsuite. To do that you have to:

  • Mock the translations service data:
    @pytest.fixture
    def translations():
    return {
    'hello': {'en': 'hello', 'ru': 'Привет'},
    'welcome': {'ru': 'Добро пожаловать', 'en': 'Welcome'},
    }
  • Mock the translations service API:
    @pytest.fixture(autouse=True)
    def mock_translations(mockserver, translations, mocked_time):
    @mockserver.json_handler('/v1/translations')
    def mock(request):
    return {
    'content': translations,
    'update_time': utils.timestring(mocked_time.now()),
    }
    return mock
  • Import the pytest_userver.plugins.core plugin and teach testsuite how to patch the service config to use the mocked URL:
    pytest_plugins = ['pytest_userver.plugins.core']
    USERVER_CONFIG_HOOKS = ['userver_config_translations']
    @pytest.fixture(scope='session')
    def userver_config_translations(mockserver_info):
    def do_patch(config_yaml, config_vars):
    components = config_yaml['components_manager']['components']
    components['cache-http-translations'][
    'translations-url'
    ] = mockserver_info.url('v1/translations')
    return do_patch
  • Write the test:
    async def test_http_caching(service_client, translations, mocked_time):
    response = await service_client.post(
    '/samples/greet', params={'username': 'дорогой разработчик'},
    )
    assert response.status == 200
    assert response.text == 'Привет, дорогой разработчик! Добро пожаловать'
    translations['hello']['ru'] = 'Приветище'
    mocked_time.sleep(10)
    await service_client.invalidate_caches()
    response = await service_client.post(
    '/samples/greet', params={'username': 'дорогой разработчик'},
    )
    assert response.status == 200
    assert response.text == 'Приветище, дорогой разработчик! Добро пожаловать'

Full sources

See the full example: