userver: Caching Component for PostgreSQL
Loading...
Searching...
No Matches
Caching Component for PostgreSQL

A typical components::PostgreCache usage consists of trait definition:

struct PostgresTrivialPolicy {
static constexpr std::string_view kName = "my-pg-cache";
using ValueType = MyStructure;
static constexpr auto kKeyMember = &MyStructure::id;
static constexpr const char* kQuery = "SELECT a, b, updated FROM test.data";
static constexpr const char* kUpdatedField = "updated";
using UpdatedFieldType = storages::postgres::TimePointTz;
};

and registration of the component in components::ComponentList:

return components::MinimalServerComponentList().Append<components::PostgreCache<example::PostgresTrivialPolicy>>();

See Basics of Caches for introduction into caches.

Configuration

components::PostgreCache static configuration file should have a PostgreSQL component name specified in pgcomponent configuration parameter.

Optionally the operation timeouts for cache loading can be specified.

For avoiding "memory leaks", see the respective section in components::CachingComponentBase.

Name Description Default value
full-update-op-timeout Timeout for a full update. 1m
incremental-update-op-timeout Timeout for an incremental update. 1s
update-correction Incremental update window adjustment. 0 for caches with defined GetLastKnownUpdated
chunk-size Number of rows to request from PostgreSQL, 0 to fetch all rows in one request. 1000
sleep-between-chunks Duration to wait between reading chunks from PostgreSQL. 0ms
pgcomponent PostgreSQL component name.

Options inherited from components::CachingComponentBase :

Name Description Default value
update-types Specifies whether incremental and/or full updates are used. Possible values: full-and-incremental, only-full, only-incremental.
update-interval (required) interval between Update invocations.
update-jitter Max. amount of time by which update-interval may be adjusted for requests dispersal. update_interval / 10
updates-enabled If false, cache updates are disabled (except for the first one if !first-update-fail-ok). true
full-update-interval Interval between full updates.
full-update-jitter Max. amount of time by which full-update-interval may be adjusted for requests dispersal. full-update-interval / 10
exception-interval Sleep interval after an unhandled exception. update_interval
first-update-fail-ok Whether first update failure is non-fatal. false
task-processor The name of the TaskProcessor for running DoWork. main-task-processor
config-settings Enables dynamic reconfiguration with CacheConfigSet. true
additional-cleanup-interval How often to run background RCU garbage collector. 10 seconds
is-strong-period Whether to include Update execution time in update-interval. false
has-pre-assign-check Enables the check before changing the value in the cache, by default it is the check that the new value is not empty. false
testsuite-force-periodic-update Override testsuite-periodic-update-enabled in TestsuiteSupport component config.
failed-updates-before-expiration The number of consecutive failed updates for data expiration.
alert-on-failing-to-update-times Fire an alert if the cache update failed specified amount of times in a row. If zero - alerts are disabled. Value from dynamic config takes priority over static. 0
safe-data-lifetime Enables awaiting data destructors in the component's destructor. Can be set to false if the stored data does not refer to the component and its dependencies. true
dump Manages cache behavior after dump load.
dump.first-update-mode Behavior of update after successful load from dump. skip - after successful load from dump, do nothing; required - make a synchronous update of type first-update-type, stop the service on failure; best-effort - make a synchronous update of type first-update-type, keep working and use data from dump on failure. Possible values: skip, required, best-effort. skip
dump.first-update-type Update type after successful load from dump. Possible values: full, incremental, incremental-then-async-full. full

Options inherited from components::ComponentBase :

Name Description Default value
load-enabled Set to false to disable loading of the component. true

Cache policy

Cache policy is the template argument of components::PostgreCache component. Please see the following code snippet for documentation.

namespace example {
struct MyStructure {
int id = 0;
std::string bar{};
storages::postgres::TimePointWithoutTz updated;
int get_id() const { return id; }
};
struct PostgresExamplePolicy {
// Name of caching policy component.
//
// Required: **yes**
static constexpr std::string_view kName = "my-pg-cache";
// Object type.
//
// Required: **yes**
using ValueType = MyStructure;
// Key by which the object must be identified in cache.
//
// One of:
// - A pointer-to-member in the object
// - A pointer-to-member-function in the object that returns the key
// - A pointer-to-function that takes the object and returns the key
// - A lambda that takes the object and returns the key
//
// Required: **yes**
static constexpr auto kKeyMember = &MyStructure::id;
// Data retrieve query.
//
// The query should not contain any clauses after the `from` clause. Either
// `kQuery` or `GetQuery` static member function must be defined.
//
// Required: **yes**
static constexpr const char* kQuery = "select id, bar, updated from test.my_data";
// Name of the field containing timestamp of an object.
//
// To turn off incremental updates, set the value to `nullptr`.
//
// Required: **yes**
static constexpr const char* kUpdatedField = "updated";
// Type of the field containing timestamp of an object.
//
// Specifies whether updated field should be treated as a timestamp
// with or without timezone in database queries.
//
// Required: **yes** if incremental updates are used.
using UpdatedFieldType = storages::postgres::TimePointTz;
// Where clause of the query. Either `kWhere` or `GetWhere` can be defined.
//
// Required: no
static constexpr const char* kWhere = "id > 10";
// Cache container type.
//
// It can be of any map type. The default is `unordered_map`, it is not
// necessary to declare the DataType alias if you are OK with
// `unordered_map`.
// The key type must match the type of kKeyMember.
//
// Required: no
using CacheContainer = std::unordered_map<int, MyStructure>;
// Cluster host selection flags to use when retrieving data.
//
// Default value is storages::postgres::ClusterHostType::kSlave, at least one
// cluster role must be present in flags.
//
// Required: no
static constexpr auto kClusterHostType = storages::postgres::ClusterHostType::kSlave;
// Whether Get() is expected to return nullptr.
//
// Default value is false, Get() will throw an exception instead of
// returning nullptr.
//
// Required: no
static constexpr bool kMayReturnNull = false;
// Order by clause of the query.
//
// May be useful for example when a table stores all the events with some timestamp, but in cache we
// wish to store only the last event and need ordering and `DISTINCT ON` expression in `kQuery`.
//
// Required: no
static constexpr const char* kOrderBy = "updated asc";
};
} // namespace example

The query can be a std::string. But due to non-guaranteed order of static data members initialization, std::string should be returned from a static member function, please see the following code snippet.

struct PostgresExamplePolicy4 {
static constexpr std::string_view kName = "my-pg-cache";
using ValueType = MyStructure;
static constexpr auto kKeyMember = &MyStructure::id;
static std::string GetQuery() { return "select id, bar, updated from test.my_data"; }
static constexpr const char* kUpdatedField = "updated";
// no time zone (should be avoided)
using UpdatedFieldType = storages::postgres::TimePointWithoutTz;
};

Policy may have static function GetLastKnownUpdated. It should be used when new entries from database are taken via revision, identifier, or anything else, but not timestamp of the last update. If this function is supplied, new entries are taken from db with condition 'WHERE kUpdatedField > GetLastKnownUpdated(cache_container)'. Otherwise, condition is 'WHERE kUpdatedField > last_update - correction_'. See the following code snippet for an example of usage

struct MyStructureWithRevision {
int id = 0;
std::string bar{};
storages::postgres::TimePointWithoutTz updated;
int32_t revision = 0;
int get_id() const { return id; }
};
class UserSpecificCache {
public:
void insert_or_assign(int, MyStructureWithRevision&& item) {
latest_revision_ = std::max(latest_revision_, item.revision);
}
static size_t size() { return 0; }
int GetLatestRevision() const { return latest_revision_; }
private:
int latest_revision_ = 0;
};
struct PostgresExamplePolicy3 {
using ValueType = MyStructureWithRevision;
static constexpr std::string_view kName = "my-pg-cache";
static constexpr const char* kQuery = "select id, bar, revision from test.my_data";
using CacheContainer = UserSpecificCache;
static constexpr const char* kUpdatedField = "revision";
using UpdatedFieldType = int32_t;
static constexpr auto kKeyMember = &MyStructureWithRevision::get_id;
// Function to get last known revision/time
//
// Optional
// If one wants to get cache updates not based on updated time, but, for
// example, based on revision > known_revision, this method should be used.
static int32_t GetLastKnownUpdated(const UserSpecificCache& container) { return container.GetLatestRevision(); }
};

Cache can also store only subset of data. For example for the database that is is defined in the following way:

CREATE TABLE IF NOT EXISTS key_value_table (
key VARCHAR,
value VARCHAR,
updated TIMESTAMPTZ NOT NULL DEFAULT NOW()
)

it is possible to create a cache that stores only the latest value:

struct KeyValue {
std::string key;
std::string value;
};
struct LastCachePolicy {
static constexpr std::string_view kName = "last-pg-cache";
using ValueType = KeyValue;
static constexpr auto kKeyMember = &KeyValue::key;
static constexpr const char* kQuery = "SELECT DISTINCT ON (key) key, value FROM key_value_table";
static constexpr const char* kUpdatedField = "updated";
using UpdatedFieldType = storages::postgres::TimePointTz;
static constexpr const char* kOrderBy = "key, updated DESC";
};
using LastCache = components::PostgreCache<LastCachePolicy>;

In case one provides a custom CacheContainer within Policy, it is notified of Update completion via its public member function OnWritesDone, if any. Custom CacheContainer must provide size method and insert_or_assign method similar to std::unordered_map's one or CacheInsertOrAssign function similar to one defined in namespace utils::impl::projected_set (i.e. used for utils::ProjectedUnorderedSet). See the following code snippet for an example of usage:

class UserSpecificCacheWithWriteNotification {
public:
void insert_or_assign(int, MyStructure&&) {}
static size_t size() { return 0; }
void OnWritesDone() {}
};

Forward Declaration

To forward declare a cache you can forward declare a trait and include userver/cache/base_postgres_cache_fwd.hpp header. It is also useful to forward declare the cache value type.

#pragma once
#include <memory> // for std::shared_ptr
USERVER_NAMESPACE_BEGIN
namespace example { // replace with a namespace of your trait
struct PostgresExamplePolicy;
struct MyStructure;
} // namespace example
namespace caches {
using MyCache1 = components::PostgreCache<example::PostgresExamplePolicy>;
using MyCache1Data = std::shared_ptr<const example::MyStructure>;
} // namespace caches