Container¶
Container Auditor¶
Container Backend¶
Pluggable Back-ends for Container Server
- class swift.container.backend.ContainerBroker(db_file, timeout=25, logger=None, account=None, container=None, pending_timeout=None, stale_reads_ok=False, skip_commits=False, force_db_file=False)¶
Bases:
swift.common.db.DatabaseBroker
Encapsulates working with a container database.
Note that this may involve multiple on-disk DB files if the container becomes sharded:
_db_file
is the path to the legacy container DB name, i.e.<hash>.db
. This file should exist for an initialised broker that has never been sharded, but will not exist once a container has been sharded.db_files
is a list of existing db files for the broker. This list should have at least one entry for an initialised broker, and should have two entries while a broker is in SHARDING state.db_file
is the path to whichever db is currently authoritative for the container. Depending on the container’s state, this may not be the same as thedb_file
argument given to__init__()
, unlessforce_db_file
is True in which casedb_file
is always equal to thedb_file
argument given to__init__()
.pending_file
is always equal to_db_file
extended with.pending
, i.e.<hash>.db.pending
.
- classmethod create_broker(device_path, part, account, container, logger=None, epoch=None, put_timestamp=None, storage_policy_index=None)¶
Create a ContainerBroker instance. If the db doesn’t exist, initialize the db file.
- Parameters
device_path – device path
part – partition number
account – account name string
container – container name string
logger – a logger instance
epoch – a timestamp to include in the db filename
put_timestamp – initial timestamp if broker needs to be initialized
storage_policy_index – the storage policy index
- Returns
a tuple of (
broker
,initialized
) wherebroker
is an instance ofswift.container.backend.ContainerBroker
andinitialized
is True if the db file was initialized, False otherwise.
- create_container_info_table(conn, put_timestamp, storage_policy_index)¶
Create the container_info table which is specific to the container DB. Not a part of Pluggable Back-ends, internal to the baseline code. Also creates the container_stat view.
- Parameters
conn – DB connection object
put_timestamp – put timestamp
storage_policy_index – storage policy index
- create_object_table(conn)¶
Create the object table which is specific to the container DB. Not a part of Pluggable Back-ends, internal to the baseline code.
- Parameters
conn – DB connection object
- create_policy_stat_table(conn, storage_policy_index=0)¶
Create policy_stat table.
- Parameters
conn – DB connection object
storage_policy_index – the policy_index the container is being created with
- create_shard_range_table(conn)¶
Create the shard_range table which is specific to the container DB.
- Parameters
conn – DB connection object
- db_contains_type = 'object'¶
- property db_epoch¶
- property db_file¶
Get the path to the primary db file for this broker. This is typically the db file for the most recent sharding epoch. However, if no db files exist on disk, or if
force_db_file
was True when the broker was constructed, then the primary db file is the file passed to the broker constructor.- Returns
A path to a db file; the file does not necessarily exist.
- property db_files¶
Gets the cached list of valid db files that exist on disk for this broker.
- The cached list may be refreshed by calling
- Returns
A list of paths to db files ordered by ascending epoch; the list may be empty.
- db_reclaim_timestamp = 'created_at'¶
- db_type = 'container'¶
- delete_meta_whitelist = ['x-container-sysmeta-shard-quoted-root', 'x-container-sysmeta-shard-root', 'x-container-sysmeta-sharding']¶
- delete_object(name, timestamp, storage_policy_index=0)¶
Mark an object deleted.
- Parameters
name – object name to be deleted
timestamp – timestamp when the object was marked as deleted
storage_policy_index – the storage policy index for the object
- empty()¶
Check if container DB is empty.
This method uses more stringent checks on object count than
is_deleted()
: this method checks that there are no objects in any policy; if the container is in the process of sharding then both fresh and retiring databases are checked to be empty; if a root container has shard ranges then they are checked to be empty.- Returns
True if the database has no active objects, False otherwise
- enable_sharding(epoch)¶
Updates this broker’s own shard range with the given epoch, sets its state to SHARDING and persists it in the DB.
- Parameters
epoch – a
Timestamp
- Returns
the broker’s updated own shard range.
- find_shard_ranges(shard_size, limit=- 1, existing_ranges=None, minimum_shard_size=1)¶
Scans the container db for shard ranges. Scanning will start at the upper bound of the any
existing_ranges
that are given, otherwise atShardRange.MIN
. Scanning will stop whenlimit
shard ranges have been found or when no more shard ranges can be found. In the latter case, the upper bound of the final shard range will be equal to the upper bound of the container namespace.This method does not modify the state of the db; callers are responsible for persisting any shard range data in the db.
- Parameters
shard_size – the size of each shard range
limit – the maximum number of shard points to be found; a negative value (default) implies no limit.
existing_ranges – an optional list of existing ShardRanges; if given, this list should be sorted in order of upper bounds; the scan for new shard ranges will start at the upper bound of the last existing ShardRange.
minimum_shard_size – Minimum size of the final shard range. If this is greater than one then the final shard range may be extended to more than shard_size in order to avoid a further shard range with less minimum_shard_size rows.
- Returns
a tuple; the first value in the tuple is a list of dicts each having keys {‘index’, ‘lower’, ‘upper’, ‘object_count’} in order of ascending ‘upper’; the second value in the tuple is a boolean which is True if the last shard range has been found, False otherwise.
- get_all_shard_range_data()¶
Returns a list of all shard range data, including own shard range and deleted shard ranges.
- Returns
A list of dict representations of a ShardRange.
- get_brokers()¶
Return a list of brokers for component dbs. The list has two entries while the db state is sharding: the first entry is a broker for the retiring db with
skip_commits
set toTrue
; the second entry is a broker for the fresh db withskip_commits
set toFalse
. For any other db state the list has one entry.- Returns
a list of
ContainerBroker
- get_db_state()¶
Returns the current state of on disk db files.
- get_db_version(conn)¶
- get_info()¶
Get global data for the container.
- Returns
dict with keys: account, container, created_at, put_timestamp, delete_timestamp, status, status_changed_at, object_count, bytes_used, reported_put_timestamp, reported_delete_timestamp, reported_object_count, reported_bytes_used, hash, id, x_container_sync_point1, x_container_sync_point2, and storage_policy_index, db_state.
- get_info_is_deleted()¶
Get the is_deleted status and info for the container.
- Returns
a tuple, in the form (info, is_deleted) info is a dict as returned by get_info and is_deleted is a boolean.
- get_misplaced_since(start, count)¶
Get a list of objects which are in a storage policy different from the container’s storage policy.
- Parameters
start – last reconciler sync point
count – maximum number of entries to get
- Returns
list of dicts with keys: name, created_at, size, content_type, etag, storage_policy_index
- get_objects(limit=None, marker='', end_marker='', include_deleted=None, since_row=None)¶
Returns a list of objects, including deleted objects, in all policies. Each object in the list is described by a dict with keys {‘name’, ‘created_at’, ‘size’, ‘content_type’, ‘etag’, ‘deleted’, ‘storage_policy_index’}.
- Parameters
limit – maximum number of entries to get
marker – if set, objects with names less than or equal to this value will not be included in the list.
end_marker – if set, objects with names greater than or equal to this value will not be included in the list.
include_deleted – if True, include only deleted objects; if False, include only undeleted objects; otherwise (default), include both deleted and undeleted objects.
since_row – include only items whose ROWID is greater than the given row id; by default all rows are included.
- Returns
a list of dicts, each describing an object.
- get_own_shard_range(no_default=False)¶
Returns a shard range representing this broker’s own shard range. If no such range has been persisted in the broker’s shard ranges table then a default shard range representing the entire namespace will be returned.
The
object_count
andbytes_used
of the returned shard range are not guaranteed to be up-to-date with the current object stats for this broker. Callers that require up-to-date stats should use theget_info
method.- Parameters
no_default – if True and the broker’s own shard range is not found in the shard ranges table then None is returned, otherwise a default shard range is returned.
- Returns
an instance of
ShardRange
- get_policy_stats()¶
- get_reconciler_sync()¶
- get_replication_info()¶
Get information about the DB required for replication.
- Returns
dict containing keys from get_info plus max_row and metadata
- Note:: get_info’s <db_contains_type>_count is translated to just
“count” and metadata is the raw string.
- get_shard_ranges(marker=None, end_marker=None, includes=None, reverse=False, include_deleted=False, states=None, include_own=False, exclude_others=False, fill_gaps=False)¶
Returns a list of persisted shard ranges.
- Parameters
marker – restricts the returned list to shard ranges whose namespace includes or is greater than the marker value.
end_marker – restricts the returned list to shard ranges whose namespace includes or is less than the end_marker value.
includes – restricts the returned list to the shard range that includes the given value; if
includes
is specified thenmarker
andend_marker
are ignored.reverse – reverse the result order.
include_deleted – include items that have the delete marker set
states – if specified, restricts the returned list to shard ranges that have the given state(s); can be a list of ints or a single int.
include_own – boolean that governs whether the row whose name matches the broker’s path is included in the returned list. If True, that row is included, otherwise it is not included. Default is False.
exclude_others – boolean that governs whether the rows whose names do not match the broker’s path are included in the returned list. If True, those rows are not included, otherwise they are included. Default is False.
fill_gaps – if True, insert a modified copy of own shard range to fill any gap between the end of any found shard ranges and the upper bound of own shard range. Gaps enclosed within the found shard ranges are not filled.
- Returns
a list of instances of
swift.common.utils.ShardRange
- get_shard_usage()¶
Get the aggregate object stats for all shard ranges in states ACTIVE, SHARDING or SHRINKING.
- Returns
a dict with keys {bytes_used, object_count}
- get_sharding_sysmeta(key=None)¶
Returns sharding specific info from the broker’s metadata.
- Parameters
key – if given the value stored under
key
in the sharding info will be returned.- Returns
either a dict of sharding info or the value stored under
key
in that dict.
- get_sharding_sysmeta_with_timestamps()¶
Returns sharding specific info from the broker’s metadata with timestamps.
- Parameters
key – if given the value stored under
key
in the sharding info will be returned.- Returns
a dict of sharding info with their timestamps.
- has_multiple_policies()¶
- is_empty_enough_to_reclaim()¶
- is_old_enough_to_reclaim(now, reclaim_age)¶
- is_own_shard_range(shard_range)¶
- is_reclaimable(now, reclaim_age)¶
Check if the broker abstraction is empty, and has been marked deleted for at least a reclaim age.
- is_root_container()¶
Returns True if this container is a root container, False otherwise.
A root container is a container that is not a shard of another container.
- is_sharded()¶
- list_objects_iter(limit, marker, end_marker, prefix, delimiter, path=None, storage_policy_index=0, reverse=False, include_deleted=False, since_row=None, transform_func=None, all_policies=False, allow_reserved=False)¶
Get a list of objects sorted by name starting at marker onward, up to limit entries. Entries will begin with the prefix and will not have the delimiter after the prefix.
- Parameters
limit – maximum number of entries to get
marker – marker query
end_marker – end marker query
prefix – prefix query
delimiter – delimiter for query
path – if defined, will set the prefix and delimiter based on the path
storage_policy_index – storage policy index for query
reverse – reverse the result order.
include_deleted – if True, include only deleted objects; if False (default), include only undeleted objects; otherwise, include both deleted and undeleted objects.
since_row – include only items whose ROWID is greater than the given row id; by default all rows are included.
transform_func – an optional function that if given will be called for each object to get a transformed version of the object to include in the listing; should have same signature as
_transform_record()
; defaults to_transform_record()
.all_policies – if True, include objects for all storage policies ignoring any value given for
storage_policy_index
allow_reserved – exclude names with reserved-byte by default
- Returns
list of tuples of (name, created_at, size, content_type, etag, deleted)
- make_tuple_for_pickle(record)¶
Turn this db record dict into the format this service uses for pending pickles.
- merge_items(item_list, source=None)¶
Merge items into the object table.
- Parameters
item_list – list of dictionaries of {‘name’, ‘created_at’, ‘size’, ‘content_type’, ‘etag’, ‘deleted’, ‘storage_policy_index’, ‘ctype_timestamp’, ‘meta_timestamp’}
source – if defined, update incoming_sync with the source
- merge_shard_ranges(shard_ranges)¶
Merge shard ranges into the shard range table.
- Parameters
shard_ranges – a shard range or a list of shard ranges; each shard range should be an instance of
ShardRange
or a dict representation of a shard range havingSHARD_RANGE_KEYS
.
- property path¶
- put_object(name, timestamp, size, content_type, etag, deleted=0, storage_policy_index=0, ctype_timestamp=None, meta_timestamp=None)¶
Creates an object in the DB with its metadata.
- Parameters
name – object name to be created
timestamp – timestamp of when the object was created
size – object size
content_type – object content-type
etag – object etag
deleted – if True, marks the object as deleted and sets the deleted_at timestamp to timestamp
storage_policy_index – the storage policy index for the object
ctype_timestamp – timestamp of when content_type was last updated
meta_timestamp – timestamp of when metadata was last updated
- reload_db_files()¶
Reloads the cached list of valid on disk db files for this broker.
- remove_objects(lower, upper, max_row=None)¶
Removes object records in the given namespace range from the object table.
Note that objects are removed regardless of their storage_policy_index.
- Parameters
lower – defines the lower bound of object names that will be removed; names greater than this value will be removed; names less than or equal to this value will not be removed.
upper – defines the upper bound of object names that will be removed; names less than or equal to this value will be removed; names greater than this value will not be removed. The empty string is interpreted as there being no upper bound.
max_row – if specified only rows less than or equal to max_row will be removed
- reported(put_timestamp, delete_timestamp, object_count, bytes_used)¶
Update reported stats, available with container’s get_info.
- Parameters
put_timestamp – put_timestamp to update
delete_timestamp – delete_timestamp to update
object_count – object_count to update
bytes_used – bytes_used to update
- classmethod resolve_shard_range_states(states)¶
Given a list of values each of which may be the name of a state, the number of a state, or an alias, return the set of state numbers described by the list.
The following alias values are supported: ‘listing’ maps to all states that are considered valid when listing objects; ‘updating’ maps to all states that are considered valid for redirecting an object update; ‘auditing’ maps to all states that are considered valid for a shard container that is updating its own shard range table from a root (this currently maps to all states except FOUND).
- Parameters
states – a list of values each of which may be the name of a state, the number of a state, or an alias
- Returns
a set of integer state numbers, or None if no states are given
- Raises
ValueError – if any value in the given list is neither a valid state nor a valid alias
- property root_account¶
- property root_container¶
- property root_path¶
- set_sharded_state()¶
Unlink’s the broker’s retiring DB file.
- Returns
True if the retiring DB was successfully unlinked, False otherwise.
- set_sharding_state()¶
Creates and initializes a fresh DB file in preparation for sharding a retiring DB. The broker’s own shard range must have an epoch timestamp for this method to succeed.
- Returns
True if the fresh DB was successfully created, False otherwise.
- set_sharding_sysmeta(key, value)¶
Updates the broker’s metadata stored under the given key prefixed with a sharding specific namespace.
- Parameters
key – metadata key in the sharding metadata namespace.
value – metadata value
- set_storage_policy_index(policy_index, timestamp=None)¶
Update the container_stat policy_index and status_changed_at.
- set_x_container_sync_points(sync_point1, sync_point2)¶
- sharding_initiated()¶
Returns True if a broker has shard range state that would be necessary for sharding to have been initiated, False otherwise.
- sharding_required()¶
Returns True if a broker has shard range state that would be necessary for sharding to have been initiated but has not yet completed sharding, False otherwise.
- property storage_policy_index¶
- update_reconciler_sync(point)¶
- swift.container.backend.merge_shards(shard_data, existing)¶
Compares
shard_data
withexisting
and updatesshard_data
with any items ofexisting
that take precedence over the corresponding item inshard_data
.- Parameters
shard_data – a dict representation of shard range that may be modified by this method.
existing – a dict representation of shard range.
- Returns
True if
shard data
has any item(s) that are considered to take precedence over the corresponding item inexisting
- swift.container.backend.sift_shard_ranges(new_shard_ranges, existing_shard_ranges)¶
Compares new and existing shard ranges, updating the new shard ranges with any more recent state from the existing, and returns shard ranges sorted into those that need adding because they contain new or updated state and those that need deleting because their state has been superseded.
- Parameters
new_shard_ranges – a list of dicts, each of which represents a shard range.
existing_shard_ranges – a dict mapping shard range names to dicts representing a shard range.
- Returns
a tuple (to_add, to_delete); to_add is a list of dicts, each of which represents a shard range that is to be added to the existing shard ranges; to_delete is a set of shard range names that are to be deleted.
- swift.container.backend.update_new_item_from_existing(new_item, existing)¶
Compare the data and meta related timestamps of a new object item with the timestamps of an existing object record, and update the new item with data and/or meta related attributes from the existing record if their timestamps are newer.
The multiple timestamps are encoded into a single string for storing in the ‘created_at’ column of the objects db table.
- Parameters
new_item – A dict of object update attributes
existing – A dict of existing object attributes
- Returns
True if any attributes of the new item dict were found to be newer than the existing and therefore not updated, otherwise False implying that the updated item is equal to the existing.
Container Replicator¶
- class swift.container.replicator.ContainerReplicator(conf, logger=None)¶
Bases:
swift.common.db_replicator.Replicator
- brokerclass¶
- cleanup_post_replicate(broker, orig_info, responses)¶
Cleanup non primary database from disk if needed.
- Parameters
broker – the broker for the database we’re replicating
orig_info – snapshot of the broker replication info dict taken before replication
responses – a list of boolean success values for each replication request to other nodes
- Return success
returns False if deletion of the database was attempted but unsuccessful, otherwise returns True.
- datadir = 'containers'¶
- default_port = 6201¶
- delete_db(broker)¶
Ensure that reconciler databases are only cleaned up at the end of the replication run.
- dump_to_reconciler(broker, point)¶
Look for object rows for objects updates in the wrong storage policy in broker with a
ROWID
greater than the rowid given as point.- Parameters
broker – the container broker with misplaced objects
point – the last verified
reconciler_sync_point
- Returns
the last successful enqueued rowid
- feed_reconciler(container, item_list)¶
Add queue entries for rows in item_list to the local reconciler container database.
- Parameters
container – the name of the reconciler container
item_list – the list of rows to enqueue
- Returns
True if successfully enqueued
- find_local_handoff_for_part(part)¶
Find a device in the ring that is on this node on which to place a partition. Preference is given to a device that is a primary location for the partition. If no such device is found then a local device with weight is chosen, and failing that any local device.
- Parameters
part – a partition
- Returns
a node entry from the ring
- get_reconciler_broker(timestamp)¶
Get a local instance of the reconciler container broker that is appropriate to enqueue the given timestamp.
- Parameters
timestamp – the timestamp of the row to be enqueued
- Returns
a local reconciler broker
- replicate_reconcilers()¶
Ensure any items merged to reconciler containers during replication are pushed out to correct nodes and any reconciler containers that do not belong on this node are removed.
- report_up_to_date(full_info)¶
- run_once(*args, **kwargs)¶
Run a replication pass once.
- server_type = 'container'¶
- class swift.container.replicator.ContainerReplicatorRpc(root, datadir, broker_class, mount_check=True, logger=None)¶
Bases:
swift.common.db_replicator.ReplicatorRpc
- get_shard_ranges(broker, args)¶
- merge_shard_ranges(broker, args)¶
Container Server¶
- class swift.container.server.ContainerController(conf, logger=None)¶
Bases:
swift.common.base_storage_server.BaseStorageServer
WSGI Controller for the container server.
- DELETE(req)¶
Handle HTTP DELETE request.
- GET(req)¶
Handle HTTP GET request.
The body of the response to a successful GET request contains a listing of either objects or shard ranges. The exact content of the listing is determined by a combination of request headers and query string parameters, as follows:
The type of the listing is determined by the
X-Backend-Record-Type
header. If this header has valueshard
then the response body will be a list of shard ranges; if this header has valueauto
, and the container state issharding
orsharded
, then the listing will be a list of shard ranges; otherwise the response body will be a list of objects.Both shard range and object listings may be filtered according to the constraints described below. However, the
X-Backend-Ignore-Shard-Name-Filter
header may be used to override the application of themarker
,end_marker
,includes
andreverse
parameters to shard range listings. These parameters will be ignored if the header has the value ‘sharded’ and the current db sharding state is also ‘sharded’. Note that this header does not override thestates
constraint on shard range listings.The order of both shard range and object listings may be reversed by using a
reverse
query string parameter with a value inswift.common.utils.TRUE_VALUES
.Both shard range and object listings may be constrained to a name range by the
marker
andend_marker
query string parameters. Object listings will only contain objects whose names are greater than anymarker
value and less than anyend_marker
value. Shard range listings will only contain shard ranges whose namespace is greater than or includes anymarker
value and is less than or includes anyend_marker
value.Shard range listings may also be constrained by an
includes
query string parameter. If this parameter is present the listing will only contain shard ranges whose namespace includes the value of the parameter; anymarker
orend_marker
parameters are ignoredThe length of an object listing may be constrained by the
limit
parameter. Object listings may also be constrained byprefix
,delimiter
andpath
query string parameters.Shard range listings will include deleted shard ranges if and only if the
X-Backend-Include-Deleted
header value is one ofswift.common.utils.TRUE_VALUES
. Object listings never include deleted objects.Shard range listings may be constrained to include only shard ranges whose state is specified by a query string
states
parameter. If present, thestates
parameter should be a comma separated list of either the string or integer representation ofSTATES
.Two alias values may be used in a
states
parameter value:listing
will cause the listing to include all shard ranges in a state suitable for contributing to an object listing;updating
will cause the listing to include all shard ranges in a state suitable to accept an object update.If either of these aliases is used then the shard range listing will if necessary be extended with a synthesised ‘filler’ range in order to satisfy the requested name range when insufficient actual shard ranges are found. Any ‘filler’ shard range will cover the otherwise uncovered tail of the requested name range and will point back to the same container.
Listings are not normally returned from a deleted container. However, the
X-Backend-Override-Deleted
header may be used with a value inswift.common.utils.TRUE_VALUES
to force a shard range listing to be returned from a deleted container whose DB file still exists.
- Parameters
req – an instance of
swift.common.swob.Request
- Returns
an instance of
swift.common.swob.Response
- HEAD(req)¶
Handle HTTP HEAD request.
- POST(req)¶
Handle HTTP POST request.
- PUT(req)¶
Handle HTTP PUT request.
- REPLICATE(req)¶
Handle HTTP REPLICATE request (json-encoded RPC calls for replication.)
- UPDATE(req)¶
Handle HTTP UPDATE request (merge_items RPCs coming from the proxy.)
- account_update(req, account, container, broker)¶
Update the account server(s) with latest container info.
- Parameters
req – swob.Request object
account – account name
container – container name
broker – container DB broker object
- Returns
if all the account requests return a 404 error code, HTTPNotFound response object, if the account cannot be updated due to a malformed header, an HTTPBadRequest response object, otherwise None.
- allowed_sync_hosts¶
The list of hosts we’re allowed to send syncs to. This can be overridden by data in self.realms_conf
- check_free_space(drive)¶
- create_listing(req, out_content_type, info, resp_headers, metadata, container_list, container)¶
- get_and_validate_policy_index(req)¶
Validate that the index supplied maps to a policy.
- Returns
policy index from request, or None if not present
- Raises
HTTPBadRequest – if the supplied index is bogus
- realms_conf¶
ContainerSyncCluster instance for validating sync-to values.
- save_headers = ['x-container-read', 'x-container-write', 'x-container-sync-key', 'x-container-sync-to']¶
- server_type = 'container-server'¶
- update_data_record(record)¶
Perform any mutations to container listing records that are common to all serialization formats, and returns it as a dict.
Converts created time to iso timestamp. Replaces size with ‘swift_bytes’ content type parameter.
- Params record
object entry record
- Returns
modified record
- swift.container.server.app_factory(global_conf, **local_conf)¶
paste.deploy app factory for creating WSGI container server apps
- swift.container.server.gen_resp_headers(info, is_deleted=False)¶
Convert container info dict to headers.
- swift.container.server.get_container_name_and_placement(req)¶
Split and validate path for a container.
- Parameters
req – a swob request
- Returns
a tuple of path parts as strings
- swift.container.server.get_obj_name_and_placement(req)¶
Split and validate path for an object.
- Parameters
req – a swob request
- Returns
a tuple of path parts as strings
Container Reconciler¶
- class swift.container.reconciler.ContainerReconciler(conf, logger=None, swift=None)¶
Bases:
swift.common.daemon.Daemon
Move objects that are in the wrong storage policy.
- can_reconcile_policy(policy_index)¶
- ensure_object_in_right_location(q_policy_index, account, container, obj, q_ts, path, container_policy_index, source_ts, source_obj_status, source_obj_info, source_obj_iter, **kwargs)¶
Validate source object will satisfy the misplaced object queue entry and move to destination.
- Parameters
q_policy_index – the policy_index for the source object
account – the account name of the misplaced object
container – the container name of the misplaced object
obj – the name of the misplaced object
q_ts – the timestamp of the misplaced object
path – the full path of the misplaced object for logging
container_policy_index – the policy_index of the destination
source_ts – the timestamp of the source object
source_obj_status – the HTTP status source object request
source_obj_info – the HTTP headers of the source object request
source_obj_iter – the body iter of the source object request
- ensure_tombstone_in_right_location(q_policy_index, account, container, obj, q_ts, path, container_policy_index, source_ts, **kwargs)¶
Issue a DELETE request against the destination to match the misplaced DELETE against the source.
- log_route = 'container-reconciler'¶
- log_stats(force=False)¶
Dump stats to logger, noop when stats have been already been logged in the last minute.
- pop_queue(container, obj, q_ts, q_record)¶
Issue a delete object request to the container for the misplaced object queue entry.
- Parameters
container – the misplaced objects container
obj – the name of the misplaced object
q_ts – the timestamp of the misplaced object
q_record – the timestamp of the queue entry
N.B. q_ts will normally be the same time as q_record except when an object was manually re-enqued.
- process_queue_item(q_container, q_entry, queue_item)¶
Process an entry and remove from queue on success.
- Parameters
q_container – the queue container
q_entry – the raw_obj name from the q_container
queue_item – a parsed entry from the queue
- reconcile()¶
Main entry point for concurrent processing of misplaced objects.
Iterate over all queue entries and delegate processing to spawned workers in the pool.
- reconcile_object(info)¶
Process a possibly misplaced object write request. Determine correct destination storage policy by checking with primary containers. Check source and destination, copying or deleting into destination and cleaning up the source as needed.
This method wraps _reconcile_object for exception handling.
- Parameters
info – a queue entry dict
- Returns
True to indicate the request is fully processed successfully, otherwise False.
- run_forever(*args, **kwargs)¶
Override this to run forever
- run_once(*args, **kwargs)¶
Process every entry in the queue.
- should_process(queue_item)¶
Check if a given entry should be handled by this process.
- Parameters
container – the queue container
queue_item – an entry from the queue
- stats_log(metric, msg, *args, **kwargs)¶
Update stats tracking for metric and emit log message.
- throw_tombstones(account, container, obj, timestamp, policy_index, path)¶
Issue a delete object request to the given storage_policy.
- Parameters
account – the account name
container – the container name
obj – the object name
timestamp – the timestamp of the object to delete
policy_index – the policy index to direct the request
path – the path to be used for logging
- swift.container.reconciler.add_to_reconciler_queue(container_ring, account, container, obj, obj_policy_index, obj_timestamp, op, force=False, conn_timeout=5, response_timeout=15)¶
Add an object to the container reconciler’s queue. This will cause the container reconciler to move it from its current storage policy index to the correct storage policy index.
- Parameters
container_ring – container ring
account – the misplaced object’s account
container – the misplaced object’s container
obj – the misplaced object
obj_policy_index – the policy index where the misplaced object currently is
obj_timestamp – the misplaced object’s X-Timestamp. We need this to ensure that the reconciler doesn’t overwrite a newer object with an older one.
op – the method of the operation (DELETE or PUT)
force – over-write queue entries newer than obj_timestamp
conn_timeout – max time to wait for connection to container server
response_timeout – max time to wait for response from container server
- Returns
.misplaced_object container name, False on failure. “Success” means a majority of containers got the update.
- swift.container.reconciler.best_policy_index(headers)¶
- swift.container.reconciler.cmp_policy_info(info, remote_info)¶
You have to squint to see it, but the general strategy is just:
- if either has been recreated:
return the newest (of the recreated)
- else
return the oldest
I tried cleaning it up for awhile, but settled on just writing a bunch of tests instead. Once you get an intuitive sense for the nuance here you can try and see there’s a better way to spell the boolean logic but it all ends up looking sorta hairy.
- Returns
-1 if info is correct, 1 if remote_info is better
- swift.container.reconciler.direct_delete_container_entry(container_ring, account_name, container_name, object_name, headers=None)¶
Talk directly to the primary container servers to delete a particular object listing. Does not talk to object servers; use this only when a container entry does not actually have a corresponding object.
- swift.container.reconciler.get_reconciler_container_name(obj_timestamp)¶
Get the name of a container into which a misplaced object should be enqueued. The name is the object’s last modified time rounded down to the nearest hour.
- Parameters
obj_timestamp – a string representation of the object’s ‘created_at’ time from it’s container db row.
- Returns
a container name
- swift.container.reconciler.get_reconciler_content_type(op)¶
- swift.container.reconciler.get_reconciler_obj_name(policy_index, account, container, obj)¶
- swift.container.reconciler.get_row_to_q_entry_translator(broker)¶
- swift.container.reconciler.incorrect_policy_index(info, remote_info)¶
Compare remote_info to info and decide if the remote storage policy index should be used instead of ours.
- swift.container.reconciler.parse_raw_obj(obj_info)¶
Translate a reconciler container listing entry to a dictionary containing the parts of the misplaced object queue entry.
- Parameters
obj_info – an entry in an a container listing with the required keys: name, content_type, and hash
- Returns
a queue entry dict with the keys: q_policy_index, account, container, obj, q_op, q_ts, q_record, and path
- swift.container.reconciler.slightly_later_timestamp(ts, offset=1)¶
- swift.container.reconciler.translate_container_headers_to_info(headers)¶
Container Sharder¶
- class swift.container.sharder.CleavingContext(ref, cursor='', max_row=None, cleave_to_row=None, last_cleave_to_row=None, cleaving_done=False, misplaced_done=False, ranges_done=0, ranges_todo=0)¶
Bases:
object
Encapsulates metadata associated with the process of cleaving a retiring DB. This metadata includes:
ref
: The unique part of the key that is used when persisting a serializedCleavingContext
as sysmeta in the DB. The unique part of the key is based off the DB id. This ensures that each context is associated with a specific DB file. The unique part of the key is included in theCleavingContext
but should not be modified by any caller.cursor
: the upper bound of the last shard range to have been cleaved from the retiring DB.max_row
: the retiring DB’s max row; this is updated to the value of the retiring DB’smax_row
every time aCleavingContext
is loaded for that DB, and may change during the process of cleaving the DB.cleave_to_row
: the value ofmax_row
at the moment when cleaving starts for the DB. When cleaving completes (i.e. the cleave cursor has reached the upper bound of the cleaving namespace),cleave_to_row
is compared to the currentmax_row
: if the two values are not equal then rows have been added to the DB which may not have been cleaved, in which case theCleavingContext
isreset
and cleaving is re-started.last_cleave_to_row
: the minimum DB row from which cleaving should select objects to cleave; this is initially set to None i.e. all rows should be cleaved. If theCleavingContext
isreset
then thelast_cleave_to_row
is set to the current value ofcleave_to_row
, which in turn is set to the current value ofmax_row
by a subsequent call tostart
. The repeated cleaving therefore only selects objects in rows greater than thelast_cleave_to_row
, rather than cleaving the whole DB again.ranges_done
: the number of shard ranges that have been cleaved from the retiring DB.ranges_todo
: the number of shard ranges that are yet to be cleaved from the retiring DB.
- property cursor¶
- delete(broker)¶
- done()¶
- classmethod load(broker)¶
Returns a CleavingContext tracking the cleaving progress of the given broker’s DB.
- Parameters
broker – an instances of
ContainerBroker
- Returns
An instance of
CleavingContext
.
- classmethod load_all(broker)¶
Returns all cleaving contexts stored in the broker’s DB.
- Parameters
broker – an instance of
ContainerBroker
- Returns
list of tuples of (CleavingContext, timestamp)
- property marker¶
- range_done(new_cursor)¶
- reset()¶
- start()¶
- store(broker)¶
Persists the serialized
CleavingContext
as sysmeta in the given broker’s DB.- Parameters
broker – an instances of
ContainerBroker
- class swift.container.sharder.ContainerSharder(conf, logger=None)¶
Bases:
swift.container.sharder.ContainerSharderConf
,swift.container.replicator.ContainerReplicator
Shards containers.
- debug(broker, msg, *args, **kwargs)¶
- error(broker, msg, *args, **kwargs)¶
- exception(broker, msg, *args, **kwargs)¶
- info(broker, msg, *args, **kwargs)¶
- log_route = 'container-sharder'¶
- run_forever(*args, **kwargs)¶
Run the container sharder until stopped.
- run_once(*args, **kwargs)¶
Run the container sharder once.
- warning(broker, msg, *args, **kwargs)¶
- yield_objects(broker, src_shard_range, since_row=None, batch_size=None)¶
Iterates through all object rows in
src_shard_range
in name order yielding them in lists of up tobatch_size
in length. All batches of rows that are not marked deleted are yielded before all batches of rows that are marked deleted.- Parameters
broker – A
ContainerBroker
.src_shard_range – A
ShardRange
describing the source range.since_row – include only object rows whose ROWID is greater than the given row id; by default all object rows are included.
batch_size – The maximum number of object rows to include in each yielded batch; defaults to cleave_row_batch_size.
- Returns
a generator of tuples of (list of rows, broker info dict)
- yield_objects_to_shard_range(broker, src_shard_range, dest_shard_ranges)¶
Iterates through all object rows in
src_shard_range
to place them in destination shard ranges provided by thedest_shard_ranges
function. Yields tuples of(batch of object rows, destination shard range in which those object rows belong, broker info)
.If no destination shard range exists for a batch of object rows then tuples are yielded of
(batch of object rows, None, broker info)
. This indicates to the caller that there are a non-zero number of object rows for which no destination shard range was found.Note that the same destination shard range may be referenced in more than one yielded tuple.
- Parameters
broker – A
ContainerBroker
.src_shard_range – A
ShardRange
describing the source range.dest_shard_ranges – A function which should return a list of destination shard ranges sorted in the order defined by
sort_key()
.
- Returns
a generator of tuples of
(object row list, shard range, broker info dict)
whereshard_range
may beNone
.
- class swift.container.sharder.ContainerSharderConf(conf=None)¶
Bases:
object
- percent_of_threshold(val)¶
- classmethod validate_conf(namespace)¶
- swift.container.sharder.combine_shard_ranges(new_shard_ranges, existing_shard_ranges)¶
Combines new and existing shard ranges based on most recent state.
- Parameters
new_shard_ranges – a list of ShardRange instances.
existing_shard_ranges – a list of ShardRange instances.
- Returns
a list of ShardRange instances.
- swift.container.sharder.finalize_shrinking(broker, acceptor_ranges, donor_ranges, timestamp)¶
Update donor shard ranges to shrinking state and merge donors and acceptors to broker.
- Parameters
broker – A
ContainerBroker
.acceptor_ranges – A list of
ShardRange
that are to be acceptors.donor_ranges – A list of
ShardRange
that are to be donors; these will have their state and timestamp updated.timestamp – timestamp to use when updating donor state
- swift.container.sharder.find_compactible_shard_sequences(broker, shrink_threshold, expansion_limit, max_shrinking, max_expanding, include_shrinking=False)¶
Find sequences of shard ranges that could be compacted into a single acceptor shard range.
This function does not modify shard ranges.
- Parameters
broker – A
ContainerBroker
.shrink_threshold – the number of rows below which a shard may be considered for shrinking into another shard
expansion_limit – the maximum number of rows that an acceptor shard range should have after other shard ranges have been compacted into it
max_shrinking – the maximum number of shard ranges that should be compacted into each acceptor; -1 implies unlimited.
max_expanding – the maximum number of acceptors to be found (i.e. the maximum number of sequences to be returned); -1 implies unlimited.
include_shrinking – if True then existing compactible sequences are included in the results; default is False.
- Returns
A list of
ShardRangeList
each containing a sequence of neighbouring shard ranges that may be compacted; the final shard range in the list is the acceptor
- swift.container.sharder.find_overlapping_ranges(shard_ranges, exclude_parent_child=False, time_period=0)¶
Find all pairs of overlapping ranges in the given list.
- Parameters
shard_ranges – A list of
ShardRange
exclude_parent_child – If True then overlapping pairs that have a parent-child relationship within the past time period
time_period
are excluded from the returned set. Default is False.time_period – the specified past time period in seconds. Value of 0 means all time in the past.
- Returns
a set of tuples, each tuple containing ranges that overlap with each other.
- swift.container.sharder.find_paths(shard_ranges)¶
Returns a list of all continuous paths through the shard ranges. An individual path may not necessarily span the entire namespace, but it will span a continuous namespace without gaps.
- Parameters
shard_ranges – A list of
ShardRange
.- Returns
A list of
ShardRangeList
.
- swift.container.sharder.find_paths_with_gaps(shard_ranges, within_range=None)¶
Find gaps in the shard ranges and pairs of shard range paths that lead to and from those gaps. For each gap a single pair of adjacent paths is selected. The concatenation of all selected paths and gaps will span the entire namespace with no overlaps.
- Parameters
shard_ranges – a list of instances of ShardRange.
within_range – an optional ShardRange that constrains the search space; the method will only return gaps within this range. The default is the entire namespace.
- Returns
A list of tuples of
(start_path, gap_range, end_path)
wherestart_path
is a list of ShardRanges leading to the gap,gap_range
is a ShardRange synthesized to describe the namespace gap, andend_path
is a list of ShardRanges leading from the gap. When gaps start or end at the namespace minimum or maximum bounds,start_path
andend_path
may be ‘null’ paths that contain a single ShardRange covering either the minimum or maximum of the namespace.
- swift.container.sharder.find_sharding_candidates(broker, threshold, shard_ranges=None)¶
- swift.container.sharder.find_shrinking_candidates(broker, shrink_threshold, expansion_limit)¶
- swift.container.sharder.is_sharding_candidate(shard_range, threshold)¶
- swift.container.sharder.is_shrinking_candidate(shard_range, shrink_threshold, expansion_limit, states=None)¶
- swift.container.sharder.make_shard_ranges(broker, shard_data, shards_account_prefix)¶
- swift.container.sharder.process_compactible_shard_sequences(broker, sequences)¶
Transform the given sequences of shard ranges into a list of acceptors and a list of shrinking donors. For each given sequence the final ShardRange in the sequence (the acceptor) is expanded to accommodate the other ShardRanges in the sequence (the donors). The donors and acceptors are then merged into the broker.
- Parameters
broker – A
ContainerBroker
.sequences – A list of
ShardRangeList
- swift.container.sharder.random() x in the interval [0, 1). ¶
- swift.container.sharder.rank_paths(paths, shard_range_to_span)¶
Sorts the given list of paths such that the most preferred path is the first item in the list.
- Parameters
paths – A list of
ShardRangeList
.shard_range_to_span – An instance of
ShardRange
that describes the namespace that would ideally be spanned by a path. Paths that include this namespace will be preferred over those that do not.
- Returns
A sorted list of
ShardRangeList
.
- swift.container.sharder.sharding_enabled(broker)¶
- swift.container.sharder.update_own_shard_range_stats(broker, own_shard_range)¶
Update the
own_shard_range
with the up-to-date object stats from thebroker
.Note: this method does not persist the updated
own_shard_range
; callers should usebroker.merge_shard_ranges
if the updated stats need to be persisted.- Parameters
broker – an instance of
ContainerBroker
.own_shard_range – and instance of
ShardRange
.
- Returns
own_shard_range
with up-to-dateobject_count
andbytes_used
.
Container Sync¶
- class swift.container.sync.ContainerSync(conf, container_ring=None, logger=None)¶
Bases:
swift.common.daemon.Daemon
Daemon to sync syncable containers.
This is done by scanning the local devices for container databases and checking for x-container-sync-to and x-container-sync-key metadata values. If they exist, newer rows since the last sync will trigger PUTs or DELETEs to the other container.
The actual syncing is slightly more complicated to make use of the three (or number-of-replicas) main nodes for a container without each trying to do the exact same work but also without missing work if one node happens to be down.
Two sync points are kept per container database. All rows between the two sync points trigger updates. Any rows newer than both sync points cause updates depending on the node’s position for the container (primary nodes do one third, etc. depending on the replica count of course). After a sync run, the first sync point is set to the newest ROWID known and the second sync point is set to newest ROWID for which all updates have been sent.
An example may help. Assume replica count is 3 and perfectly matching ROWIDs starting at 1.
First sync run, database has 6 rows:
SyncPoint1 starts as -1.
SyncPoint2 starts as -1.
No rows between points, so no “all updates” rows.
Six rows newer than SyncPoint1, so a third of the rows are sent by node 1, another third by node 2, remaining third by node 3.
SyncPoint1 is set as 6 (the newest ROWID known).
SyncPoint2 is left as -1 since no “all updates” rows were synced.
Next sync run, database has 12 rows:
SyncPoint1 starts as 6.
SyncPoint2 starts as -1.
The rows between -1 and 6 all trigger updates (most of which should short-circuit on the remote end as having already been done).
Six more rows newer than SyncPoint1, so a third of the rows are sent by node 1, another third by node 2, remaining third by node 3.
SyncPoint1 is set as 12 (the newest ROWID known).
SyncPoint2 is set as 6 (the newest “all updates” ROWID).
In this way, under normal circumstances each node sends its share of updates each run and just sends a batch of older updates to ensure nothing was missed.
- Parameters
conf – The dict of configuration values from the [container-sync] section of the container-server.conf
container_ring – If None, the <swift_dir>/container.ring.gz will be loaded. This is overridden by unit tests.
- allowed_sync_hosts¶
The list of hosts we’re allowed to send syncs to. This can be overridden by data in self.realms_conf
- conf¶
The dict of configuration values from the [container-sync] section of the container-server.conf.
- container_deletes¶
Number of successful DELETEs triggered.
- container_failures¶
Number of containers that had a failure of some type.
- container_puts¶
Number of successful PUTs triggered.
- container_report(start, end, sync_point1, sync_point2, info, max_row)¶
- container_ring¶
swift.common.ring.Ring for locating containers.
- container_skips¶
Number of containers whose sync has been turned off, but are not yet cleared from the sync store.
- container_stats¶
Per container stats. These are collected per container. puts - the number of puts that were done for the container deletes - the number of deletes that were fot the container bytes - the total number of bytes transferred per the container
- container_sync(path)¶
Checks the given path for a container database, determines if syncing is turned on for that database and, if so, sends any updates to the other container.
- Parameters
path – the path to a container db
- container_sync_row(row, sync_to, user_key, broker, info, realm, realm_key)¶
Sends the update the row indicates to the sync_to container. Update can be either delete or put.
- Parameters
row – The updated row in the local database triggering the sync update.
sync_to – The URL to the remote container.
user_key – The X-Container-Sync-Key to use when sending requests to the other container.
broker – The local container database broker.
info – The get_info result from the local container database broker.
realm – The realm from self.realms_conf, if there is one. If None, fallback to using the older allowed_sync_hosts way of syncing.
realm_key – The realm key from self.realms_conf, if there is one. If None, fallback to using the older allowed_sync_hosts way of syncing.
- Returns
True on success
- container_syncs¶
Number of containers with sync turned on that were successfully synced.
- container_time¶
Maximum amount of time to spend syncing a container before moving on to the next one. If a container sync hasn’t finished in this time, it’ll just be resumed next scan.
- devices¶
Path to the local device mount points.
- interval¶
Minimum time between full scans. This is to keep the daemon from running wild on near empty systems.
- log_route = 'container-sync'¶
- logger¶
Logger to use for container-sync log lines.
- mount_check¶
Indicates whether mount points should be verified as actual mount points (normally true, false for tests and SAIO).
- realms_conf¶
ContainerSyncCluster instance for validating sync-to values.
- report()¶
Writes a report of the stats to the logger and resets the stats for the next report.
- reported¶
Time of last stats report.
- run_forever(*args, **kwargs)¶
Runs container sync scans until stopped.
- run_once(*args, **kwargs)¶
Runs a single container sync scan.
- select_http_proxy()¶
- sync_store¶
ContainerSyncStore instance for iterating over synced containers
- swift.container.sync.random() x in the interval [0, 1). ¶
Container Updater¶
- class swift.container.updater.ContainerUpdater(conf, logger=None)¶
Bases:
swift.common.daemon.Daemon
Update container information in account listings.
- container_report(node, part, container, put_timestamp, delete_timestamp, count, bytes, storage_policy_index)¶
Report container info to an account server.
- Parameters
node – node dictionary from the account ring
part – partition the account is on
container – container name
put_timestamp – put timestamp
delete_timestamp – delete timestamp
count – object count in the container
bytes – bytes used in the container
storage_policy_index – the policy index for the container
- container_sweep(path)¶
Walk the path looking for container DBs and process them.
- Parameters
path – path to walk
- get_account_ring()¶
Get the account ring. Load it if it hasn’t been yet.
- get_paths()¶
Get paths to all of the partitions on each drive to be processed.
- Returns
a list of paths
- process_container(dbfile)¶
Process a container, and update the information in the account.
- Parameters
dbfile – container DB to process
- run_forever(*args, **kwargs)¶
Run the updater continuously.
- run_once(*args, **kwargs)¶
Run the updater once.
- swift.container.updater.random() x in the interval [0, 1). ¶