Mooncake Store Deployment & Operations Guide#
This page summarizes useful flags, environment variables, and HTTP endpoints to help advanced users tune Mooncake Master and observe metrics.
Master Startup Flags (with defaults)#
RPC Related
--rpc_port(int, default 50051): RPC listen port.--rpc_thread_num(int, default min(4, CPU cores)): RPC worker threads. If not set, uses--max_threads(default 4) capped by CPU cores.--rpc_address(str, default0.0.0.0): RPC bind address.--rpc_conn_timeout_seconds(int, default0): RPC idle connection timeout;0disables.--rpc_enable_tcp_no_delay(bool, defaulttrue): Enable TCP_NODELAY.
Metrics
--enable_metric_reporting(bool, defaulttrue): Periodically log master metrics to INFO.--metrics_port(int, default9003): HTTP port for/metricsendpoints.
HTTP Metadata Server For Mooncake Transfer Engine
--enable_http_metadata_server(bool, defaultfalse): Enable embedded HTTP metadata server.--http_metadata_server_host(str, default0.0.0.0): Metadata bind host.--http_metadata_server_port(int, default8080): Metadata TCP port.
Allocation Strategy
--allocation_strategy(str, defaultrandom): Memory allocation strategy for replica placement. Available options:random: Pure random selection across segments (baseline, fastest).free_ratio_first: Free-ratio-first strategy. Samples multiple candidates and selects those with highest free space ratio for better load balancing.
Eviction and TTLs
--default_kv_lease_ttl(uint64, default5000ms): Default lease TTL for KV objects.--default_kv_soft_pin_ttl(uint64, default1800000ms): Soft pin TTL (30 minutes).--allow_evict_soft_pinned_objects(bool, defaulttrue): Allow evicting soft-pinned objects.--eviction_ratio(double, default0.05): Fraction evicted when hitting high watermark.--eviction_high_watermark_ratio(double, default0.95): Usage ratio to trigger eviction.
High Availability (optional)
--enable_ha(bool, defaultfalse): Enable HA (requires etcd).--etcd_endpoints(str, default empty unless HA config): etcd endpoints, semicolon separated.--client_ttl(int64, default10s): Client alive TTL after last ping (HA mode).--cluster_id(str, defaultmooncake_cluster): Cluster ID for persistence in HA mode.
Task Manager (optional)
--max_total_finished_tasks(uint32, default10000): Maximum number of finished tasks to keep in memory. When this limit is reached, the oldest finished tasks will be pruned from memory.--max_total_pending_tasks(uint32, default10000): Maximum number of pending tasks that can be queued in memory. When this limit is reached, new task submissions will fail withTASK_PENDING_LIMIT_EXCEEDEDerror.--max_total_processing_tasks(uint32, default10000): Maximum number of tasks that can be processing simultaneously. When this limit is reached, no new tasks will be popped from the pending queue until some processing tasks complete.--max_retry_attempts(uint32, default10): Maximum number of retry attempts for failed tasks. Tasks that fail withNO_AVAILABLE_HANDLEerror will be retried up to this many times before being marked as failed.
DFS Storage (optional)
--root_fs_dir(str, default empty): DFS mount directory for storage backend, used in Multi-layer Storage Support.--global_file_segment_size(int64, defaultint64_max): Maximum available space for DFS segments.
Snapshot / Restore (optional)
--enable_snapshot(bool, defaultfalse): Enable periodic snapshot of master metadata data (effective when using theoffsetmemory allocator).--snapshot_interval_seconds(uint64, default600): Interval in seconds between periodic snapshots of master data.--snapshot_child_timeout_seconds(uint64, default300): Timeout in seconds for each snapshot child process.--snapshot_retention_count(uint32, default2): Number of recent snapshots to keep. Older snapshots beyond this limit will be automatically deleted.--snapshot_backend_type(str, required when snapshot enabled): Snapshot storage backend type:localfor local filesystem,s3for S3 storage.--snapshot_backup_dir(str, default empty): Optional local directory for snapshot backup. If empty (default), local backup is disabled. When set, it serves two purposes: (1) during snapshot persistence, data will be saved locally as a fallback if uploading to the backend fails; (2) during restore, downloaded metadata will also be saved to this directory as a local backup.--enable_snapshot_restore(bool, defaultfalse): Enable restore from the latest snapshot at master startup.Environment variable
MOONCAKE_SNAPSHOT_LOCAL_PATH(required when--snapshot_backend_type=local): Persistent directory path for local snapshot storage. This variable must be set before starting the master; there is no default value. Example:export MOONCAKE_SNAPSHOT_LOCAL_PATH=/data/mooncake_snapshots.
Warning: Managed Directory
The snapshot storage path (
MOONCAKE_SNAPSHOT_LOCAL_PATHfor local backend, or S3 bucket for S3 backend) is a managed directory exclusively controlled by the Mooncake snapshot system. DO NOT store other files or data in this directory. Old snapshots exceeding--snapshot_retention_countwill be automatically and permanently deleted during cleanup. Use a dedicated, isolated directory for snapshot storage to avoid accidental data loss.
Example (enable embedded HTTP metadata and metrics):
mooncake_master \
--enable_http_metadata_server=true \
--http_metadata_server_host=0.0.0.0 \
--http_metadata_server_port=8080 \
--rpc_thread_num=64 \
--metrics_port=9003 \
--enable_metric_reporting=true
Example (use free-ratio-first allocation strategy for better load balancing):
mooncake_master \
--allocation_strategy=free_ratio_first \
--enable_http_metadata_server=true \
--http_metadata_server_port=8080
Tips:
In addition to command-line flags, the Master also supports configuration via JSON and YAML files. For example:
mooncake_master \
--config_path=mooncake-store/conf/master.yaml
Metrics Endpoints#
The master exposes Prometheus-style metrics over HTTP on --metrics_port:
GET /metrics— Prometheus format (text/plain; version=0.0.4).GET /metrics/summary— Human-readable summary.
Examples:
curl -s http://<master_host>:9003/metrics
curl -s http://<master_host>:9003/metrics/summary
Client/Engine Tuning (Env Vars, with defaults)#
Topology discovery (Store Client → Transfer Engine)
MC_MS_AUTO_DISC(default1): Auto-discover NIC/GPU topology. Set0to disable and providerdma_devicesmanually.MC_MS_FILTERS(default empty): Optional comma-separated NIC whitelist when auto-discovery is enabled (e.g.,mlx5_0,mlx5_2).If
MC_MS_AUTO_DISC=0, passrdma_devices(comma-separated) to the Pythonsetup(...)call.
Transfer Engine metrics (disabled by default)
MC_TE_METRIC(default0/unset): Set to1to enable periodic engine metrics logging. Note: Not supported when using Transfer Engine TENT.MC_TE_METRIC_INTERVAL_SECONDS(default5): Positive integer seconds between reports (effective only if metrics enabled).
Client metrics (enabled by default)
MC_STORE_CLIENT_METRIC(default1): Client-side metrics on by default; set0to disable entirely.MC_STORE_CLIENT_METRIC_INTERVAL(default0): Reporting interval in seconds;0collects but does not periodically report.
Local memcpy optimization (Store transfer path)
MC_STORE_MEMCPY(default0/false): Set to1to prefer local memcpy when source/destination are on the same client.
Set the Log Level for yalantinglibs coro_rpc and coro_http#
By default, the log level is set to warning. You can customize it using the following environment variable:
export MC_YLT_LOG_LEVEL=info
This sets the log level for yalantinglibs (including coro_rpc and coro_http) to info.
Available log levels: trace, debug, info, warn (or warning), error, and critical.
Quick Tips#
Scale
--rpc_thread_numwith available CPU cores and workload.Start with default eviction settings; adjust
--eviction_high_watermark_ratioand--eviction_ratiobased on memory pressure and object churn.Use
/metrics/summaryduring bring-up; integrate/metricswith Prometheus/Grafana for production.