Operating Cirata Symphony

This guide covers operational procedures for running and maintaining a Symphony deployment, including health checks, log management, backup and recovery, and scaling.

Using the `cirata` CLI

The cirata command-line tool provides quick access to common operational tasks. It authenticates using a stored session (from cirata login) or the SYMPHONY_TOKEN environment variable.

# Authenticate with your Symphony instance
cirata login --address symphony.example.com

Once logged in, the CLI stores your token in the OS keychain and uses it automatically for subsequent commands. In headless environments, use --insecure-storage to store the token in the config file instead. See the CLI Reference for the full command list.

Common Operational Commands

Task	CLI Command	REST API Equivalent
Instance information	`cirata info`	`GET /api/v1/symphonyinfo`
Account details	`cirata account`	`GET /api/v1/apikey`
List extensions	`cirata extension list`	`GET /api/v1/extensions/all`
Extension features	`cirata extension features`	`GET /api/v1/extensions/features`
NATS credentials	`cirata nats credentials`	`GET /api/v1/apikey`
List objects	`cirata object list [bucket]`	—
Version	`cirata version`	—

Extensions that register OpenAPI specifications also become available as dynamic CLI commands. Use cirata --help to see all available commands.

Health Checks

API Health

Verify Symphony is running and responsive:

# Using the CLI
cirata info

# Using curl (no authentication required)
curl https://your-symphony-instance.com/api/v1/ping

The /api/v1/ping endpoint returns 200 OK and is a simple connectivity test. For richer health information, use the structured health endpoints:

Endpoint	Purpose	Response
`GET /api/v1/ping`	Connectivity test	`200` with `{"message": "pong"}`
`GET /api/v1/health`	Component health and version	`200` when healthy, `503` when degraded
`GET /api/v1/ready`	Readiness for traffic	`200` when ready, `503` when not

The /health endpoint reports the status of NATS and messaging components. The /ready endpoint is suitable for Kubernetes readiness probes and load balancer health checks.

Metrics

Platform and extension metrics are collected by the Observability Extension over NATS. Use the CLI to query available metrics:

# List metric namespaces
cirata observability find_all_namespaces

# Query a specific metric
cirata observability query_scalar_metrics http.requests.total

See Monitoring for details on the Observability Extension and available platform metrics.

Extension Health

List all connected extensions and their status:

# Using the CLI
cirata extension list

# Using curl
curl -H "Authorization: Bearer <token>" https://your-symphony-instance.com/api/v1/extensions/all

Extensions publish a status heartbeat every 30 seconds. An extension that has not published within the TTL window will be removed from the registry.

Logs

Symphony emits structured JSON logs to stdout/stderr. No application log files are written — log retention is handled by the host environment (journald on Linux, container runtime on Docker/Kubernetes).

# Linux (systemd)
sudo journalctl -u symphony -f

# Docker Compose
docker compose logs -f symphony

# Kubernetes
kubectl logs -f deployment/symphony

On Linux, install the bundled journald drop-in configuration to ensure logs are stored persistently with bounded retention. See Linux Installation: Log Retention for details.

For more information on the log format, fields, and levels, see Status: Logs.

Centralised Log Aggregation

In addition to stdout, Symphony supports delivering logs from both the platform and extensions to an external collector via OpenTelemetry (OTLP). When the observability extension is deployed, it aggregates logs from Symphony and all connected extensions and forwards them to a configured OTLP collector endpoint. This enables centralised log management in tools such as Grafana Loki, Datadog, or Elastic.

Configure OTLP collectors from the observability extension's Collectors page or its API. See Monitoring for details.

Enabling NATS Debug Logging

For NATS debug logging, uncomment debug: true and trace: true in nats.config and restart Symphony. NATS debug output is written to log/nats.log in the configuration directory. Disable debug logging after investigation as it generates significant output. See Configuration for details.

Backup and Recovery

What to Back Up

The configuration directory contains all state needed to restore a Symphony deployment. These items must be backed up together; a backup strategy that captures them independently will eventually produce an instance that rejects its own credentials.

Path	Contents	Criticality
`symphony.config`	Application configuration and cryptographic identity seeds (operator, symphony account, signing keys)	Critical—the trust root for every JWT and API token
`nats.config`	Messaging server configuration	Important—regenerable from `symphony.config` via setup wizard, but included in backups for completeness
`accounts/`	NATS account resolver JWTs—one file per registered user and extension	Critical—rebuildable from `symphony_users` bucket only if `symphony.config` survives
`storage/`	JetStream data: Symphony-owned KV buckets (users, roles, API keys, tokens, licenses, usage, business units, attribution rules, settings) and every user's private buckets	Critical—all operational state
`log/`	NATS debug logs (when enabled)	Optional—not required for recovery

warning

The identity seeds in symphony.config are the root of trust for the deployment. Without them, all existing user accounts, API keys, and extension tokens become permanently unverifiable. Treat this file with the same care as a TLS private key.

Backup Schedule

Choose a backup frequency based on the rate of change in your deployment:

Environment	Recommended frequency	Rationale
Production	Daily, with retention of at least 7 days	Balances recovery granularity with storage cost
High-change (frequent user/extension onboarding)	Every 6–12 hours	Minimises data loss for rapidly changing state
Development / staging	Before upgrades or configuration changes	Ad-hoc backups are sufficient

Keep at least one weekly backup offsite (different region, account, or cloud) to survive region or account compromise. Test the restore procedure on a non-production environment at least once per quarter. A backup that has never been restored is a hope, not a backup.

caution

The JetStream store under storage/ is not safe to copy with tar or cp while Symphony is running. JetStream rewrites metadata files in place; a hot file-level copy can capture a metadata file mid-rewrite and produce an archive that will not replay. Use one of the approaches below instead.

Backup Procedures

Three approaches are supported. Use them in the order listed.

Approach A — `cirata symphony backup` (recommended)

The built-in cirata symphony backup command produces a single consistent archive of /config by briefly disabling JetStream on the running server, tarring everything, and re-enabling JetStream. It captures Symphony-owned buckets, every user's private buckets, and the identity files in one archive, without any external tooling.

The command requires the symphony-admin role and is issued against the running instance over HTTPS. During the pause (seconds to low-minutes depending on store size), JetStream operations return errors that well-behaved clients retry; plain NATS pub/sub is unaffected.

# Log in once; the token is used for subsequent admin commands.
cirata login --address symphony.example.com

# Stream the archive to a local file. --output defaults to a
# server-suggested filename in the current directory.
cirata symphony backup --output ./backup.tar

# Pipe directly to compression or encryption without ever landing
# the plaintext on disk.
cirata symphony backup --output - | gzip > backup.tar.gz
cirata symphony backup --output - | gpg --symmetric --cipher-algo AES256 -o backup.tar.gpg

# Against a Symphony pod in Kubernetes, invoke inside the pod so the
# archive never has to traverse the PVC boundary.
kubectl -n Symphony exec deploy/my-release -- \
    cirata symphony backup -o - > backup-$(date +%Y%m%d-%H%M).tar

Flags:

Flag	Short	Default	Description
`--output`	`-o`	Server-suggested filename	Output path. `-` writes to stdout.
`--pause-timeout`		`5m`	Upper bound on the JetStream outage window. Clamped to 30 s – 30 min.
`--progress`		off	Print byte-count progress to stderr.

Limitations:

Single-node only. In a NATS cluster, run against the JetStream meta-leader; the command refuses to run on a follower.
When Keycloak is bundled, its separate PVC is not included—back it up with Keycloak's own export tooling.
The archive format includes a JSON manifest as the first entry, so cirata symphony restore can validate it before touching the target filesystem.

Approach B — CSI volume snapshots (Kubernetes)

When the StorageClass backing Symphony has a CSI driver that supports snapshots, a VolumeSnapshot of the PVC captures a crash-consistent image at the block-device level without any server-side coordination. JetStream's recovery logic handles crash-consistent images correctly on restart.

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: symphony-$(date +%Y%m%d-%H%M%S)
  namespace: Symphony
spec:
  volumeSnapshotClassName: <your-csi-snapshot-class>
  source:
    persistentVolumeClaimName: my-release-pvc

Confirm snapshot support with kubectl get volumesnapshotclass. Schedule snapshots with a CronJob or an operator such as Velero; copy the resulting snapshot out of the cluster to offsite storage. Repeat the same against the Keycloak PVC when Keycloak is bundled.

Approach C — Offline filesystem copy

The simplest and highest-fidelity approach when a short outage is acceptable: stop the server, copy /config, restart.

Linux (systemd):

sudo systemctl stop symphony
sudo tar czf /backup/symphony-$(date +%Y%m%d-%H%M).tar.gz /var/lib/symphony/
sudo systemctl start symphony

Docker Compose:

docker compose stop symphony
docker compose cp symphony:/config /backup/symphony-$(date +%Y%m%d-%H%M)/
docker compose start symphony

Kubernetes:

kubectl -n Symphony scale deployment/my-release --replicas=0
kubectl -n Symphony wait --for=delete pod \
    -l app.kubernetes.io/instance=my-release --timeout=60s

# Copy from a throwaway pod mounted at the PVC.
kubectl -n Symphony run backup-helper --rm -it --restart=Never \
  --overrides='{"spec":{"volumes":[{"name":"data","persistentVolumeClaim":{"claimName":"my-release-pvc"}}],"containers":[{"name":"t","image":"busybox","command":["tar","czf","-","-C","/data","."],"volumeMounts":[{"name":"data","mountPath":"/data"}]}]}}' \
  > ./backup-$(date +%Y%m%d-%H%M%S).tar.gz

kubectl -n Symphony scale deployment/my-release --replicas=1

Offsite Storage

Store at least one copy of your backups outside the host or cluster running Symphony:

Cloud object storage (S3, GCS, Azure Blob) for durable, geographically redundant copies
Network file share (NFS, SMB) for on-premises environments
Encrypted transfer—backups contain cryptographic seeds; encrypt before transferring over untrusted networks (see the gpg pipe example in Approach A above)

Recovery Procedures

Restore targets a stopped Symphony with a fresh (or about-to-be-overwritten) /config. Restoring replaces the identity chain, so any running server would be operating on stale keys.

Restore from a `cirata symphony backup` archive

# 1. Stop the target instance.
sudo systemctl stop symphony                # Linux
docker compose stop symphony                # Docker Compose
kubectl scale deploy/my-release --replicas=0 # Kubernetes

# 2. Run the restore as a user that can write into the target directory
#    AND ends up with files the service user can read. Two equivalent
#    options on packaged installs (where /var/lib/symphony is owned
#    by the dedicated symphony user):
#      a) sudo to root — restore chowns the extracted tree to the
#         target directory's owner afterwards (recommended; matches
#         the surrounding stop/start commands).
#      b) sudo -u symphony — files are owned correctly to begin with,
#         so no chown is needed.
#    Against a non-empty /config, pass --force to move existing
#    contents aside to .pre-restore-<timestamp>/ rather than deleting
#    them.
sudo cirata symphony restore --input ./backup.tar --config-dir /var/lib/symphony

# In Kubernetes, run inside a helper pod that mounts the target PVC,
# and pipe the archive in from stdin. The pod can run as root (restore
# will chown to the target's owner) or as the service uid directly.
kubectl run restore-helper --rm -i --restart=Never \
  --image=<your cirata image> \
  --overrides='{"spec":{"volumes":[{"name":"data","persistentVolumeClaim":{"claimName":"my-release-pvc"}}], ...}}' \
  -- cirata symphony restore --input - --config-dir /data --force < ./backup.tar

# 3. Start the target instance.
sudo systemctl start symphony

Useful flags:

Flag	Description
`--force`	Required when `/config` is non-empty. Existing contents are moved aside to `.pre-restore-<timestamp>/`, never deleted.
`--skip-version-check`	Accept archives whose Symphony version differs in major from the current binary.
`--override-hostname`	Accept archives whose hostname differs from the current `EXTERNAL_HOSTNAME`. OIDC settings and JWT audience will need updating after restore.

What restore handles automatically:

Rewrites absolute paths in nats.config—the source's store_dir, logfile, and account resolver dir are replaced with the target --config-dir, so you can restore into a different directory than the source was running in without manual edits.
Recreates log/, accounts/, and storage/ under the target directory if they are missing. log/ is excluded from the archive but NATS needs it to exist at startup.
Propagates ownership of the target directory to the restored tree when running as root—on packaged installs the service runs as a dedicated unprivileged user (symphony for the RPM), but operators usually invoke cirata symphony restore via sudo so it can write into /var/lib/symphony. Restore stats --config-dir and recursively chowns the extracted tree to that uid/gid, so the service can read its own files at startup. The walk is a no-op when restore is run directly as the service user (files are already owned correctly), or when the target itself is root-owned (containers running as root, no propagation needed).
Validates the manifest before touching the target—restore aborts on schema or major-version mismatch (bypass with --skip-version-check) and on hostname mismatch (bypass with --override-hostname).

Restore from a CSI snapshot

Create a new PVC with dataSource pointing at the snapshot, then reinstall the Helm chart against it (for example with --set persistence.existingClaim=<new-pvc>, or edit the PVC template to reuse the existing claim). When the pod starts it sees a fully populated /config.

Restore from an offline tarball

Unpack the archive into a fresh /config (Linux: tar xzf backup.tar.gz -C /; Kubernetes: helper pod as in Approach C in reverse), then start Symphony.

Recovering from lost identity keys

If symphony.config is lost and no backup exists, the deployment's root of trust is broken. Re-run the setup wizard to generate new identity keys—all existing user accounts, API keys, and extension tokens will be invalidated and extensions must be re-registered. This is a destructive recovery path; regular backups prevent it.

Verifying a Restored Instance

After any restore, verify:

The admin UI loads and admin login succeeds.
At least one existing user logs in successfully—this exercises the account JWT chain.
At least one extension reconnects with its previously issued credentials—this exercises the end-to-end signing chain.
Licensing, users, and roles pages render the expected data.

Failing (3) means symphony.config was restored from a different point in time than accounts/ or storage/; credentials signed by the old keys no longer validate. Re-issue affected extension tokens to recover.

Backup Verification

Periodically verify that backups can be restored successfully. A backup that cannot be restored is not a backup. The lightest-weight check is to restore into a staging instance on a different host and complete the four-step verification above. Automating this quarterly is the single most valuable thing you can do for your recovery confidence.

Scaling

High Availability

Symphony supports multi-instance clustering for high availability. See the High Availability Guide for setup instructions.

Extension Failover

Extensions maintain their own connections to Symphony. When running multiple extension instances:

NATS automatically distributes service requests across instances using queue groups
If one instance disconnects, requests are routed to remaining instances
Extensions reconnect automatically when Symphony restarts

Resource Tuning

Adjust JetStream resource limits in nats.config:

jetstream {
    max_mem: 2G    # Increase for high-throughput workloads or larger CDN cache
    max_file: 10G  # Increase for large data volumes
}

The CDN cache (configurable in Admin → Settings) uses in-memory JetStream storage. If the cache fails to initialise with an "insufficient storage resources" error, increase max_mem or reduce the cache size limit in settings. The cache is only used when the dependency resolution mode is Proxy or Mixed; in Bundle-only mode the proxy is disabled and the cache is unused.

Security Operations

Rotating API Keys

Create a new API key with the required capabilities
Update the extension or client to use the new token
Verify the extension reconnects successfully
Delete the old API key from Account → API Keys

Reviewing User Accounts

Administrators can view all user accounts at Administration → Users, including their resolved roles and last login time.

Troubleshooting

Symphony Service Won't Start

Check the logs for error messages: sudo journalctl -u symphony --since "10 min ago" (Linux) or docker compose logs symphony --tail 50 (Docker)
Verify symphony.config is valid JSON: python3 -m json.tool /var/lib/symphony/symphony.config
Check that required ports (8080, 4222, 9222) are available: ss -tlnp | grep -E '8080|4222|9222'
Verify the identity keys in symphony.config are intact

Extensions Not Connecting

Check that the extension's API token is valid:
```
curl -H "Authorization: Bearer <token>" https://your-symphony-instance.com/api/v1/apikey
```
Replace <token> with the extension's token. A successful response confirms the token is recognised.
Verify network connectivity from the extension to Symphony on port 9222 (WebSocket) or 4222 (NATS)
Check extension logs for connection errors
Confirm the API key has the required capabilities for the extension's subjects

Performance Degradation

Check resource utilization: CPU, memory, disk I/O

Query metrics for connection counts and message rates:

cirata observability query_scalar_metrics http.requests.total

Check JetStream storage usage against configured limits
Consider increasing jetstream.max_mem or jetstream.max_file in nats.config

Using the cirata CLI​

Common Operational Commands​

Health Checks​

API Health​

Metrics​

Extension Health​

Logs​

Centralised Log Aggregation​

Enabling NATS Debug Logging​

Backup and Recovery​

What to Back Up​

Backup Schedule​

Backup Procedures​

Approach A — cirata symphony backup (recommended)​

Approach B — CSI volume snapshots (Kubernetes)​

Approach C — Offline filesystem copy​

Offsite Storage​

Recovery Procedures​

Restore from a cirata symphony backup archive​

Restore from a CSI snapshot​

Restore from an offline tarball​

Recovering from lost identity keys​

Verifying a Restored Instance​

Backup Verification​

Scaling​

High Availability​

Extension Failover​

Resource Tuning​

Security Operations​

Rotating API Keys​

Reviewing User Accounts​

Troubleshooting​

Symphony Service Won't Start​

Extensions Not Connecting​

Performance Degradation​

See Also​