Skip to main content

Operating Cirata Symphony

This guide covers operational procedures for running and maintaining a Symphony deployment, including health checks, log management, backup and recovery, and scaling.

Using the cirata CLI

The cirata command-line tool provides quick access to common operational tasks. It authenticates using a stored session (from cirata login) or the SYMPHONY_TOKEN environment variable.

# Authenticate with your Symphony instance
cirata login --address symphony.example.com

Once logged in, the CLI stores your token in the OS keychain and uses it automatically for subsequent commands. In headless environments, use --insecure-storage to store the token in the config file instead. See the CLI Reference for the full command list.

Common Operational Commands

TaskCLI CommandREST API Equivalent
Instance informationcirata infoGET /api/v1/symphonyinfo
Account detailscirata accountGET /api/v1/apikey
List extensionscirata extension listGET /api/v1/extensions/all
Extension featurescirata extension featuresGET /api/v1/extensions/features
NATS credentialscirata nats credentialsGET /api/v1/apikey
List objectscirata object list [bucket]
Versioncirata version

Extensions that register OpenAPI specifications also become available as dynamic CLI commands. Use cirata --help to see all available commands.

Health Checks

API Health

Verify Symphony is running and responsive:

# Using the CLI
cirata info

# Using curl (no authentication required)
curl https://your-symphony-instance.com/api/v1/ping

The /api/v1/ping endpoint returns 200 OK and is a simple connectivity test. For richer health information, use the structured health endpoints:

EndpointPurposeResponse
GET /api/v1/pingConnectivity test200 with {"message": "pong"}
GET /api/v1/healthComponent health and version200 when healthy, 503 when degraded
GET /api/v1/readyReadiness for traffic200 when ready, 503 when not

The /health endpoint reports the status of NATS and messaging components. The /ready endpoint is suitable for Kubernetes readiness probes and load balancer health checks.

Metrics

Platform and extension metrics are collected by the Observability Extension over NATS. Use the CLI to query available metrics:

# List metric namespaces
cirata observability find_all_namespaces

# Query a specific metric
cirata observability query_scalar_metrics http.requests.total

See Monitoring for details on the Observability Extension and available platform metrics.

Extension Health

List all connected extensions and their status:

# Using the CLI
cirata extension list

# Using curl
curl -H "Authorization: Bearer <token>" https://your-symphony-instance.com/api/v1/extensions/all

Extensions publish a status heartbeat every 30 seconds. An extension that has not published within the TTL window will be removed from the registry.

Logs

Symphony emits structured JSON logs to stdout/stderr. No application log files are written — log retention is handled by the host environment (journald on Linux, container runtime on Docker/Kubernetes).

# Linux (systemd)
sudo journalctl -u symphony -f

# Docker Compose
docker compose logs -f symphony

# Kubernetes
kubectl logs -f deployment/symphony

On Linux, install the bundled journald drop-in configuration to ensure logs are stored persistently with bounded retention. See Linux Installation: Log Retention for details.

For more information on the log format, fields, and levels, see Status: Logs.

Centralised Log Aggregation

In addition to stdout, Symphony supports delivering logs from both the platform and extensions to an external collector via OpenTelemetry (OTLP). When the observability extension is deployed, it aggregates logs from Symphony and all connected extensions and forwards them to a configured OTLP collector endpoint. This enables centralised log management in tools such as Grafana Loki, Datadog, or Elastic.

Configure OTLP collectors from the observability extension's Collectors page or its API. See Monitoring for details.

Enabling NATS Debug Logging

For NATS debug logging, uncomment debug: true and trace: true in nats.config and restart Symphony. NATS debug output is written to log/nats.log in the configuration directory. Disable debug logging after investigation as it generates significant output. See Configuration for details.

Backup and Recovery

What to Back Up

The configuration directory contains all state needed to restore a Symphony deployment. These items must be backed up together; a backup strategy that captures them independently will eventually produce an instance that rejects its own credentials.

PathContentsCriticality
symphony.configApplication configuration and cryptographic identity seeds (operator, symphony account, signing keys)Critical—the trust root for every JWT and API token
nats.configMessaging server configurationImportant—regenerable from symphony.config via setup wizard, but included in backups for completeness
accounts/NATS account resolver JWTs—one file per registered user and extensionCritical—rebuildable from symphony_users bucket only if symphony.config survives
storage/JetStream data: Symphony-owned KV buckets (users, roles, API keys, tokens, licenses, usage, business units, attribution rules, settings) and every user's private bucketsCritical—all operational state
log/NATS debug logs (when enabled)Optional—not required for recovery
warning

The identity seeds in symphony.config are the root of trust for the deployment. Without them, all existing user accounts, API keys, and extension tokens become permanently unverifiable. Treat this file with the same care as a TLS private key.

Backup Schedule

Choose a backup frequency based on the rate of change in your deployment:

EnvironmentRecommended frequencyRationale
ProductionDaily, with retention of at least 7 daysBalances recovery granularity with storage cost
High-change (frequent user/extension onboarding)Every 6–12 hoursMinimises data loss for rapidly changing state
Development / stagingBefore upgrades or configuration changesAd-hoc backups are sufficient

Keep at least one weekly backup offsite (different region, account, or cloud) to survive region or account compromise. Test the restore procedure on a non-production environment at least once per quarter. A backup that has never been restored is a hope, not a backup.

caution

The JetStream store under storage/ is not safe to copy with tar or cp while Symphony is running. JetStream rewrites metadata files in place; a hot file-level copy can capture a metadata file mid-rewrite and produce an archive that will not replay. Use one of the approaches below instead.

Backup Procedures

Three approaches are supported. Use them in the order listed.

The built-in cirata symphony backup command produces a single consistent archive of /config by briefly disabling JetStream on the running server, tarring everything, and re-enabling JetStream. It captures Symphony-owned buckets, every user's private buckets, and the identity files in one archive, without any external tooling.

The command requires the symphony-admin role and is issued against the running instance over HTTPS. During the pause (seconds to low-minutes depending on store size), JetStream operations return errors that well-behaved clients retry; plain NATS pub/sub is unaffected.

# Log in once; the token is used for subsequent admin commands.
cirata login --address symphony.example.com

# Stream the archive to a local file. --output defaults to a
# server-suggested filename in the current directory.
cirata symphony backup --output ./backup.tar

# Pipe directly to compression or encryption without ever landing
# the plaintext on disk.
cirata symphony backup --output - | gzip > backup.tar.gz
cirata symphony backup --output - | gpg --symmetric --cipher-algo AES256 -o backup.tar.gpg

# Against a Symphony pod in Kubernetes, invoke inside the pod so the
# archive never has to traverse the PVC boundary.
kubectl -n Symphony exec deploy/my-release -- \
cirata symphony backup -o - > backup-$(date +%Y%m%d-%H%M).tar

Flags:

FlagShortDefaultDescription
--output-oServer-suggested filenameOutput path. - writes to stdout.
--pause-timeout5mUpper bound on the JetStream outage window. Clamped to 30 s – 30 min.
--progressoffPrint byte-count progress to stderr.

Limitations:

  • Single-node only. In a NATS cluster, run against the JetStream meta-leader; the command refuses to run on a follower.
  • When Keycloak is bundled, its separate PVC is not included—back it up with Keycloak's own export tooling.
  • The archive format includes a JSON manifest as the first entry, so cirata symphony restore can validate it before touching the target filesystem.

Approach B — CSI volume snapshots (Kubernetes)

When the StorageClass backing Symphony has a CSI driver that supports snapshots, a VolumeSnapshot of the PVC captures a crash-consistent image at the block-device level without any server-side coordination. JetStream's recovery logic handles crash-consistent images correctly on restart.

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: symphony-$(date +%Y%m%d-%H%M%S)
namespace: Symphony
spec:
volumeSnapshotClassName: <your-csi-snapshot-class>
source:
persistentVolumeClaimName: my-release-pvc

Confirm snapshot support with kubectl get volumesnapshotclass. Schedule snapshots with a CronJob or an operator such as Velero; copy the resulting snapshot out of the cluster to offsite storage. Repeat the same against the Keycloak PVC when Keycloak is bundled.

Approach C — Offline filesystem copy

The simplest and highest-fidelity approach when a short outage is acceptable: stop the server, copy /config, restart.

Linux (systemd):

sudo systemctl stop symphony
sudo tar czf /backup/symphony-$(date +%Y%m%d-%H%M).tar.gz /var/lib/symphony/
sudo systemctl start symphony

Docker Compose:

docker compose stop symphony
docker compose cp symphony:/config /backup/symphony-$(date +%Y%m%d-%H%M)/
docker compose start symphony

Kubernetes:

kubectl -n Symphony scale deployment/my-release --replicas=0
kubectl -n Symphony wait --for=delete pod \
-l app.kubernetes.io/instance=my-release --timeout=60s

# Copy from a throwaway pod mounted at the PVC.
kubectl -n Symphony run backup-helper --rm -it --restart=Never \
--overrides='{"spec":{"volumes":[{"name":"data","persistentVolumeClaim":{"claimName":"my-release-pvc"}}],"containers":[{"name":"t","image":"busybox","command":["tar","czf","-","-C","/data","."],"volumeMounts":[{"name":"data","mountPath":"/data"}]}]}}' \
> ./backup-$(date +%Y%m%d-%H%M%S).tar.gz

kubectl -n Symphony scale deployment/my-release --replicas=1

Offsite Storage

Store at least one copy of your backups outside the host or cluster running Symphony:

  • Cloud object storage (S3, GCS, Azure Blob) for durable, geographically redundant copies
  • Network file share (NFS, SMB) for on-premises environments
  • Encrypted transfer—backups contain cryptographic seeds; encrypt before transferring over untrusted networks (see the gpg pipe example in Approach A above)

Recovery Procedures

Restore targets a stopped Symphony with a fresh (or about-to-be-overwritten) /config. Restoring replaces the identity chain, so any running server would be operating on stale keys.

Restore from a cirata symphony backup archive

# 1. Stop the target instance.
sudo systemctl stop symphony # Linux
docker compose stop symphony # Docker Compose
kubectl scale deploy/my-release --replicas=0 # Kubernetes

# 2. Run the restore as a user that can write into the target directory
# AND ends up with files the service user can read. Two equivalent
# options on packaged installs (where /var/lib/symphony is owned
# by the dedicated symphony user):
# a) sudo to root — restore chowns the extracted tree to the
# target directory's owner afterwards (recommended; matches
# the surrounding stop/start commands).
# b) sudo -u symphony — files are owned correctly to begin with,
# so no chown is needed.
# Against a non-empty /config, pass --force to move existing
# contents aside to .pre-restore-<timestamp>/ rather than deleting
# them.
sudo cirata symphony restore --input ./backup.tar --config-dir /var/lib/symphony

# In Kubernetes, run inside a helper pod that mounts the target PVC,
# and pipe the archive in from stdin. The pod can run as root (restore
# will chown to the target's owner) or as the service uid directly.
kubectl run restore-helper --rm -i --restart=Never \
--image=<your cirata image> \
--overrides='{"spec":{"volumes":[{"name":"data","persistentVolumeClaim":{"claimName":"my-release-pvc"}}], ...}}' \
-- cirata symphony restore --input - --config-dir /data --force < ./backup.tar

# 3. Start the target instance.
sudo systemctl start symphony

Useful flags:

FlagDescription
--forceRequired when /config is non-empty. Existing contents are moved aside to .pre-restore-<timestamp>/, never deleted.
--skip-version-checkAccept archives whose Symphony version differs in major from the current binary.
--override-hostnameAccept archives whose hostname differs from the current EXTERNAL_HOSTNAME. OIDC settings and JWT audience will need updating after restore.

What restore handles automatically:

  • Rewrites absolute paths in nats.config—the source's store_dir, logfile, and account resolver dir are replaced with the target --config-dir, so you can restore into a different directory than the source was running in without manual edits.
  • Recreates log/, accounts/, and storage/ under the target directory if they are missing. log/ is excluded from the archive but NATS needs it to exist at startup.
  • Propagates ownership of the target directory to the restored tree when running as root—on packaged installs the service runs as a dedicated unprivileged user (symphony for the RPM), but operators usually invoke cirata symphony restore via sudo so it can write into /var/lib/symphony. Restore stats --config-dir and recursively chowns the extracted tree to that uid/gid, so the service can read its own files at startup. The walk is a no-op when restore is run directly as the service user (files are already owned correctly), or when the target itself is root-owned (containers running as root, no propagation needed).
  • Validates the manifest before touching the target—restore aborts on schema or major-version mismatch (bypass with --skip-version-check) and on hostname mismatch (bypass with --override-hostname).

Restore from a CSI snapshot

Create a new PVC with dataSource pointing at the snapshot, then reinstall the Helm chart against it (for example with --set persistence.existingClaim=<new-pvc>, or edit the PVC template to reuse the existing claim). When the pod starts it sees a fully populated /config.

Restore from an offline tarball

Unpack the archive into a fresh /config (Linux: tar xzf backup.tar.gz -C /; Kubernetes: helper pod as in Approach C in reverse), then start Symphony.

Recovering from lost identity keys

If symphony.config is lost and no backup exists, the deployment's root of trust is broken. Re-run the setup wizard to generate new identity keys—all existing user accounts, API keys, and extension tokens will be invalidated and extensions must be re-registered. This is a destructive recovery path; regular backups prevent it.

Verifying a Restored Instance

After any restore, verify:

  1. The admin UI loads and admin login succeeds.
  2. At least one existing user logs in successfully—this exercises the account JWT chain.
  3. At least one extension reconnects with its previously issued credentials—this exercises the end-to-end signing chain.
  4. Licensing, users, and roles pages render the expected data.

Failing (3) means symphony.config was restored from a different point in time than accounts/ or storage/; credentials signed by the old keys no longer validate. Re-issue affected extension tokens to recover.

Backup Verification

Periodically verify that backups can be restored successfully. A backup that cannot be restored is not a backup. The lightest-weight check is to restore into a staging instance on a different host and complete the four-step verification above. Automating this quarterly is the single most valuable thing you can do for your recovery confidence.

Scaling

High Availability

Symphony supports multi-instance clustering for high availability. See the High Availability Guide for setup instructions.

Extension Failover

Extensions maintain their own connections to Symphony. When running multiple extension instances:

  • NATS automatically distributes service requests across instances using queue groups
  • If one instance disconnects, requests are routed to remaining instances
  • Extensions reconnect automatically when Symphony restarts

Resource Tuning

Adjust JetStream resource limits in nats.config:

jetstream {
max_mem: 2G # Increase for high-throughput workloads or larger CDN cache
max_file: 10G # Increase for large data volumes
}

The CDN cache (configurable in Admin → Settings) uses in-memory JetStream storage. If the cache fails to initialise with an "insufficient storage resources" error, increase max_mem or reduce the cache size limit in settings. The cache is only used when the dependency resolution mode is Proxy or Mixed; in Bundle-only mode the proxy is disabled and the cache is unused.

Security Operations

Rotating API Keys

  1. Create a new API key with the required capabilities
  2. Update the extension or client to use the new token
  3. Verify the extension reconnects successfully
  4. Delete the old API key from Account → API Keys

Reviewing User Accounts

Administrators can view all user accounts at Administration → Users, including their resolved roles and last login time.

Troubleshooting

Symphony Service Won't Start

  1. Check the logs for error messages: sudo journalctl -u symphony --since "10 min ago" (Linux) or docker compose logs symphony --tail 50 (Docker)
  2. Verify symphony.config is valid JSON: python3 -m json.tool /var/lib/symphony/symphony.config
  3. Check that required ports (8080, 4222, 9222) are available: ss -tlnp | grep -E '8080|4222|9222'
  4. Verify the identity keys in symphony.config are intact

Extensions Not Connecting

  1. Check that the extension's API token is valid:
    curl -H "Authorization: Bearer <token>" https://your-symphony-instance.com/api/v1/apikey
    Replace <token> with the extension's token. A successful response confirms the token is recognised.
  2. Verify network connectivity from the extension to Symphony on port 9222 (WebSocket) or 4222 (NATS)
  3. Check extension logs for connection errors
  4. Confirm the API key has the required capabilities for the extension's subjects

Performance Degradation

  1. Check resource utilization: CPU, memory, disk I/O
  2. Query metrics for connection counts and message rates:
    cirata observability query_scalar_metrics http.requests.total
  3. Check JetStream storage usage against configured limits
  4. Consider increasing jetstream.max_mem or jetstream.max_file in nats.config

See Also