Skip to main content
Version: 2.3

View RPC diagnostics

You can use remote procedure call (RPC) diagnostics for your filesystem configuration to help monitor the health of your cluster. You can do this with the UI or the CLI.

View RPC diagnostics with the UI

The RPC Diagnostics graphs give you insight into RPC call times and the number of events returned for each RPC call.

Go to any of your filesystem configurations in the UI and select the link Advanced Diagnostics to view RPC diagnostics information.

You can monitor historic values for the following Remote Procedure Call (RPC) metrics in two graphs in the UI:

GraphValues
RPC call timeAverage call time
Maximum call time
Events for each RPCAverage number of events
Maximum number of events

To view values plotted across both graphs for a specific timeframe, select the data duration from the dropdown list in the UI:

  • One hour
  • One day
  • One week
  • 30 days

Average values should remain constant. If the graphs show maximum values that vary, it may indicate the Hadoop cluster has issues.

note

Maximum values are reset every minute.

View RPC diagnostics with the CLI

You can run the following commands to view a diagnostics summary with your shell CLI or the Data Migrator CLI tool:

Shell CLI with API endpoint

  • To get a diagnostics summary in text format, go to your shell CLI and run the command:

    curl http://127.0.0.1:18080/diagnostics/summary.txt
  • To view diagnostics continually updated, go to your shell CLI and run the command:

    watch curl -s http://127.0.0.1:18080/diagnostics/summary.txt
tip

If you're using curl on a TLS connection and with basic auth, use https with the --insecure and --user curl options. For example: curl https://<host>:<port>/<end_point> --insecure --user <basic-auth-username>

CLI tool

  • To get a diagnostics summary with the CLI tool, run the command:

    status --diagnostics
  • To view diagnostics continually updated with the CLI tool, run the command:

    status --diagnostics --watch

You'll see a line in the summary for the RPC diagnostics similar to this example:

EventsBehind Current/Avg/Max: 0/0/0, RPC Time Avg/Max: 4/54

The current, average, and maximum values for the metric EventsBehind inform you how far behind Data Migrator is from retrieving events on the NameNode. If these values exceed the number of events on the NameNode, migrations will eventually stop automatically. Warnings are displayed in the UI and the CLI to let you know you need to adjust the maximum number of events before migrations stop automatically.

The average and maximum values for the metric RPC Time Avg/Max inform you how long the RPC call times are.