Bump github.com/prometheus/common from 0.68.0 to 0.68.1 (#381)
Bumps github.com/prometheus/common from 0.68.0 to 0.68.1.
updated-dependencies:
- dependency-name: github.com/prometheus/common dependency-version: 0.68.1 dependency-type: direct:production update-type: version-update:semver-patch …
Signed-off-by: dependabot[bot] support@github.com Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
版权所有:中国计算机学会技术支持:开源发展技术委员会
京ICP备13000930号-9
京公网安备 11010802047560号
NATS Surveyor
NATS Monitoring, Simplified.
NATS surveyor polls the NATS server for
Statzmessages to generate data for Prometheus. This allows a single exporter to connect to any NATS server and get an entire picture of a NATS deployment without requiring extra monitoring components or sidecars. Surveyor has been used extensively by Synadia.System accounts must be enabled to use surveyor.
Commercial Options
If you want a “batteries-included” approach to high-cardinality NATS monitoring and observability, try the standalone Synadia Insights.
Usage
System account credentials can be provided in 4 ways:
--credsoption to supply chained credentials file (containing JWT and NKey seed):--jwtand--seedoptions to provide user JWT and NKey seed directly:--token-fileflag to point to a file containing the token:--userand--passwordflagsConfig
Config Files
Surveyor uses Viper to read configs, so it will support all file types that Viper supports (JSON, TOML, YAML, HCL, envfile, and Java properties)
To use a config file pass the
--configflag. The defaults are/etc/nats-surveyor/nats-surveyor[.ext]and./nats-surveyor[.ext]with one of the supported extensions.The config is simple, just set each flag in the config file. Example
nats-surveyor.yaml:Environment Variables
Environment variables are also taken into account. Any environment variable that is prefixed with
NATS_SURVEYOR_will be read.Each flag has a matching environment variable, flag names should be converted to uppercase and dashes replaced with underscores. Example:
Metrics
Scrape output is the in form of nats_core_NNNN_metric, where NNN is
server,route, orgateway.To aid filtering, each metric has labels. These include
server_cluster,server_name, andserver_id. Routes have the additional labelserver_route_nameand gateways have the additional labelserver_gateway_name.The info metrics has a nats_server_version label with the current version.
Additionally, there is a
nats_upmetric that will normally return 1, but will return 0 and no additional NATS metrics when there is no connectivity to the NATS system. This allows users to differentiate between a problem with the exporter itself connectivity with the NATS system.JSZ Metrics
Since v0.9.1, nats-surveyor supports collecting stream and consumer metrics. By default, surveyor will collect all the metrics from all the replicas from streams and consumers which depending of the size of your deployment, can result in high cardinality issues in the Prometheus setup. To narrow down the list of metrics to be exported there are a few options.
Using
--jsz=streamsto make sure that only the streams metrics is collected (if consumer metrics are not needed).Using
--jsz-leaders-onlyto skip data from the stream and consumer replicas.Using
--jsz-filterto decrease number of consumer metrics:The following list of metrics for consumers is available to be used as filters:
For example, the following will make surveyor only collect the metrics from the leaders and picking up
num_pending,num_ack_pendingandnum_waitingfrom the consumers.Docker Compose
An easy way to start the NATS Surveyor stack (Grafana, Prometheus, and NATS Surveyor) is through docker compose.
Follow these links for installation instructions:
The included
docker-composesetup supports authentication using either creds file or username/password:Using credential file:
NATS_SURVEYOR_CREDS=/path/to/SYS.creds NATS_SURVEYOR_SERVERS=nats://host.docker.internal:4222 docker compose up --pull alwaysUsing username/password:
NATS_SURVEYOR_USER=system NATS_SURVEYOR_PASSWORD=s3cret NATS_SURVEYOR_SERVERS=nats://host.docker.internal:4222 docker compose up --pull alwaysUsing the survey.sh helper script:
Environment Variables
The following environment variables MUST be set, either in your environment or through the .env file that is automatically read by docker-compose. There is a
survey.shscript that will set them for you as a convenience.Note: For referencing files and paths, docker always expects volume mounts to be either a fully qualified directory, or a relative directory beginning with with
./.Server URLs
You only need to connect to a single NATS server to monitor your entire NATS deployment. In configuring NATS_SURVEYOR_SERVERS, only one server is required, but it’s recommended you provide a list for backup servers to connect to, e.g.
nats://host1:4222,nats://host2:5222. Valid urls are formatted ashostname(defaulting to port 4222),hostname:port, ornats://hostname:port.Starting Up
You can start the Surveyor stack two ways. The first is through docker compose. Ensure the environment varibles are set, that you are working from the /docker-compose directory and run
docker compose up --pull always.Alternatively, you can pass variables into the
survey.shscript in the docker-compose directory.e.g.
./survey.sh nats://mydeployment:4222 24 /privatekeys/SYS.credsIf things aren’t working, look in the output for any lines that contain
exited with code 1and address the problem. They are usually docker volume mount problems or connectivity problems.Next, with your browser, navigate to http://127.0.0.1:3000, or if you are running the Surveyor stack remotely, the hostname of the host running the NATS surveyor stack, e.g.
http://yourremotehost:3000.The first time you connect, you’ll need to login:
After logging in, navigate to “Manage dashboards” and you’ll see a dashboard available named NATS Surveyor, where you’ll be able to monitor your entire NATS deployment.
Stopping (while keeping the containers)
To stop the surveyor stack, but keep the containers run:
docker compose stopRestarting Surveyor
To restart the surveyor stack after being stopped, run:
docker compose upStopping and removing containers
To cleanup your installation, run:
docker compose downRunning Surveyor as a service
For platforms that support
systemd, surveyor.service is provided as a service definition template. Modify and save this file as/etc/systemd/system/surveyor.service.systemctl start surveyorwill launch the service.Errors
The logs should normally contain enough information about the cause of problems or errors.
If you encounter a Prometheus error of:
panic: Unable to create mmap-ed active query log, set the UID of the container to match the UID of your user in the docker-compose file.e.g:
If the above doesn’t work, using
rootwill work but may pose a security thread to the node it is running on.More information can be found here.
Alternatively, on Linux you may need to manually set write permissions for the bind-mounted prometheus data directory(
storage).Service Observations
Services can be observed by creating JSON files in the
observationsdirectory. The file extension must be.json. Only one authentication method needs to be provided. Example file format:Files are watched and updated using fsnotify
JetStream
JetStream can be monitored on a per-account basis by creating JSON files in the
jetstreamdirectory. The file extension must be.json. Only one authentication method needs to be provided. Be sure that you give access to the$JS.EVENT.>subject to your user. Example file format:Credentials
Files are watched and updated using fsnotify
Development
The easiest way to test your changes is to build local image:
You can then use the image against local cluster from docker-compse:
TODO