Skip to content

Add multi-target support #1063

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 25 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
aad3e72
Add multi-target support
pincher95 Jul 25, 2025
0a54b3f
Update example-prometheus.yml
pincher95 Jul 27, 2025
a5d0942
Make `es.uri` optional by setting default to empty string check if it…
pincher95 Jul 27, 2025
5957b0d
Update README.md
pincher95 Jul 27, 2025
90609cc
Add sanity target scheme validation
pincher95 Jul 28, 2025
1c08e68
Change yaml package to go.yaml.in/yaml/v3
pincher95 Jul 30, 2025
daecd52
Update yaml package to go.yaml.in/yaml/v3
pincher95 Jul 30, 2025
58d2965
Update CHANGELOG.md
pincher95 Jul 30, 2025
bd4e1c4
Remove whitespaces from README.md
pincher95 Jul 31, 2025
d77cf8e
Add testing for apikey authentication module
pincher95 Aug 2, 2025
d329fb3
Add Load-time validation for the auth module config file during startup
pincher95 Aug 2, 2025
5f754b7
Expose error in the logger
pincher95 Aug 3, 2025
4a660de
Add TLS config per target support
pincher95 Aug 3, 2025
5e32ad8
Indices and Shards collectors now fetch cluster_name once from GET / …
pincher95 Aug 3, 2025
d674f6d
Removed the special-case logic that redirected /metrics?target= reque…
pincher95 Aug 3, 2025
9fa0610
Add license headers to all new files
pincher95 Aug 5, 2025
d5f818b
Fixes for relative paths in multi-target mode
pincher95 Aug 5, 2025
750e0da
Bump github.com/prometheus/client_golang from 1.22.0 to 1.23.0 (#1065)
dependabot[bot] Aug 5, 2025
3ea24f0
Add target schema validation, http/https only
pincher95 Aug 10, 2025
03d1a70
Cleanup
pincher95 Aug 10, 2025
b096536
Fix tls auth type validation
pincher95 Aug 11, 2025
236586c
Remove aws.region validation
pincher95 Aug 11, 2025
bd1c0a8
Add temp file cleanup in config_test.go
pincher95 Aug 11, 2025
76186e8
Merge branch 'master' into add-multi-target-support
pincher95 Aug 11, 2025
c7ca445
Add copyright header to config_test.go
pincher95 Aug 11, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,12 @@
## master / unreleased

### Added
- Multi-target scraping via `/probe` endpoint with optional auth modules (compatible with postgres_exporter style) #1063

BREAKING CHANGES:

* [CHANGE] Set `--es.uri` by default to empty string #1063

The flag `--es.data_stream` has been renamed to `--collector.data-stream`.
The flag `--es.ilm` has been renamed to `--collector.ilm`.

Expand Down
64 changes: 63 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ elasticsearch_exporter --help
| Argument | Introduced in Version | Description | Default |
| ----------------------- | --------------------- |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| ----------- |
| collector.clustersettings| 1.6.0 | If true, query stats for cluster settings (As of v1.6.0, this flag has replaced "es.cluster_settings"). | false |
| es.uri | 1.0.2 | Address (host and port) of the Elasticsearch node we should connect to. This could be a local node (`localhost:9200`, for instance), or the address of a remote Elasticsearch server. When basic auth is needed, specify as: `<proto>://<user>:<password>@<host>:<port>`. E.G., `http://admin:pass@localhost:9200`. Special characters in the user credentials need to be URL-encoded. | <http://localhost:9200> |
| es.uri | 1.0.2 | Address (host and port) of the Elasticsearch node we should connect to **when running in single-target mode**. Leave empty (the default) when you want to run the exporter only as a multi-target `/probe` endpoint. When basic auth is needed, specify as: `<proto>://<user>:<password>@<host>:<port>`. E.G., `http://admin:pass@localhost:9200`. Special characters in the user credentials need to be URL-encoded. | "" |
| es.all | 1.0.2 | If true, query stats for all nodes in the cluster, rather than just the node we connect to. | false |
| es.indices | 1.0.2 | If true, query stats for all indices in the cluster. | false |
| es.indices_settings | 1.0.4rc1 | If true, query settings stats for all indices in the cluster. | false |
Expand All @@ -77,6 +77,7 @@ elasticsearch_exporter --help
| web.telemetry-path | 1.0.2 | Path under which to expose metrics. | /metrics |
| aws.region | 1.5.0 | Region for AWS elasticsearch | |
| aws.role-arn | 1.6.0 | Role ARN of an IAM role to assume. | |
| config.file | 2.0.0 | Path to a YAML configuration file that defines `auth_modules:` used by the `/probe` multi-target endpoint. Leave unset when not using multi-target mode. | |
| version | 1.0.2 | Show version info on stdout and exit. | |

Commandline parameters start with a single `-` for versions less than `1.1.0rc1`.
Expand Down Expand Up @@ -113,6 +114,67 @@ Further Information
- [Defining Roles](https://www.elastic.co/guide/en/elastic-stack-overview/7.3/defining-roles.html)
- [Privileges](https://www.elastic.co/guide/en/elastic-stack-overview/7.3/security-privileges.html)

### Multi-Target Scraping (beta)

From v2.X the exporter exposes `/probe` allowing one running instance to scrape many clusters.

Supported `auth_module` types:

| type | YAML fields | Injected into request |
| ---------- | ----------------------------------------------------------------- | ------------------------------------------------------------------------------------- |
| `userpass` | `userpass.username`, `userpass.password`, optional `options:` map | Sets HTTP basic-auth header, appends `options` as query parameters |
| `apikey` | `apikey:` Base64 API-Key string, optional `options:` map | Adds `Authorization: ApiKey …` header, appends `options` |
| `aws` | `aws.region`, optional `aws.role_arn`, optional `options:` map | Uses AWS SigV4 signing transport for HTTP(S) requests, appends `options` |
| `tls` | `tls.ca_file`, `tls.cert_file`, `tls.key_file` | Uses client certificate authentication via TLS; cannot be mixed with other auth types |

Example config:

```yaml
# exporter-config.yml
auth_modules:
prod_basic:
type: userpass
userpass:
username: metrics
password: s3cr3t

staging_key:
type: apikey
apikey: "bXk6YXBpa2V5Ig==" # base64 id:key
options:
sslmode: disable
```

Run exporter:

```bash
./elasticsearch_exporter --config.file=exporter-config.yml
```

Prometheus scrape_config:

```yaml
- job_name: es
metrics_path: /probe
params:
auth_module: [staging_key]
static_configs:
- targets: ["https://es-stage:9200"]
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: exporter:9114
```

Notes:
- `/metrics` serves a single, process-wide registry and is intended for single-target mode.
- `/probe` creates a fresh registry per scrape for the given `target` allowing multi-target scraping.
- Any `options:` under an auth module will be appended as URL query parameters to the target URL.
- The `tls` auth module (client certificate authentication) is intended for self‑managed Elasticsearch/OpenSearch deployments. Amazon OpenSearch Service typically authenticates at the domain edge with IAM/SigV4 and does not support client certificate authentication; use the `aws` auth module instead when scraping Amazon OpenSearch Service domains.

### Metrics

| Name | Type | Cardinality | Help |
Expand Down
24 changes: 20 additions & 4 deletions collector/indices.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ import (
"log/slog"
"net/http"
"net/url"
"path"
"sort"
"strconv"

Expand Down Expand Up @@ -620,13 +621,28 @@ func (i *Indices) fetchAndDecodeIndexStats(ctx context.Context) (indexStatsRespo
return isr, nil
}

// getCluserName returns the name of the cluster from the clusterinfo
// if the clusterinfo is nil, it returns "unknown_cluster"
// TODO(@sysadmind): this should be removed once we have a better way to handle clusterinfo
// getClusterName returns the cluster name. If no clusterinfo retriever is
// attached (e.g. /probe mode) it performs a lightweight call to the root
// endpoint once and caches the result.
func (i *Indices) getClusterName() string {
if i.lastClusterInfo != nil {
if i.lastClusterInfo != nil && i.lastClusterInfo.ClusterName != "unknown_cluster" {
return i.lastClusterInfo.ClusterName
}
u := *i.url
u.Path = path.Join(u.Path, "/")
resp, err := i.client.Get(u.String())
if err == nil {
defer resp.Body.Close()
if resp.StatusCode == http.StatusOK {
var root struct {
ClusterName string `json:"cluster_name"`
}
if err := json.NewDecoder(resp.Body).Decode(&root); err == nil && root.ClusterName != "" {
i.lastClusterInfo = &clusterinfo.Response{ClusterName: root.ClusterName}
return root.ClusterName
}
}
}
return "unknown_cluster"
}

Expand Down
38 changes: 33 additions & 5 deletions collector/shards.go
Original file line number Diff line number Diff line change
Expand Up @@ -64,23 +64,50 @@ type nodeShardMetric struct {
Labels labels
}

// fetchClusterNameOnce performs a single request to the root endpoint to obtain the cluster name.
func fetchClusterNameOnce(s *Shards) string {
if s.lastClusterInfo != nil && s.lastClusterInfo.ClusterName != "unknown_cluster" {
return s.lastClusterInfo.ClusterName
}
u := *s.url
u.Path = path.Join(u.Path, "/")
resp, err := s.client.Get(u.String())
if err == nil {
defer resp.Body.Close()
if resp.StatusCode == http.StatusOK {
var root struct {
ClusterName string `json:"cluster_name"`
}
if err := json.NewDecoder(resp.Body).Decode(&root); err == nil && root.ClusterName != "" {
s.lastClusterInfo = &clusterinfo.Response{ClusterName: root.ClusterName}
return root.ClusterName
}
}
}
return "unknown_cluster"
}

// NewShards defines Shards Prometheus metrics
func NewShards(logger *slog.Logger, client *http.Client, url *url.URL) *Shards {
var shardPtr *Shards
nodeLabels := labels{
keys: func(...string) []string {
return []string{"node", "cluster"}
},
values: func(lastClusterinfo *clusterinfo.Response, s ...string) []string {
values: func(lastClusterinfo *clusterinfo.Response, base ...string) []string {
if lastClusterinfo != nil {
return append(s, lastClusterinfo.ClusterName)
return append(base, lastClusterinfo.ClusterName)
}
// this shouldn't happen, as the clusterinfo Retriever has a blocking
// Run method. It blocks until the first clusterinfo call has succeeded
return append(s, "unknown_cluster")
if shardPtr != nil {
return append(base, fetchClusterNameOnce(shardPtr))
}
return append(base, "unknown_cluster")
},
}

shards := &Shards{
// will assign later

logger: logger,
client: client,
url: url,
Expand Down Expand Up @@ -123,6 +150,7 @@ func NewShards(logger *slog.Logger, client *http.Client, url *url.URL) *Shards {
logger.Debug("exiting cluster info receive loop")
}()

shardPtr = shards
return shards
}

Expand Down
139 changes: 139 additions & 0 deletions config/config.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
// Copyright The Prometheus Authors
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package config

import (
"fmt"
"os"
"strings"

"go.yaml.in/yaml/v3"
)

// Config represents the YAML configuration file structure.
type Config struct {
AuthModules map[string]AuthModule `yaml:"auth_modules"`
}

type AuthModule struct {
Type string `yaml:"type"`
UserPass *UserPassConfig `yaml:"userpass,omitempty"`
APIKey string `yaml:"apikey,omitempty"`
AWS *AWSConfig `yaml:"aws,omitempty"`
TLS *TLSConfig `yaml:"tls,omitempty"`
Options map[string]string `yaml:"options,omitempty"`
}

// AWSConfig contains settings for SigV4 authentication.
type AWSConfig struct {
Region string `yaml:"region"`
RoleARN string `yaml:"role_arn,omitempty"`
}

// TLSConfig allows per-target TLS options.
type TLSConfig struct {
CAFile string `yaml:"ca_file,omitempty"`
CertFile string `yaml:"cert_file,omitempty"`
KeyFile string `yaml:"key_file,omitempty"`
InsecureSkipVerify bool `yaml:"insecure_skip_verify,omitempty"`
}

type UserPassConfig struct {
Username string `yaml:"username"`
Password string `yaml:"password"`
}

// validate ensures every auth module has the required fields according to its type.
func (c *Config) validate() error {
for name, am := range c.AuthModules {
// Validate fields based on auth type
switch strings.ToLower(am.Type) {
case "userpass":
if am.UserPass == nil || am.UserPass.Username == "" || am.UserPass.Password == "" {
return fmt.Errorf("auth_module %s type userpass requires username and password", name)
}
case "apikey":
if am.APIKey == "" {
return fmt.Errorf("auth_module %s type apikey requires apikey", name)
}
case "aws":
if am.AWS == nil {
return fmt.Errorf("auth_module %s type aws requires region", name)
}
case "tls":
// TLS auth type means client certificate authentication only (no other auth)
if am.TLS == nil {
return fmt.Errorf("auth_module %s type tls requires tls configuration section", name)
}
if am.TLS.CertFile == "" || am.TLS.KeyFile == "" {
return fmt.Errorf("auth_module %s type tls requires cert_file and key_file for client certificate authentication", name)
}
// Validate that other auth fields are not set when using TLS auth type
if am.UserPass != nil {
return fmt.Errorf("auth_module %s type tls cannot have userpass configuration", name)
}
if am.APIKey != "" {
return fmt.Errorf("auth_module %s type tls cannot have apikey", name)
}
if am.AWS != nil {
return fmt.Errorf("auth_module %s type tls cannot have aws configuration", name)
}
default:
return fmt.Errorf("auth_module %s has unsupported type %s", name, am.Type)
}

// Validate TLS configuration (optional for all auth types, provides transport security)
if am.TLS != nil {
// For cert-based auth (type: tls), cert and key are required
// For other auth types, TLS config is optional and used for transport security
if strings.ToLower(am.Type) != "tls" {
// For non-TLS auth types, if cert/key are provided, both must be present
if (am.TLS.CertFile != "") != (am.TLS.KeyFile != "") {
return fmt.Errorf("auth_module %s: if providing client certificate, both cert_file and key_file must be specified", name)
}
}

// Validate file accessibility
for fileType, path := range map[string]string{
"ca_file": am.TLS.CAFile,
"cert_file": am.TLS.CertFile,
"key_file": am.TLS.KeyFile,
} {
if path == "" {
continue
}
if _, err := os.Stat(path); err != nil {
return fmt.Errorf("auth_module %s: %s '%s' not accessible: %w", name, fileType, path, err)
}
}
}
}
return nil
}

// LoadConfig reads, parses, and validates the YAML config file.
func LoadConfig(path string) (*Config, error) {
data, err := os.ReadFile(path)
if err != nil {
return nil, err
}
var cfg Config
if err := yaml.Unmarshal(data, &cfg); err != nil {
return nil, err
}
if err := cfg.validate(); err != nil {
return nil, err
}
return &cfg, nil
}
Loading