Skip to content

Commit b2e2def

Browse files
committed
moved content
1 parent 54eaee1 commit b2e2def

File tree

3 files changed

+79
-49
lines changed

3 files changed

+79
-49
lines changed

crowdsec-docs/docs/log_processor/data_sources/introduction.md

Lines changed: 75 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -6,26 +6,70 @@ sidebar_position: 1
66

77
## Datasources
88

9-
To be able to monitor applications, the Security Engine needs to access logs.
10-
DataSources are configured via the [acquisition](/configuration/crowdsec_configuration.md#acquisition_path) configuration, or specified via the command-line when performing cold logs analysis.
9+
To monitor applications, the Security Engine needs to read logs.
10+
DataSources define where to access them (either as files, or over the network from a centralized logging service).
1111

12+
They can be defined:
13+
14+
- in [Acquisition files](/configuration/crowdsec_configuration.md#acquisition_path). Each file can contain multiple DataSource definitions.
15+
- for cold log analysis, you can also specify acquisitions via the command line.
16+
17+
18+
### Service detection (automated setup)
19+
20+
When CrowdSec is installed via a package manager on a fresh system, the package may run [`cscli setup`](/cscli/cscli_setup) in **unattended** mode.
21+
22+
The `cscli setup` command will:
23+
24+
- detect installed services and common log file locations
25+
- install the related Hub collections
26+
- generate acquisition files under `acquis.d/` as `setup.<service>.yaml` (e.g., `setup.linux.yaml`)
27+
28+
Generated files are meant to be managed by CrowdSec; don’t edit them in place. If you need changes, delete the generated file and create your own.
29+
30+
When upgrading or reinstalling CrowdSec, it detects non-generated or modified files and won’t overwrite your custom acquisitions.
31+
32+
:::caution
33+
34+
Make sure the same data sources are not ingested more than once: duplicating inputs can artificially increase scenario sensitivity.
35+
36+
:::
37+
38+
Examples:
39+
40+
- If an application logs to both `journald` and `/var/log/*`, you usually only need one of them.
41+
- If an application writes to `/var/log/syslog` or `/var/log/messages`, it’s already acquired by `setup.linux.yaml` (since 1.7) or `acquis.yam`. You don’t need to add a separate acquisition for the same logs.
42+
43+
For config-managed deployments (e.g., Ansible), set the environment variable `CROWDSEC_SETUP_UNATTENDED_DISABLE` to any non-empty value to skip the automated setup.
44+
In that case, ensure you configure at least one data source and install the OS collection (e.g., crowdsecurity/linux).
45+
46+
### Assisted service detection (semi-automated setup)
47+
48+
If you installed new applications and want to detect the service detection again, running [`cscli setup`](/cscli/cscli_setup) yourself will guide you through the
49+
automated setup, with confirmation prompts. You will receive a warning if you already configured some acquisition yourself but they won't be
50+
modified by `cscli`.
51+
52+
Note that `cscli setup` will not remove any collection or acquisition file in `acquis.d/setup.<service>.yaml`, even if the service has been uninstalled since the file creation.
53+
54+
55+
## Datasources modules
1256

1357
Name | Type | Stream | One-shot
1458
-----|------|--------|----------
15-
[Appsec](/log_processor/data_sources/appsec.md) | expose HTTP service for the Appsec component | yes | no
16-
[AWS cloudwatch](/log_processor/data_sources/cloudwatch.md) | single stream or log group | yes | yes
17-
[AWS kinesis](/log_processor/data_sources/kinesis.md)| read logs from a kinesis strean | yes | no
18-
[AWS S3](/log_processor/data_sources/s3.md)| read logs from a S3 bucket | yes | yes
19-
[docker](/log_processor/data_sources/docker.md) | read logs from docker containers | yes | yes
20-
[file](/log_processor/data_sources/file.md) | single files, glob expressions and .gz files | yes | yes
21-
[HTTP](/log_processor/data_sources/http.md) | read logs from an HTTP endpoint | yes | no
22-
[journald](/log_processor/data_sources/journald.md) | journald via filter | yes | yes
23-
[Kafka](/log_processor/data_sources/kafka.md)| read logs from kafka topic | yes | no
24-
[Kubernetes Audit](/log_processor/data_sources/kubernetes_audit.md) | expose a webhook to receive audit logs from a Kubernetes cluster | yes | no
25-
[Loki](/log_processor/data_sources/loki.md) | read logs from loki | yes | yes
26-
[VictoriaLogs](/log_processor/data_sources/victorialogs.md) | read logs from VictoriaLogs | yes | yes
27-
[syslog service](/log_processor/data_sources/syslog_service.md) | read logs received via syslog protocol | yes | no
28-
[Windows Event](/log_processor/data_sources/windows_event_log.md)| read logs from windows event log | yes | yes
59+
[Appsec](/log_processor/data_sources/appsec) | expose HTTP service for the Appsec component | yes | no
60+
[AWS cloudwatch](/log_processor/data_sources/cloudwatch) | single stream or log group | yes | yes
61+
[AWS kinesis](/log_processor/data_sources/kinesis)| read logs from a kinesis strean | yes | no
62+
[AWS S3](/log_processor/data_sources/s3)| read logs from a S3 bucket | yes | yes
63+
[docker](/log_processor/data_sources/docker) | read logs from docker containers | yes | yes
64+
[file](/log_processor/data_sources/file) | single files, glob expressions and .gz files | yes | yes
65+
[HTTP](/log_processor/data_sources/http) | read logs from an HTTP endpoint | yes | no
66+
[journald](/log_processor/data_sources/journald) | journald via filter | yes | yes
67+
[Kafka](/log_processor/data_sources/kafka)| read logs from kafka topic | yes | no
68+
[Kubernetes Audit](/log_processor/data_sources/kubernetes_audit) | expose a webhook to receive audit logs from a Kubernetes cluster | yes | no
69+
[Loki](/log_processor/data_sources/loki) | read logs from loki | yes | yes
70+
[VictoriaLogs](/log_processor/data_sources/victorialogs) | read logs from VictoriaLogs | yes | yes
71+
[syslog service](/log_processor/data_sources/syslog_service) | read logs received via syslog protocol | yes | no
72+
[Windows Event](/log_processor/data_sources/windows_event_log)| read logs from windows event log | yes | yes
2973

3074
## Common configuration parameters
3175

@@ -46,6 +90,7 @@ An expression that will run after the acquisition has read one line, and before
4690
It allows to modify an event (or generate multiple events from one line) before parsing.
4791

4892
For example, if you acquire logs from a file containing a JSON object on each line, and each object has a `Records` array with multiple events, you can use the following to generate one event per entry in the array:
93+
4994
```
5095
map(JsonExtractSlice(evt.Line.Raw, "Records"), ToJsonString(#))
5196
```
@@ -70,31 +115,39 @@ If not set, then crowdsec will think all logs happened at once, which can lead t
70115
A map of labels to add to the event.
71116
The `type` label is mandatory, and used by the Security Engine to choose which parser to use.
72117

73-
## Acquisition configuration example
118+
## Acquisition configuration examples
74119

75-
```yaml title="/etc/crowdsec/acquis.yaml"
120+
```yaml title="/etc/crowdsec/acquis.d/nginx.yaml"
76121
filenames:
77122
- /var/log/nginx/*.log
78123
labels:
79124
type: nginx
80-
---
125+
```
126+
127+
```yaml title="/etc/crowdsec/acquis.d/linux.yaml"
81128
filenames:
82129
- /var/log/auth.log
83130
- /var/log/syslog
84131
labels:
85132
type: syslog
86-
---
133+
```
134+
135+
```yaml title="/etc/crowdsec/acquis.d/docker.yaml"
87136
source: docker
88137
container_name_regexp:
89138
- .*caddy*
90139
labels:
91140
type: caddy
92141
---
93-
...
142+
source: docker
143+
container_name_regexp:
144+
- .*nginx*
145+
labels:
146+
type: nginx
94147
```
95148
96149
:::warning
97150
The `labels` and `type` fields are necessary to dispatch the log lines to the right parser.
98151

99-
Also note between each datasource is `---` this is needed to separate multiple YAML documents (each datasource) in a single file.
152+
In the last example we defined multiple datasources separated by the line `---`, which is the standard YAML marker.
100153
:::

crowdsec-docs/docs/log_processor/intro.mdx

Lines changed: 0 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -50,32 +50,8 @@ labels:
5050
type: syslog
5151
```
5252
53-
When CrowdSec is installed via a package manager on a fresh system, the package manager may run `cscli setup` in **unattended** mode.
54-
It detects installed services and common log file locations, installs the related Hub collections, and generates acquisition files under `acquis.d/setup.<service>.yaml`, e.g. `setup.linux.yaml`).
55-
56-
Generated files are meant to be managed by crowdsec; don’t edit them in place. If you need changes, delete the generated file and create your own.
57-
58-
When upgrading or reinstalling crowdsec, it detects non-generated or modified files and won’t overwrite your custom acquisitions.
59-
60-
:::caution
61-
62-
Make sure the same data sources aren’t ingested more than once: duplicating inputs can artificially increase scenario sensitivity.
63-
64-
:::
65-
66-
Examples:
67-
68-
- If an application logs to both `journald` and `/var/log/*`, you usually only need one of them.
69-
70-
- If an application writes to `/var/log/syslog` or `/var/log/messages`, it’s already acquired by `setup.linux.yaml` (since 1.7) or `acquis.yam`. You don’t need to add a separate acquisition for the same logs.
71-
72-
For config-managed deployments (e.g., Ansible), set the environment variable `CROWDSEC_SETUP_UNATTENDED_DISABLE` to any non-empty value to skip the automated setup.
73-
In that case, ensure you configure at least one data source and install the OS collection (e.g., crowdsecurity/linux).
74-
7553
For more information on Data Sources and Acquisitions, see the [Data Sources](log_processor/data_sources/introduction.md) documentation.
7654
77-
For more information on the automated configuration, see the command `cscli setup`.
78-
7955
## Collections
8056
8157
Collections are used to group together Parsers, Scenarios, and Enrichers that are related to a specific application. For example the `crowdsecurity/nginx` collection contains all the Parsers, Scenarios, and Enrichers that are needed to parse logs from an NGINX web server and detect patterns of interest.

crowdsec-docs/unversioned/getting_started/post_installation/acquisition.mdx

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,13 +5,14 @@ title: Acquisition
55

66
# Acquisition
77

8-
By default when CrowdSec is installed it will attempt to detect the running services and acquire the appropriate log sources and [Collections](https://docs.crowdsec.net/docs/next/collections/intro).
8+
By default when CrowdSec is installed it will attempt to [detect the running services](/log_processor/data_sources#service-detection) and acquire the appropriate log sources and [Collections](https://docs.crowdsec.net/docs/next/collections/intro).
99

10-
However, we should check that this detection worked or you may want to manually acquire additional [Collections](https://docs.crowdsec.net/docs/next/collections/intro) for other services that are not detected.
10+
However, we should check that this detection worked and the log locations are correct.
11+
You may want to manually acquire additional [Collections](https://docs.crowdsec.net/docs/next/collections/intro) for the services that were not detected.
1112

1213
## What log sources are already detected?
1314

14-
To find what log sources are already detected, you can use the `cscli` command line tool.
15+
To find out which log sources are providing data to crowdsec, you can query the CrowdSec metrics with the `cscli` command line tool.
1516

1617
```bash
1718
cscli metrics show acquisition

0 commit comments

Comments
 (0)