Architecture Overview

Every integration built with this framework follows the same four-layer architecture. The layers are identical across integrations — only the vendor-specific API logic changes.

The four layers

Layer 1: Shell wrapper (`run.sh`)

The outermost layer. This is the command target referenced in ossec.conf. It does three things and nothing more:

Sets environment variables (API endpoints, modes, feature flags)
Resolves the path to the Python entry point
Execs the Python process (replacing itself — no parent shell lingers)

The shell wrapper is the only file that changes between deployment environments. All environment-specific configuration lives here, keeping the Python code portable.

Layer 2: Entry point / orchestrator (`{vendor}.py`)

The main Python script. Responsibilities:

Parse CLI arguments (overrides for source, debug, lookback, mode)
Load and validate configuration from environment variables
Load persisted state (cursors, timestamps) from the state file
Call each domain module in sequence, passing credentials and state
Save updated state atomically after successful processing
Handle top-level exceptions and emit structured error events

The entry point never contains API-specific logic. It orchestrates — it does not fetch, parse, or transform vendor data.

Layer 3: Domain modules (`{vendor}_events.py`, `{vendor}_siem.py`, etc.)

One module per logical API surface or data type. Each module:

Constructs API requests (URLs, headers, query parameters, POST bodies)
Calls the shared HTTP function from utils
Iterates through paginated responses
Transforms each vendor event into the namespaced output format
Calls emit() for each event
Returns updated cursor/bookmark state to the orchestrator

Domain modules are where all vendor-specific logic lives. They import from utils but never from each other. This isolation ensures a bug in one module cannot affect another.

Layer 4: Shared utilities (`{vendor}_utils.py`)

The foundation layer. Provides all cross-cutting concerns:

Credential loading — the three-tier priority chain (systemd > secrets file > env vars)
HTTP functions — http_get(), http_post(), or api_post() with retry, timeout, and error handling
State management — load_state(), save_state() with atomic writes
Event emission — emit() writes a single JSON line to stdout
Logging — log() writes diagnostic messages to stderr at configurable verbosity
Secrets file parsing — load_secrets_file() for the KEY=VALUE format

Utils never import from domain modules or the entry point. Dependencies flow strictly downward: entry point → domain modules → utils.

Component relationship

ossec.conf
  └─► run.sh                         [Layer 1: Shell wrapper]
        │  Sets env vars
        │  Execs Python
        └─► {vendor}.py               [Layer 2: Orchestrator]
              │  Parses args
              │  Loads state
              │  Calls modules
              ├─► {vendor}_events.py   [Layer 3: Domain module]
              │     │  Builds requests
              │     │  Paginates
              │     │  Transforms
              │     └─► emit() ──► stdout ──► Wazuh
              │
              ├─► {vendor}_people.py   [Layer 3: Domain module]
              │     └─► emit() ──► stdout ──► Wazuh
              │
              └─► {vendor}_utils.py    [Layer 4: Shared utilities]
                    ├── credential_chain()
                    ├── http_get() / http_post()
                    ├── load_state() / save_state()
                    ├── emit()
                    └── log()

How many domain modules?

The answer depends on the vendor API surface:

One module — the API has a single endpoint or closely related endpoints that share auth, pagination, and error handling (e.g., a vendor whose alert and incident endpoints use the same request format and response structure)
Two modules — the API has distinct surfaces with different auth, pagination, or data models (e.g., Cortex XDR separates alerts and incidents into different modules because they have different response schemas and query patterns; Proofpoint has a SIEM API and a People API with different rate limits and schedules)
Three+ modules — rare, but justified when the API surfaces are truly independent

The rule: each module should correspond to one logical API surface that could, in principle, run independently. If two endpoints share request format, pagination, and error handling, they belong in the same module. If they differ on any of those dimensions, separate them.

What stays the same vs. what changes

Component	Same across integrations	Changes per vendor
`run.sh` structure	Yes — env vars, exec pattern	Variable names, default values
Entry point flow	Yes — args, state, modules, save	Module names, config variables
Utils functions	~90% identical	HTTP auth method, header construction
Domain modules	Pattern identical, logic differs	API URLs, pagination, field mapping
Decoder XML	Structural template identical	Program name, parent decoder
Rules XML	Pattern identical	Rule IDs, field names, descriptions
`artifacts/` layout	Identical directory structure	Content specific to vendor

SIEM-agnostic notes

The architecture is Wazuh-native but portable. The SIEM-specific touchpoints are:

stdout emission — Wazuh reads stdout from wodle commands. Splunk uses modular inputs (also stdout). Sentinel uses the Data Collector API (HTTP POST). The emit() function is the only place this changes.
Decoder/rules — Wazuh-specific XML. Other SIEMs have their own parsing configuration (Splunk props.conf/transforms.conf, Sentinel KQL parsers, Elastic ingest pipelines).
ossec.conf stanza — Wazuh-specific scheduling. Other SIEMs use their own scheduling mechanisms (Splunk inputs.conf, cron, systemd timers).
File paths — /var/ossec/wodles/ is Wazuh-specific. The integration itself is path-agnostic through environment variables.

See Adapting to other SIEMs for detailed guidance.