Developing Custom Boefjes for OpenKAT: A Developer Guide

1 april 2026 Praktische Tips & Best Practices No Comments

One of OpenKAT’s greatest strengths is its modular architecture. Boefjes, the scanning plugins that collect data, can be extended with custom implementations for your specific needs. Being actively involved in OpenKAT development and maintenance, we build boefjes both for the community and for our clients. Here’s how the system works and how you can build your own.

Understanding the Boefje Architecture

OpenKAT’s scanning pipeline consists of three components that work together:

Boefjes, Collect raw data by calling external tools, APIs or running custom scripts
Whiskers (Normalizers), Parse the raw output and convert it into structured objects in the Octopoes data model
Bits (Business Rules), Analyze the structured objects and generate findings based on security policies

When you build a “custom boefje”, you typically create both a boefje (data collection) and a whisker (normalization). The bits layer usually uses existing rules unless you have custom compliance requirements.

Anatomy of a Boefje

Every boefje lives in the boefjes/boefjes/plugins/ directory and consists of:

kat_my_custom_scanner/
├── __init__.py
├── boefje.json        # Metadata: name, description, input/output types
├── main.py            # The actual scanning logic
├── normalizer.py      # Whisker: parses raw output into OOIs
└── requirements.txt   # Python dependencies (if any)

boefje.json, The manifest

This file tells the katalogus what your boefje does, what input it expects and what output it produces:

{
  "id": "kat_my_custom_scanner",
  "name": "My Custom Scanner",
  "description": "Checks for specific configuration issues",
  "consumes": ["Hostname"],
  "produces": ["boefje/kat_my_custom_scanner"],
  "scan_level": 1,
  "enabled": true
}

Key fields:

consumes, The OOI (Object of Interest) types this boefje needs as input. Common types: Hostname, IPAddressV4, URL, Network
produces, The MIME types of the raw output. The normalizer will look for these.
scan_level, Minimum clearance level needed (0–4). Use 0–1 for passive checks, 2+ for active scanning.

main.py, The scanning logic

The main module must implement a run() function that receives the input OOI and returns raw results:

import json
import requests
from os import getenv

def run(boefje_meta: dict) -> list[tuple[set, bytes | str]]:
    """Main entry point for the boefje."""
    input_ooi = boefje_meta["arguments"]["input"]["hostname"]["name"]
    
    # Your scanning logic here
    result = requests.get(f"https://api.example.com/check/{input_ooi}")
    
    return [
        ({"boefje/kat_my_custom_scanner"}, json.dumps(result.json()))
    ]

The return value is a list of tuples: each tuple contains a set of MIME types and the raw data. This data gets stored in Bytes (OpenKAT’s object storage) and is then picked up by the normalizer.

normalizer.py, Parsing raw output into OOIs

The normalizer transforms raw scanner output into structured objects that Octopoes can store and analyze:

import json
from collections.abc import Iterable
from octopoes.models.ooi.findings import Finding, KATFindingType

def run(input_ooi: dict, raw: bytes) -> Iterable:
    """Normalize raw boefje output into OOIs."""
    data = json.loads(raw)
    
    if data.get("vulnerable"):
        finding_type = KATFindingType(id="KAT-MY-FINDING-001")
        yield finding_type
        yield Finding(
            finding_type=finding_type.reference,
            ooi=input_ooi["primary_key"],
            description=f"Issue found: {data['detail']}"
        )

Real-World Examples

Here are some custom boefjes we’ve built for clients:

Internal API health checker, Monitors internal microservices for configuration drift and exposed debug endpoints
EPD system scanner, Healthcare-specific checks for Electronic Patient Dossier systems (via IP-Zorg)
Supply chain DNS monitor, Tracks DNS changes across supplier domains to detect potential hijacking
Custom compliance checker, Validates specific BIO/NEN7510 controls beyond the built-in checks

Testing Your Boefje

Before deploying a custom boefje to production:

Unit test the run() function with mocked inputs
Test the normalizer with sample raw output to verify OOIs are created correctly
Run in a Docker dev environment first, use make kat to spin up a local instance
Check the katalogus to verify your boefje appears and can be enabled
Run a scan on a test hostname and verify findings appear in the report

Containerized Boefjes

Since release 1.17, boefjes can also run as separate containers. This is useful when your boefje has complex dependencies or needs to run isolated from the main system. We contributed the katalogus settings for containerized boefjes feature to the OpenKAT codebase, making it possible to pass configuration parameters to boefjes running in their own containers.

To run a boefje as a container, create a Dockerfile in your boefje directory and register it in the katalogus with the container image reference.

Need Custom Boefjes?

Building boefjes requires understanding of both the OpenKAT framework and the security domain you’re targeting. Our direct involvement in the OpenKAT codebase gives us a thorough understanding of how boefjes interact with the scheduler, katalogus and Octopoes. We develop custom boefjes for clients across government, healthcare and enterprise, from initial concept to production deployment.

Discuss your custom boefje needs

Adding Hosts, IP Ranges and URLs to OpenKAT via the API

18 juni 2026 Continuous Scanning & Monitoring, Praktische Tips & Best Practices, Zonder categorie No Comments

Does OpenKAT have an API to add new (sub)domains or IP ranges? Yes, through the Octopoes declarations API. A practical, tested guide with working curl examples, the two-step declaration plus scan profile pattern, bulk options, and where Rocky’s own API fits in.

De echte OpenKAT herkennen: officiële bronnen en hoe u namaak vermijdt

17 juni 2026 Praktische Tips & Best Practices, Zonder categorie No Comments

Hoe herkent u de officiële OpenKAT en vermijdt u misleidende namaaksites? De echte bronnen, een verificatie-checklist en de belangrijkste waarschuwingssignalen op een rij.

Elastic SIEM Optimalisatie voor Moderne Beveiliging

1 april 2026 Praktische Tips & Best Practices No Comments

In het hedendaagse digitale landschap vormt Elastic SIEM een cruciale schakel in cybersecurity. Deze krachtige Security Information and Event Management oplossing transformeert de manier waarop organisaties hun beveiligingsgegevens verzamelen, analyseren en beheren. Door realtime monitoring en geavanceerde analyses biedt het een robuuste verdedigingslinie tegen moderne cyberdreigingen. De Fundamenten van Elastic SIEM De robuuste architectuur van […]

Elasticsearch ML Jobs: Automatische Inventarisatie, Analyse en Herstel met Python

1 april 2026 Praktische Tips & Best Practices No Comments

Hoe je met een Python script automatisch alle Elasticsearch Machine Learning jobs inventariseert, analyseert op memory-problemen, geblokkeerde datafeeds en failed states, en vervolgens load-aware herstelt. Inclusief complete aanpak en code.

Tags: Boefjes, Development, Docker, OpenKat

Docker Compose Override: Keep Your Config Clean Across Environments

1 april 2026 No Comments

Installing OpenKAT: A Complete Guide

1 april 2026 No Comments

Developing Custom Boefjes for OpenKAT: A Developer Guide

Understanding the Boefje Architecture

Anatomy of a Boefje

boefje.json, The manifest

main.py, The scanning logic

normalizer.py, Parsing raw output into OOIs

Real-World Examples

Testing Your Boefje

Containerized Boefjes

Need Custom Boefjes?

Adding Hosts, IP Ranges and URLs to OpenKAT via the API

De echte OpenKAT herkennen: officiële bronnen en hoe u namaak vermijdt

Elastic SIEM Optimalisatie voor Moderne Beveiliging

Elasticsearch ML Jobs: Automatische Inventarisatie, Analyse en Herstel met Python

Docker Compose Override: Keep Your Config Clean Across Environments

Installing OpenKAT: A Complete Guide

Geef een reactie Reactie annuleren