DNS TAPIR Core

TAPIR Core is an ISP (carrier) independent data analysis system which receives aggregated, minimised and de-personified DNS data from TAPIR Edge devices. Core analyse this data and indicates possible anomalies as “observations". Individual ISPs can freely choose how to act upon the observations, if at all.

What is TAPIR Core?

The main purpose of DNS TAPIR is to make DNS data transparent and available to interested parties by addressing the challenge of it being highly privacy sensitive.

To meet the information management requirements needed to handle data from DNS, DNS TAPIR has designed a system with clear boundaries of responsibility for stored data and data flows.

The critical division is between TAPIR Edge, which is the part of the system managed by a DNS resolver operator, and TAPIR Core, which is run by - frequently an independent - Core operator.

Core Edge

DNS messages themselves are very sparse in terms of information content but when aggregated across broad groups of clients and DNS providers, indications of criminal activity, misuse of DNS for tracking or information gathering, manipulation of the DNS system itself, and similar activities can be found.

DNS data tends to grow quickly into very large data sets, so the minimisation process that started in Edge also continues in Core. The goal is to retain as much of the information value as possible and continuously evaluate this value in relation to the amount of retained data, and then aggressively cull the data - partly as a further protection of privacy but mainly to avoid getting caught up in collecting data for the sake of collecting.

More information about DNS TAPIR system, TAPIR Edge and Core, can be found here: https://www.dnstapir.se/info_mgmt/tapir_info_mgmt.en

TAPIR Core Input and Output

Core Input Output

Output from TAPIR Core Analyse is shared to partners immediately, and some is shared to the public after delay. 

Input aggregates and events originate from Edge DNSTAP Minimiser (EDM). Additionally, Core can also use data from publicly available sources such as malware lists to correlate with.

More details of input and output format, delays and examples are found here: https://www.dnstapir.se/info_mgmt/tapir_info_mgmt.en

Data flow diagram: https://github.com/dnstapir/website/blob/main/docs/info_mgmt/tapirdataflow.md

Core Observations

DNS TAPIR system architecture overview

DNS TAPIR architecture

TAPIR Core architecture overview

TAPIR Core architecture

Node Support Functions

Node Manager

The node manager (Nodeman) is used to enroll new nodes and to renew certificates for existing nodes. It also serves node public keys to signature verifiers, e.g., aggrec and evrec.

Aggregate Receiver

The aggregate receiver (aggrec) receives aggregates submitted by EDM and stores the raw data a S3 compatible object store as well as a metadata in MongoDB.

Received aggregates can be retrieved either directly from the S3 compatible object store, or via the aggregate receiver HTTP API.

Core Support Functions

Event Receiver

The event receiver listens to events published on the message bus, verifies signatures and payload and republishes them on another topic.

Slogger

Slogger (“status logger”) is a receiver service for status update reports from different Edge components. Status update reports are sent as packets on the message bus on a structured set of topics that allow identification of the sending EdgeId and Component (eg. TAPIR-POP (Policy Processor), TAPIR-EDM)  from the topic used.

Each Edge Component may define its own set of “Functions” for which status may be reported. A status update report contains a section for each Function that has something to report. For each reporting Function the report contains severity, number consequtive events and possibly a free-form text message.

Slogger stores current and historical reports in a database and provides a management API that allows this data to be queried for by consumers of the information (CLI tools, possible dashboards, etc).

Core Infrastructure Components

MQTT Broker

A generic MQTTv5 message broker. Requires mTLS authentication for all external connections.

TODO: Document ACLs elsewhere.

MongoDB

Core requires a MongoDB compatible database, e.g. MongoDB or Amazon DocumentDB.

Object Store

Core requires an Amazon S3 compatible object store, e.g. Amazon S3, Ceph or MinIO.

Certification Authority

Core requires a Certification Authority (CA), e.g. Step CA, to manage internal certificates and certificate issuers.

Data Analyse Functions

TBD

Cogito ergo sum

Data arriving from Edge

Core Continuous Analysis Engine

The Analysis Engine is preferably run as a common source for analysis in Sweden.  With many Edge nodes that send data to one engine, the most valuable results will be achieved. In parallel local engines might be set up.

Core Analysis Engine