Ask HN: SIEM-like product with DNS as its data API?

Bender · on Aug 21, 2022

I don't know about a particular product but I have seen people pull in data feeds into both Splunk and into Elk/Logstash instances in conjunction with DNS query logging. So the SIEM's were just Splunk and Elk/Logstash. Splunk is very expensive. The DNS data-feeds were commercial and I can't recall which companies provided them but the folks at Splunk could likely tell you. Just beware of sales people.

If you wanted some publicly sourced data try out the firehol [1] datasets. They could probably be used in conjunction with Suricata or Snort distributed to each DNS resolver.

[1] - https://github.com/firehol/blocklist-ipsets

m3047 · on Aug 21, 2022

Author here. Thanks, but I'm using DNS-as-a-database not DNS-as-given-data.

Distinguishing feature: The command line example is using a DNS proxy to query a redis database containing netflow information.

It also utilizes DNS to query synthesized PTR records representing DNS-as-given-data (just not the data they want to give you most of the time).

DNS is the database.

Bender · on Aug 21, 2022

Ah so more like a Rbldnsd used in the RBL/RSL servers. I've run rbldnsd [1] as both a lookup for spammers/malware and as an authoritative server. Different data format than redis but same idea.

[1] - https://github.com/spamhaus/rbldnsd

m3047 · on Aug 21, 2022

Yes, RBL is an example (I've worked for Vixie FWIW) of DNS-as-a-database. Another notorious example in the Indicator of Compromise sector is EDR (chiefly antivirus) software which uses the DNS to query the provenance of file hashes.

The popular take on Redis is that it's good for caching. Well, the DNS is good for caching; Redis is good for a kind of caching that the DNS is not good at: write caching (if there is such a thing).

Netflows are a good example: despite whatever measures you take, you're going to be writing the same flow artifact over and over (might as well increment a counter). It may never be queried (read). If it is read, it will probably be read a fair amount in a short period of time.

Architecturally, you can have agents updating whatever you want in a Redis database; then when you need it... wherever you need it... you make a DNS query, separating read and write concerns. Put the Redis db close to the event source(s) instead of shipping all of it to a cloud. Agents are already being managed as fleets, so that's just current state of the art.

howlett · on Aug 21, 2022

I'm not 100% certain if I'm understanding the requirement correctly - but would something like this help?

https://github.com/ctxis/SnitchDNS

m3047 · on Aug 21, 2022

Hello, author here. That's a database driven DNS server alright. (Bonus: it's got a web admin interface.) There are DNS implementations with various backends; that's kind of the point.

I'm not sure that a mainstream SQL database is really a good target for network telemetry and e.g. access logging artifacts. Not talking about a time stream database either. There is an architecture here, and it's predicated on not collecting "all the things" in a central place.

Example: In this model, a service / server you're monitoring might have a couple of Redis keys which get incremented every time there's a successful or unsuccessful login. Maybe there's a redis hashkey with fails for individual accounts too.

There might be a graph somewhere of the login / attempt rates. It would query the summary redis keys (via the DNS) once a minute (doing whatever it needs to keep historical datapoints for however long they're needed).

If the rate skyrockets, maybe the hashkey with account-level granularity is consulted but most of the time it wouldn't be.

There might also be a Zabbix alarm somewhere querying the same keys, and if a threshold setting is exceeded, then an alarm is sent.

It's pull, not push. It's easy enough to write something to make the periodic queries and post them to e.g. ElasticSearch and graph it with Kibana.

So the question concerns the SIEM part. Something like Splunk is married to its database (their pricing model is based on how much data you want to put into that database). Something like the Yeti Threat Intelligence Platform (TIP) (https://github.com/yeti-platform/yeti) comes with the ability to manage and orchestrate a large number of periodic or event-driven tasks and therefore has the capability to generate the periodic DNS queries; it's been a few years, but its graphing capabilities didn't compare to ELK when I looked at it.

There's a lot of overlap with SCADA as well. All of the necessary features I've mentioned can be assembled from open source projects.

Is there some SIEM, TIP or Ops product out there, with an active userbase, which has the periodic task capability, alarming, and graphing?