More

dataloopio · on Oct 11, 2016

Broader comparison of the facts:

https://docs.google.com/spreadsheets/d/1sMQe9oOKhMhIVw9WmuCE...

Also a blog comparison:

https://blog.dataloop.io/top10-open-source-time-series-datab...

From reading benchmarks and various articles on how to setup druid for real-time analytics it isn't great. Druid was designed for batch workloads. Feeding the database slowly and then once complete performing analytics over a large data set. Real-time streaming analytics are dreadful to setup and use.

fangjin · on Oct 11, 2016

While Druid was initially built for batch loads, the architecture has evolved substantially as the project has matured. Today, Druid supports exactly-once streaming ingestion from Kafka, and large production deployments routinely stream millions of events per second into Druid.

dataloopio · on Oct 11, 2016

Can you point me to a source for 'routinely stream millions of events per second into druid'?

While it is true that druid is great at querying billions of rows per second it's not very good at ingress. Here is a mailing list discussion for some background.

https://groups.google.com/forum/#!searchin/druid-user/benchm...

cheddar · on Oct 11, 2016

What kind of ingestion numbers are you working with? The thread you link to shows that Druid can ingest ~27.5k events/sec per node, which is roughly 2.376bn events a day per node.

While you can claim bias here too, we have multiple clusters ingesting in the high hundreds of thousands of events/second and our largest cluster does close to 2m/s. That's definitely scaled horizontally across multiple nodes.

If you are suggesting there is a system out there that can ingest millions of messages a second on a single node, I'd love to hear about it :).

edit: Ah, I see from the spreadsheet that you linked that there are systems out there that claim 2.5-3.5m writes per second per node. That's really quite amazing, would be awesome if you could provide the methodology used to collect those numbers. For example, if you are sending in 500 byte events (a rather common size for what we do), if my calculations are correct, you are now sustaining 14 Gbps, which means those benchmarks were done on some beefy hardware. Can you link to a blog post that details the methodology?

dataloopio · on Oct 11, 2016

Most benchmarks are given a colour for reliability and link to repeatability.

cheddar · on Oct 11, 2016

Ah, cool, I chased down what you are doing and figured out that you are doing an apples to oranges comparison.

As described in your benchmark description:

https://gist.github.com/sacreman/b77eb561270e19ca973dd505527...

You are running 200 agents emitting 6000 metrics a piece using Haggar to generate load, which is at

https://github.com/dalmatinerdb/haggar

The specific thing of interest is how you are generating your data, which looks like you have a single set of dimensions and 6000 metrics dangling off of it. The loop that populates all of the "metrics" are:

https://github.com/dalmatinerdb/haggar/blob/master/main.go#L...

And the thing that actually populates the bytes are at:

https://github.com/dalmatinerdb/haggar/blob/master/util.go#L...

So, if we take this to an apples-to-apples comparison, you have 200 agents sending a single event every second with 6000 metrics in it. That means that you are successfully ingesting 200 events per second in the way that we would measure event ingestion for Druid.

Note, also, that the thread you link to is ingesting 17 independent dimensions with each and every event that flows in. From the Daltaminer docs, it looks like you put all dimension data into postgres and you don't expect any large-scale deployment to ever need more than a single postgres node:

https://gist.github.com/sacreman/9015bf466b4fa2a654486cd79b7...

Look under "Setup Postgres".

We routinely have billions of unique combinations of dimension values per day flowing into our system. Delegating the finding of the right keys to a relational database for such operations is going to be very cost-prohibitive, not to mention, you are going to have to materialize hundreds of millions of keys in order to do a simple aggregate over the day.

So, I guess this is just another case where you should never trust benchmarks that you didn't do yourself or that don't follow a standard pattern like TPC-H. It's too easy for the same words to be used with different meanings.

dataloopio · on Oct 11, 2016

DalmatinerDB, InfluxDB, Prometheus and Graphite each claim their numbers based on similar benchmark methods. The results range from 500k/sec to a couple of million metrics /sec. Druid comparatively, for the same benchmark would be closer to 30k/sec. If that's factually wrong please post some details and we can update the spreadsheet.

Expanding the benchmark to cover cardinality and other aspects would indeed be comparing apples to oranges.

In terms of benchmarking DalmatinerDB with billions of unique combinations indexed in Postgres.. I think we know what will happen there :) That's what it's designed for. We can also shard in the query engine, or use any of the multi master Postgres options, but I doubt that would even be necessary.

fangjin · on Oct 11, 2016

The databases listed above, to the best of my knowledge, are commonly used for dev ops metrics data and share similar terminology. Druid on the other hand, draws much of its terminology from the OLAP world. As cheddar clarified in his post above, the benchmarks for Druid are misleading as it is not an apples to apples comparison (I suspect the benchmarks for ES also suffer from this problem). A single Druid event may consist of thousands of metrics.

gillh · on Oct 12, 2016

Agree with your analysis. At Netsil, one of the big factors that we considered was average query latency and fast-aggregations over high cardinality, multi-dimensional data. Few of our early customers told us that when they deployed solutions from other vendors (with storage engines such as Cassandra) at scale (800+ monitored instances), they would have to wait for several minutes to let the data render on their dashboards for a 1-day aggregation query. So, it was not just the scalable ingestion that was paramount, fast ad-hoc analytics functionality was equally important to us.

kylequest · on Oct 11, 2016

and that happens to be the system they built :-)

fangjin · on Oct 11, 2016

Here is a source from 2015: http://www.marketwired.com/press-release/metamarkets-clients...

You can also find additional information that folks have been willing to publicly share on scale and use cases here: http://druid.io/druid-powered.html

dataloopio · on Oct 11, 2016

Those sources contain literally no technical detail. At 1.1 million metrics per second is that a 40 node druid cluster?

fangjin · on Oct 11, 2016

I think we're using very different terminology here. An event in our world may contain thousands of metrics as part of the same event.

packetslave · on Oct 11, 2016

well, he's is a druid committer and CEO of a company built on top of it, so...

dataloopio · on Oct 11, 2016

No bias there then

user5994461 · on Oct 11, 2016

> millions of events per second

Source? That statement as-is is clearly overselling.

dataloopio · on Sept 14, 2016

It's in the spreadsheet. I've not tried it and there wasn't much info available at the time of writing the blog.

dataloopio · on Sept 14, 2016

Hi, I'm the original author. I think that GCE instance has 2Gb per core but I'd need to check. It wasn't bottlenecked on network bandwidth though.

The other benchmarks are linked in the spreadsheet to their respective details. It's not an absolute, direct, fair comparison. However, we wanted to start somewhere with information available right now and try to collect better results over time as the respective interested parties benchmarked and blogged about their databases.

I'd like to spend more time benchmarking every database in the spreadsheet but it feels like something the project owners should do themselves. I'd probably only get the setup wrong.

socmag · on Sept 15, 2016

Thanks for the clarification, good to know.

dataloopio · on Aug 31, 2016

I had a look at Citus when evaluating time series databases and the documents said that work was still under way on masterless setup for faster ingestion (up to 500k metrics /sec).

https://docs.citusdata.com/en/v5.2/performance/scaling_data_...

If that has changed I'd add it to my table.

https://docs.google.com/spreadsheets/d/1sMQe9oOKhMhIVw9WmuCE...

dataloopio · on Aug 29, 2016

Hi, the wording could probably use some tidying up around that part and I'm open to suggestions. However, I do think it's a big problem with columnar based time series databases.

When somebody wants to query for a few points matching certain dimensions in Cassandra there's no getting around the fact that you have to do a scan across potentially billions of data points.

Whereas if the index lives outside in something relational like Postgres the lookup becomes insanely cheap and you're not having to scan over a bunch of data.

There are quite a few databases that don't have an efficient external index. For those, running 10 times the number of nodes would certainly speed things up, but it's probably just a good idea to avoid databases like that if you want fast queries.

avifreedman · on Aug 30, 2016

Sorry to keep being pedantic, but I think it's important to thinking about approaches to scalable and performant TSDBs, and I still disagree :)

Your example re: Cassandra is a problem with a particular example of columnar based time series database, not inherently with using columnar-store based backends for time series data.

At Kentik, our in-house backend deals with 80+ columns wide (what would be tags in TSDB) for primarily network data, and querying across tens of billions of records (tens of devices of data for 90 days) usually takes .5-2 seconds.

That's deployed on ~7 backend data nodes, running heavily multi-tenant with 300k-2m records/second ingested and averaging 450 queries/minute across a week (don't have a peak query # handy).

But there's also nothing that says that a columnar store database can't have indexes per column built-in (vs. external).

dataloopio · on Aug 27, 2016

Hi, that is a good point. It was all written yesterday so is pretty up to date :) I've added some dates to the top of the blog post. Thanks!

dataloopio · on Aug 27, 2016

The blog discusses storage sizes with very similar maths examples. DalmatinerDB uses 1 byte of storage per point compared to at the top end Elasticsearch using 22 bytes per point.

Build vs buy is an age old discussion. You won't convince anyone to switch from one side to the other. There will continue to be people like you and me who would prefer to buy, and others who want to build and run it themselves. As you have found out I don't need to be convinced as I started a company to address the issue of there being no good options to buy at the time. In most cases, for monitoring micro services, I'd buy a SaaS solution. I founded Dataloop 3 years ago so not really a new startup any more. We're past Series A and starting to grow.

It's true that we compete with Datadog and SignalFX in that area although our real competition is open source with 90% of the addressable market using older tools like Nagios etc. As the shift to the cloud and micro services happens I'm sure it won't be a winner takes all market. Dataloop tends to focus on the enterprise end of the scale whereas Signalfx is more developer focussed and Datadog is more operations and SME.

When you say best I'd argue that's subjective. Signalfx charges by the metric and that gets very expensive. Datadog limits you to 100 metrics per node with an agent based pricing model. Dataloop uses per node pricing that's much cheaper with unlimited metric volume. We're aiming to keep the costs extremely low by investing in highly efficient backend storage.

The reason people are moving away from Graphite to InfluxDB and Prometheus is the dimensional data model. Graphite simply isn't as powerful. Similarly, StatsD aggregates down to the service and doesn't help pinpoint the outlier. Prometheus collects all metrics in their raw format far more efficiently and will let you instantly drill down into what is causing the issue.

To answer your question about what's next after you outgrow open source solutions that don't scale.. well that was kind of the point of the blog! DalmatinerDB scales to millions of metrics per second on a single node and linearly as you add additional nodes. It isn't exactly hard to maintain either as it's based on Riak Core.

I guess the final thing to say was that this wasn't really an advert for Dataloop. Our business model doesn't depend on selling database features. Unlike other SaaS companies we're happy to release the work done on our time series database for free and available as open source.

Why would we do that? Mostly because it's fun to do open source stuff. Also because hiring Erlang developers is pretty hard and this gives me an excuse to talk at conferences where they hang out.

We've had a team of people working on this stuff now for over a year and as you've mentioned no open source time series databases really scale. It's a problem we've solved and are giving away for free. I must be really bad at conveying that message in the blog.

user5994461 · on Aug 29, 2016

A full metrics/monitoring/alerting solution and a metrics storage engine are two different purposes. (The first one being solved by the latest SaaS tools, including dataloop.io).

I limited the previous message to the monitoring use case because it is already quite long and a topic of it's own but I'd like to address the storage as well.

There are many reasons one would need a time series database for an application. In which case he'd need that kind of comparison.

---

There are a few things which I'd like to see about storage systems:

- What kind of features does it have to compress and/or aggregate data? Does it have any?

Some systems can take 4 bytes per int, other can take 50. Some can store diff, some do not... That makes a huge difference.

- Can it cluster horizontally? Also, does it scale write horizontally?

We can have 50 CPU systems with 10TB of SSD array noawadays, but we probably won't. It's actually rather challenging to scale vertically on AWS/GCE (not so much on softlayer), not to mention the nightmare of having a single point of failure for maintenance and issues.

I suppose we get that with the read/write number per 1 node and per 5 nodes systems, which brings me to the next point.

- The performance numbers are somewhat misleading IMO.

You say yourself that you didn't do benchmarking. You're just taking some random facts you found on the internet and showing that as data.

- You should include the versions of the database in the table. Features change over time.

- Are you backing and contributing to DalmatinerDB? For some reasons, the link between dataloop and dalmatinerDB wasn't clear to me on the first read. (Not to mention, you're not even advertising your product or your company).

- How much of DalmatinerDB magic is based on ZFS? Does it actually need ZFS to run?

As far as I remember, ZFS is still a BSD/Solaris citizen only. (And don't tell my that it's coming into the next ubuntu release, it's just an hypothetical future until actually done ;) )

Anyway. It's a welcomed comparison. Good work =)

---

> our real competition is open source with 90% of the addressable market using older tools like Nagios etc.

An interesting point of view. I personally consider 90% of the nagios market to not-be-a-market at all. It belongs to people who only uses it because it's free (as in no-money) and can be downloaded easily.

Free automatically brings the students, the amateurs trying things in their garage/homelab, micro deployment where it's enough, many companies and people who simply don't value their time or the quality of what they deliver, and finally all who have no money whatsoever or can't go through the hassle of the buying-stuff department.

dataloopio · on Aug 29, 2016

The raw data in the spreadsheet, which I have continued to update and is now up to 15 databases with 30 characteristics each, is indeed applicable to both people who want to pick up a database and use it in their monitoring stack, or who want to use a time series database for another purpose.

Some of the newer additions like Warp10 and Akumuli for example have comments about being great choices for sensor applications and those who need highly performant local time series storage written in C++.

That's a good point about the compression. I bundled that all in the row 'bytes per point after compression' which gives the end result for each database but I haven't noted which ones actually compress. However, there is 1 database that uses lossy compression so that is noted. That's all in row 11 currently.

Someone else also raised the clustering question and I added a row for 'dynamic cluster management' to address the question of whether you could just horizontally scale by adding a node without bringing the system down.

The performance numbers are a big problem regardless of how they are calculated. Firstly, I'd like to address the issue of whether having them or not is important. I believe it's incredibly important to have some kind of idea up front before investing time trying out a database. If we agree that performance numbers are essential, especially considering the database list is up to 15 and growing and we need some kind of method to reduce the number to trial, then we're onto the next problem. Benchmarking 15 databases in a uniform way as a science experiment is about 7 master thesis worth of work (there are actually several on the topic of benchmarking a few of those databases together that are interesting reads). To put it plainly such a benchmark won't ever happen and if if ever does someone would find a way to undermine it for their use case and setup. Therefore the usefulness of the numbers in the list is more a practical ballpark estimate as outlined in the blog.

We did however benchmark DalmatinerDB and released easy to re-create results, the exact box hardware, test method, code and a little graph. I used the same mechanism to benchmark InfluxDB and got reasonably close to the figures they released.

The version of each database is in row 29. It probably wasn't when you first saw the sheet but I added it soon after along with a maturity field.

Dataloop is a SaaS company of which I'm a co-founder and we needed a database a few years ago to build our SaaS monitoring product. None of the things that existed at the time were all that appealing so we picked up an already open sourced project (DalmatinerDB) and have put about 5 people working for a year improving it under the direction of Heinz (the original author) and contributed it all back.

The reason to contribute it back wasn't completely altruistic. I'm personally interested in time series databases, as a co-founder I need to go on the road and talk at conferences, and chatting about DalmatinerDB seemed like a good way to get my slides up and a big 'we are hiring' one at the end. Also, hiring Erlang devs is hard. The more community contributors we have the bigger the potential hiring pool of known good people.

For the ZFS question there's the choice of SmartOS (which is what Heinz built it on) or Linux. Ubuntu 16.04 has been out for several months and supports native ZFS. Dataloop runs DalmatinerDB on ZFS on Ubuntu. Heinz and several of his customers running Project Fifo (which DalmatinerDB came out of) usually run it on SmartOS. I'm guessing far more people would be interested in running it on Linux than SmartOS. The database is intrinsically linked to the filesystem for proper running. It relies on the way the compression works to achieve 1 byte per data point storage volume. As well as the way the filesystem makes atomic writes. You can run DalmatinerDB without ZFS but you'd quickly run out of disk space and your data integrity wouldn't be as well known.

The bit I do agree with fully is the marketing sizing questions.. it's all a gamble based on opinion. We can probably agree it's a huge area for growth right now with more people moving to cloud and spinning up far more infrastructure than ever before. All I personally know, being a SysAdmin most of my career, is that things are getting more complex and open source isn't scaling very well, or when it does you still need a team to manage it. If you had asked big companies 10 years ago who would want to outsource their email hosting I think you'd get a much different answer to today. The trend long term is towards SaaS and I believe the 90% of companies who today are using open source will see the same shift. Will I get the amateurs? Probably never, but then the 90% market numbers come from our own research:

https://blog.dataloop.io/2014/01/30/what-we-learnt-talking-t...

And then validated time and time again by others:

https://kartar.net/2015/08/monitoring-survey-2015---tools/

I honestly don't think that even if the mega combo of Prometheus / DalmatinerDB / Grafana was polished today it would eat into either Dataloops or Datadogs business. Some people want to run their own, others want to buy SaaS. Over time we're going to see the shift to SaaS that has happened with most other products and by having a foot in the door with DalmatinerDB, Dataloop should hopefully create some good-will and credibility from the people who want to run a monitoring stack themselves. At least then if they move on into an environment where they do want to buy SaaS hosted monitoring we're going in warm with a good reputation.

dataloopio · on Aug 27, 2016

Spreadsheet link:

https://docs.google.com/spreadsheets/d/1sMQe9oOKhMhIVw9WmuCE...

Blog Link:

https://blog.dataloop.io/top11-open-source-time-series-datab...

dataloopio · on March 22, 2016

I know quite a few companies running ZFSOL in production. In terms of our own experiences I blogged about them here: https://dataloopio.wordpress.com/2016/03/07/zfs-on-linux/

dataloopio · on March 22, 2016

Sorry, blog corrected.