DataSketches is an open source, high-performance library of stochastic streaming algorithms commonly called "sketches" in the data sciences. Sketches are small, stateful programs that process massive data as a stream and can provide approximate answers, with mathematical guarantees, to computationally difficult queries orders-of-magnitude faster than traditional, exact methods.
Started: 2019-03-30; Last Status Update: 2020-04-29
Reporting: February, May, August, November
All Committers are PPMC members
Mentors: Liang Chen (chenliang613), Kenneth Knowles (kenn), Furkan Kamaci (kamaci), Dave Fisher (wave), Evans Ye (evansye)
2019-03-30 Datasketches enters incubation.
Current Releases: <a href="https://datasketches.apache.org/docs/Community/Downloads.html"/>https://datasketches.apache.org/docs/Community/Downloads.html
History of Apache Releases by component:
datasketches-java: <a href="https://github.com/apache/incubator-datasketches-java/releases">https://github.com/apache/incubator-datasketches-java/releases</a>
datasketches-cpp: <a href="https://github.com/apache/incubator-datasketches-cpp/releases">https://github.com/apache/incubator-datasketches-cpp/releases</a>
datasketches-hive: <a href="https://github.com/apache/incubator-datasketches-hive/releases">https://github.com/apache/incubator-datasketches-hive/releases</a>
datasketches-pig: <a href="https://github.com/apache/incubator-datasketches-pig/releases">https://github.com/apache/incubator-datasketches-pig/releases</a>
datasketches-postgresql: <a href="https://github.com/apache/incubator-datasketches-postgresql/releases">https://github.com/apache/incubator-datasketches-postgresql/releases</a>
datasketches-memory: <a href="https://github.com/apache/incubator-datasketches-memory/releases">https://github.com/apache/incubator-datasketches-memory/releases</a>
Integration efforts have started with Apache Flink and Apache Impala. There is also interest from Apache Beam.
Developer mailing list: http://mail-archives.apache.org/mod_mbox/datasketches-dev
Commits mailing list: http://mail-archives.apache.org/mod_mbox/datasketches-commits
It is essential that you verify the integrity of release downloads. See instructions here