commit | 8b2aceeb97c8f89f2898c0b35f38cc36d3cdda63 | [log] [tgz] |
---|---|---|
author | Taewoo Kim <wangsaeu@yahoo.com> | Wed Jan 04 12:44:02 2017 -0800 |
committer | Taewoo Kim <wangsaeu@yahoo.com> | Wed Jan 04 15:45:58 2017 -0800 |
tree | cba219e154ed7760cc89d798777cc728eb9836f2 | |
parent | 1355c269f50e84087ed24cb0ec9f091d2ce19a5a [diff] |
ASTERIXDB-1556, ASTERIXDB-1733: Hash Group By and Hash Join conform to the memory budget - External Hash Group By and Hash Join now conform to the memory budget (compiler.groupmemory and compiler.joinmemory) - For Optimzed Hybrid Hash Join, we calculate the expected hash table size when the build phase is done and try to spill one or more partitions if the freespace can't afford the hash table size. - For External Hash Group By, the number of hash entries (hash table size) is calculated based on an estimation of the aggregated tuple size and possible hash values for the given field size in that tuple. - Garbage Collection feature has been added to SerializableHashTable. For external hash group-by, whenever we spill a data partition to the disk, we also check the ratio of garbage in the hash table. If it's greater than the given threshold, we conduct a GC on Hash Table. Change-Id: I2b323e9a2141b4c1dd1652a360d2d9354d3bc3f5 Reviewed-on: https://asterix-gerrit.ics.uci.edu/1056 Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu> BAD: Jenkins <jenkins@fulliautomatix.ics.uci.edu> Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu> Reviewed-by: Yingyi Bu <buyingyi@gmail.com>
#AsterixDB
AsterixDB is a BDMS (Big Data Management System) with a rich feature set that sets it apart from other Big Data platforms. Its feature set makes it well-suited to modern needs such as web data warehousing and social data storage and analysis. AsterixDB has:
Learn more about AsterixDB at [http://asterixdb.apache.org] (http://asterixdb.apache.org)
##Building AsterixDB
To build AsterixDB from source, you should have a platform with the following:
Instructions for building the master:
Checkout AsterixDB master:
$git clone https://github.com/apache/asterixdb.git
Build AsterixDB master:
$cd asterixdb $mvn clean package -DskipTests
##Running AsterixDB (on your machine from your build) Here are steps to get AsterixDB running on your local machine:
Start a single-machine AsterixDB instance:
$cd asterixdb/asterix-server/target/asterix-server-*-binary-assembly/ $./samples/local/bin/start-sample-cluster.sh
Good to go and run queries in your browser at:
http://localhost:19001
Read more documentations to learn the data model, query language, and how to create a cluster instance: [https://ci.apache.org/projects/asterixdb/index.html] (https://ci.apache.org/projects/asterixdb/index.html)
##Documentation
AsterixDB's official documentation resides at [https://ci.apache.org/projects/asterixdb/index.html] (https://ci.apache.org/projects/asterixdb/index.html). This is built from the maven project under asterix-doc/
as a maven site. The documentation on the official website refers to the most stable build version, so for pre-release versions one should refer to the compiled documentation.
##Support/Contact
If you have any questions, please feel free to ask on our mailing list, users@asterixdb.apache.org. Join the list by sending an email to users-subscribe@asterixdb.apache.org. If you are interested in the internals or developement of AsterixDB, also please feel free to subscribe to our developer mailing list, dev@asterixdb.apache.org, by sending an email to dev-subscribe@asterixdb.apache.org.