commit | c2545cc1d8d100ac1765641eecf154bb68285e72 | [log] [tgz] |
---|---|---|
author | Murtadha Hubail <mhubail@apache.org> | Tue Sep 10 04:44:09 2019 +0300 |
committer | Murtadha Hubail <mhubail@apache.org> | Wed Sep 11 03:08:47 2019 +0000 |
tree | c06e01e9d476e4092d1049998362b8b26f601d6d | |
parent | 309b96f8d7a266c0d9e986bb294f611c065c4909 [diff] |
[NO ISSUE][RT] Add Thread-Based Stats Collector - user model changes: no - storage format changes: no - interface changes: yes Details: - Add infra to allow collecting thread-based stats during runtime for any thread that belongs to a task. - Collect number of pinned pages per thread and report it in the TaskProfile. - Aggregate pinned pages counters from all job tasks and report it as diskIoCount in the metrics field in the json response. The plan is to move this stats to the profile field when it is introduced. - Collecting pinned pages stats is currently enabled by default for any job with IndexSearchOperatorNodePushable. The plan is to allow enabling/disabling as part of the job profiling change. - Add test case for diskIoCount metric. - Remove unused IndexSearchOperatorNodePushable constructor. Change-Id: I44dfcedcadb3d0f48815b521e7d495e473b02e3d Reviewed-on: https://asterix-gerrit.ics.uci.edu/3555 Contrib: Jenkins <jenkins@fulliautomatix.ics.uci.edu> Integration-Tests: Jenkins <jenkins@fulliautomatix.ics.uci.edu> Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu> Reviewed-by: Murtadha Hubail <mhubail@apache.org> Reviewed-by: Till Westmann <tillw@apache.org>
AsterixDB is a BDMS (Big Data Management System) with a rich feature set that sets it apart from other Big Data platforms. Its feature set makes it well-suited to modern needs such as web data warehousing and social data storage and analysis. AsterixDB has:
Data model
A semistructured NoSQL style data model (ADM) resulting from extending JSON with object database ideas
Query languages
Two expressive and declarative query languages (SQL++ and AQL) that support a broad range of queries and analysis over semistructured data
Scalability
A parallel runtime query execution engine, Apache Hyracks, that has been scale-tested on up to 1000+ cores and 500+ disks
Native storage
Partitioned LSM-based data storage and indexing to support efficient ingestion and management of semistructured data
External storage
Support for query access to externally stored data (e.g., data in HDFS) as well as to data stored natively by AsterixDB
Data types
A rich set of primitive data types, including spatial and temporal data in addition to integer, floating point, and textual data
Indexing
Secondary indexing options that include B+ trees, R trees, and inverted keyword (exact and fuzzy) index types
Transactions
Basic transactional (concurrency and recovery) capabilities akin to those of a NoSQL store
Learn more about AsterixDB at its website.
To build AsterixDB from source, you should have a platform with the following:
Instructions for building the master:
Checkout AsterixDB master:
$git clone https://github.com/apache/asterixdb.git
Build AsterixDB master:
$cd asterixdb $mvn clean package -DskipTests
Here are steps to get AsterixDB running on your local machine:
Start a single-machine AsterixDB instance:
$cd asterixdb/asterix-server/target/asterix-server-*-binary-assembly/apache-asterixdb-*-SNAPSHOT $./opt/local/bin/start-sample-cluster.sh
Good to go and run queries in your browser at:
http://localhost:19001
Read more documentation to learn the data model, query language, and how to create a cluster instance.