e05df7be235057d706c79db6122e7e7235017f9b - asterixdb

commit	e05df7be235057d706c79db6122e7e7235017f9b	[log] [tgz]
author	Yingyi Bu <buyingyi@gmail.com>	Tue Jun 16 11:03:39 2015 -0700
committer	Yingyi Bu <buyingyi@gmail.com>	Tue Jun 16 18:54:02 2015 -0700
tree	82d307ab11be5c027d816e4de110ade01bd44270
parent	1445153fda37f0d244bb1648c27ac5df4c47a852 [diff]

AsterixDB changes for fixing issue873.

For example, in the following query plan, the change lets the optimizer recognize that $12 and $20 are equivalent.
Therefore, HASH_PARTITION_EXCHANGE [$$12] can be replaced by ONE_TO_ONE_EXCHANGE.

-- COMMIT  |PARTITIONED|
  project ([$$12])
  -- STREAM_PROJECT  |PARTITIONED|
    exchange
    -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
      delete from TinySocial:TweetMessages from %0->$$4 partitioned by [%0->$$12]
      -- INSERT_DELETE  |PARTITIONED|
        exchange
        -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
          materialize
          -- MATERIALIZE  |PARTITIONED|
            exchange
            -- HASH_PARTITION_EXCHANGE [$$12]  |PARTITIONED|
              assign [$$12] <- [function-call: asterix:field-access-by-index, Args:[%0->$$4, AInt32: {0}]]
              -- ASSIGN  |PARTITIONED|
                project ([$$4])
                -- STREAM_PROJECT  |PARTITIONED|
                  assign [$$4] <- [function-call: asterix:open-record-constructor, Args:[AString: {tweetid}, %0->$$14, AString: {user}, function-call: asterix:field-access-by-index, Args:[%0->$$0, AInt32: {1}], AString: {sender-location}, function-call: asterix:field-access-by-index, Args:[%0->$$0, AInt32: {2}], AString: {send-time}, function-call: asterix:field-access-by-index, Args:[%0->$$0, AInt32: {3}], AString: {referred-topics}, function-call: asterix:field-access-by-index, Args:[%0->$$0, AInt32: {4}], AString: {message-text}, function-call: asterix:field-access-by-index, Args:[%0->$$0, AInt32: {5}]]]
                  -- ASSIGN  |PARTITIONED|
                    exchange
                    -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                      unnest-map [$$14, $$0] <- function-call: asterix:index-search, Args:[AString: {TweetMessages}, AInt32: {0}, AString: {TinySocial}, AString: {TweetMessages}, ABoolean: {false}, ABoolean: {false}, ABoolean: {false}, AInt32: {1}, %0->$$20, AInt32: {1}, %0->$$21, TRUE, TRUE, TRUE]
                      -- BTREE_SEARCH  |PARTITIONED|
                        exchange
                        -- ONE_TO_ONE_EXCHANGE  |PARTITIONED|
                          assign [$$20, $$21] <- [AString: {15}, AString: {15}]
                          -- ASSIGN  |PARTITIONED|
                            empty-tuple-source
                            -- EMPTY_TUPLE_SOURCE  |PARTITIONED|

Change-Id: Ife8c378a62cdbbcd8c19b521de246162f1f3d6ec
Reviewed-on: https://asterix-gerrit.ics.uci.edu/267
Tested-by: Jenkins <jenkins@fulliautomatix.ics.uci.edu>
Reviewed-by: Wenhai Li <lwhaymail@yahoo.com>
Reviewed-by: Ildar Absalyamov <ildar.absalyamov@gmail.com>

15 files changed

tree: 82d307ab11be5c027d816e4de110ade01bd44270

README.md

#AsterixDB

AsterixDB is a BDMS (Big Data Management System) with a rich feature set that sets it apart from other Big Data platforms. Its feature set makes it well-suited to modern needs such as web data warehousing and social data storage and analysis. AsterixDB has:

A semistructured NoSQL style data model (ADM) resulting from extending JSON with object database ideas
An expressive and declarative query language (AQL) that supports a broad range of queries and analysis over semistructured data
A parallel runtime query execution engine, Hyracks, that has been scale-tested on up to 1000+ cores and 500+ disks
Partitioned LSM-based data storage and indexing to support efficient ingestion and management of semistructured data
Support for query access to externally stored data (e.g., data in HDFS) as well as to data stored natively by AsterixDB
A rich set of primitive data types, including spatial and temporal data in addition to integer, floating point, and textual data
Secondary indexing options that include B+ trees, R trees, and inverted keyword (exact and fuzzy) index types
Support for fuzzy and spatial queries as well as for more traditional parametric queries
Basic transactional (concurrency and recovery) capabilities akin to those of a NoSQL store

Learn more about AsterixDB at [http://asterixdb.ics.uci.edu/] (http://asterixdb.ics.uci.edu/)

##Building AsterixDB

To build AsterixDB from source, you should have a platform with the following:

A Unix-ish environment (Linux, OS X, will all do).
git
Maven 3.1.1 or newer.
Java 7 or newer.

Additionally to run all the integration tests you should be running sshd locally, and have passwordless ssh logins enabled for the account which is running the tests.

##Documentation

AsterixDB's official documentation resides at [http://asterixdb.ics.uci.edu/documentation/index.html] (http://asterixdb.ics.uci.edu/documentation/index.html). This is built from the maven project under asterix-doc/ as a maven site. The documentation on the official website refers to the most stable release version, so for pre-release versions one should refer to the compiled documentation.

##Support/Contact

If you have any questions, please feel free to ask on our mailing list, users@asterixdb.incubator.apache.org. Join the list by sending an email to users-subscribe@asterixdb.incubator.apache.org. If you are interested in the internals or developement of AsterixDB, also please feel free to subscribe to our developer mailing list, dev@asterixdb.incubator.apache.org, by sending an email to dev-subscribe@asterixdb.incubator.apache.org.