blob: a7e694dd9c712e68bcf7fa71f67c14432c5d461b [file] [log] [blame]
vinayakb1f25c442010-06-17 19:41:17 +00001Hyracks is a data-parallel platform that allows users to run jobs on a cluster of shared-nothing computers.
2
3QUICKSTART
4__________
5
6Hyracks is made up of two parts that need to run on a cluster to accept and run users' jobs. The Hyracks Cluster Controller must be run on one machine designated as the
7master node. This machine should be able to be accessed from the other nodes in the cluster that would do work as well as the client machines that would submit jobs to Hyracks.
8The worker nodes (machines) must run a Hyracks Node Controller.
9
101. Starting the Hyracks Cluster Controller
11
12The simplest way to start the cluster controller is to run bin/hyrackscc.
13By default, the cluster controller listens on port 1099 for connections from Node Controllers. However, it can be made to listen on a different port by passing an optional
14parameter as shown below:
15
16bin/hyrackscc -port <port>
17
182. Starting the Hyracks Node Controller
19
20The node controller is started by running bin/hyracksnc. It requires at least the following two command line arguments.
21
22 -cc-host VAL : Cluster Controller host name
23 -data-ip-address VAL : IP Address to bind data listener
24
25If the cluster controller was directed to listen on a port other than the default, you will need to pass one more argument to hyracksnc.
26
27 -cc-port N : Cluster Controller port (default: 1099)
28
29The data-ip-address is the interface on which the Node Controller must listen on -- in the event the machine is multi-homed it must listen on an IP that is reachable from
30other Node Controllers. Make sure that the value passed to the data-ip-address is a valid IPv4 address (four octets separated by .).
31
323. Running a job on Hyracks
33
34There are a few examples in the source distribution under src/test/integration that outline the construction and issue of a Job to the Hyracks Cluster Controller. The
35basic steps that need to be performed on the client to execute jobs are:
36
37Registry registry = LocateRegistry.getRegistry(ccHost, ccPort);
38IClusterController cc = registry.lookup(IClusterController.class.getName()); // Get a handle to the Cluster Controller
39
40JobSpecification spec = createJob(); // User code to create a Job
41
42UUID jobId = cc.createJob(spec); // Install the Job on the Cluster Controller
43
44cc.start(jobId); // Start the job
45cc.waitForCompletion(jobId); // Jobs run asynchronously -- Wait for this one to complete
46
47
48
49BUILDING FROM SOURCE
50____________________
51
52Prerequisites:
53
541. JDK 1.6
552. Maven2 (maven.apache.org)
56
57Steps:
58
591. Download and unzip the Hyracks source assembly
602. Download dcache-client-0.0.1.jar
613. Run the following command: (Replacing /path/to/file with the path to the dcache-client jar file).
62
63 mvn install:install-file -DgroupId=edu.uci.ics.dcache -DartifactId=dcache-client -Dversion=0.0.1 -Dpackaging=jar -Dfile=/path/to/file/dcache-client-0.0.1.jar
64
654. cd into the hyracks-core folder where the source assembly was unzipped
665. Run
67
68 mvn package
69
706. That's it!
71
72
73IMPORTING INTO ECLIPSE
74______________________
75
76Prerequisites:
77
78You will need to have the Eclipse Maven plugin (http://m2eclipse.sonatype.org/)
79
801. Open eclipse
812. Right click in the Package Explorer pane.
823. Click on Import...
834. Under "General" choose "Existing Projects into Workspace"
845. Pick the project root option and browse to the hyracks-core folder
856. Click on Finish