introduced new adm lexer into stabilization. Issue #215
git-svn-id: https://asterixdb.googlecode.com/svn/branches/asterix_stabilization@1205 eaa15691-b419-025a-1212-ee371bd00084
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/Asterix_ADM_Parser.md b/asterix-maven-plugins/lexer-generator-maven-plugin/Asterix_ADM_Parser.md
new file mode 100644
index 0000000..eeaffc9
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/Asterix_ADM_Parser.md
@@ -0,0 +1,53 @@
+The Asterix ADM Parser
+======================
+
+The ADM parser inside Asterix is composed by two different components:
+
+* **The Parser** AdmTupleParser, which converts the adm tokens in internal objects
+* **The Lexer** AdmLexer, which scans the adm file and returns a list of adm tokens
+
+These two classes belong to the package:
+
+ edu.uci.ics.asterix.runtime.operators.file
+
+The Parser is loaded through a factory (*AdmSchemafullRecordParserFactory*) by
+
+ edu.uci.ics.asterix.external.dataset.adapter.FileSystemBasedAdapter extends AbstractDatasourceAdapter
+
+
+How to add a new datatype
+-------------------------
+The ADM format allows two different kinds of datatype:
+
+* primitive
+* with constructor
+
+A primitive datatype allows to write the actual value of the field without extra markup:
+
+ { name : "Diego", age : 23 }
+
+while the datatypes with constructor require to specify first the type of the value and then a string with the serialized value
+
+ { center : point3d("P2.1,3,8.5") }
+
+In order to add a new datatype the steps are:
+
+1. Add the new token to the **Lexer**
+ * **if the datatype is primite** is necessary to create a TOKEN able to recognize **the format of the value**
+ * **if the datatype is with constructor** is necessary to create **only** a TOKEN able to recognize **the name of the constructor**
+
+2. Change the **Parser** in order to convert correctly the new token in internal objects
+ * This will require to **add new cases to the switch-case statements** and the introduction of **a serializer/deserializer object** for that datatype.
+
+
+The Lexer
+----------
+To add new datatype or change the tokens definition you have to change ONLY the file adm.grammar located in
+ asterix-runtime/src/main/resources/adm.grammar
+The lexer will be generated from that definition file during each maven building.
+
+The maven configuration in located in asterix-runtime/pom.xml
+
+
+> Author: Diego Giorgini - diegogiorgini@gmail.com
+> 6 December 2012
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/README.md b/asterix-maven-plugins/lexer-generator-maven-plugin/README.md
new file mode 100644
index 0000000..b3632e6
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/README.md
@@ -0,0 +1,111 @@
+Lexer Generator
+===============
+
+This tool automate the creation of Hand-Coded-Like Lexers.
+It was created to address the performance issues of other (more advanced) lexer generators like JavaCC that arise when you need to scan TB of data. In particular it is *~20x faster* than javacc and typically can parse the data from a normal harddisk at *more than 70MBs*.
+
+
+Maven Plugin (to put inside pom.xml)
+-------------------------------------
+ <build>
+ <plugins>
+ <plugin>
+ <groupId>org.apache.maven.plugins</groupId>
+ <artifactId>maven-compiler-plugin</artifactId>
+ <version>2.0.2</version>
+ <configuration>
+ <source>1.6</source>
+ <target>1.6</target>
+ </configuration>
+ </plugin>
+ <plugin>
+ <groupId>edu.uci.ics.asterix</groupId>
+ <artifactId>lexer-generator-maven-plugin</artifactId>
+ <version>0.1-SNAPSHOT</version>
+ <configuration>
+ <grammarFile>src/main/java/edu/uci/ics/asterix/runtime/operators/file/adm/adm.grammar</grammarFile>
+ <outputDir>${project.build.directory}/generated-sources</outputDir>
+ </configuration>
+ <executions>
+ <execution>
+ <id>generate-lexer</id>
+ <phase>generate-sources</phase>
+ <goals>
+ <goal>generate-lexer</goal>
+ </goals>
+ </execution>
+ </executions>
+ </plugin>
+ </plugins>
+ </build>
+
+
+Command line
+-------------
+ LexerGenerator\nusage: java LexerGenerator <configuration file>
+
+
+
+What means Hand-Coded-Like and why it is so fast
+------------------------------------------------
+The most of the Lexers use a Finite State Machine encoded in data structure called [State Transition Table](http://en.wikipedia.org/wiki/State_transition_table).
+While elegant and practical this approach require some extra controls and operations to deal with the data structure at runtime. A different approach consists in encoding the State Machine as actual code, in this way all the operations done are limited to the minumum amount needed to parse our grammar.
+A common problem with this kind of hard-hand-coded lexers is that is almost impossible to do manutency and changes, this is the reason of this Lexer Generator able to produce a Hand-Coded-Like lexer starting from a grammar specification.
+
+Another big difference with the most of the LexerGenerator (expecially the ones for Java) is that since it is optimized for performance we **don't return objects** and we **use the minimum possible of objects internally**.
+This actually is the main reason of the ~20x when compared with javacc.
+
+
+Configuration File
+------------------
+Is a simple *key: value* configuration file plus the *specification of your grammar*.
+The four configuration keys are listed below:
+
+ # LEXER GENERATOR configuration file
+ # ---------------------------------------
+ # Place *first* the generic configuration
+ # then list your grammar.
+
+ PACKAGE: edu.uci.ics.asterix.admfast.parser
+ LEXER_NAME: AdmLexer
+ OUTPUT_DIR: output/
+
+
+Specify The Grammar
+-------------------
+Your grammar has to be listed in the configuration file after the *TOKENS:* keyword.
+
+ TOKENS:
+
+ BOOLEAN_LIT = string(boolean)
+ COMMA = char(\,)
+ COLON = char(:)
+ STRING_LITERAL = char("), anythingUntil(")
+ INT_LITERAL = signOrNothing(), digitSequence()
+ INT8_LITERAL = token(INT_LITERAL), string(i8)
+ @EXPONENT = caseInsensitiveChar(e), signOrNothing(), digitSequence()
+ DOUBLE_LITERAL = signOrNothing(), digitSequence(), char(.), digitSequence(), token(@EXPONENT)
+ DOUBLE_LITERAL = signOrNothing(), digitSequence(), token(@EXPONENT)
+
+Each token is composed by a **name** and a sequence of **rules**.
+Each rule is then written with the format: **constructor(parameter)**
+the list of the rules available is coded inside *NodeChainFactory.java*
+
+You can write more than a sequence of rules just addind more another line and repeating the token name.
+
+You can reuse the rules of a token inside another one with the special rule: **token(** *TOKEN_NAME* **)**
+
+Lastly you can define *auxiliary* token definitions that will not be encoded in the final lexer (but that can be useful inside other token definitions) just **startig the token name with @**.
+
+**Attention:** please pay attention to not write rules that once merged int the state machine would lead to a *conflict between transaction* like a transaction for a generic digit and one for a the digit 0 from the same node.
+
+The result: MyLexer
+-------------------
+The result of the execution of the LexerGenerator is the creation of the Lexer inside the directory *components**.
+The lexer is extremly easy and minimal and can be used likewise an Iterator:
+
+ MyLexer myLexer = new MyLexer(new FileReader(file)));
+ while((token = MyLexer.next()) != MyLexer.TOKEN_EOF){
+ System.out.println(MyLexer.tokenKindToString(token));
+ }
+
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/pom.xml b/asterix-maven-plugins/lexer-generator-maven-plugin/pom.xml
new file mode 100644
index 0000000..524727f
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/pom.xml
@@ -0,0 +1,36 @@
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
+ <modelVersion>4.0.0</modelVersion>
+ <groupId>edu.uci.ics.asterix</groupId>
+ <artifactId>lexer-generator-maven-plugin</artifactId>
+ <version>0.1</version>
+ <packaging>maven-plugin</packaging>
+ <name>lexer-generator-maven-plugin</name>
+
+ <build>
+ <plugins>
+ <plugin>
+ <groupId>org.apache.maven.plugins</groupId>
+ <artifactId>maven-compiler-plugin</artifactId>
+ <version>2.0.2</version>
+ <configuration>
+ <source>1.6</source>
+ <target>1.6</target>
+ </configuration>
+ </plugin>
+ </plugins>
+ </build>
+
+ <dependencies>
+ <dependency>
+ <groupId>junit</groupId>
+ <artifactId>junit</artifactId>
+ <version>4.8.1</version>
+ <scope>test</scope>
+ </dependency>
+ <dependency>
+ <groupId>org.apache.maven</groupId>
+ <artifactId>maven-plugin-api</artifactId>
+ <version>2.0.2</version>
+ </dependency>
+ </dependencies>
+</project>
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/LexerGenerator.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/LexerGenerator.java
new file mode 100644
index 0000000..512f3d0
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/LexerGenerator.java
@@ -0,0 +1,202 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator;
+
+import java.io.BufferedReader;
+import java.io.FileNotFoundException;
+import java.io.FileReader;
+import java.io.Reader;
+import java.io.InputStream;
+import java.io.InputStreamReader;
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.Map.Entry;
+import java.util.Set;
+import org.apache.maven.plugin.logging.Log;
+
+public class LexerGenerator {
+ private LinkedHashMap<String, Token> tokens = new LinkedHashMap<String, Token>();
+ private Log logger;
+
+ public LexerGenerator() {
+ }
+
+ public LexerGenerator(Log logger) {
+ this.logger = logger;
+ }
+
+ private void log(String info) {
+ if (logger == null) {
+ System.out.println(info);
+ } else {
+ logger.info(info);
+ }
+ }
+
+ public void addToken(String rule) throws Exception {
+ Token newToken;
+ if (rule.charAt(0) == '@') {
+ newToken = new TokenAux(rule, tokens);
+ } else {
+ newToken = new Token(rule, tokens);
+ }
+ Token existingToken = tokens.get(newToken.getName());
+ if (existingToken == null) {
+ tokens.put(newToken.getName(), newToken);
+ } else {
+ existingToken.merge(newToken);
+ }
+ }
+
+ public void generateLexer(HashMap<String, String> config) throws Exception {
+ LexerNode main = this.compile();
+ config.put("TOKENS_CONSTANTS", this.tokensConstants());
+ config.put("TOKENS_IMAGES", this.tokensImages());
+ config.put("LEXER_LOGIC", main.toJava());
+ config.put("LEXER_AUXFUNCTIONS", replaceParams(this.auxiliaryFunctions(main), config));
+ String[] files = { "/Lexer.java", "/LexerException.java" };
+ String outputDir = config.get("OUTPUT_DIR");
+ (new File(outputDir)).mkdirs();
+ for (String file : files) {
+ String input = readFile(LexerGenerator.class.getResourceAsStream(file));
+ String fileOut = file.replace("Lexer", config.get("LEXER_NAME"));
+ String output = replaceParams(input, config);
+ log("Generating: " + file + "\t>>\t" + fileOut);
+ FileWriter out = new FileWriter((new File(outputDir, fileOut)).toString());
+ out.write(output);
+ out.close();
+ log(" [done]\n");
+ }
+ }
+
+ public String printParsedGrammar() {
+ StringBuilder result = new StringBuilder();
+ for (Token token : tokens.values()) {
+ result.append(token.toString()).append("\n");
+ }
+ return result.toString();
+ }
+
+ private LexerNode compile() throws Exception {
+ LexerNode main = new LexerNode();
+ for (Token token : tokens.values()) {
+ if (token instanceof TokenAux)
+ continue;
+ main.merge(token.getNode());
+ }
+ return main;
+ }
+
+ private String tokensImages() {
+ StringBuilder result = new StringBuilder();
+ Set<String> uniqueTokens = tokens.keySet();
+ for (String token : uniqueTokens) {
+ result.append(", \"<").append(token).append(">\" ");
+ }
+ return result.toString();
+ }
+
+ private String tokensConstants() {
+ StringBuilder result = new StringBuilder();
+ Set<String> uniqueTokens = tokens.keySet();
+ int i = 2;
+ for (String token : uniqueTokens) {
+ result.append(", TOKEN_").append(token).append("=").append(i).append(" ");
+ i++;
+ }
+ return result.toString();
+ }
+
+ private String auxiliaryFunctions(LexerNode main) {
+ StringBuilder result = new StringBuilder();
+ Set<String> functions = main.neededAuxFunctions();
+ for (String token : functions) {
+ result.append("private int parse_" + token
+ + "(char currentChar) throws IOException, [LEXER_NAME]Exception{\n");
+ result.append(tokens.get(token).getNode().toJavaAuxFunction());
+ result.append("\n}\n\n");
+ }
+ return result.toString();
+ }
+
+ private static String readFile(Reader input) throws FileNotFoundException, IOException {
+ StringBuffer fileData = new StringBuffer(1000);
+ BufferedReader reader = new BufferedReader(input);
+ char[] buf = new char[1024];
+ int numRead = 0;
+ while ((numRead = reader.read(buf)) != -1) {
+ String readData = String.valueOf(buf, 0, numRead);
+ fileData.append(readData);
+ buf = new char[1024];
+ }
+ reader.close();
+ return fileData.toString();
+ }
+
+ private static String readFile(InputStream input) throws FileNotFoundException, IOException {
+ if (input == null) {
+ throw new FileNotFoundException();
+ }
+ return readFile(new InputStreamReader(input));
+ }
+
+ private static String readFile(String fileName) throws FileNotFoundException, IOException {
+ return readFile(new FileReader(fileName));
+ }
+
+ private static String replaceParams(String input, HashMap<String, String> config) {
+ for (Entry<String, String> param : config.entrySet()) {
+ String key = "\\[" + param.getKey() + "\\]";
+ String value = param.getValue();
+ input = input.replaceAll(key, value);
+ }
+ return input;
+ }
+
+ public static void main(String args[]) throws Exception {
+ if (args.length == 0 || args[0] == "--help" || args[0] == "-h") {
+ System.out.println("LexerGenerator\nusage: java LexerGenerator <configuration file>");
+ return;
+ }
+
+ LexerGenerator lexer = new LexerGenerator();
+ HashMap<String, String> config = new HashMap<String, String>();
+
+ System.out.println("Config file:\t" + args[0]);
+ String input = readFile(args[0]);
+ boolean tokens = false;
+ for (String line : input.split("\r?\n")) {
+ line = line.trim();
+ if (line.length() == 0 || line.charAt(0) == '#')
+ continue;
+ if (tokens == false && !line.equals("TOKENS:")) {
+ config.put(line.split("\\s*:\\s*")[0], line.split("\\s*:\\s*")[1]);
+ } else if (line.equals("TOKENS:")) {
+ tokens = true;
+ } else {
+ lexer.addToken(line);
+ }
+ }
+
+ String parsedGrammar = lexer.printParsedGrammar();
+ lexer.generateLexer(config);
+ System.out.println("\nGenerated grammar:");
+ System.out.println(parsedGrammar);
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/LexerGeneratorMojo.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/LexerGeneratorMojo.java
new file mode 100644
index 0000000..11ee1d5
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/LexerGeneratorMojo.java
@@ -0,0 +1,92 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator;
+
+import edu.uci.ics.asterix.lexergenerator.LexerGenerator;
+import java.io.BufferedReader;
+import java.io.FileNotFoundException;
+import java.io.FileReader;
+import java.io.IOException;
+import java.util.HashMap;
+import org.apache.maven.plugin.AbstractMojo;
+import org.apache.maven.plugin.MojoExecutionException;
+
+import java.io.File;
+
+/**
+ * @goal generate-lexer
+ * @phase generate-sources
+ * @requiresDependencyResolution compile
+ */
+public class LexerGeneratorMojo extends AbstractMojo {
+ /**
+ * parameter injected from pom.xml
+ *
+ * @parameter
+ * @required
+ */
+ private File grammarFile;
+
+ /**
+ * parameter injected from pom.xml
+ *
+ * @parameter
+ * @required
+ */
+ private File outputDir;
+
+ public void execute() throws MojoExecutionException {
+ LexerGenerator lexer = new LexerGenerator(getLog());
+ HashMap<String, String> config = new HashMap<String, String>();
+ getLog().info("--- Lexer Generator Maven Plugin - started with grammarFile: " + grammarFile.toString());
+ try {
+ String input = readFile(grammarFile);
+ config.put("OUTPUT_DIR", outputDir.toString());
+ boolean tokens = false;
+ for (String line : input.split("\r?\n")) {
+ line = line.trim();
+ if (line.length() == 0 || line.charAt(0) == '#')
+ continue;
+ if (tokens == false && !line.equals("TOKENS:")) {
+ config.put(line.split("\\s*:\\s*")[0], line.split("\\s*:\\s*")[1]);
+ } else if (line.equals("TOKENS:")) {
+ tokens = true;
+ } else {
+ lexer.addToken(line);
+ }
+ }
+ lexer.generateLexer(config);
+ } catch (Throwable e) {
+ throw new MojoExecutionException("Error while generating lexer", e);
+ }
+ String parsedGrammar = lexer.printParsedGrammar();
+ getLog().info("--- Generated grammar:\n" + parsedGrammar);
+ }
+
+ private String readFile(File file) throws FileNotFoundException, IOException {
+ StringBuffer fileData = new StringBuffer(1000);
+ BufferedReader reader = new BufferedReader(new FileReader(file));
+ char[] buf = new char[1024];
+ int numRead = 0;
+ while ((numRead = reader.read(buf)) != -1) {
+ String readData = String.valueOf(buf, 0, numRead);
+ fileData.append(readData);
+ buf = new char[1024];
+ }
+ reader.close();
+ return fileData.toString();
+ }
+
+}
\ No newline at end of file
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/LexerNode.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/LexerNode.java
new file mode 100644
index 0000000..7b8d059
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/LexerNode.java
@@ -0,0 +1,243 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator;
+
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.LinkedHashMap;
+import java.util.Map;
+import java.util.Set;
+
+import edu.uci.ics.asterix.lexergenerator.rules.*;
+
+public class LexerNode {
+ private static String TOKEN_PREFIX = "TOKEN_";
+ private LinkedHashMap<Rule, LexerNode> actions = new LinkedHashMap<Rule, LexerNode>();
+ private String finalTokenName;
+ private Set<String> ongoingParsing = new HashSet<String>();
+
+ public LexerNode clone() {
+ LexerNode node = new LexerNode();
+ node.finalTokenName = this.finalTokenName;
+ for (Map.Entry<Rule, LexerNode> entry : this.actions.entrySet()) {
+ node.actions.put(entry.getKey().clone(), entry.getValue().clone());
+ }
+ for (String ongoing : this.ongoingParsing) {
+ node.ongoingParsing.add(ongoing);
+ }
+ return node;
+ }
+
+ public void add(Rule newRule) {
+ if (actions.get(newRule) == null) {
+ actions.put(newRule, new LexerNode());
+ }
+ }
+
+ public void append(Rule newRule) {
+ if (actions.size() == 0) {
+ add(newRule);
+ } else {
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ action.getValue().append(newRule);
+ }
+ if (actions.containsKey(new RuleEpsilon())) {
+ actions.remove(new RuleEpsilon());
+ add(newRule);
+ }
+ }
+ }
+
+ public void merge(LexerNode newNode) throws Exception {
+ for (Map.Entry<Rule, LexerNode> action : newNode.actions.entrySet()) {
+ if (this.actions.get(action.getKey()) == null) {
+ this.actions.put(action.getKey(), action.getValue());
+ } else {
+ this.actions.get(action.getKey()).merge(action.getValue());
+ }
+ }
+ if (newNode.finalTokenName != null) {
+ if (this.finalTokenName == null) {
+ this.finalTokenName = newNode.finalTokenName;
+ } else {
+ throw new Exception("Rule conflict between: " + this.finalTokenName + " and " + newNode.finalTokenName);
+ }
+ }
+ for (String ongoing : newNode.ongoingParsing) {
+ this.ongoingParsing.add(ongoing);
+ }
+ }
+
+ public void append(LexerNode node) throws Exception {
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ if (action.getKey() instanceof RuleEpsilon)
+ continue;
+ action.getValue().append(node);
+ }
+ if (actions.containsKey(new RuleEpsilon())) {
+ actions.remove(new RuleEpsilon());
+ merge(node.clone());
+ }
+ if (actions.size() == 0 || finalTokenName != null) {
+ finalTokenName = null;
+ merge(node.clone());
+ }
+ }
+
+ public void appendTokenName(String name) {
+ if (actions.size() == 0) {
+ this.finalTokenName = name;
+ } else {
+ ongoingParsing.add(TOKEN_PREFIX + name);
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ action.getValue().appendTokenName(name);
+ }
+ }
+ }
+
+ public LexerNode removeTokensName() {
+ this.finalTokenName = null;
+ this.ongoingParsing.clear();
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ action.getValue().removeTokensName();
+ }
+ return this;
+ }
+
+ public String toString() {
+ StringBuilder result = new StringBuilder();
+ if (finalTokenName != null)
+ result.append("! ");
+ if (actions.size() == 1)
+ result.append(actions.keySet().toArray()[0].toString() + actions.values().toArray()[0].toString());
+ if (actions.size() > 1) {
+ result.append(" ( ");
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ if (result.length() != 3) {
+ result.append(" || ");
+ }
+ result.append(action.getKey().toString());
+ result.append(action.getValue().toString());
+ }
+ result.append(" ) ");
+ }
+ return result.toString();
+ }
+
+ public String toJava() {
+ StringBuffer result = new StringBuffer();
+ if (numberOfRuleChar() > 2) {
+ result.append(toJavaSingleCharRules());
+ result.append(toJavaComplexRules(false));
+ } else {
+ result.append(toJavaComplexRules(true));
+ }
+ if (this.finalTokenName != null) {
+ result.append("return " + TOKEN_PREFIX + finalTokenName + ";\n");
+ } else if (ongoingParsing != null) {
+ String ongoingParsingArgs = collectionJoin(ongoingParsing, ',');
+ result.append("return parseError(" + ongoingParsingArgs + ");\n");
+ }
+ return result.toString();
+ }
+
+ private int numberOfRuleChar() {
+ int singleCharRules = 0;
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ if (action.getKey() instanceof RuleChar)
+ singleCharRules++;
+ }
+ return singleCharRules;
+ }
+
+ private String toJavaSingleCharRules() {
+ StringBuffer result = new StringBuffer();
+ result.append("switch(currentChar){\n");
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ if (action.getKey() instanceof RuleChar) {
+ RuleChar rule = (RuleChar) action.getKey();
+ result.append("case '" + rule.expectedChar() + "':\n");
+ result.append(rule.javaAction()).append("\n");
+ result.append(action.getValue().toJava());
+ }
+ }
+ result.append("}\n");
+ return result.toString();
+ }
+
+ private String toJavaComplexRules(boolean all) {
+ StringBuffer result = new StringBuffer();
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ if (!all && action.getKey() instanceof RuleChar)
+ continue;
+ if (action.getKey() instanceof RuleEpsilon)
+ continue;
+ String act = action.getKey().javaAction();
+ if (act.length() > 0) {
+ act = "\n" + act;
+ }
+ result.append(action.getKey().javaMatch(act + "\n" + action.getValue().toJava()));
+ }
+ return result.toString();
+ }
+
+ public void expandFirstAction(LinkedHashMap<String, Token> tokens) throws Exception {
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ Rule first = action.getKey();
+ if (first instanceof RulePartial) {
+ if (tokens.get(((RulePartial) first).getPartial()) == null) {
+ throw new Exception("Cannot find a token used as part of another definition, missing token: "
+ + ((RulePartial) first).getPartial());
+ }
+ actions.remove(first);
+ LexerNode node = tokens.get(((RulePartial) first).getPartial()).getNode().clone();
+ merge(node);
+ }
+ }
+ }
+
+ public Set<String> neededAuxFunctions() {
+ HashSet<String> partials = new HashSet<String>();
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ Rule rule = action.getKey();
+ if (rule instanceof RulePartial) {
+ partials.add(((RulePartial) rule).getPartial());
+ }
+ partials.addAll(action.getValue().neededAuxFunctions());
+ }
+ return partials;
+ }
+
+ public String toJavaAuxFunction() {
+ String oldFinalTokenName = finalTokenName;
+ if (oldFinalTokenName == null)
+ finalTokenName = "AUX_NOT_FOUND";
+ String result = toJava();
+ finalTokenName = oldFinalTokenName;
+ return result;
+ }
+
+ private String collectionJoin(Collection<String> collection, char c) {
+ StringBuilder ongoingParsingArgs = new StringBuilder();
+ for (String token : collection) {
+ ongoingParsingArgs.append(token);
+ ongoingParsingArgs.append(c);
+ }
+ if (ongoingParsing.size() > 0) {
+ ongoingParsingArgs.deleteCharAt(ongoingParsingArgs.length() - 1);
+ }
+ return ongoingParsingArgs.toString();
+ }
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/NodeChainFactory.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/NodeChainFactory.java
new file mode 100644
index 0000000..941f822
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/NodeChainFactory.java
@@ -0,0 +1,43 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator;
+
+import java.util.HashMap;
+
+import edu.uci.ics.asterix.lexergenerator.rulegenerators.*;
+
+public class NodeChainFactory {
+ static private HashMap<String, RuleGenerator> ruleGenerators = new HashMap<String, RuleGenerator>();
+
+ static {
+ ruleGenerators.put("char", new RuleGeneratorChar());
+ ruleGenerators.put("string", new RuleGeneratorString());
+ ruleGenerators.put("anythingUntil", new RuleGeneratorAnythingUntil());
+ ruleGenerators.put("signOrNothing", new RuleGeneratorSignOrNothing());
+ ruleGenerators.put("sign", new RuleGeneratorSign());
+ ruleGenerators.put("digitSequence", new RuleGeneratorDigitSequence());
+ ruleGenerators.put("caseInsensitiveChar", new RuleGeneratorCaseInsensitiveChar());
+ ruleGenerators.put("charOrNothing", new RuleGeneratorCharOrNothing());
+ ruleGenerators.put("token", new RuleGeneratorToken());
+ ruleGenerators.put("nothing", new RuleGeneratorNothing());
+ }
+
+ public static LexerNode create(String generator, String constructor) throws Exception {
+ constructor = constructor.replace("@", "aux_");
+ if (ruleGenerators.get(generator) == null)
+ throw new Exception("Rule Generator not found for '" + generator + "'");
+ return ruleGenerators.get(generator).generate(constructor);
+ }
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/Token.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/Token.java
new file mode 100644
index 0000000..bb122c2
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/Token.java
@@ -0,0 +1,70 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator;
+
+import java.util.LinkedHashMap;
+import java.util.regex.Matcher;
+import java.util.regex.Pattern;
+
+public class Token {
+ private String userDescription;
+ private String name;
+ private LexerNode node;
+
+ public Token(String str, LinkedHashMap<String, Token> tokens) throws Exception {
+ userDescription = str;
+ node = new LexerNode();
+ parse(userDescription, tokens);
+ }
+
+ public String getName() {
+ return name;
+ }
+
+ public LexerNode getNode() {
+ return node;
+ }
+
+ public String toString() {
+ return this.name + " => " + getNode().toString();
+ }
+
+ public void merge(Token newToken) throws Exception {
+ node.merge(newToken.getNode());
+ }
+
+ private void parse(String str, LinkedHashMap<String, Token> tokens) throws Exception {
+ Pattern p = Pattern.compile("^(@?\\w+)\\s*=\\s*(.+)");
+ Matcher m = p.matcher(str);
+ if (!m.find())
+ throw new Exception("Token definition not correct: " + str);
+ this.name = m.group(1).replaceAll("@", "aux_");
+ String[] textRules = m.group(2).split("(?<!\\\\),\\s*");
+ for (String textRule : textRules) {
+ Pattern pRule = Pattern.compile("^(\\w+)(\\((.*)\\))?");
+ Matcher mRule = pRule.matcher(textRule);
+ mRule.find();
+ String generator = mRule.group(1);
+ String constructor = mRule.group(3);
+ if (constructor == null)
+ throw new Exception("Error in rule format: " + "\n " + str + " = " + generator + " : " + constructor);
+ constructor = constructor.replace("\\", "");
+ node.append(NodeChainFactory.create(generator, constructor));
+ node.expandFirstAction(tokens);
+ }
+ node.appendTokenName(name);
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/TokenAux.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/TokenAux.java
new file mode 100644
index 0000000..a9c7ffc
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/TokenAux.java
@@ -0,0 +1,25 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator;
+
+import java.util.LinkedHashMap;
+
+public class TokenAux extends Token {
+
+ public TokenAux(String str, LinkedHashMap<String, Token> tokens) throws Exception {
+ super(str, tokens);
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGenerator.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGenerator.java
new file mode 100644
index 0000000..3733746
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGenerator.java
@@ -0,0 +1,21 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+
+public interface RuleGenerator {
+ public LexerNode generate(String input) throws Exception;
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorAnythingUntil.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorAnythingUntil.java
new file mode 100644
index 0000000..b14eb3e
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorAnythingUntil.java
@@ -0,0 +1,31 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleAnythingUntil;
+
+public class RuleGeneratorAnythingUntil implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) throws Exception {
+ LexerNode result = new LexerNode();
+ if (input == null || input.length() != 1)
+ throw new Exception("Wrong rule format for generator anythingExcept: " + input);
+ result.append(new RuleAnythingUntil(input.charAt(0)));
+ return result;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorCaseInsensitiveChar.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorCaseInsensitiveChar.java
new file mode 100644
index 0000000..b789f59
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorCaseInsensitiveChar.java
@@ -0,0 +1,34 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleChar;
+
+public class RuleGeneratorCaseInsensitiveChar implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) throws Exception {
+ LexerNode result = new LexerNode();
+ if (input == null || input.length() != 1)
+ throw new Exception("Wrong rule format for generator char: " + input);
+ char cl = Character.toLowerCase(input.charAt(0));
+ char cu = Character.toUpperCase(cl);
+ result.add(new RuleChar(cl));
+ result.add(new RuleChar(cu));
+ return result;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorChar.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorChar.java
new file mode 100644
index 0000000..0b830e6
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorChar.java
@@ -0,0 +1,31 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleChar;
+
+public class RuleGeneratorChar implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) throws Exception {
+ LexerNode result = new LexerNode();
+ if (input == null || input.length() != 1)
+ throw new Exception("Wrong rule format for generator char: " + input);
+ result.append(new RuleChar(input.charAt(0)));
+ return result;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorCharOrNothing.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorCharOrNothing.java
new file mode 100644
index 0000000..d01ff7d
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorCharOrNothing.java
@@ -0,0 +1,33 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleChar;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleEpsilon;
+
+public class RuleGeneratorCharOrNothing implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) throws Exception {
+ LexerNode result = new LexerNode();
+ if (input == null || input.length() != 1)
+ throw new Exception("Wrong rule format for generator charOrNothing: " + input);
+ result.add(new RuleChar(input.charAt(0)));
+ result.add(new RuleEpsilon());
+ return result;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorDigitSequence.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorDigitSequence.java
new file mode 100644
index 0000000..d067ee7
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorDigitSequence.java
@@ -0,0 +1,29 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleDigitSequence;
+
+public class RuleGeneratorDigitSequence implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) throws Exception {
+ LexerNode result = new LexerNode();
+ result.append(new RuleDigitSequence());
+ return result;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorNothing.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorNothing.java
new file mode 100644
index 0000000..fec06a1
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorNothing.java
@@ -0,0 +1,29 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleEpsilon;
+
+public class RuleGeneratorNothing implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) throws Exception {
+ LexerNode node = new LexerNode();
+ node.add(new RuleEpsilon());
+ return node;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorSign.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorSign.java
new file mode 100644
index 0000000..0160f09
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorSign.java
@@ -0,0 +1,30 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleChar;
+
+public class RuleGeneratorSign implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) throws Exception {
+ LexerNode result = new LexerNode();
+ result.add(new RuleChar('+'));
+ result.add(new RuleChar('-'));
+ return result;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorSignOrNothing.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorSignOrNothing.java
new file mode 100644
index 0000000..7c4297d
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorSignOrNothing.java
@@ -0,0 +1,32 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleChar;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleEpsilon;
+
+public class RuleGeneratorSignOrNothing implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) throws Exception {
+ LexerNode result = new LexerNode();
+ result.add(new RuleChar('+'));
+ result.add(new RuleChar('-'));
+ result.add(new RuleEpsilon());
+ return result;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorString.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorString.java
new file mode 100644
index 0000000..eb0471b
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorString.java
@@ -0,0 +1,33 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleChar;
+
+public class RuleGeneratorString implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) {
+ LexerNode result = new LexerNode();
+ if (input == null)
+ return result;
+ for (int i = 0; i < input.length(); i++) {
+ result.append(new RuleChar(input.charAt(i)));
+ }
+ return result;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorToken.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorToken.java
new file mode 100644
index 0000000..b4c23d8
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorToken.java
@@ -0,0 +1,31 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RulePartial;
+
+public class RuleGeneratorToken implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) throws Exception {
+ if (input == null || input.length() == 0)
+ throw new Exception("Wrong rule format for generator token : " + input);
+ LexerNode node = new LexerNode();
+ node.add(new RulePartial(input));
+ return node;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/Rule.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/Rule.java
new file mode 100644
index 0000000..01cd1d5
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/Rule.java
@@ -0,0 +1,29 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rules;
+
+public interface Rule {
+ public int hashCode();
+
+ public boolean equals(Object o);
+
+ public String toString();
+
+ public String javaAction();
+
+ public String javaMatch(String action);
+
+ public Rule clone();
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleAnythingUntil.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleAnythingUntil.java
new file mode 100644
index 0000000..8d45835
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleAnythingUntil.java
@@ -0,0 +1,68 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rules;
+
+public class RuleAnythingUntil implements Rule {
+
+ private char expected;
+
+ public RuleAnythingUntil clone() {
+ return new RuleAnythingUntil(expected);
+ }
+
+ public RuleAnythingUntil(char expected) {
+ this.expected = expected;
+ }
+
+ @Override
+ public String toString() {
+ return " .* " + String.valueOf(expected);
+ }
+
+ @Override
+ public int hashCode() {
+ return 10 * (int) expected;
+ }
+
+ @Override
+ public boolean equals(Object o) {
+ if (o == null)
+ return false;
+ if (o instanceof RuleAnythingUntil) {
+ if (((RuleAnythingUntil) o).expected == this.expected) {
+ return true;
+ }
+ }
+ return false;
+ }
+
+ @Override
+ public String javaAction() {
+ return "currentChar = readNextChar();";
+ }
+
+ @Override
+ public String javaMatch(String action) {
+ StringBuilder result = new StringBuilder();
+ result.append("boolean escaped = false;");
+ result.append("while (currentChar!='").append(expected).append("' || escaped)");
+ result.append("{\nif(!escaped && currentChar=='\\\\\\\\'){escaped=true;}\nelse {escaped=false;}\ncurrentChar = readNextChar();\n}");
+ result.append("\nif (currentChar=='").append(expected).append("'){");
+ result.append(action);
+ result.append("}\n");
+ return result.toString();
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleChar.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleChar.java
new file mode 100644
index 0000000..0e53374
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleChar.java
@@ -0,0 +1,70 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rules;
+
+public class RuleChar implements Rule {
+
+ private char expected;
+
+ public RuleChar clone() {
+ return new RuleChar(expected);
+ }
+
+ public RuleChar(char expected) {
+ this.expected = expected;
+ }
+
+ @Override
+ public String toString() {
+ return String.valueOf(expected);
+ }
+
+ public char expectedChar() {
+ return expected;
+ }
+
+ @Override
+ public int hashCode() {
+ return (int) expected;
+ }
+
+ @Override
+ public boolean equals(Object o) {
+ if (o == null)
+ return false;
+ if (o instanceof RuleChar) {
+ if (((RuleChar) o).expected == this.expected) {
+ return true;
+ }
+ }
+ return false;
+ }
+
+ @Override
+ public String javaAction() {
+ return "currentChar = readNextChar();";
+ }
+
+ @Override
+ public String javaMatch(String action) {
+ StringBuilder result = new StringBuilder();
+ result.append("if (currentChar=='");
+ result.append(expected);
+ result.append("'){");
+ result.append(action);
+ result.append("}");
+ return result.toString();
+ }
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleDigitSequence.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleDigitSequence.java
new file mode 100644
index 0000000..13381e0
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleDigitSequence.java
@@ -0,0 +1,57 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rules;
+
+public class RuleDigitSequence implements Rule {
+
+ public RuleDigitSequence clone() {
+ return new RuleDigitSequence();
+ }
+
+ @Override
+ public String toString() {
+ return " [0-9]+ ";
+ }
+
+ @Override
+ public int hashCode() {
+ return 1;
+ }
+
+ @Override
+ public boolean equals(Object o) {
+ if (o == null)
+ return false;
+ if (o instanceof RuleDigitSequence) {
+ return true;
+ }
+ return false;
+ }
+
+ @Override
+ public String javaAction() {
+ return "";
+ }
+
+ @Override
+ public String javaMatch(String action) {
+ StringBuilder result = new StringBuilder();
+ result.append("if(currentChar >= '0' && currentChar<='9'){" + "\ncurrentChar = readNextChar();"
+ + "\nwhile(currentChar >= '0' && currentChar<='9'){" + "\ncurrentChar = readNextChar();" + "\n}\n");
+ result.append(action);
+ result.append("\n}");
+ return result.toString();
+ }
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleEpsilon.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleEpsilon.java
new file mode 100644
index 0000000..41b7535
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleEpsilon.java
@@ -0,0 +1,54 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rules;
+
+public class RuleEpsilon implements Rule {
+
+ public RuleEpsilon clone() {
+ return new RuleEpsilon();
+ }
+
+ @Override
+ public String toString() {
+ return "?";
+ }
+
+ @Override
+ public int hashCode() {
+ return 0;
+ }
+
+ @Override
+ public boolean equals(Object o) {
+ if (o == null)
+ return false;
+ if (o instanceof RuleEpsilon) {
+ return true;
+ }
+ return false;
+ }
+
+ @Override
+ public String javaAction() {
+ return "";
+ }
+
+ @Override
+ public String javaMatch(String action) {
+ StringBuilder result = new StringBuilder();
+ result.append("{").append(action).append("}");
+ return result.toString();
+ }
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RulePartial.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RulePartial.java
new file mode 100644
index 0000000..89caf4f
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RulePartial.java
@@ -0,0 +1,69 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rules;
+
+public class RulePartial implements Rule {
+
+ private String partialName;
+
+ public RulePartial clone() {
+ return new RulePartial(partialName);
+ }
+
+ public RulePartial(String expected) {
+ this.partialName = expected;
+ }
+
+ public String getPartial() {
+ return this.partialName;
+ }
+
+ @Override
+ public String toString() {
+ return partialName;
+ }
+
+ @Override
+ public int hashCode() {
+ return (int) partialName.charAt(1);
+ }
+
+ @Override
+ public boolean equals(Object o) {
+ if (o == null)
+ return false;
+ if (o instanceof RulePartial) {
+ if (((RulePartial) o).partialName.equals(this.partialName)) {
+ return true;
+ }
+ }
+ return false;
+ }
+
+ @Override
+ public String javaAction() {
+ return "";
+ }
+
+ @Override
+ public String javaMatch(String action) {
+ StringBuilder result = new StringBuilder();
+ result.append("if (parse_" + partialName + "(currentChar)==TOKEN_" + partialName + "){");
+ result.append(action);
+ result.append("}");
+ return result.toString();
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/resources/Lexer.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/resources/Lexer.java
new file mode 100644
index 0000000..8cee79d
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/resources/Lexer.java
@@ -0,0 +1,219 @@
+package [PACKAGE];
+
+import java.io.IOException;
+import [PACKAGE].[LEXER_NAME]Exception;
+
+public class [LEXER_NAME] {
+
+ public static final int
+ TOKEN_EOF = 0, TOKEN_AUX_NOT_FOUND = 1 [TOKENS_CONSTANTS];
+
+ // Human representation of tokens. Useful for debug.
+ // Is possible to convert a TOKEN_CONSTANT in its image through
+ // [LEXER_NAME].tokenKindToString(TOKEN_CONSTANT);
+ private static final String[] tokenImage = {
+ "<EOF>", "<AUX_NOT_FOUND>" [TOKENS_IMAGES]
+ };
+
+ private static final char EOF_CHAR = 4;
+ protected java.io.Reader inputStream;
+ protected int column;
+ protected int line;
+ protected boolean prevCharIsCR;
+ protected boolean prevCharIsLF;
+ protected char[] buffer;
+ protected int bufsize;
+ protected int bufpos;
+ protected int tokenBegin;
+ protected int endOf_USED_Buffer;
+ protected int endOf_UNUSED_Buffer;
+ protected int maxUnusedBufferSize;
+
+// ================================================================================
+// Auxiliary functions. Can parse the tokens used in the grammar as partial/auxiliary
+// ================================================================================
+
+ [LEXER_AUXFUNCTIONS]
+
+// ================================================================================
+// Main method. Return a TOKEN_CONSTANT
+// ================================================================================
+
+ public int next() throws [LEXER_NAME]Exception, IOException{
+ char currentChar = buffer[bufpos];
+ while (currentChar == ' ' || currentChar=='\t' || currentChar == '\n' || currentChar=='\r')
+ currentChar = readNextChar();
+ tokenBegin = bufpos;
+ if (currentChar==EOF_CHAR) return TOKEN_EOF;
+
+ [LEXER_LOGIC]
+ }
+
+// ================================================================================
+// Public interface
+// ================================================================================
+
+ public [LEXER_NAME](java.io.Reader stream) throws IOException{
+ reInit(stream);
+ }
+
+ public void reInit(java.io.Reader stream) throws IOException{
+ done();
+ inputStream = stream;
+ bufsize = 4096;
+ line = 1;
+ column = 0;
+ bufpos = -1;
+ endOf_UNUSED_Buffer = bufsize;
+ endOf_USED_Buffer = 0;
+ prevCharIsCR = false;
+ prevCharIsLF = false;
+ buffer = new char[bufsize];
+ tokenBegin = -1;
+ maxUnusedBufferSize = 4096/2;
+ readNextChar();
+ }
+
+ public String getLastTokenImage() {
+ if (bufpos >= tokenBegin)
+ return new String(buffer, tokenBegin, bufpos - tokenBegin);
+ else
+ return new String(buffer, tokenBegin, bufsize - tokenBegin) +
+ new String(buffer, 0, bufpos);
+ }
+
+ public static String tokenKindToString(int token) {
+ return tokenImage[token];
+ }
+
+ public void done(){
+ buffer = null;
+ }
+
+// ================================================================================
+// Parse error management
+// ================================================================================
+
+ protected int parseError(String reason) throws [LEXER_NAME]Exception {
+ StringBuilder message = new StringBuilder();
+ message.append(reason).append("\n");
+ message.append("Line: ").append(line).append("\n");
+ message.append("Row: ").append(column).append("\n");
+ throw new [LEXER_NAME]Exception(message.toString());
+ }
+
+ protected int parseError(int ... tokens) throws [LEXER_NAME]Exception {
+ StringBuilder message = new StringBuilder();
+ message.append("Error while parsing. ");
+ message.append(" Line: ").append(line);
+ message.append(" Row: ").append(column);
+ message.append(" Expecting:");
+ for (int tokenId : tokens){
+ message.append(" ").append([LEXER_NAME].tokenKindToString(tokenId));
+ }
+ throw new [LEXER_NAME]Exception(message.toString());
+ }
+
+ protected void updateLineColumn(char c){
+ column++;
+
+ if (prevCharIsLF)
+ {
+ prevCharIsLF = false;
+ line += (column = 1);
+ }
+ else if (prevCharIsCR)
+ {
+ prevCharIsCR = false;
+ if (c == '\n')
+ {
+ prevCharIsLF = true;
+ }
+ else
+ {
+ line += (column = 1);
+ }
+ }
+
+ if (c=='\r') {
+ prevCharIsCR = true;
+ } else if(c == '\n') {
+ prevCharIsLF = true;
+ }
+ }
+
+// ================================================================================
+// Read data, buffer management. It uses a circular (and expandable) buffer
+// ================================================================================
+
+ protected char readNextChar() throws IOException {
+ if (++bufpos >= endOf_USED_Buffer)
+ fillBuff();
+ char c = buffer[bufpos];
+ updateLineColumn(c);
+ return c;
+ }
+
+ protected boolean fillBuff() throws IOException {
+ if (endOf_UNUSED_Buffer == endOf_USED_Buffer) // If no more unused buffer space
+ {
+ if (endOf_UNUSED_Buffer == bufsize) // -- If the previous unused space was
+ { // -- at the end of the buffer
+ if (tokenBegin > maxUnusedBufferSize) // -- -- If the first N bytes before
+ { // the current token are enough
+ bufpos = endOf_USED_Buffer = 0; // -- -- -- setup buffer to use that fragment
+ endOf_UNUSED_Buffer = tokenBegin;
+ }
+ else if (tokenBegin < 0) // -- -- If no token yet
+ bufpos = endOf_USED_Buffer = 0; // -- -- -- reuse the whole buffer
+ else
+ ExpandBuff(false); // -- -- Otherwise expand buffer after its end
+ }
+ else if (endOf_UNUSED_Buffer > tokenBegin) // If the endOf_UNUSED_Buffer is after the token
+ endOf_UNUSED_Buffer = bufsize; // -- set endOf_UNUSED_Buffer to the end of the buffer
+ else if ((tokenBegin - endOf_UNUSED_Buffer) < maxUnusedBufferSize)
+ { // If between endOf_UNUSED_Buffer and the token
+ ExpandBuff(true); // there is NOT enough space expand the buffer
+ } // reorganizing it
+ else
+ endOf_UNUSED_Buffer = tokenBegin; // Otherwise there is enough space at the start
+ } // so we set the buffer to use that fragment
+ int i;
+ if ((i = inputStream.read(buffer, endOf_USED_Buffer, endOf_UNUSED_Buffer - endOf_USED_Buffer)) == -1)
+ {
+ inputStream.close();
+ buffer[endOf_USED_Buffer]=(char)EOF_CHAR;
+ endOf_USED_Buffer++;
+ return false;
+ }
+ else
+ endOf_USED_Buffer += i;
+ return true;
+ }
+
+
+ protected void ExpandBuff(boolean wrapAround)
+ {
+ char[] newbuffer = new char[bufsize + maxUnusedBufferSize];
+
+ try {
+ if (wrapAround) {
+ System.arraycopy(buffer, tokenBegin, newbuffer, 0, bufsize - tokenBegin);
+ System.arraycopy(buffer, 0, newbuffer, bufsize - tokenBegin, bufpos);
+ buffer = newbuffer;
+ endOf_USED_Buffer = (bufpos += (bufsize - tokenBegin));
+ }
+ else {
+ System.arraycopy(buffer, tokenBegin, newbuffer, 0, bufsize - tokenBegin);
+ buffer = newbuffer;
+ endOf_USED_Buffer = (bufpos -= tokenBegin);
+ }
+ } catch (Throwable t) {
+ throw new Error(t.getMessage());
+ }
+
+ bufsize += maxUnusedBufferSize;
+ endOf_UNUSED_Buffer = bufsize;
+ tokenBegin = 0;
+ }
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/resources/LexerException.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/resources/LexerException.java
new file mode 100644
index 0000000..76aa8a4
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/resources/LexerException.java
@@ -0,0 +1,13 @@
+package [PACKAGE];
+
+public class [LEXER_NAME]Exception extends Exception {
+
+ public [LEXER_NAME]Exception(String message) {
+ super(message);
+ }
+
+ private static final long serialVersionUID = 1L;
+
+}
+
+
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/resources/default.config b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/resources/default.config
new file mode 100644
index 0000000..7efbeb8
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/resources/default.config
@@ -0,0 +1,16 @@
+# LEXER GENERATOR configuration file
+# ---------------------------------------
+# Place *first* the generic configuration
+# then list your grammar.
+
+PACKAGE: com.my.lexer
+LEXER_NAME: MyLexer
+OUTPUT_DIR: output
+
+TOKENS:
+
+BOOLEAN_LIT = string(boolean)
+FALSE_LIT = string(false)
+BOMB_LIT = string(bomb)
+BONSAI_LIT = string(bonsai)
+HELLO_LIT = string(hello)
\ No newline at end of file
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/Fixtures.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/Fixtures.java
new file mode 100644
index 0000000..2ed2eaa
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/Fixtures.java
@@ -0,0 +1,100 @@
+package edu.uci.ics.asterix.lexergenerator;
+
+import edu.uci.ics.asterix.lexergenerator.rules.Rule;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleChar;
+
+public class Fixtures {
+ static String token_name = "MYTOKEN";
+ static String token2_name = "MYTOKEN2";
+ static String token_return = "return TOKEN_MYTOKEN;\n";
+ static String token2_return = "return TOKEN_MYTOKEN2;\n";
+ static String token_parseerror = "return parseError(TOKEN_MYTOKEN);\n";
+ static String token_tostring = "! ";
+ static String rule_action = "myaction";
+ static String rule_name = "myrule";
+ static String rule_match = "matchCheck("+rule_name+")";
+ static String rule2_action = "myaction2";
+ static String rule2_name = "myrule2";
+ static String rule2_match = "matchCheck2("+rule_name+")";
+
+ static public Rule createRule(final String name){
+ return new Rule(){
+ String rule_name = name;
+ String rule_action = "myaction";
+ String rule_match = "matchCheck("+rule_name+")";
+
+ @Override
+ public Rule clone(){
+ return Fixtures.createRule(name+"_clone");
+ }
+
+ @Override
+ public String javaAction() {
+ return rule_action;
+ }
+
+ @Override
+ public String javaMatch(String action) {
+ return rule_match+"{"+action+"}";
+ }
+
+ @Override
+ public String toString(){
+ return rule_name;
+ }
+
+ };
+ }
+
+ static Rule rule = new Rule(){
+
+ public Rule clone(){
+ return null;
+ }
+
+ @Override
+ public String javaAction() {
+ return rule_action;
+ }
+
+ @Override
+ public String javaMatch(String action) {
+ return rule_match+"{"+action+"}";
+ }
+
+ @Override
+ public String toString(){
+ return rule_name;
+ }
+
+ };
+
+ static Rule rule2 = new Rule(){
+
+ public Rule clone(){
+ return null;
+ }
+
+ @Override
+ public String javaAction() {
+ return rule2_action;
+ }
+
+ @Override
+ public String javaMatch(String act) {
+ return rule2_match+"{"+act+"}";
+ }
+
+ @Override
+ public String toString(){
+ return rule2_name;
+ }
+
+ };
+
+ static RuleChar ruleA = new RuleChar('a');
+ static RuleChar ruleB = new RuleChar('b');
+ static RuleChar ruleC = new RuleChar('c');
+ static String ruleABC_action = "currentChar = readNextChar();";
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAddRuleTest.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAddRuleTest.java
new file mode 100644
index 0000000..7541124
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAddRuleTest.java
@@ -0,0 +1,51 @@
+package edu.uci.ics.asterix.lexergenerator;
+
+import static edu.uci.ics.asterix.lexergenerator.Fixtures.*;
+import static org.junit.Assert.*;
+
+import org.junit.Test;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+
+public class LexerNodeAddRuleTest {
+
+ @Test
+ public void NodeRuleRuleNodeNode() {
+ LexerNode node = new LexerNode();
+ node.append(rule);
+ node.add(rule2);
+ node.appendTokenName(token_name);
+ assertEquals(" ( " + rule_name +token_tostring + " || " + rule2_name + token_tostring + " ) ", node.toString());
+ assertEquals(rule_match+"{"
+ +"\n" + rule_action
+ +"\n" +token_return
+ +"}"
+ +rule2_match+"{"
+ +"\n"+rule2_action
+ +"\n"+token_return
+ +"}"
+ +token_parseerror , node.toJava());
+ }
+
+ @Test
+ public void NodeSwitchCase() {
+ LexerNode node = new LexerNode();
+ node.append(ruleA);
+ node.add(ruleB);
+ node.add(ruleC);
+ node.appendTokenName(token_name);
+ assertEquals(" ( a" + token_tostring + " || b" + token_tostring + " || c" + token_tostring + " ) ", node.toString());
+ assertEquals("switch(currentChar){\n" +
+ "case 'a':" +
+ "\n" + ruleABC_action +
+ "\n" + token_return +
+ "case 'b':" +
+ "\n" + ruleABC_action +
+ "\n" + token_return +
+ "case 'c':" +
+ "\n" + ruleABC_action +
+ "\n" + token_return +
+ "}\n"+ token_parseerror , node.toJava());
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAppendNodeTest.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAppendNodeTest.java
new file mode 100644
index 0000000..5151e77
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAppendNodeTest.java
@@ -0,0 +1,81 @@
+package edu.uci.ics.asterix.lexergenerator;
+
+import static edu.uci.ics.asterix.lexergenerator.Fixtures.*;
+import static org.junit.Assert.*;
+
+import org.junit.Test;
+
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleEpsilon;
+
+public class LexerNodeAppendNodeTest {
+
+ @Test
+ public void AppendIsMergeIfNoActions() throws Exception {
+ LexerNode node = new LexerNode();
+ LexerNode node2 = new LexerNode();
+ node2.append(createRule("rule"));
+ node2.appendTokenName(token_name);
+ node.append(node2);
+ assertEquals("rule_clone! ", node.toString());
+ }
+
+ @Test
+ public void AppendIsAppend() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(createRule("A"));
+ LexerNode node2 = new LexerNode();
+ node2.append(createRule("rule"));
+ node2.appendTokenName(token_name);
+ node.append(node2);
+ assertEquals("Arule_clone! ", node.toString());
+ }
+
+ @Test
+ public void AppendedNodesAreCloned() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(createRule("A"));
+ node.appendTokenName(token_name);
+ LexerNode node2 = new LexerNode();
+ node2.append(createRule("B"));
+ node2.appendTokenName(token2_name);
+ node.append(node2);
+ // TODO
+ // assertEquals("A! B_clone! ", node.toString());
+
+ LexerNode node3 = new LexerNode();
+ node3.append(createRule("C"));
+ node3.append(createRule("D"));
+ node3.appendTokenName(token2_name);
+ node.append(node3);
+ // TODO
+ // assertEquals("A! B_clone! C_cloneD_clone! ", node.toString());
+ }
+
+ @Test
+ public void EpsilonRuleDoesNotPropagateAppended() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(new RuleEpsilon());
+ LexerNode node2 = new LexerNode();
+ node2.append(createRule("A"));
+ node2.appendTokenName(token2_name);
+ node.append(node2);
+ assertEquals("A_clone! ", node.toString());
+ }
+
+ @Test
+ public void EpsilonRuleIsRemovedAndIssueMerge() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(new RuleEpsilon());
+ LexerNode node2 = new LexerNode();
+ node2.append(createRule("A"));
+ node2.appendTokenName(token2_name);
+ node.append(node2);
+ node.add(new RuleEpsilon());
+ node.append(node2);
+ // TODO
+ // assertEquals(" ( A_clone! A_clone! || A_clone! ) ", node.toString());
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAppendRuleTest.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAppendRuleTest.java
new file mode 100644
index 0000000..84fd292
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAppendRuleTest.java
@@ -0,0 +1,47 @@
+package edu.uci.ics.asterix.lexergenerator;
+
+import static org.junit.Assert.*;
+
+import org.junit.Test;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import static edu.uci.ics.asterix.lexergenerator.Fixtures.*;
+
+public class LexerNodeAppendRuleTest {
+ @Test
+ public void SingleNode() {
+ LexerNode node = new LexerNode();
+ node.appendTokenName(token_name);
+ assertEquals(token_tostring, node.toString());
+ assertEquals(token_return, node.toJava());
+ }
+
+ @Test
+ public void NodeRuleNode() {
+ LexerNode node = new LexerNode();
+ node.append(rule);
+ node.appendTokenName(token_name);
+ assertEquals(rule_name+token_tostring, node.toString());
+ assertEquals(rule_match+"{"
+ +"\n"+rule_action
+ +"\n"+token_return
+ +"}"+token_parseerror, node.toJava());
+ }
+
+ @Test
+ public void NodeRuleNodeRuleNode() {
+ LexerNode node = new LexerNode();
+ node.append(rule);
+ node.append(rule2);
+ node.appendTokenName(token_name);
+ assertEquals(rule_name+rule2_name+token_tostring, node.toString());
+ assertEquals(rule_match+"{"
+ +"\n"+rule_action
+ +"\n"+rule2_match+"{"
+ +"\n"+rule2_action
+ +"\n"+token_return
+ +"}"
+ +token_parseerror
+ +"}"+token_parseerror, node.toJava());
+ }
+}
\ No newline at end of file
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAuxFunctionsTest.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAuxFunctionsTest.java
new file mode 100644
index 0000000..9f12c00
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAuxFunctionsTest.java
@@ -0,0 +1,111 @@
+package edu.uci.ics.asterix.lexergenerator;
+
+import static edu.uci.ics.asterix.lexergenerator.Fixtures.*;
+import static org.junit.Assert.*;
+
+import java.util.HashSet;
+import java.util.LinkedHashMap;
+import java.util.Set;
+
+import org.junit.Test;
+
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.Token;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleEpsilon;
+import edu.uci.ics.asterix.lexergenerator.rules.RulePartial;
+
+public class LexerNodeAuxFunctionsTest {
+ String expectedDifferentReturn = "return TOKEN_AUX_NOT_FOUND;\n";
+
+ @Test
+ public void NodeRuleRuleNodeNode() {
+ LexerNode node = new LexerNode();
+ node.append(rule);
+ node.add(rule2);
+ node.appendTokenName(token_name);
+ assertEquals(" ( " + rule_name +token_tostring + " || " + rule2_name + token_tostring + " ) ", node.toString());
+ assertEquals(rule_match+"{"
+ +"\n" + rule_action
+ +"\n" +token_return
+ +"}"
+ +rule2_match+"{"
+ +"\n"+rule2_action
+ +"\n"+token_return
+ +"}"
+ +expectedDifferentReturn , node.toJavaAuxFunction());
+ }
+
+ @Test
+ public void NodeSwitchCase() {
+ LexerNode node = new LexerNode();
+ node.append(ruleA);
+ node.add(ruleB);
+ node.add(ruleC);
+ node.appendTokenName(token_name);
+ assertEquals(" ( a" + token_tostring + " || b" + token_tostring + " || c" + token_tostring + " ) ", node.toString());
+ assertEquals("switch(currentChar){\n" +
+ "case 'a':" +
+ "\n" + ruleABC_action +
+ "\n" + token_return +
+ "case 'b':" +
+ "\n" + ruleABC_action +
+ "\n" + token_return +
+ "case 'c':" +
+ "\n" + ruleABC_action +
+ "\n" + token_return +
+ "}\n"+ expectedDifferentReturn , node.toJavaAuxFunction());
+ }
+
+ @Test
+ public void NodeNeededAuxFunctions() {
+ LexerNode node = new LexerNode();
+ node.append(ruleA);
+ node.add(new RulePartial("token1"));
+ node.append(ruleC);
+ node.append(new RulePartial("token2"));
+ node.appendTokenName(token_name);
+ assertEquals(" ( actoken2! || token1ctoken2! ) ", node.toString());
+ Set<String> expectedNeededAuxFunctions = new HashSet<String>();
+ expectedNeededAuxFunctions.add("token1");
+ expectedNeededAuxFunctions.add("token2");
+ assertEquals(expectedNeededAuxFunctions, node.neededAuxFunctions());
+ }
+
+ @Test(expected=Exception.class)
+ public void NodeExpandFirstActionError() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(ruleA);
+ node.add(new RulePartial("token1"));
+ node.append(ruleC);
+ node.add(new RuleEpsilon());
+ node.append(new RulePartial("token2"));
+ node.appendTokenName(token_name);
+ assertEquals(" ( actoken2! || token1ctoken2! || token2! ) ", node.toString());
+ LinkedHashMap<String, Token> tokens = new LinkedHashMap<String, Token>();
+ try {
+ node.expandFirstAction(tokens);
+ } catch (Exception e) {
+ assertEquals("Cannot find a token used as part of another definition, missing token: token1", e.getMessage());
+ throw e;
+ }
+ }
+
+ public void NodeExpandFirstAction() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(ruleA);
+ node.add(new RulePartial("token1"));
+ node.append(ruleC);
+ node.add(new RuleEpsilon());
+ node.append(new RulePartial("token2"));
+ node.appendTokenName(token_name);
+ assertEquals(" ( actoken2! || token1ctoken2! || token2! ) ", node.toString());
+ LinkedHashMap<String, Token> tokens = new LinkedHashMap<String, Token>();
+ Token a = new Token("token1 = string(T1-blabla)", tokens);
+ Token b = new Token("token1 = string(T1-blabla)", tokens);
+ tokens.put("token1", a);
+ tokens.put("token2", b);
+ node.expandFirstAction(tokens);
+ assertEquals(" ( actoken2! || T1-blablactoken2! || T2-blabla! ) ", node.toString());
+ }
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeCloneTest.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeCloneTest.java
new file mode 100644
index 0000000..87e3ff4
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeCloneTest.java
@@ -0,0 +1,56 @@
+package edu.uci.ics.asterix.lexergenerator;
+
+import static edu.uci.ics.asterix.lexergenerator.Fixtures.*;
+import static org.junit.Assert.*;
+
+import org.junit.Test;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+
+public class LexerNodeCloneTest {
+
+ @Test
+ public void Depth1() throws Exception {
+ LexerNode node = new LexerNode();
+ LexerNode newNode = node.clone();
+ assertFalse(node == newNode);
+ }
+
+
+ @Test
+ public void Depth2() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(createRule("my1"));
+ node.add(createRule("my2"));
+ node.add(ruleA);
+ node.appendTokenName(token_name);
+ LexerNode newNode = node.clone();
+
+ assertEquals(" ( my1! || my2! || a! ) ", node.toString());
+ assertEquals(" ( my1_clone! || my2_clone! || a! ) ", newNode.toString());
+ }
+
+ @Test
+ public void Depth3() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(createRule("my1"));
+ node.add(createRule("my2"));
+ node.add(ruleA);
+ node.appendTokenName(token_name);
+ LexerNode node2 = new LexerNode();
+ node2.append(createRule("my3"));
+ node2.add(createRule("my4"));
+ node2.add(ruleB);
+ node2.appendTokenName(token2_name);
+ node.append(node2);
+ LexerNode newNode = node.clone();
+ // TODO
+ // assertEquals(" ( my1! ( || my3_clone! || my4_clone! || b! ) " +
+ // " || my2! ( || my3_clone! || my4_clone! || b! ) " +
+ // " || a! ( || my3_clone! || my4_clone! || b! ) ) ", node.toString());
+ // assertEquals(" ( my1_clone! ( || my3_clone_clone! || my4_clone_clone! || b! ) " +
+ // " || my2_clone! ( || my3_clone_clone! || my4_clone_clone! || b! ) " +
+ // " || a! ( || my3_clone_clone! || my4_clone_clone! || b! ) ) ", newNode.toString());
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeMergeNodeTest.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeMergeNodeTest.java
new file mode 100644
index 0000000..4b22d99
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeMergeNodeTest.java
@@ -0,0 +1,83 @@
+package edu.uci.ics.asterix.lexergenerator;
+
+import static edu.uci.ics.asterix.lexergenerator.Fixtures.*;
+import static org.junit.Assert.*;
+
+import org.junit.Test;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+
+public class LexerNodeMergeNodeTest {
+
+ @Test
+ public void MergeIsAdd() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(rule);
+ LexerNode node2 = new LexerNode();
+ node2.append(rule2);
+ node2.append(rule);
+ node2.merge(node);
+ node2.appendTokenName(token_name);
+
+ LexerNode expected = new LexerNode();
+ expected.append(rule2);
+ expected.append(rule);
+ expected.add(rule);
+ expected.appendTokenName(token_name);
+
+ assertEquals(expected.toString(), node2.toString());
+ assertEquals(expected.toJava(), node2.toJava());
+ }
+
+ @Test
+ public void MergeTwoToken() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(rule);
+ node.appendTokenName(token_name);
+ LexerNode node2 = new LexerNode();
+ node2.append(rule2);
+ node2.appendTokenName(token2_name);
+ node.merge(node2);
+
+ assertEquals(" ( "+rule_name+token_tostring+" || "+rule2_name+token_tostring+" ) ", node.toString());
+ assertEquals(rule_match + "{"
+ + "\n" + rule_action
+ + "\n" + token_return
+ +"}"+rule2_match+"{"
+ + "\n" + rule2_action
+ + "\n" + token2_return
+ +"}return parseError(TOKEN_MYTOKEN,TOKEN_MYTOKEN2);\n"
+, node.toJava());
+ }
+
+ @Test(expected=Exception.class)
+ public void MergeConflict() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(rule);
+ node.appendTokenName(token_name);
+ LexerNode node2 = new LexerNode();
+ node2.append(rule);
+ node2.appendTokenName(token2_name);
+ try {
+ node.merge(node2);
+ } catch (Exception e) {
+ assertEquals("Rule conflict between: "+token_name +" and "+token2_name, e.getMessage());
+ throw e;
+ }
+ }
+
+ @Test
+ public void MergeWithoutConflictWithRemoveTokensName() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(rule);
+ node.append(rule);
+ node.appendTokenName(token_name);
+ LexerNode node2 = new LexerNode();
+ node2.append(rule);
+ node2.append(rule);
+ node2.appendTokenName(token2_name);
+ node2.removeTokensName();
+ node.merge(node2);
+ assertEquals(rule_name+rule_name+token_tostring, node.toString());
+ }
+}
diff --git a/asterix-maven-plugins/pom.xml b/asterix-maven-plugins/pom.xml
new file mode 100644
index 0000000..0677ffb
--- /dev/null
+++ b/asterix-maven-plugins/pom.xml
@@ -0,0 +1,21 @@
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
+ <modelVersion>4.0.0</modelVersion>
+ <groupId>edu.uci.ics.asterix</groupId>
+ <artifactId>asterix-maven-plugins</artifactId>
+ <version>0.1</version>
+ <packaging>pom</packaging>
+
+ <dependencies>
+ <dependency>
+ <groupId>org.apache.maven</groupId>
+ <artifactId>maven-plugin-api</artifactId>
+ <version>2.2.1</version>
+ <type>jar</type>
+ <scope>compile</scope>
+ </dependency>
+ </dependencies>
+
+ <modules>
+ <module>lexer-generator-maven-plugin</module>
+ </modules>
+</project>
diff --git a/asterix-runtime/pom.xml b/asterix-runtime/pom.xml
index 93b38df..a56f732 100644
--- a/asterix-runtime/pom.xml
+++ b/asterix-runtime/pom.xml
@@ -20,23 +20,80 @@
<target>1.6</target>
</configuration>
</plugin>
- <plugin>
- <groupId>org.codehaus.mojo</groupId>
- <artifactId>javacc-maven-plugin</artifactId>
- <version>2.6</version>
- <executions>
- <execution>
- <id>javacc</id>
- <goals>
- <goal>javacc</goal>
- </goals>
- <configuration>
- <isStatic>false</isStatic>
- </configuration>
- </execution>
- </executions>
- </plugin>
- </plugins>
+ <plugin>
+ <groupId>edu.uci.ics.asterix</groupId>
+ <artifactId>lexer-generator-maven-plugin</artifactId>
+ <version>0.1</version>
+ <configuration>
+ <grammarFile>src/main/resources/adm.grammar</grammarFile>
+ <outputDir>${project.build.directory}/generated-sources/edu/uci/ics/asterix/runtime/operators/file/adm</outputDir>
+ </configuration>
+ <executions>
+ <execution>
+ <id>generate-lexer</id>
+ <phase>generate-sources</phase>
+ <goals>
+ <goal>generate-lexer</goal>
+ </goals>
+ </execution>
+ </executions>
+ </plugin>
+ <plugin>
+ <groupId>org.codehaus.mojo</groupId>
+ <artifactId>build-helper-maven-plugin</artifactId>
+ <executions>
+ <execution>
+ <id>add-source</id>
+ <phase>generate-sources</phase>
+ <goals>
+ <goal>add-source</goal>
+ </goals>
+ <configuration>
+ <sources>
+ <source>${project.build.directory}/generated-sources/</source>
+ </sources>
+ </configuration>
+ </execution>
+ </executions>
+ </plugin>
+ </plugins>
+ <pluginManagement>
+ <plugins>
+ <!--This plugin's configuration is used to store Eclipse m2e settings only. It has no influence on the Maven build itself.-->
+ <plugin>
+ <groupId>org.eclipse.m2e</groupId>
+ <artifactId>lifecycle-mapping</artifactId>
+ <version>1.0.0</version>
+ <configuration>
+ <lifecycleMappingMetadata>
+ <pluginExecutions>
+ <pluginExecution>
+ <pluginExecutionFilter>
+ <groupId>
+ edu.uci.ics.asterix
+ </groupId>
+ <artifactId>
+ lexer-generator-maven-plugin
+ </artifactId>
+ <versionRange>
+ [0.1,)
+ </versionRange>
+ <goals>
+ <goal>generate-lexer</goal>
+ </goals>
+ </pluginExecutionFilter>
+ <action>
+ <execute>
+ <runOnIncremental>false</runOnIncremental>
+ </execute>
+ </action>
+ </pluginExecution>
+ </pluginExecutions>
+ </lifecycleMappingMetadata>
+ </configuration>
+ </plugin>
+ </plugins>
+ </pluginManagement>
</build>
<dependencies>
diff --git a/asterix-runtime/src/main/java/edu/uci/ics/asterix/runtime/operators/file/ADMDataParser.java b/asterix-runtime/src/main/java/edu/uci/ics/asterix/runtime/operators/file/ADMDataParser.java
index 8606088..2e64ad4 100644
--- a/asterix-runtime/src/main/java/edu/uci/ics/asterix/runtime/operators/file/ADMDataParser.java
+++ b/asterix-runtime/src/main/java/edu/uci/ics/asterix/runtime/operators/file/ADMDataParser.java
@@ -22,10 +22,8 @@
import java.util.List;
import java.util.Queue;
-import edu.uci.ics.asterix.adm.parser.nontagged.AdmLexer;
-import edu.uci.ics.asterix.adm.parser.nontagged.AdmLexerConstants;
-import edu.uci.ics.asterix.adm.parser.nontagged.ParseException;
-import edu.uci.ics.asterix.adm.parser.nontagged.Token;
+import edu.uci.ics.asterix.runtime.operators.file.adm.AdmLexer;
+import edu.uci.ics.asterix.runtime.operators.file.adm.AdmLexerException;
import edu.uci.ics.asterix.builders.IARecordBuilder;
import edu.uci.ics.asterix.builders.IAsterixListBuilder;
import edu.uci.ics.asterix.builders.OrderedListBuilder;
@@ -36,6 +34,7 @@
import edu.uci.ics.asterix.dataflow.data.nontagged.serde.ADateSerializerDeserializer;
import edu.uci.ics.asterix.dataflow.data.nontagged.serde.ADateTimeSerializerDeserializer;
import edu.uci.ics.asterix.dataflow.data.nontagged.serde.ADurationSerializerDeserializer;
+import edu.uci.ics.asterix.dataflow.data.nontagged.serde.AIntervalSerializerDeserializer;
import edu.uci.ics.asterix.dataflow.data.nontagged.serde.ALineSerializerDeserializer;
import edu.uci.ics.asterix.dataflow.data.nontagged.serde.APoint3DSerializerDeserializer;
import edu.uci.ics.asterix.dataflow.data.nontagged.serde.APointSerializerDeserializer;
@@ -55,7 +54,7 @@
import edu.uci.ics.hyracks.data.std.util.ArrayBackedValueStorage;
/**
- * Parser for ADM formatted data.
+ * Parser for ADM formatted data.
*/
public class ADMDataParser extends AbstractDataParser implements IDataParser {
@@ -82,21 +81,25 @@
}
@Override
- public void initialize(InputStream in, ARecordType recordType, boolean datasetRec) {
- admLexer = new AdmLexer(in);
+ public void initialize(InputStream in, ARecordType recordType, boolean datasetRec) throws AsterixException {
this.recordType = recordType;
this.datasetRec = datasetRec;
+ try {
+ admLexer = new AdmLexer(new java.io.InputStreamReader(in));
+ } catch (IOException e) {
+ throw new AsterixException(e);
+ }
}
protected boolean parseAdmInstance(IAType objectType, boolean datasetRec, DataOutput out) throws AsterixException,
IOException {
- Token token;
+ int token;
try {
token = admLexer.next();
- } catch (ParseException pe) {
- throw new AsterixException(pe);
+ } catch (AdmLexerException e) {
+ throw new AsterixException(e);
}
- if (token.kind == AdmLexerConstants.EOF) {
+ if (token == AdmLexer.TOKEN_EOF) {
return false;
} else {
admFromLexerStream(token, objectType, out, datasetRec);
@@ -104,157 +107,212 @@
}
}
- private void admFromLexerStream(Token token, IAType objectType, DataOutput out, Boolean datasetRec)
+ private void admFromLexerStream(int token, IAType objectType, DataOutput out, Boolean datasetRec)
throws AsterixException, IOException {
- switch (token.kind) {
- case AdmLexerConstants.NULL_LITERAL: {
+ switch (token) {
+ case AdmLexer.TOKEN_NULL_LITERAL: {
if (checkType(ATypeTag.NULL, objectType, out)) {
nullSerde.serialize(ANull.NULL, out);
} else
throw new AsterixException(" This field can not be null ");
break;
}
- case AdmLexerConstants.TRUE_LITERAL: {
+ case AdmLexer.TOKEN_TRUE_LITERAL: {
if (checkType(ATypeTag.BOOLEAN, objectType, out)) {
booleanSerde.serialize(ABoolean.TRUE, out);
} else
throw new AsterixException(mismatchErrorMessage + objectType.getTypeName());
break;
}
- case AdmLexerConstants.BOOLEAN_CONS: {
+ case AdmLexer.TOKEN_BOOLEAN_CONS: {
parseConstructor(ATypeTag.BOOLEAN, objectType, out);
break;
}
- case AdmLexerConstants.FALSE_LITERAL: {
+ case AdmLexer.TOKEN_FALSE_LITERAL: {
if (checkType(ATypeTag.BOOLEAN, objectType, out)) {
booleanSerde.serialize(ABoolean.FALSE, out);
} else
throw new AsterixException(mismatchErrorMessage + objectType.getTypeName());
break;
}
- case AdmLexerConstants.DOUBLE_LITERAL: {
+ case AdmLexer.TOKEN_DOUBLE_LITERAL: {
if (checkType(ATypeTag.DOUBLE, objectType, out)) {
- aDouble.setValue(Double.parseDouble(token.image));
+ aDouble.setValue(Double.parseDouble(admLexer.getLastTokenImage()));
doubleSerde.serialize(aDouble, out);
} else
throw new AsterixException(mismatchErrorMessage + objectType.getTypeName());
break;
}
- case AdmLexerConstants.DOUBLE_CONS: {
+ case AdmLexer.TOKEN_DOUBLE_CONS: {
parseConstructor(ATypeTag.DOUBLE, objectType, out);
break;
}
- case AdmLexerConstants.FLOAT_LITERAL: {
+ case AdmLexer.TOKEN_FLOAT_LITERAL: {
if (checkType(ATypeTag.FLOAT, objectType, out)) {
- aFloat.setValue(Float.parseFloat(token.image));
+ aFloat.setValue(Float.parseFloat(admLexer.getLastTokenImage()));
floatSerde.serialize(aFloat, out);
} else
throw new AsterixException(mismatchErrorMessage + objectType.getTypeName());
break;
}
- case AdmLexerConstants.FLOAT_CONS: {
+ case AdmLexer.TOKEN_FLOAT_CONS: {
parseConstructor(ATypeTag.FLOAT, objectType, out);
break;
}
- case AdmLexerConstants.INT8_LITERAL: {
+ case AdmLexer.TOKEN_INT8_LITERAL: {
if (checkType(ATypeTag.INT8, objectType, out)) {
- parseInt8(token.image, out);
+ parseInt8(admLexer.getLastTokenImage(), out);
} else
throw new AsterixException(mismatchErrorMessage + objectType.getTypeName());
break;
}
- case AdmLexerConstants.INT8_CONS: {
+ case AdmLexer.TOKEN_INT8_CONS: {
parseConstructor(ATypeTag.INT8, objectType, out);
break;
}
- case AdmLexerConstants.INT16_LITERAL: {
+ case AdmLexer.TOKEN_INT16_LITERAL: {
if (checkType(ATypeTag.INT16, objectType, out)) {
- parseInt16(token.image, out);
+ parseInt16(admLexer.getLastTokenImage(), out);
} else
throw new AsterixException(mismatchErrorMessage + objectType.getTypeName());
break;
}
- case AdmLexerConstants.INT16_CONS: {
+ case AdmLexer.TOKEN_INT16_CONS: {
parseConstructor(ATypeTag.INT16, objectType, out);
break;
}
- case AdmLexerConstants.INT_LITERAL:
- case AdmLexerConstants.INT32_LITERAL: {
+ case AdmLexer.TOKEN_INT_LITERAL:
+ case AdmLexer.TOKEN_INT32_LITERAL: {
if (checkType(ATypeTag.INT32, objectType, out)) {
- parseInt32(token.image, out);
+ parseInt32(admLexer.getLastTokenImage(), out);
} else
throw new AsterixException(mismatchErrorMessage + objectType.getTypeName());
break;
}
- case AdmLexerConstants.INT32_CONS: {
+ case AdmLexer.TOKEN_INT32_CONS: {
parseConstructor(ATypeTag.INT32, objectType, out);
break;
}
- case AdmLexerConstants.INT64_LITERAL: {
+ case AdmLexer.TOKEN_INT64_LITERAL: {
if (checkType(ATypeTag.INT64, objectType, out)) {
- parseInt64(token.image, out);
+ parseInt64(admLexer.getLastTokenImage(), out);
} else
throw new AsterixException(mismatchErrorMessage + objectType.getTypeName());
break;
}
- case AdmLexerConstants.INT64_CONS: {
+ case AdmLexer.TOKEN_INT64_CONS: {
parseConstructor(ATypeTag.INT64, objectType, out);
break;
}
- case AdmLexerConstants.STRING_LITERAL: {
+ case AdmLexer.TOKEN_STRING_LITERAL: {
if (checkType(ATypeTag.STRING, objectType, out)) {
- aString.setValue(token.image.substring(1, token.image.length() - 1));
+ aString.setValue(admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1));
stringSerde.serialize(aString, out);
} else
throw new AsterixException(mismatchErrorMessage + objectType.getTypeName());
break;
}
- case AdmLexerConstants.STRING_CONS: {
+ case AdmLexer.TOKEN_STRING_CONS: {
parseConstructor(ATypeTag.STRING, objectType, out);
break;
}
- case AdmLexerConstants.DATE_CONS: {
+ case AdmLexer.TOKEN_DATE_CONS: {
parseConstructor(ATypeTag.DATE, objectType, out);
break;
}
- case AdmLexerConstants.TIME_CONS: {
+ case AdmLexer.TOKEN_TIME_CONS: {
parseConstructor(ATypeTag.TIME, objectType, out);
break;
}
- case AdmLexerConstants.DATETIME_CONS: {
+ case AdmLexer.TOKEN_DATETIME_CONS: {
parseConstructor(ATypeTag.DATETIME, objectType, out);
break;
}
- case AdmLexerConstants.DURATION_CONS: {
+ case AdmLexer.TOKEN_INTERVAL_DATE_CONS: {
+ try {
+ if (checkType(ATypeTag.INTERVAL, objectType, out)) {
+ if (admLexer.next() == AdmLexer.TOKEN_CONSTRUCTOR_OPEN) {
+ if (admLexer.next() == AdmLexer.TOKEN_STRING_CONS) {
+ AIntervalSerializerDeserializer.parseDate(admLexer.getLastTokenImage(), out);
+
+ if (admLexer.next() == AdmLexer.TOKEN_CONSTRUCTOR_CLOSE) {
+ break;
+ }
+ }
+ }
+ }
+ } catch (AdmLexerException ex) {
+ throw new AsterixException(ex);
+ }
+ throw new AsterixException("Wrong interval data parsing for date interval.");
+ }
+ case AdmLexer.TOKEN_INTERVAL_TIME_CONS: {
+ try {
+ if (checkType(ATypeTag.INTERVAL, objectType, out)) {
+ if (admLexer.next() == AdmLexer.TOKEN_CONSTRUCTOR_OPEN) {
+ if (admLexer.next() == AdmLexer.TOKEN_STRING_CONS) {
+ AIntervalSerializerDeserializer.parseTime(admLexer.getLastTokenImage(), out);
+
+ if (admLexer.next() == AdmLexer.TOKEN_CONSTRUCTOR_CLOSE) {
+ break;
+ }
+ }
+ }
+ }
+ } catch (AdmLexerException ex) {
+ throw new AsterixException(ex);
+ }
+ throw new AsterixException("Wrong interval data parsing for time interval.");
+ }
+ case AdmLexer.TOKEN_INTERVAL_DATETIME_CONS: {
+ try {
+ if (checkType(ATypeTag.INTERVAL, objectType, out)) {
+ if (admLexer.next() == AdmLexer.TOKEN_CONSTRUCTOR_OPEN) {
+ if (admLexer.next() == AdmLexer.TOKEN_STRING_CONS) {
+ AIntervalSerializerDeserializer.parseDatetime(admLexer.getLastTokenImage(), out);
+
+ if (admLexer.next() == AdmLexer.TOKEN_CONSTRUCTOR_CLOSE) {
+ break;
+ }
+ }
+ }
+ }
+ } catch (AdmLexerException ex) {
+ throw new AsterixException(ex);
+ }
+ throw new AsterixException("Wrong interval data parsing for datetime interval.");
+ }
+ case AdmLexer.TOKEN_DURATION_CONS: {
parseConstructor(ATypeTag.DURATION, objectType, out);
break;
}
- case AdmLexerConstants.POINT_CONS: {
+ case AdmLexer.TOKEN_POINT_CONS: {
parseConstructor(ATypeTag.POINT, objectType, out);
break;
}
- case AdmLexerConstants.POINT3D_CONS: {
+ case AdmLexer.TOKEN_POINT3D_CONS: {
parseConstructor(ATypeTag.POINT3D, objectType, out);
break;
}
- case AdmLexerConstants.CIRCLE_CONS: {
+ case AdmLexer.TOKEN_CIRCLE_CONS: {
parseConstructor(ATypeTag.CIRCLE, objectType, out);
break;
}
- case AdmLexerConstants.RECTANGLE_CONS: {
+ case AdmLexer.TOKEN_RECTANGLE_CONS: {
parseConstructor(ATypeTag.RECTANGLE, objectType, out);
break;
}
- case AdmLexerConstants.LINE_CONS: {
+ case AdmLexer.TOKEN_LINE_CONS: {
parseConstructor(ATypeTag.LINE, objectType, out);
break;
}
- case AdmLexerConstants.POLYGON_CONS: {
+ case AdmLexer.TOKEN_POLYGON_CONS: {
parseConstructor(ATypeTag.POLYGON, objectType, out);
break;
}
- case AdmLexerConstants.START_UNORDERED_LIST: {
+ case AdmLexer.TOKEN_START_UNORDERED_LIST: {
if (checkType(ATypeTag.UNORDEREDLIST, objectType, out)) {
objectType = getComplexType(objectType, ATypeTag.UNORDEREDLIST);
parseUnorderedList((AUnorderedListType) objectType, out);
@@ -263,7 +321,7 @@
break;
}
- case AdmLexerConstants.START_ORDERED_LIST: {
+ case AdmLexer.TOKEN_START_ORDERED_LIST: {
if (checkType(ATypeTag.ORDEREDLIST, objectType, out)) {
objectType = getComplexType(objectType, ATypeTag.ORDEREDLIST);
parseOrderedList((AOrderedListType) objectType, out);
@@ -271,7 +329,7 @@
throw new AsterixException(mismatchErrorMessage + objectType.getTypeTag());
break;
}
- case AdmLexerConstants.START_RECORD: {
+ case AdmLexer.TOKEN_START_RECORD: {
if (checkType(ATypeTag.RECORD, objectType, out)) {
objectType = getComplexType(objectType, ATypeTag.RECORD);
parseRecord((ARecordType) objectType, out, datasetRec);
@@ -279,11 +337,11 @@
throw new AsterixException(mismatchErrorMessage + objectType.getTypeTag());
break;
}
- case AdmLexerConstants.EOF: {
+ case AdmLexer.TOKEN_EOF: {
break;
}
default: {
- throw new AsterixException("Unexpected ADM token kind: " + admLexer.tokenKindToString(token.kind) + ".");
+ throw new AsterixException("Unexpected ADM token kind: " + AdmLexer.tokenKindToString(token) + ".");
}
}
}
@@ -365,7 +423,7 @@
recBuilder.reset(null);
recBuilder.init();
- Token token = null;
+ int token;
boolean inRecord = true;
boolean expectingRecordField = false;
boolean first = true;
@@ -375,15 +433,15 @@
IAType fieldType = null;
do {
token = nextToken();
- switch (token.kind) {
- case AdmLexerConstants.END_RECORD: {
+ switch (token) {
+ case AdmLexer.TOKEN_END_RECORD: {
if (expectingRecordField) {
throw new AsterixException("Found END_RECORD while expecting a record field.");
}
inRecord = false;
break;
}
- case AdmLexerConstants.STRING_LITERAL: {
+ case AdmLexer.TOKEN_STRING_LITERAL: {
// we've read the name of the field
// now read the content
fieldNameBuffer.reset();
@@ -391,12 +449,14 @@
expectingRecordField = false;
if (recType != null) {
- String fldName = token.image.substring(1, token.image.length() - 1);
+ String fldName = admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1);
fieldId = recBuilder.getFieldId(fldName);
if (fieldId < 0 && !recType.isOpen()) {
throw new AsterixException("This record is closed, you can not add extra fields !!");
} else if (fieldId < 0 && recType.isOpen()) {
- aStringFieldName.setValue(token.image.substring(1, token.image.length() - 1));
+ aStringFieldName.setValue(admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1));
stringSerde.serialize(aStringFieldName, fieldNameBuffer.getDataOutput());
openRecordField = true;
fieldType = null;
@@ -407,16 +467,17 @@
openRecordField = false;
}
} else {
- aStringFieldName.setValue(token.image.substring(1, token.image.length() - 1));
+ aStringFieldName.setValue(admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1));
stringSerde.serialize(aStringFieldName, fieldNameBuffer.getDataOutput());
openRecordField = true;
fieldType = null;
}
token = nextToken();
- if (token.kind != AdmLexerConstants.COLON) {
- throw new AsterixException("Unexpected ADM token kind: "
- + admLexer.tokenKindToString(token.kind) + " while expecting \":\".");
+ if (token != AdmLexer.TOKEN_COLON) {
+ throw new AsterixException("Unexpected ADM token kind: " + AdmLexer.tokenKindToString(token)
+ + " while expecting \":\".");
}
token = nextToken();
@@ -436,7 +497,7 @@
break;
}
- case AdmLexerConstants.COMMA: {
+ case AdmLexer.TOKEN_COMMA: {
if (first) {
throw new AsterixException("Found COMMA before any record field.");
}
@@ -447,7 +508,7 @@
break;
}
default: {
- throw new AsterixException("Unexpected ADM token kind: " + admLexer.tokenKindToString(token.kind)
+ throw new AsterixException("Unexpected ADM token kind: " + AdmLexer.tokenKindToString(token)
+ " while parsing record fields.");
}
}
@@ -498,18 +559,18 @@
itemType = oltype.getItemType();
orderedListBuilder.reset(oltype);
- Token token = null;
+ int token;
boolean inList = true;
boolean expectingListItem = false;
boolean first = true;
do {
token = nextToken();
- if (token.kind == AdmLexerConstants.END_ORDERED_LIST) {
+ if (token == AdmLexer.TOKEN_END_ORDERED_LIST) {
if (expectingListItem) {
throw new AsterixException("Found END_COLLECTION while expecting a list item.");
}
inList = false;
- } else if (token.kind == AdmLexerConstants.COMMA) {
+ } else if (token == AdmLexer.TOKEN_COMMA) {
if (first) {
throw new AsterixException("Found COMMA before any list item.");
}
@@ -542,18 +603,18 @@
itemType = uoltype.getItemType();
unorderedListBuilder.reset(uoltype);
- Token token = null;
+ int token;
boolean inList = true;
boolean expectingListItem = false;
boolean first = true;
do {
token = nextToken();
- if (token.kind == AdmLexerConstants.END_UNORDERED_LIST) {
+ if (token == AdmLexer.TOKEN_END_UNORDERED_LIST) {
if (expectingListItem) {
throw new AsterixException("Found END_COLLECTION while expecting a list item.");
}
inList = false;
- } else if (token.kind == AdmLexerConstants.COMMA) {
+ } else if (token == AdmLexer.TOKEN_COMMA) {
if (first) {
throw new AsterixException("Found COMMA before any list item.");
}
@@ -574,11 +635,13 @@
returnTempBuffer(itemBuffer);
}
- private Token nextToken() throws AsterixException {
+ private int nextToken() throws AsterixException {
try {
return admLexer.next();
- } catch (ParseException pe) {
- throw new AsterixException(pe);
+ } catch (AdmLexerException e) {
+ throw new AsterixException(e);
+ } catch (IOException e) {
+ throw new AsterixException(e);
}
}
@@ -633,73 +696,109 @@
private void parseConstructor(ATypeTag typeTag, IAType objectType, DataOutput out) throws AsterixException {
try {
- Token token = admLexer.next();
- if (token.kind == AdmLexerConstants.CONSTRUCTOR_OPEN) {
+ int token = admLexer.next();
+ if (token == AdmLexer.TOKEN_CONSTRUCTOR_OPEN) {
if (checkType(typeTag, objectType, out)) {
token = admLexer.next();
- if (token.kind == AdmLexerConstants.STRING_LITERAL) {
+ if (token == AdmLexer.TOKEN_STRING_LITERAL) {
switch (typeTag) {
case BOOLEAN:
- parseBoolean(token.image.substring(1, token.image.length() - 1), out);
+ parseBoolean(
+ admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1), out);
break;
case INT8:
- parseInt8(token.image.substring(1, token.image.length() - 1), out);
+ parseInt8(
+ admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1), out);
break;
case INT16:
- parseInt16(token.image.substring(1, token.image.length() - 1), out);
+ parseInt16(
+ admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1), out);
break;
case INT32:
- parseInt32(token.image.substring(1, token.image.length() - 1), out);
+ parseInt32(
+ admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1), out);
break;
case INT64:
- parseInt64(token.image.substring(1, token.image.length() - 1), out);
+ parseInt64(
+ admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1), out);
break;
case FLOAT:
- aFloat.setValue(Float.parseFloat(token.image.substring(1, token.image.length() - 1)));
+ aFloat.setValue(Float.parseFloat(admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1)));
floatSerde.serialize(aFloat, out);
break;
case DOUBLE:
- aDouble.setValue(Double.parseDouble(token.image.substring(1, token.image.length() - 1)));
+ aDouble.setValue(Double.parseDouble(admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1)));
doubleSerde.serialize(aDouble, out);
break;
case STRING:
- aString.setValue(token.image.substring(1, token.image.length() - 1));
+ aString.setValue(admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1));
stringSerde.serialize(aString, out);
break;
case TIME:
- parseTime(token.image.substring(1, token.image.length() - 1), out);
+ parseTime(
+ admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1), out);
break;
case DATE:
- parseDate(token.image.substring(1, token.image.length() - 1), out);
+ parseDate(
+ admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1), out);
break;
case DATETIME:
- parseDatetime(token.image.substring(1, token.image.length() - 1), out);
+ parseDatetime(
+ admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1), out);
break;
case DURATION:
- parseDuration(token.image.substring(1, token.image.length() - 1), out);
+ parseDuration(
+ admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1), out);
break;
case POINT:
- parsePoint(token.image.substring(1, token.image.length() - 1), out);
+ parsePoint(
+ admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1), out);
break;
case POINT3D:
- parsePoint3d(token.image.substring(1, token.image.length() - 1), out);
+ parsePoint3d(
+ admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1), out);
break;
case CIRCLE:
- parseCircle(token.image.substring(1, token.image.length() - 1), out);
+ parseCircle(
+ admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1), out);
break;
case RECTANGLE:
- parseRectangle(token.image.substring(1, token.image.length() - 1), out);
+ parseRectangle(
+ admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1), out);
break;
case LINE:
- parseLine(token.image.substring(1, token.image.length() - 1), out);
+ parseLine(
+ admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1), out);
break;
case POLYGON:
- parsePolygon(token.image.substring(1, token.image.length() - 1), out);
+ parsePolygon(
+ admLexer.getLastTokenImage().substring(1,
+ admLexer.getLastTokenImage().length() - 1), out);
break;
+ default:
+ throw new AsterixException("Missing deserializer method for constructor: "
+ + AdmLexer.tokenKindToString(token) + ".");
}
token = admLexer.next();
- if (token.kind == AdmLexerConstants.CONSTRUCTOR_CLOSE)
+ if (token == AdmLexer.TOKEN_CONSTRUCTOR_CLOSE)
return;
}
}
diff --git a/asterix-runtime/src/main/javacc/AdmLexer.jj b/asterix-runtime/src/main/javacc/AdmLexer.jj
deleted file mode 100644
index fbab62f..0000000
--- a/asterix-runtime/src/main/javacc/AdmLexer.jj
+++ /dev/null
@@ -1,150 +0,0 @@
-options {
-
-
- STATIC = false;
-
-}
-
-PARSER_BEGIN(AdmLexer)
-
-package edu.uci.ics.asterix.adm.parser;
-
-import java.io.*;
-
-public class AdmLexer {
-
- public static void main(String args[]) throws ParseException, TokenMgrError, IOException, FileNotFoundException {
- File file = new File(args[0]);
- Reader freader = new BufferedReader(new InputStreamReader
- (new FileInputStream(file), "UTF-8"));
- AdmLexer flexer = new AdmLexer(freader);
- Token t = null;
- do {
- t = flexer.next();
- System.out.println(AdmLexerConstants.tokenImage[t.kind]);
- } while (t.kind != EOF);
- freader.close();
- }
-
- public Token next() throws ParseException {
- return getNextToken();
- }
-
- public String tokenKindToString(int tokenKind) {
- return AdmLexerConstants.tokenImage[tokenKind];
- }
-}
-
-PARSER_END(AdmLexer)
-
-<DEFAULT>
-TOKEN :
-{
- <NULL_LITERAL : "null">
-}
-
-<DEFAULT>
-TOKEN :
-{
- <TRUE_LITERAL : "true">
-}
-
-<DEFAULT>
-TOKEN :
-{
- <FALSE_LITERAL : "false">
-}
-
-
-<DEFAULT>
-TOKEN :
-{
- <INTEGER_LITERAL : ("-")? (<DIGIT>)+ >
-}
-
-
-<DEFAULT>
-TOKEN :
-{
- <#DIGIT : ["0" - "9"]>
-}
-
-
-TOKEN:
-{
- < DOUBLE_LITERAL:
- ("-")? <INTEGER> ( "." <INTEGER> )? (<EXPONENT>)?
- | ("-")? "." <INTEGER>
- >
- | < #EXPONENT: ["e","E"] (["+","-"])? (["0"-"9"])+ >
- | <INTEGER : (<DIGIT>)+ >
- | <FLOAT_LITERAL: <DOUBLE_LITERAL>("f"|"F")>
- }
-
-<DEFAULT>
-TOKEN :
-{
- <STRING_LITERAL : ("\"" (<EscapeQuot> | ~["\""])* "\"") >
- |
- < #EscapeQuot: "\\\"" >
-}
-
-<DEFAULT>
-TOKEN :
-{
- <START_RECORD : "{">
-}
-
-<DEFAULT>
-TOKEN :
-{
- <END_RECORD : "}">
-}
-
-<DEFAULT>
-TOKEN :
-{
- <COMMA : ",">
-}
-
-<DEFAULT>
-TOKEN :
-{
- <COLON : ":">
-}
-
-
-<DEFAULT>
-TOKEN :
-{
- <START_ORDERED_LIST : "[">
-}
-
-<DEFAULT>
-TOKEN :
-{
- <END_ORDERED_LIST : "]">
-}
-
-<DEFAULT>
-TOKEN :
-{
- <START_UNORDERED_LIST : "{{">
-}
-
-<DEFAULT>
-TOKEN :
-{
- <END_UNORDERED_LIST : "}}">
-}
-
-
-
-
-SKIP:
-{
- " "
-| "\t"
-| "\r"
-| "\n"
-}
diff --git a/asterix-runtime/src/main/javacc/nontagged/AdmLexer.jj b/asterix-runtime/src/main/javacc/nontagged/AdmLexer.jj
deleted file mode 100644
index d94033d..0000000
--- a/asterix-runtime/src/main/javacc/nontagged/AdmLexer.jj
+++ /dev/null
@@ -1,385 +0,0 @@
-options {
-
-
- STATIC = false;
-
-}
-
-PARSER_BEGIN(AdmLexer)
-
-package edu.uci.ics.asterix.adm.parser.nontagged;
-
-import java.io.*;
-
-public class AdmLexer {
-
- public static void main(String args[]) throws ParseException, TokenMgrError, IOException, FileNotFoundException {
- File file = new File(args[0]);
- Reader freader = new BufferedReader(new InputStreamReader
- (new FileInputStream(file), "UTF-8"));
- AdmLexer flexer = new AdmLexer(freader);
- Token t = null;
- do {
- t = flexer.next();
- System.out.println(AdmLexerConstants.tokenImage[t.kind]);
- } while (t.kind != EOF);
- freader.close();
- }
-
- public Token next() throws ParseException {
- return getNextToken();
- }
-
- public String tokenKindToString(int tokenKind) {
- return AdmLexerConstants.tokenImage[tokenKind];
- }
-}
-
-PARSER_END(AdmLexer)
-
-<DEFAULT>
-TOKEN :
-{
- <NULL_LITERAL : "null">
-}
-
-<DEFAULT>
-TOKEN :
-{
- <TRUE_LITERAL : "true">
-}
-
-<DEFAULT>
-TOKEN :
-{
- <FALSE_LITERAL : "false">
-}
-
-<DEFAULT>
-TOKEN :
-{
- <BOOLEAN_CONS : ("boolean") >
-}
-
-<DEFAULT>
-TOKEN :
-{
- <CONSTRUCTOR_OPEN : ("(")>
-}
-
-
-<DEFAULT>
-TOKEN :
-{
- <CONSTRUCTOR_CLOSE : (")")>
-}
-
-<DEFAULT>
-TOKEN:
-{
- <INT8_LITERAL : ("-" | "+")? (<DIGIT>)+ ("i8")>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <INT8_CONS : ("int8") >
-}
-
-<DEFAULT>
-TOKEN:
-{
- <INT16_LITERAL : ("-" | "+")? (<DIGIT>)+ ("i16")>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <INT16_CONS : ("int16") >
-}
-
-<DEFAULT>
-TOKEN :
-{
- <INT32_LITERAL : ("-" | "+")? (<DIGIT>)+ ("i32")>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <INT32_CONS : ("int32")>
-}
-
-<DEFAULT>
-TOKEN:
-{
- <INT64_LITERAL : ("-" | "+")? (<DIGIT>)+ ("i64")>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <INT64_CONS : ("int64") >
-}
-
-<DEFAULT>
-TOKEN:
-{
- <INT_LITERAL : ("-" | "+")? (<DIGIT>)+>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <CIRCLE_LITERAL : "P"<DOUBLE_LITERAL>(",") <DOUBLE_LITERAL> ("R") <DOUBLE_LITERAL> >
-}
-
-<DEFAULT>
-TOKEN :
-{
- <CIRCLE_CONS : ("circle") >
-}
-
-
-<DEFAULT>
-TOKEN :
-{
- <TIMEZONE_LITERAL : (("+"|"-")<DIGIT><DIGIT>(":")<DIGIT><DIGIT>) | (("+"|"-")<DIGIT><DIGIT><DIGIT><DIGIT>) | ("Z") >
-}
-
-<DEFAULT>
-TOKEN :
-{
- <DATE_LITERAL : (("-")?<DIGIT><DIGIT><DIGIT><DIGIT>("-")<DIGIT><DIGIT>("-")<DIGIT><DIGIT>) | (("-")?<DIGIT><DIGIT><DIGIT><DIGIT><DIGIT><DIGIT><DIGIT><DIGIT>) >
-}
-
-<DEFAULT>
-TOKEN :
-{
- <DATE_CONS : ("date")>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <TIME_LITERAL : (<DIGIT><DIGIT>(":")<DIGIT><DIGIT>(":")<DIGIT><DIGIT> ( (".")<DIGIT>(<DIGIT>((<DIGIT>)?))?)? ((("+"|"-")<DIGIT><DIGIT>(":")<DIGIT><DIGIT>) | ("Z"))?) | (<DIGIT><DIGIT><DIGIT><DIGIT><DIGIT><DIGIT> (<DIGIT>(<DIGIT>((<DIGIT>)?))?)? ((("+"|"-")<DIGIT><DIGIT><DIGIT><DIGIT>) | ("Z"))?) >
-}
-
-<DEFAULT>
-TOKEN :
-{
- <TIME_CONS : ("time")>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <DATETIME_LITERAL : (("-")?<DIGIT><DIGIT><DIGIT><DIGIT>("-")<DIGIT><DIGIT>("-")<DIGIT><DIGIT>("T")<DIGIT><DIGIT>(":")<DIGIT><DIGIT>(":")<DIGIT><DIGIT> ( (".")<DIGIT>(<DIGIT>((<DIGIT>)?))?)? ((("+"|"-")<DIGIT><DIGIT>(":")<DIGIT><DIGIT>) | ("Z"))?) | (("-")?<DIGIT><DIGIT><DIGIT><DIGIT><DIGIT><DIGIT><DIGIT><DIGIT>("T")<DIGIT><DIGIT><DIGIT><DIGIT><DIGIT><DIGIT> (<DIGIT>(<DIGIT>((<DIGIT>)?))?)? ((("+"|"-")<DIGIT><DIGIT><DIGIT><DIGIT>) | ("Z"))?)>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <DATETIME_CONS : ("datetime")>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <DURATION_LITERAL : ("-")? ("P")(<INTEGER>("Y"))?(<INTEGER>("M"))?(<INTEGER>("D"))?(("T")(((<INTEGER>("H"))(<INTEGER>("M"))?(<INTEGER>((".")<DIGIT>(<DIGIT>(<DIGIT>)?)?)?("S"))?) | ((<INTEGER>("M"))(<INTEGER>((".")<DIGIT>(<DIGIT>(<DIGIT>)?)?)?("S"))?) | ((<INTEGER>((".")<DIGIT>(<DIGIT>(<DIGIT>)?)?)?("S")))))?>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <DURATION_CONS : ("duration")>
-}
-
-<DEFAULT>
-TOKEN :
-{
<INTERVAL_CONS : ("interval")>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <TIME_INTERVAL_CONS : ("tinterval")>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <DATE_INTERVAL_CONS : ("dinterval")>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <DATETIME_INTERVAL_CONS : ("dtinterval")>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <#DIGIT : ["0" - "9"]>
-}
-
-TOKEN:
-{
- < DOUBLE_LITERAL:
- ("-" | "+")? <INTEGER> ( "." <INTEGER> )? (<EXPONENT>)?
- | ("-" | "+")? "." <INTEGER>
- >
- | < #EXPONENT: ["e","E"] (["+","-"])? (["0"-"9"])+ >
- | <INTEGER : (<DIGIT>)+ >
- | <FLOAT_LITERAL: <DOUBLE_LITERAL>("f"|"F")>
- }
-
-
-<DEFAULT>
-TOKEN :
-{
- <FLOAT_CONS : ("float")>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <DOUBLE_CONS : ("double")>
-}
-
-
-<DEFAULT>
-TOKEN :
-{
- <STRING_LITERAL : ("\"" (<EscapeQuot> | ~["\""])* "\"") >
- |
- < #EscapeQuot: "\\\"" >
-}
-
-<DEFAULT>
-TOKEN :
-{
- <STRING_CONS : ("string")>
-}
-
-
-<DEFAULT>
-TOKEN :
-{
- <POINT_LITERAL : "P"<DOUBLE_LITERAL>(",")<DOUBLE_LITERAL>>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <POINT_CONS : ("point")>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <POINT3D_LITERAL : "P" <DOUBLE_LITERAL>(",") <DOUBLE_LITERAL> (",") <DOUBLE_LITERAL>>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <POINT3D_CONS : ("point3d")>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <LINE_LITERAL : "P"<DOUBLE_LITERAL>(",") <DOUBLE_LITERAL> ("P") <DOUBLE_LITERAL> (",") <DOUBLE_LITERAL>>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <LINE_CONS : ("line")>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <POLYGON_LITERAL : "P"<DOUBLE_LITERAL>(",") <DOUBLE_LITERAL> ("P") <DOUBLE_LITERAL> (",") <DOUBLE_LITERAL> (("P") <DOUBLE_LITERAL> (",") <DOUBLE_LITERAL>)+>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <POLYGON_CONS : ("polygon")>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <RECTANGLE_CONS : ("rectangle")>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <RECTANGLE_LITERAL : "P"<DOUBLE_LITERAL>(",") <DOUBLE_LITERAL> ("P") <DOUBLE_LITERAL> (",") <DOUBLE_LITERAL>>
-}
-
-<DEFAULT>
-TOKEN :
-{
- <START_RECORD : "{">
-}
-
-<DEFAULT>
-TOKEN :
-{
- <END_RECORD : "}">
-}
-
-<DEFAULT>
-TOKEN :
-{
- <COMMA : ",">
-}
-
-<DEFAULT>
-TOKEN :
-{
- <COLON : ":">
-}
-
-
-<DEFAULT>
-TOKEN :
-{
- <START_ORDERED_LIST : "[">
-}
-
-<DEFAULT>
-TOKEN :
-{
- <END_ORDERED_LIST : "]">
-}
-
-<DEFAULT>
-TOKEN :
-{
- <START_UNORDERED_LIST : "{{">
-}
-
-<DEFAULT>
-TOKEN :
-{
- <END_UNORDERED_LIST : "}}">
-}
-
-
-
-
-SKIP:
-{
- " "
-| "\t"
-| "\r"
-| "\n"
-}
diff --git a/asterix-runtime/src/main/resources/adm.grammar b/asterix-runtime/src/main/resources/adm.grammar
new file mode 100644
index 0000000..56c7212
--- /dev/null
+++ b/asterix-runtime/src/main/resources/adm.grammar
@@ -0,0 +1,63 @@
+# LEXER GENERATOR configuration file
+# ---------------------------------------
+# Place *first* the generic configuration
+# then list your grammar.
+
+PACKAGE: edu.uci.ics.asterix.runtime.operators.file.adm
+LEXER_NAME: AdmLexer
+
+TOKENS:
+
+BOOLEAN_CONS = string(boolean)
+INT8_CONS = string(int8)
+INT16_CONS = string(int16)
+INT32_CONS = string(int32)
+INT64_CONS = string(int64)
+FLOAT_CONS = string(float)
+DOUBLE_CONS = string(double)
+DATE_CONS = string(date)
+DATETIME_CONS = string(datetime)
+DURATION_CONS = string(duration)
+STRING_CONS = string(string)
+POINT_CONS = string(point)
+POINT3D_CONS = string(point3d)
+LINE_CONS = string(line)
+POLYGON_CONS = string(polygon)
+RECTANGLE_CONS = string(rectangle)
+CIRCLE_CONS = string(circle)
+TIME_CONS = string(time)
+INTERVAL_TIME_CONS = string(interval_time)
+INTERVAL_DATE_CONS = string(interval_date)
+INTERVAL_DATETIME_CONS = string(interval_datetime)
+
+NULL_LITERAL = string(null)
+TRUE_LITERAL = string(true)
+FALSE_LITERAL = string(false)
+
+CONSTRUCTOR_OPEN = char(()
+CONSTRUCTOR_CLOSE = char())
+START_RECORD = char({)
+END_RECORD = char(})
+COMMA = char(\,)
+COLON = char(:)
+START_ORDERED_LIST = char([)
+END_ORDERED_LIST = char(])
+START_UNORDERED_LIST = string({{)
+END_UNORDERED_LIST = string(}})
+
+STRING_LITERAL = char("), anythingUntil(")
+
+INT_LITERAL = signOrNothing(), digitSequence()
+INT8_LITERAL = token(INT_LITERAL), string(i8)
+INT16_LITERAL = token(INT_LITERAL), string(i16)
+INT32_LITERAL = token(INT_LITERAL), string(i32)
+INT64_LITERAL = token(INT_LITERAL), string(i64)
+
+@EXPONENT = caseInsensitiveChar(e), signOrNothing(), digitSequence()
+
+DOUBLE_LITERAL = signOrNothing(), char(.), digitSequence()
+DOUBLE_LITERAL = signOrNothing(), digitSequence(), char(.), digitSequence()
+DOUBLE_LITERAL = signOrNothing(), digitSequence(), char(.), digitSequence(), token(@EXPONENT)
+DOUBLE_LITERAL = signOrNothing(), digitSequence(), token(@EXPONENT)
+
+FLOAT_LITERAL = token(DOUBLE_LITERAL), caseInsensitiveChar(f)
diff --git a/pom.xml b/pom.xml
index fadaa0b..8999ad9 100644
--- a/pom.xml
+++ b/pom.xml
@@ -84,6 +84,7 @@
<module>asterix-metadata</module>
<module>asterix-dist</module>
<module>asterix-test-framework</module>
+ <module>asterix-maven-plugins</module>
</modules>
<repositories>