introduced new adm lexer into stabilization. Issue #215
git-svn-id: https://asterixdb.googlecode.com/svn/branches/asterix_stabilization@1205 eaa15691-b419-025a-1212-ee371bd00084
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/Asterix_ADM_Parser.md b/asterix-maven-plugins/lexer-generator-maven-plugin/Asterix_ADM_Parser.md
new file mode 100644
index 0000000..eeaffc9
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/Asterix_ADM_Parser.md
@@ -0,0 +1,53 @@
+The Asterix ADM Parser
+======================
+
+The ADM parser inside Asterix is composed by two different components:
+
+* **The Parser** AdmTupleParser, which converts the adm tokens in internal objects
+* **The Lexer** AdmLexer, which scans the adm file and returns a list of adm tokens
+
+These two classes belong to the package:
+
+ edu.uci.ics.asterix.runtime.operators.file
+
+The Parser is loaded through a factory (*AdmSchemafullRecordParserFactory*) by
+
+ edu.uci.ics.asterix.external.dataset.adapter.FileSystemBasedAdapter extends AbstractDatasourceAdapter
+
+
+How to add a new datatype
+-------------------------
+The ADM format allows two different kinds of datatype:
+
+* primitive
+* with constructor
+
+A primitive datatype allows to write the actual value of the field without extra markup:
+
+ { name : "Diego", age : 23 }
+
+while the datatypes with constructor require to specify first the type of the value and then a string with the serialized value
+
+ { center : point3d("P2.1,3,8.5") }
+
+In order to add a new datatype the steps are:
+
+1. Add the new token to the **Lexer**
+ * **if the datatype is primite** is necessary to create a TOKEN able to recognize **the format of the value**
+ * **if the datatype is with constructor** is necessary to create **only** a TOKEN able to recognize **the name of the constructor**
+
+2. Change the **Parser** in order to convert correctly the new token in internal objects
+ * This will require to **add new cases to the switch-case statements** and the introduction of **a serializer/deserializer object** for that datatype.
+
+
+The Lexer
+----------
+To add new datatype or change the tokens definition you have to change ONLY the file adm.grammar located in
+ asterix-runtime/src/main/resources/adm.grammar
+The lexer will be generated from that definition file during each maven building.
+
+The maven configuration in located in asterix-runtime/pom.xml
+
+
+> Author: Diego Giorgini - diegogiorgini@gmail.com
+> 6 December 2012
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/README.md b/asterix-maven-plugins/lexer-generator-maven-plugin/README.md
new file mode 100644
index 0000000..b3632e6
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/README.md
@@ -0,0 +1,111 @@
+Lexer Generator
+===============
+
+This tool automate the creation of Hand-Coded-Like Lexers.
+It was created to address the performance issues of other (more advanced) lexer generators like JavaCC that arise when you need to scan TB of data. In particular it is *~20x faster* than javacc and typically can parse the data from a normal harddisk at *more than 70MBs*.
+
+
+Maven Plugin (to put inside pom.xml)
+-------------------------------------
+ <build>
+ <plugins>
+ <plugin>
+ <groupId>org.apache.maven.plugins</groupId>
+ <artifactId>maven-compiler-plugin</artifactId>
+ <version>2.0.2</version>
+ <configuration>
+ <source>1.6</source>
+ <target>1.6</target>
+ </configuration>
+ </plugin>
+ <plugin>
+ <groupId>edu.uci.ics.asterix</groupId>
+ <artifactId>lexer-generator-maven-plugin</artifactId>
+ <version>0.1-SNAPSHOT</version>
+ <configuration>
+ <grammarFile>src/main/java/edu/uci/ics/asterix/runtime/operators/file/adm/adm.grammar</grammarFile>
+ <outputDir>${project.build.directory}/generated-sources</outputDir>
+ </configuration>
+ <executions>
+ <execution>
+ <id>generate-lexer</id>
+ <phase>generate-sources</phase>
+ <goals>
+ <goal>generate-lexer</goal>
+ </goals>
+ </execution>
+ </executions>
+ </plugin>
+ </plugins>
+ </build>
+
+
+Command line
+-------------
+ LexerGenerator\nusage: java LexerGenerator <configuration file>
+
+
+
+What means Hand-Coded-Like and why it is so fast
+------------------------------------------------
+The most of the Lexers use a Finite State Machine encoded in data structure called [State Transition Table](http://en.wikipedia.org/wiki/State_transition_table).
+While elegant and practical this approach require some extra controls and operations to deal with the data structure at runtime. A different approach consists in encoding the State Machine as actual code, in this way all the operations done are limited to the minumum amount needed to parse our grammar.
+A common problem with this kind of hard-hand-coded lexers is that is almost impossible to do manutency and changes, this is the reason of this Lexer Generator able to produce a Hand-Coded-Like lexer starting from a grammar specification.
+
+Another big difference with the most of the LexerGenerator (expecially the ones for Java) is that since it is optimized for performance we **don't return objects** and we **use the minimum possible of objects internally**.
+This actually is the main reason of the ~20x when compared with javacc.
+
+
+Configuration File
+------------------
+Is a simple *key: value* configuration file plus the *specification of your grammar*.
+The four configuration keys are listed below:
+
+ # LEXER GENERATOR configuration file
+ # ---------------------------------------
+ # Place *first* the generic configuration
+ # then list your grammar.
+
+ PACKAGE: edu.uci.ics.asterix.admfast.parser
+ LEXER_NAME: AdmLexer
+ OUTPUT_DIR: output/
+
+
+Specify The Grammar
+-------------------
+Your grammar has to be listed in the configuration file after the *TOKENS:* keyword.
+
+ TOKENS:
+
+ BOOLEAN_LIT = string(boolean)
+ COMMA = char(\,)
+ COLON = char(:)
+ STRING_LITERAL = char("), anythingUntil(")
+ INT_LITERAL = signOrNothing(), digitSequence()
+ INT8_LITERAL = token(INT_LITERAL), string(i8)
+ @EXPONENT = caseInsensitiveChar(e), signOrNothing(), digitSequence()
+ DOUBLE_LITERAL = signOrNothing(), digitSequence(), char(.), digitSequence(), token(@EXPONENT)
+ DOUBLE_LITERAL = signOrNothing(), digitSequence(), token(@EXPONENT)
+
+Each token is composed by a **name** and a sequence of **rules**.
+Each rule is then written with the format: **constructor(parameter)**
+the list of the rules available is coded inside *NodeChainFactory.java*
+
+You can write more than a sequence of rules just addind more another line and repeating the token name.
+
+You can reuse the rules of a token inside another one with the special rule: **token(** *TOKEN_NAME* **)**
+
+Lastly you can define *auxiliary* token definitions that will not be encoded in the final lexer (but that can be useful inside other token definitions) just **startig the token name with @**.
+
+**Attention:** please pay attention to not write rules that once merged int the state machine would lead to a *conflict between transaction* like a transaction for a generic digit and one for a the digit 0 from the same node.
+
+The result: MyLexer
+-------------------
+The result of the execution of the LexerGenerator is the creation of the Lexer inside the directory *components**.
+The lexer is extremly easy and minimal and can be used likewise an Iterator:
+
+ MyLexer myLexer = new MyLexer(new FileReader(file)));
+ while((token = MyLexer.next()) != MyLexer.TOKEN_EOF){
+ System.out.println(MyLexer.tokenKindToString(token));
+ }
+
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/pom.xml b/asterix-maven-plugins/lexer-generator-maven-plugin/pom.xml
new file mode 100644
index 0000000..524727f
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/pom.xml
@@ -0,0 +1,36 @@
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
+ <modelVersion>4.0.0</modelVersion>
+ <groupId>edu.uci.ics.asterix</groupId>
+ <artifactId>lexer-generator-maven-plugin</artifactId>
+ <version>0.1</version>
+ <packaging>maven-plugin</packaging>
+ <name>lexer-generator-maven-plugin</name>
+
+ <build>
+ <plugins>
+ <plugin>
+ <groupId>org.apache.maven.plugins</groupId>
+ <artifactId>maven-compiler-plugin</artifactId>
+ <version>2.0.2</version>
+ <configuration>
+ <source>1.6</source>
+ <target>1.6</target>
+ </configuration>
+ </plugin>
+ </plugins>
+ </build>
+
+ <dependencies>
+ <dependency>
+ <groupId>junit</groupId>
+ <artifactId>junit</artifactId>
+ <version>4.8.1</version>
+ <scope>test</scope>
+ </dependency>
+ <dependency>
+ <groupId>org.apache.maven</groupId>
+ <artifactId>maven-plugin-api</artifactId>
+ <version>2.0.2</version>
+ </dependency>
+ </dependencies>
+</project>
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/LexerGenerator.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/LexerGenerator.java
new file mode 100644
index 0000000..512f3d0
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/LexerGenerator.java
@@ -0,0 +1,202 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator;
+
+import java.io.BufferedReader;
+import java.io.FileNotFoundException;
+import java.io.FileReader;
+import java.io.Reader;
+import java.io.InputStream;
+import java.io.InputStreamReader;
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.LinkedHashMap;
+import java.util.Map.Entry;
+import java.util.Set;
+import org.apache.maven.plugin.logging.Log;
+
+public class LexerGenerator {
+ private LinkedHashMap<String, Token> tokens = new LinkedHashMap<String, Token>();
+ private Log logger;
+
+ public LexerGenerator() {
+ }
+
+ public LexerGenerator(Log logger) {
+ this.logger = logger;
+ }
+
+ private void log(String info) {
+ if (logger == null) {
+ System.out.println(info);
+ } else {
+ logger.info(info);
+ }
+ }
+
+ public void addToken(String rule) throws Exception {
+ Token newToken;
+ if (rule.charAt(0) == '@') {
+ newToken = new TokenAux(rule, tokens);
+ } else {
+ newToken = new Token(rule, tokens);
+ }
+ Token existingToken = tokens.get(newToken.getName());
+ if (existingToken == null) {
+ tokens.put(newToken.getName(), newToken);
+ } else {
+ existingToken.merge(newToken);
+ }
+ }
+
+ public void generateLexer(HashMap<String, String> config) throws Exception {
+ LexerNode main = this.compile();
+ config.put("TOKENS_CONSTANTS", this.tokensConstants());
+ config.put("TOKENS_IMAGES", this.tokensImages());
+ config.put("LEXER_LOGIC", main.toJava());
+ config.put("LEXER_AUXFUNCTIONS", replaceParams(this.auxiliaryFunctions(main), config));
+ String[] files = { "/Lexer.java", "/LexerException.java" };
+ String outputDir = config.get("OUTPUT_DIR");
+ (new File(outputDir)).mkdirs();
+ for (String file : files) {
+ String input = readFile(LexerGenerator.class.getResourceAsStream(file));
+ String fileOut = file.replace("Lexer", config.get("LEXER_NAME"));
+ String output = replaceParams(input, config);
+ log("Generating: " + file + "\t>>\t" + fileOut);
+ FileWriter out = new FileWriter((new File(outputDir, fileOut)).toString());
+ out.write(output);
+ out.close();
+ log(" [done]\n");
+ }
+ }
+
+ public String printParsedGrammar() {
+ StringBuilder result = new StringBuilder();
+ for (Token token : tokens.values()) {
+ result.append(token.toString()).append("\n");
+ }
+ return result.toString();
+ }
+
+ private LexerNode compile() throws Exception {
+ LexerNode main = new LexerNode();
+ for (Token token : tokens.values()) {
+ if (token instanceof TokenAux)
+ continue;
+ main.merge(token.getNode());
+ }
+ return main;
+ }
+
+ private String tokensImages() {
+ StringBuilder result = new StringBuilder();
+ Set<String> uniqueTokens = tokens.keySet();
+ for (String token : uniqueTokens) {
+ result.append(", \"<").append(token).append(">\" ");
+ }
+ return result.toString();
+ }
+
+ private String tokensConstants() {
+ StringBuilder result = new StringBuilder();
+ Set<String> uniqueTokens = tokens.keySet();
+ int i = 2;
+ for (String token : uniqueTokens) {
+ result.append(", TOKEN_").append(token).append("=").append(i).append(" ");
+ i++;
+ }
+ return result.toString();
+ }
+
+ private String auxiliaryFunctions(LexerNode main) {
+ StringBuilder result = new StringBuilder();
+ Set<String> functions = main.neededAuxFunctions();
+ for (String token : functions) {
+ result.append("private int parse_" + token
+ + "(char currentChar) throws IOException, [LEXER_NAME]Exception{\n");
+ result.append(tokens.get(token).getNode().toJavaAuxFunction());
+ result.append("\n}\n\n");
+ }
+ return result.toString();
+ }
+
+ private static String readFile(Reader input) throws FileNotFoundException, IOException {
+ StringBuffer fileData = new StringBuffer(1000);
+ BufferedReader reader = new BufferedReader(input);
+ char[] buf = new char[1024];
+ int numRead = 0;
+ while ((numRead = reader.read(buf)) != -1) {
+ String readData = String.valueOf(buf, 0, numRead);
+ fileData.append(readData);
+ buf = new char[1024];
+ }
+ reader.close();
+ return fileData.toString();
+ }
+
+ private static String readFile(InputStream input) throws FileNotFoundException, IOException {
+ if (input == null) {
+ throw new FileNotFoundException();
+ }
+ return readFile(new InputStreamReader(input));
+ }
+
+ private static String readFile(String fileName) throws FileNotFoundException, IOException {
+ return readFile(new FileReader(fileName));
+ }
+
+ private static String replaceParams(String input, HashMap<String, String> config) {
+ for (Entry<String, String> param : config.entrySet()) {
+ String key = "\\[" + param.getKey() + "\\]";
+ String value = param.getValue();
+ input = input.replaceAll(key, value);
+ }
+ return input;
+ }
+
+ public static void main(String args[]) throws Exception {
+ if (args.length == 0 || args[0] == "--help" || args[0] == "-h") {
+ System.out.println("LexerGenerator\nusage: java LexerGenerator <configuration file>");
+ return;
+ }
+
+ LexerGenerator lexer = new LexerGenerator();
+ HashMap<String, String> config = new HashMap<String, String>();
+
+ System.out.println("Config file:\t" + args[0]);
+ String input = readFile(args[0]);
+ boolean tokens = false;
+ for (String line : input.split("\r?\n")) {
+ line = line.trim();
+ if (line.length() == 0 || line.charAt(0) == '#')
+ continue;
+ if (tokens == false && !line.equals("TOKENS:")) {
+ config.put(line.split("\\s*:\\s*")[0], line.split("\\s*:\\s*")[1]);
+ } else if (line.equals("TOKENS:")) {
+ tokens = true;
+ } else {
+ lexer.addToken(line);
+ }
+ }
+
+ String parsedGrammar = lexer.printParsedGrammar();
+ lexer.generateLexer(config);
+ System.out.println("\nGenerated grammar:");
+ System.out.println(parsedGrammar);
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/LexerGeneratorMojo.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/LexerGeneratorMojo.java
new file mode 100644
index 0000000..11ee1d5
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/LexerGeneratorMojo.java
@@ -0,0 +1,92 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator;
+
+import edu.uci.ics.asterix.lexergenerator.LexerGenerator;
+import java.io.BufferedReader;
+import java.io.FileNotFoundException;
+import java.io.FileReader;
+import java.io.IOException;
+import java.util.HashMap;
+import org.apache.maven.plugin.AbstractMojo;
+import org.apache.maven.plugin.MojoExecutionException;
+
+import java.io.File;
+
+/**
+ * @goal generate-lexer
+ * @phase generate-sources
+ * @requiresDependencyResolution compile
+ */
+public class LexerGeneratorMojo extends AbstractMojo {
+ /**
+ * parameter injected from pom.xml
+ *
+ * @parameter
+ * @required
+ */
+ private File grammarFile;
+
+ /**
+ * parameter injected from pom.xml
+ *
+ * @parameter
+ * @required
+ */
+ private File outputDir;
+
+ public void execute() throws MojoExecutionException {
+ LexerGenerator lexer = new LexerGenerator(getLog());
+ HashMap<String, String> config = new HashMap<String, String>();
+ getLog().info("--- Lexer Generator Maven Plugin - started with grammarFile: " + grammarFile.toString());
+ try {
+ String input = readFile(grammarFile);
+ config.put("OUTPUT_DIR", outputDir.toString());
+ boolean tokens = false;
+ for (String line : input.split("\r?\n")) {
+ line = line.trim();
+ if (line.length() == 0 || line.charAt(0) == '#')
+ continue;
+ if (tokens == false && !line.equals("TOKENS:")) {
+ config.put(line.split("\\s*:\\s*")[0], line.split("\\s*:\\s*")[1]);
+ } else if (line.equals("TOKENS:")) {
+ tokens = true;
+ } else {
+ lexer.addToken(line);
+ }
+ }
+ lexer.generateLexer(config);
+ } catch (Throwable e) {
+ throw new MojoExecutionException("Error while generating lexer", e);
+ }
+ String parsedGrammar = lexer.printParsedGrammar();
+ getLog().info("--- Generated grammar:\n" + parsedGrammar);
+ }
+
+ private String readFile(File file) throws FileNotFoundException, IOException {
+ StringBuffer fileData = new StringBuffer(1000);
+ BufferedReader reader = new BufferedReader(new FileReader(file));
+ char[] buf = new char[1024];
+ int numRead = 0;
+ while ((numRead = reader.read(buf)) != -1) {
+ String readData = String.valueOf(buf, 0, numRead);
+ fileData.append(readData);
+ buf = new char[1024];
+ }
+ reader.close();
+ return fileData.toString();
+ }
+
+}
\ No newline at end of file
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/LexerNode.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/LexerNode.java
new file mode 100644
index 0000000..7b8d059
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/LexerNode.java
@@ -0,0 +1,243 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator;
+
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.LinkedHashMap;
+import java.util.Map;
+import java.util.Set;
+
+import edu.uci.ics.asterix.lexergenerator.rules.*;
+
+public class LexerNode {
+ private static String TOKEN_PREFIX = "TOKEN_";
+ private LinkedHashMap<Rule, LexerNode> actions = new LinkedHashMap<Rule, LexerNode>();
+ private String finalTokenName;
+ private Set<String> ongoingParsing = new HashSet<String>();
+
+ public LexerNode clone() {
+ LexerNode node = new LexerNode();
+ node.finalTokenName = this.finalTokenName;
+ for (Map.Entry<Rule, LexerNode> entry : this.actions.entrySet()) {
+ node.actions.put(entry.getKey().clone(), entry.getValue().clone());
+ }
+ for (String ongoing : this.ongoingParsing) {
+ node.ongoingParsing.add(ongoing);
+ }
+ return node;
+ }
+
+ public void add(Rule newRule) {
+ if (actions.get(newRule) == null) {
+ actions.put(newRule, new LexerNode());
+ }
+ }
+
+ public void append(Rule newRule) {
+ if (actions.size() == 0) {
+ add(newRule);
+ } else {
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ action.getValue().append(newRule);
+ }
+ if (actions.containsKey(new RuleEpsilon())) {
+ actions.remove(new RuleEpsilon());
+ add(newRule);
+ }
+ }
+ }
+
+ public void merge(LexerNode newNode) throws Exception {
+ for (Map.Entry<Rule, LexerNode> action : newNode.actions.entrySet()) {
+ if (this.actions.get(action.getKey()) == null) {
+ this.actions.put(action.getKey(), action.getValue());
+ } else {
+ this.actions.get(action.getKey()).merge(action.getValue());
+ }
+ }
+ if (newNode.finalTokenName != null) {
+ if (this.finalTokenName == null) {
+ this.finalTokenName = newNode.finalTokenName;
+ } else {
+ throw new Exception("Rule conflict between: " + this.finalTokenName + " and " + newNode.finalTokenName);
+ }
+ }
+ for (String ongoing : newNode.ongoingParsing) {
+ this.ongoingParsing.add(ongoing);
+ }
+ }
+
+ public void append(LexerNode node) throws Exception {
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ if (action.getKey() instanceof RuleEpsilon)
+ continue;
+ action.getValue().append(node);
+ }
+ if (actions.containsKey(new RuleEpsilon())) {
+ actions.remove(new RuleEpsilon());
+ merge(node.clone());
+ }
+ if (actions.size() == 0 || finalTokenName != null) {
+ finalTokenName = null;
+ merge(node.clone());
+ }
+ }
+
+ public void appendTokenName(String name) {
+ if (actions.size() == 0) {
+ this.finalTokenName = name;
+ } else {
+ ongoingParsing.add(TOKEN_PREFIX + name);
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ action.getValue().appendTokenName(name);
+ }
+ }
+ }
+
+ public LexerNode removeTokensName() {
+ this.finalTokenName = null;
+ this.ongoingParsing.clear();
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ action.getValue().removeTokensName();
+ }
+ return this;
+ }
+
+ public String toString() {
+ StringBuilder result = new StringBuilder();
+ if (finalTokenName != null)
+ result.append("! ");
+ if (actions.size() == 1)
+ result.append(actions.keySet().toArray()[0].toString() + actions.values().toArray()[0].toString());
+ if (actions.size() > 1) {
+ result.append(" ( ");
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ if (result.length() != 3) {
+ result.append(" || ");
+ }
+ result.append(action.getKey().toString());
+ result.append(action.getValue().toString());
+ }
+ result.append(" ) ");
+ }
+ return result.toString();
+ }
+
+ public String toJava() {
+ StringBuffer result = new StringBuffer();
+ if (numberOfRuleChar() > 2) {
+ result.append(toJavaSingleCharRules());
+ result.append(toJavaComplexRules(false));
+ } else {
+ result.append(toJavaComplexRules(true));
+ }
+ if (this.finalTokenName != null) {
+ result.append("return " + TOKEN_PREFIX + finalTokenName + ";\n");
+ } else if (ongoingParsing != null) {
+ String ongoingParsingArgs = collectionJoin(ongoingParsing, ',');
+ result.append("return parseError(" + ongoingParsingArgs + ");\n");
+ }
+ return result.toString();
+ }
+
+ private int numberOfRuleChar() {
+ int singleCharRules = 0;
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ if (action.getKey() instanceof RuleChar)
+ singleCharRules++;
+ }
+ return singleCharRules;
+ }
+
+ private String toJavaSingleCharRules() {
+ StringBuffer result = new StringBuffer();
+ result.append("switch(currentChar){\n");
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ if (action.getKey() instanceof RuleChar) {
+ RuleChar rule = (RuleChar) action.getKey();
+ result.append("case '" + rule.expectedChar() + "':\n");
+ result.append(rule.javaAction()).append("\n");
+ result.append(action.getValue().toJava());
+ }
+ }
+ result.append("}\n");
+ return result.toString();
+ }
+
+ private String toJavaComplexRules(boolean all) {
+ StringBuffer result = new StringBuffer();
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ if (!all && action.getKey() instanceof RuleChar)
+ continue;
+ if (action.getKey() instanceof RuleEpsilon)
+ continue;
+ String act = action.getKey().javaAction();
+ if (act.length() > 0) {
+ act = "\n" + act;
+ }
+ result.append(action.getKey().javaMatch(act + "\n" + action.getValue().toJava()));
+ }
+ return result.toString();
+ }
+
+ public void expandFirstAction(LinkedHashMap<String, Token> tokens) throws Exception {
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ Rule first = action.getKey();
+ if (first instanceof RulePartial) {
+ if (tokens.get(((RulePartial) first).getPartial()) == null) {
+ throw new Exception("Cannot find a token used as part of another definition, missing token: "
+ + ((RulePartial) first).getPartial());
+ }
+ actions.remove(first);
+ LexerNode node = tokens.get(((RulePartial) first).getPartial()).getNode().clone();
+ merge(node);
+ }
+ }
+ }
+
+ public Set<String> neededAuxFunctions() {
+ HashSet<String> partials = new HashSet<String>();
+ for (Map.Entry<Rule, LexerNode> action : actions.entrySet()) {
+ Rule rule = action.getKey();
+ if (rule instanceof RulePartial) {
+ partials.add(((RulePartial) rule).getPartial());
+ }
+ partials.addAll(action.getValue().neededAuxFunctions());
+ }
+ return partials;
+ }
+
+ public String toJavaAuxFunction() {
+ String oldFinalTokenName = finalTokenName;
+ if (oldFinalTokenName == null)
+ finalTokenName = "AUX_NOT_FOUND";
+ String result = toJava();
+ finalTokenName = oldFinalTokenName;
+ return result;
+ }
+
+ private String collectionJoin(Collection<String> collection, char c) {
+ StringBuilder ongoingParsingArgs = new StringBuilder();
+ for (String token : collection) {
+ ongoingParsingArgs.append(token);
+ ongoingParsingArgs.append(c);
+ }
+ if (ongoingParsing.size() > 0) {
+ ongoingParsingArgs.deleteCharAt(ongoingParsingArgs.length() - 1);
+ }
+ return ongoingParsingArgs.toString();
+ }
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/NodeChainFactory.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/NodeChainFactory.java
new file mode 100644
index 0000000..941f822
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/NodeChainFactory.java
@@ -0,0 +1,43 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator;
+
+import java.util.HashMap;
+
+import edu.uci.ics.asterix.lexergenerator.rulegenerators.*;
+
+public class NodeChainFactory {
+ static private HashMap<String, RuleGenerator> ruleGenerators = new HashMap<String, RuleGenerator>();
+
+ static {
+ ruleGenerators.put("char", new RuleGeneratorChar());
+ ruleGenerators.put("string", new RuleGeneratorString());
+ ruleGenerators.put("anythingUntil", new RuleGeneratorAnythingUntil());
+ ruleGenerators.put("signOrNothing", new RuleGeneratorSignOrNothing());
+ ruleGenerators.put("sign", new RuleGeneratorSign());
+ ruleGenerators.put("digitSequence", new RuleGeneratorDigitSequence());
+ ruleGenerators.put("caseInsensitiveChar", new RuleGeneratorCaseInsensitiveChar());
+ ruleGenerators.put("charOrNothing", new RuleGeneratorCharOrNothing());
+ ruleGenerators.put("token", new RuleGeneratorToken());
+ ruleGenerators.put("nothing", new RuleGeneratorNothing());
+ }
+
+ public static LexerNode create(String generator, String constructor) throws Exception {
+ constructor = constructor.replace("@", "aux_");
+ if (ruleGenerators.get(generator) == null)
+ throw new Exception("Rule Generator not found for '" + generator + "'");
+ return ruleGenerators.get(generator).generate(constructor);
+ }
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/Token.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/Token.java
new file mode 100644
index 0000000..bb122c2
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/Token.java
@@ -0,0 +1,70 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator;
+
+import java.util.LinkedHashMap;
+import java.util.regex.Matcher;
+import java.util.regex.Pattern;
+
+public class Token {
+ private String userDescription;
+ private String name;
+ private LexerNode node;
+
+ public Token(String str, LinkedHashMap<String, Token> tokens) throws Exception {
+ userDescription = str;
+ node = new LexerNode();
+ parse(userDescription, tokens);
+ }
+
+ public String getName() {
+ return name;
+ }
+
+ public LexerNode getNode() {
+ return node;
+ }
+
+ public String toString() {
+ return this.name + " => " + getNode().toString();
+ }
+
+ public void merge(Token newToken) throws Exception {
+ node.merge(newToken.getNode());
+ }
+
+ private void parse(String str, LinkedHashMap<String, Token> tokens) throws Exception {
+ Pattern p = Pattern.compile("^(@?\\w+)\\s*=\\s*(.+)");
+ Matcher m = p.matcher(str);
+ if (!m.find())
+ throw new Exception("Token definition not correct: " + str);
+ this.name = m.group(1).replaceAll("@", "aux_");
+ String[] textRules = m.group(2).split("(?<!\\\\),\\s*");
+ for (String textRule : textRules) {
+ Pattern pRule = Pattern.compile("^(\\w+)(\\((.*)\\))?");
+ Matcher mRule = pRule.matcher(textRule);
+ mRule.find();
+ String generator = mRule.group(1);
+ String constructor = mRule.group(3);
+ if (constructor == null)
+ throw new Exception("Error in rule format: " + "\n " + str + " = " + generator + " : " + constructor);
+ constructor = constructor.replace("\\", "");
+ node.append(NodeChainFactory.create(generator, constructor));
+ node.expandFirstAction(tokens);
+ }
+ node.appendTokenName(name);
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/TokenAux.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/TokenAux.java
new file mode 100644
index 0000000..a9c7ffc
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/TokenAux.java
@@ -0,0 +1,25 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator;
+
+import java.util.LinkedHashMap;
+
+public class TokenAux extends Token {
+
+ public TokenAux(String str, LinkedHashMap<String, Token> tokens) throws Exception {
+ super(str, tokens);
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGenerator.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGenerator.java
new file mode 100644
index 0000000..3733746
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGenerator.java
@@ -0,0 +1,21 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+
+public interface RuleGenerator {
+ public LexerNode generate(String input) throws Exception;
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorAnythingUntil.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorAnythingUntil.java
new file mode 100644
index 0000000..b14eb3e
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorAnythingUntil.java
@@ -0,0 +1,31 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleAnythingUntil;
+
+public class RuleGeneratorAnythingUntil implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) throws Exception {
+ LexerNode result = new LexerNode();
+ if (input == null || input.length() != 1)
+ throw new Exception("Wrong rule format for generator anythingExcept: " + input);
+ result.append(new RuleAnythingUntil(input.charAt(0)));
+ return result;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorCaseInsensitiveChar.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorCaseInsensitiveChar.java
new file mode 100644
index 0000000..b789f59
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorCaseInsensitiveChar.java
@@ -0,0 +1,34 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleChar;
+
+public class RuleGeneratorCaseInsensitiveChar implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) throws Exception {
+ LexerNode result = new LexerNode();
+ if (input == null || input.length() != 1)
+ throw new Exception("Wrong rule format for generator char: " + input);
+ char cl = Character.toLowerCase(input.charAt(0));
+ char cu = Character.toUpperCase(cl);
+ result.add(new RuleChar(cl));
+ result.add(new RuleChar(cu));
+ return result;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorChar.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorChar.java
new file mode 100644
index 0000000..0b830e6
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorChar.java
@@ -0,0 +1,31 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleChar;
+
+public class RuleGeneratorChar implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) throws Exception {
+ LexerNode result = new LexerNode();
+ if (input == null || input.length() != 1)
+ throw new Exception("Wrong rule format for generator char: " + input);
+ result.append(new RuleChar(input.charAt(0)));
+ return result;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorCharOrNothing.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorCharOrNothing.java
new file mode 100644
index 0000000..d01ff7d
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorCharOrNothing.java
@@ -0,0 +1,33 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleChar;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleEpsilon;
+
+public class RuleGeneratorCharOrNothing implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) throws Exception {
+ LexerNode result = new LexerNode();
+ if (input == null || input.length() != 1)
+ throw new Exception("Wrong rule format for generator charOrNothing: " + input);
+ result.add(new RuleChar(input.charAt(0)));
+ result.add(new RuleEpsilon());
+ return result;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorDigitSequence.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorDigitSequence.java
new file mode 100644
index 0000000..d067ee7
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorDigitSequence.java
@@ -0,0 +1,29 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleDigitSequence;
+
+public class RuleGeneratorDigitSequence implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) throws Exception {
+ LexerNode result = new LexerNode();
+ result.append(new RuleDigitSequence());
+ return result;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorNothing.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorNothing.java
new file mode 100644
index 0000000..fec06a1
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorNothing.java
@@ -0,0 +1,29 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleEpsilon;
+
+public class RuleGeneratorNothing implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) throws Exception {
+ LexerNode node = new LexerNode();
+ node.add(new RuleEpsilon());
+ return node;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorSign.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorSign.java
new file mode 100644
index 0000000..0160f09
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorSign.java
@@ -0,0 +1,30 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleChar;
+
+public class RuleGeneratorSign implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) throws Exception {
+ LexerNode result = new LexerNode();
+ result.add(new RuleChar('+'));
+ result.add(new RuleChar('-'));
+ return result;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorSignOrNothing.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorSignOrNothing.java
new file mode 100644
index 0000000..7c4297d
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorSignOrNothing.java
@@ -0,0 +1,32 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleChar;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleEpsilon;
+
+public class RuleGeneratorSignOrNothing implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) throws Exception {
+ LexerNode result = new LexerNode();
+ result.add(new RuleChar('+'));
+ result.add(new RuleChar('-'));
+ result.add(new RuleEpsilon());
+ return result;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorString.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorString.java
new file mode 100644
index 0000000..eb0471b
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorString.java
@@ -0,0 +1,33 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleChar;
+
+public class RuleGeneratorString implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) {
+ LexerNode result = new LexerNode();
+ if (input == null)
+ return result;
+ for (int i = 0; i < input.length(); i++) {
+ result.append(new RuleChar(input.charAt(i)));
+ }
+ return result;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorToken.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorToken.java
new file mode 100644
index 0000000..b4c23d8
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rulegenerators/RuleGeneratorToken.java
@@ -0,0 +1,31 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rulegenerators;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RulePartial;
+
+public class RuleGeneratorToken implements RuleGenerator {
+
+ @Override
+ public LexerNode generate(String input) throws Exception {
+ if (input == null || input.length() == 0)
+ throw new Exception("Wrong rule format for generator token : " + input);
+ LexerNode node = new LexerNode();
+ node.add(new RulePartial(input));
+ return node;
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/Rule.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/Rule.java
new file mode 100644
index 0000000..01cd1d5
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/Rule.java
@@ -0,0 +1,29 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rules;
+
+public interface Rule {
+ public int hashCode();
+
+ public boolean equals(Object o);
+
+ public String toString();
+
+ public String javaAction();
+
+ public String javaMatch(String action);
+
+ public Rule clone();
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleAnythingUntil.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleAnythingUntil.java
new file mode 100644
index 0000000..8d45835
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleAnythingUntil.java
@@ -0,0 +1,68 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rules;
+
+public class RuleAnythingUntil implements Rule {
+
+ private char expected;
+
+ public RuleAnythingUntil clone() {
+ return new RuleAnythingUntil(expected);
+ }
+
+ public RuleAnythingUntil(char expected) {
+ this.expected = expected;
+ }
+
+ @Override
+ public String toString() {
+ return " .* " + String.valueOf(expected);
+ }
+
+ @Override
+ public int hashCode() {
+ return 10 * (int) expected;
+ }
+
+ @Override
+ public boolean equals(Object o) {
+ if (o == null)
+ return false;
+ if (o instanceof RuleAnythingUntil) {
+ if (((RuleAnythingUntil) o).expected == this.expected) {
+ return true;
+ }
+ }
+ return false;
+ }
+
+ @Override
+ public String javaAction() {
+ return "currentChar = readNextChar();";
+ }
+
+ @Override
+ public String javaMatch(String action) {
+ StringBuilder result = new StringBuilder();
+ result.append("boolean escaped = false;");
+ result.append("while (currentChar!='").append(expected).append("' || escaped)");
+ result.append("{\nif(!escaped && currentChar=='\\\\\\\\'){escaped=true;}\nelse {escaped=false;}\ncurrentChar = readNextChar();\n}");
+ result.append("\nif (currentChar=='").append(expected).append("'){");
+ result.append(action);
+ result.append("}\n");
+ return result.toString();
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleChar.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleChar.java
new file mode 100644
index 0000000..0e53374
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleChar.java
@@ -0,0 +1,70 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rules;
+
+public class RuleChar implements Rule {
+
+ private char expected;
+
+ public RuleChar clone() {
+ return new RuleChar(expected);
+ }
+
+ public RuleChar(char expected) {
+ this.expected = expected;
+ }
+
+ @Override
+ public String toString() {
+ return String.valueOf(expected);
+ }
+
+ public char expectedChar() {
+ return expected;
+ }
+
+ @Override
+ public int hashCode() {
+ return (int) expected;
+ }
+
+ @Override
+ public boolean equals(Object o) {
+ if (o == null)
+ return false;
+ if (o instanceof RuleChar) {
+ if (((RuleChar) o).expected == this.expected) {
+ return true;
+ }
+ }
+ return false;
+ }
+
+ @Override
+ public String javaAction() {
+ return "currentChar = readNextChar();";
+ }
+
+ @Override
+ public String javaMatch(String action) {
+ StringBuilder result = new StringBuilder();
+ result.append("if (currentChar=='");
+ result.append(expected);
+ result.append("'){");
+ result.append(action);
+ result.append("}");
+ return result.toString();
+ }
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleDigitSequence.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleDigitSequence.java
new file mode 100644
index 0000000..13381e0
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleDigitSequence.java
@@ -0,0 +1,57 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rules;
+
+public class RuleDigitSequence implements Rule {
+
+ public RuleDigitSequence clone() {
+ return new RuleDigitSequence();
+ }
+
+ @Override
+ public String toString() {
+ return " [0-9]+ ";
+ }
+
+ @Override
+ public int hashCode() {
+ return 1;
+ }
+
+ @Override
+ public boolean equals(Object o) {
+ if (o == null)
+ return false;
+ if (o instanceof RuleDigitSequence) {
+ return true;
+ }
+ return false;
+ }
+
+ @Override
+ public String javaAction() {
+ return "";
+ }
+
+ @Override
+ public String javaMatch(String action) {
+ StringBuilder result = new StringBuilder();
+ result.append("if(currentChar >= '0' && currentChar<='9'){" + "\ncurrentChar = readNextChar();"
+ + "\nwhile(currentChar >= '0' && currentChar<='9'){" + "\ncurrentChar = readNextChar();" + "\n}\n");
+ result.append(action);
+ result.append("\n}");
+ return result.toString();
+ }
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleEpsilon.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleEpsilon.java
new file mode 100644
index 0000000..41b7535
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RuleEpsilon.java
@@ -0,0 +1,54 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rules;
+
+public class RuleEpsilon implements Rule {
+
+ public RuleEpsilon clone() {
+ return new RuleEpsilon();
+ }
+
+ @Override
+ public String toString() {
+ return "?";
+ }
+
+ @Override
+ public int hashCode() {
+ return 0;
+ }
+
+ @Override
+ public boolean equals(Object o) {
+ if (o == null)
+ return false;
+ if (o instanceof RuleEpsilon) {
+ return true;
+ }
+ return false;
+ }
+
+ @Override
+ public String javaAction() {
+ return "";
+ }
+
+ @Override
+ public String javaMatch(String action) {
+ StringBuilder result = new StringBuilder();
+ result.append("{").append(action).append("}");
+ return result.toString();
+ }
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RulePartial.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RulePartial.java
new file mode 100644
index 0000000..89caf4f
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/java/edu/uci/ics/asterix/lexergenerator/rules/RulePartial.java
@@ -0,0 +1,69 @@
+/*
+ * Copyright 2009-2012 by The Regents of the University of California
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * you may obtain a copy of the License from
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package edu.uci.ics.asterix.lexergenerator.rules;
+
+public class RulePartial implements Rule {
+
+ private String partialName;
+
+ public RulePartial clone() {
+ return new RulePartial(partialName);
+ }
+
+ public RulePartial(String expected) {
+ this.partialName = expected;
+ }
+
+ public String getPartial() {
+ return this.partialName;
+ }
+
+ @Override
+ public String toString() {
+ return partialName;
+ }
+
+ @Override
+ public int hashCode() {
+ return (int) partialName.charAt(1);
+ }
+
+ @Override
+ public boolean equals(Object o) {
+ if (o == null)
+ return false;
+ if (o instanceof RulePartial) {
+ if (((RulePartial) o).partialName.equals(this.partialName)) {
+ return true;
+ }
+ }
+ return false;
+ }
+
+ @Override
+ public String javaAction() {
+ return "";
+ }
+
+ @Override
+ public String javaMatch(String action) {
+ StringBuilder result = new StringBuilder();
+ result.append("if (parse_" + partialName + "(currentChar)==TOKEN_" + partialName + "){");
+ result.append(action);
+ result.append("}");
+ return result.toString();
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/resources/Lexer.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/resources/Lexer.java
new file mode 100644
index 0000000..8cee79d
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/resources/Lexer.java
@@ -0,0 +1,219 @@
+package [PACKAGE];
+
+import java.io.IOException;
+import [PACKAGE].[LEXER_NAME]Exception;
+
+public class [LEXER_NAME] {
+
+ public static final int
+ TOKEN_EOF = 0, TOKEN_AUX_NOT_FOUND = 1 [TOKENS_CONSTANTS];
+
+ // Human representation of tokens. Useful for debug.
+ // Is possible to convert a TOKEN_CONSTANT in its image through
+ // [LEXER_NAME].tokenKindToString(TOKEN_CONSTANT);
+ private static final String[] tokenImage = {
+ "<EOF>", "<AUX_NOT_FOUND>" [TOKENS_IMAGES]
+ };
+
+ private static final char EOF_CHAR = 4;
+ protected java.io.Reader inputStream;
+ protected int column;
+ protected int line;
+ protected boolean prevCharIsCR;
+ protected boolean prevCharIsLF;
+ protected char[] buffer;
+ protected int bufsize;
+ protected int bufpos;
+ protected int tokenBegin;
+ protected int endOf_USED_Buffer;
+ protected int endOf_UNUSED_Buffer;
+ protected int maxUnusedBufferSize;
+
+// ================================================================================
+// Auxiliary functions. Can parse the tokens used in the grammar as partial/auxiliary
+// ================================================================================
+
+ [LEXER_AUXFUNCTIONS]
+
+// ================================================================================
+// Main method. Return a TOKEN_CONSTANT
+// ================================================================================
+
+ public int next() throws [LEXER_NAME]Exception, IOException{
+ char currentChar = buffer[bufpos];
+ while (currentChar == ' ' || currentChar=='\t' || currentChar == '\n' || currentChar=='\r')
+ currentChar = readNextChar();
+ tokenBegin = bufpos;
+ if (currentChar==EOF_CHAR) return TOKEN_EOF;
+
+ [LEXER_LOGIC]
+ }
+
+// ================================================================================
+// Public interface
+// ================================================================================
+
+ public [LEXER_NAME](java.io.Reader stream) throws IOException{
+ reInit(stream);
+ }
+
+ public void reInit(java.io.Reader stream) throws IOException{
+ done();
+ inputStream = stream;
+ bufsize = 4096;
+ line = 1;
+ column = 0;
+ bufpos = -1;
+ endOf_UNUSED_Buffer = bufsize;
+ endOf_USED_Buffer = 0;
+ prevCharIsCR = false;
+ prevCharIsLF = false;
+ buffer = new char[bufsize];
+ tokenBegin = -1;
+ maxUnusedBufferSize = 4096/2;
+ readNextChar();
+ }
+
+ public String getLastTokenImage() {
+ if (bufpos >= tokenBegin)
+ return new String(buffer, tokenBegin, bufpos - tokenBegin);
+ else
+ return new String(buffer, tokenBegin, bufsize - tokenBegin) +
+ new String(buffer, 0, bufpos);
+ }
+
+ public static String tokenKindToString(int token) {
+ return tokenImage[token];
+ }
+
+ public void done(){
+ buffer = null;
+ }
+
+// ================================================================================
+// Parse error management
+// ================================================================================
+
+ protected int parseError(String reason) throws [LEXER_NAME]Exception {
+ StringBuilder message = new StringBuilder();
+ message.append(reason).append("\n");
+ message.append("Line: ").append(line).append("\n");
+ message.append("Row: ").append(column).append("\n");
+ throw new [LEXER_NAME]Exception(message.toString());
+ }
+
+ protected int parseError(int ... tokens) throws [LEXER_NAME]Exception {
+ StringBuilder message = new StringBuilder();
+ message.append("Error while parsing. ");
+ message.append(" Line: ").append(line);
+ message.append(" Row: ").append(column);
+ message.append(" Expecting:");
+ for (int tokenId : tokens){
+ message.append(" ").append([LEXER_NAME].tokenKindToString(tokenId));
+ }
+ throw new [LEXER_NAME]Exception(message.toString());
+ }
+
+ protected void updateLineColumn(char c){
+ column++;
+
+ if (prevCharIsLF)
+ {
+ prevCharIsLF = false;
+ line += (column = 1);
+ }
+ else if (prevCharIsCR)
+ {
+ prevCharIsCR = false;
+ if (c == '\n')
+ {
+ prevCharIsLF = true;
+ }
+ else
+ {
+ line += (column = 1);
+ }
+ }
+
+ if (c=='\r') {
+ prevCharIsCR = true;
+ } else if(c == '\n') {
+ prevCharIsLF = true;
+ }
+ }
+
+// ================================================================================
+// Read data, buffer management. It uses a circular (and expandable) buffer
+// ================================================================================
+
+ protected char readNextChar() throws IOException {
+ if (++bufpos >= endOf_USED_Buffer)
+ fillBuff();
+ char c = buffer[bufpos];
+ updateLineColumn(c);
+ return c;
+ }
+
+ protected boolean fillBuff() throws IOException {
+ if (endOf_UNUSED_Buffer == endOf_USED_Buffer) // If no more unused buffer space
+ {
+ if (endOf_UNUSED_Buffer == bufsize) // -- If the previous unused space was
+ { // -- at the end of the buffer
+ if (tokenBegin > maxUnusedBufferSize) // -- -- If the first N bytes before
+ { // the current token are enough
+ bufpos = endOf_USED_Buffer = 0; // -- -- -- setup buffer to use that fragment
+ endOf_UNUSED_Buffer = tokenBegin;
+ }
+ else if (tokenBegin < 0) // -- -- If no token yet
+ bufpos = endOf_USED_Buffer = 0; // -- -- -- reuse the whole buffer
+ else
+ ExpandBuff(false); // -- -- Otherwise expand buffer after its end
+ }
+ else if (endOf_UNUSED_Buffer > tokenBegin) // If the endOf_UNUSED_Buffer is after the token
+ endOf_UNUSED_Buffer = bufsize; // -- set endOf_UNUSED_Buffer to the end of the buffer
+ else if ((tokenBegin - endOf_UNUSED_Buffer) < maxUnusedBufferSize)
+ { // If between endOf_UNUSED_Buffer and the token
+ ExpandBuff(true); // there is NOT enough space expand the buffer
+ } // reorganizing it
+ else
+ endOf_UNUSED_Buffer = tokenBegin; // Otherwise there is enough space at the start
+ } // so we set the buffer to use that fragment
+ int i;
+ if ((i = inputStream.read(buffer, endOf_USED_Buffer, endOf_UNUSED_Buffer - endOf_USED_Buffer)) == -1)
+ {
+ inputStream.close();
+ buffer[endOf_USED_Buffer]=(char)EOF_CHAR;
+ endOf_USED_Buffer++;
+ return false;
+ }
+ else
+ endOf_USED_Buffer += i;
+ return true;
+ }
+
+
+ protected void ExpandBuff(boolean wrapAround)
+ {
+ char[] newbuffer = new char[bufsize + maxUnusedBufferSize];
+
+ try {
+ if (wrapAround) {
+ System.arraycopy(buffer, tokenBegin, newbuffer, 0, bufsize - tokenBegin);
+ System.arraycopy(buffer, 0, newbuffer, bufsize - tokenBegin, bufpos);
+ buffer = newbuffer;
+ endOf_USED_Buffer = (bufpos += (bufsize - tokenBegin));
+ }
+ else {
+ System.arraycopy(buffer, tokenBegin, newbuffer, 0, bufsize - tokenBegin);
+ buffer = newbuffer;
+ endOf_USED_Buffer = (bufpos -= tokenBegin);
+ }
+ } catch (Throwable t) {
+ throw new Error(t.getMessage());
+ }
+
+ bufsize += maxUnusedBufferSize;
+ endOf_UNUSED_Buffer = bufsize;
+ tokenBegin = 0;
+ }
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/resources/LexerException.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/resources/LexerException.java
new file mode 100644
index 0000000..76aa8a4
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/resources/LexerException.java
@@ -0,0 +1,13 @@
+package [PACKAGE];
+
+public class [LEXER_NAME]Exception extends Exception {
+
+ public [LEXER_NAME]Exception(String message) {
+ super(message);
+ }
+
+ private static final long serialVersionUID = 1L;
+
+}
+
+
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/resources/default.config b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/resources/default.config
new file mode 100644
index 0000000..7efbeb8
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/main/resources/default.config
@@ -0,0 +1,16 @@
+# LEXER GENERATOR configuration file
+# ---------------------------------------
+# Place *first* the generic configuration
+# then list your grammar.
+
+PACKAGE: com.my.lexer
+LEXER_NAME: MyLexer
+OUTPUT_DIR: output
+
+TOKENS:
+
+BOOLEAN_LIT = string(boolean)
+FALSE_LIT = string(false)
+BOMB_LIT = string(bomb)
+BONSAI_LIT = string(bonsai)
+HELLO_LIT = string(hello)
\ No newline at end of file
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/Fixtures.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/Fixtures.java
new file mode 100644
index 0000000..2ed2eaa
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/Fixtures.java
@@ -0,0 +1,100 @@
+package edu.uci.ics.asterix.lexergenerator;
+
+import edu.uci.ics.asterix.lexergenerator.rules.Rule;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleChar;
+
+public class Fixtures {
+ static String token_name = "MYTOKEN";
+ static String token2_name = "MYTOKEN2";
+ static String token_return = "return TOKEN_MYTOKEN;\n";
+ static String token2_return = "return TOKEN_MYTOKEN2;\n";
+ static String token_parseerror = "return parseError(TOKEN_MYTOKEN);\n";
+ static String token_tostring = "! ";
+ static String rule_action = "myaction";
+ static String rule_name = "myrule";
+ static String rule_match = "matchCheck("+rule_name+")";
+ static String rule2_action = "myaction2";
+ static String rule2_name = "myrule2";
+ static String rule2_match = "matchCheck2("+rule_name+")";
+
+ static public Rule createRule(final String name){
+ return new Rule(){
+ String rule_name = name;
+ String rule_action = "myaction";
+ String rule_match = "matchCheck("+rule_name+")";
+
+ @Override
+ public Rule clone(){
+ return Fixtures.createRule(name+"_clone");
+ }
+
+ @Override
+ public String javaAction() {
+ return rule_action;
+ }
+
+ @Override
+ public String javaMatch(String action) {
+ return rule_match+"{"+action+"}";
+ }
+
+ @Override
+ public String toString(){
+ return rule_name;
+ }
+
+ };
+ }
+
+ static Rule rule = new Rule(){
+
+ public Rule clone(){
+ return null;
+ }
+
+ @Override
+ public String javaAction() {
+ return rule_action;
+ }
+
+ @Override
+ public String javaMatch(String action) {
+ return rule_match+"{"+action+"}";
+ }
+
+ @Override
+ public String toString(){
+ return rule_name;
+ }
+
+ };
+
+ static Rule rule2 = new Rule(){
+
+ public Rule clone(){
+ return null;
+ }
+
+ @Override
+ public String javaAction() {
+ return rule2_action;
+ }
+
+ @Override
+ public String javaMatch(String act) {
+ return rule2_match+"{"+act+"}";
+ }
+
+ @Override
+ public String toString(){
+ return rule2_name;
+ }
+
+ };
+
+ static RuleChar ruleA = new RuleChar('a');
+ static RuleChar ruleB = new RuleChar('b');
+ static RuleChar ruleC = new RuleChar('c');
+ static String ruleABC_action = "currentChar = readNextChar();";
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAddRuleTest.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAddRuleTest.java
new file mode 100644
index 0000000..7541124
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAddRuleTest.java
@@ -0,0 +1,51 @@
+package edu.uci.ics.asterix.lexergenerator;
+
+import static edu.uci.ics.asterix.lexergenerator.Fixtures.*;
+import static org.junit.Assert.*;
+
+import org.junit.Test;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+
+public class LexerNodeAddRuleTest {
+
+ @Test
+ public void NodeRuleRuleNodeNode() {
+ LexerNode node = new LexerNode();
+ node.append(rule);
+ node.add(rule2);
+ node.appendTokenName(token_name);
+ assertEquals(" ( " + rule_name +token_tostring + " || " + rule2_name + token_tostring + " ) ", node.toString());
+ assertEquals(rule_match+"{"
+ +"\n" + rule_action
+ +"\n" +token_return
+ +"}"
+ +rule2_match+"{"
+ +"\n"+rule2_action
+ +"\n"+token_return
+ +"}"
+ +token_parseerror , node.toJava());
+ }
+
+ @Test
+ public void NodeSwitchCase() {
+ LexerNode node = new LexerNode();
+ node.append(ruleA);
+ node.add(ruleB);
+ node.add(ruleC);
+ node.appendTokenName(token_name);
+ assertEquals(" ( a" + token_tostring + " || b" + token_tostring + " || c" + token_tostring + " ) ", node.toString());
+ assertEquals("switch(currentChar){\n" +
+ "case 'a':" +
+ "\n" + ruleABC_action +
+ "\n" + token_return +
+ "case 'b':" +
+ "\n" + ruleABC_action +
+ "\n" + token_return +
+ "case 'c':" +
+ "\n" + ruleABC_action +
+ "\n" + token_return +
+ "}\n"+ token_parseerror , node.toJava());
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAppendNodeTest.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAppendNodeTest.java
new file mode 100644
index 0000000..5151e77
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAppendNodeTest.java
@@ -0,0 +1,81 @@
+package edu.uci.ics.asterix.lexergenerator;
+
+import static edu.uci.ics.asterix.lexergenerator.Fixtures.*;
+import static org.junit.Assert.*;
+
+import org.junit.Test;
+
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleEpsilon;
+
+public class LexerNodeAppendNodeTest {
+
+ @Test
+ public void AppendIsMergeIfNoActions() throws Exception {
+ LexerNode node = new LexerNode();
+ LexerNode node2 = new LexerNode();
+ node2.append(createRule("rule"));
+ node2.appendTokenName(token_name);
+ node.append(node2);
+ assertEquals("rule_clone! ", node.toString());
+ }
+
+ @Test
+ public void AppendIsAppend() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(createRule("A"));
+ LexerNode node2 = new LexerNode();
+ node2.append(createRule("rule"));
+ node2.appendTokenName(token_name);
+ node.append(node2);
+ assertEquals("Arule_clone! ", node.toString());
+ }
+
+ @Test
+ public void AppendedNodesAreCloned() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(createRule("A"));
+ node.appendTokenName(token_name);
+ LexerNode node2 = new LexerNode();
+ node2.append(createRule("B"));
+ node2.appendTokenName(token2_name);
+ node.append(node2);
+ // TODO
+ // assertEquals("A! B_clone! ", node.toString());
+
+ LexerNode node3 = new LexerNode();
+ node3.append(createRule("C"));
+ node3.append(createRule("D"));
+ node3.appendTokenName(token2_name);
+ node.append(node3);
+ // TODO
+ // assertEquals("A! B_clone! C_cloneD_clone! ", node.toString());
+ }
+
+ @Test
+ public void EpsilonRuleDoesNotPropagateAppended() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(new RuleEpsilon());
+ LexerNode node2 = new LexerNode();
+ node2.append(createRule("A"));
+ node2.appendTokenName(token2_name);
+ node.append(node2);
+ assertEquals("A_clone! ", node.toString());
+ }
+
+ @Test
+ public void EpsilonRuleIsRemovedAndIssueMerge() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(new RuleEpsilon());
+ LexerNode node2 = new LexerNode();
+ node2.append(createRule("A"));
+ node2.appendTokenName(token2_name);
+ node.append(node2);
+ node.add(new RuleEpsilon());
+ node.append(node2);
+ // TODO
+ // assertEquals(" ( A_clone! A_clone! || A_clone! ) ", node.toString());
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAppendRuleTest.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAppendRuleTest.java
new file mode 100644
index 0000000..84fd292
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAppendRuleTest.java
@@ -0,0 +1,47 @@
+package edu.uci.ics.asterix.lexergenerator;
+
+import static org.junit.Assert.*;
+
+import org.junit.Test;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import static edu.uci.ics.asterix.lexergenerator.Fixtures.*;
+
+public class LexerNodeAppendRuleTest {
+ @Test
+ public void SingleNode() {
+ LexerNode node = new LexerNode();
+ node.appendTokenName(token_name);
+ assertEquals(token_tostring, node.toString());
+ assertEquals(token_return, node.toJava());
+ }
+
+ @Test
+ public void NodeRuleNode() {
+ LexerNode node = new LexerNode();
+ node.append(rule);
+ node.appendTokenName(token_name);
+ assertEquals(rule_name+token_tostring, node.toString());
+ assertEquals(rule_match+"{"
+ +"\n"+rule_action
+ +"\n"+token_return
+ +"}"+token_parseerror, node.toJava());
+ }
+
+ @Test
+ public void NodeRuleNodeRuleNode() {
+ LexerNode node = new LexerNode();
+ node.append(rule);
+ node.append(rule2);
+ node.appendTokenName(token_name);
+ assertEquals(rule_name+rule2_name+token_tostring, node.toString());
+ assertEquals(rule_match+"{"
+ +"\n"+rule_action
+ +"\n"+rule2_match+"{"
+ +"\n"+rule2_action
+ +"\n"+token_return
+ +"}"
+ +token_parseerror
+ +"}"+token_parseerror, node.toJava());
+ }
+}
\ No newline at end of file
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAuxFunctionsTest.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAuxFunctionsTest.java
new file mode 100644
index 0000000..9f12c00
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeAuxFunctionsTest.java
@@ -0,0 +1,111 @@
+package edu.uci.ics.asterix.lexergenerator;
+
+import static edu.uci.ics.asterix.lexergenerator.Fixtures.*;
+import static org.junit.Assert.*;
+
+import java.util.HashSet;
+import java.util.LinkedHashMap;
+import java.util.Set;
+
+import org.junit.Test;
+
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+import edu.uci.ics.asterix.lexergenerator.Token;
+import edu.uci.ics.asterix.lexergenerator.rules.RuleEpsilon;
+import edu.uci.ics.asterix.lexergenerator.rules.RulePartial;
+
+public class LexerNodeAuxFunctionsTest {
+ String expectedDifferentReturn = "return TOKEN_AUX_NOT_FOUND;\n";
+
+ @Test
+ public void NodeRuleRuleNodeNode() {
+ LexerNode node = new LexerNode();
+ node.append(rule);
+ node.add(rule2);
+ node.appendTokenName(token_name);
+ assertEquals(" ( " + rule_name +token_tostring + " || " + rule2_name + token_tostring + " ) ", node.toString());
+ assertEquals(rule_match+"{"
+ +"\n" + rule_action
+ +"\n" +token_return
+ +"}"
+ +rule2_match+"{"
+ +"\n"+rule2_action
+ +"\n"+token_return
+ +"}"
+ +expectedDifferentReturn , node.toJavaAuxFunction());
+ }
+
+ @Test
+ public void NodeSwitchCase() {
+ LexerNode node = new LexerNode();
+ node.append(ruleA);
+ node.add(ruleB);
+ node.add(ruleC);
+ node.appendTokenName(token_name);
+ assertEquals(" ( a" + token_tostring + " || b" + token_tostring + " || c" + token_tostring + " ) ", node.toString());
+ assertEquals("switch(currentChar){\n" +
+ "case 'a':" +
+ "\n" + ruleABC_action +
+ "\n" + token_return +
+ "case 'b':" +
+ "\n" + ruleABC_action +
+ "\n" + token_return +
+ "case 'c':" +
+ "\n" + ruleABC_action +
+ "\n" + token_return +
+ "}\n"+ expectedDifferentReturn , node.toJavaAuxFunction());
+ }
+
+ @Test
+ public void NodeNeededAuxFunctions() {
+ LexerNode node = new LexerNode();
+ node.append(ruleA);
+ node.add(new RulePartial("token1"));
+ node.append(ruleC);
+ node.append(new RulePartial("token2"));
+ node.appendTokenName(token_name);
+ assertEquals(" ( actoken2! || token1ctoken2! ) ", node.toString());
+ Set<String> expectedNeededAuxFunctions = new HashSet<String>();
+ expectedNeededAuxFunctions.add("token1");
+ expectedNeededAuxFunctions.add("token2");
+ assertEquals(expectedNeededAuxFunctions, node.neededAuxFunctions());
+ }
+
+ @Test(expected=Exception.class)
+ public void NodeExpandFirstActionError() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(ruleA);
+ node.add(new RulePartial("token1"));
+ node.append(ruleC);
+ node.add(new RuleEpsilon());
+ node.append(new RulePartial("token2"));
+ node.appendTokenName(token_name);
+ assertEquals(" ( actoken2! || token1ctoken2! || token2! ) ", node.toString());
+ LinkedHashMap<String, Token> tokens = new LinkedHashMap<String, Token>();
+ try {
+ node.expandFirstAction(tokens);
+ } catch (Exception e) {
+ assertEquals("Cannot find a token used as part of another definition, missing token: token1", e.getMessage());
+ throw e;
+ }
+ }
+
+ public void NodeExpandFirstAction() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(ruleA);
+ node.add(new RulePartial("token1"));
+ node.append(ruleC);
+ node.add(new RuleEpsilon());
+ node.append(new RulePartial("token2"));
+ node.appendTokenName(token_name);
+ assertEquals(" ( actoken2! || token1ctoken2! || token2! ) ", node.toString());
+ LinkedHashMap<String, Token> tokens = new LinkedHashMap<String, Token>();
+ Token a = new Token("token1 = string(T1-blabla)", tokens);
+ Token b = new Token("token1 = string(T1-blabla)", tokens);
+ tokens.put("token1", a);
+ tokens.put("token2", b);
+ node.expandFirstAction(tokens);
+ assertEquals(" ( actoken2! || T1-blablactoken2! || T2-blabla! ) ", node.toString());
+ }
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeCloneTest.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeCloneTest.java
new file mode 100644
index 0000000..87e3ff4
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeCloneTest.java
@@ -0,0 +1,56 @@
+package edu.uci.ics.asterix.lexergenerator;
+
+import static edu.uci.ics.asterix.lexergenerator.Fixtures.*;
+import static org.junit.Assert.*;
+
+import org.junit.Test;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+
+public class LexerNodeCloneTest {
+
+ @Test
+ public void Depth1() throws Exception {
+ LexerNode node = new LexerNode();
+ LexerNode newNode = node.clone();
+ assertFalse(node == newNode);
+ }
+
+
+ @Test
+ public void Depth2() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(createRule("my1"));
+ node.add(createRule("my2"));
+ node.add(ruleA);
+ node.appendTokenName(token_name);
+ LexerNode newNode = node.clone();
+
+ assertEquals(" ( my1! || my2! || a! ) ", node.toString());
+ assertEquals(" ( my1_clone! || my2_clone! || a! ) ", newNode.toString());
+ }
+
+ @Test
+ public void Depth3() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(createRule("my1"));
+ node.add(createRule("my2"));
+ node.add(ruleA);
+ node.appendTokenName(token_name);
+ LexerNode node2 = new LexerNode();
+ node2.append(createRule("my3"));
+ node2.add(createRule("my4"));
+ node2.add(ruleB);
+ node2.appendTokenName(token2_name);
+ node.append(node2);
+ LexerNode newNode = node.clone();
+ // TODO
+ // assertEquals(" ( my1! ( || my3_clone! || my4_clone! || b! ) " +
+ // " || my2! ( || my3_clone! || my4_clone! || b! ) " +
+ // " || a! ( || my3_clone! || my4_clone! || b! ) ) ", node.toString());
+ // assertEquals(" ( my1_clone! ( || my3_clone_clone! || my4_clone_clone! || b! ) " +
+ // " || my2_clone! ( || my3_clone_clone! || my4_clone_clone! || b! ) " +
+ // " || a! ( || my3_clone_clone! || my4_clone_clone! || b! ) ) ", newNode.toString());
+ }
+
+}
diff --git a/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeMergeNodeTest.java b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeMergeNodeTest.java
new file mode 100644
index 0000000..4b22d99
--- /dev/null
+++ b/asterix-maven-plugins/lexer-generator-maven-plugin/src/test/java/edu/uci/ics/asterix/lexergenerator/LexerNodeMergeNodeTest.java
@@ -0,0 +1,83 @@
+package edu.uci.ics.asterix.lexergenerator;
+
+import static edu.uci.ics.asterix.lexergenerator.Fixtures.*;
+import static org.junit.Assert.*;
+
+import org.junit.Test;
+
+import edu.uci.ics.asterix.lexergenerator.LexerNode;
+
+public class LexerNodeMergeNodeTest {
+
+ @Test
+ public void MergeIsAdd() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(rule);
+ LexerNode node2 = new LexerNode();
+ node2.append(rule2);
+ node2.append(rule);
+ node2.merge(node);
+ node2.appendTokenName(token_name);
+
+ LexerNode expected = new LexerNode();
+ expected.append(rule2);
+ expected.append(rule);
+ expected.add(rule);
+ expected.appendTokenName(token_name);
+
+ assertEquals(expected.toString(), node2.toString());
+ assertEquals(expected.toJava(), node2.toJava());
+ }
+
+ @Test
+ public void MergeTwoToken() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(rule);
+ node.appendTokenName(token_name);
+ LexerNode node2 = new LexerNode();
+ node2.append(rule2);
+ node2.appendTokenName(token2_name);
+ node.merge(node2);
+
+ assertEquals(" ( "+rule_name+token_tostring+" || "+rule2_name+token_tostring+" ) ", node.toString());
+ assertEquals(rule_match + "{"
+ + "\n" + rule_action
+ + "\n" + token_return
+ +"}"+rule2_match+"{"
+ + "\n" + rule2_action
+ + "\n" + token2_return
+ +"}return parseError(TOKEN_MYTOKEN,TOKEN_MYTOKEN2);\n"
+, node.toJava());
+ }
+
+ @Test(expected=Exception.class)
+ public void MergeConflict() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(rule);
+ node.appendTokenName(token_name);
+ LexerNode node2 = new LexerNode();
+ node2.append(rule);
+ node2.appendTokenName(token2_name);
+ try {
+ node.merge(node2);
+ } catch (Exception e) {
+ assertEquals("Rule conflict between: "+token_name +" and "+token2_name, e.getMessage());
+ throw e;
+ }
+ }
+
+ @Test
+ public void MergeWithoutConflictWithRemoveTokensName() throws Exception {
+ LexerNode node = new LexerNode();
+ node.append(rule);
+ node.append(rule);
+ node.appendTokenName(token_name);
+ LexerNode node2 = new LexerNode();
+ node2.append(rule);
+ node2.append(rule);
+ node2.appendTokenName(token2_name);
+ node2.removeTokensName();
+ node.merge(node2);
+ assertEquals(rule_name+rule_name+token_tostring, node.toString());
+ }
+}