LAMB - A Lexical Analyzer with Ambiguity Support

Luis Quesada, Fernando Berzal, Francisco J. Cortijo

Abstract

Lexical ambiguities may naturally arise in language specifications. We present Lamb, a lexical analyzer that captures overlapping tokens caused by lexical ambiguities. This novel technique scans through the input string and produces a lexical analysis graph that describes all the possible sequences of tokens that can be found within the string. The lexical graph can then be fed as input to a parser, which will discard any sequence of tokens that does not produce a valid syntactic sentence. In summary, our approach allows a context-sensitive lexical analysis that supports lexically-ambiguous language specifications.

References

  1. Aho, A. V., Lam, M. S., Sethi, R., and Ullman, J. D. (2006). Compilers: Principles, Techniques, and Tools. Addison Wesley, 2nd edition.
  2. Han, J., Kamber, M., and Pei, J. (2005). Data Mining: Concepts and Techniques. The Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann, 2nd edition.
  3. Jurafsky, D. and Martin, J. H. (2009). Speech and Language Processing. Prentice Hall, 2nd edition.
  4. Levine, J. R., Mason, T., and Brown, D. (1992). lex&yacc. O'Reilly, 2nd edition.
  5. Markov, A. A. (1971). Extension of the limit theorems of probability theory to a sum of variables connected in a chain. R. Howard, Dynamic Probabilistic Systems volume 1, Appendix B. John Wiley and Sons.
  6. McCallum, A., Freitag, D., and Pereira, F. (2000). Maximum entropy markov models for information extraction and segmentation. In Proc. of the 17th International Conference on Machine Learning, pages 591- 598.
  7. Shyu, Y.-H. (1986). From semi-syntactic lexical analyzer to a new compiler model. ACM SIGPLAN Notices, 21:149-157.
Download


Paper Citation


in Harvard Style

Quesada L., Berzal F. and J. Cortijo F. (2011). LAMB - A Lexical Analyzer with Ambiguity Support . In Proceedings of the 6th International Conference on Software and Database Technologies - Volume 1: ICSOFT, ISBN 978-989-8425-76-8, pages 297-300. DOI: 10.5220/0003476802970300


in Bibtex Style

@conference{icsoft11,
author={Luis Quesada and Fernando Berzal and Francisco J. Cortijo},
title={LAMB - A Lexical Analyzer with Ambiguity Support},
booktitle={Proceedings of the 6th International Conference on Software and Database Technologies - Volume 1: ICSOFT,},
year={2011},
pages={297-300},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003476802970300},
isbn={978-989-8425-76-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 6th International Conference on Software and Database Technologies - Volume 1: ICSOFT,
TI - LAMB - A Lexical Analyzer with Ambiguity Support
SN - 978-989-8425-76-8
AU - Quesada L.
AU - Berzal F.
AU - J. Cortijo F.
PY - 2011
SP - 297
EP - 300
DO - 10.5220/0003476802970300