CSTParser

Dev Project Status: Active - The project has reached a stable, usable state and is being actively developed. codecov

A parser for Julia using Tokenize that aims to extend the built-in parser by providing additional meta information along with the resultant AST.

Installation and Usage

using Pkg
Pkg.add("CSTParser")
using CSTParser

Documentation: Dev

Additional Output

  • EXPR's are iterable producing children in the order that they appear in the source code, including punctuation.

    Example:

    f(x) = x*2 becomes [f(x), =, x*2]
    f(x) becomes [f, (, x, )]
  • The byte span of each EXPR is stored allowing a mapping between byte position in the source code and the releveant parsed expression. The span of a single token includes any trailing whitespace, newlines or comments. This also allows for fast partial parsing of modified source code.

  • Formatting hints are generated as the source code is parsed (e.g. mismatched indents for blocks, missing white space around operators).

  • The declaration of modules, functions, datatypes and variables are tracked and stored in the relevant hierarchical scopes attatched to the expressions that declare the scope. This allows for a mapping between any identifying symbol and the relevant code that it refers to.

Structure

Expressions are represented solely by the following types:

Parser.SyntaxNode
  Parser.EXPR
  Parser.INSTANCE
    Parser.HEAD{K}
    Parser.IDENTIFIER
    Parser.KEYWORD{K}
    Parser.LITERAL{K}
    Parser.OPERATOR{P,K,dot}
    Parser.PUNCTUATION
  Parser.QUOTENODE

The K parameterisation refers to the kind of the associated token as specified by Tokenize.Tokens.Kind. The P and dot parameters for operators refers to the precedence of the operator and whether it is dotted (e.g. .+).

INSTANCEs represent singular objects that may have a concrete or implicit relation to a portion of the source text. In the the former case they have a span storing the width in bytes that they occupy in the source text, in the latter case their span is 0. Additionally, IDENTIFIERs store their value as a Symbol and LITERALs as a String.

EXPR are equivalent to Base.Expr but have extra fields to store their span and any punctuation tokens.