Awesome Open Source
Awesome Open Source

Note!

As of now the author ran out of time. If anyone wants to help out "fixing" a small part of my private code, so it can be ported to Escaya. This parser can be completed. Just ping me in the 'TODO's issue on this repo if interested.

Escaya

An blazing fast 100% spec compliant, incremental javascript parser written in Typescript


GitHub license Circle License

Work in progress

Features

  • Conforms to the standard ECMAScript® 2021 (ECMA-262 11th Edition) language specification
  • Support for additional ECMAScript features for Web Browsers
  • Optionally track syntactic node locations
  • Optionally concrete syntax
  • Emits an ECMAScript® 2021 compatible abstract syntax tree
  • Error recovery mode with incremental parsing support
  • Errors diagnostics
  • Supports JSON serializing
  • Possible to use a custom AST such as ESTree and Babel AST
  • No backtracking
  • Low memory usage
  • Optimized for use on handheld devices such as a mobile phone or tablet
  • Very well tested (~67 000 unit tests with full code coverage)
  • Lightweight - ~94 KB minified

API

Escaya generates it's own AST that is close to the ECMAScript® 2021 specs, and can be used to perform syntactic analysis (parsing) of a JavaScript program, and with ES2015 and later a JavaScript program can be either a script or a module.

Example usage:

import { parseScript, parseModule } from './escaya';

parseScript('({x: [y] = 0} = 1)', { impliedStrict: true });

parseModule('({x: [y] = 0} = 1)');

This is the available options:

Option Description
next Enable stage 3 support (ESNext)
disableWebCompat Disable web compatibility
loc Enable line/column location information start and end offsets to each node
cst Enable additional concrete syntax to each node
impliedStrict Enable strict mode initial enforcement
module Enable parsing in module goal in error recovery mode

Comments

Escaya lets you extract leading and trailing comments from a given position with either extractCommentsScript or extractCommentsModule. It takes the source code as it's first argument, the position within the source code where you want to extract the comments from as it's second argument. The last argument let's you decide if you want to extract leading or trailing comments - collectCommentsScript(source, start, boolean);

Here is an example on how to get all trailing comments belonging to bar

import { extractCommentsScript } from './escaya';

extractCommentsScript('/* MultieLine */ bar /* trailing */', 20, true);

Outputs:

[{
        comment: ' trailing ',
        end: 35,
        newLine: false,
        start: 21,
        type: 'MultiLine'
}]

Escaya AST

The AST used by Escaya represents the structure of an ECMAScript program as a tree and is designed to stay true to the ECMAScript® 2021 specification. The AST has been designed for performance, and it nearly eliminates the chance of accidentally creating an AST that does not represent an ECMAScript program while also requiring fewer bytes than an ESTree AST like Babel and Acorn produce, and Babel parser's own AST.

The Escaya AST doesn't try to follow the SpiderMonkey-compatible standard that ESTree strictly follows. For example it distinguish Identifier from IdentifierPattern. That makes it easier to calculate the free variables of a program.

Concrete syntax (CST)

Escaya supports a simplified definition of "concrete syntax" that follows the ECMAScript® 2021 specification.

A ParenthesisExpression has been added to represent the ( ) and everything in between. See Primary Expression - Supplemental Syntax

A Elison node has been added to represent a splice array in 12.2.5 Array Initializer and 13.3.3 Destructuring Binding Patterns - ArrayBindingPattern.

A Semicolon node has been used in ClassElement to represent the ; token.

Custom AST

Use of parseCustomScript and parseCustomModule let you use whatever AST format you want.

Here is an example on how to use Babel AST

import { parseCustomScript } from './escaya';

parseCustomScript('a = b', {
        Script: function (source, directives, statements) {
          return {
            type: 'File',
            errors: [],
            program: {
              type: 'Program',
              sourceType: 'script',
              body: statements
            },
            directives,
            comments: [],
            start: 0,
            end: source.length
          };
        }
    );
  });

Error recovery

When Escaya parser is given an input that does not represent a valid JavaScript program, it throws an exception. If parsing in recovery mode, the parser will continue parsing and produce a syntax tree that conforms to the standard ECMAScript® 2021 specs.

However, Escaya will continue to do a full parse for every keystroke. To avoid this you can enable incremental parsing. This is best demonstrated with an example.

import { recovery, update } from './escaya';

const rootNode = recovery('(foo);', 'filename.js', { module: true });

const ast = update(rootNode, '=> bar;', 'filename.js', { span: { start: 6, length: 0 }, newLength: 7 })

Now when incremental parsing has been enabled, Escaya will reuse nodes from the old tree if possible.

Options

The options for the recovery mode are about the same as for parseScript and parseModule except you have to enable {module: true} if parsing in module goal.

No options can be set during an incremental update because it's only possible to reuse a node if it was parsed in the same context that parser are currently in.

AST

One of the design goals for Escaya has been that the abstract syntax tree (AST) shouldn't change. It should be the same either you are parsing in normal mode or recovery mode but there are a couple of exceptions.

For example, in recovery mode you are creating a RootNode instead of either a Module or Script. This RootNode has additional information such as diagnostics, context masks and mutual parser flags that you carry over from the recovery mode to the incremental parsing and let you continue to parse in the same context that you are currently in, unless you set a strict directive on the RootNode. If you do this, Escaya will parse in strict mode and you will not be able to recover any nodes from the old tree if you were first parsing in sloppy mode, because it's only possible to reuse a node if it was parsed with the same context that the parser used before.

EScaya recovery mode vs. Acorn loose

The main difference is that EScaya's recovery mode conforms to the ECMAScript® 2021 specs , while Acorn Loose does not. It's not even an JavaScript parser. You can play with Acorn Loose on ASTExplorer and you will notice the differences.

As an example you will get a BlockStatement if you try to parse something like try.

Performance


Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
javascript (70,454
typescript (11,686
performance (615
parser (521
parsing (190
ast (168
ecmascript (92
errors (42
incremental (32
syntax-tree (31
recovery (29
tc39 (17