Skip to content

Latest commit

 

History

History
267 lines (193 loc) · 12.1 KB

File metadata and controls

267 lines (193 loc) · 12.1 KB

This is a very impressive and ambitious project. Building a new language, even one that transpiles to another, is a significant undertaking. The code is well-structured, the documentation is excellent, and the test coverage at the integration level is fantastic.

Here is a detailed review, focusing on the "glaring issues" you asked for, as well as potential areas for improvement.

High-Level Architecture

The overall design is sound and clever. By compiling Dasy Lisp syntax into a Vyper AST and then handing off to the Vyper compiler, you gain several major advantages:

  1. Security & Optimization: You are piggy-backing on the extensive security analysis and optimization work done by the Vyper team. This is a huge win and the right architectural choice.
  2. EVM Compatibility: You're guaranteed to produce correct and efficient EVM bytecode as long as your AST generation is correct.
  3. Developer Experience: You can focus on providing a powerful, macro-driven Lisp experience without needing to reinvent the entire compiler backend.

The use of Hy for macros and some parser utilities is a natural and excellent fit for the project.


Critical Issues & Glaring Problems

These are the most significant issues that could lead to bugs, security vulnerabilities, or unpredictable behavior.

1. Recursive Compilation and Circular Dependencies in Macros

This is the most critical issue in the codebase.

File: dasy/dasy/builtin/macros.hy

Macros: interface! and include!

(defmacro interface! [filename]
  (import dasy) ;; <-- Problem 1: Imports the whole package
  (import os)
  ;; Problem 2: Path is relative to Current Working Directory, not the source file
  (let [path (+ (.getcwd os) "/" filename)
        data (.compile-file dasy path) ;; <-- Problem 3: Re-enters the compiler
        interface-str (.get-external-interface dasy data)]
    (.read dasy interface-str)))

Analysis:

  1. Circular Dependency: A macro, which is executed during compilation, imports the dasy package. The dasy package itself initializes the parser and compiler. This creates a highly coupled and potentially circular dependency.
  2. Recursive Compilation: The macro calls dasy.compile_file(). This means the compiler is calling itself to compile another file just to get its interface. While this is a powerful feature, it's fraught with peril:
    • What happens if examples/test_interface.vy has a syntax error? The main compilation will fail with a potentially confusing traceback originating from the macro expansion.
    • What if test_interface.vy itself uses an interface! macro? This could lead to deep, hard-to-debug recursion.
  3. Path Resolution: The use of os.getcwd() is incorrect. It makes compilation dependent on where the user runs the dasy command from, not where the source file is located. If a user tries to compile dasy/examples/interface.dasy from the root directory, os.getcwd() will be /path/to/dasy, and path will be /path/to/dasy/examples/test_interface.vy, which is correct. But if they cd dasy and run dasy ../examples/interface.dasy, it will fail.

Recommendation:

  • Decouple Macro Helpers: Create a separate, minimal module for file reading and compilation that macros can import without pulling in the entire dasy package. This module should not depend on the main parser loop.
  • Fix Path Resolution: The parser or macro expansion context needs to know the path of the file currently being compiled. Hy's macro system can provide this. The path to the included/interfaced file should be resolved relative to the current source file's path, not the CWD. pathlib is your friend here.
  • Avoid Global State (see next point): This will make recursive compilation safer.

2. Use of Global State in Parser

File: dasy/dasy/parser/parse.py

Globals: SRC, CONSTS

# ...
SRC = ""
CONSTS = {}

def parse_node(node):
    # ... uses CONSTS ...

def parse_src(src: str):
    global SRC
    SRC = src
    # ...
    # parse_node is called, which eventually calls add_src_map
    # add_src_map uses the global SRC
    return add_src_map(SRC, node, ast_node)

Analysis: The parser is not re-entrant. SRC is a global variable holding the source code for the current file. CONSTS is a global dictionary holding constants.

  • If you ever wanted to support parallel compilation in the same process, this would fail catastrophically.
  • It makes the recursive compilation in the interface! macro even more dangerous. When the compiler is re-entered, this global state could be overwritten, leading to incorrect source mapping or constant resolution for the original file.

Recommendation:

  • Refactor parse_src and parse_node to accept a Context object or pass src and consts down the call stack as arguments. This makes the parser pure and re-entrant.
# Suggested change
def parse_src(src: str):
    # No more global SRC
    mod_node = ...
    settings = {}
    consts = {} # Local to this compilation
    for element in dasy_read_many(src):
        # Pass context down
        ast = parse_node(element, src, consts)
        # ...
        if cmd_str == "defconst":
             consts[str(expr[1])] = expr[2] # Mutate local dict
             return None
    # ...

def parse_node(node, src, consts):
    # ...
    ast_node = ...
    # ...
    return add_src_map(src, node, ast_node) # Pass src explicitly

Potential Improvements & Refactoring

These are less critical but would improve code quality, maintainability, and robustness.

1. Code Duplication

Files: dasy/dasy/parser/comparisons.py and dasy/dasy/parser/ops.py

Both files define an almost identical chain_comps function. It appears dasy/parser/ops.py is the one being used. The dasy/parser/comparisons.py file seems redundant or a leftover from a refactor and could likely be removed to avoid confusion.

2. Error Handling

The codebase frequently uses raise Exception(...). It would be better to define a hierarchy of custom exceptions.

class DasyError(Exception):
    pass

class DasySyntaxError(DasyError):
    # Could include line/column info
    pass

class DasyCompilerError(DasyError):
    pass

This allows consumers of dasy as a library (like an IDE extension or build tool) to catch specific errors and provide better feedback to the user.

3. Parser Dispatch Mechanism

File: dasy/dasy/parser/parse.py in parse_expr

The current dispatch mechanism is a large if/elif chain over strings.

def parse_expr(expr):
    # ...
    if is_op(cmd_str): # ...
    if cmd_str in nodes.handlers: # ...
    node_fn = f"parse_{cmd_str}"
    for ns in [nodes, core, macros, functions]:
        if hasattr(ns, node_fn):
            return getattr(ns, node_fn)(expr)
    # ...

This works, but it can be hard to trace and extend. A dictionary-based dispatch table is a common and often cleaner alternative.

# A potential alternative structure
DISPATCH_TABLE = {
    "defn": core.parse_defn,
    "defvar": core.parse_defvar,
    # ... map all core forms
}

def parse_expr(expr):
    cmd_str = ALIASES.get(str(expr[0]), str(expr[0]))
    # ...
    if handler := DISPATCH_TABLE.get(cmd_str):
        return handler(expr)
    # ... handle ops, macros, calls etc.

4. Complex Method Call Syntax Parsing

The logic in parse.py:parse_call and parse.py:parse_expr to handle method calls like (.append self/nums 1) (which becomes ((. self nums) append 1)) and especially the (. None method) pattern is clever but complex and a bit "magical". The doto macro is a much cleaner and more idiomatic Lisp pattern for this. It might be worth considering simplifying the core syntax and encouraging doto for these use cases.


Strengths

It's important to highlight what the project does well, and it does a lot.

  1. Excellent Documentation: The README.org, docs.org, and dasybyexample.org are fantastic. Providing clear documentation and a "by example" guide is crucial for any new language, and you have nailed this.
  2. Comprehensive Integration Tests: tests/test_dasy.py is a model for how to test a compiler. It covers a vast range of language features by compiling and executing example files. The use of boa is perfect for this.
  3. Smart Language Features:
    • The DasyReader for handling 0x literals is a small but very smart detail that improves usability.
    • The built-in macros like cond, condp, doto, and the field accessors (set-in, get-at) provide significant ergonomic improvements over raw Vyper AST construction.
    • The automatic chaining of binary operators ((+ 1 2 3)) is a great Lisp-y feature.
  4. Clean Code Structure: The project is well-organized into logical modules (parser, compiler, builtin), which makes it easy to navigate.

Final Verdict

You have a very solid foundation for an exciting project. The architecture is smart, the documentation is superb, and the feature set is already rich.

To move from "experimental pre-alpha" to a more stable state, I would strongly recommend focusing on these two action items in order:

  1. Refactor the interface! and include! macros to eliminate recursive compilation and fix path handling.
  2. Remove global state from the parser to make it re-entrant and robust.

After addressing these critical issues, the project will be in a much more stable and secure position for further development. Fantastic work so far!


Implementation Progress

✅ Completed (2024-12-06)

1. Fixed Path Resolution in Macros

  • Problem: interface! and include! used os.getcwd() making compilation dependent on working directory
  • Solution:
    • Created ParseContext class to carry source file path through compilation
    • Macros now resolve paths relative to context.base_dir (source file's directory)
    • Tested and verified compilation works from any directory

2. Removed Global State from Parser

  • Problem: Global SRC and CONSTS variables made parser non-reentrant
  • Solution:
    • Created ParseContext object to hold source code, constants, and file path
    • Refactored all parser functions to accept context parameter
    • Added backwards compatibility layer (parse_node_legacy) for gradual migration
    • Parser is now thread-safe and reentrant

3. Fixed Recursive Compilation and Circular Dependencies

  • Problem: interface! macro could cause infinite recursion by importing itself
  • Solution:
    • Created compile_for_interface() function for minimal compilation
    • Added circular dependency detection with clear error messages
    • Used thread-local storage to track compilation and include stacks
    • CircularDependencyError provides helpful debugging information

✅ Recently Completed (2024-12-06 - Part 2)

4. Removed Code Duplication

  • Problem: chain_comps function duplicated in both comparisons.py and ops.py
  • Solution:
    • Removed unused comparisons.py file entirely
    • ops.py is the canonical location for comparison operations
    • All tests pass after removal

5. Implemented Custom Exception Hierarchy

  • Problem: Generic Exception and ValueError used throughout
  • Solution:
    • Created dasy/exceptions.py with specialized exception classes
    • Exception hierarchy: DasyException (base) with subclasses for syntax, type, compilation errors etc.
    • Replaced all generic exceptions with appropriate specific types
    • Better error messages and easier debugging for users

📝 Remaining Tasks

High Priority

None - All high priority items completed!

Medium Priority

  1. Complex method call syntax - Consider simplifying the (. None method) pattern

Low Priority

  1. Refactor parser dispatch - Replace if/elif chain with dictionary-based dispatch

🎯 Next Steps

The critical architectural issues have been resolved. The codebase is now much more robust with:

  • Proper path handling independent of working directory
  • Thread-safe, reentrant parser without global state
  • Protection against circular dependencies with helpful error messages

All 39 tests pass. The foundation is solid for further improvements