Some fun weekend explorations to learn more about earley parsing and constrained decoding with LLMs.
Early stage exploration and prototyping.
earley.pyimplements earley chart filling algorithm with extensions to support incremental parsing and backtracking. Example usage inexample_incremental_parsing.py(using thetry_accept_tokensfunction).logits_processor.pyimplements a hf transformers LogitProcessor which forces sampling/decoding to choose the token with highest predicted probablity that is permitted by the grammar. Example usage inexample_constrained_decoding.py.
- Developed with python 3.12. Though, python 3.10+ will likely work.
- Create an environment for this project. One way is using conda:
conda create -n earley-decoding python=3.12 - Activate your environment:
conda activate earley-decoding - Install dependencies using
pip install -r requirements.txtin your environment.
- Leo optimizations and support for nullable grammars.
- Token healing in constrained decoding similar to the
guidancelibrary. Currently we do not try alternate token sequences with the same prefix when sampling the next tokens. Without this, generation seems to often be degenerate. - Logits processor implementation is very naive right now and slow -- there must be ways we can optimize (e.g. caching prefixes). We can probably use microsoft
aicilibrary for more low level control. - ...