Skip to content

Understanding Regexes

Allen Rohner edited this page Jun 9, 2018 · 3 revisions

Spectrum's regexes are mostly the same as clojure.spec's. There are some tricky implementation details that if understood, make it easier to follow along with the code. While this is primarily focused on spectrum's implementation, most of this applies to clojure.spec as well.

Regex vs. Spect

There are two main differences between regexes and normal spects. 1) they consume seqs of data 2) they are stateful.

Conforming

When conforming a regex, i.e. (re-conform s data) re-conform repeatedly calls (derivative s (first data)) until the data is exhausted or the spec returns reject, whichever comes first.

Derivatives

(derivative s x)

derivative says "given spec s and value x, attempt to consume x". If x matches, derivative will return an updated version of s.

(let [s (cat- [int? keyword?])
      s* (derivative s 3)
      s** (derivative s* :foo)]
  s**)

(derivative s 3) will return (cat- [keyword?]), with an internal field called :return containing the value [3]. s** will return (cat- []) with :return [3 :foo]

non-regexes implementing p/Regex

Why do non-regex spects such as Pred implement p/Regex? To simplify conforming. (derivative (pred int?) (value 3)) => accept.

reject vs. invalid

Q: What is the point of reject? A: Reject is used in regex contexts to say "this particular step doesn't work, but that doesn't mean the whole regex is invalid". Consider (derivative (alt int? keyword?) (value :foo)). alt will call derivative on each possibility in turn. (dx int? :foo) will return reject, but (dx keyword? :foo) will return (accept :foo), and the matching will continue.

Clone this wiki locally