srdatalog.dsl¶
Python front-end DSL for SRDatalog, replacing the Nim macro DSL in lang.nim/syntax.nim.
Rules are constructed with Python objects and operator overloading:
X, Y, Z = Var(“X”), Var(“Y”), Var(“Z”) edge = Relation(“Edge”, 2) path = Relation(“Path”, 2)
Module Contents¶
Classes¶
Aggregation body clause. Binds |
|
Mirrors Nim ClauseArgKind in syntax.nim. |
|
A relation application, used as head or body clause. |
|
An argument slot in an atom: a logic var, a compile-time constant, or raw C++ code. |
|
Intermediate: accumulates body clauses under |
|
A compile-time constant argument wrapping a Python value. |
|
Inline filter — |
|
Intermediate: accumulates head atoms under |
|
Bind a fresh variable to a C++ expression. Produced by the head-
constant-rewriting pass when a head has literal args; the head arg is
replaced by a fresh variable and a corresponding |
|
Negated atom ( |
|
User-specified plan for a rule variant. Mirrors PlanEntry in syntax.nim. |
|
A Datalog program. Takes rules; the relations list is derived from them via the Relation back-ref on each Atom. |
|
A relation declaration. Callable to build atoms. |
|
A Datalog rule: |
|
Split marker — partitions a rule body into above-split and below-split
sections. Mirrors Nim’s SplitClause ( |
|
A logic variable. Distinct from Python values; used by operator overloads to build AST. |
Functions¶
Data¶
API¶
- class srdatalog.dsl.Agg[source]¶
Aggregation body clause. Binds
result_varto the aggregate ofrel(args...)usingfunc(C++ aggregator name; “agg” +cpp_typefor custom aggregators, mirrors Nim’s AggClause).Example: count of R(x, y) bound to
c: agg(c, “count”, r(x, y))Nim’s HIR emits these into JSON as
{"kind": "aggregation", ...}but its lowering pipeline does not construct moAggregate nodes from AggClause (zero such constructions in src/srdatalog). Python mirrors that behavior: Agg round-trips through HIR but does not appear in MIR.- __and__(other: srdatalog.dsl.BodyClauseT | srdatalog.dsl.Conjunction) srdatalog.dsl.Conjunction[source]¶
- args: tuple[srdatalog.dsl.ClauseArg, ...]¶
None
- class srdatalog.dsl.ArgKind(*args, **kwds)[source]¶
Bases:
enum.EnumMirrors Nim ClauseArgKind in syntax.nim.
Initialization
- CONST¶
‘const’
- CPP_CODE¶
‘code’
- LVAR¶
‘var’
- class srdatalog.dsl.Atom[source]¶
A relation application, used as head or body clause.
Build via
Relation.__call__, never directly. Supports&to chain into a body conjunction and<=to form a rule with this atom as head.provcarries rewrite provenance: set by passes like semi-join optimization when a rewritten body clause is emitted in place of the original. Defaults to user-written.- __and__(other) srdatalog.dsl.Conjunction[source]¶
Compose with Atom / Negation / Filter / Let / Conjunction.
- __invert__() srdatalog.dsl.Negation[source]¶
~atom= negation.
- __le__(body) srdatalog.dsl.Rule[source]¶
head <= body→ Rule. Anonymous; call.named(name)to label.bodycan be any BodyClauseT or a Conjunction of them.
- __or__(other: srdatalog.dsl.Atom | srdatalog.dsl.HeadGroup) srdatalog.dsl.HeadGroup[source]¶
Compose atoms into a multi-head group:
A | B | C <= body.
- args: tuple[srdatalog.dsl.ClauseArg, ...]¶
None
- prov: srdatalog.ir.hir.provenance.Provenance¶
None
- srdatalog.dsl.BodyClauseT¶
None
- class srdatalog.dsl.ClauseArg[source]¶
An argument slot in an atom: a logic var, a compile-time constant, or raw C++ code.
- kind: srdatalog.dsl.ArgKind¶
None
- class srdatalog.dsl.Conjunction[source]¶
Intermediate: accumulates body clauses under
&. Not emitted directly.- __and__(other: srdatalog.dsl.BodyClauseT | srdatalog.dsl.Conjunction) srdatalog.dsl.Conjunction[source]¶
- clauses: tuple[srdatalog.dsl.BodyClauseT, ...]¶
None
- class srdatalog.dsl.Const(value, cpp_expr: str | None = None)[source]¶
A compile-time constant argument wrapping a Python value.
Prefer this over bare
intarguments when you want the intent explicit at the call site — e.g.,Method_Modifier(Const(abstract_id), meth)instead ofMethod_Modifier(abstract_id, meth)whereabstract_idis a Python int that readers can’t tell apart from a pure-Python value.For dataset-resolved constants (read from a meta.json at program construction time), this is the recommended shape:
meta = load_meta("batik_meta.json") ABSTRACT = Const(meta["abstract"]) # Python binding, value baked in Method_Modifier(ABSTRACT, meth)cpp_exproverrides the auto-derived C++ literal. Forintit defaults tostr(value). Other types require an explicitcpp_expruntil we need them.Initialization
- __slots__¶
(‘cpp_expr’, ‘value’)
- class srdatalog.dsl.Filter[source]¶
Inline filter —
return <cpp_code>against bound vars. Mostly produced by the constant-rewriting pass (where e.g.R(1, x)becomesR(_c0, x)Filter((_c0,), "return _c0 == 1;")), but available in the surface DSL too.
- class srdatalog.dsl.HeadGroup[source]¶
Intermediate: accumulates head atoms under
|. Mirrors Nim’s{(A args), (B args)} <-- bodymulti-head rule form.- __le__(body) srdatalog.dsl.Rule[source]¶
- __or__(other: srdatalog.dsl.Atom | srdatalog.dsl.HeadGroup) srdatalog.dsl.HeadGroup[source]¶
- atoms: tuple[srdatalog.dsl.Atom, ...]¶
None
- class srdatalog.dsl.Let[source]¶
Bind a fresh variable to a C++ expression. Produced by the head- constant-rewriting pass when a head has literal args; the head arg is replaced by a fresh variable and a corresponding
Letis appended to the body (so the fresh variable is bound before InsertInto reads it).
- class srdatalog.dsl.Negation[source]¶
Negated atom (
~rel(...)). Appears only in rule bodies.- __and__(other: srdatalog.dsl.BodyClauseT | srdatalog.dsl.Conjunction) srdatalog.dsl.Conjunction[source]¶
- atom: srdatalog.dsl.Atom¶
None
- class srdatalog.dsl.PlanEntry[source]¶
User-specified plan for a rule variant. Mirrors PlanEntry in syntax.nim.
delta == -1targets the base (non-recursive) variant; otherwise it is the body-clause index used as the delta seed for semi-naive evaluation.var_orderandclause_orderoverride the default planning heuristic; when onlyvar_orderis given,clause_orderis derived from it.The pragma flags flow through to HirRuleVariant so codegen sees them:
fanout -> fan-out work-stealing for Cartesian products
work_stealing -> mid-level work-stealing (task queue + steal loop)
block_group -> block-group work partitioning
dedup_hash -> GPU hash table for in-kernel existential dedup
balanced_root/balanced_sourcesdrive balanced partitioning for skewed joins (not yet lowered in Python).
- class srdatalog.dsl.Program[source]¶
A Datalog program. Takes rules; the relations list is derived from them via the Relation back-ref on each Atom.
The previous API took
relations=[...]in parallel withrules=[...]. That was a pure bug generator — if a relation was declared but never used, or used but never declared, the downstream passes silently generated wrong code. With the derived list, the schema is exactly the set of relations referenced by some rule, in rule-first-occurrence order (heads before body, body in source order). This matches the Nim-side normalization in hir.nim:normalizeDecls and keeps byte-match across the two ports.- add(*items: srdatalog.dsl.Rule) srdatalog.dsl.Program[source]¶
- relations: list[srdatalog.dsl.Relation]¶
‘field(…)’
- rules: list[srdatalog.dsl.Rule]¶
‘field(…)’
- show(*, rule: str | None = None, delta: int | None = None, theme: str = 'dark', include_jit: bool = True, height_px: int = 600) None[source]¶
Render this program in Jupyter with full options.
Args: rule: when None, shows the ruleset overview (the default the cell’s
progexpression already produces). When a string, drills into that rule’s plan view — variant access patterns, clause order, var order with drag-to-reorder. delta: only meaningful withrule. Filters to a single variant of the rule — e.g.delta=0shows just the variant seeded on body clause 0. Recursive rules emit one variant per body clause for semi-naive evaluation; this is how you isolate one of those “versions”. Default None shows all. theme: ‘dark’ (default), ‘light’, or ‘high-contrast’. Controls the renderer’s color palette inside the iframe — independent of VS Code’s editor theme. include_jit: include per-rule JIT C++ kernels. Adds ~2-3 MB on doop; off by default in_repr_mimebundle_for cell rerun speed, on by default here since you’re explicitly invoking. height_px: iframe height. Bump for larger rulesets.Examples: prog.show() # ruleset, dark, with JIT prog.show(rule=’TCRec’) # all variants of TCRec prog.show(rule=’TCRec’, delta=0) # just delta-0 variant prog.show(theme=’light’) # light mode prog.show(rule=’VPT_Load’, delta=1, theme=’light’, height_px=900)
Requires IPython.
- class srdatalog.dsl.Relation(name: str, arity: int, column_types: tuple[type, ...] | None = None, *, input_file: str = '', print_size: bool = False, output_file: str = '', index_type: str = '', semiring: str = 'NoProvenance')[source]¶
A relation declaration. Callable to build atoms.
Arity + column_types are structural metadata. Pragma fields (all optional) mirror Nim’s Relation[…] pragmas:
input_file → CSV the load-data block reads into this relation
print_size → runner emits a size-readback line after the fixpoint
output_file → runner writes the final contents to this path
index_type → C++ index template (e.g. “SRDatalog::GPU::Device2LevelIndex”)
semiring → override “NoProvenance” (rare — provenance semirings)
Initialization
- __call__(*args) srdatalog.dsl.Atom[source]¶
- __slots__¶
(‘arity’, ‘column_types’, ‘index_type’, ‘input_file’, ‘name’, ‘output_file’, ‘print_size’, ‘semiring…
- class srdatalog.dsl.Rule[source]¶
A Datalog rule:
head_1, head_2, ... :- body_1, body_2, ....headsis always a tuple of one or more Atoms (mirrors Nim’sRule.head: seq[HeadClause]). Build multi-head rules with(A | B | C) <= body; single-head still readsA <= body.plansholds user-provided PlanEntry overrides (one per delta position).countmarks a rule as count-only: no materialization, just the cardinality.semi_joinopts the rule into the Pass 1.5 semi-join optimization.is_generatedis True for compiler-synthesised rules (e.g. the_SJ_Target_Filter_...helpers emitted by semi-join optimization).provcarries rewrite provenance (user vs compiler-gen) — mirrors syntax.nim’sRule.prov.- body: tuple[srdatalog.dsl.BodyClauseT, ...]¶
None
- property head: srdatalog.dsl.Atom¶
First head (convenience for single-head rules). For multi-head, iterate
self.heads.
- heads: tuple[srdatalog.dsl.Atom, ...]¶
None
- named(name: str) srdatalog.dsl.Rule[source]¶
- plans: tuple[srdatalog.dsl.PlanEntry, ...]¶
()
- prov: srdatalog.ir.hir.provenance.Provenance¶
None
- with_count() srdatalog.dsl.Rule[source]¶
Mark this rule as count-only.
- with_inject_cpp(code: str) srdatalog.dsl.Rule[source]¶
Attach a C++ debug hook to be emitted as an InjectCppHook MIR node once per variant (after the rule’s pipeline runs). Mirrors Nim’s
inject_cpp: "..."rule pragma.
- with_plan(*, delta: int = -1, var_order: tuple[str, ...] | list[str] | None = None, clause_order: tuple[int, ...] | list[int] | None = None, fanout: bool = False, work_stealing: bool = False, block_group: bool = False, dedup_hash: bool = False, balanced_root: tuple[str, ...] | list[str] | None = None, balanced_sources: tuple[str, ...] | list[str] | None = None) srdatalog.dsl.Rule[source]¶
Append a single PlanEntry. Can be called multiple times to add entries for different deltas (or use .with_plans(entries) to replace).
- with_plans(entries: list[srdatalog.dsl.PlanEntry] | tuple[srdatalog.dsl.PlanEntry, ...]) srdatalog.dsl.Rule[source]¶
Replace all plans with the given sequence.
- with_semi_join() srdatalog.dsl.Rule[source]¶
Opt into semi-join optimization (Pass 1.5). Ignored on rules with <= 2 body clauses (the pass skips them per Nim’s semantics).
- srdatalog.dsl.SPLIT¶
‘Split(…)’
- class srdatalog.dsl.Split[source]¶
Split marker — partitions a rule body into above-split and below-split sections. Mirrors Nim’s SplitClause (
splitkeyword).Pipeline A writes the above-split output to a temp relation; Pipeline B scans the temp and joins with below-split clauses to produce the head. Useful for negation pushdown / selective join evaluation. At most one Split per rule body.
- class srdatalog.dsl.Var(name: str)[source]¶
A logic variable. Distinct from Python values; used by operator overloads to build AST.
Initialization
- __slots__¶
(‘name’,)
- srdatalog.dsl.agg(result_var, func: str, rel_atom: srdatalog.dsl.Atom, cpp_type: str = '') srdatalog.dsl.Agg[source]¶
Build an aggregation body clause.
result_varmay be a Var instance or a bare string var name.rel_atomis the output ofRelation(...)(...)— its rel + args become the aggregation’s relation reference.
- srdatalog.dsl.count(result_var, rel_atom: srdatalog.dsl.Atom) srdatalog.dsl.Agg[source]¶
Convenience: count(v, R(x, y)) → (v = count(R(x, y))).
- srdatalog.dsl.cpp(code: str) srdatalog.dsl.ClauseArg[source]¶
Raw C++ code as a clause argument (rare; mirrors the
$"..."Nim syntax).
- srdatalog.dsl.sum(result_var, rel_atom: srdatalog.dsl.Atom) srdatalog.dsl.Agg[source]¶
Path(X, Y) :- Edge(X, Y)¶
r1 = Rule(heads=(path(X, Y),), body=[edge(X, Y)], name=”TCBase”)
Path(X, Z) :- Path(X, Y), Edge(Y, Z)¶
r2 = (path(X, Z) <= path(X, Y) & edge(Y, Z)).named(“TCRec”)
This module defines only the DSL surface; lowering to HIR is in hir_passes.py (TBD).