srdatalog.ir.codegen.cuda.envelope

target.cuda — full-file envelope emission.

The dialect-emitted kernel body sits inside a fixed-shape envelope:

JIT_FILE_PRELUDE (constant header)

  • banner(rule_name, num_handles)

  • functor_start(rule_name, …)

  • [optional DedupTable struct]

  • view_declarations + (operator() body)

  • functor_end()

  • “\n”

  • JIT_FILE_FOOTER (constant footer)

Pure string emitters — no algorithm dispatch, no feature-flag-driven branching beyond the dedup-hash struct injection. Lives in the dialect because the envelope shape is target-specific (CUDA cooperative-groups signature, __device__ qualifier, etc.); a target.cpp_tbb envelope would emit a different shape from the same MIR pipeline.

The legacy ir/dialects/target/cuda/ modules still own their own copies of these helpers during the Stage 2 transition (they are imported by the byte-equivalence harness which compares the legacy emitter against the dialect path). Once the legacy inner-body emitters are deleted, the legacy copies go with them.

Module Contents

Classes

ViewSpec

(rel_name, index, version, handle_idx) — handle_idx is the handle position of the FIRST op that referenced this view spec.

Functions

assign_handle_positions

Assign handle_start to every source-bearing node in pipeline order starting from 0. Mutates ops in place.

collect_unique_view_specs

Walk the pipeline and collect a deduplicated list of ViewSpecs in first-occurrence order. Covers every op that references a view (ColumnJoin, CartesianJoin, Scan, Negation, Aggregate, BalancedScan, PositionedExtract).

count_handles

Number of views[] slots needed by the kernel — max(handle_start) + 1.

emit_dedup_table_struct

Emit the DedupTable struct nested inside a rule’s kernel scope.

emit_full_file

Wrap a dialect-emitted operator() body in the standard file envelope.

emit_functor_end

emit_functor_start

Open struct Kernel_<rule> { ... operator()(...) const {.

emit_view_declarations

Emit the top-of-kernel auto view_X = views[i]; block.

first_dest_arity

Arity of the first InsertInto’s column set. Sizes the DedupTable’s hash function (one v0..vN-1 column per parameter).

Data

API

srdatalog.ir.codegen.cuda.envelope.JIT_FILE_PRELUDE = <Multiline-String>
class srdatalog.ir.codegen.cuda.envelope.ViewSpec[source]

(rel_name, index, version, handle_idx) — handle_idx is the handle position of the FIRST op that referenced this view spec.

handle_idx: int

None

index: list[int]

None

rel_name: str

None

version: str

None

srdatalog.ir.codegen.cuda.envelope.__all__

[‘JIT_FILE_FOOTER’, ‘JIT_FILE_PRELUDE’, ‘ViewSpec’, ‘assign_handle_positions’, ‘collect_unique_view_…

srdatalog.ir.codegen.cuda.envelope.assign_handle_positions(ops: list[srdatalog.ir.mir.types.MirNode]) None[source]

Assign handle_start to every source-bearing node in pipeline order starting from 0. Mutates ops in place.

srdatalog.ir.codegen.cuda.envelope.collect_unique_view_specs(ops: list[srdatalog.ir.mir.types.MirNode]) list[srdatalog.ir.codegen.cuda.envelope.ViewSpec][source]

Walk the pipeline and collect a deduplicated list of ViewSpecs in first-occurrence order. Covers every op that references a view (ColumnJoin, CartesianJoin, Scan, Negation, Aggregate, BalancedScan, PositionedExtract).

srdatalog.ir.codegen.cuda.envelope.count_handles(ops: list[srdatalog.ir.mir.types.MirNode]) int[source]

Number of views[] slots needed by the kernel — max(handle_start) + 1.

srdatalog.ir.codegen.cuda.envelope.emit_dedup_table_struct(arity: int) str[source]

Emit the DedupTable struct nested inside a rule’s kernel scope.

GPU hash table over arity 32-bit columns: try_insert (atomicCAS during count phase) and check_winner (read-only during materialize phase). Linear probing over up to 128 slots from an FNV-1a hash; the capacity is power-of-2 (host-side runner zero-initializes hash_slots between phases).

srdatalog.ir.codegen.cuda.envelope.emit_full_file(ep: srdatalog.ir.mir.types.ExecutePipeline, body: str, *, scalar_mode: bool = False) str[source]

Wrap a dialect-emitted operator() body in the standard file envelope.

body must be everything between operator() { and the closing }, i.e. the view declarations followed by the dialect-emitted kernel logic. Caller composes view_decls + emit(iir, emit_ctx).

srdatalog.ir.codegen.cuda.envelope.emit_functor_end() str[source]
srdatalog.ir.codegen.cuda.envelope.emit_functor_start(rule_name: str, *, scalar_mode: bool = False, dedup_hash: bool = False) str[source]

Open struct Kernel_<rule> { ... operator()(...) const {.

When dedup_hash=True, operator() takes an additional DedupTable dedup_table parameter.

srdatalog.ir.codegen.cuda.envelope.emit_view_declarations(specs: list[srdatalog.ir.codegen.cuda.envelope.ViewSpec], pipeline: list[srdatalog.ir.mir.types.MirNode], *, indent_level: int = 4, debug: bool = True, slot_mode: str = 'handle_idx', view_counts: list[int] | None = None) tuple[str, dict[str, str]][source]

Emit the top-of-kernel auto view_X = views[i]; block.

Returns (decls_string, view_vars). view_vars maps both:

  • spec key (<rel>_<cols>_<VER>) → view variable name

  • str(handle_idx) → view variable name (so handle-bearing ops can look up “which view does this handle name reference?”)

slot_mode controls how the index into views[] is chosen:

  • 'handle_idx': use sp.handle_idx directly (matches the jit_batch.<rule>.cpp standalone-kernel goldens, which don’t apply slot-offset compaction).

  • 'positional': use cumulative-sum-of-view_counts slot per spec (matches the jit_runner.<rule>.cpp production goldens via compute_view_slot_offsets).

view_counts (per-spec, parallel to specs) is the number of physical view slots each spec consumes. Default = all 1s (DSAI). D2L FULL_VER specs consume 2 slots (HEAD + FULL); the dialect’s view decl emits the BASE view at the first slot only (the second slot is referenced by BG histogram via views[base+seg]).

srdatalog.ir.codegen.cuda.envelope.first_dest_arity(ops: list[srdatalog.ir.mir.types.MirNode]) int[source]

Arity of the first InsertInto’s column set. Sizes the DedupTable’s hash function (one v0..vN-1 column per parameter).