Developer Guide

Setting up Rust

The easiest way to set up Rust toolchain is with https://rustup.rs/. By default, only stable toolchain is installed. Active Rust development also requires nightly toolchain:

rustup install nightly

Building and testing

Build is done with cargo tool, just like any other Rust project.

cargo build
# or
cargo build --release

Tests can also be done with cargo test command, but a much better way is to use nextest tool. To install it, do cargo install cargo-nextest. Then run tests with the following command:

cargo nextest r

nextest is much faster than the default test runner.

Running check tests

There are also check-tests, which are very similar to LLVM Integrated Tests. An easy and quick way to run those is to invoke cargo run --bin check-runner. However, a more convenient way for day-to-day use is cargo-make:

cargo install cargo-make
cargo make check # run build and check tests
cargo make check-only # only run check whithout re-building TIR
cargo make test # run build, cargo tests and check

Running fuzz tests

We also have fuzzing set up for each user input parser, like a disassembler or an IR parser. These tests also require an external tool, that can be installed with a command like cargo install cargo-fuzz. The usage is very simple:

# List tests
cargo fuzz list
# Run specific test
cargo +nightly fuzz run fuzz_riscv_disassembler -- -max_total_time=60 -max_len=16384

Collecting coverage info

WARNING!!! Coverage tool creates a lot of temp files in your working directory. You better commit all your changes to be able to use git to clean up.

Install dependencies:

rustup component add llvm-tools-preview
cargo install grcov

Run tests with special flags:

CARGO_INCREMENTAL=0 RUSTFLAGS='-Cinstrument-coverage' LLVM_PROFILE_FILE='cargo-test-%p-%m.profraw' cargo test
grcov . --binary-path target/debug/ -s . -t coveralls+ --branch --llvm \
    --ignore '../*' --ignore "/*" --ignore 'macros/*' --ignore 'fuzz/*' \
    --ignore '**/tests/**' -o target/coverage/html

Open target/coverage/html/index.html to see the report.

Also main branch reports are available at https://coveralls.io/github/perf-toolbox/tir.

Defining Dialects

Intro

Dialects are semantically-complete purpose-focused pieces of intermediate representation. An example of a TIR dialect is riscv dialect, that contains operations, types and algorithms to accurately represent RISC-V ISA on both assembly and binary levels. Dialects can both co-exist together and progressively lower one to another.

Defining Operations

TIR provides a crate to help developers easily define new operations.

Below is an example of a simple operation:

#![allow(unused)]
fn main() {
use crate::builtin::DIALECT_NAME;
use tir_core::{Op, OpImpl, Type};
use tir_macros::{Assembly, Op};

#[derive(Op, Assembly)]
#[operation(name = "super", known_attrs(value: IntegerAttr))]
pub struct SuperOp {
    #[operand]
    operand1: Register,
    #[ret_type]
    return_type: Type,
    r#impl: OpImpl,
}

}

The helper macros would implement the following things for you:

  • Getters and setters for operands (i.e., get_operand1, set_operand1)
  • Getters for regions
  • Default assembly parser and printer.

Additional methods can be defined manually by implementing impl SuperOp {...}.

Field Configurations

#[operation(..., known_attrs(attr1: AttrType))]

Defines an attribute of a specific type. Any type used in attributes must be convertible to tir_core::Attribute enum. Also implements basic attribute getters and setters.

#[operand]

Defines an operand of a specific type. Also implements basic setters and getters for the operand.

#[region(single_block, no_args)]

Defines a region. Also defines a basic getter get_<field_name>_region. If single_block argument is passed, also defines a get_<field_name> single block getter. If both single_block and no_args are passed, a default region will be created during operation building.

tir_core - Rust

The Language Parsing Library

Language Parsing Library is TIR's built-in parser combinator library, similar to nom, winnow or chumsky. It is aimed at providin flexible yet performant interfaces for any built-in DLS within TIR domain.

Rationale

One may wonder why not just use any of the existing libraries. There're a few reasons to take this road:

  1. Existing libraries are often too abstract and too hard to use. With too many customization points, they often become a burden when supporting multiple built-in DSLs. This includes things like error handling that is often lacking features required for mature language tooling. Rather then being a one stop shop for every parsing problem, we focus on things that matter for us: unified error handling and diagnostic engine, source tracking on by default, common sub-parsers.
  2. Lack of features. Surprisingly, generic parser combinators usually lack things that are actually needed for a real-world compiler or DSL, like comment parsing or usable support for separate lexing and parsing stages.
  3. Performance. Production compilers deal with huge amounts of data, and the hotspots can be in the most unexpected places. To have better control over our compile times, we need to make sure every part of the pipeline can be easily changed.