8bdd748 Fix use of uninitialized values during scheduling
~jprotopopov pushed to ~jprotopopov/kefir git
Kefir is an independent compiler for C programming language (C17 standard).
Kefir does not rely on any existing frameworks, and provides a complete compiler
implementation written in C. The main priority is self-sufficiency of the
project, compatibility with platform ABI and compliance with C17 language
standard. Any omissions or incompatibilities between the language standard and
Kefir behavior which are not explicitly documented (see Implementation & Usage quirks
section below) shall be considered bugs.
Kefir supports modern x86-64 Linux, FreeBSD, OpenBSD and NetBSD environments
(see Supported environments
section below). Compiler is also able to produce
JSON streams containing program representation at various stages of compilation
(tokens, AST, IR), as well as printing source code in preprocessed form. The
compiler targets GNU As (Intel syntax with/without prefixes and ATT syntax are
supported) and also has limited support for Yasm assembler. Kefir is able to
produce debug information in DWARF5 format for GNU As. Position-independent code
generation is supported. Kefir features cc
-compatible command line interface.
Kefir website also provides some additional information.
Note to the users of Kefir: if you encounter any behavior that does not comply with C17 language standard or significantly diverges from other compilers, please do no hestitate to reach me out via email directly or through the mailing list. Code snippets for easier reproduction are especially welcome.
Kefir compiler is named after fermented milk drink, no other connotations are meant or intended.
Kefir targets x86-64 ISA and System-V ABI. Supported systems include modern
Linux, FreeBSD, OpenBSD and NetBSD operating systems (full test suite is
executed regularly in these environments). A platform is considered supported if
full automated test suite (see Test suite
section below) sucessfully executes
there -- no other guarantees and claims are made. On Linux, glibc
and musl
standard libraries are supported; musl
might be preferable because its header
files use significantly less non-standard extensions, however as of now Kefir
supports enough GCC extensions to reasonably use include files from glibc
; on
BSDs system libc
can be used (additional macro definitions, such as
__GNUC__
, __GNUC_MINOR__
, could be necessary depending on used system libc
features). Kefir supports selection of target platform via --target
command
line option.
For each respective target, compiler expects a set of environment variables
(e.g. KEFIR_GNU_INCLUDE
, KEFIR_GNU_LIB
, KEFIR_GNU_DYNAMIC_LINKER
) to be
present in order to correctly configure system include and library paths -- the
default values for these variables are detected upon Kefir build, however in the
event of any changes to system toolchain (e.g. after upgrades) configuration
needs to be re-generated.
In addition, see Implementation & Usage quirks
section below for some other
specifics of Kefir.
The main motivation of the project is deeper understanding of C programming language, as well as practical experience in the broader scope of compiler implementation aspects. Based on this, following goals were set for the project:
libc
implementations.Following things are NON-goals:
Note on the language standard support: initially the compiler development was focused on C11 language standard support. The migration to C17 happened when the original plan was mostly finished. During the migration applicable DRs from those included in C17 were inspected and code was updated accordingly. The compiler itself is still written in compliance with C11 language standard.
See CHANGELOG
.
The initial implementation has been finished: at the momement the compiler features all necessary components, and with some known minor idiosyncrasies supports C17 language standard. Kefir is able to reasonably re-use standard library provided by target systems. Significant effort has been spent to compose a test suite of third-party open source software to validate correctness of Kefir. Further effort is concentrated on improving and extending the compiler, including:
Some implementation details that user needs to take into account:
libatomic
for
GCC, libcompiler_rt
for Clang), thus any program that employs atomic
operations need to link a libatomic
-compatible library. It happens by
default for Glibc and *BSD targets. Furthermore, if <stdatomic.h>
header
from Clang includes is used (the default on FreeBSD and OpenBSD),
-D__GNUC__=4 -D__GNUC_MINOR__=20
command line arguments shall be added to
Kefir invocation.STDC
pragmas are implemented in preprocessor. Kefir does not perform
respective optimizations and implements conservative behavior, thus these
pragmas would be no-op.glibc
might exhibit problems with non-standard features
(e.g. attributes). For instance, glibc
might suppress __attribute__
by a
macro or omit packed
attributes, thus breaking the ABI, so care needs to be
taken.Kefir can be used along with musl libc standard
library. Kefir also supports glibc
, as well as libc
implementations provided
with FreeBSD, OpenBSD and NetBSD, however header files from these libraries tend
to include non-standard compiler features, and thus support might vary for
different library versions. In practice, Kefir implements enough compiler
extensions to make use of all target system standard libraries. However,
additional macro definitions (such as __GNUC__
and __GNUC_MINOR__
on BSD
systems) might be needed for successful compilation.
Several C language extensions are implemented for better compatibility with GCC. All of them are enabled by default in the driver, but disabled if the compiler is invoked directly (consult the manual for details). No specific compability guarantees are provided. Among the implemented extensions (non-exaustive list):
int function_name()
is automatically defined. The feature was
part of previous C standards, however it's absent from C11 onwards.int
as implicit function return type -- function definition may omit return
type, int
will be used instead.fieldname:
form -- old, deprecated form which is
still supported by GCC.&&
operator, gotos support
arbitratry addresses in goto *
form.__typeof__
, __typeof_unqual__
, __auto_type
type specifiers.__atomic*
and __sync*
builtins.echo '__KEFIRCC_SUPPORTED_BUILTINS__' | kefir -E -
Kefir also defines a few non-standard macros by default, such as macros
indicating data model (__LP64__
), endianess (__BYTE_ORDER__
and
__ORDER_LITTLE_ENDIAN__
), as well as __KEFIRCC__
which can be used to
identify the compiler.
Kefir has support of asm
directive, both in file and function scope.
Implementation supports output and input parameters, parameter constraints
(immediate, register, memory), clobbers and jump labels, however there is no
compatibility with respective GCC/Clang functionality (implemented bits behave
similarly, though, thus basic use cases shall be compatible). Additionally,
asm
-labels are supported for non-static non-local variables.
Kefir supports __attribute__(...)
syntax, however only a few attributes are
actually implemented. Presence of attribute in source code can be turned into a syntax error
by CLI option. For the list of supported attributes see:
echo '__KEFIRCC_SUPPORTED_ATTRIBUTES__' | kefir -E -
Disclaimer: Use at your own risk. This is experimental project which is not meant for production purposes. No guarantees are being made for correctness, completeness, stability and fitness for any particular purpose.
Kefir depends on a C11 compiler (tested with gcc
and clang
), bash, GNU Make
as well as basic UNIX utilities for build. Runtime dependencies include libc
,
an assembler (GNU As or Yasm) and a linker (GNU ld). Test dependencies
optionally include valgrind
, python
and csmith
as well. After installing
all dependencies, kefir can be built with a single command: make -j$(nproc)
.
By default, kefir builds a shared library and links executables to it. Static
linkage can be enforced by specifying USE_SHARED=no
in make command line
arguments. Sample PKGBUILD
is provided in dist/kefir
directory.
It is also advised to run basic test suite:
make test all # Linux glibc
make test all USE_SHARED=no CC=musl-gcc KEFIR_TEST_USE_MUSL=yes # Linux musl
gmake test all CC=clang # FreeBSD
gmake test all CC=clang AS=gas # OpenBSD
gmake test all CC=gcc AS=gas # NetBSD
Optionally, Kefir can be installed via: make install DESTDIR=... prefix=...
.
Short reference on compiler options can be obtained by running kefir --help
,
as well as in the manual which is supplied in the compiler distribution.
At the moment, Kefir is automatically tested in Ubuntu 24.04, FreeBSD 14.x and
OpenBSD 7.5 and NetBSD 10.x environments. Arch Linux is used as a primary
development environment. Kefir also provides a Dockerfile (see dist
) which
defines necessary dependencies for running the whole test suite:
podman build -t kefir-dev-env -f dist/Dockerfile .
Kefir provides scripts to build portable, standalone Kefir distribution package that incorporates statically-linked Kefir C compiler, musl libc, and tools from GNU Binutils. The package targets modern x86_64-based Linux systems and provides a minimalistic C17 development toolchain independent of host system tooling.
Portable package can be obtained via:
make portable
# Build artifact is located in bin/portable/kefir-portable-*.tar.gz
In addition, portable package can be fully bootstraped in 3-stage process:
make portable_bootstrap
Kefir supports compilation with Emscripten into a
WebAssembly library, which can be invoked from client-side JavaScript in
Web-applications. Kefir functionality in that mode in limited due to absence of
normal POSIX environment, linker and assembler utilities: only text output
(assembly code, preprocessed code, tokens, ASTs, IR) can be produced from a
single input file. Furthermore, all include files need to be explicitly supplied
from JavaScript side in order to be available during compilation. Note that this
does not imply support for WebAssembly as a compilation target: it only serves
as a host environment. To build kefir.js
and kefir.wasm
in bin/web
directory, use:
make web -j$(nproc) # Requires Emscripten installed
A simple playground Web application is also available. It bundles Kefir web build with Musl include files and provides a simple Godbolt-like interface. Build as follows:
make webapp -j$(nproc)
The Web application is static and requires no server-side logic. An example of simple server command-line:
python -m http.server 8000 -d bin/webapp
A hosted version of the Web application is available at Kefir playground (please note that the Web page uses JavaScript and WebAssembly).
Kefir is capable of bootstraping itself (that is, compiling it's own source code). This can be verified as follows:
make bootstrap_test BOOTSTRAP_EXTRA_CFLAGS="-O1 -g" -j$(nproc)
For practical purposes, kefir
can be bootstrapped using the normal build
process (note that depending on the target platform, additional defines might be
necessary. See source/tests/bootstrap/bootstrap.mk
):
make all CC=$PATH_TO_KEFIR -j$(nproc)
Kefir relies on following tests, most of which are executed as part of CI:
*.c
files which are
compiled either using system compiler or kefir depending on file
extension. Everything is then linked together and executed. The test suite
is executed on Linux with gcc and clang compilers, on FreeBSD with clang and
on OpenBSD with clang. In Linux and FreeBSD environments valgrind
is used
to control test suite correctness at runtime.compile
& execute
parts of GCC torture test suite are
executed with kefir compiler, with some permissive options enabled. At the
moment, out of 3445 tests, 477 fail and 29 are skipped due to being irrelevant
(e.g. SIMD or profiling test cases; there is no exhaustive skip list yet). All
failures happen on compilation stage, no abortions occur at runtime. The work
with torture test suite will be continued in order to reduce the number of
failures. The torture tests are included into CI pipeline with some basic test
result checks.Own test suite is deterministic (that is, tests do not fail spuriously), however there might arise problems when executed in unusual environments (e.g. with non-Unicode locale). For instance, some tests contain unicode characters and require the environment to have appropriate locale set. Also, issues with local standard library version might cause test failures.
Furthermore, Kefir also provides an external test suite
comprised of open source software that is known to work with Kefir. This suite
are not included in the CI, however, it is regularly executed manually. Kefir is
used to build software from the external test suite, and where possible, run
(most of) tests of the respective software projects. Software projects
constituting the external test suite are compiled with little or no changes.
Only acceptable changes are short patches that do not alter the project logic
(e.g. patching #ifdef
s, replacing a non-standard keyword or altering the build
system is considered acceptable). Note that running the external test suite
might require additional project-specific dependencies (see dist/Dockerfile
).
Currently, the external test suite includes: GNU Binutils (except gprofng), GNU Coreutils, GCC 4.0.4 bootstrap, programming language interpreters (Lua, Perl, Python, Tcl, Guile, duktape, GNU Awk, pForth), text editors (Vim, nano), database engines (SQLite3, Postrges), assemblers (yasm, nasm), shells (Bash, oksh), compression libraries (zlib, zstd), cryptography (OpenSSL, OpenSSH), networking tools (nginx, curl, wget), build systems (GNU Make, muon), git (git itself and cgit), other small language compilers (tcc, slimcc, cproc), other libraries (libCello, libuv, libpng), as well as some lesser-known projects. The external test suite is used to verify Kefir compatbility with real world software. Source code for third-party projects has been archived and available at Kefir website.
Furthermore, at development stage,
CSmith is used for randomized,
differential testing against GCC. Problematic csmith seeds are added to the test
suite (executed optionally by make test
if CSMITH
environment variable is
defined; requires Python 3.12).
At the moment, extension of the test suite is a major goal. It helps significantly in eliminating bugs, bringing kefir closer to C language standard support and improving compiler UX in general.
In order to simplify translation and facilitate portability, intermediate representation (IR) layer was introduced. It defines architecture-agnostic 64-bit stack machine bytecode, providing generic calling convention and abstracting out the type layout information. Compiler is structured into separate modules with respect to IR: code generation, AST analysis and translation. The IR code is then converted into optimizer SSA-like representation. IR layer provides several interfaces for AST analyzer to retrieve necessary target type layout information (for instance, for constant expression analysis). AST analysis and translation are separate stages to improve code structure and reusability. Parser uses recursive descent approach with unlitmited back-tracking. Lexer was implemented before preprocessor and can be used independently of it (preprocessing stage can be completely omitted), thus both lexer and preprocessor modules share the same lexing facilities. Driver links kefir as a library and forks in order to isolate each file processing.
The primary code repository is hosted at Sourcehut, with secondary mirrors at Codeberg and author's personal website.
Author: Jevgenijs Protopopovs
The code base also includes patches from:
License: