C lexer/parser/checker/compiler for Hare
-E
: parse + pre-process the input file and unparse to stdout (useful for
developing the parser)-dump
: check the input file and dump all top-level declarations to
stdout (useful for developing the checker)-emit-qbe
: emit the qbe .ssa file (not implemented yet)-h
for other supported flagsSend patches to ~sebsite/hare-c@lists.sr.ht (archive)
If you need help or have questions ping me on fedi or IRC or wherever :)
A checkmark means the thing is implemented; a T means the thing is properly tested. This checklist isn't yet exhaustive; I'm updating it as I go.
basics
T🗸 comments
T🗸 block
T🗸 line
T🗸 backslashes at end of line
declarations
🗸 must declare at least one identifier
specifiers
type specifiers
T🗸 signed, unsigned
T🗸 parse
🗸 void
T🗸 parse
🗸 incomplete type; cannot be completed
T🗸 char
T🗸 parse
T🗸 short
T🗸 parse
T🗸 int
T🗸 parse
T🗸 long
T🗸 parse
T🗸 float
T🗸 parse
T🗸 double
T🗸 parse
struct/union
struct
bit-fields
named
T🗸 parse
check
anonymous
T🗸 parse
check
T🗸 declarations
T🗸 named
T🗸 parse
T🗸 check
T🗸 anonymous
T🗸 parse
T🗸 check
T🗸 definitions
T🗸 parse
T🗸 check
T🗸 forward
T🗸 parse
T🗸 check
🗸 incomplete types
T🗸 final member may be incomplete array if other members are present
🗸 everything else disallowed
T🗸 struct type is still complete; size is as though incomplete array weren't present
🗸 but struct type can't be used in other structs or in arrays
🗸 including if it's (recursively) a member of a union
🗸 union
🗸 bit-fields prohibited
🗸 declarations
🗸 named
T🗸 parse
🗸 check
🗸 anonymous
T🗸 parse
🗸 check
🗸 definitions
T🗸 parse
🗸 check
🗸 forward
T🗸 parse
🗸 check
🗸 incomplete types disallowed
🗸 must contain at least one named member
variably modified member types disallowed
T🗸 tag inserted into scope immediately after it's declared
tag is always declared when not used as type specifier
🗸 when used as type specifier, tag is only declared if no other declaration is visible
🗸 enum
🗸 declarations
T🗸 named
T🗸 anonymous
T🗸 definitions
🗸 forward disallowed
🗸 fields inserted into scope
T🗸 parse
🗸 no linkage
🗸 tag inserted into scope immediately after definition ends
typedefs
T🗸 parse
T🗸 inserted into scope
T🗸 file scope
block scope
disallowed for variably modified types
T🗸 allowed otherwise
storage specifiers
T🗸 auto
T🗸 parse
T🗸 static
T🗸 parse
register
T🗸 parse
forbids taking address
🗸 explicitly
implicitly (e.g. array -> pointer conversion)
also applies to derivative lvalues
🗸 extern
T🗸 parse
🗸 defaults to external linkage
T🗸 uses linkage of identifier visible in scope when it's internal or external
🗸 can't have initializer
🗸 typedef
T🗸 parse
T🗸 inserts type into scope
🗸 can't have initializer
🗸 at most one storage specifier per declaration (sans thread_local)
🗸 duplicates disallowed
🗸 duplicate specifiers
🗸 disallowed for type specifiers
T🗸 allowed for other specifiers
🗸 disallow incompatible
🗸 type specifiers
🗸 other specifiers
T🗸 qualifiers
T🗸 allowed in any order
T🗸 duplicates allowed
declarators
T🗸 pointer
function
parameters
🗸 named
T🗸 anonymous
🗸 arrays converted to (qualified) pointers
🗸 still checked as arrays
🗸 functions converted to function pointers
may not shadow typedefs
🗸 may not declare more than one (non-tag) identifier
🗸 may not be initialized
T🗸 without storage specifier
T🗸 defaults to implicit auto
register storage specifier
🗸 parse
🗸 declares with register storage
🗸 ignored for non-definitions
🗸 may be different in compatible types
applies even when other declarations' parameters have different/absent storage specifiers
🗸 all other specifiers disallowed
🗸 return type requirements
🗸 must return either complete type or void
🗸 may not return array
🗸 warn when function returning complete type doesn't have return statement
T🗸 except for main function in hosted environment
🗸 all other functions
🗸 storage specifier must be static or extern (explicit or otherwise)
🗸 error out when explicitly given qualifiers (via typedef)
🗸 implicit const qualifier
array
T🗸 without qualifiers
🗸 qualifiers apply to element type
🗸 don't apply to array itself before c23
...except as inner-most declarator of function parameter
qualifiers may only be in square brackets in inner-most declarator of function parameter
🗸 if present, known size must be greater than zero
🗸 element type must be
🗸 complete
🗸 object
🗸 identifier only
🗸 parenthesized
🗸 type names
🗸 with declarator
🗸 without declarator
🗸 scopes
🗸 file
🗸 parse
🗸 auto/register specifiers disallowed
🗸 default storage
🗸 extern for functions
T🗸 external linkage for objects
🗸 block
🗸 parse
🗸 default storage is auto
🗸 duplicates
🗸 file scope
T🗸 allowed for compatible declarations
T🗸 composite type
T🗸 compatible specifiers
🗸 disallowed for incompatible declarations
🗸 at most one definition allowed
🗸 block scope
🗸 disallowed within same scope
T🗸 shadowing in nested scopes
🗸 shadowing block-scoped declarations
🗸 shadowing file-scoped declarations
🗸 struct/union/enum tags must always refer to same type
🗸 unique namespaces
T🗸 identifiers
T🗸 struct/union/enum
🗸 goto labels
🗸 tentative declarations
🗸 internally-linked declarations must be complete by end of translation unit
translation unit must not be empty (static assertions are ok)
internally-linked declaration must be initialized if used
except sizeof
except alignof
all other expressions
🗸 can't declare an object with type void
expressions
literals
numbers
T🗸 pre-processing numbers
T🗸 int literals
T🗸 decimal
T🗸 hex
T🗸 octal
T🗸 suffixes
T🗸 u
T🗸 l
T🗸 parse value
T🗸 float literals
T🗸 decimal
T🗸 exponent
T🗸 f suffix
T🗸 parse value
T🗸 char literals
T🗸 plain
T🗸 L prefix
🗸 string literals
T🗸 plain
T🗸 L prefix
🗸 string concatenation
🗸 works for same-prefix strings
🗸 disallowed for strings with different prefixes
T🗸 sizeof
T🗸 parse
T🗸 array indexing
T🗸 parse
🗸 struct/union field accessing
🗸 a.b
🗸 parse
🗸 a->b
🗸 parse
T🗸 unary postfix
T🗸 parse
unary prefix
T🗸 parse
&
compile-time
T🗸 runtime
🗸 when operand is unary *, neither is evaluated
🗸 constraints still apply
T🗸 binary
T🗸 parse
T🗸 assignment
T🗸 parse
T🗸 parenthesized
T🗸 operator precedence
casting
🗸 explicit
T🗸 parse
T🗸 extends implicit type conversion rules
T🗸 integer -> pointer
T🗸 pointer -> integer
T🗸 pointer -> pointer
T🗸 object -> object
T🗸 function -> function
🗸 everything else disallowed (except object <-> function; see extensions)
implicit conversion
T🗸 integer promotion
🗸 lvalue conversion
types
T🗸 integer -> float
T🗸 float -> float
T🗸 integer -> integer
T🗸 float -> integer
T🗸 pointer -> bool
T🗸 array -> pointer
T🗸 function -> pointer
🗸 pointer -> pointer
T🗸 pointer -> compatible pointer
T🗸 object pointer -> void pointer
T🗸 void pointer -> object pointer
🗸 can't implicitly convert to any other pointer
T🗸 qualifiers may be added
🗸 qualifiers may not be removed
T🗸 any -> void
🗸 va_list -> va_list
struct/union -> same struct/union
🗸 tagged
untagged
🗸 everything else disallowed
statements
goto
T🗸 parse
T🗸 jump to any label within function
🗸 disallow undefined labels
may not jump from outside scope of VMT to inside scope
T🗸 compound
T🗸 parse
labelled
🗸 goto label
T🗸 parse
🗸 unique per-function
case
T🗸 parse
🗸 must evaluate to integer constant
case type is converted to type of controlling expression
🗸 must only appear in switch body
T🗸 direct descendant
T🗸 within another block
🗸 disallowed everywhere else
T🗸 only visible to inner-most switch body
🗸 each case in switch body is unique
T🗸 ...but nested switch statements may contain duplicates
🗸 default
T🗸 parse
🗸 must only appear in switch body
T🗸 direct descendant
T🗸 within another block
🗸 disallowed everywhere else
T🗸 only visible to inner-most switch body
🗸 at most one per switch body
T🗸 ...but nested switch statements may contain their own
T🗸 labelled statements may themselves be labelled
T🗸 empty
T🗸 parse
🗸 if
T🗸 parse
🗸 condition must be scalar
🗸 while
T🗸 parse
🗸 condition must be scalar
🗸 do-while
T🗸 parse
🗸 condition must be scalar
for
T🗸 parse
🗸 condition must be scalar (if present)
T🗸 initializer, condition, and afterthought may all be omitted
initializer declarations may only have automatic storage duration
initializer declarations enter scope immediately after declared
switch
T🗸 parse
🗸 value must be integer
🗸 value is promoted
no VMTs not in scope of switch statement may be in scope of any cases
break
T🗸 parse
T🗸 allowed in loop
T🗸 allowed in switch statement
🗸 disallowed everywhere else
causes jump to outside whatever it applies to
T🗸 within check
for-loop afterthought isn't evaluated
continue
T🗸 parse
T🗸 allowed in loop
🗸 disallowed everywhere else
causes jump to end of loop it applies to
T🗸 within check
so for-loop afterthought is evaluated, then condition
🗸 return
T🗸 parse
🗸 type of return value must be convertible to function return type
🗸 warn when returning non-void from void function
🗸 warn when used in noreturn function
T🗸 expression statements
T🗸 parse
T🗸 implicitly convert to void
initializers
as initializer of declaration
T🗸 parse
must be constant expression for static objects
T🗸 may be any expression otherwise
may be surrounded by braces
when single scalar object
when using literal to initialize array (e.g. string literal)
T🗸 in cast expression (compound literal)
T🗸 parse
🗸 variadic functions
🗸 function definitions
T🗸 declaration inserted into scope
T🗸 before body is parsed
T🗸 before body is checked
🗸 declarator must be function
T🗸 non-pointer accepted
🗸 pointer rejected
🗸 typedef rejected
T🗸 without arguments (void)
T🗸 with named arguments
🗸 argument types must be complete (sans (void) case)
🗸 identifiers
🗸 no such thing as an invalid token
🗸 main function
🗸 freestanding environment
🗸 not required to be declared
🗸 no requirements or restrictions imposed when declared
🗸 hosted environment
🗸 must have strictly conforming declaration
🗸 int main(void)
T🗸 int main(int, char **)
T🗸 ...or equivalent, expanding typedefs and performing usual parameter conversions
🗸 must have external linkage
🗸 all other forms rejected
🗸 checked even if no definition is present
pre-processor
🗸 macro definitions
🗸 #define
🗸 variable-like
🗸 function-likes
🗸 shadowing keywords
🗸 undef
🗸 shadowing keywords
🗸 other identifiers
macro substitution
🗸 pre-defined
🗸 macros
T🗸 __STDC__
T🗸 __STDC_HOSTED__
T🗸 __FILE__
T🗸 __LINE__
🗸 __DATE__
🗸 __TIME__
operators
# (%:)
🗸 ## (%:%:)
🗸 with macro arguments
🗸 only token closest to ## is concatenated
🗸 when argument expands to no tokens, replace with placemarker
🗸 with other tokens
🗸 constructs new token
🗸 non-pre-processing tokens
🗸 constructed token won't be used for pre-processing (i.e. no # or ##)
🗸 # has higher precedence than ##
🗸 object-like
🗸 function-like
🗸 works when followed by left paren
🗸 doesn't work when not followed by left paren
🗸 parameters
🗸 expand non-recursively
🗸 expands macros
🗸 everything else
🗸 don't expand recursively
🗸 from parameters
🗸 from macros
🗸 recursive
🗸 macros can't expand to themselves
🗸 everything else expands
🗸 #include
🗸 system headers
🗸 non-system headers
🗸 header exists
🗸 header doesn't exist; fallback to system header
🗸 header name doesn't have angle brackets
🗸 header name has angle brackets
🗸 macro expansion
🗸 <this> is lexed as a system header string literal
🗸 true in #include
🗸 false everywhere else
🗸 conditional
🗸 #if, #endif
🗸 #else
🗸 #elif
🗸 defined
🗸 non-parenthesized identifier
🗸 paranthesized identifier
🗸 error out when neither of the above forms is matched
🗸 #ifdef
🗸 #ifndef
🗸 #error
🗸 errors out
🗸 error message uses all tokens
🗸 macros aren't expanded
#line
🗸 only change line number
🗸 also change filename
macros are expanded
legacy
T🗸 trigraphs
🗸 k&r-style functions
🗸 empty parameter list
🗸 non-empty parameter list without types
🗸 k&r-style parameter declarations
🗸 implicit int
implicit function declaration
🗸 c95
T🗸 digraphs
🗸 __STDC_VERSION__
c99
🗸 pragma
🗸 #pragma
🗸 _Pragma
🗸 STDC
🗸 accept standardized
🗸 FP_CONTRACT
🗸 FENV_ACCESS
🗸 CX_LIMITED_RANGE
🗸 reject unstandardized
🗸 implementation-defined
🗸 accepted; ignored
VLAs
🗸 parse
can't be initialized
🗸 hex float literals
T🗸 prefix / base
🗸 parse value
T🗸 exponent
universal character names
T🗸 in char/string literals
in identifiers
prior to c17, conform to annex D
🗸 disallowed: (<0xa0 && !='$' && !='@' && !='\'') || (>=0xd800 && <0xe000)
T🗸 initializer designators
T🗸 array
T🗸 struct
declaration as for-loop initializer
T🗸 parse
must be auto or register
default is auto
T🗸 specifiers
T🗸 type specifiers
T🗸 _Complex
T🗸 _Imaginary
T🗸 long double
T🗸 long long
inline
T🗸 parse
T🗸 may appear more than once
can't be used outside of function declaration
T🗸 applies to function declaration
if inline is used on any declaration, there must be a definition
🗸 internally-linked
🗸 can be used on any declaration; no change in behavior
externally-linked; inline definitions
🗸 function becomes inline definition if all file-scope declarations use inline
🗸 ...and none explicitly use extern
🗸 otherwise no change in behavior; function isn't an inline definition
🗸 inline definition doesn't provide external definition
constraints
may not define a modifiable object with static or thread duration
may not use identifier with internal linkage
🗸 static array parameters
T🗸 restrict
T🗸 parse
🗸 variadic macros
🗸 use of ... in definition
🗸 __VA_ARGS__
T🗸 intermixing declarations and statements in compound statements
T🗸 literal suffixes
T🗸 ll int literal suffix
T🗸 l float literal suffix
🗸 __func__
🗸 parse
🗸 resolves to name of current function
🗸 can't be redeclared at top-level
🗸 has type 'const char []'
implicit return 0 from main for hosted targets
not for freestanding targets
c11
specifiers
_Thread_local
🗸 parse
🗸 may not appear alongside auto or register
🗸 may not appear in block scope when no storage specifiers are present
🗸 may not be used on function declaration
must appear on all declarations of an object
_Noreturn
T🗸 parse
🗸 may appear more than once
applies to function declaration
_Alignas
🗸 expression
🗸 type
duplicates permitted; most strict alignment used
invalid uses
alongside typedef/register specifier
on function
on bit-field
must not specify less strict alignment than default
_Atomic
as specifier
parse
type in parens must not be qualified
type in parens must not be array, function, or atomic
as qualifier
🗸 parse
other qualifiers apply to atomic type, not to type being made atomic
🗸 _Static_assert
T🗸 parse
T🗸 top-level
T🗸 within function
T🗸 within struct/union
🗸 eval
T🗸 top-level
T🗸 within function
T🗸 within struct/union
🗸 errors out when condition is false
T🗸 _Alignof
_Generic
T🗸 parse
controlling expression conversions
🗸 lvalue conversion
array -> pointer
🗸 function -> function pointer
🗸 must match exactly one case
🗸 at most one compatible case
🗸 at most one default case allowed
🗸 if no compatible case, default case must be present
T🗸 prefixes
T🗸 char literals
T🗸 u8-
T🗸 u-
T🗸 U-
T🗸 string literals
T🗸 u-
T🗸 U-
nested anonymous structs/unions
🗸 pre-defined macros
🗸 __STDC_IEC_559__
🗸 __STDC_UTF_16__
🗸 __STDC_UTF_32__
c23
unicode identifiers
specifiers
🗸 _BitInt
T🗸 signed
T🗸 parse
T🗸 unsigned
T🗸 parse
T🗸 wb int literal suffix
🗸 operand must be integer constant expression
🗸 signed width must be >= 2
🗸 unsigned width must be >= 1
T🗸 doesn't undergo integer promotion
🗸 has rank below equivalently sized basic integer type
🗸 otherwise ranked based on width of both types
T🗸 typeof
T🗸 qualified (typeof)
T🗸 unqualified (typeof_unqual)
float types
basic
real
_Float32
_Float64
_Float80
_Float128
imaginary
_Float32_Imaginary
_Float64_Imaginary
_Float80_Imaginary
_Float128_Imaginary
complex
_Float32_Complex
_Float64_Complex
_Float80_Complex
_Float128_Complex
extended
real
_Float32x
_Float64x
_Float128x
imaginary
_Float32x_Imaginary
_Float64x_Imaginary
_Float128x_Imaginary
complex
_Float32x_Complex
_Float64x_Complex
_Float128x_Complex
decimal types
T🗸 basic
T🗸 _Decimal32
T🗸 type specifier
T🗸 df suffix
T🗸 _Decimal64
T🗸 type specifier
T🗸 dd suffix
T🗸 _Decimal128
T🗸 type specifier
T🗸 dl suffix
extended
_Decimal64x
_Decimal128x
🗸 constexpr
pre-processor
#embed
system embeds
non-system embeds
embed exists
embed doesn't exist; fallback to system header
embed name doesn't have angle brackets
embed name has angle brackets
parameters
standard
if_empty
limit
prefix
suffix
vendor-specific
duplicates not permitted
leading+trailing underscores permitted
#warning
actually warns
🗸 warning message uses all tokens
🗸 macros aren't expanded
__VA_OPT__
🗸 #elifdef, #elifndef
standard pragmas
FENV_ROUND
🗸 value must be direction
eval
information persists in check
FENV_DEC_ROUND
🗸 value must be dec-direction
eval
information persists in check
🗸 static_assert without reason
type inferencing with auto
T🗸 nullptr
T🗸 labels
T🗸 labelled declarations
T🗸 at end of compound statement
🗸 binary int literals
🗸 parse value
🗸 function declarations
🗸 variadic function without parameters
T🗸 parameters in function definition need not be named
attributes
locations
T🗸 declaration
T🗸 top-level
T🗸 bindings
T🗸 function definitions
T🗸 as a declaration consisting of only attributes and nothing else
T🗸 within compound body
T🗸 for-loop initializer
T🗸 function parameter
T🗸 struct/union fields
T🗸 binding
T🗸 base type
T🗸 declarator
T🗸 array
T🗸 function
T🗸 pointer
T🗸 with identifier
T🗸 without identifier
struct/union/enum declaration
T🗸 parse
allowed when defining struct/union/enum
allowed in tag declaration where struct/union doesn't act as type specifier
doesn't apply to enums since they can't be forward declared
disallowed otherwise
T🗸 enum field
T🗸 statement
standard
[[noreturn]], [[_Noreturn]]
disallow argument
applicable to
function
nothing else
[[deprecated]]
may have optional argument
applicable to
struct/union declaration
typedef name
object
struct/union member
function
enum
enum member
nothing else
__has_c_attribute => 202311L
warn when name is used
[[fallthrough]]
disallow argument
applicable to
lone attribute declaration
next encountered statement must have case or default label
if within iteration statement, next statement must also be within said iteration statement
__has_c_attribute => 202311L
[[maybe_unused]]
disallow argument
applicable to
struct/union declaration
typedef name
object
struct/union member
function
enum
enum member
label
nothing else
__has_c_attribute => 202311L
[[nodiscard]]
may have optional argument
applicable to
function
struct/union definition
enum definition
nothing else
__has_c_attribute => 202311L
warn when value discarded
[[reproducible]]
TODO
[[unsequenced]]
TODO
warn on non-standard
leading+trailing underscores permitted
T🗸 keywords are treated as identifiers
T🗸 with prefix
T🗸 without prefix
T🗸 without arguments
T🗸 with arguments
T🗸 including balanced tokens
T🗸 multiple attribute lists are combined
🗸 multiple attributes in [[brackets]]
T🗸 u8- string literals
T🗸 empty initializers
🗸 ' as separator
🗸 int literals
🗸 float literals
T🗸 empty declarations
pre-defined macros
keyword aliases
🗸 alignas
T🗸 alignof
T🗸 bool
T🗸 true
T🗸 defined
T🗸 expands to _Bool value
T🗸 false
T🗸 defined
T🗸 expands to _Bool value
🗸 static_assert
🗸 thread_local
when equivalent keyword is re-defined, alias macro doesn't expand said definition
__has_c_attribute
__has_embed
__has_include
🗸 constants
🗸 __STDC_IEC_60559_TYPES__
🗸 __STDC_IEC_60559_BFP__
🗸 __STDC_IEC_60559_DFP__
🗸 __STDC_EMBED_NOT_FOUND__
🗸 __STDC_EMBED_FOUND__
🗸 __STDC_EMBED_EMPTY__
explicit enum backing type
🗸 storage specifiers in cast expression
T🗸 allowed for compound literals
🗸 disallowed for other casts
T🗸 treat empty parameter list as identical to (void)
array is qualified equivalently to element type
extensions
redefinition of macros
🗸 user-defined
🗸 pre-defined
warns when new definition isn't identical to old
doesn't warn when new definition is identical
defining keywords as macros
🗸 allowed
warns
T🗸 pre-declared identifiers
🗸 can't be redeclared at top-level
🗸 __builtin_va_arg
T🗸 parse
🗸 check
🗸 __builtin_va_copy
T🗸 parse
🗸 check
🗸 __builtin_va_end
T🗸 parse
🗸 check
🗸 __builtin_va_list
T🗸 parse
🗸 check
__builtin_va_start
🗸 parse
T🗸 require exactly two arguments before c23
🗸 require one or two arguments after c23
🗸 check
warnings and errors
error when function isn't variadic
warn when more than one optional argument is supplied (c23)
optional argument isn't an identifier
error before c23
warn since c23
optional argument isn't the last named function parameter
error before c23
warn since c23
🗸 pre-defined macros
🗸 __has_builtin
__asm__ declarations
🗸 parse
disallowed when no linkage
disallowed for struct/union fields
🗸 allowed otherwise
casting between object pointer and function pointer
warns
🗸 does the Right Thing
🗸 __DATE__ and __TIME__ use SOURCE_DATE_EPOCH if set
__attribute__
parse
warn for unrecognized attributes
supported attributes
packed
section
leading+trailing underscores permitted
additional warnings
declaring a reserved identifier
before c23: including potentially reserved
after c23: excluding potentially reserved
defining a reserved identifier
before c23: including potentially reserved
after c23: excluding potentially reserved
SOURCE_DATE_EPOCH is invalid
unused internally-linked objects and functions