C lexer/parser/checker/compiler for Hare
-E: parse + pre-process the input file and unparse to stdout (useful for developing the parser)
-dump: check the input file and dump all top-level declarations to stdout (useful for developing the checker)
-emit-qbe: emit the qbe .ssa file (not implemented yet)
-hfor other supported flags
If you need help or have questions ping me on fedi or IRC or wherever :)
A checkmark means the thing is implemented; a T means the thing is properly tested. This checklist isn't yet exhaustive; I'm updating it as I go.
basics T🗸 comments T🗸 block T🗸 line T🗸 backslashes at end of line declarations 🗸 must declare at least one identifier specifiers type specifiers T🗸 signed, unsigned T🗸 parse 🗸 void T🗸 parse 🗸 incomplete type; cannot be completed T🗸 char T🗸 parse T🗸 short T🗸 parse T🗸 int T🗸 parse T🗸 long T🗸 parse T🗸 float T🗸 parse T🗸 double T🗸 parse struct/union struct bit-fields named T🗸 parse check anonymous T🗸 parse check T🗸 declarations T🗸 named T🗸 parse T🗸 check T🗸 anonymous T🗸 parse T🗸 check T🗸 definitions T🗸 parse T🗸 check T🗸 forward T🗸 parse T🗸 check 🗸 incomplete types T🗸 final member may be incomplete array if other members are present 🗸 everything else disallowed T🗸 struct type is still complete; size is as though incomplete array weren't present 🗸 but struct type can't be used in other structs or in arrays 🗸 including if it's (recursively) a member of a union 🗸 union 🗸 bit-fields prohibited 🗸 declarations 🗸 named T🗸 parse 🗸 check 🗸 anonymous T🗸 parse 🗸 check 🗸 definitions T🗸 parse 🗸 check 🗸 forward T🗸 parse 🗸 check 🗸 incomplete types disallowed 🗸 must contain at least one named member variably modified member types disallowed T🗸 tag inserted into scope immediately after it's declared tag is always declared when not used as type specifier 🗸 when used as type specifier, tag is only declared if no other declaration is visible 🗸 enum 🗸 declarations T🗸 named T🗸 anonymous T🗸 definitions 🗸 forward disallowed 🗸 fields inserted into scope T🗸 parse 🗸 no linkage 🗸 tag inserted into scope immediately after definition ends typedefs T🗸 parse T🗸 inserted into scope T🗸 file scope block scope disallowed for variably modified types T🗸 allowed otherwise storage specifiers T🗸 auto T🗸 parse T🗸 static T🗸 parse register T🗸 parse forbids taking address 🗸 explicitly implicitly (e.g. array -> pointer conversion) also applies to derivative lvalues 🗸 extern T🗸 parse 🗸 defaults to external linkage T🗸 uses linkage of identifier visible in scope when it's internal or external 🗸 can't have initializer 🗸 typedef T🗸 parse T🗸 inserts type into scope 🗸 can't have initializer 🗸 at most one storage specifier per declaration (sans thread_local) 🗸 duplicates disallowed 🗸 duplicate specifiers 🗸 disallowed for type specifiers T🗸 allowed for other specifiers 🗸 disallow incompatible 🗸 type specifiers 🗸 other specifiers T🗸 qualifiers T🗸 allowed in any order T🗸 duplicates allowed declarators T🗸 pointer function parameters 🗸 named T🗸 anonymous 🗸 arrays converted to (qualified) pointers 🗸 still checked as arrays 🗸 functions converted to function pointers may not shadow typedefs 🗸 may not declare more than one (non-tag) identifier 🗸 may not be initialized T🗸 without storage specifier T🗸 defaults to implicit auto register storage specifier 🗸 parse 🗸 declares with register storage 🗸 ignored for non-definitions 🗸 may be different in compatible types applies even when other declarations' parameters have different/absent storage specifiers 🗸 all other specifiers disallowed 🗸 return type requirements 🗸 must return either complete type or void 🗸 may not return array 🗸 warn when function returning complete type doesn't have return statement T🗸 except for main function in hosted environment 🗸 all other functions 🗸 storage specifier must be static or extern (explicit or otherwise) 🗸 error out when explicitly given qualifiers (via typedef) 🗸 implicit const qualifier array T🗸 without qualifiers 🗸 qualifiers apply to element type 🗸 don't apply to array itself before c23 ...except as inner-most declarator of function parameter qualifiers may only be in square brackets in inner-most declarator of function parameter 🗸 if present, known size must be greater than zero 🗸 element type must be 🗸 complete 🗸 object 🗸 identifier only 🗸 parenthesized 🗸 type names 🗸 with declarator 🗸 without declarator 🗸 scopes 🗸 file 🗸 parse 🗸 auto/register specifiers disallowed 🗸 default storage 🗸 extern for functions T🗸 external linkage for objects 🗸 block 🗸 parse 🗸 default storage is auto 🗸 duplicates 🗸 file scope T🗸 allowed for compatible declarations T🗸 composite type T🗸 compatible specifiers 🗸 disallowed for incompatible declarations 🗸 at most one definition allowed 🗸 block scope 🗸 disallowed within same scope T🗸 shadowing in nested scopes 🗸 shadowing block-scoped declarations 🗸 shadowing file-scoped declarations 🗸 struct/union/enum tags must always refer to same type 🗸 unique namespaces T🗸 identifiers T🗸 struct/union/enum 🗸 goto labels 🗸 tentative declarations 🗸 internally-linked declarations must be complete by end of translation unit translation unit must not be empty (static assertions are ok) internally-linked declaration must be initialized if used except sizeof except alignof all other expressions 🗸 can't declare an object with type void expressions literals numbers T🗸 pre-processing numbers T🗸 int literals T🗸 decimal T🗸 hex T🗸 octal T🗸 suffixes T🗸 u T🗸 l T🗸 parse value T🗸 float literals T🗸 decimal T🗸 exponent T🗸 f suffix T🗸 parse value T🗸 char literals T🗸 plain T🗸 L prefix 🗸 string literals T🗸 plain T🗸 L prefix 🗸 string concatenation 🗸 works for same-prefix strings 🗸 disallowed for strings with different prefixes T🗸 sizeof T🗸 parse T🗸 array indexing T🗸 parse 🗸 struct/union field accessing 🗸 a.b 🗸 parse 🗸 a->b 🗸 parse T🗸 unary postfix T🗸 parse unary prefix T🗸 parse & compile-time T🗸 runtime 🗸 when operand is unary *, neither is evaluated 🗸 constraints still apply T🗸 binary T🗸 parse T🗸 assignment T🗸 parse T🗸 parenthesized T🗸 operator precedence casting 🗸 explicit T🗸 parse T🗸 extends implicit type conversion rules T🗸 integer -> pointer T🗸 pointer -> integer T🗸 pointer -> pointer T🗸 object -> object T🗸 function -> function 🗸 everything else disallowed (except object <-> function; see extensions) implicit conversion T🗸 integer promotion 🗸 lvalue conversion types T🗸 integer -> float T🗸 float -> float T🗸 integer -> integer T🗸 float -> integer T🗸 pointer -> bool T🗸 array -> pointer T🗸 function -> pointer 🗸 pointer -> pointer T🗸 pointer -> compatible pointer T🗸 object pointer -> void pointer T🗸 void pointer -> object pointer 🗸 can't implicitly convert to any other pointer T🗸 qualifiers may be added 🗸 qualifiers may not be removed T🗸 any -> void 🗸 va_list -> va_list struct/union -> same struct/union 🗸 tagged untagged 🗸 everything else disallowed statements goto T🗸 parse T🗸 jump to any label within function 🗸 disallow undefined labels may not jump from outside scope of VMT to inside scope T🗸 compound T🗸 parse labelled 🗸 goto label T🗸 parse 🗸 unique per-function case T🗸 parse 🗸 must evaluate to integer constant case type is converted to type of controlling expression 🗸 must only appear in switch body T🗸 direct descendant T🗸 within another block 🗸 disallowed everywhere else T🗸 only visible to inner-most switch body 🗸 each case in switch body is unique T🗸 ...but nested switch statements may contain duplicates 🗸 default T🗸 parse 🗸 must only appear in switch body T🗸 direct descendant T🗸 within another block 🗸 disallowed everywhere else T🗸 only visible to inner-most switch body 🗸 at most one per switch body T🗸 ...but nested switch statements may contain their own T🗸 labelled statements may themselves be labelled T🗸 empty T🗸 parse 🗸 if T🗸 parse 🗸 condition must be scalar 🗸 while T🗸 parse 🗸 condition must be scalar 🗸 do-while T🗸 parse 🗸 condition must be scalar for T🗸 parse 🗸 condition must be scalar (if present) T🗸 initializer, condition, and afterthought may all be omitted initializer declarations may only have automatic storage duration initializer declarations enter scope immediately after declared switch T🗸 parse 🗸 value must be integer 🗸 value is promoted no VMTs not in scope of switch statement may be in scope of any cases break T🗸 parse T🗸 allowed in loop T🗸 allowed in switch statement 🗸 disallowed everywhere else causes jump to outside whatever it applies to T🗸 within check for-loop afterthought isn't evaluated continue T🗸 parse T🗸 allowed in loop 🗸 disallowed everywhere else causes jump to end of loop it applies to T🗸 within check so for-loop afterthought is evaluated, then condition 🗸 return T🗸 parse 🗸 type of return value must be convertible to function return type 🗸 warn when returning non-void from void function 🗸 warn when used in noreturn function T🗸 expression statements T🗸 parse T🗸 implicitly convert to void initializers as initializer of declaration T🗸 parse must be constant expression for static objects T🗸 may be any expression otherwise may be surrounded by braces when single scalar object when using literal to initialize array (e.g. string literal) T🗸 in cast expression (compound literal) T🗸 parse 🗸 variadic functions 🗸 function definitions T🗸 declaration inserted into scope T🗸 before body is parsed T🗸 before body is checked 🗸 declarator must be function T🗸 non-pointer accepted 🗸 pointer rejected 🗸 typedef rejected T🗸 without arguments (void) T🗸 with named arguments 🗸 argument types must be complete (sans (void) case) 🗸 identifiers 🗸 no such thing as an invalid token 🗸 main function 🗸 freestanding environment 🗸 not required to be declared 🗸 no requirements or restrictions imposed when declared 🗸 hosted environment 🗸 must have strictly conforming declaration 🗸 int main(void) T🗸 int main(int, char **) T🗸 ...or equivalent, expanding typedefs and performing usual parameter conversions 🗸 must have external linkage 🗸 all other forms rejected 🗸 checked even if no definition is present pre-processor 🗸 macro definitions 🗸 #define 🗸 variable-like 🗸 function-likes 🗸 shadowing keywords 🗸 undef 🗸 shadowing keywords 🗸 other identifiers macro substitution 🗸 pre-defined 🗸 macros T🗸 __STDC__ T🗸 __STDC_HOSTED__ T🗸 __FILE__ T🗸 __LINE__ 🗸 __DATE__ 🗸 __TIME__ operators # (%:) 🗸 ## (%:%:) 🗸 with macro arguments 🗸 only token closest to ## is concatenated 🗸 when argument expands to no tokens, replace with placemarker 🗸 with other tokens 🗸 constructs new token 🗸 non-pre-processing tokens 🗸 constructed token won't be used for pre-processing (i.e. no # or ##) 🗸 # has higher precedence than ## 🗸 object-like 🗸 function-like 🗸 works when followed by left paren 🗸 doesn't work when not followed by left paren 🗸 parameters 🗸 expand non-recursively 🗸 expands macros 🗸 everything else 🗸 don't expand recursively 🗸 from parameters 🗸 from macros 🗸 recursive 🗸 macros can't expand to themselves 🗸 everything else expands 🗸 #include 🗸 system headers 🗸 non-system headers 🗸 header exists 🗸 header doesn't exist; fallback to system header 🗸 header name doesn't have angle brackets 🗸 header name has angle brackets 🗸 macro expansion 🗸 <this> is lexed as a system header string literal 🗸 true in #include 🗸 false everywhere else 🗸 conditional 🗸 #if, #endif 🗸 #else 🗸 #elif 🗸 defined 🗸 non-parenthesized identifier 🗸 paranthesized identifier 🗸 error out when neither of the above forms is matched 🗸 #ifdef 🗸 #ifndef 🗸 #error 🗸 errors out 🗸 error message uses all tokens 🗸 macros aren't expanded #line 🗸 only change line number 🗸 also change filename macros are expanded legacy T🗸 trigraphs 🗸 k&r-style functions 🗸 empty parameter list 🗸 non-empty parameter list without types 🗸 k&r-style parameter declarations 🗸 implicit int implicit function declaration 🗸 c95 T🗸 digraphs 🗸 __STDC_VERSION__ c99 🗸 pragma 🗸 #pragma 🗸 _Pragma 🗸 STDC 🗸 accept standardized 🗸 FP_CONTRACT 🗸 FENV_ACCESS 🗸 CX_LIMITED_RANGE 🗸 reject unstandardized 🗸 implementation-defined 🗸 accepted; ignored VLAs 🗸 parse can't be initialized 🗸 hex float literals T🗸 prefix / base 🗸 parse value T🗸 exponent universal character names T🗸 in char/string literals in identifiers prior to c17, conform to annex D 🗸 disallowed: (<0xa0 && !='$' && !='@' && !='\'') || (>=0xd800 && <0xe000) T🗸 initializer designators T🗸 array T🗸 struct declaration as for-loop initializer T🗸 parse must be auto or register default is auto T🗸 specifiers T🗸 type specifiers T🗸 _Complex T🗸 _Imaginary T🗸 long double T🗸 long long inline T🗸 parse T🗸 may appear more than once can't be used outside of function declaration T🗸 applies to function declaration if inline is used on any declaration, there must be a definition 🗸 internally-linked 🗸 can be used on any declaration; no change in behavior externally-linked; inline definitions 🗸 function becomes inline definition if all file-scope declarations use inline 🗸 ...and none explicitly use extern 🗸 otherwise no change in behavior; function isn't an inline definition 🗸 inline definition doesn't provide external definition constraints may not define a modifiable object with static or thread duration may not use identifier with internal linkage 🗸 static array parameters T🗸 restrict T🗸 parse 🗸 variadic macros 🗸 use of ... in definition 🗸 __VA_ARGS__ T🗸 intermixing declarations and statements in compound statements T🗸 literal suffixes T🗸 ll int literal suffix T🗸 l float literal suffix 🗸 __func__ 🗸 parse 🗸 resolves to name of current function 🗸 can't be redeclared at top-level 🗸 has type 'const char ' implicit return 0 from main for hosted targets not for freestanding targets c11 specifiers _Thread_local 🗸 parse 🗸 may not appear alongside auto or register 🗸 may not appear in block scope when no storage specifiers are present 🗸 may not be used on function declaration must appear on all declarations of an object _Noreturn T🗸 parse 🗸 may appear more than once applies to function declaration _Alignas 🗸 expression 🗸 type duplicates permitted; most strict alignment used invalid uses alongside typedef/register specifier on function on bit-field must not specify less strict alignment than default _Atomic as specifier parse type in parens must not be qualified type in parens must not be array, function, or atomic as qualifier 🗸 parse other qualifiers apply to atomic type, not to type being made atomic 🗸 _Static_assert T🗸 parse T🗸 top-level T🗸 within function T🗸 within struct/union 🗸 eval T🗸 top-level T🗸 within function T🗸 within struct/union 🗸 errors out when condition is false T🗸 _Alignof _Generic T🗸 parse controlling expression conversions 🗸 lvalue conversion array -> pointer 🗸 function -> function pointer 🗸 must match exactly one case 🗸 at most one compatible case 🗸 at most one default case allowed 🗸 if no compatible case, default case must be present T🗸 prefixes T🗸 char literals T🗸 u8- T🗸 u- T🗸 U- T🗸 string literals T🗸 u- T🗸 U- nested anonymous structs/unions 🗸 pre-defined macros 🗸 __STDC_IEC_559__ 🗸 __STDC_UTF_16__ 🗸 __STDC_UTF_32__ c23 unicode identifiers specifiers 🗸 _BitInt T🗸 signed T🗸 parse T🗸 unsigned T🗸 parse T🗸 wb int literal suffix 🗸 operand must be integer constant expression 🗸 signed width must be >= 2 🗸 unsigned width must be >= 1 T🗸 doesn't undergo integer promotion 🗸 has rank below equivalently sized basic integer type 🗸 otherwise ranked based on width of both types T🗸 typeof T🗸 qualified (typeof) T🗸 unqualified (typeof_unqual) float types basic real _Float32 _Float64 _Float80 _Float128 imaginary _Float32_Imaginary _Float64_Imaginary _Float80_Imaginary _Float128_Imaginary complex _Float32_Complex _Float64_Complex _Float80_Complex _Float128_Complex extended real _Float32x _Float64x _Float128x imaginary _Float32x_Imaginary _Float64x_Imaginary _Float128x_Imaginary complex _Float32x_Complex _Float64x_Complex _Float128x_Complex decimal types T🗸 basic T🗸 _Decimal32 T🗸 type specifier T🗸 df suffix T🗸 _Decimal64 T🗸 type specifier T🗸 dd suffix T🗸 _Decimal128 T🗸 type specifier T🗸 dl suffix extended _Decimal64x _Decimal128x 🗸 constexpr pre-processor #embed system embeds non-system embeds embed exists embed doesn't exist; fallback to system header embed name doesn't have angle brackets embed name has angle brackets parameters standard if_empty limit prefix suffix vendor-specific duplicates not permitted leading+trailing underscores permitted #warning actually warns 🗸 warning message uses all tokens 🗸 macros aren't expanded __VA_OPT__ 🗸 #elifdef, #elifndef standard pragmas FENV_ROUND 🗸 value must be direction eval information persists in check FENV_DEC_ROUND 🗸 value must be dec-direction eval information persists in check 🗸 static_assert without reason type inferencing with auto T🗸 nullptr T🗸 labels T🗸 labelled declarations T🗸 at end of compound statement 🗸 binary int literals 🗸 parse value 🗸 function declarations 🗸 variadic function without parameters T🗸 parameters in function definition need not be named attributes locations T🗸 declaration T🗸 top-level T🗸 bindings T🗸 function definitions T🗸 as a declaration consisting of only attributes and nothing else T🗸 within compound body T🗸 for-loop initializer T🗸 function parameter T🗸 struct/union fields T🗸 binding T🗸 base type T🗸 declarator T🗸 array T🗸 function T🗸 pointer T🗸 with identifier T🗸 without identifier struct/union/enum declaration T🗸 parse allowed when defining struct/union/enum allowed in tag declaration where struct/union doesn't act as type specifier doesn't apply to enums since they can't be forward declared disallowed otherwise T🗸 enum field T🗸 statement standard [[noreturn]], [[_Noreturn]] disallow argument applicable to function nothing else [[deprecated]] may have optional argument applicable to struct/union declaration typedef name object struct/union member function enum enum member nothing else __has_c_attribute => 202311L warn when name is used [[fallthrough]] disallow argument applicable to lone attribute declaration next encountered statement must have case or default label if within iteration statement, next statement must also be within said iteration statement __has_c_attribute => 202311L [[maybe_unused]] disallow argument applicable to struct/union declaration typedef name object struct/union member function enum enum member label nothing else __has_c_attribute => 202311L [[nodiscard]] may have optional argument applicable to function struct/union definition enum definition nothing else __has_c_attribute => 202311L warn when value discarded [[reproducible]] TODO [[unsequenced]] TODO warn on non-standard leading+trailing underscores permitted T🗸 keywords are treated as identifiers T🗸 with prefix T🗸 without prefix T🗸 without arguments T🗸 with arguments T🗸 including balanced tokens T🗸 multiple attribute lists are combined 🗸 multiple attributes in [[brackets]] T🗸 u8- string literals T🗸 empty initializers 🗸 ' as separator 🗸 int literals 🗸 float literals T🗸 empty declarations pre-defined macros keyword aliases 🗸 alignas T🗸 alignof T🗸 bool T🗸 true T🗸 defined T🗸 expands to _Bool value T🗸 false T🗸 defined T🗸 expands to _Bool value 🗸 static_assert 🗸 thread_local when equivalent keyword is re-defined, alias macro doesn't expand said definition __has_c_attribute __has_embed __has_include 🗸 constants 🗸 __STDC_IEC_60559_TYPES__ 🗸 __STDC_IEC_60559_BFP__ 🗸 __STDC_IEC_60559_DFP__ 🗸 __STDC_EMBED_NOT_FOUND__ 🗸 __STDC_EMBED_FOUND__ 🗸 __STDC_EMBED_EMPTY__ explicit enum backing type 🗸 storage specifiers in cast expression T🗸 allowed for compound literals 🗸 disallowed for other casts T🗸 treat empty parameter list as identical to (void) array is qualified equivalently to element type extensions redefinition of macros 🗸 user-defined 🗸 pre-defined warns when new definition isn't identical to old doesn't warn when new definition is identical defining keywords as macros 🗸 allowed warns T🗸 pre-declared identifiers 🗸 can't be redeclared at top-level 🗸 __builtin_va_arg T🗸 parse 🗸 check 🗸 __builtin_va_copy T🗸 parse 🗸 check 🗸 __builtin_va_end T🗸 parse 🗸 check 🗸 __builtin_va_list T🗸 parse 🗸 check __builtin_va_start 🗸 parse T🗸 require exactly two arguments before c23 🗸 require one or two arguments after c23 🗸 check warnings and errors error when function isn't variadic warn when more than one optional argument is supplied (c23) optional argument isn't an identifier error before c23 warn since c23 optional argument isn't the last named function parameter error before c23 warn since c23 🗸 pre-defined macros 🗸 __has_builtin __asm__ declarations 🗸 parse disallowed when no linkage disallowed for struct/union fields 🗸 allowed otherwise casting between object pointer and function pointer warns 🗸 does the Right Thing 🗸 __DATE__ and __TIME__ use SOURCE_DATE_EPOCH if set __attribute__ parse warn for unrecognized attributes supported attributes packed section leading+trailing underscores permitted additional warnings declaring a reserved identifier before c23: including potentially reserved after c23: excluding potentially reserved defining a reserved identifier before c23: including potentially reserved after c23: excluding potentially reserved SOURCE_DATE_EPOCH is invalid unused internally-linked objects and functions