-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Open
Labels
docsenhancementSolving this issue will likely involve adding new logic or components to the codebase.Solving this issue will likely involve adding new logic or components to the codebase.
Milestone
Description
In IR analysis, right now Zig is inconsistent about the semantics of an operation with an undefined value. This issue is to make the rules clear how it is supposed to work. Note that undefined values are different from undefined behavior.
- In memory, an undefined value of type
Ttakes up the same store size as a normal value of typeT, and exists as any bit pattern within that store size. Thus by looking only within the store size of an undefined value it may be impossible to tell that it is an undefined value. - Undefined values semantically represent an extra state which is not possible to represent using any of the valid bit patterns of the underlying type. However, aside from the store size, the representation of an undefined value in memory is undefined; it can be any bit pattern. As an example, the value
u8(undefined), in memory, could be any combination of bits that fits in@sizeOf(u8), which is1. The valuebool(undefined), in memory, could be any combination of bits that fits in@sizeOf(bool), which is also1. So even though the only valid bit patterns of the typeboolare0b00000000and0b00000001, when the value is undefined, the byte which represents the storage of theu1value could be anything, including0b00000010,0b10101010, or0b11111111. Therefore, because undefined values semantically represent an extra state, it is an incorrect assumption that an undefined value with typeThas a value which is in the set of valid values for typeT. - Expressions which have no side effects and no possible undefined behavior, and one or more of the operands has an undefined value which is read, the expression result is an undefined value. For example, the
+%operator. Note that for slicing operator, if the start is0, the pointer value is not read, which makes this expression defined:(([*]u8)(undefined))[0..0]. Another example is@ptrCast(*i32, (*u32)(undefined)). Although 0x0 is not a valid bit pattern for the type*u32, 0x0 is a possible bit pattern within the store size of*u32, and so this expression is capable of producing an invalid bit pattern for the result type. However@ptrCastis defined to have no possible undefined behavior because it is a no-op on the bit pattern. - Branching on an undefined value is undefined behavior. This can be caught at
comptime, and caught at runtime if debug safety feature: runtime undefined value detection #211 is solved. For example, the condition of anifexpression. - Expressions which have possible undefined behavior, if one or more of the operands is an undefined value and there are any combinations of bit patterns within the store sizes of the undefined values that would cause undefined behavior then this expression causes undefined behavior. For example,
@intCast(u8, u16(undefined)). Another example: the+operator. However if one of the operands of+iscomptime-known to be0, and the other is an undefined value the result is an undefined value because there exists no bit pattern added to0that causes overflow.
Every IR instruction analysis code should be audited and tests added to enforce this behavior, especially forcomptimecode.
Also these rules should be made clear in the language reference.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
docsenhancementSolving this issue will likely involve adding new logic or components to the codebase.Solving this issue will likely involve adding new logic or components to the codebase.