Float16 (javet 4.1.5 API)

java.lang.Object
- com.caoccao.javet.utils.Float16

public final class Float16
extends java.lang.Object

The FP16 class is a wrapper and a utility class to manipulate half-precision 16-bit IEEE 754 floating point data types (also called fp16 or binary16). A half-precision float can be created from or converted to single-precision floats, and is stored in a short data type.

The IEEE 754 standard specifies an fp16 as having the following format:

Sign bit: 1 bit
Exponent width: 5 bits
Significand: 10 bits

The format is laid out as follows:

 1   11111   1111111111
 ^   --^--   -----^----
 sign  |          |_______ significand
       |
       -- exponent

Half-precision floating points can be useful to save memory and/or bandwidth at the expense of range and precision when compared to single-precision floating points (fp32).

To help you decide whether fp16 is the right storage type for you need, please refer to the table below that shows the available precision throughout the range of possible values. The precision column indicates the step size between two consecutive numbers in a specific part of the range.

Range start	Precision
0	1 ⁄ 16,777,216
1 ⁄ 16,384	1 ⁄ 16,777,216
1 ⁄ 8,192	1 ⁄ 8,388,608
1 ⁄ 4,096	1 ⁄ 4,194,304
1 ⁄ 2,048	1 ⁄ 2,097,152
1 ⁄ 1,024	1 ⁄ 1,048,576
1 ⁄ 512	1 ⁄ 524,288
1 ⁄ 256	1 ⁄ 262,144
1 ⁄ 128	1 ⁄ 131,072
1 ⁄ 64	1 ⁄ 65,536
1 ⁄ 32	1 ⁄ 32,768
1 ⁄ 16	1 ⁄ 16,384
1 ⁄ 8	1 ⁄ 8,192
1 ⁄ 4	1 ⁄ 4,096
1 ⁄ 2	1 ⁄ 2,048
1	1 ⁄ 1,024
2	1 ⁄ 512
4	1 ⁄ 256
8	1 ⁄ 128
16	1 ⁄ 64
32	1 ⁄ 32
64	1 ⁄ 16
128	1 ⁄ 8
256	1 ⁄ 4
512	1 ⁄ 2
1,024	1
2,048	2
4,096	4
8,192	8
16,384	16
32,768	32

This table shows that numbers higher than 1024 lose all fractional precision.

Field Summary

Fields
Modifier and Type	Field and Description
`static short`	`EPSILON` Epsilon is the difference between 1.0 and the next value representable by a half-precision floating-point.
`static int`	`EXPONENT_BIAS` The offset of the exponent from the actual value.
`static int`	`EXPONENT_SHIFT` The offset to shift by to obtain the exponent bits.
`static int`	`EXPONENT_SIGNIFICAND_MASK` The bitmask to AND with to obtain exponent and significand bits.
`static short`	`LOWEST_VALUE` Smallest negative value a half-precision float may have.
`static int`	`MAX_EXPONENT` Maximum exponent a finite half-precision float may have.
`static short`	`MAX_VALUE` Maximum positive finite value a half-precision float may have.
`static int`	`MIN_EXPONENT` Minimum exponent a normalized half-precision float may have.
`static short`	`MIN_NORMAL` Smallest positive normal value a half-precision float may have.
`static short`	`MIN_VALUE` Smallest positive non-zero value a half-precision float may have.
`static short`	`NaN` A Not-a-Number representation of a half-precision float.
`static short`	`NEGATIVE_INFINITY` Negative infinity of type half-precision float.
`static short`	`NEGATIVE_ZERO` Negative 0 of type half-precision float.
`static short`	`POSITIVE_INFINITY` Positive infinity of type half-precision float.
`static short`	`POSITIVE_ZERO` Positive 0 of type half-precision float.
`static int`	`SHIFTED_EXPONENT_MASK` The bitmask to AND a number shifted by `EXPONENT_SHIFT` right, to obtain exponent bits.
`static int`	`SIGN_MASK` The bitmask to AND a number with to obtain the sign bit.
`static int`	`SIGN_SHIFT` The offset to shift by to obtain the sign bit.
`static int`	`SIGNIFICAND_MASK` The bitmask to AND a number with to obtain significand bits.
`static int`	`SIZE` The number of bits used to represent a half-precision float value.

Method Summary

All Methods Static Methods Concrete Methods
Modifier and Type	Method and Description
`static short`	`ceil(short h)` Returns the smallest half-precision float value toward negative infinity greater than or equal to the specified half-precision float value.
`static int`	`compare(short x, short y)` Compares the two specified half-precision float values.
`static boolean`	`equals(short x, short y)` Returns true if the two half-precision float values are equal.
`static short`	`floor(short h)` Returns the largest half-precision float value toward positive infinity less than or equal to the specified half-precision float value.
`static boolean`	`greater(short x, short y)` Returns true if the first half-precision float value is greater (larger toward positive infinity) than the second half-precision float value.
`static boolean`	`greaterEquals(short x, short y)` Returns true if the first half-precision float value is greater (larger toward positive infinity) than or equal to the second half-precision float value.
`static boolean`	`isInfinite(short h)` Returns true if the specified half-precision float value represents infinity, false otherwise.
`static boolean`	`isNaN(short h)` Returns true if the specified half-precision float value represents a Not-a-Number, false otherwise.
`static boolean`	`isNormalized(short h)` Returns true if the specified half-precision float value is normalized (does not have a subnormal representation).
`static boolean`	`less(short x, short y)` Returns true if the first half-precision float value is less (smaller toward negative infinity) than the second half-precision float value.
`static boolean`	`lessEquals(short x, short y)` Returns true if the first half-precision float value is less (smaller toward negative infinity) than or equal to the second half-precision float value.
`static short`	`max(short x, short y)` Returns the larger of two half-precision float values (the value closest to positive infinity).
`static short`	`min(short x, short y)` Returns the smaller of two half-precision float values (the value closest to negative infinity).
`static short`	`rint(short h)` Returns the closest integral half-precision float value to the specified half-precision float value.
`static float`	`toFloat(short h)` Converts the specified half-precision float value into a single-precision float value.
`static short`	`toHalf(float f)` Converts the specified single-precision float value into a half-precision float value.
`static java.lang.String`	`toHexString(short h)` Returns a hexadecimal string representation of the specified half-precision float value.
`static short`	`trunc(short h)` Returns the truncated half-precision float value of the specified half-precision float value.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - SIZE
```
public static final int SIZE
```
    The number of bits used to represent a half-precision float value.
    
    See Also:
    
    Constant Field Values
  - EPSILON
```
public static final short EPSILON
```
    Epsilon is the difference between 1.0 and the next value representable by a half-precision floating-point.
    
    See Also:
    
    Constant Field Values
  - MAX_EXPONENT
```
public static final int MAX_EXPONENT
```
    Maximum exponent a finite half-precision float may have.
    
    See Also:
    
    Constant Field Values
  - MIN_EXPONENT
```
public static final int MIN_EXPONENT
```
    Minimum exponent a normalized half-precision float may have.
    
    See Also:
    
    Constant Field Values
  - LOWEST_VALUE
```
public static final short LOWEST_VALUE
```
    Smallest negative value a half-precision float may have.
    
    See Also:
    
    Constant Field Values
  - MAX_VALUE
```
public static final short MAX_VALUE
```
    Maximum positive finite value a half-precision float may have.
    
    See Also:
    
    Constant Field Values
  - MIN_NORMAL
```
public static final short MIN_NORMAL
```
    Smallest positive normal value a half-precision float may have.
    
    See Also:
    
    Constant Field Values
  - MIN_VALUE
```
public static final short MIN_VALUE
```
    Smallest positive non-zero value a half-precision float may have.
    
    See Also:
    
    Constant Field Values
  - NaN
```
public static final short NaN
```
    A Not-a-Number representation of a half-precision float.
    
    See Also:
    
    Constant Field Values
  - NEGATIVE_INFINITY
```
public static final short NEGATIVE_INFINITY
```
    Negative infinity of type half-precision float.
    
    See Also:
    
    Constant Field Values
  - NEGATIVE_ZERO
```
public static final short NEGATIVE_ZERO
```
    Negative 0 of type half-precision float.
    
    See Also:
    
    Constant Field Values
  - POSITIVE_INFINITY
```
public static final short POSITIVE_INFINITY
```
    Positive infinity of type half-precision float.
    
    See Also:
    
    Constant Field Values
  - POSITIVE_ZERO
```
public static final short POSITIVE_ZERO
```
    Positive 0 of type half-precision float.
    
    See Also:
    
    Constant Field Values
  - SIGN_SHIFT
```
public static final int SIGN_SHIFT
```
    The offset to shift by to obtain the sign bit.
    
    See Also:
    
    Constant Field Values
  - EXPONENT_SHIFT
```
public static final int EXPONENT_SHIFT
```
    The offset to shift by to obtain the exponent bits.
    
    See Also:
    
    Constant Field Values
  - SIGN_MASK
```
public static final int SIGN_MASK
```
    The bitmask to AND a number with to obtain the sign bit.
    
    See Also:
    
    Constant Field Values
  - SHIFTED_EXPONENT_MASK
```
public static final int SHIFTED_EXPONENT_MASK
```
    The bitmask to AND a number shifted by EXPONENT_SHIFT right, to obtain exponent bits.
    
    See Also:
    
    Constant Field Values
  - SIGNIFICAND_MASK
```
public static final int SIGNIFICAND_MASK
```
    The bitmask to AND a number with to obtain significand bits.
    
    See Also:
    
    Constant Field Values
  - EXPONENT_SIGNIFICAND_MASK
```
public static final int EXPONENT_SIGNIFICAND_MASK
```
    The bitmask to AND with to obtain exponent and significand bits.
    
    See Also:
    
    Constant Field Values
  - EXPONENT_BIAS
```
public static final int EXPONENT_BIAS
```
    The offset of the exponent from the actual value.
    
    See Also:
    
    Constant Field Values
- Method Detail
  - compare
```
public static int compare(short x,
                          short y)
```
    Compares the two specified half-precision float values. The following conditions apply during the comparison:
    - NaN is considered by this method to be equal to itself and greater than all other half-precision float values (including #POSITIVE_INFINITY)
    - POSITIVE_ZERO is considered by this method to be greater than NEGATIVE_ZERO.
    Parameters:
    
    x - The first half-precision float value to compare.
    
    y - The second half-precision float value to compare
    
    Returns:
    
    The value 0 if x is numerically equal to y, a value less than 0 if x is numerically less than y, and a value greater than 0 if x is numerically greater than y
  - rint
```
public static short rint(short h)
```
    Returns the closest integral half-precision float value to the specified half-precision float value. Special values are handled in the following ways:
    - If the specified half-precision float is NaN, the result is NaN
    - If the specified half-precision float is infinity (negative or positive), the result is infinity (with the same sign)
    - If the specified half-precision float is zero (negative or positive), the result is zero (with the same sign)
    Parameters:
    
    h - A half-precision float value
    
    Returns:
    
    The value of the specified half-precision float rounded to the nearest half-precision float value
  - ceil
```
public static short ceil(short h)
```
    Returns the smallest half-precision float value toward negative infinity greater than or equal to the specified half-precision float value. Special values are handled in the following ways:
    - If the specified half-precision float is NaN, the result is NaN
    - If the specified half-precision float is infinity (negative or positive), the result is infinity (with the same sign)
    - If the specified half-precision float is zero (negative or positive), the result is zero (with the same sign)
    Parameters:
    
    h - A half-precision float value
    
    Returns:
    
    The smallest half-precision float value toward negative infinity greater than or equal to the specified half-precision float value
  - floor
```
public static short floor(short h)
```
    Returns the largest half-precision float value toward positive infinity less than or equal to the specified half-precision float value. Special values are handled in the following ways:
    - If the specified half-precision float is NaN, the result is NaN
    - If the specified half-precision float is infinity (negative or positive), the result is infinity (with the same sign)
    - If the specified half-precision float is zero (negative or positive), the result is zero (with the same sign)
    Parameters:
    
    h - A half-precision float value
    
    Returns:
    
    The largest half-precision float value toward positive infinity less than or equal to the specified half-precision float value
  - trunc
```
public static short trunc(short h)
```
    Returns the truncated half-precision float value of the specified half-precision float value. Special values are handled in the following ways:
    - If the specified half-precision float is NaN, the result is NaN
    - If the specified half-precision float is infinity (negative or positive), the result is infinity (with the same sign)
    - If the specified half-precision float is zero (negative or positive), the result is zero (with the same sign)
    Parameters:
    
    h - A half-precision float value
    
    Returns:
    
    The truncated half-precision float value of the specified half-precision float value
  - min
```
public static short min(short x,
                        short y)
```
    Returns the smaller of two half-precision float values (the value closest to negative infinity). Special values are handled in the following ways:
    - If either value is NaN, the result is NaN
    - NEGATIVE_ZERO is smaller than POSITIVE_ZERO
    Parameters:
    
    x - The first half-precision value
    
    y - The second half-precision value
    
    Returns:
    
    The smaller of the two specified half-precision values
  - max
```
public static short max(short x,
                        short y)
```
    Returns the larger of two half-precision float values (the value closest to positive infinity). Special values are handled in the following ways:
    - If either value is NaN, the result is NaN
    - POSITIVE_ZERO is greater than NEGATIVE_ZERO
    Parameters:
    
    x - The first half-precision value
    
    y - The second half-precision value
    
    Returns:
    
    The larger of the two specified half-precision values
  - less
```
public static boolean less(short x,
                           short y)
```
    Returns true if the first half-precision float value is less (smaller toward negative infinity) than the second half-precision float value. If either of the values is NaN, the result is false.
    
    Parameters:
    
    x - The first half-precision value
    
    y - The second half-precision value
    
    Returns:
    
    True if x is less than y, false otherwise
  - lessEquals
```
public static boolean lessEquals(short x,
                                 short y)
```
    Returns true if the first half-precision float value is less (smaller toward negative infinity) than or equal to the second half-precision float value. If either of the values is NaN, the result is false.
    
    Parameters:
    
    x - The first half-precision value
    
    y - The second half-precision value
    
    Returns:
    
    True if x is less than or equal to y, false otherwise
  - greater
```
public static boolean greater(short x,
                              short y)
```
    Returns true if the first half-precision float value is greater (larger toward positive infinity) than the second half-precision float value. If either of the values is NaN, the result is false.
    
    Parameters:
    
    x - The first half-precision value
    
    y - The second half-precision value
    
    Returns:
    
    True if x is greater than y, false otherwise
  - greaterEquals
```
public static boolean greaterEquals(short x,
                                    short y)
```
    Returns true if the first half-precision float value is greater (larger toward positive infinity) than or equal to the second half-precision float value. If either of the values is NaN, the result is false.
    
    Parameters:
    
    x - The first half-precision value
    
    y - The second half-precision value
    
    Returns:
    
    True if x is greater than y, false otherwise
  - equals
```
public static boolean equals(short x,
                             short y)
```
    Returns true if the two half-precision float values are equal. If either of the values is NaN, the result is false. POSITIVE_ZERO and NEGATIVE_ZERO are considered equal.
    
    Parameters:
    
    x - The first half-precision value
    
    y - The second half-precision value
    
    Returns:
    
    True if x is equal to y, false otherwise
  - isInfinite
```
public static boolean isInfinite(short h)
```
    Returns true if the specified half-precision float value represents infinity, false otherwise.
    
    Parameters:
    
    h - A half-precision float value
    
    Returns:
    
    True if the value is positive infinity or negative infinity, false otherwise
  - isNaN
```
public static boolean isNaN(short h)
```
    Returns true if the specified half-precision float value represents a Not-a-Number, false otherwise.
    
    Parameters:
    
    h - A half-precision float value
    
    Returns:
    
    True if the value is a NaN, false otherwise
  - isNormalized
```
public static boolean isNormalized(short h)
```
    Returns true if the specified half-precision float value is normalized (does not have a subnormal representation). If the specified value is POSITIVE_INFINITY, NEGATIVE_INFINITY, POSITIVE_ZERO, NEGATIVE_ZERO, NaN or any subnormal number, this method returns false.
    
    Parameters:
    
    h - A half-precision float value
    
    Returns:
    
    True if the value is normalized, false otherwise
  - toFloat
```
public static float toFloat(short h)
```
    Converts the specified half-precision float value into a single-precision float value. The following special cases are handled:
    - If the input is NaN, the returned value is Float.NaN
    - If the input is POSITIVE_INFINITY or NEGATIVE_INFINITY, the returned value is respectively Float.POSITIVE_INFINITY or Float.NEGATIVE_INFINITY
    - If the input is 0 (positive or negative), the returned value is +/-0.0f
    - Otherwise, the returned value is a normalized single-precision float value
    Parameters:
    
    h - The half-precision float value to convert to single-precision
    
    Returns:
    
    A normalized single-precision float value
  - toHalf
```
public static short toHalf(float f)
```
    Converts the specified single-precision float value into a half-precision float value. The following special cases are handled:
    - If the input is NaN (see Float.isNaN(float)), the returned value is NaN
    - If the input is Float.POSITIVE_INFINITY or Float.NEGATIVE_INFINITY, the returned value is respectively POSITIVE_INFINITY or NEGATIVE_INFINITY
    - If the input is 0 (positive or negative), the returned value is POSITIVE_ZERO or NEGATIVE_ZERO
    - If the input is a less than MIN_VALUE, the returned value is flushed to POSITIVE_ZERO or NEGATIVE_ZERO
    - If the input is a less than MIN_NORMAL, the returned value is a denorm half-precision float
    - Otherwise, the returned value is rounded to the nearest representable half-precision float value
    Parameters:
    
    f - The single-precision float value to convert to half-precision
    
    Returns:
    
    A half-precision float value
  - toHexString
```
public static java.lang.String toHexString(short h)
```
    Returns a hexadecimal string representation of the specified half-precision float value. If the value is a NaN, the result is "NaN", otherwise the result follows this format:
    - If the sign is positive, no sign character appears in the result
    - If the sign is negative, the first character is '-'
    - If the value is inifinity, the string is "Infinity"
    - If the value is 0, the string is "0x0.0p0"
    - If the value has a normalized representation, the exponent and significand are represented in the string in two fields. The significand starts with "0x1." followed by its lowercase hexadecimal representation. Trailing zeroes are removed unless all digits are 0, then a single zero is used. The significand representation is followed by the exponent, represented by "p", itself followed by a decimal string of the unbiased exponent
    - If the value has a subnormal representation, the significand starts with "0x0." followed by its lowercase hexadecimal representation. Trailing zeroes are removed unless all digits are 0, then a single zero is used. The significand representation is followed by the exponent, represented by "p-14"
    Parameters:
    
    h - A half-precision float value
    
    Returns:
    
    A hexadecimal string representation of the specified value

Class Float16

Field Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

SIZE

EPSILON

MAX_EXPONENT

MIN_EXPONENT

LOWEST_VALUE

MAX_VALUE

MIN_NORMAL

MIN_VALUE

NaN

NEGATIVE_INFINITY

NEGATIVE_ZERO

POSITIVE_INFINITY

POSITIVE_ZERO

SIGN_SHIFT

EXPONENT_SHIFT

SIGN_MASK

SHIFTED_EXPONENT_MASK

SIGNIFICAND_MASK

EXPONENT_SIGNIFICAND_MASK

EXPONENT_BIAS

Method Detail

compare

rint

ceil

floor

trunc

min

max

less

lessEquals

greater

greaterEquals

equals

isInfinite

isNaN

isNormalized

toFloat

toHalf

toHexString