The following numeric types are supported:
|
|
A signed 64-bit integer with a minimum value of |
|
|
A signed 32-bit integer with a minimum value of |
|
|
A signed 16-bit integer with a minimum value of |
|
|
A signed 8-bit integer with a minimum value of |
|
|
A double-precision 64-bit IEEE 754 floating point number, restricted to finite values. |
|
|
A single-precision 32-bit IEEE 754 floating point number, restricted to finite values. |
|
|
A half-precision 16-bit IEEE 754 floating point number, restricted to finite values. |
|
|
A floating point number that is backed by a |
|
|
An unsigned 64-bit integer with a minimum value of 0 and a maximum value of |
Below is an example of configuring a mapping with numeric fields:
response = client.indices.create(
index: 'my-index-000001',
body: {
mappings: {
properties: {
number_of_bytes: {
type: 'integer'
},
time_in_seconds: {
type: 'float'
},
price: {
type: 'scaled_float',
scaling_factor: 100
}
}
}
}
)
puts response
res, err := es.Indices.Create(
"my-index-000001",
es.Indices.Create.WithBody(strings.NewReader(`{
"mappings": {
"properties": {
"number_of_bytes": {
"type": "integer"
},
"time_in_seconds": {
"type": "float"
},
"price": {
"type": "scaled_float",
"scaling_factor": 100
}
}
}
}`)),
)
fmt.Println(res, err)
PUT my-index-000001
{
"mappings": {
"properties": {
"number_of_bytes": {
"type": "integer"
},
"time_in_seconds": {
"type": "float"
},
"price": {
"type": "scaled_float",
"scaling_factor": 100
}
}
}
}
The double, float and half_float types consider that -0.0 and
+0.0 are different values. As a consequence, doing a term query on
-0.0 will not match +0.0 and vice-versa. Same is true for range queries:
if the upper bound is -0.0 then +0.0 will not match, and if the lower
bound is +0.0 then -0.0 will not match.
As far as integer types (byte, short, integer and long) are concerned,
you should pick the smallest type which is enough for your use-case. This will
help indexing and searching be more efficient. Note however that storage is
optimized based on the actual values that are stored, so picking one type over
another one will have no impact on storage requirements.
For floating-point types, it is often more efficient to store floating-point
data into an integer using a scaling factor, which is what the scaled_float
type does under the hood. For instance, a price field could be stored in a
scaled_float with a scaling_factor of 100. All APIs would work as if
the field was stored as a double, but under the hood Elasticsearch would be
working with the number of cents, price*100, which is an integer. This is
mostly helpful to save disk space since integers are way easier to compress
than floating points. scaled_float is also fine to use in order to trade
accuracy for disk space. For instance imagine that you are tracking cpu
utilization as a number between 0 and 1. It usually does not matter much
whether cpu utilization is 12.7% or 13%, so you could use a scaled_float
with a scaling_factor of 100 in order to round cpu utilization to the
closest percent in order to save space.
If scaled_float is not a good fit, then you should pick the smallest type
that is enough for the use-case among the floating-point types: double,
float and half_float. Here is a table that compares these types in order
to help make a decision.
| Type | Minimum value | Maximum value | Significant bits / digits |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Mapping numeric identifiers
Not all numeric data should be mapped as a numeric field data type.
Elasticsearch optimizes numeric fields, such as integer or long, for
range queries. However, keyword fields
are better for term and other
term-level queries.
Identifiers, such as an ISBN or a product ID, are rarely used in range
queries. However, they are often retrieved using term-level queries.
Consider mapping a numeric identifier as a keyword if:
-
You don’t plan to search for the identifier data using
rangequeries. -
Fast retrieval is important.
termquery searches onkeywordfields are often faster thantermsearches on numeric fields.
If you’re unsure which to use, you can use a multi-field to map
the data as both a keyword and a numeric data type.
The following parameters are accepted by numeric types:
-
coerce -
Try to convert strings to numbers and truncate fractions for integers.
Accepts
true(default) andfalse. Not applicable forunsigned_long. Note that this cannot be set if thescriptparameter is used. -
doc_values -
Should the field be stored on disk in a column-stride fashion, so that it
can later be used for sorting, aggregations, or scripting? Accepts
true(default) orfalse. -
ignore_malformed -
If
true, malformed numbers are ignored. Iffalse(default), malformed numbers throw an exception and reject the whole document. Note that this cannot be set if thescriptparameter is used. -
index -
Should the field be quickly searchable? Accepts
true(default) andfalse. Numeric fields that only havedoc_valuesenabled can also be queried, albeit slower. -
meta - Metadata about the field.
-
null_value -
Accepts a numeric value of the same
typeas the field which is substituted for any explicitnullvalues. Defaults tonull, which means the field is treated as missing. Note that this cannot be set if thescriptparameter is used. -
on_script_error -
Defines what to do if the script defined by the
scriptparameter throws an error at indexing time. Acceptsfail(default), which will cause the entire document to be rejected, andcontinue, which will register the field in the document’s_ignoredmetadata field and continue indexing. This parameter can only be set if thescriptfield is also set. -
script -
If this parameter is set, then the field will index values generated
by this script, rather than reading the values directly from the
source. If a value is set for this field on the input document, then
the document will be rejected with an error.
Scripts are in the same format as their
runtime equivalent. Scripts can only be
configured on
longanddoublefield types. -
store -
Whether the field value should be stored and retrievable separately from
the
_sourcefield. Acceptstrueorfalse(default). -
time_series_dimension -
[preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will apply best effort to fix any issues, but features in technical preview are not subject to the support SLA of official GA features. (Optional, Boolean)
For internal use by Elastic only.
Marks the field as a time series dimension. Defaults to
false.The
index.mapping.dimension_fields.limitindex setting limits the number of dimensions in an index.Dimension fields have the following constraints:
-
The
doc_valuesandindexmapping parameters must betrue. - Field values cannot be an array or multi-value.
Of the numeric field types, only
byte,short,integer,long, andunsigned_longfields support this parameter.A numeric field can’t be both a time series dimension and a time series metric.
-
The
-
time_series_metric -
[preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will apply best effort to fix any issues, but features in technical preview are not subject to the support SLA of official GA features. (Optional, string)
For internal use by Elastic only.
Marks the field as a time series metric. The value is the metric type. Defaults to
null(Not a time series metric).For numeric fields, this parameter accepts
gaugeandcounter. You can’t update this parameter for existing fields.For a numeric time series metric, the
doc_valuesparameter must betrue. A numeric field can’t be both a time series dimension and a time series metric.
scaled_float accepts an additional parameter:
|
|
The scaling factor to use when encoding values. Values will be multiplied
by this factor at index time and rounded to the closest long value. For
instance, a |