InfluxDB is a fast time-series database distributed under an open source license with commercial support. It allows for precision to a nanosecond.
The original design goals of InfluxDB include:[1]
A Time Series Database (TSDB) is a database optimized for time series data. Time series data are simply measurements or events that are things you want to ask questions about, visualize, or summarize over time.
In a nutshell, InfluxDB is a
Design Goals of InfluxDB
The original design goals of InfluxDB include:[1]
- Simple to install and manage
- No external dependencies like Zookeeper and Hadoop
- HTTP(s) interface for reading and writing data
- Horizontally scalable
- On disk and in memory
- Most data is cold
- Compute percentiles and other functions on the fly
- Downsample data on different windows of time
Time Series Database
A Time Series Database (TSDB) is a database optimized for time series data. Time series data are simply measurements or events that are things you want to ask questions about, visualize, or summarize over time.
To illustrate the concepts of InfluxDB, we use below sample data (Table 1) with a measurement named census in it, which shows the number of butterflies and honeybees counted by two scientists (langstroth and perpetua) in two locations (location 1 and location 2) over the time period from August 18, 2015 at midnight through August 18, 2015 at 6:12 AM.
Table 1. Sample Data (name: census)
time | location | scientist | butterflies | honeybees |
2015-08-18T00:00:00Z | 1 | langstroth | 12 | 23 |
2015-08-18T00:00:00Z | 1 | perpetua | 1 | 30 |
2015-08-18T00:06:00Z | 1 | langstroth | 11 | 28 |
2015-08-18T00:06:00Z | 1 | perpetua | 3 | 28 |
2015-08-18T05:54:00Z | 2 | langstroth | 2 | 11 |
2015-08-18T06:00:00Z | 2 | langstroth | 1 | 10 |
2015-08-18T06:06:00Z | 2 | perpetua | 8 | 23 |
2015-08-18T06:12:00Z | 2 | perpetua | 7 | 22 |
Influx Client
influx is InfluxDB’s command line interface (CLI) that you can use to interact with an InfluxDB server. For example, you can write data (manually or from a file), query data interactively, and view query output in different formats.
Assuming it was installed in your system, you can type "influx" to launch the CLI as below:
Assuming it was installed in your system, you can type "influx" to launch the CLI as below:
$ influx
Connected to http://localhost:8086 version 1.5.0
InfluxDB shell version: 1.5.0
> help
Usage:
connect connects to another node specified by host:port
auth prompts for username and password
pretty toggles pretty print for the json format
chunked turns on chunked responses from server
chunk size sets the size of the chunked responses. Set to 0 to reset to the default chunked size
use sets current database
format specifies the format of the server responses: json, csv, or column
precision specifies the format of the timestamp: rfc3339, h, m, s, ms, u or ns
consistency sets write consistency level: any, one, quorum, or all
history displays command history
settings outputs the current settings for the shell
clear clears settings such as database or retention policy. run 'clear' for help
exit/quit/ctrl+d quits the influx shell
show databases show database names
show series show series information
show measurements show measurement information
show tag keys show tag key information
show field keys show field key information
A full list of influxql commands can be found at:
As highlighted above, below items are the key concepts in InfluxDB:
In InfluxDB, data are organized as:
- Series
- Is the collection of data that share a retention policy, measurement, and tag set
- Measurements
- Acts as a container for tags, fields, and the time column
- The measurement name is the description of the data that are stored in the associated fields
- Tags
- Are made up of tag keys and tag values.
- Both tag keys and tag values are stored as strings and record metadata.
- Tags are defined into JSON and indexed
- Tag Set
- Is the different combinations of all the tag key-value pairs
- Fields
- Fields are NOT indexed
How Data is Organized in Influx
In InfluxDB, data are organized as:
- Databases (like in MySQL, Postgres, etc)
- A logical container for users, retention policies, continuous queries, and time series data
- Time series
- Kind of like tables
- Primary key is always time
- Null values are not stored
- A time series is composed by points or events
- Points or events
- Kind of like rows
Using sample data (Table 1) as examples:
- Fields are
- butterflies, honeybees
- Tags are
- location, scientist
- Tag Sets are
- location = 1, scientist = langstroth
- location = 2, scientist = langstroth
- location = 1, scientist = perpetua
- location = 2, scientist = perpetua
- Measurement is
- census
- Series are
- See Table 2
Table 2. Time Series
Arbitrary series number | Retention policy | Measurement | Tag set |
series 1 | autogen | census | location = 1,scientist = langstroth |
series 2 | autogen | census | location = 2,scientist = langstroth |
series 3 | autogen | census | location = 1,scientist = perpetua |
series 4 | autogen | census | location = 2,scientist = perpetua |
Summary
In a nutshell, InfluxDB is a
- Time series database
- Where the timestamp is the key
- All data in InfluxDB have time column. time stores timestamps, and the timestamp shows the date and time, in RFC3339 UTC (e.g., 2015-08-18T00:06:00Z), associated with particular data
- Works best with large number of series with fewer columns in each one
- Schemaless database
- Which means it’s easy to add new measurements, tags, and fields at any time
- It’s designed to make working with time series data easier and faster
InfluxQL is a SQL-like query language for interacting with InfluxDB and providing features specific to storing and analyzing time series data.
References
- Devoxx france 2015 influxdb
- InfluxDB Key Concepts
- InfluxQL
- Oracle Cloud Infrastructure (redthunder.blog)
No comments:
Post a Comment