Elasticsearch

Reliably And Securely Take Data And Search, Analyze, And Visualize It In Real Time.

Elasticsearch, Kibana, Beats, and Logstash (also known as the ELK Stack). Reliably and securely take data from any source, in any format, then search, analyze, and visualize it in real time.

What is Elasticsearch?

Elasticsearch is a search engine and analytics platform that is used to store, search, and analyze large volumes of data in real time. It is an open-source, distributed search engine based on the Lucene search library. Elasticsearch is designed to be highly scalable and fault-tolerant, and can be used to store and search a wide range of data types, including text, numerical, geospatial, and structured data. It can also be used to perform complex data analytics, such as aggregations and data visualization.

Elasticsearch is often used in conjunction with other technologies, such as Logstash and Kibana, to create what is known as the “ELK stack”. Logstash is used to collect and transform data from a variety of sources, such as logs and databases, and send it to Elasticsearch for indexing and analysis. Kibana is a data visualization tool that allows users to create dashboards and visualizations based on data stored in Elasticsearch.

Some of the key features of Elasticsearch include:
  • Full-text search: Elasticsearch supports powerful full-text search capabilities, including relevance scoring, stemming, and faceting.
  • Distributed architecture: Elasticsearch is designed to be highly scalable and fault-tolerant, with support for distributed indexing and search.
  • Near real-time search: Elasticsearch provides near real-time search capabilities, allowing users to search and analyze data as soon as it is indexed.
  • Document-oriented: Elasticsearch stores data in a document-oriented format, making it flexible and easy to work with.
  • RESTful API: Elasticsearch provides a RESTful API for interacting with the search engine, making it easy to integrate with other applications.

Elasticsearch is used in a wide range of applications, including e-commerce, social media, cybersecurity, and data analytics. Its flexibility and scalability make it a popular choice for organizations that need to store and search large volumes of data in real time.

What types of data can be stored in Elasticsearch?

Elasticsearch can store and search a wide range of data types, including text, numerical, geospatial, and structured data. It is designed to be flexible and can be used to index and search almost any type of data.

How is Elasticsearch different from traditional relational databases?

Elasticsearch is a search engine and not a relational database. It is designed to handle unstructured and semi-structured data, whereas traditional databases are designed for structured data. Elasticsearch provides powerful search capabilities and can be used for text search, geospatial search, and more.

Can Elasticsearch handle large volumes of data?

Yes, Elasticsearch is designed to handle large volumes of data. It is a distributed system that can scale horizontally by adding more nodes to the cluster. Elasticsearch is used by organizations to store and search billions of documents.

How does Elasticsearch perform search queries?

Elasticsearch uses a query language called the Elasticsearch Query DSL to perform search queries. The Query DSL is a JSON-based syntax that allows users to specify the search criteria and filters.

What is the ELK stack?

The ELK stack is a combination of Elasticsearch, Logstash, and Kibana. Logstash is used to collect and transform data from various sources, which is then indexed and stored in Elasticsearch. Kibana is used to visualize and analyze the data stored in Elasticsearch.

Is Elasticsearch open source?

Yes, Elasticsearch is open source software that is distributed under the Apache License 2.0. The source code is available on GitHub.

Can Elasticsearch be used for real-time analytics?

Yes, Elasticsearch can be used for real-time analytics. It provides near real-time search capabilities and supports complex data analytics, such as aggregations and data visualization.

How is Elasticsearch different from Solr?

Elasticsearch and Solr are both search engines based on the Lucene search library. However, Elasticsearch is designed to be more scalable and easier to use, whereas Solr is designed to be more customizable. Elasticsearch is also more focused on near real-time search and analytics, whereas Solr is more focused on traditional search applications.

Snippet from Wikipedia: Elasticsearch

Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java and is dual-licensed under the (source-available) Server Side Public License and the Elastic license, while other parts fall under the proprietary (source-available) Elastic License. Official clients are available in Java, .NET (C#), PHP, Python, Ruby and many other languages. According to the DB-Engines ranking, Elasticsearch is the most popular enterprise search engine.

Shay Banon created the precursor to Elasticsearch, called Compass, in 2004. While thinking about the third version of Compass he realized that it would be necessary to rewrite big parts of Compass to "create a scalable search solution". So he created "a solution built from the ground up to be distributed" and used a common interface, JSON over HTTP, suitable for programming languages other than Java as well. Shay Banon released the first version of Elasticsearch in February 2010.

Elastic NV was founded in 2012 to provide commercial services and products around Elasticsearch and related software. In June 2014, the company announced raising $70 million in a Series C funding round, just 18 months after forming the company. The round was led by New Enterprise Associates (NEA). Additional funders include Benchmark Capital and Index Ventures. This round brought total funding to $104M.

In March 2015, the company Elasticsearch changed its name to Elastic.

In June 2018, Elastic filed for an initial public offering with an estimated valuation of between 1.5 and 3 billion dollars. On 5 October 2018, Elastic was listed on the New York Stock Exchange.

Major releases:

  • 1.0.0 – February 12, 2014
  • 2.0.0 – October 28, 2015
  • 5.0.0 – October 26, 2016
  • 6.0.0 – November 14, 2017
  • 7.0.0 – April 10, 2019
  • 8.0.0 – February 10, 2022

In January 2021, Elastic announced that starting with version 7.11, they would be relicensing their Apache 2.0 licensed code in Elasticsearch and Kibana to be dual licensed under Server Side Public License and the Elastic License, neither of which is recognized as an open-source license. Elastic blamed Amazon Web Services (AWS) for this change, objecting to AWS offering Elasticsearch and Kibana as a service directly to consumers and claiming that AWS was not appropriately collaborating with Elastic. Critics of the re-licensing decision predicted that it would harm Elastic's ecosystem and noted that Elastic had previously promised to "never....change the license of the Apache 2.0 code of Elasticsearch, Kibana, Beats, and Logstash". Amazon responded with plans to fork the projects and continue development under Apache License 2.0. Other users of the Elasticsearch ecosystem, including Logz.io, CrateDB and Aiven, also committed to the need for a fork, leading to a discussion of how to coordinate the open source efforts. Due to potential trademark issues with using the name "Elasticsearch", AWS rebranded their fork as OpenSearch in April 2021.

Elasticsearch can be used to search any kind of document. It provides scalable search, has near real-time search, and supports multitenancy. "Elasticsearch is distributed, which means that indices can be divided into shards and each shard can have zero or more replicas. Each node hosts one or more shards and acts as a coordinator to delegate operations to the correct shard(s). Rebalancing and routing are done automatically". Related data is often stored in the same index, which consists of one or more primary shards, and zero or more replica shards. Once an index has been created, the number of primary shards cannot be changed.

Elasticsearch is developed alongside the data collection and log-parsing engine Logstash, the analytics and visualization platform Kibana, and the collection of lightweight data shippers called Beats. The four products are designed for use as an integrated solution, referred to as the "Elastic Stack". (Formerly the "ELK stack", short for "Elasticsearch, Logstash, Kibana".)

Elasticsearch uses Lucene and tries to make all its features available through the JSON and Java API. It supports facetting and percolating (a form of prospective search),  which can be useful for notifying if new documents match for registered queries. Another feature, "gateway", handles the long-term persistence of the index; for example, an index can be recovered from the gateway in the event of a server crash. Elasticsearch supports real-time GET requests, which makes it suitable as a NoSQL datastore, but it lacks distributed transactions.

On 20 May 2019, Elastic made the core security features of the Elastic Stack available free of charge, including TLS for encrypted communications, file and native realm for creating and managing users, and role-based access control for controlling user access to cluster APIs and indexes. The corresponding source code is available under the “Elastic License”, a source-available license. In addition, Elasticsearch now offers SIEM and Machine Learning as part of its offered services.

Developed from the Found acquisition by Elastic in 2015, Elastic Cloud is a family of Elasticsearch-powered SaaS offerings which include the Elasticsearch Service, as well as Elastic App Search Service, and Elastic Site Search Service which were developed from Elastic's acquisition of Swiftype. In late 2017, Elastic formed partnerships with Google to offer Elastic Cloud in Google Cloud Platform (GCP), and Alibaba to offer Elasticsearch and Kibana in Alibaba Cloud.

Elasticsearch Service on Elastic Cloud is the official hosted and managed Elasticsearch and Kibana offering from the creators of the project since August 2018. Elasticsearch Service users can create secure deployments with partners, Google Cloud Platform (GCP) and Alibaba Cloud.

AWS previously offered Elasticsearch as a managed service beginning 2015. There are many companies that currently offer managed services, such as Elastic Co, BigData Boutique, Instacluster, and Dattell. Such managed services provide hosting, deployment, backup and other support. Most managed services also include support for Kibana.

  • Information extraction
  • List of information retrieval libraries
  • OpenSearch (software), an open source fork of Elasticsearch
  • Official website
  • tools/elasticsearch.txt
  • Last modified: 2023/03/30 15:48
  • by Henrik Yllemo