AWS offers a lot of services. In this post, I would like to take a look at the Elasticsearch Service. Read on to learn more!
One of the primary reasons why you may consider the Elasticsearch Service is because it can be used as your primary logging backend. It offers the ability to store, query, and visualize log data easily. The service itself is made up of a backend based on elasticsearch and Kibana for the UI. Like most logging solutions, you are responsible for getting data into the backend. To be fair, elasticsearch now offers way more than just logging including metrics and APM, but logging is still the primary use-case.
Of course you have options when deciding to leverage elasticsearch. You could:
- Deploy and maintain your own
- Leverage a platform service like what AWS offers
- Go with a third-party vendor which may provide additional benefits like auto-scaling, dashboards, and more
If you are already on AWS and looking to stand up elasticsearch quickly then the Elasticsearch Service is available to try.
The Elasticsearch Service can be configured from the console or via the API. It requires some basic information including:
- Whether production or test environment — relevant for whether a multi-master configuration is required or not
- Instance type and number of instances
- Storage type and size
- Encryption, snapshot, and network information
- Advanced options
As mentioned, the primary difference between production and test is whether you want dedicated masters nodes or not and how many. Like most distributed architectures, the master node(s) handle administration and configuration so if they are unavailable you are limited in operations you can perform against the cluster.
The first tricky thing about elasticsearch is properly sizing your cluster. Sizing is based in part on the ingestion rate. I found the recommendations section of this article to be helpful in initial sizing activities. Note, if you do not size properly it is possible that ingestion stops working and/or queries will timeout / not return.
You also have the option of purchasing reserved instances if desired.
The amount of storage you need is dependent on your ingestion rate. One thing to keep in mind is that AWS limits the maximum amount of storage based on the instance type selected.
The rest of the settings are pretty explanatory:
- Encryption is a good idea from a security perspective
- Snapshots can be helpful if data durability is important
- Network information is how you restrict access to the cluster (do NOT expose it to the Internet)
- AWS Cognito can be configured for Kibana access if desired (Kibana offers no authentication out of the box)
- Advanced elasticsearch parameters can be tweaked (defaults typically do not need to be changed)
Deployment or reconfiguration of an elasticsearch cluster takes several minutes. Once the cluster is available, you can select the domain to get some important information including:
- VPC endpoint: Where clients can send data to and where API actions can be targeted
It is important to note with AWS Elasticsearch that you MUST access elasticsearch via HTTPS on port 443 (not the standard elasticsearch port).
- Domain ARN: For IAM
- Kibana: How to hit the UI
It is important to note that you MUST use the URI
/_plugin/kibana/to access the UI (do not forget the trailing slash). Also, be advised that Kibana does not offer authentication.
In addition to the above, an access policy should be configured. Ideally, a signing request via IAM credentials should be leveraged. Note that some clients (e.g. fluent-bit) do not support signing requests. The best practice would be to forward events to a proxy that does (e.g. fluentd or something custom), though you could modify the access policy to allow access in trusted environments (not secure).
AWS Elasticsearch uses AWS CloudWatch for health monitoring. The Elasticsearch dashboard for a given domain provides rich information about both cluster and instance health. If the status is not green for the cluster then you can look at the graphs to determine what is causing the unhealthy state and then correct the issue.
Pricing for the Elasticsearch Service is done per instance + storage + snapshots. While the instance types are similar to the EC2 instance types it is important to note a couple of differences:
- Elasticsearch instance types are typically older family (e.g. you can still find m3 master nodes though m3 is not offered for EC2 instances)
- Pricing for Elasticsearch instances is approximately 1.5x EC2 instances — this is because you get the instance and the managed service
I find examples to help with pricing so let’s assume:
- 4 x r4-xlarge with 1TB per node
- 3 x m3-medium masters
Without snapshots, this works out to $1,300 a month or $15,600 a year. This configuration can support ~50 node kubernetes cluster with audit logging and a couple of hundred pods (assuming sane logging defaults on the pods) for anywhere between 15-30 days of retention.
It is very easy to get started with the AWS Elasticsearch Service. Decisions around instance type and storage need to be made to deploy, but these settings can easily be modified after the fact as desired. Once stood up, accessing the service is similar to other AWS services and offers similar security controls, but be advised that Kibana does not authenticate users. Like any system, monitoring the health is critical and this is made easy via AWS CloudWatch.
It is not cheap to leverage the Elasticsearch service, but neither is maintaining your own cluster so it comes down to experience level and which investment you want to make. If just getting started, the Elasticsearch Service is a good option to stand something up quickly.
© 2019, Steve Flanders. All rights reserved.