Confluent Kafka® on
Kubernetes in minutes
Ready to get started?
You can explore snapblocs without using your cloud account. We also provide a free Sandbox environment credit to make it a little easier!
Implementing Kafka does not have to be difficult!
Kafka streaming applications can be frustrating to build, challenging to scale, and difficult to troubleshoot. Fortunately, we can take care of all that for you. Imagine how much more you can do with your time?
Deploy your Kafka cluster in minutes
Click to create clustered Kafka services in your cloud environment with no infrastructure coding required. Production quality without the steep learning curve!
Focus on what you do best
- Free up scarce resources
- Focus on your core work
- Knock months off your timeline
- Deliver secure, scalable solutions
- Avoid tech debt
snapblocs Architecture-as-a-Service delivers instant value
Day 2 level operations out of the box
- High availability
- Data protection
- Data security
- Configurable alerts
- Health checks
- Cost optimization
- Easy debugging of topic data
- Easy overriding of Kafka parameters
- Scaling on demand
- Graceful shutdown
- Pause, Resume cluster without data loss
snapblocs Kafka on Kubernetes
Depending on your needs, you can deploy a Kafka+ solution bloc with all the bells and whistles or just the minimum Kafka-only solution.
NetApp's Sisir Shekhar on snapblocs
Sisir Shekhar, Principal Engineer at NetApp
...you don't want to spend time deploying that kafka cluster you want to spend time analyzing the data.
Common Kafka use cases
Kafka works well as a replacement for a more traditional message broker. Message brokers are used for various reasons, such as decoupling processing from data producers, buffering unprocessed messages, etc. Compared to most messaging systems, Kafka has better throughput, built-in partitioning, replication, and fault tolerance, making it a good solution for large-scale message processing applications. Messaging uses are often comparatively low throughput but may require low end-to-end latency and often depend on Kafka's strong durability guarantees. In this domain, Kafka is comparable to traditional messaging systems such as ActiveMQ or RabbitMQ.
Website activity tracking
The original use case for Kafka was to rebuild a user activity tracking pipeline as a set of real-time publish-subscribe feeds. Site activities like page views, searches, or other user actions get published to central topics with one topic per activity type. These feeds are available for subscription for various use cases, including real-time processing, real-time monitoring, and loading into Hadoop or offline data warehousing systems for offline processing and reporting. Activity tracking is often very high in volume as many activity messages are generated for each page view
Use Kafka for the operational monitoring of data. Aggregate statistics from distributed applications to produce centralized feeds of operational data.
Many people use Kafka as a replacement for a log aggregation solution. Log aggregation typically collects physical log files off servers and puts them in a central place (a file server or HDFS perhaps) for processing. Kafka abstracts away the details of files and gives a cleaner abstraction of log or event data as a stream of messages. This capability allows for lower-latency processing and easy support for multiple data sources and distributed data consumption. Compared to log-centric systems like Scribe or Flume, Kafka offers equally good performance, stronger durability guarantees due to replication, and much lower end-to-end latency.
It is common to use Kafka to process data in processing pipelines consisting of multiple stages. Raw input data is consumed from Kafka topics and then aggregated, enriched, or otherwise transformed into new topics for further consumption or follow-up processing. For example, a processing pipeline for recommending news articles might crawl article content from RSS feeds and publish it to an "articles" topic. Further processing might normalize or deduplicate this content and publish the cleansed article content to a new topic. A final processing stage might attempt to recommend this content to users. Such processing pipelines create graphs of real-time data flows based on particular topics.
Starting in version 0.10.0.0, a lightweight but powerful stream processing library called Kafka Streams is available in Apache Kafka to perform such data processing as described above.
Event sourcing is a style of application design where state changes get logged as a time-ordered sequence of records. Kafka's support for massive stored log data makes it an excellent backend for an application built in this style.
Kafka can serve as a kind of external commit-log for a distributed system. The log helps replicate data between nodes and acts as a re-syncing mechanism for failed nodes to restore their data. The log compaction feature in Kafka helps support this usage.
Get instant Day-2 Level Operations
Self-service fully automated Kafka clusters deployed in your cloud environment. Click to deploy, scale at will.
- Self-service for DevOps and agile teams
- Automated Day-2 operational Kafka clusters
- Best practice security, dashboards
- Configure, deploy, manage and monitor
- Pause, resume, clone, scale
- Built in observability
snapblocs automates data platforms on Kubernetes, including Confluent Kafka®
The reality of DIY Kafka
Building your Kafka solution and going from POC to production requires a significant investment from multiple engineers over many months.
- Effort and resources are underestimated
- Project timelines become extended
- A lot of reinvention reduces time on core work
- Knowledgeable resources are scarce
- Security and dashboards are often inadequate
- Full-featured solution not feasible for small IT teams
What about Hosted Kafka?
Hosted managed Kafka solutions promise simple turnkey solutions, but data becomes spread across more data locations.
- Increased data movement in/out of vendor VPCs
- Creates privacy, security, latency, and cost issues
- Not as cost-effective with high event volumes
|Low-code for infrastructure||Yes||No||Yes|
|Easier to scale||Yes||No||Yes|
|Full lifecycle management*||Yes||No||No|
|Low-code complex data pipeline||Yes||No||No|
|Deploy on Kubernetes||Yes||No||No|
|Kafka log securely remain in your VPC||Yes||Yes||No|
|Configure once and run in any cloud||Yes||No||Yes|
|Built-in best practice security and Day-2 operations||Yes||No||Yes|
|Less focus on operations||Yes||No||Yes|
* Powerful lifecycle features like pause, resume, clone, and move the Kafka cluster
Leverage additional snapblocs platforms
snapblocs dpStudio automates many data platforms on Kubernetes, including Confluent Kafka®. Our pre-fab blueprint library that combines multiple best-in-class open-source technologies into ready-to-go solution-blocs on Kubernetes
- Data Platforms for Moving, Ingesting, Transforming, Processing, Storing, Analyzing, and Presenting data.
- PLUS, Development Platforms for Kubernetes and Microservices.
snapblocs makes it easy to Deploy and Manage Data Platform stacks
How to deploy a stack
This video will explores Component Configuration, Stack Deployment and Stack Teardown.
How to manage the lifecycle of a stack
This video demonstraights Pausing, Resuming, Cloning and Moving stacks.
Multi-Cloud - Click to deploy, scale at will
snapblocs dpStudio supports the major cloud providers, including Amazon Web Services, Google Cloud Platform, and Azure.
Leverage our Kafka on Kubernetes as an infrastructure abstraction: Configure once and run in any cloud.
Start your free Kafka cluster on Kubernetes today
Deploy Kafka and many other data platform solutions using the snapblocs blueprint library.
Our Help Center has the resources you need to learn about dpStudio.
Access the dpStudio Help Center
Tutorials and How-to-Videos show how dpStudio and our blueprint library can reduce your development cycle.
View the dpStudio Videos
Schedule a mtg to talk about professional services or book your free interactive demo of dpStudio.
Got questions? Our team is here to help.