As technologies emerge, and it seems that the words microservices and DevOps have become the answer for any technological problem, Organizations are putting a lot of effort into their automation, CI/CD pipelines, to improve their release lifecycle and gain a competitive advantage while keep improving their code and enrich their feature set.

CONTAINERS ARE’NT ALWAYS THE ANSWER!

Yes, I’ve just said it. We’ve become so excited about microservices and containers, that we force this solution on every use case that we meet. Sometimes, the effort (in terms of time and money) that needs to be spent is so big, that…


Photo by Stephen Dawson on Unsplash

Kubernetes is our new Operating System, no one can doubt that anymore. As a lot of effort has been made, in order to develop a micro-services approach, and migrate workloads towards Kubernetes, organizations left their data services behind.

We all saw, due to COVID-19, how important data is, and how important it is to have the proper architecture and data fabric. Data won’t stop from growing! more even, it’ll just keep breaking its consumption records one year after the other.

This challenge forces us to provide a more automatic, and scalable solution to our organization by moving our data services…


Photo by Chris Liverani on Unsplash

The Big Data world is making its way towards Kubernetes and we already see many Data Processing and AI/ML products building their solution around Kubernetes to assure stability, scalability, and availability.

Until now, most of the solutions were VM based, with no orchestration, automation, or config management layers above, which caused those solutions to be less scalable and a little bit of a pain.

With Kubernetes, we can provide far more scalable solutions, that are fully automated and will preserve their state after failures.

With the will to run Big Data workloads on Kubernetes, comes the need for a simple…


Photo by ev on Unsplash

Kubernetes has become the de facto-standard container orchestration platform. With this approach, organizations are trying to gather up all their applications and platforms around Kubernetes to take advantage of its stability, agility, and simplicity. Running your whole stack in Kubernetes will allow you to have a single API and a common language whether it’s for an application, a database, or a storage engine that needs to be deployed.

A few years ago, people believed that in order to gain more performance for big data workloads, your application needs to have performant local disks mostly based on flash media. …


After my last article regarding Ceph deployments, I have decided to talk with you about a new capability that is now available on Ceph Octopus upstream version and will be available on later the RHCS versions as well, called cephadm. This new capability will allow you to deploy a whole Ceph cluster in under 10 minutes. cephadm is a deployment tool that is here to make you life much easier when dealing with Ceph cluster deployments, It uses Podman in order to run all the Ceph daemons, and the deployment management is done via an SSH connection. There are a…


Today more and more organizations are moving away from ETL, an ETL process in the form of Extract-> Transform -> Load where the data is being extracted from its source location, being transformed into a clean valuable data and then loaded into a target database/warehouse. ETL jobs are batch-driven, time-consuming, and messy, mainly because there is no alignment between the different ETL pipeline components, which make the ETL architecture to be looking like a big spaghetti plate.

So, is ETL dead? The answer is not at all, it’s just being re-newed.

Many organizations understand that in today’s world they cannot…


With the massive adoption of Apache Kafka, enterprises are looking for a way of replicating data across different sites. Kafka by itself has its own internal replication and self-healing mechanism which are only relevant to the local cluster and cannot tolerate a whole site failure. The solution for that, is the “Mirror Maker” feature, with this capability, your local Kafka cluster can be replicated asynchronously to a different external/central Kafka cluster that is located on a whole different location in order to persist your data pipelines, log collection, and metrics gathering processes.

The “Mirror Maker” connects between two clusters, as…


Ceph is a distributed, unified software-defined storage solution, it can be the source of your relevant storage protocols by exposing block, file, and object storage. Most of Ceph’s installation today is being used as daemons treated as system services and can be started, stopped, reloaded, disabled, etc. With the massive adoption of microservices and container engines, Ceph daemons can be installed as containers too without causing any bottlenecks at all. Having Ceph running in containers increases management simplicity dramatically since containers are stateless and can easily spawn up when having a failure. …


As the world adopts the data-centric approach and people get more familiar with Kubernetes as an end-to-end platform for their application lifecycle, the need for persistent arises. By default, containers are stateless, which means that they don’t save any state and treat the data as ephemeral. To solve this problem in Kubernetes, storage classes are being used. With storage classes, we have a storage provider (whether it’s block, file, or object storage) that Kubernetes can access to save the information that is being used by the containers in a volume. This volume is being attached to the container at runtime…


Today, when running Kubernetes in production, we are sometimes having a hard time collecting the logs from our cluster components. eventually, it ends up using custom made solutions that doesn’t provide the wanted user experience. To solve this problem, we can use the Cluster Logging Operator provided by Red Hat as out of the box solution for Openshift Container Platform. The CLO will deploy a whole Elastisearch->Fluentd->Kibana stack that will collect logs from our cluster components automatically. …

Shon Paz

Solution Architect, Red Hat

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store