Introducction

In this very practical article, we will build two of the most interesting technologies in the Big Data ecosystem: Apache kudu and Apache Impala. We will build both technologies from source code, package them and deploy them with the minimum configurations to make them functional (we will avoid performance related configuration on this occasion) and additionally we will use Apache Impala making it as independent as possible from HDFS.

It is known that Apache Kudu and Apache Impala release their versions in “Source Release” form, therefore they do not make the binaries of these technologies available. This forces users to build such projects from source code. Generally the projects of the Big Data ecosystem and the Apache Software Foundation are based on Java code, which makes them easier to build and in particular to distribute (in most cases only a JVM is needed on the machine to be deployed).

In the case of Apache Kudu and Apache Impala this is not the case, they are two projects whose code base is C++, which makes their construction and in particular their distribution in the target systems somewhat more complicated.

Continue reading “How To Deploy Apache Kudu/Impala From Sources” →

Coding Apache NiFi Ecosystem: Part 1

April 18, 2020September 29, 2020 jromanes

Introduction

This is a series of articles about coding Apache NiFi in order to help new comers (like me) to understand this awesome project and development community. In this series we are going to study the Apache NiFi ecosystem from a coding perspective. The idea is learning step by step because of modify a huge code set like Apache NiFi is a complex endeavor.

First Step: The Build System

Probably one of the first steps in order to understand the project, it’s to analyze the build structure used in the project. The build structure is based on Apache Maven. This first article is focused in this building tool.

Continue reading →

Apache MiNiFi Debug in Intellij IDEA

April 10, 2020May 2, 2020 jromanes

Introduction

In this article is shown how to setup Jetbrains IntelliJ IDEA for debug/develop Apache MiNiFi tool.

Jetbrains Intellij IDEA is a powerful IDE but sometimes is a little bit complicated of setting up, in particular when the upstream project doesn’t have clear instructions about the development using this kind of IDE’s.

Continue reading →

Data Intensive

Big Data Intensive Architectures

Tag: Apache Software Foundation

How To Deploy Apache Kudu/Impala From Sources

Introducction

Coding Apache NiFi Ecosystem: Part 1

Introduction

First Step: The Build System

Apache MiNiFi Debug in Intellij IDEA

Introduction