Data tools
Menu
  • Home
  • Web Service
  • Avro schema
  • JSON

Data Engineering Tools

A site to share contents, tutorials and online tools that I use in my day-to-day tasks as a data engineer.

Welcome to my tutorial pages!

In this website, you will find contents, tutorials and online tools that I use in my day-to-day tasks as a data engineer.

The primary reason of this site is to publish tools that helps me to understand, practice, or even develop proof of concepts for my projects. There would be small tutorials and links to sources if you'd like to learn more.

Kafka
Hadoop and Zookeeper
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.
Kafka
Apache Hive
The Apache Hive data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL.
Kafka
Kafka and Schema Registry
A messaging system based on pub/sub architecture perfectly designed for streaming and message passing.
Spark
Spark
Spark is a unified distributed data processing platform.
Kafka
AWS
Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform, offering over 175 fully featured services from data centers globally.

Iraj Hedayati

A data engineer, a student and a teacher.

Contents

  • Web services
  • Avro schema

Where to find me

  •  LinkedIn
© Iraj Hedayati