In the previous text, we talked about the basics of streaming, what it means in theory, what are the advantages, disadvantages and mentioned some streaming tools. This text is more technical, and we will talk about Flink in general as well as the basics of streaming in Flink, the whole process from start (read data) …
Articles by sinisa
Once you start with streaming, you go with the flow!
Data streaming has become very popular in the big data industry. It is used for processing large amounts of data from different sources which are continuously generated, in real-time. When we say “real-time” we need to understand that it can vary from a few milliseconds to a few minutes. Besides that streaming is enabling us …
Data Exploration with Pandas (Part 2)
In the previous article, I wrote about some introductory stuff and basic Pandas capabilities. In this part, the main focus will be on DateTime values. I am also going to introduce you to some grouping and merging possibilities in Pandas. For this purpose here is another dataset downloaded from UCI Repository, which contains date and time …
Data Exploration with Pandas (part 1)
If you ever decide to become someone who is into big data, surely you can do it without having a clue about pandas. But that’s not the brightest solution, because why would you leave aside something that’s gonna make you a lot better. Pandas as well know library for manipulating datasets that contains numerical and …
Interactive log analysis with Apache Spark
The Internet is becoming the largest global shop across markets, and anyone who is offering products and services of any kind prefers for web shops to become the primary outlets to supply customers. This leads to a reduction in the number of employees and traditional brick and mortar branches and reduction in costs, so it …