From early days of information technology to today’s Big Data analytics, here you will find a quick summary of the different types of data.

There are 4 types of data. However, there could possibly be more segregation of each type in the future. Data retrieved from blockchain (a trending technology today) is also classified under Big Data unless until the industry comes out with a new classification of such blockchain data.

4 Types of Data Categories:

  1. Structured Data
  2. Unstructured Data
  3. Semi Structured Data (not in database but stored in an organized matter)
  4. Big Data

Structured Data includes data stored in a structured tabular way in an Excel or a database. A group of excel can be created or uploaded into a data management software to create a database. A database consists of various tables with logical relationships between each table and cells.

You can also use database query language (e.g. SQL which was also created by IBM in 1974) to search and filter a dataset you need. SQL stands for Structured Query Language). SQL was created for querying structured data in RDBMS (relational database management systems).

RDBMS is a software application to administrate, manage, and query your database. Example of RDMS includes MySQL, MariaDB, PostgreSQL, Db2 Express-C, SQLite, CUBRID, Firebird, Oracle Database XE, Sequel Pro, and SQL Server Express. RDMS was created based on relational database model after DBMS (database management systems) was developed.

Here are a few differences between DBMS and RDMS:

DBMSRDBMS
No relationship between dataData relationship exist with multiple excels
Stores data as a fileStores data in a tabular form
Handles small data volumeHandles large data volume
No NormalizationNormalization
Supports single userSupports multiple users
Examples: DBASE III Plus, FoxProExamples: MySQL, MariaDB, PostgreSQL,
Db2 Express-C, SQLite, CUBRID, Firebird,
Oracle Database XE, Sequel Pro, and
SQL Server Express

Unstructured Data is in all shapes and sizes. It does not have a pre-defined data model. It ranges from data on blogs, forums, email contents, social media, images, videos, audios, and more; whichever data or file that cannot be stored in a structured way. Unstructured data can be generated in large volume on hourly basis. To store unstructured data, you can use MongoDB.

Semi-Structured Data is data that is not stored in structured database but stored in unstructured database. The data has tags for organization. This includes NoSQL databases (also known as “not only SQL”) and XML.

A classic example of the semi-structure data is an email. Email consist of pre-defined fields such as sender, receiver, subject, and timestamp. These are structured data. However, the email contents and attachments are classified as unstructured.

Big Data is made up of all data; structured, unstructured, semi-unstructured. Majority of Big Data are classified as unstructured data. Fundamentally, it is difficult to analyse. In short, Big Data is a very large data that could comprise of any type of data and sources. E.g. from images, CCTV footages, videos, IoT devices, temperature sensors, other sensors, weather data, medical data, blockchain data, gaming data, and/or combinations of them.

Having large volume of data alone is far from being useful to the business. Companies must also invest into mining these data, both in human resources (time & talent) and analytical technologies, so that useful trends and patterns can be utilized.

Before you start investing your resources in any Big Data project, you need to figure out the value creation of the project. Mainly, the outcomes of big data projects could either to save time and cost or generate more revenue (e.g. customer spending behavior). From data mining, you are getting Insights for decision making. Big data projects can also be used to deter crimes by predicting potential criminal events.

Knowing what Big Data is just the surface. Time spent on a data project, mining the insights, and concluding the findings will accelerate your learning and skillset.

Aung Ko Hein