Quantcast
Channel: Cloud Training Program
Viewing all articles
Browse latest Browse all 1902

Structured Data Vs Unstructured Data Vs Semi-Structured Data

$
0
0

We can classify data as structured data, semi-structured data, or unstructured data. Structured data resides in predefined formats and models, Unstructured data is stored in its natural format until it’s extracted for analysis, and Semi-structured data basically is a mix of both structured and unstructured data.

In this blog, we are going to cover Data, types of Data, and Structured Vs Unstructured Data, and suitable Datastores.

What Is Data?

  • Data is a set of facts such as descriptions, observations, and numbers used in decision making.
  • We can classify data as structured, unstructured, or semi-structured.

Structured Vs Unstructured Data

1) Structured Data

  • Structured data is generally tabular data that is represented by columns and rows in a database.
  • Databases that hold tables in this form are called relational databases.
  • The mathematical term “relation” specify to a formed set of data held as a table.
  • In structured data, all row in a table has the same set of columns.
  • SQL (Structured Query Language) programming language used for structured data.

Structured Vs Unstructured Data

2) Semi-structured Data

  • Semi-structured data is information that doesn’t consist of Structured data (relational database) but still has some structure to it.
  • Semi-structured data consist of documents held in JavaScript Object Notation (JSON) format. It also includes key-value stores and graph databases.

Structured Vs Unstructured Data

3) Unstructured Data

  • Unstructured data is information that either does not organize in a pre-defined manner or not have a pre-defined data model.
  • Unstructured information is a set of text-heavy but may contain data such as numbers, dates, and facts as well.
  • Videos, audio, and binary data files might not have a specific structure. They’re assigned to as unstructured data.

Structured Vs Unstructured Data

Characteristics Of Structured (Relational) and Unstructured (Non-Relational) Data

Relational Data

  • Relational databases provide undoubtedly the most well-understood model for holding data.
  • The simplest structure of columns and tables makes them very easy to use initially, but the inflexible structure can cause some problems.
  • We can communicate with relational databases using Structured Query Language (SQL).
  • SQL allows the joining of tables using a few lines of code, with a structure most beginner employees can learn very fast.
  • Examples of relational databases:
    • MySQL
    • PostgreSQL
    • Db2

Structured Vs Unstructured Data

Non-Relational Data

  • Non-relational databases permit us to store data in a format that more closely meets the original structure.
  • non-relational database is a database that does not use the tabular schema of columns and rows found in most traditional database systems.
  • It uses a storage model that is enhanced for the specific requirements of the type of data being stored.
  • In a non-relational database the data may be stored as JSON documents, as simple key/value pairs, or as a graph consisting of edges and vertices.
  • Examples of relational databases:
    • Redis
    • JanusGraph
    • MongoDB
    • RabbitMQ

 norelational-db

Document Data Stores

  • A document data store handles a set of objects data values and named string fields in an entity referred to as a document.
  • These data stores generally store data in the form of JSON documents.

 documents-data-store

Columnar Data Stores

  • A columnar or column-family data store construct data into rows and columns. The columns are divided into groups known as column families.
  • Each column family consists of a set of columns that are logically related and are generally retrieved or manipulated as a unit.
  • Within a column family, rows can be sparse and new columns can be added dynamically.

 column-data

Key/Value Data Stores

  • A key/value store is actually a large hash table.
  • We associate each data value with a unique key, and the key/value store uses this key to store the data by using a correct hashing function.
  • The hashing function is preferred to provide an even distribution of hashed keys across the data storage.
  • Key/value stores are highly suitable for applications operating simple lookups using the value of the by a range of keys.

 key-value-data-store

Graph Data Stores

  • A graph data store handles two types of information, edges, and nodes.
  • Edges point out the relationships between these entities and Nodes represent entities.
  • The aim of a graph datastore is to grant an application to efficiently perform queries that traverse the network of edges and nodes and to inspect the relationships between entities.

 graph-data

Time series data stores

  • Time series data is a set of values formed by time, and a time-series data store is making the best for this type of data.
  • Time series data stores must support a very large number of writes, as they generally collect large amounts of data in real-time from a huge number of sources.

 Time-series

Object data stores

  • Object data stores are correct for retrieving and storing large binary objects or blobs such as audio and video streams, images, text files, large application documents and data objects, and virtual machine disk images.
  • An object consists of some metadata, stored data, and a unique ID for access to the object.

 object-data-stores

External index data stores

  • External index data stores give the ability to search for information held in other data services and stores.
  • An external index acts as a secondary index for any data store. It can provide real-time access to indexes and can be used to index massive volumes of data.

 index-stores

Structured Vs Unstructured Data

1) Defined Vs Undefined Data

  • Structured data is undoubtedly a defined type of data in a structure.
  • Structured data lives in columns and rows and it can be mapped into pre-defined fields.
  • Unstructured data does not have a predefined data format.

2)Quantitative Vs Qualitative Data

  • Structured data is generally quantitative data, it usually consists of hard numbers or things that can be counted.
  • Methods for analysis include classification, regression, and clustering of data.
  • Unstructured data is generally categorized as qualitative data, and cannot be analyzed and processed using conventional tools and methods.
  • Understanding qualitative data requires advanced analytics techniques like data stacking and data mining.

3) Storage In Data Lakes Vs Data Houses

  • Structured data is generally stored in data warehouses.
  • Unstructured data is stored in data lakes.
  • Unstructured data requires more storage space, while structured data requires less storage space.

4) Ease Of Analysis

  • Structured data is easy to search, both for algorithms and for humans.
  • Unstructured data is more difficult to search and requires processing to become understandable.

Structured Vs Unstructured Data

Related/References

Next Task For You

You will know more about data concepts, why you should learnJob opportunities, and what to study to clear [DP-900] Microsoft Azure Data Fundamentals Certification by registering for our FREE CLASS.

Click on the below image to join the FREE CLASS for Microsoft Azure Data Fundamentals [DP-900].

Content Upgrade - Free Class

The post Structured Data Vs Unstructured Data Vs Semi-Structured Data appeared first on Cloud Training Program.


Viewing all articles
Browse latest Browse all 1902

Trending Articles