In order to understand the different types of data as well as their similarities/differences we use the perspective of schema. The presence of schema allows us to classify the usability and operability of data. Data without the presence of schema provides very limited use. Data with static and standardized schema provides unlimited use. Listed below is a breakdown of unstructured, semi-structured and structured types of data with respect to schema.
1. Unstructured = NO SCHEMA
Sources:
-Database Log Programs
-Application Log Programs
-Database Extracts
-Application Data Extracts
-Text/Messages
-RFID scanners
-Mobile Devices
Types:
-CSV Files
-Flat Files
-Log Files
-Scanner Log Files
-Pictures
-Videos
-Sounds
Example:
2.Semi-Structured = VARIABLE SCHEMA
Sources:
-Document Store Databases
-Graph Databases
-Web Data
-Social Network Data
-Demographic/Socio-Economic Data
-Web Crawler Extracts
Types:
-JSON
-XML
Example:
3. Structured = STATIC SCHEMA
Sources:
-RDBMS Databases
-RDBMS Table Extracts
-Transactional System Data Extract
-CRM System Extract
Types:
-Structure Data Files (SDF)
-Excel Files
Example: