Data Analysis and Data Analytics
Data Analysis VS. Data Analytics
Dr. Abdulrahman M. Aljamouss
Assistant Professor, Business Development
Data is everywhere, If we said :
- I can’t find data that I need – data is scattered over many versions with differences.
- I can’t understand data.– data didn't documented as well .
- I can’t use the data I found – data needs to be transformed from one form to other.
Definition and history of data analysis
- Data analysis is a process of inspecting, cleansing, transforming and modeling data with the goal of discovering useful information to support decision-making.
Data VS. information VS. knowledge
What’s Big Data?
Statistics Big Data, Statistics Big Data, In 2018, U.S. faced a shortage of 1.5 million analysts with Big Data know-how, [McKinsey]. The challenges include capture, processing, storage, search, sharing, transfer, analysis, and visualization.
3 V’s: Volume - Velocity - Variety
Some Make it 4V’s
5 Vs of Big Data
- Raw Data: Volume.
- Change over time: Velocity.
- Data types: Variety.
- Data Quality: Veracity.
- Information for Decision Making: Value.
Difference between data analysis and data analytics
- Data Analysis looks backwards over time, providing us with a historical view of what has happened.
- Data Analytics look forward to model the future or predict a result.
- Data Analytics, a more detailed business practice that starts with identifying which data to analyze, collecting the right data, and then organizing that data into the right data sets using the right algorithms and statistical techniques. Data analytics also involves a certain amount of data cleaning to deliver the right analysis based on which, the right decisions can be taken. Once the data is transformed into a useful form, some mechanical or algorithmic process has to be applied to it in the form of Machine Learning algorithm or statistical process to derive insights. This is done by comparing the different data sets to gain answers to the problems the data is being used to solve. Once this is done, the data analyst has to represent the data in a form that can be understood for business benefit.
Alternative name (KDD), Knowledge discovery in databases, Extraction of interesting knowledge (important, implicit, previously unknown and potentially useful) from huge amount of data.
The impact of vast volumes of available data
Considering that people are generating more than 2.5 Quintillion(1018) bytes of data every day, it is hardly a surprise that ‘data’ is the term that is on everyone’s lips.
According to International Data Corporation (IDC), this data will grow ten-fold and will expectedly cross 44 zettabytes by 2020! ((Kilo, mega, giga, tera, peta, exa, zetta, yotta)). The IDC report also revealed that the amount of useful data will increase from 22% in 2013 to 35% by 2020.