data is where we have columns representing variables and rows representing the cases or observations.
numerical in nature or quantitative in nature
- when we look at data, data is broadly classified into two
categories; categorical data and numerical data. - So, when we look at categorical data, these are also called as qualitative variables.
What do we mean by group membership?
- Again we go back to our student data, let us look at gender.
- Gender is a categorical variable.
- I have two categories here. I can classify any observation into one of these two categories.
- So, it is a group membership. Similarly, when I look at board I have a category which is a State Board, I have ICSE, I have CBSE. So, again you can see that this categorical variable has three categories and any observation can be categorized into one of these three groups.
- So we are giving membership of an observation to a particular group in that particular variable.
- So, this category has groups.
- Let us go to the hospital data.
- You see that blood group every patient is either an O positive or an O negative or a B positive or a A positive or A negative.
- So, you can see that there are many blood groups I again this is a categorical variable; gender is a categorical variable.
numerical data
- When we have numerical data, numerical data is also
called quantitative variables. - Here I can talk about numerical properties of data.
- we need to understand what is the scale that defines the numerical data.
- Again we have already emphasized on the point that when you have numerical data which take units.
-
within numerical data, I could have discrete data and I could have continuous data.
- We need to ensure that the variable is measured across all observations and shares a common unit.
Time Series Data and Cross sectional
- Apart from categorical data and numerical data, we also have data which where which are referred to as time series data.
- we refer to as a time series data where the data on a particular variable; this could be the quantity procured on potato is obtained the variable is the same.
- what we refer to as a time series data whereas, cross sectional data is the data which is observed at the same time.
Summary
- So, we will to broadly classify, we should know that given data we classify them broadly as categorical or numerical.
- So, whenever we are presented with a data set, we should be able to classify all the variables in the data set as a categorical variable or a numerical variable.
Facebook Comments Box