
I am always encountering the problem about handling the data from a likert scale, partly because almost all the students I provided consultation with collect data from questionnaires with the questions like 1-strongly disagree,2-,3-,4-, to 5- strong agree. This kinds of questions could not be more common in sampling survey field. But if we treat this data as interval or ratio type, there are some potential problems.
1. Many statistical techniques require the data following a normal distribution or asymptotically normality at least. It is really hard to ask likert scale data to follow normal distribution, at least for the following two reasons from my experiences: first, the answers of respondents often have floor effects or ceiling effect, which means there are rarely people who choose 5 as many as who choose 1. In another words, if most people choose 5 probably there are fewest people to choose 1. People’s answers have similar tendency. All above will not lead the distribution to be a bell-shape, or symmetrical shape. The other reason is that the normal distribution was designed for continuous data. How can you estimate a normal curve by just three or five vertical discrete strips? Ridiculous.
2. The interval between any two points cannot make sense. I agree with that if the ordinal data is 1st place, 2nd place or 3rd place in a 100 meter run. The distances between 1 and 2 make no senses(who tell me 1.3th place win gold medal or silver?). However, I admit the data from Likert Scale is ordinal in nature. I think the distance between two points could mean something, more or less. For example, if the mean of two person’s answer is 1.5, at least we cannot conclude they have a positive attitude, right? I read some article talking about this. I suggest to the students come to see me that treat it as interval type if you are using a at least five-points likert scale. But be aware of that there are still debating and arguing among statisticians. If you show my article to another statistician, they may say what I wrote is bullshit.
Is it very important that the data is ordinal or interval? The answer is definitely YES. Different types of measurement have their corresponding appropriate statistical techniques. Gender, for example, we cannot take the mean score of it.(Who’s gender is half/half?? I guess there are!). Therefore, we cannot use t-test, ANOVA, mean score and many other, instead only mode, median and frequency can be applied.
Oh, it is too sad of it. Aren’t there any exemptions? Yes, there are. Thanks to Central Limit Theorem. We can safely disregard what I talk about by Central Limit Theorem, IF the sample size is LARGE.
Is it good? But unfortunately, how large is large? The empirical cutoff is 100. But how about your sample size is 90? HAHA. Yeah, this is statistics. The only way now is judging by yourself.
Yes, it is true. Your case might always be imperfect, life is not always ideal and statistics methods used are not always most appropriate.
Statistics, I love it and I hate it.

