Probably the most used and most talked about graph in any statistics class, a histogram contains a huge amount of information if you can learn how to look for it. While it is possible to go into great detail about the different shapes you may encounter or where the mean and median will “end up”, this article will only focus on reading the information the histogram is giving you.
The general idea behind a histogram is to divide the data set into groups of equal length which allows us to see the patterns in the data instead of the detailed information we would get from what is basically a list of numbers.
In the histogram of salaries above, those groups are 24-32, 32-40, 40-48, etc. Once the groups have been chosen, the frequency of each group is determined. The frequency is simply the number of data values that are in each group.
Let’s look at the very first group 24-32. The bar goes up to 7, meaning that this group has a frequency of 7. This tells us that there are seven data values (if we had the list of all the salaries) that are between 24 and 32 thousand. In other words, seven people in this group made between $24,000 and $32,000.
Very important: this group does not include the 32. There are seven data values bewteen 24 and 32 thousand, not including 32 thousand. Keeping this in mind and reading from the next group: there are six data values between 32 thousand up to (not including) 40 thousand. Again, this means that six of the people in this group had a salary of $32000 up to $40000 a year. (anyone making exactly $40,000 is in the next group)
Be careful in making more detailed conclusions. While I can say that most people in this group made less than $50,000 (that’s where the most frequency is) I can’t use this graph to say how many people made EXACTLY $35,000 or how many made EXACTLY $25,000. In a histogram we “lose” the information about individual data values when we group the data. If we didn’t want to lose that information we may choose to use a dotplot or a stemplot to display the data instead.