The choice of intervals seems to dominate everything else in this graph, and it’s completely arbitrary. Can’t you at least make the graph in a systematic way, e.g., with a log-scale on the X-axis and intervals of equal size in log-space?
I’m following the data source precisely. This is the way the Census Bureau breaks it down. Without the raw numbers, there’s no way to break it down any differently.
Even so, I think this graph tells an very interesting story.
You’ve got me thinking. I try to present data objectively and accurately. Implicit in that is not letting the data “lie”. And you’re right that this might not present the data most “accurately”, depending on how your define the term.
As it is, this graph implies that the difference between $5,000 a year and $10,000 a year is the same as the difference between $190,000 and $195,000, and I doubt anyone in those ranges would agree.
On the other hand, a log scale would imply that the difference between $1,000 and $10,000 is the same as that between $10,000 and $100,000 — again, arguably inaccurate.
I’d say there is no single, honest way to present this (these) data, other than maybe presenting both a linear and a log scale graph.
Good point on the interpolation, too – I might give that a shot myself.
I agree there’s no clearly right answer. The log scale makes more sense at the high end but breaks down at the low end. Linear is ok at the low/middle but breaks down badly at the high end. I am curious to see what it looks at in a log scale but I won’t claim it’s not going to have its own issues.
The choice of intervals seems to dominate everything else in this graph, and it’s completely arbitrary. Can’t you at least make the graph in a systematic way, e.g., with a log-scale on the X-axis and intervals of equal size in log-space?
I’m following the data source precisely. This is the way the Census Bureau breaks it down. Without the raw numbers, there’s no way to break it down any differently.
Even so, I think this graph tells an very interesting story.
You could use interpolation to resample the data. Maybe I will try that and post the result. I don’t have a good intuition for what it will look like.
You’ve got me thinking. I try to present data objectively and accurately. Implicit in that is not letting the data “lie”. And you’re right that this might not present the data most “accurately”, depending on how your define the term.
As it is, this graph implies that the difference between $5,000 a year and $10,000 a year is the same as the difference between $190,000 and $195,000, and I doubt anyone in those ranges would agree.
On the other hand, a log scale would imply that the difference between $1,000 and $10,000 is the same as that between $10,000 and $100,000 — again, arguably inaccurate.
I’d say there is no single, honest way to present this (these) data, other than maybe presenting both a linear and a log scale graph.
Good point on the interpolation, too – I might give that a shot myself.
I agree there’s no clearly right answer. The log scale makes more sense at the high end but breaks down at the low end. Linear is ok at the low/middle but breaks down badly at the high end. I am curious to see what it looks at in a log scale but I won’t claim it’s not going to have its own issues.