|
Just a note on the data presentation :) Your analysis shows why it's kind of a shame GP had to 'stripe' the years rather than having another dimension (i.e. the striping is such that long-row closest to us to long-row farthest from us goes clinic1-yr1, clinic1-yr2, clinic1-yr3, clinic1-yr4, clinic1-yr5, clinic1-yr6, clinic2-yr1, clinic2-yr2, etc: i.e. 1,2,3,5,6,1,2,3,4,5,6,1,2,3,4,5,6) That means that rows don't actually form a data dimension: rather, we are looking at four independent graphs that are put one after the other without spaces. (The first graph is rows 1-6 with the row dimension being the year, the second graph is rows 7-13 with the year reset, etc.) See note for another way to see this. It might be visually possible to see what happened at other clinics in yr1, yr2, yr3, yr4, yr5, and yr6, but at the moment the only way to do this is to read long-rows 1, 7, 13, 19, and 24 which is not obvious, you have to count to know what is what. It would help if the colors corresponded (row 1 and row 7 had the same colors, so that color forms the year dimension), then you could look at the image from the perspective of different colors and see if anything sticks out. For example, if you wanted to see what happened in year 4 or 5 (as you mention), then you could look at all of the greens and blacks. (This means to identify a specific clinic-year you have to go by row number rather than color, but that seems OK to me - nobody is going to consult the legend consisting of 24 colors anyway.) As it happens, it's a chore to count out what is year 4 and 5 for the other clinics and we lose this very important dimension visually. (Quick: tell me the tendency among all clinics as you move from year lavender, to red, to yellow, to green, to black, to peach in clinic 1). However, I don't think that excel would have let you define colors in a custom way like this. -- NOTE: You can tell that the rows don't form a data dimension, because it would be a mistake to connect all the points, like a topographic map -- like this: https://alastaira.files.wordpress.com/2011/04/image31.png -- . If you did that you should have a break between rows 6 and 7, between 12 and 13; and between 18 and 19 ---- because the slope between these specific rows is meaningless. On the other hand, if you DID have four such broken graphs next to each other, and within each graph they followed (repeated) the same color order, it would be easier to compare years. To further identify the color dimension the colors could move more predictably along the color scale (e.g. roygbi - red, orange, yellow, green, blue, indigo...) |
The other clinics show reasonable self-similarity in years 3, 4, and 5. But the fraudulent clinic is reporting almost nothing at all. Not whatever codes it reported in the past, not whatever codes are worth the most money, not even randomly selected codes -- nothing. It's true that that might not be interesting if the other clinics showed similar behavior, but they don't. (And actually, I think "no medical demand for a three-year period" would be pretty interesting too.)