Wake County School Report Cards Using R

Recently Wake County School Systems released a “school report card” that can be used to compare how well a school is doing relative to the other schools in the state. As expected, it made front-page news in our local newspaper here.  The key theme was that schools that have kids from poorer families have worse results than schools from more affluent families.  Though this shouldn’t come as a surprise, the follow-up op eds were equally predictable: more money for poorer schools, changing the rating scale, etc..

I thought it would be an interesting data set to analyze to see if the conclusion that the N&O came up with was, in fact, the only conclusion you can get out of the dataset..  Since they did simple crosstab analysis, perhaps there was some other analysis that could be done?  I know that news paper articles are at a pretty low level reading level and perhaps they are also written at a low analytical level also?  I went to the website to download the data here and quickly ran into two items:

1) The dataset is very limited –> there are only 3 creditable variables in the dataset (county, free and reduced percent, and the school score).  It is almost as if the dataset was purposely limited to only support the conclusion.

2) The dataset is shown in a way that alternative analysis is very hard.  You have to install Tableau if you want to look the data yourself.  Parsing Tableau was a pain because even with Fiddler, they don’t render the results as HTML with some tags but as images.

Side Note –> My guess is that Tableau is trying to be the Flash for the analytics space.  I find it curious that companies/organizations that think they are just “one tool away” from good analytics.   Even the age of Watson,  it is never the tooling – it is always the analyst that determines the usefulness of a dataset.  It would much better if WCPSS embraced open data and had higher expectations of the people using the datasets.

In any event, with the 14 day trial of Tableau, I could download into Access.  I then exported the data into a .txt file (where it should have been in the 1st place).  I the pumped it into R Studio like so:


I then created 2 variables from the FreeAndReducedLunch and SchoolScores vectors.  When I ran the correlation the 1st time, I got an NA, meaning that there are some mal-formed data. 


I re-ran the correlation using only complete data and sure enough, there is a creditable correlation –> higher the percent of free and reduced lunch, the lower the score.  The N&O is right. 


I then added a filter to only look at Wake County and there is even a stronger correlation in Wake County than the state as a whole:


As I mentioned earlier, the dataset was set up for a pre-decided conclusion by limited the number of independent variables and the choice of using Tableau as the reporting mechanism.  I decided to augment the dataset with additional information.  My son plays in TYO and I unsuccessful tried to set up an orchestra at our local elementary school 8 years ago.  I also thought of this article where  some families tried to get more orchestras in Wake County schools.  Fortunately, the list of schools with orchestra can be found here and it did not take very long to add an “HasAnStringsProgram” field to the dataset.


Running a correlation for just the WCPSS schools shows that there is no relationship  between a school having an orchestra and their performance grade. 


So the statement by the parents in the N&O like this

… that music students have higher graduation rates, grades and test scores …

might be true for all music but a specialized strings program does not seem to impact the school’s score –> at least with this data.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: