The data files are 5 denormalized CSV tables. If you want to load the data into a relational database, and easily run SQL against it, there are two options:
1. We've already done this for you. Get an invite into our open data exploration and visualization tool, powered by our partners at Looker.com
2. Or do it yourself using our Postgres scripts:
Below are scripts both to help load the data into 5 tables and also to migrate the data into 9 partially normalized tables, which are more suitable for exploration with SQL. The scripts provided are for PostgreSQL, but should be simple to modify for other relational databases.
The entity-relation diagrams (ERDs) below only show keys that would be used for joining/grouping. Full documentation of all the fields in the data files can be found in the documentation for each.
Also, the table names diagrammed are different (and shorter) than the names that are actually used in the scripts as distributed. Simply remove the long (and admittedly unwieldy donorschoose_ and normalized_) prefixes to get the names shown in these ERDs. If you do so, the 5 spreadsheet table names will be in the plural and the 9 partially normalized tables will be in the singular, exactly as shown below.
5 Denormalized Tables
You can run this SQL script and SH script to create and load the 5 data files into 5 db tables.
9 Partially Normalized Tables
You can use this SQL script to migrate the data from the 5 original tables (above) into 9 partially normalized tables (below). Warning: this file has not been kept up to date. There are likely column naming discrepancies, so expect to have to make adjustments.