School Data Dashboard | School Improvement, Educational Business Intelligence
This dashboard significantly reduced the needs assessment effort and planning by gathering all the data needed to one location. It collects data from public sources—thousands of CSV and Excel files—and stores them in relational form for all Texas schools. Upon request, each school has around 3,000 data points pulled from the database in JSON in 0.5 seconds, including history, reference data, related data, and statistics. The data is then put in a graph structure and represented with custom D3.js visualizations as a single page application. Noteworthy features include:
• XML parsing
• Collect data from students, parents, and staff via surveys
• An intuitive interface to explore hierarchical data—broken down by subject, grade, demographic, and more—in a top-to-bottom approach
• Rate, filter, sort, take notes on, and set targets for data while adding to plan for further action
• Concurrently exploration of data from multiple sets (i.e., multiple graph structures) by area.
• A reusable role-based authorization app for Django tied to users’ school positions
• Custom release management and deployment scripts in Bash
• Single sign-on with a discourse site
• Python unit testing, reusable Django detail views (CRUD), guided tour, and more
Data Pipelines with AWS Glue
A data pipeline that extracts, transforms, and loads around 10TB of initial data from a Teradata instance to a Vertica instance using AWS Glue. Subsequent, scheduled jobs would bring about 100 million records weekly to update the data.