MoneyThink DecidED Mobile App

Building a serverless image processing engine for scraping and structuring student financial aid information

Organizations: MoneyThink DecidED Mobile App

Collaborators: Luis Carlos Contreras, Marcelo Ventura, Chris Taylor, Ben May, Nate Ranney

Volunteers: Chakshu Agarwal, Romina Reyimjan, Jun Kim

Dates: Spring 2020 - Present

Foci: Education, Software, Data Engineering, DevOps

Synopsis

MoneyThink is a non-profit seeking "to harness the power of people and technology to bring transparency to college costs so that all students are equipped to invest in their futures". We’re a team that’s passionate about helping students make great decisions for their future success.

Simple, side-by-side comparisons!

Simple, side-by-side comparisons!

To that end, we're building a new tool for automatically turning financial aid award images into easy-to-understand, side-by-side comparisons of college affordability. This can help underserved students more clearly understand the debt they’ll be taking on for a given college education.

The Opportunity

I found MoneyThink through an ex-boss, and after a few conversations with several team members, I discovered that their work was meaningfully changing thousands of students' college decisions. I wanted to help!

At the time, they were looking to bring on a Lead Data Engineer, who would lead the creation of the core data engine: a tool for automatically turning financial aid award images into structured, tabular data.

Software is magic.

Software is magic.

The chance to build and own a system from scratch - from setting up CI/CD systems to learning Terraform for deploying AWS infrastructure to writing core logic - for a great cause was irresistible.

The Work

I'm currently in the midst of this work, focused on building their core data processing engine. The system we've built is entirely serverless (on AWS), utilizing S3 for file storage, SNS & SQS for pub-sub messaging, performing most of our core processing with Lambda and Textract, and saving our tabular data with RDS.

Along the way, we've completely encapsulated all our infrastructure with Terraform, and integrated CircleCI into our deployment process for continuous integration and deployment. With 100% of our infrastructure-as-code, we can deploy new environments just by creating a new feature branch!

Deployment made easy.

Deployment made easy.

The Future

Within the next few quarters, the engine should be capable of processing images of student award letters and returning structured, tabular data for consumption by our front-end!

With designers and front-end engineers working on the user interface, that data will be able to be rendered into beautiful visualizations, helping students make more informed college decisions.