Charles Grady
Verified Expert in Engineering
Back-end Developer
Overland Park, KS, United States
Toptal member since September 26, 2022
Charles is primarily a back-end developer specializing in Python. He has more than 16 years of experience developing RESTful APIs, high-throughput computations, and cluster interfaces in an open source environment. Charles published papers, gave conference presentations, and taught domestic and international workshops showcasing the tools he created, the community infrastructure, and his research projects.
Portfolio
Experience
- Python - 17 years
- APIs - 17 years
- GIS - 12 years
- NumPy - 8 years
- Spatial Statistics - 6 years
- Data Engineering - 4 years
- Data Science - 2 years
- Docker - 2 years
Availability
Preferred Environment
Python, Slack, pylint, Flask, CI/CD Pipelines
The most amazing...
...thing I've developed is a binary matrix randomization algorithm. This algorithm allowed researchers to look at patterns at previously impossible sizes.
Work Experience
Back-end Developer
Cedar Build, Inc
- Created an automated workflow for the client to take raw input data from various sources through pre-processing and analytical tools to combine and derive outputs the client was interested in for public consumption.
- Developed methods for identifying and classifying edges for land parcel polygons using attribute data and spatial relationships to enable additional processing and derived attribute analyses.
- Developed a dynamic configuration schema that the client can easily update. Generated multiple examples along with documentation for using and updating the configuration files for multiple municipalities.
- Created tools for determining which building codes affect specific parcels. This required identifying triggering parcels and then the parcels they affect and following region-specific rules for handling all the building codes for a parcel.
- Developed workflows and tools designed to be scaled, parallelized, and checkpointed in order to operate on larger and more complex municipalities.
- Automated testing, builds, packaging, release notes, and code quality control using GitHub actions, pytest, and pre-commit.
Python/Flask Developer
Chegg - Thinkful, Inc.
- Migrated an independent product with its own microservices into a larger umbrella product to simplify interactions between products and improve performance.
- Resolved existing technical debt and migrated and rewrote scripts and tools to fit the client's new project structure to improve performance, leverage new tools, and simplify data access and testing.
- Migrated the client's existing customer interactions called through Customer.IO APIs to use customer-based and event-based Braze APIs.
- Updated dev and production Docker containers for updated software products.
- Participated in code peer reviews as part of the build deployment cycle.
- Tracked and fixed bugs using Jira as a reporting tool and Agile task planning.
Senior Research Software Engineer | BiotaPhy Project
University of Kansas
- Developed software for performing mathematical computations at novel data and computational scales.
- Converted scripts created by non-programmers into production-quality software tools suitable for larger data scales and use in a more general audience.
- Built, maintained, and moderated a common code repository for collaboration between five institutions. This repository utilized pre-commit and CI/CD pipelines to ensure that code was well-tested and met quality standards.
- Created and used workflows for big data analyses that were then interpreted by biologists.
- Co-authored publications in scientific journals describing the research.
- Designed and implemented algorithms for computing new analyses.
- Presented research and led workshops at conferences and webinars.
- Documented software tools and APIs for public and software consumption.
Senior Research Software Engineer
University of Kansas
- Developed an algorithm and implementation method for randomizing binary matrices while maintaining row and column totals for null model creation at multiple orders of magnitude larger scales while requiring a fraction of the time and resources.
- Built a suite of biodiversity analysis tools for biological researchers designed to be easily extended and for operation on a single machine or a multi-machine computational architecture.
- Designed and implemented a RESTful web API for accessing biodiversity data and workflow tools in a high-throughput computational environment.
- Created tools to analyze single-species and communities of species occurrence records to establish each record's information value and novelty across a multi-variate space.
- Developed high-throughput computational workflows for analyzing billions of museum specimen records by aggregating records from multiple, heterogeneous sources, cleaning that data, and deriving numerous single- and multi-species outputs.
- Published articles in scientific journals describing tools created for biodiversity research.
Software Developer
University of Kansas
- Developed data and processing web services for creating species distribution models for all species with enough data.
- Designed computational workflows for processing input occurrence records and interfacing with modeling software to create models and distribution projections exploring the impacts of climate change.
- Presented work and research at various conferences.
- Created a visual interface representing the computational status of the nodes in a compute cluster so that users can see how various computational jobs move throughout the system.
- Built a client library for collaborators to use for interfacing with computational services.
Software Developer
Specify Collections Consortium (via University of Kansas)
- Developed a web interface for the Specify collection management software.
- Created the installer for the collection management software using InstallShield.
- Provided user support for issues impacting the web interface, installer, and various others, utilizing Bugzilla.
Experience
Biodiversity Research Software for the BiotaPhy Project
I collected and pre-processed raw input data, performed hundreds of thousands of single-species workflows, and generated large-scale multi-species data structures—including hundreds of thousands of species by hundreds of thousands of geographic sites—and some additional generated data structures that were then used to generate global analyses. This required significant data engineering, involving 5+ billion input records, and new data structures and methods for computing global analyses.
The portions of these projects that were possible in the past would have taken several months to years of computational time, but I had reduced the time required to several days. Other portions of the analyses were only possible at smaller scales—hundreds of sites by hundreds of species at most—and I was able to perform them 10,000 times for each analysis in matters of hours at up to 10 to the 12th power in data sizes. I then co-authored manuscripts with the researchers describing the methods.
Biological Collection Comparisons Tool
The total input data set is approximately 5 billion records from 6 data sources, but it can be expanded as data becomes available. I wrote tools to process these inputs and standardize them so they can be grouped and assessed one species or collection at a time.
Each specimen record undergoes many single- and multi-variate analyses across various dimensions to determine how likely each is to be an outlier, representative, unique, a duplicate, among others. Data is summarized at various taxonomic and phylogenetic levels to assess larger patterns and the collection density and distribution. The end goal is to create actionable items for a collection to improve for funding reviews, public presence, and research value.
Lmpy | Library of Biodiversity Analysis Tools
https://github.com/specifysystems/lmpy/tree/3.1.21The motivation behind this project came from the need to modify and reimplement various scripts and software written by biologists that could not scale to their needs. My involvement in this project started from those re-implementations, as I developed the library so that the biologists we collaborated with could use code, tools, and scripts designed with computational performance in mind. I set up the repository, including CI/CD for testing, PyPI packaging, Docker builds, and documentation builds, created new tools, and reimplemented others, including my binary matrix randomization algorithm. I contributed to nearly all of the code and documentation until version 3.1.21.
Parallel Implementation of Dijkstra's Algorithm for Computing Coastal Inundation Height
The project splits raster files into manageable-sized chunks to be worked on in parallel. These grid chunks can be of heterogeneous sizes and allow data sizes and scales that would not be feasible without specialized hardware. The critical points of the method are that individual chunks, or tiles, can be operated on in parallel and isolation and that the computations can be repeated as necessary when new information is determined. This results in an algorithm that is not necessarily as work efficient as standard Dijkstra's algorithm but runs in significantly less real time given the ability to utilize parallel resources, including massively parallel resources and supercomputers while requiring a fraction of the memory resources needed. In addition, heterogeneous data scales can be used for regions with higher resolution data.
I designed and implemented this project for my master's thesis and received honors for it.
Lifemapper Client Library
This client was aware, as dynamically as possible, of all of the services and data models that were currently exposed and was able to construct workflow requests for the computational back end. The client library was then distributed to our collaborating researchers, the general public, and additional clients, such as a QGIS plugin, that used this library as a liaison for interacting with the Lifemapper web services.
Documented, Re-executable Workflows for VisTrails
https://eim.ecoinformatics.org/eim2011.htmlThis project allows researchers to include their input data and processes in a standardized format with their publications. Consumers of those publications can then use these data or metadata packages to recreate the researcher's findings and tweak those workflows as desired for different datasets or to perform additional analyses.
I presented this work at the 2011 Environmental Information Management Conference, and a paper covering this work was published in the proceedings.
Web Interface for the Specify 5 Collection Management Software
The software was written using Java Server Pages and could be installed with Microsoft IIS or Apache. My portion of this project was to expose data from a full-text index of the collection database. This software was distributed and enabled by hundreds of institutions to expose their collections for public consumption. It was also highly configurable and provided interfaces for collection managers to see what data was accessed.
This was an early-career project, and—while I believe there are a couple of installations that are still live—there are many software packages available now that have made this project obsolete.
Education
Master's Degree in Geographic Information Science
University of Kansas - Lawrence, Kansas, USA
Bachelor's Degree in Computer Science
University of Kansas - Lawrence, Kansas, USA
Certifications
Machine Learning A-Z: AI, Python & R + ChatGPT Prize [2024]
Udemy
The Complete JavaScript Course 2023: From Zero to Expert!
Udemy
Data Science
University of New Mexico
Skills
Libraries/APIs
NumPy, REST APIs, API Development, DendroPy, OpenAPI, Pandas, SQLAlchemy, Stripe, GDAL/OGR, GDAL
Tools
GIS, Solr, Cluster, Git, GitHub, pylint, Pytest, PyPI, Microsoft Word, Microsoft Excel, GitHub Pages, Slack, Flash, InstallShield, Apache Tomcat, Bugzilla, Braze, CircleCI, Docker Hub, Zoom, Jira
Languages
Python, Python 3, SQL, R, XML, Markdown, C++, HTML, JavaScript, Python 2, Java, CSS, JavaScript 6
Paradigms
Testing, Object-oriented Programming (OOP), REST, Unit Testing, Parallel Programming, High-performance Computing (HPC), Microservices, Automation
Storage
Data Pipelines, PostgreSQL, Databases, JSON, MySQL, SQLite, Microsoft SQL Server
Frameworks
Flask, CherryPy, Swagger, Jakarta Server Pages (JSP), Alembic
Platforms
Docker, Apache2, Windows, Ubuntu, Linux, Amazon Web Services (AWS), MacOS
Other
APIs, Algorithms, Workflow, Programming, Spatial Algorithm Design, Metadata, Documentation, Web Services, API Integration, API Documentation, Back-end, Data Structures, Scripting, Research, Architecture, Software Design, Software Integration, Debugging, Software Documentation, Back-end Development, CSV, Data Processing, Modeling, HTTP Clients, Spatial Statistics, Statistical Methods, Geographic Information Systems, Data Science, CI/CD Pipelines, Matrix Algebra, Statistics, Workshops, Data Engineering, Big Data, EML, Authentication, Scaling, Cluster Computing, Maps, Training, Lint, Presentations, Conference Speaking, Tech Conferences, Agile Sprints, Multithreading, HTTP, Spatial Analysis, Open Source, Technical Architecture, Technical Writing, Writing & Editing, Sphinx, SDKs, Supercomputers, QGIS, Operating Systems, Cartography, Publication, VisTrails, Biology, Graphs, Gimp, HTTPS, IIS, Migration, Code Review, GitHub Actions, Build Releases, Packaging, Mathematics, Machine Learning, Decision Trees, K-means Clustering, Random Forests, Artificial Intelligence (AI), Artificial Neural Networks (ANN), Convolutional Neural Networks (CNNs)
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring