Paul Butler, Developer in Riverhead, NY, United States
Paul is available for hire
Hire Paul

Paul Butler

Verified Expert  in Engineering

Principal Architect and Developer

Riverhead, NY, United States
Toptal Member Since
October 21, 2020

Backed by 20 years of experience, Paul has served as a principal architect, technical project manager, and engineer for various software development focus areas such as SFA, CRM, business intelligence, data warehousing, forecast modeling, reporting, data visualization, data science, and web apps. He focuses on creating innovative solutions, which are highly aligned to business value. A keen strategist, Paul enjoy spanning business and technical expertise to create high impact solutions.


Architecture, Snowflake, Microsoft Power BI, Data Build Tool (dbt)...
Amazon S3 (AWS S3), Wasabi, WebRTC, JavaScript 6, PostgreSQL, Node.js
Snowflake, Star Schema, Supervisor, Contract Negotiation, Negotiation...




Preferred Environment

Data Warehouse Design, Data Warehousing, Data Engineering, ETL, Microsoft Excel, DAX, SQL Server Analysis Services (SSAS), SQL Server Integration Services (SSIS), SQL, Microsoft SQL Server, Microsoft Power BI

The most amazing...

...thing I've built is a dynamic data management system. I managed the full lifecycle from the transactional input, analytics to the delivery of the web apps.

Work Experience

Senior Data Architect

2020 - PRESENT
  • Established a cloud-based data analytics platform based on Snowflake.
  • Conducted POC with dbt to establish the standard data transformation technology.
  • Designed data analytics platform data models, including definitions of data domains and data layers (RAW, GOLD, MART).
  • Conducted POC of Datavault 2.0 modeling; adapted a hybrid approach for a new analytics platform.
  • Designed data integration application for a complex customs partner agency.
  • Coded Snowflake views and Stored Procedures (JavaScript and SQL-based) and UDFs.
  • Conducted POC for data orchestration using Prefect. Facilitated the evaluation of technology alternatives, such as Airflow, Prefect, Dagster, and others. Coded PoC in Python 3 with Prefect for representative use cases.
  • Conducted POC for Snowflake SnowPark technology using Python and VSCode.
Technologies: Architecture, Snowflake, Microsoft Power BI, Data Build Tool (dbt), Data Modeling, Database Modeling, Python 3, Prefect, Azure Synapse, Stored Procedure, Data Warehouse Design


2019 - 2020
  • Advised the founder regarding software technologies to enable the business strategy for a 360° VR video content startup.
  • Evaluated and implemented POCs for cloud-based software products and services.
  • Led R&D, solution selection, product, and software component evaluation.
  • Established the website, cloud-based data storage (Wasabi), and data storage integration using Rclone on Ubuntu.
  • Utilized Node.js to tailor a youtube clone website and overwrite the default 360 VR Player to use a Viblast player.
  • Developed a test plan to measure end-user performance and peer-to-peer utilization which allowed playback of high-resolution 360° VR.
Technologies: Amazon S3 (AWS S3), Wasabi, WebRTC, JavaScript 6, PostgreSQL, Node.js

Senior Director of Software Engineering

2001 - 2019
  • Led software engineering teams to create and maintain a portfolio of applications that managed market statistics data through its entire lifecycle; including client deliverables and public web applications for querying data content deliverables.
  • Created an application portfolio supporting over $600 million annual revenue of subscriptions to predictive datasets.
  • Served as a lead architect to select Power BI as the standard data visualization product for the line of business and created key dashboards; also led the effort to migrate the entire portfolio to AWS cloud environments.
  • Developed a series of dashboards to impact business analyst productivity; reducing data analysts FTEs required by %7 and eliminating weeks from time to market for key deliverables.
  • Established the Python production engineering environment and built a natural language processing application using Python, Pandas, NLTK, NumPy, and NLP libraries to analyze documents.
  • Designed and built a dynamic data warehouse (SSIS, SSAS, .NET REST APIs) that supported data content quality review, automated data flows, and generated data content deliverables. The data structures could be changed without making code changes.
  • Reduced portfolio support costs by 40% via application stability and performance improvements over the course of three years.
  • Served as a lead architect to establish the company's first enterprise data warehouse environment which included setting development standards, performing Star and Snowflake data modeling, reporting solutions, and ETL and OLAP tools.
  • Assisted the data warehouse team in establishing Azure SQL and SSAS on Gartner's Azure cloud. Advised in an architectural capacity on design and data refresh strategies and topics around tradeoffs for SSAS tabular vs. multidimensional implementation.
Technologies: Snowflake, Star Schema, Supervisor, Contract Negotiation, Negotiation, IT Strategy, Cost Estimation, Coaching, Technical Project Management, Algorithms, Generative Pre-trained Transformers (GPT), GPT, Natural Language Processing (NLP), Python 3, Microsoft Excel, Groovy Scripting, REST APIs, RESTful Development, SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), Microsoft SQL Server, Microsoft Power BI, Data Warehouse Design

Market Statistics Workflow Monitoring Application

A SQL Server database, Microsoft Excel, and Microsoft Power BI based application which monitored workflow of data through a structured set of editing/modeling steps.

Automated triggering actions were embedded in end-user applications to signal and record status within the overall data set management process. We then utilized Excel and Power BI based reporting to allow managers of data scientists and research analysts to monitor the overall readiness of data and inspect for quality prior to allowing the publication to customers.

Natural Language Processing of Research Documents

I led a project to process a large set of published technology research documents to gain insights in technology topics covered, topic distributions over time, and sentiment related to those topics. We utilized Python to extract relevant content from XML sources, stored it in an intermediary relational database format, and then processed the content using NLTK and Scikit-learn.

We stored the results of processing in a format suitable for analytics and reporting, data visualization, and dashboard navigation experience was constructed.

I served as technical project lead, business analyst, and solution architect to design the system and constructed the visualizations and analytics most useful to achieve business objectives.

I also established the production Python execution environment, refactored data scientist NLP code to make suitable for more generic processing and created production performance tuned code. I designed and built the data analytics data storage and all the Power BI data visualizations, data navigation, and navigation experience for researchers and managers.

Insights were gained as to sentiment related to tech topics, and the business was enabled to explore changes in sentiment and topics over time.

Forecast Modeling Workbench

I deployed a third-party forecast modeling tool, integrated it into the existing portfolio of data management applications, and enhanced it to utilize a proprietary matrix algebra-based best-fit algorithm to allocate changes in data across multi-dimensional data sets.

The proprietary matrix algebra algorithm was core to taking high level modeled forecast results and cascading the changes to multidimensional data simultaneously to all dimensionality of the cube space.

This was previously a very time intensive process which was difficult to assure quality. The best fit approach allowed research analysts to focus on few data facts they "knew" to be true and figure out how to adjust detailed dimensional data to intelligently honor facts that were known while adjusting all related detailed data in a statistically appropriate manner.

I led the technical team to implement the proprietary algorithm in Java and Groovy and create data integration APIs for live data flows into and out of data stores during the algorithm execution process.

My primary tasks were architecting the solution, leading the engineering team, implementing the key components, and determining how to test and troubleshoot a very complex process.

GMSPUB | Client Data Deliverable Generator

A dynamic and flexible application to automate the generation of all client data deliverable content using metadata and content templates. This was driven by an event-oriented workflow engine that supported long-running task sets.

I led the solution architecture and design of this system, along with the engineering team. I made code contributions on the database back-end side (Transact-SQL) and in Microsoft Office automation (Excel VBA and .NET).

Reporting was created—first in Microsoft Excel and later using Power BI—to monitor system and deliverables status within the overall workflow and to report error capture.

My thought leadership contribution was to emphasize the metadata-driven process and the idea that new deliverable content could be added and generated without modifying the codebase. This needed a creative use of metadata and templates to refine the deliverables' visual output content, which included a variety of tables, aggregations, pivot tables, and graphs.

Over time, I contributed ideas to extend the system for generating file-based content (Excel) to data set generation for piping data into client-facing website apps.

Research analysts were self-empowered to generate time-critical client deliverables.

Dynamic Data Warehouse

A system to dynamically generate data warehouse storage structures (star schemas) from metadata

We designed and constructed a user interface that allowed business analysts to define data structures and initiate data warehouse refreshes. The system triggered a process to generate SSAS tabular models that served as data analytics dynamically.

We chose to make editing data in user interfaces the event triggers, which would refresh the data in the data warehouse structures. This choice enabled app users to refresh (on-demand, in near real-time) responsive data warehouse data and associated analytic cubes (with support for partial cube refreshes).

I served as lead architect and lead of engineering team to construct this system over a two year period (incremental value-added delivery). I contributed Transact-SQL code and SSAS artifacts. I was primary thought leader in the design and technical features which enabled the responsiveness of the system. This avoided traditional batch data warehouse ETL refreshes (ETL was eliminated), shortened refresh cycles exponentially through support of partial DW content refreshes and served the near real-time refresh responsiveness so critical for business processes.

Salesforce Automation and CRM System

I served as the development lead on a large-scale enterprise SFA implementation of and an associated reporting system for sales leadership. was configured to fit the company's sales process to optimize efficiency and provide sales management visibility to the sales pipeline and deal-close success.

I served as a liaison to the sales leadership, which included the GVP of sales, and led an external consulting team of 20 engineers and PMs. I prioritized the features implemented and delivered the highest value apps, reports, and analytics in an incrementally. I led the incremental rollout and led a team of ten engineers who took delivery from consulting firm and enhanced the application portfolio over three years.

Sales processes were drastically streamlined, highlighting best practices that had the highest revenue contribution and improved close rates. The system served to enhance the business process and sales processes.

I was responsible for contributing a design that could be measured to improve application adoption. I also carefully measured usage and constructed reports and dashboards, allowing the sales management team to steer actions to ensure adoption and successful outcomes.

Public Website Client Dataset Query User Experience

A series of web applications to navigate and query data content on a public website for clients. Datasets were published to the back-end system and then "released" to be available on the public website and front-end client web applications.

I served as a solution architect and lead of the engineering teams to construct a data store and associated REST APIs to query a set of dimensional datasets. I served as an advocate and key advisor for the user experience so that clients could easily navigate, query, visualize, and download data subsets of interest for large-scale published data deliverables.

I utilized my specialized knowledge of dimensional data to advise clients' realistic and useful experiences to digest data content. Custom, innovative, visual mechanisms were utilized to navigate the data content. In particular, it was imperative to prevent users from receiving empty results in filtering data. Therefore, I promoted a microservices approach, so the user interface reacted to "clicks" and adjusted front-end content/choices, so there were never empty results.

We implemented a swift SSAS back end and API performance to deliver those behaviors; this was one of my core contributions to the project.

GDW | Gartner Data Warehouse (Later BAW)

I led the software engineering team to design and build Gartner's first enterprise data warehouse. This system integrated data from financial, order entry, sales, and CRM systems to provide operational views of company performance. A star schema was utilized as well as a snowflake dimensional design.

I led data architecture, data modeling, selected vendor tools, and reporting/dashboards solutions. Informatica was selected as an ETL tool and Oracle for database and reporting. As well as serving as architect and leading requirements gathering and documentation, I helped implement the system. In the initial years, Oracle's reporting was changed to Microsoft Power BI after leading a cross functional evaluation team to select an Enterprise Standard.

Insights were gained in what factors most influenced retention; which led to shifting Sales actions to those which most impacted renewals. This ultimately allowed Gartner to achieve double digit revenue growth.

My contributions in optimizing the data flows involved which reduced the data warehouse refresh time from over eight hours to four and set in place strategies to expand subjects added to the data warehouse.


T-SQL (Transact-SQL), SQL, Excel VBA, C#, C++, Fortran, RPG, COBOL, Python 3, C#.NET, Java, JavaScript, MDX, JavaScript 6, Python, XML, Snowflake, Stored Procedure


Power Pivot, Microsoft Power BI, Microsoft Excel, Supervisor, Salesforce Sales Cloud, Informatica ETL


Database Design, App Development, Software Testing, Business Intelligence (BI), Acceptance Test-driven Development (ATDD), ETL, ETL Implementation & Design, OLAP, Test-driven Development (TDD), Agile, Object-oriented Programming (OOP), RESTful Development

Industry Expertise

Project Management


Database Programming, Microsoft SQL Server, Databases, Data Pipelines, SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), MySQL, Master Data Management (MDM), PL/SQL, PL/SQL Developer, SSAS Tabular, PostgreSQL, Amazon S3 (AWS S3), Oracle RDBMS, SQL Server Reporting Services (SSRS), Oracle PL/SQL, JSON, SQL Server DBA, Database Modeling


Programming, Systems Analysis, Relational Database Design, Data Warehouse Design, Technical Project Management, Coaching, Cost Estimation, IT Strategy, Quantrix, Data Warehousing, Debugging, Data Analysis, Complex Data Analysis, Data Aggregation, Data Migration, Data Engineering, Reports, BI Reports, Data Profiling, Software Development Lifecycle (SDLC), Data Modeling, Star Schema, Algorithms, Computer Science, Groovy Scripting, Contract Negotiation, DAX, Data Visualization, IT Management, Data Security, Statistics, Probability Theory, Calculus, Networking, Natural Language Processing (NLP), Negotiation, Wasabi, Forecasting, Modeling, Machine Learning, Azure Data Factory, ETL Testing, Query Optimization, Rapid Development, Architecture, Data Build Tool (dbt), Prefect, GPT, Generative Pre-trained Transformers (GPT)




Google Cloud Platform (GCP), Amazon Web Services (AWS), Amazon EC2, Oracle, Unix, Ubuntu Linux, Azure, Azure Synapse


Node.js, WebRTC, REST APIs, NumPy, Pandas, Scikit-learn, Natural Language Toolkit (NLTK), D3.js, React

2005 - 2007

Master's Degree in Computer Science

Boston University - Boston, MA, United States

2003 - 2005

Bachelor's Degree in Computer Programming

American Intercontinental University - Chandler, AZ, United States

1999 - 2001

Associate's Degree in Computer Programming

Housatonic Community College - Bridgeport, CT, United States

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.


Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring