Paul is available for hire

Paul Butler

Verified Expert in Engineering

Principal Architect and Developer

Location

Riverhead, NY, United States

Toptal Member Since

October 21, 2020

Backed by 20+ years of experience, Paul has served as a senior data architect, principal architect, technical project manager, and engineer for various areas of focus such as marketing, CDP, SFA, CRM, data warehousing, data modeling, reporting, data visualization, data science, and analytics. He focuses on creating innovative solutions that are highly aligned with business value. A keen strategist, Paul enjoys utilizing his business and technical expertise to create high-impact solutions.

Programming Debugging Data Analysis BI Reports Data Warehouse Design Data Warehousing Data Migration Data Engineering Data Profiling Data Modeling Database Design SQL Databases T-SQL (Transact-SQL)Business Intelligence (BI)Fortran COBOL

Portfolio

BRP

Architecture, Snowflake, Microsoft Power BI, Data Build Tool (dbt)...

ShareVR

Amazon S3 (AWS S3), Wasabi, WebRTC, JavaScript 6, PostgreSQL, Node.js...

Gartner

Snowflake, Star Schema, Contract Negotiation, Negotiation, IT Strategy...

Experience

Database Modeling - 20 years SQL - 20 years Microsoft SQL Server - 15 years Data Modeling - 15 years Data Warehouse Design - 15 years Microsoft Power BI - 4 years Snowflake - 3 years Data Build Tool (dbt) - 2 years

Availability

Part-time

Preferred Environment

Data Warehouse Design, Data Engineering, ETL, SQL, Microsoft SQL Server, Microsoft Power BI, Snowflake, Data Build Tool (dbt), Data Modeling, Data Architecture

The most amazing...

...thing I've built is a dynamic data management system. I managed the full lifecycle from the transactional input and analytics to the delivery of the web apps.

Work Experience

Senior Data Architect

2020 - PRESENT

BRP

Established a cloud-based data analytics platform based on Snowflake.
Conducted POC with dbt to establish the standard data transformation technology.
Designed data analytics platform data models, including definitions of data domains and data layers (RAW, GOLD, MART).
Conducted POC of Datavault 2.0 modeling; adapted a hybrid approach for a new analytics platform.
Designed data integration application for a complex customs partner agency.
Coded Snowflake views and Stored Procedures (JavaScript and SQL-based) and UDFs.
Conducted POC for data orchestration using Prefect. Facilitated the evaluation of technology alternatives, such as Airflow, Prefect, Dagster, and others. Coded PoC in Python 3 with Prefect for representative use cases.
Conducted POC for Snowflake SnowPark technology using Python and VSCode.

Technologies: Architecture, Snowflake, Microsoft Power BI, Data Build Tool (dbt), Data Modeling, Python 3, Prefect, Azure Synapse, Stored Procedure, Data Warehouse Design, Data Analysis, Complex Data Analysis, Databases, Data Migration, Data Aggregation, ETL Implementation & Design, Data Engineering, Data Profiling, Azure, Azure Data Factory, Database Modeling, Data Architecture, Data Visualization, SQL Server Analysis Services (SSAS), Database Design, Modeling, Machine Learning, BI Reports, Data Pipelines, Acceptance Test-driven Development (ATDD), Debugging, Google Cloud Platform (GCP), Data Warehousing, Test-driven Development (TDD), Programming, Software Development Lifecycle (SDLC), Coaching, JSON, Negotiation, Microsoft Excel, SQL, Data Analytics, Dashboards

Technologist

2019 - 2020

ShareVR

Advised the founder regarding software technologies to enable the business strategy for a 360° VR video content startup.
Evaluated and implemented POCs for cloud-based software products and services.
Led R&D, solution selection, product, and software component evaluation.
Established the website, cloud-based data storage (Wasabi), and data storage integration using Rclone on Ubuntu.
Utilized Node.js to tailor a youtube clone website and overwrite the default 360 VR Player to use a Viblast player.
Developed a test plan to measure end-user performance and peer-to-peer utilization which allowed playback of high-resolution 360° VR.

Technologies: Amazon S3 (AWS S3), Wasabi, WebRTC, JavaScript 6, PostgreSQL, Node.js, Databases, Amazon EC2, Java, Database Design, Project Management, Amazon Web Services (AWS), Debugging, Programming, Software Development Lifecycle (SDLC), JSON, Negotiation, Contract Negotiation, Microsoft Excel, SQL

Senior Director of Software Engineering

2001 - 2019

Gartner

Led software engineering teams to create and maintain a portfolio of applications that managed market statistics data through its entire lifecycle; including client deliverables and public web applications for querying data content deliverables.
Created an application portfolio supporting over $600 million annual revenue of subscriptions to predictive datasets.
Served as a lead architect to select Power BI as the standard data visualization product for the line of business and created key dashboards; also led the effort to migrate the entire portfolio to AWS cloud environments.
Developed a series of dashboards to impact business analyst productivity; reducing data analysts FTEs required by %7 and eliminating weeks from time to market for key deliverables.
Established the Python production engineering environment and built a natural language processing application using Python, Pandas, NLTK, NumPy, and NLP libraries to analyze documents.
Designed and built a dynamic data warehouse (SSIS, SSAS, .NET REST APIs) that supported data content quality review, automated data flows, and generated data content deliverables. The data structures could be changed without making code changes.
Reduced portfolio support costs by 40% via application stability and performance improvements over the course of three years.
Served as a lead architect to establish the company's first enterprise data warehouse environment which included setting development standards, performing Star and Snowflake data modeling, reporting solutions, and ETL and OLAP tools.
Assisted the data warehouse team in establishing Azure SQL and SSAS on Gartner's Azure cloud. Advised in an architectural capacity on design and data refresh strategies and topics around tradeoffs for SSAS tabular vs. multidimensional implementation.

Technologies: Snowflake, Star Schema, Contract Negotiation, Negotiation, IT Strategy, Cost Estimation, Coaching, Technical Project Management, Algorithms, Natural Language Processing (NLP), Python 3, Microsoft Excel, Groovy Scripting, REST APIs, RESTful Development, SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), Microsoft SQL Server, Microsoft Power BI, Data Warehouse Design, Complex Data Analysis, Data Analysis, Databases, Data Migration, Data Aggregation, ETL Implementation & Design, Data Engineering, Modeling, Data Profiling, Database Modeling, Data Architecture, Data Visualization, PL/SQL, PL/SQL Developer, Java, Power Pivot, Database Design, Project Management, Statistical Methods, Forecasting, BI Reports, Reports, SQL Server DBA, Data Pipelines, Acceptance Test-driven Development (ATDD), MySQL, Debugging, Data Warehousing, Test-driven Development (TDD), DAX, IT Management, Programming, Software Development Lifecycle (SDLC), Supervisor, JSON, Object-oriented Programming (OOP), SQL, Data Analytics, Dashboards, Reporting

Principal Architect, Data Architect

2005 - 2010

Gartner

Led all aspects of software and data architectures. Managed the technical development of a portfolio of data applications related to Gartner product offerings. Established, justified, and rationalized investment priorities and set scope, budget, and timelines.
Set a strategic technical direction for applications and analytics. Built an engineering team and oversaw career development for a team of seven engineers and leads.
Improved the time to market for data-centric client-facing deliverables and front-end applications. Generalization allows expansion to five new product offerings without the need to await individual development projects.
Served as an architect and technical lead to build back-end data services and UX for web-based query and data extraction applications on Gartner.com. This resulted in Gartner's first online user experience for its market statistics offering.
Created web-based user interfaces with a dynamic back-end query engine, allowing exploration, navigation, and extraction of large datasets on Gartner.com for the first time via an interactive web application.

Technologies: SQL, Business Intelligence (BI), Data Warehousing, Java, SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), SSAS Tabular, C#.NET, Visual Basic for Applications (VBA), Excel VBA, Data Architecture, ETL, IT Strategy, Data Integration, Salesforce Sales Cloud, Algorithms, Team Leadership, Data Analytics, Dashboards, Reporting

Experience

Data Model and Algorithm

I served as a data architect, designing the data model and algorithm to match potentially duplicate consumer data and merge them into unique individuals. This used a nearest neighbors methodology to determine that multiple candidate consumer data records from multiple sources (such as lead management or vehicle registration processes) are, in fact, single individuals. I used Snowflake, dbt, and Python for implementation and coded aspects of this application while serving as lead for the overall implementation.

As a result, the Snowflake analytics platform is the single source of truth regarding consumer data, integrating many sources and facilitating the assembly of data for a customer data platform (Tealium).

Data Integration Applications

I implemented a series of data integration applications to share data from the enterprise analytics platform (Snowflake) with customs processing partners. This facilitated the shipping/delivery of all BRP products through a variety of borders around the world. Specific data needed to be pre-processed following highly complex business rules and transformed into formats suitable for sharing via web APIs to the customs partners. I designed and built a system on Snowflake to log and monitor the flow of data through the preparation and delivery process for all products. I also designed and delivered a large amount of Snowflake SQL views via Dbt projects.

I designed algorithms to properly identify incremental data sets and conditions to resend information per a variety of business rules.

Snowflake as a Self-serve Analytic Platform

I evaluated cloud-based analytic platforms and chose Snowflake as the new self-serve, data mesh-oriented analytics platform. I helped the client establish the new platform data modeling standards, ETL/ELT technology, standardized ELT design patterns, data organization, and data flow. I conducted POCs with dbt (chosen as ELT), Snowflake stored procedures, and Prefect (for data orchestration). I designed many core data entities for consumption via end users and both ODS and data mart structures.

Dynamic Data Warehouse

A system to dynamically generate data warehouse storage structures (star schemas) from metadata

We designed and constructed a user interface that allowed business analysts to define data structures and initiate data warehouse refreshes. The system triggered a process to generate SSAS tabular models that served as data analytics dynamically.

We chose to make editing data in user interfaces the event triggers, which would refresh the data in the data warehouse structures. This choice enabled app users to refresh (on-demand, in near real-time) responsive data warehouse data and associated analytic cubes (with support for partial cube refreshes).

I served as lead architect and lead of the engineering team to construct this system over a two-year period (incremental value-added delivery). I contributed Transact-SQL code and SSAS artifacts. I was the primary thought leader in the design and technical features, which enabled the responsiveness of the system. This avoided traditional batch data warehouse ETL refreshes (ETL was eliminated), shortened refresh cycles exponentially through the support of partial DW content refreshes, and served the near real-time refresh responsiveness so critical for business processes.

Natural Language Processing of Research Documents

I led a project to process a large set of published technology research documents to gain insights into technology topics covered, topic distributions over time, and sentiment related to those topics. We utilized Python to extract relevant content from XML sources, stored it in an intermediary relational database format, and then processed the content using NLTK and Scikit-Learn.

We stored the results of the processing in a format suitable for analytics, reporting, and data visualization. A dashboard navigation experience was constructed.

I served as technical project lead, business analyst, and solution architect to design the system and construct the visualizations and analytics most useful to achieve business objectives.

I also established the production Python execution environment, refactored data scientist NLP code to make it suitable for more generic processing, and created production performance-tuned code. I designed and built the data analytics data storage and all the Power BI data visualizations, data navigation, and navigation experience for researchers and managers.

Insights were gained as to sentiment related to tech topics, and the business was enabled to explore changes in sentiment and topics over time.

Public Website Client Dataset Query User Experience

A series of web applications to navigate and query data content on a public website for clients. Datasets were published to the back-end system and then "released" to be available on the public website and front-end client web applications.

I served as a lead and solution architect of the engineering teams to construct a data store and associated REST APIs to query a set of dimensional datasets. I was the advocate and key advisor for the user experience so that clients could easily navigate, query, visualize, and download data subsets of interest for large-scale published data deliverables.

I utilized my specialized knowledge of dimensional data to advise on clients' realistic and useful experiences in digesting data content. Custom, innovative visual mechanisms were utilized to navigate the data content. In particular, it was imperative to prevent users from receiving empty results in filtering data. Therefore, I promoted a microservices approach so the user interface reacted to "clicks" and adjusted front-end content and choices, so there were never empty results.

We implemented a swift SSAS back end and API performance to deliver those behaviors; this was one of my core contributions to the project.

Market Statistics Workflow Monitoring Application

An SQL Server database, Microsoft Excel, and Microsoft Power BI-based application that monitored the workflow of data through a structured set of editing and modeling steps.

Automated triggering actions were embedded in end-user applications to signal and record status within the overall data set management process. We then utilized Excel and Power BI-based reporting to allow managers of data scientists and research analysts to monitor the overall readiness of data and inspect for quality prior to allowing the publication to customers.

Forecast Modeling Workbench

I deployed a third-party forecast modeling tool, integrated it into the existing portfolio of data management applications, and enhanced it to utilize a proprietary matrix algebra-based best-fit algorithm to allocate changes in data across multi-dimensional data sets.

The proprietary matrix algebra algorithm was core to taking high-level modeled forecast results and cascading the changes to multidimensional data simultaneously for all dimensionalities of the cube space.

This was previously a very time-intensive process, which was difficult to ensure quality. The best-fit approach allowed research analysts to focus on a few data facts they "knew" to be true and figure out how to adjust detailed dimensional data to intelligently honor known facts while adjusting all related detailed data in a statistically appropriate manner.

I led the technical team to implement the proprietary algorithm in Java and Groovy and create data integration APIs for live data flows into and out of data stores during the algorithm execution process.

My primary tasks were architecting the solution, leading the engineering team, implementing the key components, and determining how to test and troubleshoot a very complex process.

GMSPUB | Client Data Deliverable Generator

A dynamic and flexible application to automate the generation of all client data deliverable content using metadata and content templates. This was driven by an event-oriented workflow engine that supported long-running task sets.

I led the solution architecture and design of this system, along with the engineering team. I made code contributions on the database back-end side (Transact-SQL) and in Microsoft Office automation (Excel VBA and .NET).

Reporting was created—first in Microsoft Excel and later using Power BI—to monitor system and deliverables status within the overall workflow and to report error capture.

My thought leadership contribution was to emphasize the metadata-driven process and the idea that new deliverable content could be added and generated without modifying the codebase. This needed a creative use of metadata and templates to refine the deliverables' visual output content, which included a variety of tables, aggregations, pivot tables, and graphs.

Over time, I contributed ideas to extend the system for generating file-based content (Excel) to data set generation for piping data into client-facing website apps.

Research analysts were self-empowered to generate time-critical client deliverables.

Salesforce Automation and CRM System

I served as the development lead on a large-scale enterprise SFA implementation of Salesforce.com and an associated reporting system for sales leadership.

Salesforce.com was configured to fit the company's sales process to optimize efficiency and provide sales management visibility to the sales pipeline and deal-close success.

I served as a liaison to the sales leadership, which included the GVP of sales, and led an external consulting team of 20 engineers and PMs. I prioritized the features implemented and incrementally delivered the highest-value apps, reports, and analytics. I led the incremental rollout and led a team of 10 engineers who took delivery from a consulting firm and enhanced the application portfolio over three years.

Sales processes were drastically streamlined, highlighting best practices that had the highest revenue contribution and improved close rates. The system served to enhance the business process and sales processes.

I was responsible for contributing to a design that could be measured to improve application adoption. I also carefully measured usage and constructed reports and dashboards, allowing the sales management team to steer actions to ensure adoption and successful outcomes.

GDW | Gartner Data Warehouse (Later BAW)

I led the software engineering team to design and build Gartner's first enterprise data warehouse. This system integrated data from financial, order entry, sales, and CRM systems to provide operational views of company performance. A star schema was utilized as well as a snowflake dimensional design.

I led data architecture and data modeling and selected vendor tools and reporting/dashboard solutions. Informatica was selected as the ETL tool, and Oracle was for database and reporting. I served as an architect and led requirements gathering and documentation and helped implement the system. In the initial years, Oracle's reporting was changed to Microsoft Power BI after leading a cross-functional evaluation team to select an enterprise standard.

Insights were gained in what factors most influenced retention; which led to shifting sales actions to those which most impacted renewals. This ultimately allowed Gartner to achieve double-digit revenue growth.

My contributions in optimizing the data flows reduced the data warehouse refresh time from over eight hours to four and set in place strategies to expand subjects added to the data warehouse.

Skills

Languages

T-SQL (Transact-SQL), SQL, Excel VBA, C#, Stored Procedure, C++, Fortran, COBOL, Python 3, C#.NET, Java, JavaScript, MDX, JavaScript 6, Python, XML, Snowflake, Visual Basic for Applications (VBA)

Tools

Power Pivot, Microsoft Power BI, Microsoft Excel, Supervisor, Salesforce Sales Cloud, Informatica ETL, Power BI Report Server

Paradigms

Database Design, App Development, Software Testing, Business Intelligence (BI), Acceptance Test-driven Development (ATDD), ETL, ETL Implementation & Design, OLAP, Test-driven Development (TDD), Agile, Object-oriented Programming (OOP), RESTful Development

Industry Expertise

Project Management

Storage

Database Programming, Microsoft SQL Server, Databases, Data Pipelines, Database Modeling, SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), MySQL, Master Data Management (MDM), PL/SQL, PL/SQL Developer, SSAS Tabular, PostgreSQL, Amazon S3 (AWS S3), Oracle RDBMS, SQL Server Reporting Services (SSRS), Oracle PL/SQL, JSON, SQL Server DBA, Data Integration, SQL Stored Procedures, Database Architecture

Other

Programming, Systems Analysis, Relational Database Design, Data Warehouse Design, Technical Project Management, Coaching, Cost Estimation, IT Strategy, Quantrix, Data Warehousing, Debugging, Data Analysis, Complex Data Analysis, Data Aggregation, Data Migration, Data Engineering, Reports, BI Reports, Data Profiling, Software Development Lifecycle (SDLC), Data Modeling, Star Schema, Data Architecture, Algorithms, Computer Science, Groovy Scripting, Contract Negotiation, DAX, Data Visualization, Dashboards, Reporting, IT Management, Data Security, Statistics, Probability Theory, Calculus, Natural Language Processing (NLP), Negotiation, Wasabi, Forecasting, Modeling, Machine Learning, Azure Data Factory, ETL Testing, Query Optimization, Rapid Development, Architecture, Data Build Tool (dbt), Prefect, Statistical Methods, Team Leadership, ERD, Data Analytics

Frameworks

.NET, ASP.NET

Platforms

Amazon Web Services (AWS), Amazon EC2, Oracle, Unix, Ubuntu Linux, Google Cloud Platform (GCP), Azure, Azure Synapse

Libraries/APIs

Node.js, WebRTC, REST APIs, NumPy, Pandas, Scikit-learn, Natural Language Toolkit (NLTK), D3.js, React

Education

2005 - 2007

Master's Degree in Computer Science

Boston University - Boston, MA, United States

2003 - 2005

Bachelor's Degree in Computer Programming

American Intercontinental University - Chandler, AZ, United States

1999 - 2001

Associate's Degree in Computer Programming

Housatonic Community College - Bridgeport, CT, United States

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring