Art Vancil, Data Architect and Developer in Charlottesville, VA, United States
Art Vancil

Data Architect and Developer in Charlottesville, VA, United States

Member since January 14, 2020
Art has 25 years of data architecture and cloud-computing consulting experience—mostly in building enterprise data warehouses. He's an end-to-end solution architect and chief problem solver with a long history of focused execution—according to a statement of work—and successful delivery in a team setting.
Art is now available for hire




Charlottesville, VA, United States



Preferred Environment

Azure, T-SQL, Microsoft Power BI, PostgreSQL, PL/SQL, Erwin, Azure Logic Apps, Redshift, AWS

The most amazing... I've developed is a hash join algorithm for joining many tables. This high-volume solution outperformed Db2's table joins by 66%.


  • Azure Data Warehouse Architect

    2021 - PRESENT
    American Associated Pharmacies
    • Designed and developed Azure data warehouse using Azure SQL, Azure Data Factory and Power BI to product sales and profitability analysis, and customer-facing reports embedded on the RXAAP website.
    • Created relational data models using IDERA ER/Studio; deployed data models to physical Azure SQL databases.
    • Developed ETL to load the data warehouse tables, using Azure Data Factory.
    • Created sales and rebate reports, embedded in the RXAAP website, using Power BI.
    Technologies: Data Modeling, Azure SQL, Microsoft Power BI, Azure Data Factory, ER/Studio Data Architecture
  • Technical Due Diligence Architect (Contractor)

    2020 - PRESENT
    Crosslake Technologies
    • Conducted technical due diligence examinations for investors and VC companies who were acquiring software companies. Examined documentation, evaluated architecture, conducted interviews to discover overall technical health.
    • Evaluated a software company that operates a SaaS application for the home services industry, recommending cloud architecture changes and development organization changes to enhance application testing and high-quality delivery.
    • Evaluated a software company that operates a SaaS application for health and lifestyle optimization. Recommended cloud architecture changes and staffing changes to accelerate their results in machine learning and artificial intelligence.
    Technologies: Executive Presentations, Technical Reports, Machine Learning, Software Development Lifecycle (SDLC), Organization, IT Infrastructure, IT Operations, Software Architecture, Due Diligence
  • Cloud Data Architect (Contractor)

    2020 - 2020
    McKnight Consulting Group
    • Created big data benchmark comparisons of cloud vendor big data platforms.
    • Followed the TPC-H industry-standard specifications to compare the performance of the 30TB test set. Loaded 30TB test data set into each of the five databases.
    • Designed data partitioning and indexing uniquely for each vendor to define a massively parallel storage layout.
    • Optimized and rewrote queries to tune them to the highest level of performance.
    • Measured execution timings in each environment to publish comparisons.
    Technologies: Synapse, BigQuery, Snowflake, Redshift, Actian
  • MongoDB and Atlas Architect (Toptal Contractor)

    2020 - 2020
    Anthem Wellpoint
    • Compared Atlas and MongoDB versus DocumentDB versus DynamoDB to recommend the best performing solution for a real-time data streaming solution. Identified limitations and advantages of each tool.
    • Conducted the AWS Well-Architected review, recommending reliability and performance upgrades to the cloud environment.
    • Created a new AWS data-streaming architecture to combine batch and real-time data updates, transaction logging, and JSON document handling.
    • Evaluated and optimized a real-time data streaming application in AWS by introducing GraphQL and DocumentDB.
    Technologies: AWS DynamoDB, Atlas, DocumentDB, MongoDB, GraphQL, Apache Kafka, Amazon Web Services (AWS)
  • Azure Data Architect (Contractor)

    2020 - 2020
    BioTE Medical (through a development agency)
    • Created data models and cloud architecture models to dramatically restructured the enterprise databases for conversion to Azure cloud microservices. Transformed monolithic MDM into domain-based data stores.
    • Selected a data vault data design pattern. Implemented a GraphQL middleware for data virtualization.
    • Led the C++ .NET Core team to minimize production support impact upon project delivery.
    • Created Power BI dashboards and an OLAP data design, supporting sales and project performance.
    Technologies: Microsoft Power BI, Auth0, .NET Core, GraphQL, Azure Application Insights, Azure Logic Apps, Azure Cosmos DB, Redis, Azure Event Hubs, Apache Kafka, Azure SQL
  • Data Science Team Leader (Freelance)

    2018 - 2020
    TAMKO Building Products
    • Created the vision and strategy for a manufacturing data warehouse using relational star-schema storage, ETL, and Power BI dashboards.
    • Created Power BI interactive analytics dashboards with a Java front end to identify $millions cost savings and control the manufacturing process. Designed an OLAP data structure for reporting.
    • Proposed the data governance program including the IT, business, and PMO roles.
    • Supervised 12 developers and DBAs to enable self-service analytics through team leadership, data strategy, and an execution roadmap.
    Technologies: Microsoft Power BI, SAP HANA, Microsoft SQL Server, Azure
  • Global Center for Innovative Analytics Director

    1999 - 2018
    Hitachi Consulting
    • Prepared business cases and prepped data for the data science team. Delivered dozens of predictive analytics solutions for the manufacturing, mining, automotive, and transportation industries.
    • Defined the predictive maintenance solution offering, including solution architecture, software, and services components. Performed POCs and client engagements to implement the solutions.
    • Delivered large-scale global cloud migrations to AWS and Azure for financial services, pharmaceutical, and manufacturing companies including Hadoop, Redshift, DevOps, Impala, and Power BI.
    • Defined the big data product offering, including Hadoop hardware specifications, IoT machine data collection, and analysis.
    Technologies: Pentaho Data Integration (Kettle), Redshift, Microsoft SQL Server, Hadoop, Machine Learning


  • Implementation of Cloudera on AWS Platform for Leading Semiconductor Manufacturing Company

    I undertook a pilot implementation of Cloudera on the AWS platform, providing advice on technology tools, architecture, and ETL design. I also managed the project tasks and deliverables and designed and developed a supply chain traceability solution.

    The technology stack included ​AWS, Cloudera Hadoop, Hue, Impala, Hive, Sqoop, Superset, StreamSets, Tableau, Neo4J, SQL Server, and Oracle.

  • Data Extraction for Internet Banking Company

    I designed a strategy for high-volume data warehouse extracts, developing daily metrics subsystem for 140 million accounts. I also delivered 100GB daily feeds from AWS to Marketing Cloud and tuned Redshift data storage and SQL script execution performance.

    The technology stack included Redshift and Marketing Cloud.

  • Asset Optimization Solution

    I defined the asset optimization solution strategy among Hitachi Group companies, managing software development of solution artifacts, including offshore development team and data science team oversight. I implemented solutions for equipment health index and optimizing inspection cycles.

    The technology stack included Domo, Ammo, Pentaho, and Oracle Enterprise Asset Management (eAM).

  • Operations Data Warehouse for Fortune 100 Technology Services Company

    I designed a normalized, historical operations data warehouse using data vault design techniques. I also developed data models and implemented physical databases in Oracle and SQL Server. In addition, I sourced SAP data for loading the data warehouse and reproduced SAP utilization and labor cost calculations for the SQL Server star schema. Finally, I designed Informatica ETL mapping requirements, reviewed deliverables from multiple teams, and instructed the teams in data warehousing best practices.

    The technology stack included Erwin, Oracle, Informatica, Microsoft SQL Server, and SAP.

  • Microservices Enterprise Architecture for Pharmaceutical Services Company

    I redesigned a monolithic .NET application for native cloud microservices. I also designed an Event Hubs pub/sub messaging strategy and a normalized, historical operations data warehouse using data vault design techniques. Further on, I developed data models and implemented physical databases in Azure DB, relying on rapid, agile development using data patterns and services patterns.
    The technology stack included Microsoft Power BI, Microsoft Azure SQL Database, Event Hubs, Logic Apps, Application Insights, and Angular.

  • Analytics Strategy and Data Warehouse for Leading Media and Entertainment Company

    I delivered data architecture review and strategy for multiple applications within the national facilities and network engineering department, proposing and assisting the transition to a corporate information factory architecture. In addition, I introduced normalized data modeling, data vault data modeling, and star-schema data modeling. I also proposed new roles to move toward an advanced analytics center of excellence and further SDLC steps to support design sprints and change control.

    The technology stack included OpenJDK, PostgreSQL, RabbitMQ, and Pentaho Data Integration (PDI).

  • Enterprise Data Strategy for Building Products Manufacturing Company

    As a data science team leader, I provided team leadership, data strategy, and execution roadmap to enable self-service analytics. I also created a vision and strategy for a manufacturing data warehouse, leading two different analytics development teams with 12 members. Finally, I delivered Power BI control charts, innovative analytics, and visualizations, saving millions of dollars.

    The technology stack included Microsoft SQL Server and SAP HANA environments, Power BI, and SAP Analytics Cloud.


  • Languages

    T-SQL, SQL, R, GraphQL, Snowflake, Python 2, Python
  • Paradigms

    Database Design, Data Science, ETL, Business Intelligence (BI), DevOps
  • Storage

    Azure SQL, Databases, Relational Databases, SQL Architecture, Redshift, SQL Server Management Studio, PL/SQL, Apache Hive, HDFS, PostgreSQL, DB, Microsoft SQL Server, Redis, Azure Cosmos DB, MongoDB, AWS DynamoDB, Netezza, Azure SQL Databases, MySQL, ER/Studio Data Architecture
  • Other

    Data Modeling, Data Management, Solution Architecture, IT Consulting, IT Project Management, Consulting, Data Warehouse Design, Leadership, Technical Design, Troubleshooting, Data Architecture, Data Analysis, Data Queries, Data, Data Marts, Relational Database Design, Healthcare Effectiveness Data and Information Set (HEDIS), Informatica, System Integration, AWS Cloud Architecture, Architecture, Software Design, Analytics, Software Development, Predictive Analysis, Performance Management, Manufacturing, Healthcare Delivery, Agile Data Science, Financial Services, Big Data Architecture, Clinical Quantitative Methods, Commercial Software, Business Process Analysis, Algorithms, Hue, AWS, Dashboards, Data Visualization, Cloud Architecture, Healthcare Management Systems, Data Vaults, Machine Learning, DocumentDB, Due Diligence, Software Architecture, IT Operations, IT Infrastructure, Organization, Software Development Lifecycle (SDLC), Technical Reports, Executive Presentations, Atlas, Document Search, Oracle R, Government, Data Governance, Electric Utilities, Data Center Management, Consumer Packaged Goods (CPG), Computer Hardware, Claims Administration, Parquet, Columnar Data Store, Benefits Administration, Big Data, Strategy, Internet of Things (IoT), Data Engineering, Azure Data Factory, Computer Science, Marketing Cloud, SAP
  • Frameworks

    Hadoop, .NET Core
  • Tools

    Informatica ETL, Erwin, Impala, Microsoft Power BI, Lucidchart, STATA, Pentaho Data Integration (Kettle), Azure Application Insights, Auth0, Actian, BigQuery, Synapse, Tableau, Azure Logic Apps, Cloudera, RabbitMQ
  • Platforms

    Azure Event Hubs, Oracle, SAP HANA, Apache Kafka, AWS EC2, Salesforce, Amazon Web Services (AWS), Azure, Pentaho
  • Industry Expertise

    Banking & Finance, Automotive, Healthcare


  • Coursework in Quantitative Methods in Clinical and Public Health Research
    2012 - 2013
    Harvard University - Cambridge, MA, United States
  • Bachelor of Science Degree in Computer Science
    1973 - 1976
    Louisiana Tech University - Ruston, LA, United States


  • Data Science Essentials
    MAY 2013 - PRESENT
  • Quantitative Methods in Clinical and Public Health Research
    Harvard Medical School and Harvard School of Public Health
  • Certified Cloud Security Knowledge (CCSK)
    Cloud Security Alliance
  • Certified Computing Professional
    APRIL 1995 - PRESENT
    Institute for Certification of Computing Professionals

To view more profiles

Join Toptal
Share it with others