Alex Clark
Verified Expert in Engineering
Data Engineer and Developer
Seattle, WA, United States
Toptal member since November 14, 2022
Alex is an innovative and experienced big data engineer, skilled in a wide variety of tools and technologies. He is competent in all aspects of the data science process, including data ingestion, storage, transformation, and statistical modeling. Alex has a proven track record of thoughtfully dissecting business problems, vetting requirements, and then designing the processes to address them.
Portfolio
Experience
Availability
Preferred Environment
Python 3, Amazon Elastic MapReduce (EMR), Apache Hive, PySpark, AWS Lambda, Amazon CloudWatch, Amazon DynamoDB, Amazon Simple Queue Service (SQS), SQL, Databases, Amazon Web Services (AWS)
The most amazing...
...solution I've developed is an internal API-based billing system.
Work Experience
Data Engineer
D2 Nova
- Evaluated the client's SQL database and provided recommendations for improving design and query performance.
- Implemented MySQL table partitioning and composite indexes, which improved query performance from 2,240 milliseconds to 10 milliseconds.
- Designed and implemented a MongoDB database with a lightweight front end built in Flask. Created a custom search functionality that allows users to search for partial text matches in several languages while maintaining low latency.
Freelance Data Engineer
ProjectPro
- Developed a CDK project that deploys data to S3, creates a data pipeline in EMR, and transforms the data using Hive. Connected the EMR cluster to Microsoft Power BI and created data visualizations.
- Architected a CDK project/data pipeline that extracts data from a SQL database in RDS, loads incremental data from an API using AWS Lambda, and transforms data using Spark.
- Created detailed written and video documentation for each project.
Data Engineer
Amazon.com
- Gathered data from disparate sources to identify and deprecate low-performing content.
- Collaborated with stakeholders, software engineers, and managers to design and construct an internal API-based billing system.
- Developed an API pricing model for the platform's primary services.
- Built an attribution model to assign customer orders to customer actions.
- Organized and maintained data storage systems, including relational databases, big data systems, and serverless technologies.
- Constructed data pipelines and managed ETL processes.
- Developed metrics to analyze and report on page performance and customer retention.
- Created custom map-reduce jobs to parse complex, high-volume data structures.
Business Systems Analyst
Liberty Mutual Insurance
- Created an automated process to build and maintain 24 data sets in a centralized location.
- Delivered presentations to educate SAS users about data sets and their analytical potential.
- Facilitated biweekly meetings with stakeholders to improve the usability and integrity of data sets.
- Leveraged SAS and Teradata to efficiently execute numerous ad hoc requests.
- Developed SQL queries and VBA macros to streamline monthly reporting.
- Built a Microsoft Access database and VBA scripts to automate the production of a weekly status report.
Data Analyst
Efinancial
- Presented complex analyses to upper management, driving high-level decision-making.
- Collaborated with the analytics team to develop a calling strategy which led to a 50% increase in sales.
- Automated the production of weekly scorecards and reports using SQL and VBA.
- Wrote SQL queries and performed data analysis to aid in the development of monthly and/or weekly goals.
Experience
Page-to-order Attribution Model
Internal Billing System
Education
Master's Degree in Business Analytics & Data Science
Bentley University - Waltham, MA, USA
Bachelor's Degree in Accounting
Central Washington University - Ellensburg, WA, USA
Skills
Libraries/APIs
PySpark, Pandas, NumPy
Tools
Amazon Athena, Amazon CloudWatch, Amazon Elastic MapReduce (EMR), AWS Glue, Cron, Boto, AWS Step Functions, Amazon Simple Queue Service (SQS), PyCharm, Amazon QuickSight, GitHub, Microsoft Power BI, AWS CloudFormation
Languages
SQL, Python, Stored Procedure, SAS, Scala
Paradigms
ETL, Business Intelligence (BI)
Storage
Apache Hive, Databases, PostgreSQL, Redshift, RDBMS, JSON, Database Administration (DBA), Relational Databases, MySQL, Data Pipelines, Teradata, Amazon S3 (AWS S3), Amazon DynamoDB, Microsoft SQL Server, MongoDB, NoSQL, Amazon Aurora
Frameworks
Hadoop, Spark, Flask
Platforms
AWS Lambda, Linux, Oracle, Amazon Web Services (AWS), Amazon EC2
Other
Information Systems, Data Architecture, EMR, Data Analytics, Datasets, Data Engineering, Data Analysis, Data Cleansing, Data Profiling, CSV File Processing, Data Modeling, Data, Metrics, Big Data, Pipelines, Conda, PIP, APIs, Data Warehousing, Big Data Architecture, BI Reporting, Analytics, Business Requirements, Data Visualization, Scripting, Amazon RDS, BI Reports, Dashboards, Database Optimization, Machine Learning, Statistics, Time Series Analysis, Optimization, Attribution Modeling, Marketing Attribution, Web Analytics, IT Project Management, User-defined Functions (UDF), Dashboard Design, Predictive Modeling
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring