
Alex Clark
Verified Expert in Engineering
Data Engineer and Developer
Alex is an innovative and experienced big data engineer, skilled in a wide variety of tools and technologies. He is competent in all aspects of the data science process, including data ingestion, storage, transformation, and statistical modeling. Alex has a proven track record of thoughtfully dissecting business problems, vetting requirements, and then designing the processes to address them.
Portfolio
Experience
Availability
Preferred Environment
Python 3, Amazon Elastic MapReduce (EMR), Apache Hive, PySpark, AWS Lambda, Amazon CloudWatch, Amazon DynamoDB, Amazon Simple Queue Service (SQS), SQL, Databases, Amazon Web Services (AWS)
The most amazing...
...solution I've developed is an internal API-based billing system.
Work Experience
Data Engineer
D2 Nova
- Evaluated the client's SQL database and provided recommendations for improving design and query performance.
- Implemented MySQL table partitioning and composite indexes, which improved query performance from 2,240 milliseconds to 10 milliseconds.
- Designed and implemented a MongoDB database with a lightweight front end built in Flask. Created a custom search functionality that allows users to search for partial text matches in several languages while maintaining low latency.
Freelance Data Engineer
ProjectPro
- Developed a CDK project that deploys data to S3, creates a data pipeline in EMR, and transforms the data using Hive. Connected the EMR cluster to Microsoft Power BI and created data visualizations.
- Architected a CDK project/data pipeline that extracts data from a SQL database in RDS, loads incremental data from an API using AWS Lambda, and transforms data using Spark.
- Created detailed written and video documentation for each project.
Data Engineer
Amazon.com
- Gathered data from disparate sources to identify and deprecate low-performing content.
- Collaborated with stakeholders, software engineers, and managers to design and construct an internal API-based billing system.
- Developed an API pricing model for the platform's primary services.
- Built an attribution model to assign customer orders to customer actions.
- Organized and maintained data storage systems, including relational databases, big data systems, and serverless technologies.
- Constructed data pipelines and managed ETL processes.
- Developed metrics to analyze and report on page performance and customer retention.
- Created custom map-reduce jobs to parse complex, high-volume data structures.
Business Systems Analyst
Liberty Mutual Insurance
- Created an automated process to build and maintain 24 data sets in a centralized location.
- Delivered presentations to educate SAS users about data sets and their analytical potential.
- Facilitated biweekly meetings with stakeholders to improve the usability and integrity of data sets.
- Leveraged SAS and Teradata to efficiently execute numerous ad hoc requests.
- Developed SQL queries and VBA macros to streamline monthly reporting.
- Built a Microsoft Access database and VBA scripts to automate the production of a weekly status report.
Data Analyst
Efinancial
- Presented complex analyses to upper management, driving high-level decision-making.
- Collaborated with the analytics team to develop a calling strategy which led to a 50% increase in sales.
- Automated the production of weekly scorecards and reports using SQL and VBA.
- Wrote SQL queries and performed data analysis to aid in the development of monthly and/or weekly goals.
Experience
Page-to-order Attribution Model
Internal Billing System
Skills
Languages
SQL, Python, Stored Procedure, SAS, Scala
Tools
Amazon Athena, Amazon CloudWatch, Amazon Elastic MapReduce (EMR), AWS Glue, Cron, Boto, AWS Step Functions, Amazon Simple Queue Service (SQS), PyCharm, Amazon QuickSight, GitHub, Microsoft Power BI, AWS CloudFormation
Paradigms
ETL, Business Intelligence (BI)
Storage
Apache Hive, Databases, PostgreSQL, Redshift, RDBMS, JSON, Database Administration (DBA), Relational Databases, MySQL, Data Pipelines, Teradata, Amazon S3 (AWS S3), Amazon DynamoDB, Microsoft SQL Server, MongoDB, NoSQL, Amazon Aurora
Other
Information Systems, Data Architecture, EMR, Data Analytics, Datasets, Data Engineering, Data Analysis, Data Cleansing, Data Profiling, CSV File Processing, Data Modeling, Data, Metrics, Big Data, Pipelines, Conda, PIP, APIs, Data Warehousing, Big Data Architecture, BI Reporting, Analytics, Business Requirements, Data Visualization, Scripting, Amazon RDS, BI Reports, Dashboards, Database Optimization, Machine Learning, Statistics, Time Series Analysis, Optimization, Attribution Modeling, Marketing Attribution, Web Analytics, IT Project Management, User-defined Functions (UDF), Dashboard Design, Predictive Modeling
Frameworks
Hadoop, Spark, Flask
Libraries/APIs
PySpark, Pandas, NumPy
Platforms
AWS Lambda, Linux, Oracle, Amazon Web Services (AWS), Amazon EC2
Education
Master's Degree in Business Analytics & Data Science
Bentley University - Waltham, MA, USA
Bachelor's Degree in Accounting
Central Washington University - Ellensburg, WA, USA