Big Data Consultant
2019 - PRESENTSun Life (via a Contractor)- Acted as a tech lead at the project's second phase and provided technical guidance to the project team. Hosted a daily scrum and facilitated the team's activities.
- Rearchitected the project and redesigned the code to reduce the number of AWS Glue jobs from 150 down to 30. This reduced the operation cost by 80%.
- Developed a Python and PySpark code that handles history data bulk load and daily CDC load and builds daily snapshots.
- Created Hive SQL and Spark SQL to handle complex business transformation logic.
- Developed the CI/CD pipeline to build, package, and deploy the project to development, system integration, and production testing.
- Tuned performance for the system and located the data skew issue. Provided suggestions to the business team to adjust the data model and avoid recurrence of the problem.
- Tested the solution in Amazon EMR and AWS Glue and deployed the AWS Glue job solution to production.
Technologies: Big Data, Amazon Web Services (AWS), Apache Hive, Amazon S3 (AWS S3), AWS Glue, Zeppelin, SQL, Python 3, PySpark, Spark SQL, Linux, Git, Confluence, Scala, PyCharm, IntelliJ IDEA, AWS EMR, Jenkins Pipeline, CI/CD Pipelines, Scrum, Bash, Data Lakes, Data Warehouse DesignBig Data Solution Designer | Architect IV
2016 - 2019TD Bank Group (via a Contractor)- Led a team of three solution developers and successfully delivered several projects for several lines of business (LOB).
- Worked with business analysts from LOBs to clarify functional requirements.
- Designed solutions for projects, documented design specifications, and shared development work with team members.
- Developed Apache Hive queries for a complex business logic with various source data and delivered ETL solutions.
- Created Oozie workflow and scheduler to orchestrate and schedule jobs.
- Developed Java solutions to handle mainframe data files in a copybook format.
- Mentored solution developers, shared design intentions, best practices, and guidelines, and reviewed solution developers' codes.
Technologies: Big Data, Cloudera, Apache Hive, Oozie, Linux, ETL, SQL, Java, HDFS, TIBCO, Bash Script, MapReduce, IntelliJ IDEA, VirtualBox, Git, Confluence, Jenkins, Bash, Data Lakes, Data Warehouse DesignSenior Software Developer
2016 - 2016Creditron- Developed the SSRS reports according to the business' needs and deployed them to Azure SSRS.
- Fixed bugs in existing features and developed new features for an electronic check processing (ECP) payment application using ASP.NET, C#.NET, .NET Framework, and SQL Server.
- Created SQL scripts to populate data and showcase typical ECP system's use cases and scenarios through SSRS reports.
- Designed a .NET application to automatically deploy SSRS reports using SSRS web services.
Technologies: SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), SQL Server 2015, C#.NET, ASP.NET, Visual Studio, Azure, Azure SQL Databases, Data Warehouse Design, SQL, Microsoft SQL ServerSenior Software Developer | Scrum Master
2008 - 2016Hatch- Developed SSIS packages to load data from various sources like database, CSV files, XML files, SOAP web service, RESTful API, FTP, etc. Applied data hygiene logic and developed transformations using C# script tasks. Loaded data into databases.
- Created a data access layer and a business logic layer of applications using C#.NET and .NET Framework to work with data in SQL Server databases.
- Developed RESTful API for applications to access data in SQL Server databases.
- Used ASP.NET to develop a presentation layer of web applications.
- Played a scrum master role, facilitated teamwork, and led daily scrums, sprint planning, sprint review, and retrospective meetings.
Technologies: SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), SQL Server 2015, C#.NET, ASP.NET, T-SQL, TFS, .NET, Data Modeling, Azure, Azure Active Directory, Scrum Master, SQL, Data Warehouse Design, Design Patterns, Service-oriented Architecture (SOA), SOAP, REST APIs, UML, Web Services, Microsoft SQL ServerSenior Software Engineer | Team Leader
2004 - 2008Epsilon- Led the engineering team with seven team members and designed a BI solution for the digital marketing business.
- Designed and developed ETL packages using SSIS to extract and cleanse data, apply business transformation logic, and load data into a data warehouse.
- Designed the data model. Defined the dimensions and facts of SSAS cubes. Developed a strategy to refresh the cubes to catch up with data changes in a warehouse.
- Developed a set of SSRS reports visualizing business insights of campaigns.
- Created a tool to automatically deploy SSRS reports into different projects and farms.
- Enabled viewing data by different categories and granularities by developing a web application with a dashboard and drill-down feature.
Technologies: SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), SQL Server Analysis Services (SSAS), C#.NET, SQL Server BI, SQL, C++, ASP.NET, Data Modeling, Scrum Master, Data Warehouse Design, T-SQL, UML, Design Patterns, Service-oriented Architecture (SOA), SOAP, Web Services, Microsoft SQL ServerSoftware Developer
2004 - 2004Redknee- Implemented Unicode short message service (SMS) to support multiple languages.
- Designed a thread pool to serve concurrent tag-length-values (TLV) records from sockets and files.
- Implemented CORBA interfaces for communications across distributed components.
Technologies: Java, Oracle, Linux, StarTeam, CORBA, Design Patterns, JSP, SQLSoftware Developer
2001 - 2004Invatron- Developed a set of generic algorithms in C++ templates to handle various perishable food operations using Visual C++ on Windows and GCC on Linux and Unix to deploy the application to different operating systems.
- Created a data access layer via the Open Database Connectivity (ODBC) to access multiple database systems, including SQL Server, Oracle, DB2, Informix, and Sybase. The applications can be deployed with various database systems.
- Developed a messaging framework for communication across the components of the decision support system.
- Built a set of embedded applications to check and adjust inventory, check and mark down the price, and print barcode labels for various devices like hand-held scanners and wall-mounted price checkers.
- Developed an installation daemon to automatically check and install new application versions for devices like hand-held scanners, wall-mounted price checkers, and point-of-sale (POS) machines in distributed chain stores.
Technologies: C++, Windows, Linux, SQL, SQL Server 2015, Oracle, IBM Informix, IBM Db2, Sybase, Visual Studio, GCC, Bash, Unix, Message Bus, ODBC, Data Modeling, Entity-relationships Model (ERM), T-SQL, Microsoft SQL ServerSenior Software Engineer | Team Leader
1995 - 2000China Construction Bank | Guangdong Branch- Led the team that developed a client-server system employing C, C++, Pro*C, and SQL on various Unix and Linux platforms using the Informix database system.
- Gathered requirements from lines of businesses, designed the database and ER diagram, and implemented the data model in Informix SQL scripts.
- Troubleshot production issues, investigated root causes, and found resolutions.
Technologies: C, C++, Pro*C, SQL, IBM Informix, Unix, Linux, HP-UX, Sco Unix, Bash, C Shell, Bourne Shell, KornShell, Entity-relationships Model (ERM), Data Modeling