Benjamin Li, Software Developer in Oakville, ON, Canada
Benjamin Li

Software Developer in Oakville, ON, Canada

Member since August 18, 2021
Benjamin has over two decades of software and big data development experience, including data modeling and data warehouse design. His active toolset includes Spark, Python, Scala, AWS, Azure, SQL, Hive, Linux, Microsoft BI solutions, C#.NET, and Java. His orientation to detail and strong analytical and problem-solving skills make him an excellent addition to any team. A kind and intentional communicator, Benjamin always produces high-quality work.
Benjamin is now available for hire

Portfolio

Experience

Location

Oakville, ON, Canada

Availability

Part-time

Preferred Environment

Linux, PyCharm, IntelliJ IDEA, Apache Hive, Spark, AWS, Azure, Visual Studio, Windows, SQL Server BI

The most amazing...

...thing I've done was to reduce operation costs by 80% by rearchitecting a project and enhancing the code.

Employment

  • Big Data Consultant

    2019 - PRESENT
    Sun Life (via a Contractor)
    • Acted as a tech lead at the project's second phase and provided technical guidance to the project team. Hosted a daily scrum and facilitated the team's activities.
    • Rearchitected the project and redesigned the code to reduce the number of AWS Glue jobs from 150 down to 30. This reduced the operation cost by 80%.
    • Developed a Python and PySpark code that handles history data bulk load and daily CDC load and builds daily snapshots.
    • Created Hive SQL and Spark SQL to handle complex business transformation logic.
    • Developed the CI/CD pipeline to build, package, and deploy the project to development, system integration, and production testing.
    • Tuned performance for the system and located the data skew issue. Provided suggestions to the business team to adjust the data model and avoid recurrence of the problem.
    • Tested the solution in Amazon EMR and AWS Glue and deployed the AWS Glue job solution to production.
    Technologies: Big Data, AWS, Apache Hive, AWS S3, AWS Glue, Zeppelin, SQL, Python 3, PySpark, Spark SQL, Linux, Git, Confluence, Scala, PyCharm, IntelliJ IDEA, AWS EMR, Jenkins Pipeline, CI/CD Pipelines, Scrum, Bash, Data Lakes, Data Warehouse Design
  • Big Data Solution Designer | Architect IV

    2016 - 2019
    TD Bank Group (via a Contractor)
    • Led a team of three solution developers and successfully delivered several projects for several lines of business (LOB).
    • Worked with business analysts from LOBs to clarify functional requirements.
    • Designed solutions for projects, documented design specifications, and shared development work with team members.
    • Developed Apache Hive queries for a complex business logic with various source data and delivered ETL solutions.
    • Created Oozie workflow and scheduler to orchestrate and schedule jobs.
    • Developed Java solutions to handle mainframe data files in a copybook format.
    • Mentored solution developers, shared design intentions, best practices, and guidelines, and reviewed solution developers' codes.
    Technologies: Big Data, Cloudera, Apache Hive, Oozie, Linux, ETL, SQL, Java, HDFS, TIBCO, Bash Script, MapReduce, IntelliJ IDEA, VirtualBox, Git, Confluence, Jenkins, Bash, Data Lakes, Data Warehouse Design
  • Senior Software Developer

    2016 - 2016
    Creditron
    • Developed the SSRS reports according to the business' needs and deployed them to Azure SSRS.
    • Fixed bugs in existing features and developed new features for an electronic check processing (ECP) payment application using ASP.NET, C#.NET, .NET Framework, and SQL Server.
    • Created SQL scripts to populate data and showcase typical ECP system's use cases and scenarios through SSRS reports.
    • Designed a .NET application to automatically deploy SSRS reports using SSRS web services.
    Technologies: SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), SQL Server 2015, C#.NET, ASP.NET, Visual Studio, Azure, Azure SQL Databases, Data Warehouse Design, SQL, Microsoft SQL Server
  • Senior Software Developer | Scrum Master

    2008 - 2016
    Hatch
    • Developed SSIS packages to load data from various sources like database, CSV files, XML files, SOAP web service, RESTful API, FTP, etc. Applied data hygiene logic and developed transformations using C# script tasks. Loaded data into databases.
    • Created a data access layer and a business logic layer of applications using C#.NET and .NET Framework to work with data in SQL Server databases.
    • Developed RESTful API for applications to access data in SQL Server databases.
    • Used ASP.NET to develop a presentation layer of web applications.
    • Played a scrum master role, facilitated teamwork, and led daily scrums, sprint planning, sprint review, and retrospective meetings.
    Technologies: SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), SQL Server 2015, C#.NET, ASP.NET, T-SQL, TFS, .NET Framework, Data Modeling, Azure, Azure Active Directory, Scrum Master, SQL, Data Warehouse Design, Design Patterns, SOA, SOAP, RESTful APIs, UML, Web Services, Microsoft SQL Server
  • Senior Software Engineer | Team Leader

    2004 - 2008
    Epsilon
    • Led the engineering team with seven team members and designed a BI solution for the digital marketing business.
    • Designed and developed ETL packages using SSIS to extract and cleanse data, apply business transformation logic, and load data into a data warehouse.
    • Designed the data model. Defined the dimensions and facts of SSAS cubes. Developed a strategy to refresh the cubes to catch up with data changes in a warehouse.
    • Developed a set of SSRS reports visualizing business insights of campaigns.
    • Created a tool to automatically deploy SSRS reports into different projects and farms.
    • Enabled viewing data by different categories and granularities by developing a web application with a dashboard and drill-down feature.
    Technologies: SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), SQL Server Analysis Services (SSAS), C#.NET, SQL Server BI, SQL, C++, ASP.NET, Data Modeling, Scrum Master, Data Warehouse Design, T-SQL, UML, Design Patterns, SOA, SOAP, Web Services, Microsoft SQL Server
  • Software Developer

    2004 - 2004
    Redknee
    • Implemented Unicode short message service (SMS) to support multiple languages.
    • Designed a thread pool to serve concurrent tag-length-values (TLV) records from sockets and files.
    • Implemented CORBA interfaces for communications across distributed components.
    Technologies: Java, Oracle, Linux, StarTeam, CORBA, Design Patterns, JSP, SQL
  • Software Developer

    2001 - 2004
    Invatron
    • Developed a set of generic algorithms in C++ templates to handle various perishable food operations using Visual C++ on Windows and GCC on Linux and Unix to deploy the application to different operating systems.
    • Created a data access layer via the Open Database Connectivity (ODBC) to access multiple database systems, including SQL Server, Oracle, DB2, Informix, and Sybase. The applications can be deployed with various database systems.
    • Developed a messaging framework for communication across the components of the decision support system.
    • Built a set of embedded applications to check and adjust inventory, check and mark down the price, and print barcode labels for various devices like hand-held scanners and wall-mounted price checkers.
    • Developed an installation daemon to automatically check and install new application versions for devices like hand-held scanners, wall-mounted price checkers, and point-of-sale (POS) machines in distributed chain stores.
    Technologies: C++, Windows, Linux, SQL, SQL Server 2015, Oracle, IBM Informix, IBM Db2, Sybase, Visual Studio, GCC, Bash, Unix, Message Bus, ODBC, Data Modeling, Entity-relationships Model (ERM), T-SQL, Microsoft SQL Server
  • Senior Software Engineer | Team Leader

    1995 - 2000
    China Construction Bank | Guangdong Branch
    • Led the team that developed a client-server system employing C, C++, Pro*C, and SQL on various Unix and Linux platforms using the Informix database system.
    • Gathered requirements from lines of businesses, designed the database and ER diagram, and implemented the data model in Informix SQL scripts.
    • Troubleshot production issues, investigated root causes, and found resolutions.
    Technologies: C, C++, Pro*C, SQL, IBM Informix, Unix, Linux, HP-UX, Sco Unix, Bash, C Shell, Bourne Shell, Korn Shell, Entity-relationships Model (ERM), Data Modeling

Experience

  • AWS Glue ETL Project for Insurance Business

    This is an ETL project on AWS to extract data from multiple lines of business in the enterprise data lake. The data are transformed according to business logic and loaded into a consumption zone for Tableau reports so that reports can be quickly built on the integrated data model, regardless of various data models from lines of business.

  • Common Reporting Standard (CRS)

    The Common Reporting Standard (CRS) is a regulatory project regarding bank accounts on a global level between tax authorities. I developed complex Hive queries, Oozie workflow, and scheduler to extract data from the master data management (MDM) system and consolidate accounts from the wealth management system. I discovered data discrepancies, dug out the root causes, and enhanced the enterprise data model so that the data provided by this application were accurate and accountable.

  • Data Lake Ingestion Data Flow

    This is an add-on component for a bank to move the enterprise data ingestion to a data lake. I designed the solution and implemented Java classes to parse ingestion logs, extract bad records, convert mainframe copybook into Unicode, and persist data into Hive table for business users to view and fix data. I improved application performance by 20 times.

  • Global Procurement Intelligence (GPI)

    I designed a data model, C#.NET and ASP.NET web application, SSIS packages with complex logic, and SSRS reports for global procurement intelligence (GPI) system that help optimize sourcing decisions significantly.

Skills

  • Languages

    SQL, Bash, C#.NET, C++, Java, Python 3, Scala, Python, T-SQL, UML, C, Pro*C, C Shell, Bourne Shell, Snowflake
  • Frameworks

    Spark, ASP.NET, AWS EMR, JSP, Hadoop, YARN
  • Paradigms

    Database Design, Business Intelligence (BI), ETL, Scrum, Agile, MapReduce, Design Patterns, SOA
  • Storage

    SQL Server 2016, Apache Hive, SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), Microsoft SQL Server, Database Architecture, AWS S3, HDFS, Azure SQL Databases, SQL Server Analysis Services (SSAS), Azure Active Directory, MySQL, PostgreSQL, Data Lakes, IBM Informix, IBM Db2, Sybase, Redshift, Data Pipelines
  • Other

    Data Modeling, Big Data, Data Warehouse Design, Data Engineering, Data Analysis, Data Analytics, Reverse Engineering, AWS, Software Engineering, Software, TIBCO, SQL Server 2015, .NET Framework, Azure Data Factory, CI/CD Pipelines, Scrum Master, Data Warehousing, StarTeam, CORBA, SOAP, RESTful APIs, Web Services, Message Bus, Sco Unix, Korn Shell, Entity-relationships Model (ERM), Hue, Enterprise Architecture, MSMQ, Azure Data Lake
  • Platforms

    Linux, Windows, Zeppelin, Azure, Apache Kafka, Databricks, Oracle, Unix, HP-UX, Google Cloud Platform (GCP)
  • Libraries/APIs

    PySpark, Jenkins Pipeline, ODBC, JDBC, Standard Template Library (STL)
  • Tools

    PyCharm, IntelliJ IDEA, AWS Glue, Spark SQL, Git, Confluence, Jenkins, Cloudera, Oozie, Visual Studio, TFS, SQL Server BI, Apache Airflow, VirtualBox, GCC, Eclipse IDE, BigQuery

Education

  • Master's Degree in Computer Science
    1992 - 1995
    Fudan University - Shanghai, China
  • Bachelor's Degree in Computer Science
    1988 - 1992
    National University of Defense Technology - Changsha, China

Certifications

  • Certified Scrum Master
    MAY 2015 - MAY 2019
    Scrum Alliance

To view more profiles

Join Toptal
Share it with others