Sung Jun Kim, Big Data Developer in Sydney, New South Wales, Australia
Sung Jun Kim

Big Data Developer in Sydney, New South Wales, Australia

Member since March 11, 2019
As a highly effective technical leader with over 25 years of experience, Andrew specializes in data integration, data conversion, data engineering, ETL, big data architecture, data analytics, data visualization, data science, analytics platforms, and cloud architecture. He has an array of skills in building data platforms, analytic consulting, trend monitoring, data modeling, data governance, and machine learning.
Sung is now available for hire

Portfolio

Experience

Location

Sydney, New South Wales, Australia

Availability

Full-time

Preferred Environment

Hadoop, Spark, PySpark, SQL, Informatica

The most amazing...

...thing I've coded is a data ingestion and transformation algorithm to explode and normalize very complex multi-hierarchal data structure.

Employment

  • Data Engineer

    2020 - 2020
    Dermalogica Unilever
    • Created ETL from JDE ERP system to Data Warehouse using SQL and SSIS.
    • Desgined DW data model.
    • Create batch SQL from various sytems to DW.
    • Created Power BI Data Model.
    • Writing complex DAX functions and in Power BI and Power Pivot.
    • Data Transformatin using M-Query.
    • Created Sales/Revenuce and Field Service dashboard for consultants.
    • Implemeted YTD, QTD and other comparion visuals.
    Technologies: Power BI, Azure SQL Server
  • Data Engineer

    2019 - 2020
    10th Man Media
    • Designed and implemented data ingestion and trasformation framework from various social media to Azure platform. Social media data is extracted using API then ingested into Data Lake using ADF.
    • Sung created complex data transformation logic using PyParks which involves time series trend, aggregation and time windows comparison. Data is moved to downsteam Azure Datawarehouse for Power BI data visualization.
    • Desing entire pipeline from upsteam to downstream using Azure Data products.
    Technologies: Azure, Azure Data Factory, Databricks, SQL Datawarehouse, Data Lake, BLOB
  • Big Data Architect/Lead Data Engineer

    2019 - 2019
    TechMahindra/Optus
    • Spearheaded big data architecture and engineering under the ambit of Optus , developed proof of concept (POC), architecture design, drive analytics, and manage technical project delivery in line with expectations.
    • Led legacy DW migration project from Teradata to Cloudera using Informatica BDM, Scala, Hive, HDFS, Impala, Elastic Stack, Splunk, DevOps, and CI/CD.
    • Migration of Cloudera big data platform data and code to AWS and Azure Platform.
    Technologies: Big Data, Hadoop, Spark, Hive, Hbase, Informatica BDM, TeraData
  • Lead Big Data Architect/Lead Data Engineer

    2018 - 2019
    Cognizant/Westpac
    • Led a big data team of data engineers/developers and delivered real-time and batch data processing projects using Agile Scrum.
    • Designed and delivered a metadata-driven data ingestion framework which ingests data from various Westpac data sources to Westpac Data Hub (HDFS) then integrates, transforms, and publishes to target sources including Kafka, RDBMS (Teradata, Oracle, and SQL Server) and SFTP, etc. Technologies used for the project including Python, Spark, Spark SQL, Hadoop, HDFS, Hive, Hbase, Kafka, NIFI, and Atlas.
    • Led CCR project which ingests data from customer rating bureau including Equifax, Illion, and Experian. Designed the entire XML explosion pattern which involves multi-level XML explosion and normalized table creation in HDFS platform using PySpark, Hive, SparkSQL, and Hbase. Created entire downstream conceptual, logical, and physical data models for downstream users including credit risk analysts and data scientists.
    Technologies: Plantir Foundry, Kafka, RDBMS, Teradata, Oracle, SQL Server , Python, PySpark, Spark SQL, Hadoop, HDFS, Hive, Hbase, Kafka, NIFI, Atlas
  • Analytic Lead/Data Architect

    2009 - 2018
    OneGov, Department of Finance & Services and Innovation, NSW Government, Sydney, Australia
    • Managed BI team of 8 BI/ETL developers and was responsible for OneGov’s entire analytic, data science and big data projects and BAU activities for a number of large NSW government agencies including DAC (Data Analytic Centre), Service NSW, RMS, Fair Trading, NSW Health, etc. Worked closely with product owner, scrum master, developers, BA, architects, support team, external agency users and other stakeholders then delivered a number of critical analytic projects successfully.
    • Delivered the entire analytics platform, applications, data visualization, prediction model, and ETL process from scratch and continuously enhanced the system by adopting new technologies and new processes. Developed ETL process using SSIS (2016) which integrated data from sources including SQL Server 2016, Siebel CRM, websites via APIs and flat files (CSV/XLSX/XLS/XML/JSON). Responsible for daily ETL refresh and on-going maintenance. Also responsible for SQL Server database tuning, upgrade, query optimization, and also index maintenance. Built SSAS cube for KPI and management reporting. Also built a number of dashboard and reports for executives, managers and operation people using Power BI, DOMO, Tableau, and OBIEE.
    • Created a prediction model for license renewal reminder campaign using Logistic Regression, Petrol Station Grouping model by using unsupervised learning technique (K-Means cluster). Involved several other machine learning projects including CTP and Fuel Pricing in NSW using various ML libraries on Hortonworks Hadoop Platform.
    • Created a dashboard for ministers to monitor fuel price update, compliance, and price trend using Power BI and DOMO. Analyzed sophisticated real-time and historical fuel price by using Python, Spark (PySpark), and Hive. Analyzed customer feedback using NLP/data mining techniques with R programming.
    • Built HDP (Hadoop cluster) and HDF (NIFI) clusters for data scientists and academics for their large data analytic and prediction model build. Public and confidential data ingested across from AWS EMR/S3/Redshift to on-premise Hadoop using Spark ETL framework program, Glue and NIFI. Provided consulting service for data ingests and other big data technologies to data scientists and engineers.
    • Developed data ingest flow from various data sources to Hive by using Spark, NIFI, HDFS, and Sqoop in near real-time basis for Service NSW OTC. Managed the entire Hadoop cluster including day to day server maintenance and daily delta data ingest. Power BI is used for data visualization.
    Technologies: MS SQL Server database, SSAS, SSIS, SSRS 2008, SSRS 2012, SSRS 2016, Power BI, Azure, AWS, Oracle Data Warehouse, OBIEE 10g, OBIEE 11g, Informatica, Hadoop Cluster, HDP, HDF, NIFI, Hive, SAM, Schema Registry, SuperSet, Python, R, PySpark
  • CRM /BI Lead

    2007 - 2008
    IBM Global Business Service
    • Delivered core case management system, integration services to internal/external systems, and upgraded detention portal system.
    • Led a team of six consultants and responsible for the implementation of main case management modules.
    • Handled resource management, task distribution, and schedule management.
    • Wrote technical and integration specification.
    • Configured various Siebel Public Sector Case Management.
    • Created SOA Integration interface to the department.
    • Implemented Oracle Business Intelligence Enterprise Edition.
    • Delivered unified systems for border security, case management, and detention for national security. The system involved a complicated process which started from border entry to granting of a visa.
    • Contributed to the team awarded by Secretary of Department of Immigration and Citizen on 26th Jan, 2010 “Australia Day 2010 secretary’s citation’ for the delivery of the Service provide Portal within the Systems for People 9 Compliance, Case Management and Detention Release.”.
    Technologies: Oracle Business Intelligence, Siebel CRM
  • Program Manager (BI/CRM)

    2004 - 2008
    Samsung
    • Designed and implemented CRM and analytics.
    • Created application standard, interface and configuration framework, and development guideline.
    • Integrated Siebel.
    • Converted data using SQL and other ETL tools.
    • Performed technical requirement analysis, configuration, and report creation.
    • Installed and configured OBIEE.
    • Installed and configured data warehouse including environment setup, DAC, Informatica ETL modification, data model change, performance tuning, and optimization.
    • Designed system architecture and sized hardware.
    • Provided various in-house Siebel technical and business consulting as BI and CRM subject matter expert.
    • Managed the team and mentored junior team members.
    Technologies: Oracle Business Intelligence, Siebel CRM, SQL Server, Informatica, SSIS
  • Senior Principle Consultant

    2000 - 2004
    Oracle (Siebel)
    • Engaged multiple Siebel CRM/Analytic projects across Asia Pacific with leading multinational customers and partners. Provided various technical, system design, business requirement analysis, and project management services to the partners and customers. This included technical system architect design and implementation, enterprise application integration (EAI), project management, and application configuration. Involved responses to RFP and RFI, wrote consulting proposals, supported pre-sales, resource planning, mentored junior consultants, team lead, practice development, and management and operational procedures for consulting assignments.
    Technologies: Oracle Business Intelligence / Siebel CRM
  • Lead DBA

    1998 - 2000
    SIEMENS
    • Handled database management, administration, data conversion and migration, SQL and database engine turning and optimization and release of a new database.
    Technologies: SQL Server, Sybase
  • Senior Development DBA

    1997 - 1999
    BT Fund Management
    • Successfully delivered a number of projects including Unit Trust System Database Conversion from SQL Base to MS SQL Server and Sybase, Investment Product Marketing Data Mart/Warehouse ETL, Web Data Warehouse Reporting System, and Web Unit Price File Polling Application.
    Technologies: SQL Server, Sybase, Oracle, Informatica
  • Senior Systems Developer

    1995 - 1997
    Colonial Insurance
    • Headed various system analysis, design, data modeling, programming, and testing as well as internal technical and external consultation and support. The role also has included analysis, design, implementation, and support of two mission-critical systems: UPMS (Unit Price Management System) and New Business 400 system.
    Technologies: C++, SQL Server, Sybase
  • Senior Systems Analyst/Programmer

    1995 - 1995
    Reserve Bank of Australia
    • Served as the system analyst/programmer in designing, developing and implementing various banking applications and automated fund transfer systems for the central bank of Australia.
    • Oversaw the development process and managed the integration of various internal and external systems, reporting processes and applications to streamline and simplify the external as well as internal reporting activities.
    Technologies: C++

Experience

  • Optus Big Data Project (Development)

    Lead big data architect for Optus legacy data warehouse migration project which migrates data from legacy Teradata to Cloudera Big Data Platform and designed data ingestion/transformation framework using Informatica BDM, Scala, and DevOps.

  • Westpac Big Data Platform (Development)

    Served as lead solution architect on Westpac’s big data platform and comprehensive credit reporting projects. Led a team of data engineers and solution engineers.

  • NSW Government's Analytic Platform Build (Development)

    Successfully delivered award-winning NSW State Government’s in-house and cloud big data, data science, and business intelligence projects as the lead data architect.

Skills

  • Languages

    Scala, Python 2, Python, R, JavaScript, Visual Basic for Applications (VBA)
  • Frameworks

    Hadoop, Spark, YARN, Flutter, React Native, Redux
  • Libraries/APIs

    Node.js, Flask-RESTful, PySpark, MLlib, TensorFlow, Stanford NLP, Ggplot2, React
  • Tools

    ELK (Elastic Stack), Kibana, Logstash, cURL Command Line Tool, Dplyr, Superset, Solr, Sqoop, Impala, Cloudera, SSAS, Domo, Oracle Business Intelligence Enterprise Edition 11g (OBIEE), Microsoft Power BI, Tableau, Amazon Athena, AWS Glue, Azure HDInsight
  • Paradigms

    ETL, Data Science, OLAP
  • Platforms

    Firebase, Amazon Web Services (AWS), Azure, RStudio, Apache Kafka, Hortonworks Data Platform (HDP), Oracle, Databricks, Android, iOS
  • Storage

    Oracle DBMS, Elasticsearch, HDFS, Apache Hive, Essbase, PostgreSQL, MySQL, Teradata, Microsoft SQL Server, Redshift, AWS DynamoDB, AWS S3, Azure Blobs
  • Other

    APIs, Big Data, Data Visualization, Filebeat, Microsoft Data Transformation Services (now SSIS), Informatica, Engineering, Schemas, Ranger, NiFi, DAX, Data Warehouse Design, Software Development, Freelance Developer, React Native Bridge, Azure Data Factory

Education

  • Master of Science degree in Computer Science
    1993 - 1996
    University of Technology, Sydney (UTS) - Sydney, Australia
  • Bachelor's degree in Information and Communication Systems
    1990 - 1993
    Macquarie University - Sydney, Australia

Certifications

  • AWS Certified Data Analytics - Specialty
    MARCH 2020 - MARCH 2023
    AWS
  • PMP
    MARCH 2004 - PRESENT
    PMI
  • Microsoft Certified DBA
    MARCH 1999 - PRESENT
    Microsoft
  • Oracle Certified DBA
    JANUARY 1998 - PRESENT
    Oracle

To view more profiles

Join Toptal
Share it with others