Vivek Ramaswamy
Verified Expert in Engineering
SQL Developer
Toronto, ON, Canada
Toptal member since February 22, 2022
Vivek is an IT professional with 15 years of experience in designing and building systems, the last five years in big data systems. He excels in multiple tools, technologies, and programming languages, including SQL, .NET, Java, Scala, and JavaScript. Vivek has worked with data in Excel, VBA, Access, RDBMS, and distributed systems.
Portfolio
Experience
Availability
Preferred Environment
Apache Kafka, Apache Hive, Spark, SQL, Excel VBA, Apache Impala, Cloudera, Scala, Java 8, Python
The most amazing...
...thing I've done is efficiency improvement of data loaders while bringing in more visibility into the process.
Work Experience
Data Engineer
Pogo Technologies, Inc.
- Gathered requirements to understand the different data sources and their mode of data export. Perform quick proof of concepts to identify the appropriate tool to use to build the data pipelines.
- Built standardized data pipelines in Dagster to source data from multiple sources and land data in Snowflake in a scheduled manner. Routed run notifications to slack channels for easy monitoring and oversight.
- Repurposed and enabled a live staging area to enable a prod-like environment for wet runs and validations.
Software Engineer
Leading FX Trading Platform
- Set up streaming pipelines in Google Cloud Dataflow to land data from on-premise, schemaless Kafka topics to BigQuery. Captured and stored schema drifts to ensure a smooth run.
- Evaluated different tick database vendors to store tick data. Ran POCs and compared different products' benchmark performances by setting up a greenfield environment and running simulations of common use cases.
- Used MuleSoft dataflows to replace Informatica jobs.
- Built a Java Spring framework-based wrapper service hosted on Google IAP to query Domo and send out data in CSV format for resellers. Rebuilt Domo dashboards based on pre-existing ones.
Big Data Engineer | Associate Director
Leading Insurance Broker
- Built ETL pipelines using StreamSets Data Collector to land data from various sources into the cloud data platform based on S3 and supported by Impala.
- Evaluated the use of StreamSets Transformer as a complementary ETL tool for batch processing.
- Assessed the use of Apache Airflow for orchestrating and deciding if it would fit a multitenant environment.
Senior Big Data Engineer
Leading French Investment Bank
- Redesigned the streaming application to be more contextual and in line with the nature of data.
- Incorporated logging into ELK for more real-time metrics and analysis.
- Improved the deployment process, making it more efficient and independent using Unix helpers.
Senior Associate
Global Bank Leader in the Private and Investment Banking Space
- Improved Dynamics CRM application performance and load time by identifying the bottleneck and applying technical and functional fixes. Scaled up the data loader by leveraging .NET multi-threading and Dynamics CRM's bulk handling capability.
- Sped up Spark job timing by using the caching ability for the Hive and Spark SQL tables.
- Used the Akka Streams to improve the extensive file loading process on the edge Node.js with minimum resources to circumvent throughput and memory contention issues.
IT Analyst
Leading Consulting Firm
- Identified the bottlenecks and improved the performance of SQL queries.
- Migrated the application from Dynamics CRM 4.0 to Dynamics CRM 2011. Revamped the application to use the new, version-breaking SDK while also incorporating Silverlight to display a new custom UI.
- Integrated IBM MQ with VB 6 to replace a screen scrapping procedure.
- Built an Excel-based tool using macros to capture data from multiple Excel files for reporting.
Research Engineer
VoIP Service Provider
- Enabled system integration between Linux back-end systems and Windows-based front-end systems.
- Built a flash-based SIP Softphone integrated within the browser used in a Windows-based MIS system to facilitate displaying agent availability and placing internal SIP calls directly.
- Identified and optimized the internal system by migrating it from VB 5 to VB 6.
- Created an Excel-based reconciliation tool using VBA/Macros to highlight and report errors for billing.
Experience
Batch Data Warehouse System to Real-time Data Lake Migration
We used the CDC tool to move database feed updates to Kafka and wrote Spark streaming applications to process and store data in HBase. There was also a caching system outside the data lake to facilitate faster access to data.
OUTCOME
As a result of this migration, critical downstream applications had access to data in real-time. They no longer had to wait T+1 days for a feed or deal with stale data in case of batch processing failure. Due to near-real-time data feeds, it also created new opportunities to identify misuse and potential cross-sellable products.
On-premise Data Lake for Capital Markets Data
OUTCOME
The project led to the collation of organization-wide trade data and provided an environment for scientists, actuaries, and risk modelers to analyze, test, and tweak their existing and new models.
Evaluation of Tick Databases
Education
Master's Degree in Information Technology
University of Mumbai - Mumbai, India
Bachelor's Degree in Information Technology
University of Mumbai - Mumbai, India
Skills
Libraries/APIs
Spark Streaming, Protobuf, PySpark
Tools
Spark SQL, ELK (Elastic Stack), Cloudera, Impala, Apache Airflow, IntelliJ IDEA, Apache Impala, BigQuery, Apache Maven, Cloud Dataflow, Asterisk, Microsoft Dynamics CRM, Microsoft Silverlight, IBM MQ, Jenkins, Stash, Git, Control-M, Slack, Tableau, Domo, Apache Beam, Microsoft Excel, Microsoft Power BI
Languages
Orc, SQL, Excel VBA, Java 8, Scala, C++, Python, Power Query M, Visual Basic 6 (VB6), Flash ActionScript, ActionScript 3, Java, Snowflake, Python 3, Visual Basic, Visual Basic for Applications (VBA)
Paradigms
ETL, Parallel Computing
Platforms
Apache Kafka, Hortonworks Data Platform (HDP), MuleSoft, Slackware, Google Cloud Platform (GCP), Amazon Web Services (AWS), Databricks
Storage
HDFS, Apache Hive, Database Management Systems (DBMS), RDBMS, HBase, Microsoft SQL Server, Kdb+, ExtremeDB, Amazon S3 (AWS S3), PostgreSQL, Greenplum, SQL Server 2008 R2, Neo4j
Frameworks
Spark, .NET, Apache Spark, Spring Boot, Hadoop
Other
StreamSets, Data Engineering, Google BigQuery, Distributed Systems, ELT, Big Data Architecture, APIs, Excel Macros, OneTick, Dynamics CRM 2011, Dynamics CRM 2013, Parquet, Informatica, Session Initiation Protocol (SIP), Dynamics CRM Plugins, Dagster, ETL Tools, GraphDB
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring