Data Science and Databases

Showing 17-32 of 140 results

Share

Social Network Analysis Using Power BI and R: A Custom Visuals Guide

Microsoft’s Power BI is one of the most popular software solutions used to perform social network analysis. Here’s how to create custom Power BI visuals in R for compelling and flexible results.

14-minute readContinue Reading
Bharat Garg

Bharat Garg

Bharat is a data scientist and developer who specializes in designing and developing interactive reports and tools to facilitate decision-making. He has worked with small startups and large corporations, such as Comcast, MetLife, UnitedHealth Group/Optum, and Jefferson Health. One of Bharat’s projects delivered $6 million in revenue, and another delivered $10 million in savings.

Efficiency at Scale: A Tale of AWS Cost Optimization

Understanding total spend is a common challenge for cloud users, especially on projects with complex pricing models. This article explores the top AWS cost optimizations that will help you scale your platform effectively.

11-minute readContinue Reading
Rudolf Eremyan

Rudolf Eremyan

Rudolf is a data scientist who has architected big data processing infrastructures on AWS and implemented data engineering solutions for Fortune 500 companies, including Disneyland Hong Kong and Philip Morris. He was invited to participate in NASA’s 2021 International Space Apps Challenge as a speaker and judge.

Understanding Twitter Dynamics With R and Gephi: Text Analysis and Centrality

Centrality and text analysis allow users to get more out of their social network data. Here’s how you can leverage them using R and Gephi.

12-minute readContinue Reading
Juan Manuel Ortiz de Zarate

Juan Manuel Ortiz de Zarate

Juan is a developer, data scientist, and doctoral researcher at the University of Buenos Aires where he studies social networks, AI, and NLP. Juan has more than a decade of data science experience and has published papers at ML conferences, including SPIRE and ICCS.

A Deeper Meaning: Topic Modeling in Python

Colloquial language doesn’t lend itself to computation. That’s where natural language processing steps in. Learn how topic modeling helps computers understand human speech.

8-minute readContinue Reading
Federico Albanese

Federico Albanese

Federico is a developer and data scientist who has worked at Facebook, where he made machine learning model predictions. He is a Python expert and a university lecturer. His PhD research pertains to graph machine learning.

Social Network Analysis in R and Gephi: Digging Into Twitter

Thanks to rapid advances in technology, large amounts of data generated on social networks can be analyzed with relative ease, especially for those who use the R programming language and Gephi.

9-minute readContinue Reading
Juan Manuel Ortiz de Zarate

Juan Manuel Ortiz de Zarate

Juan is a developer, data scientist, and doctoral researcher at the University of Buenos Aires where he studies social networks, AI, and NLP. Juan has more than a decade of data science experience and has published papers at ML conferences, including SPIRE and ICCS.

World-class articles, delivered weekly.

By entering your email, you are agreeing to our privacy policy.

Ensemble Methods: The Kaggle Machine Learning Champion

Two heads are better than one. This proverb describes the concept behind ensemble methods in machine learning. Let’s examine why ensembles dominate ML competitions and what makes them so powerful.

9-minute readContinue Reading
Juan Manuel Ortiz de Zarate

Juan Manuel Ortiz de Zarate

Juan is a lecturer at the University of Buenos Aires. His research focuses on AI, NLP, and social networks. He has more than a decade of data science experience and he’s published papers at ML conferences, including SPIRE and ICCS.

Graph Data Science With Python/NetworkX

Data inundates us like never before—how can we hope to analyze it? Graphs (networks, not bar graphs) provide an elegant approach. Find out how to start with the Python NetworkX library to describe, visualize, and analyze “graph theory” datasets.

9-minute readContinue Reading
Federico Albanese

Federico Albanese

Federico is a developer and data scientist who has worked at Facebook, where he made machine learning model predictions. He is a Python expert and a university lecturer. His PhD research pertains to graph machine learning.

How to Approach Writing an Interpreter From Scratch

How source code becomes a running program is often opaque: “Just run the compiler” is all that developers normally need to know.

Writing an interpreter from scratch—including its lexer and parser—is an illuminating challenge.

13-minute readContinue Reading
Sakib Hadžiavdić

Sakib Hadžiavdić

A back-end expert, Sakib is the creator of the static site generator Hepek. Always learning, he writes tutorials in English and Bosnian.

Solving Bottlenecks With SQL Indexes and Partitions

Indexes and partitioning can help with SQL performance, but they’re not cure-alls. Through everyday examples of date range and LIKE queries, find out how to “think like an RDBMS” to make yours run faster.

14-minute readContinue Reading
Mirko Marović

Mirko Marović

Mirko designs and develops massive, extreme-workload databases. He also trains software developers on databases and SQL.

Machine Learning Number Recognition: From Zero to Application

Harnessing the potential of machine learning for computer vision is not a new concept but recent advances and the availability of new tools and datasets have made it more accessible to developers.

In this article, Toptal Software Developer Teimur Gasanov demonstrates how you can create an app capable of identifying handwritten digits in under 30 minutes, including the API and UI.

10-minute readContinue Reading
Teimur Gasanov

Teimur Gasanov

Teimur is passionate about writing composite interfaces using React and building extensible APIs with Go. He excels at finding solutions for atypical problems.

Building a Data Warehouse Data Quality Process

Data quality is a crucial element of any successful data warehouse solution. As the complexity of data warehouses increases, so does the need for data quality processes.

In this article, Toptal Data Quality Developer Alexander Hauskrecht outlines how you can ensure a high degree of data quality and why this process is so important.

16-minute readContinue Reading
Toptal emblem

Toptal Talent Network Experts

SQL Indexes Explained, Pt. 2

Sorting a table can make some queries faster—but the maintenance cost is untenable. Enter real database indexes and their most common implementation structure: the B-tree.

10-minute readContinue Reading
Mirko Marović

Mirko Marović

Mirko designs and develops massive, extreme-workload databases. He also trains software developers on databases and SQL.

Serializing Complex Objects in JavaScript

The Tanagra.js library is designed to be simple and lightweight, and it currently supports Node.js and ES6 classes. The main implementation supports JSON, and an experimental version supports Google Protocol Buffers.

7-minute readContinue Reading
Luke Wilson

Luke Wilson

Luke has 12 years of experience as an engineer, team lead, and scrum master.

Optimizing Retailer Revenue With Sales Forecasting AI

Retailers often face supply and demand issues that cause them to miss out on potential sales or tie up a lot of money in overstocked products.

In this article, Toptal Data Scientist Ahmed Khaled explains how retailers can boost revenues and cut costs with sales forecasts backed by artificial intelligence.

9-minute readContinue Reading
Ahmed Khaled

Ahmed Khaled

Ahmed is a senior data scientist who loves to dig into clients’ problems and solve them using state-of-the-art data-driven solutions.

Embeddings in Machine Learning: Making Complex Data Simple

Working with non-numerical data can be challenging, even for seasoned data scientists. To make good use of such data, it needs to be transformed. But how?

In this article, Toptal Data Scientist Yaroslav Kopotilov will introduce you to embeddings and demonstrate how they can be used to visualize complex data and make it usable.

11-minute readContinue Reading
Yaroslav Kopotilov

Yaroslav Kopotilov

Yaroslav is a data scientist with experience in business analysis, predictive modeling, data visualization, data orchestration, and deployment.

The Many Applications of Gradient Descent in TensorFlow

TensorFlow is one of the leading tools for training deep learning models. Outside that space, it may seem intimidating and unnecessary, but it has many creative uses—like producing highly effective adversarial input for black-box AI systems.

18-minute readContinue Reading
Alan Reiner

Alan Reiner

Alan’s ML expertise covers visual target recognition models for missile defense systems, real-time NLP, and financial evaluation tools.

Toptal Engineering Expert

Gabriel Courtemanche

Gabriel is a highly efficient and reliable professional who possesses a broad skill set for web application development. He's been working on a range of products and clients—from working on scalability problems in production engineering teams at Shopify and Autodesk to launching new applications for startups. Most of his work consists of leading technical teams, by creating an easy development environment, fixing technical debts, providing best practices code examples, and mentoring devs.
Read more

Previously At

Shopify

World-class articles, delivered weekly.

By entering your email, you are agreeing to our privacy policy.

Join the Toptal® community.