Data Science

Showing 1-9 of 35 results
EngineeringIcon ChevronData Science and Databases

Advancing AI Image Labeling and Semantic Metadata Collection

By Neven Pičuljan

Image labeling can be a tedious, time-consuming task, compounded by the sheer volume of data needed to train deep neural networks. This article breaks down large data set processing and explains how a new SaaS product can help automate image labeling.

13 minute readContinue Reading
EngineeringIcon ChevronData Science and Databases

Apache Spark Optimization Techniques for High-performance Data Processing

By Necati Demir, PhD

Apache Spark is an analytics engine that can handle very large data sets. This guide reveals strategies to optimize its performance using PySpark.

11 minute readContinue Reading
EngineeringIcon ChevronData Science and Databases

Mining for Data Clusters: Social Network Analysis With R and Gephi

By Juan Manuel Ortiz de Zarate

Explore X (formerly Twitter) data clusters to uncover user behaviors (e.g., repost and reply patterns) within online communities. This guide focuses on a politically charged data set to illustrate the process of visualizing and analyzing social data.

8 minute readContinue Reading
EngineeringIcon ChevronData Science and Databases

Python vs. R: Syntactic Sugar Magic

By Leandro Roser

Python and R empower data scientists to solve problems using elegant syntactic sugar, simplifying coding and solution exploration. Each language brings its unique capabilities and approach to bear.

7 minute readContinue Reading
EngineeringIcon ChevronData Science and Databases

Understanding Twitter Dynamics With R and Gephi: Text Analysis and Centrality

By Juan Manuel Ortiz de Zarate

Centrality and text analysis allow users to get more out of their social network data. Here’s how you can leverage them using R and Gephi.

12 minute readContinue Reading
EngineeringIcon ChevronData Science and Databases

A Deeper Meaning: Topic Modeling in Python

By Federico Albanese

Colloquial language doesn’t lend itself to computation. That’s where natural language processing steps in. Learn how topic modeling helps computers understand human speech.

8 minute readContinue Reading
EngineeringIcon ChevronData Science and Databases

Social Network Analysis in R and Gephi: Digging Into Twitter

By Juan Manuel Ortiz de Zarate

Thanks to rapid advances in technology, large amounts of data generated on social networks can be analyzed with relative ease, especially for those who use the R programming language and Gephi.

9 minute readContinue Reading
EngineeringIcon ChevronData Science and Databases

Graph Data Science With Python/NetworkX

By Federico Albanese

Data inundates us like never before—how can we hope to analyze it? Graphs (networks, not bar graphs) provide an elegant approach. Find out how to start with the Python NetworkX library to describe, visualize, and analyze "graph theory" datasets.

9 minute readContinue Reading
EngineeringIcon ChevronData Science and Databases

Machine Learning Number Recognition: From Zero to Application

By Teimur Gasanov

Harnessing the potential of machine learning for computer vision is not a new concept but recent advances and the availability of new tools and datasets have made it more accessible to developers. In this article, Toptal Software Developer Teimur Gasanov demonstrates how you can create an app capable of identifying handwritten digits in under 30 minutes, including the API and UI.

10 minute readContinue Reading

Join the Toptal® community.