Data Science and Databases

Showing 1-8 of 120 results

Ensemble Methods: The Kaggle Machine Learning Champion

by Juan Manuel Ortiz de Zarate

Two heads are better than one. This proverb describes the concept behind ensemble methods in machine learning. Let's examine why ensembles dominate ML competitions and what makes them so powerful.

9 minute readContinue Reading

Graph Data Science With Python/NetworkX

by Federico Albanese

Data inundates us like never before—how can we hope to analyze it? Graphs (networks, not bar graphs) provide an elegant approach. Find out how to start with the Python NetworkX library to describe, visualize, and analyze "graph theory" datasets.

9 minute readContinue Reading

How to Approach Writing an Interpreter From Scratch

by Sakib Hadžiavdić

How source code becomes a running program is often opaque: "Just run the compiler" is all that developers normally need to know. Writing an interpreter from scratch—including its lexer and parser—is an illuminating challenge.

14 minute readContinue Reading

Solving Bottlenecks With SQL Indexes and Partitions

by Mirko Marović

Indexes and partitioning can help with SQL performance, but they're not cure-alls. Through everyday examples of date range and LIKE queries, find out how to "think like an RDBMS" to make yours run faster.

14 minute readContinue Reading

Machine Learning Number Recognition - From Zero to Application

by Teimur Gasanov

Harnessing the potential of machine learning for computer vision is not a new concept but recent advances and the availability of new tools and datasets have made it more accessible to developers. In this article, Toptal Software Developer Teimur Gasanov demonstrates how you can create an app capable of identifying handwritten digits in under 30 minutes, including the API and UI.

10 minute readContinue Reading

How to Implement a Data Quality Process

by Alexander Hauskrecht

Data quality is a crucial element of any successful data warehouse solution. As the complexity of data warehouses increases, so does the need for data quality processes. In this article, Toptal Data Quality Developer Alexander Hauskrecht outlines how you can ensure a high degree of data quality and why this process is so important.

16 minute readContinue Reading

SQL Indexes Explained, Pt. 2

by Mirko Marović

Sorting a table can make some queries faster—but the maintenance cost is untenable. Enter real database indexes and their most common implementation structure: the B-tree.

10 minute readContinue Reading

Serializing Complex Objects in JavaScript

by Luke Wilson

The Tanagra.js library is designed to be simple and lightweight, and it currently supports Node.js and ES6 classes. The main implementation supports JSON, and an experimental version supports Google Protocol Buffers.

7 minute readContinue Reading

Join the Toptal® community.