Data Science and Databases

Showing 85-91 of 139 results

Share

Twitter Data Mining: A Guide to Big Data Analytics Using Python

Twitter is a goldmine of data. Unlike other social platforms, almost every user’s tweets are completely public and pullable.

In this tutorial, Toptal Freelance Software Engineer Anthony Sistilli will be exploring how you can use Python, the Twitter API, and data mining techniques to gather useful data.

8 minute readContinue Reading
Anthony Sistilli

Anthony Sistilli

With four years of experience, Anthony specializes in machine learning and artificial intelligence as an engineer and a researcher.

Social networks are among the biggest sources of data today, and this means they are an extremely valuable asset for marketers, big data specialists, and even individual users like journalists and other professionals. Harnessing the potential of real-time Twitter data is also useful in many time-sensitive business processes.

In this article, Toptal Freelance Software Engineer Hanee’ Medhat explains how you can build a simple Python application to leverage the power of Apache Spark, and then use it to read and process tweets to identify trending hashtags.

10 minute readContinue Reading
Hanee' Medhat Shousha

Hanee' Medhat Shousha

A certified Spark dev with a CEng degree and business intelligence diploma, Hanee’ has built enterprise apps with millions of daily users.

SQL Server 2016 Always Encrypted: Easy to Implement, Tough to Crack

Security has always been a primary concern for database experts, and with the advent of new, decentralized services, it’s become even more crucial. Microsoft addressed the need for an added level of security in SQL with the introduction of Always Encrypted functionality in SQL Server 2016.

In this blog post, Toptal Freelance Software Engineer Josip Saban explains how Microsoft’s Always Encrypted concept works, how it’s implemented, and why developers can’t afford to ignore it.

11 minute readContinue Reading
Josip Šaban

Josip Šaban

With two Master’s degrees and having worked for the largest Slovenian enterprises, Josip is a veteran of Microsoft business/database tech.

A Guide to Consistent Hashing

Consistent Hashing is a distributed hashing scheme that operates independently of the number of servers or objects in a distributed hash table. It powers many high-traffic dynamic websites and web applications.

In this tutorial, Toptal Freelance Software Engineer Juan Pablo Carzolio will walk us through what it is and how hashing, distributed hashing and consistent hashing work.

25+ minute readContinue Reading
Juan Pablo Carzolio

Juan Pablo Carzolio

Juan is a versatile full-stack engineer with 10+ years of experience and a computer science degree. He is proficient in several languages.

The Definitive Guide to NoSQL Databases

Limited SQL scalability has prompted the industry to develop and deploy a number of NoSQL database management systems, with a focus on performance, reliability, and consistency. The trend was driven by proprietary NoSQL databases developed by Google and Amazon. Eventually, open-source systems like MongoDB, Cassandra, and Hypertable brought NoSQL within reach of everyone.

In this post, Senior Software Engineer Mohamad Altarade dives into some of them and explains why NoSQL will probably be with us for years to come.

16 minute readContinue Reading
Mohammad Altarade

Mohammad Altarade

Mohammad is a highly motivated, high-energy individual with a passion for writing useful software and working with the latest technologies.

World-class articles, delivered weekly.

Subscription implies consent to our privacy policy

A Data Engineer’s Guide To Nontraditional Data Storages

With the rise of big data and data science, storage and retrieval have become a critical pipeline component for data use and analysis. Recently, new data storage technologies have emerged. But the question is: Which one should you choose? Which one is best suited for data engineering?

In this article, Toptal Data Scientist Ken Hu compares three prominent storage technologies within the context of data engineering.

7 minute readContinue Reading
Ken Hu

Ken Hu

Ken is a Python expert. He focuses on data and machine learning, microservices, data analytics, natural language processing, and AI.

10 Tips for Effective Legacy Data Migration

Nobody wants to leave valuable customer data behind. Unfortunately, though, the hardest part of data migration to a complex CRM system, such as Salesforce, is the handling of legacy data.

In this article, Toptal Software Engineer Marian Paul provides 10 tips for successful legacy data migration to Salesforce.

13 minute readContinue Reading
Marian Paul

Marian Paul

Marian is a senior developer, expert in data migrations and back-end solutions. He has Oracle SQL and Microsoft SQL expert certifications.

Toptal Engineering Expert

Gabriel Courtemanche

Gabriel is a highly efficient and reliable professional who possesses a broad skill set for web application development. He's been working on a range of products and clients—from working on scalability problems in production engineering teams at Shopify and Autodesk to launching new applications for startups. Most of his work consists of leading technical teams, by creating an easy development environment, fixing technical debts, providing best practices code examples, and mentoring devs.
Read more

Previously At

Shopify

World-class articles, delivered weekly.

Subscription implies consent to our privacy policy

Join the Toptal® community.