Data Science and Databases

Showing 1-7 of 139 results

Share

Fine-tuning LLMs for Your Industry: Optimal Data Labeling Strategies

LLMs have a vast knowledge base, but training them with domain-specific data can extend their capabilities to specialized industries and tasks. This article delves into data labeling for fine-tuning and includes a step-by-step tutorial for training GPT-4o.

18-minute readContinue Reading
Jedrzej Kardach

Jedrzej Kardach

Jedrzej is a machine learning engineer who specializes in AI and data science. He has delivered several NLP-based classification algorithms and reinforcement learning solutions to clients, and has worked alongside researchers at Princeton University developing ML and data analytics tools. Jedrzej has partnered with clients in multiple industries, including service, finance, and insurance.

Architecting Effective Data Labeling Systems for Machine Learning Pipelines

Machine learning models are trained on massive datasets in which each data point is labeled to give it context and meaning. This deep dive describes how to build a data labeling architecture from scratch, with a focus on workflow, security, and data quality.

16-minute readContinue Reading
Reza Fazeli

Reza Fazeli

Reza is a machine learning engineer specializing in natural language processing and computer vision. At IBM, he developed machine learning algorithms designed to improve text classification and automate model training, innovations that resulted in six patents. Reza has a master’s degree in engineering from the University of Toronto.

Theory, Tools, and Business Applications: An In-depth Look at Quantum Computing

Quantum computing is challenging the realities of technology, security, and industry as we know them. Here, we investigate the nuances of quantum mechanics and how to enter the world of quantum software development with tools such as Cirq and TensorFlow Quantum.

22-minute readContinue Reading
Joao Diogo de Oliveira

Joao Diogo de Oliveira

Joao is an AI developer who holds a Quantum Excellence Certificate from IBM. He specializes in machine learning and deep learning and has partnered with Fortune 100 companies like Procter & Gamble and Hearst. Joao has more than 14 years of experience and holds a master’s degree in computer engineering from the University of Porto.

Advancing AI Image Labeling and Semantic Metadata Collection

Image labeling can be a tedious, time-consuming task, compounded by the sheer volume of data needed to train deep neural networks. This article breaks down large data set processing and explains how a new SaaS product can help automate image labeling.

13-minute readContinue Reading
Neven Pičuljan

Neven Pičuljan

Neven is an artificial intelligence engineer with extensive experience in machine learning, computer vision, algorithms, and a range of AI-related technologies. Prior to founding an AI R&D consulting company, Neven helped create and train cutting-edge computer vision models used by healthcare, e-commerce, real estate, and financial services companies across the globe.

Apache Spark Optimization Techniques for High-performance Data Processing

Apache Spark is an analytics engine that can handle very large data sets. This guide reveals strategies to optimize its performance using PySpark.

11-minute readContinue Reading
Necati Demir, PhD

Necati Demir, PhD

Necati is a software engineer specializing in data science, machine learning, back-end development, and DevOps. He is an AWS Certified Solutions Architect and AWS Certified Machine Learning Specialist with a doctorate in computer engineering. Necati serves as Chief AI Officer and CTO of Datagran, a machine learning automation company that he co-founded.

World-class articles, delivered weekly.

By entering your email, you are agreeing to our privacy policy.

5 Pillars of Responsible Generative AI: A Code of Ethics for the Future

Generative AI advances raise new questions around data ownership, content integrity, algorithmic bias, and more. Here, three experts at the forefront of NLP present recommendations for developing ethical generative AI solutions.

12-minute readContinue Reading
Madelyn Douglas

Madelyn Douglas

Madelyn is the Lead Editor of Engineering at Toptal and a former software engineer at Meta. She has more than six years of experience researching, writing, and editing for engineering publications, specializing in emerging technologies and AI. She previously served as an editor at USC’s Viterbi School of Engineering and her research on engineering ethics was published at IEEE’s NER 2021 conference.

In this ask-me-anything-style Q&A, leading Toptal AI developer Joao Diogo de Oliveira fields questions from fellow engineers about resources for pivoting to ML, approaches to large language models, and the most critical future applications of AI.

6-minute readContinue Reading
Joao Diogo de Oliveira

Joao Diogo de Oliveira

Joao is an AI developer with more than 10 years of experience at Fortune 100 companies like Procter & Gamble and startups in the healthcare, energy, and finance industries. Joao holds a master’s degree in computer science from the University of Porto and has multiple certifications in ML and deep learning.

Toptal Engineering Expert

Gabriel Courtemanche

Gabriel is a highly efficient and reliable professional who possesses a broad skill set for web application development. He's been working on a range of products and clients—from working on scalability problems in production engineering teams at Shopify and Autodesk to launching new applications for startups. Most of his work consists of leading technical teams, by creating an easy development environment, fixing technical debts, providing best practices code examples, and mentoring devs.
Read more

Previously At

Shopify

World-class articles, delivered weekly.

By entering your email, you are agreeing to our privacy policy.

Join the Toptal® community.