The Protein Data Bank (PDB) bioinformatics database is the world's largest repository of experimentally-determined structures of proteins, nucleic acids, and complex assemblies. All data is gathered using experimental methods such as X-ray, spectroscopy, crystallography, NMR, etc. This article explains how to extract, filter, and clean data from the PDB to make it suitable for further analysis.
For a large codebase, managing database schema can become tedious, especially if you maintain multiple testing environments or customers that update the product at different paces. Sometimes, documenting the latest schema or database changes isn't enough. In this article, Toptal Database Engineer Ivan Pavlov introduces us to concepts that help manage database states.
Open-source, IoT, and Ethereum smart contracts work together with a new utility coin to make transportation more accessible and reduce vehicle waste. In this article, Toptal Freelance Ethereum Developer Michał Mikolajczyk explains the motivations and methodology behind his startup's latest initiative.
Flaws in your database design are like cracks in your application’s foundations. If left unchecked, trying to fix them down the line will be costly to say the least. In this article, Toptal Freelance Software Engineer Fernando Martinez discusses some of the most common database design bad practices and how to avoid them.
PhalconPHP will make your high-load application fast and easy, it's one of the fastest MVC frameworks for PHP available. It's written in C and supplied as a compiled PHP extension, so it doesn’t need to be interpreted at every request. Consider PhalconPHP for your next project, you won't regret it.
Limited SQL scalability has prompted the industry to develop and deploy a number of NoSQL database management systems, with a focus on performance, reliability, and consistency. The trend was driven by proprietary NoSQL databases developed by Google and Amazon. Eventually, open-source systems like MongoDB, Cassandra, and Hypertable brought NoSQL within reach of everyone. In this post, Senior Software Engineer Mohamad Altarade dives into some of them and explains why NoSQL will probably be with us for years to come.
With the rise of big data and data science, storage and retrieval have become a critical pipeline component for data use and analysis. Recently, new data storage technologies have emerged. But the question is: Which one should you choose? Which one is best suited for data engineering? In this article, Toptal Data Scientist Ken Hu compares three prominent storage technologies within the context of data engineering.
The Hadoop Distributed File System (HDFS) is a scalable, open-source solution for storing and processing large volumes of data. With its built-in replication and resilience to disk failures, HDFS is an ideal system for storing and processing data for analytics. In this step-by-step tutorial, Toptal Database Developer Dallas H. Snider details how to migrate existing data from a PostgreSQL database into the more efficient HDFS.
Database tuning can be an incredibly difficult task, particularly when working with large-scale data where even the most minor change can have a dramatic (positive or negative) impact on performance. In mid-sized and large companies, most database tuning will be handled by a Database Administrator (DBA). But there are plenty of developers who have to perform DBA-like tasks; meanwhile, DBAs often struggle to work well with developers. In this article, learn database tuning tips and how developers and DBAs can work together effectively.
World-class articles, delivered weekly.
Subscription implies consent to our privacy policy
Thank you!
Check out your inbox to confirm your invite.
Join the Toptal® community.