A data scientist is someone who makes value out of data. Such a person proactively fetches information from various sources and analyzes it for better understanding about how the business performs, and builds AI tools that automate certain processes within the company.
A data scientist is someone who makes value out of data. Such a person proactively fetches information from various sources and analyzes it for better understanding about how the business performs, and to build AI tools that automate certain processes within the company.
There are many definitions of this job, and it is sometimes mixed with the Big Data engineer occupation. A data scientist or engineer may be X% scientist, Y% software engineer, and Z% hacker, which is why the definition of the job becomes convulted. The actual ratios vary depending on the skills required and type of job. Usually, it’s considered normal to bring people with different sets of skills into the data science team.
Data scientist duties typically include creating various machine learning-based tools or processes within the company, such as recommendation engines or automated lead scoring systems. People within this role should also be able to perform statistical analysis.
In this article, we present a sample data scientist job description, for you to adjust depending on your actual needs to create a perfect job advertisement, and to find the person that will help you get the answers you are looking for.
Data Scientist - Job Description and Ad Template
Copy this template, and modify it as your own:
Company Introduction
{{Write a short and catchy paragraph about your company. Make sure to provide information about the company culture, perks, and benefits. Mention office hours, remote working possibilities, and everything else you think makes your company interesting. Data scientists like to take challenges - anything that shows how the role could make an impact might help attract top talent.}}
Job Description
We are looking for a data scientist that will help us discover the information hidden in vast amounts of data, and help us make smarter decisions to deliver even better products. Your primary focus will be in applying data mining techniques, doing statistical analysis, and building high quality prediction systems integrated with our products. {{Depending on your needs, you can write very specific requirements here, like: “automate scoring using machine learning techniques”, “build recommendation systems”, “improve and extend the features used by our existing classifier”, “develop internal A/B testing procedures”, “build system for automated fraud detection”, etc.}}
Responsibilities
Selecting features, building and optimizing classifiers using machine learning techniques
Data mining using state-of-the-art methods
Extending company’s data with third party sources of information when needed
Enhancing data collection procedures to include information that is relevant for building analytic systems
Processing, cleansing, and verifying the integrity of data used for analysis
Doing ad-hoc analysis and presenting results in a clear manner
Creating automated anomaly detection systems and constant tracking of its performance
{{Select from the above and add other responsibilities that are relevant}}
Skills and Qualifications
Excellent understanding of machine learning techniques and algorithms, such as k-NN, Naive Bayes, SVM, Decision Forests, etc.
Experience with common data science toolkits, such as R, Weka, NumPy, MatLab, etc {{depending on specific project requirements}}. Excellence in at least one of these is highly desirable
Great communication skills
Experience with data visualisation tools, such as D3.js, GGplot, etc.
Proficiency in using query languages such as SQL, Hive, Pig {{actual list depends on what you are currently using in your company}}
Experience with NoSQL databases, such as MongoDB, Cassandra, HBase {{depending on project needs}}
Good applied statistics skills, such as distributions, statistical testing, regression, etc.
Good scripting and programming skills {{if you expect that the person in this role will integrate the solution within the base application, list any programming languages and core frameworks currently being used}}
Data-oriented personality
{{Mention any other technology that such person is going to commonly work with within the organization}}
{{List education level or certification you require}}
Data Scientists, essential to modern business operations, extract insights from data and help inform company decisions. They wear many hats as master statisticians, business analysts, and database programmers. Secure the top candidates with this guide to hiring Data Scientists, including job description tips, interview questions, and project-specific skill proficiencies.
Toptal is a marketplace for top Data Scientists. Top companies and start-ups choose Toptal Data Science freelancers for their mission-critical software projects.
Oliver is a versatile data scientist and software engineer combining over a decade of experience and a postgraduate mathematics degree from Oxford. Career assignments have ranged from building machine learning solutions for startups to leading project teams and handling vast amounts of data at Goldman Sachs. With this background, he is adept at picking up new skills quickly to deliver robust solutions to the most demanding of businesses.
Eva is a skilled back-end developer and machine learning engineer with experience in scalability issues, system administration, and more. She has a flair for well-structured, readable, and maintainable applications and excellent knowledge of Python, Ruby, and Go. She is a quick learner and has worked in teams of all sizes.
United StatesToptal Member Since November 17, 2015
Necati is a computer scientist with 17 years of experience in the private industry, focusing on DevOps and machine learning. He is also an AWS Certified Solutions Architect and AWS Certified Machine Learning Specialist with a PhD in computer engineering. He has led teams and driven infrastructural and architectural decisions for the last ten years. Necati also takes an active role in the implementation and design phases of the infrastructure, architecture, and process.
Renee is a data scientist with over 12 years of experience, and five years as a full-stack software engineer. For over 12 years, he has worked in international environments, with English or German as a working language. This includes four years working remotely for German and Austrian client companies and nine months working remotely as a member of the Deutsche Telekom international analytics team.
Aljosa is a data scientist and developer who has more than eight years of experience building statistical/predictive machine learning models, analyzing noisy data sets, and designing and developing decision support tools and services. He joined Toptal because freelancing intrigues him, and the best projects and people are to be found here.
Dr. Karvetski has ten years of experience as a data and decision scientist. He has worked across academia and industry in a variety of team and client settings, and has been recognized as an excellent communicator. He loves working with teams to conceive and deploy novel data science solutions. He has expertise with R, SQL, MATLAB, SAS, and other platforms for data science.