15 Most frequently asked Data Science interview questionsSeptember 22, 2018
According to the recent Glassdoor report on the 50 best jobs in America, data science jobs are still the most opted for job choice in the IT sector. This report studies the factors such as job satisfaction, salary and the total number of jobs available. Performing well in all sectors, data science jobs have scored an overall rating of 4.8 out of 5. With a huge gap between demand and supply of qualified individuals, this profession is expected to grow bigger.
In this era of Machine Learning and Big Data, data scientists are the stars. If you are looking to be a part of this, the following are the questions you might face in your interview to display your technical proficiency. Brief answers are also provided to help you recall.
- What is root cause analysis?
This is a problem-solving technique used for isolating root causes of a problem.
- What is meant by Logistic Regression?
Also known as Logit Model, it is a technique to predict the binary outcome from a linear combination of predictor variables.
- What are the recommender systems?
They are a subclass of filtering systems that predict customer ratings of a product.
- What is Collaborative Filtering?
It is a widely used filtering system to find patterns through collaborating perspectives, several agents and multiple data sources.
- Why do we do A/B Testing?
A/B test detects any change to a web page and increases or maximises the strategic outcome.
- What is the Law of Large Numbers?
It states that sample variance, standard deviation and the sample mean converges to the intended estimate. This theorem provides the basis for frequency style thinking.
- What is Star Schema?
It is a database schema where data is organised into dimensions and facts. A sale, or login marks a fact. The dimension means reference information about this fact such as product, date or customer.
- Define Eigenvalue and Eigenvector
Eigen Value denotes the direction at which a linear transformation acts by compressing, flipping or stretching. Eigenvectors are used to understand the linear transformation. The correlation or covariance matrix can be found using eigenvectors.
- What are the common biases that during the sampling?
- Under coverage bias
- Selection bias
- Survivorship bias
- What is selective bias?
The problematic situations created by non-random samples are generally called selection bias.
- What is Survivorship Biasing?
This is a logical error caused by overlooking some aspects due to their lack of prominence. It leads to wrong conclusions.
- Define Confounding Variables
They are variables in a statistical model that correlate with both independent and dependent variables.
- What are Feature Vectors?
It is an n-dimensional vector containing numerical features of an object. It makes an object easy to be analysed mathematically.
- What is Cross-validation?
It is a popular model validation technique used to evaluate how the output of a statistical analyse will generalise to an independent data set.
- Gradient descent methods always converge to a similar point, true or false?
False. In some cases, they approach local optima or local minima point. The data and starting conditions dictate whether you reach the global point.