Data Science MCQs | For BCA, MCA, Computer Science Undergraduates | #ipumusings #eduvictors
Data Science MCQs
Q1.A collection of information about a related topic is referred to as a__________
(a) Visualisation
(b) Analysis
(c) Conclusion
(d) Data
Q2. The process of examining data to draw insights is called _______.
(a) Visualisation
(b) Analysis
(c) Conclusion
(d) Data
Q3.To find the _________, you add up all the numbers and then divide by how many of numbers you have.
(a) Median
(b) Mean
(c) Mode
(d) Range
Q4. To find the ________, you put all numbers in order from least to greatest and find the number that is in the middle.
(a) Median
(b) Mode
(c) Mean
(d) Range
Q5. Data on visitors' viewing habits at a bank's website has been collected. Which technique is used to identify pages commonly viewed during the same visit to the website?
(a) Clustering
(b) Classification
(c) Association Rules
(d) Regression
Q6. A market research team studies smartphone preferences across different age groups (18–25, 26–40, 41–60, and 60+). To ensure each age group is proportionally represented in the sample, which sampling method should they use?
(a) Random sampling
(b) Stratified sampling
(c) Cluster Sampling
(d) Multistage sampling
Q7. A relationship between two or more variables is referred to as a ________
(a) Trend
(b) Spike
(c) All of the above
(d) None of the above
Q8. Data that sits outside the trend is referred to as a ______
(a) Outlier
(b) Trend
(c) Spike
(d) Both (a) & (b)
Q9. A health researcher conducts a study on the effectiveness of a new fitness app by recruiting participants exclusively from a local gym. After analysing the data, the researcher concludes that the app significantly improves users' fitness levels. However, critics argue that the results may not apply to the general population.
Which type of bias most likely affects the study's conclusions due to its participant recruitment method?
(a) Selection Bias – The sample is unrepresentative because it only includes gym-goers (who may already be more health-conscious).
(b) Confirmation Bias – The researcher interprets data to confirm pre-existing beliefs.
(c) Observer Bias – The researcher’s expectations influence how they record or assess outcomes.
(d) Recall Bias – Participants inaccurately remember or report past behaviours.
Q10. Which of the following is NOT a machine learning algorithm?
(a) SVG
(b) Random Forest
(c) SVM
(d) None
Q11. Which of the following is one of the key data science skills?
(a) Machine Learning
(b) Statistics
(c) Data Visualisation
(d) All of the above
Q12. Customer profile data often contains discrete features like gender, occupation, or car brand (stored as strings). Since most data analysis models require numeric inputs, which encoding method is typically applied?
(a) Normalisation
(b) One-Hot Encoding
(c) Log Transformation
(d) Principal Component Analysis (PCA)
Answers:
1. (d) Data
2. (b) Analysis
3. (b) Mean
4. (a) Median
5. (c) Association Rules
Association Rules is a data mining technique used to discover relationships or patterns between items in large datasets. In this case, it helps identify which web pages are frequently viewed together during the same visit (e.g., "Users who viewed Page A also viewed Page B")
6. (b) Stratified sampling
Stratified sampling guarantees proportional representation of key subgroups (here, age groups), making it ideal for comparative analysis.
7. (a) Trend
A trend represents a consistent, long-term relationship or pattern between two or more variables (e.g., as education level increases, income tends to rise)
8. (a) Outlier
An outlier is a data point that significantly deviates from the overall trend or pattern in a dataset.A trend refers to the general direction or relationship between variables, not an anomaly. A spike is a sudden, sharp increase, but doesn’t necessarily imply deviation from the trend.
9. (d) Recall Bias
The researcher recruited participants exclusively from a local gym. Gym-goers are generally more health-conscious and likely already have higher fitness levels or a stronger motivation to improve fitness compared to the general population. This makes the sample unrepresentative of the broader population, leading to conclusions that may not be generalizable.
10. (a) SVG
11. (d) All of the above
12. (b) One-Hot Encoding
One-Hot Encoding – Converts each category into a binary column (0/1). It is the standard method to convert string-based categories (e.g., "Male/Female") into numeric form for ML models.
👉SEE ALSO
1. Basic Statistics with Python
3. Handling Imbalanced Data in ML
4. High-Dimensional Space and Law of Large Numbers
6. Understanding Data Preparation