Breaking News

Data Science MCQs | For BCA, MCA, Computer Science Undergraduates | #ipumusings #eduvictors

Data Science MCQs

Data Science MCQs | For BCA, MCA, Computer Science Undergraduates | #ipumusings #eduvictors


Q1.A collection of information about a related topic is referred to as a__________

(a) Visualisation

(b) Analysis

(c) Conclusion

(d) Data

Q2. The process of examining data to draw insights is called _______.

(a) Visualisation

(b) Analysis

(c) Conclusion

(d) Data


Q3.To find the _________, you add up all the numbers and then divide by how many of numbers you have.

(a) Median

(b) Mean

(c) Mode

(d) Range 


Q4. To find the ________, you put all numbers in order from least to greatest and find the number that is in the middle.

(a) Median

(b) Mode

(c) Mean

(d) Range


Q5.  Data on visitors' viewing habits at a bank's website has been collected. Which technique is used to identify pages commonly viewed during the same visit to the website?

(a) Clustering

(b) Classification

(c) Association Rules

(d) Regression


Q6. A market research team studies smartphone preferences across different age groups (18–25, 26–40, 41–60, and 60+). To ensure each age group is proportionally represented in the sample, which sampling method should they use?

(a) Random sampling 

(b) Stratified sampling 

(c) Cluster Sampling

(d) Multistage sampling


Q7. A relationship between two or more variables is referred to as a ________

(a) Trend

(b) Spike

(c) All of the above

(d) None of the above


Q8. Data that sits outside the trend is referred to as a ______

(a) Outlier

(b) Trend

(c) Spike

(d) Both (a) & (b) 


Q9. A health researcher conducts a study on the effectiveness of a new fitness app by recruiting participants exclusively from a local gym. After analysing the data, the researcher concludes that the app significantly improves users' fitness levels. However, critics argue that the results may not apply to the general population.

Which type of bias most likely affects the study's conclusions due to its participant recruitment method?

(a) Selection Bias – The sample is unrepresentative because it only includes gym-goers (who may already be more health-conscious).

(b) Confirmation Bias – The researcher interprets data to confirm pre-existing beliefs.

(c) Observer Bias – The researcher’s expectations influence how they record or assess outcomes.

(d) Recall Bias – Participants inaccurately remember or report past behaviours.


Q10. Which of the following is NOT a machine learning algorithm?

(a) SVG

(b) Random Forest

(c) SVM

(d) None 


Q11. Which of the following is one of the key data science skills?

(a) Machine Learning

(b) Statistics

(c) Data Visualisation

(d) All of the above


Q12. Customer profile data often contains discrete features like gender, occupation, or car brand (stored as strings). Since most data analysis models require numeric inputs, which encoding method is typically applied?

(a) Normalisation

(b) One-Hot Encoding

(c) Log Transformation

(d) Principal Component Analysis (PCA)



Answers:

1. (d) Data

2. (b) Analysis

3. (b) Mean

4. (a) Median


5. (c) Association Rules

Association Rules is a data mining technique used to discover relationships or patterns between items in large datasets. In this case, it helps identify which web pages are frequently viewed together during the same visit (e.g., "Users who viewed Page A also viewed Page B")


6. (b) Stratified sampling 

Stratified sampling guarantees proportional representation of key subgroups (here, age groups), making it ideal for comparative analysis.


7. (a) Trend

A trend represents a consistent, long-term relationship or pattern between two or more variables (e.g., as education level increases, income tends to rise)


8. (a) Outlier

An outlier is a data point that significantly deviates from the overall trend or pattern in a dataset.A trend refers to the general direction or relationship between variables, not an anomaly. A spike is a sudden, sharp increase, but doesn’t necessarily imply deviation from the trend.


9. (d) Recall Bias

The researcher recruited participants exclusively from a local gym. Gym-goers are generally more health-conscious and likely already have higher fitness levels or a stronger motivation to improve fitness compared to the general population. This makes the sample unrepresentative of the broader population, leading to conclusions that may not be generalizable.


10. (a) SVG

11. (d) All of the above


12. (b) One-Hot Encoding

One-Hot Encoding – Converts each category into a binary column (0/1). It is the standard method to convert string-based categories (e.g., "Male/Female") into numeric form for ML models.


👉SEE ALSO

1. Basic Statistics with Python

2. Types of Data Inputs

3. Handling Imbalanced Data in ML

4. High-Dimensional Space and Law of Large Numbers

5. Linear Regression

6. Understanding Data Preparation