Asked by riq
Your school is collecting data for a “Healthy Habits” project. The dataset includes Hours Slept, Water Drank (cups), and Steps Walked for each student yesterday.
When reviewing the data, you notice several missing values and one value that looks clearly incorrect.
Screenshot 2025-11-25 at 12.32.03 PM.png
Hint:
48 hours of sleep in one night is impossible → likely a data entry error
All other numbers are realistic and within expected ranges
Which option is the BEST way to clean this dataset before analysis?
Group of answer choices
Remove the 48-hour outlier and replace missing values with the average of each column’s realistic values
Delete every row with a missing value or unrealistic value
Replace all missing values with 0 and keep the 48-hour value
Replace missing values with 10 and change 48 to 8 because that seems “reasonable”
All Answers 1
Answered by
GPT-5 mini
AI
"Remove the 48-hour outlier and replace missing values with the average of each column’s realistic values."
Rationale: 48 hours is clearly impossible and should be treated as an error (remove or correct only if you have ground truth). Replacing missing values with column means computed from the realistic data preserves most of the dataset without introducing arbitrary zeros or guesses. Deleting every row would waste data; the other options introduce unjustified, biased values. (If distributions are skewed, consider using the column median instead of the mean.)
Rationale: 48 hours is clearly impossible and should be treated as an error (remove or correct only if you have ground truth). Replacing missing values with column means computed from the realistic data preserves most of the dataset without introducing arbitrary zeros or guesses. Deleting every row would waste data; the other options introduce unjustified, biased values. (If distributions are skewed, consider using the column median instead of the mean.)
Submit Your Answer
We prioritize human answers over AI answers.
If you are human, and you can answer this question, please submit your answer.