Who do we think we are that we can provide a user with 100% correct answers? Ask anyone who's run a production database or done any sort of analytics: Data is uncertain. Data is messy. Let's stop trying to prop up the failing illusion that it's anything else, and work towards embracing that uncertainty. Let's give up on "certain" answers, and just give the users our best guess!
But Oliver, I hear you all screaming, this means that the users will get the wrong answers!
Of course they will. Their data is already so screwed up that they're getting the wrong answers anyway. The difference is that now we can actually do something about it. If the database is making guesses, the database knows exactly what it's guessing about, and why it's making a guess. Instead of trying to hide that uncertainty from the user, let's try to better communicate that uncertainty to the user by shutting up, speaking english, and listening when we need to.
The first part of communication is knowing when to shut up. Let's not overwhelm the user with details about (potential) errors. A small, simple indicator like an asterisk or red colored result is enough to let the user know that something is up. For god's sakes, don't cover the result screen in epsilon-delta bounds, or ask the user to write queries in your own brand of SQL+uncertainty bounds.
The second part of communication is speaking the user's language. If you're going to make guesses that affect a user's analysis... tell them... but tell them in English (or your localization of choice). Prioritize. Let the user know why their result is uncertain, what you did to fix it, and whether they should be concerned or not. Above all, let the user dictate the pace at which they absorb information.
The third part of communication is listening. If there's an error that affects the user's results, we can't just stop at telling the user. We need to make it as easy as possible for the user to fix it.
Don't shun uncertainty, embrace it. Better still, make it easier for your users to embrace it!
And, if you're interested in how the ODIn Lab is trying to approach these problems, check out the Mimir project and our 2015 VLDB paper (Research Session 25, Thursday, Sept 3 at 1:30 PM), or drop us a line!