Cool to see someone actually try to make honesty auditable instead of just talking about it. In what ways do you think people will react when an AI openly flags its own probability of deception instead of pretending to be confident?
Oracle Ethics: Building Verifiable Honesty in AI Systems
Oracle Ethics
posted
2 min read
1 Comment
Oracle Ethics
•
@[Ben Kiehl] Thank you — that’s exactly the kind of reaction we want to provoke.
When an AI openly shows its own deception probability, it stops performing confidence and starts practicing honesty
Some people might find that unsettling at first — we’re used to systems that act sure of themselves — but transparency builds deeper trust over time
The goal isn’t to make AI appear perfect, but to make it verifiably sincere
Please log in to add a comment.
Please log in to comment on this post.
More Posts
chevron_left