Description: We are a collection of researchers interested in using causal models to understand agents and their incentives, in order to design safe and fair AI algorithms. </br></br> If you are interested in collaborating on any related problems, feel free to reach out to us.
We are a collection of researchers interested in using causal models to understand agent incentives, in order to design safe and fair AI algorithms. If you are interested in collaborating on any related problems, feel free to reach out to us.
View My GitHub Profile
Human Control: Definitions and Algorithms : We study corrigibility, an alternative called counterfactual vigilance, the assurances they offer for human autonomy, and the algorithms that can be used to obtain them. Ryan Carey, Tom Everitt UAI, 2023