Harvard University — AI Interpretability, Controllability, and Safety Research

Organization:
Harvard University
Award Date:
01/2024
Amount:
$1,000,000
Purpose:
To support research on artificial intelligence interpretability, controllability, and safety.

Open Philanthropy recommended a grant of $1,000,000 over two years to Harvard University to support research led by Martin Wattenberg and Fernanda Viégas on artificial intelligence interpretability, controllability, and safety. Their research will focus on the extent to which large language models have developed internal models of the user and of themselves as distinct agents.

This falls within Open Philanthropy’s focus area of potential risks from advanced artificial intelligence.

Read more: