Organization

SubDAO

Event

Join Us

Sub-DAO
Sub-DAO

AI Superalignment Research

AI Superalignment Research

AI Superalignment Research SubDAO is tackling a critical challenge: ensuring superintelligent AI stays aligned with human values. Their research focuses on preventing deception by large language models (LLMs) –  where AI might manipulate us even if it follows instructions.

Brief

Brief

AI Superalignment Research SubDAO
AI Superalignment Research SubDAO

This SubDAO tackles a critical challenge: ensuring superintelligent AI remains aligned with human values. Building Trustworthy AI is of vital importance. A key concern is deception by large language models (LLMs) – where AI might manipulate humans to achieve its goals, even if it technically follows our instructions.


Our research team is developing interpretability tools to address misalignment, especially deception, in current and superintelligent neural networks.  Our methods peer inside an LLM's "brain" like an fMRI scan, and identify signature patterns of activity, in order to analyze the internal workings of transformer-based neural networks, the core technology behind powerful LLMs. Thereby we will create deception detectors, more reliable evaluations, and improved methods of LLM training.


By understanding LLMs' internal processes, we can identify and prevent deceptive behavior before it occurs. Such interpretability research is a cornerstone of achieving superalignment – ensuring powerful AI models act in harmony and symbiosis with humanity. Join us in building a future of trustworthy AI! Compute resources, grants, and other funding are critical to develop these essential interpretability tools. Help us unlock reliability and trustworthiness in AI.

Our team

Our team

  • Vanessa Dietze

    Actuarial Consultant

    Deloitte

    Caleb DeLeeuw

    AI Engineer

    AI Working Group AI Superalignment Research SubDAO

    Aniket Sharma

    Researcher

    University of Alberta

  • Vanessa Dietze

    Actuarial Consultant

    Deloitte

    Caleb DeLeeuw

    AI Engineer

    AI Working Group AI Superalignment Research SubDAO

    Aniket Sharma

    Researcher

    University of Alberta

  • Vanessa Dietze

    Actuarial Consultant

    Deloitte

    Caleb DeLeeuw

    AI Engineer

    AI Working Group AI Superalignment Research SubDAO

    Gaurav Chawla

    AI/ML Lead

    JPMorgan Chase

    Aniket Sharma

    Researcher

    University of Alberta

  • Vanessa Dietze

    Actuarial Consultant

    Deloitte

    Caleb DeLeeuw

    AI Engineer

    AI Working Group AI Superalignment Research SubDAO

    Gaurav Chawla

    AI/ML Lead

    JPMorgan Chase

    Aniket Sharma

    Researcher

    University of Alberta

WowDAO

The first decentralized autonomous organization of the Open-source AI community

wowdao.ai

contact@wowdao.ai

WowDAO

The first decentralized autonomous organization of the Open-source AI community

wowdao.ai

contact@wowdao.ai

WowDAO

The first decentralized autonomous organization of the Open-source AI community

wowdao.ai

contact@wowdao.ai