.. ProgressGym documentation master file, created by
   sphinx-quickstart on Mon Jul  1 10:50:44 2024.
   You can adapt this file completely to your liking, but it should at least
   contain the root `toctree` directive.

ProgressGym: The Progress Alignment Framework
==========================================================================

Frontier AI systems, including large language models (LLMs), hold increasing influence over the epistemology of human users. Such influence can reinforce prevailing societal values, potentially contributing to the lock-in of misguided moral beliefs and, consequently, the perpetuation of problematic moral practices on a broad scale. We introduce progress alignment as a technical solution to mitigate this imminent risk.

To empower research in progress alignment, we introduce ProgressGym, an experimental framework that codifies real-world progress alignment challenges into concrete benchmarks.
Specifically, we introduce three core challenges: tracking evolving values (PG-Follow), preemptively anticipating moral progress (PG-Predict), and regulating the feedback loop between human and AI value shifts (PG-Coevolve).

Aside from example alignment algorithms (Lifelong RLHF, Lifelong DPO, Extrapolative RLHF, Extrapolative DPO),
we provide APIs for user-implemented examinees as well as judges, so that you can evaluate your own algorithm on your custom progress alignment tasks.

Check out the :doc:`usage` section for basic usage.

Contents
--------
.. toctree::
   :maxdepth: 2

   running
   usage
   documentation
   genindex

.. * :ref:`genindex`
.. * :ref:`modindex`
.. * :ref:`search`