diff --git a/demos/Main_Demo.ipynb b/demos/Main_Demo.ipynb index 70ffabbea..5e52962a7 100644 --- a/demos/Main_Demo.ipynb +++ b/demos/Main_Demo.ipynb @@ -1015,9 +1015,7 @@ "Mathematically, centering is a linear map, normalizing is *not* a linear map, and scaling and translation are linear maps. \n", "* **Centering:** LayerNorm is applied every time a layer reads from the residual stream, so the mean of any residual stream vector can never matter - `center_writing_weights` set every weight matrix writing to the residual to have zero mean. \n", "* **Normalizing:** Normalizing is not a linear map, and cannot be factored out. The `hook_scale` hook point lets you access and control for this.\n", - "* **Scaling and Translation:** Scaling and translation are linear maps, and are always followed by another linear map. The composition of two linear maps is another linear map, so we can *fold* the scaling and translation weights into the weights of the subsequent layer, and simplify things without changing the underlying computation. \n", - "\n", - "[See the docs for more details](https://github.com/TransformerLensOrg/TransformerLens/blob/main/further_comments.md#what-is-layernorm-folding-fold_ln)" + "* **Scaling and Translation:** Scaling and translation are linear maps, and are always followed by another linear map. The composition of two linear maps is another linear map, so we can *fold* the scaling and translation weights into the weights of the subsequent layer, and simplify things without changing the underlying computation. \n" ] }, { diff --git a/docs/README.md b/docs/README.md index ff6bd6040..4055aa376 100644 --- a/docs/README.md +++ b/docs/README.md @@ -8,10 +8,16 @@ The documentation uses Sphinx. However, the documentation is written in regular ## Build the Documentation -First install the packages: +For the standard contributor setup, install the default dependency groups: ```bash -uv sync --group docs +uv sync +``` + +For a docs-focused environment without the other default groups, install only the docs group: + +```bash +uv sync --no-default-groups --group docs ``` Then for hot-reloading, run this (note the model properties table won't hot reload, but everything diff --git a/docs/source/content/contributing.md b/docs/source/content/contributing.md index fc3092ad8..4fb8f94e0 100644 --- a/docs/source/content/contributing.md +++ b/docs/source/content/contributing.md @@ -28,7 +28,7 @@ source .venv/bin/activate cp .env.example .env ``` -Dependency groups are defined in `pyproject.toml` under `[dependency-groups]`. The project sets `default-groups = ["dev", "docs", "jupyter"]`, so `uv sync` installs all three out of the box — you do not need to pass `--group` flags for the standard contributor setup. +Dependency groups are defined in `pyproject.toml` under `[dependency-groups]`. The project sets `default-groups = ["dev", "docs", "jupyter", "multimodal"]`, so `uv sync` installs these groups out of the box — you do not need to pass `--group` flags for the standard contributor setup. - Standard contributor setup (recommended default): `uv sync` - Include the optional `quantization` group (bitsandbytes, optimum-quanto): `uv sync --all-groups` @@ -156,7 +156,7 @@ They will also be automatically checked with [pytest](https://docs.pytest.org/) If you want to view your documentation changes, run `uv run docs-hot-reload`. This will give you hot-reloading docs (they change in real time as you edit docstrings). -For documentation generation to work, install with `uv sync --group docs`. +The standard `uv sync` includes documentation generation. For a docs-focused environment without other default groups, use `uv sync --no-default-groups --group docs`. ### Docstring Style Guide diff --git a/docs/source/content/hook_system.md b/docs/source/content/hook_system.md index dbf17cf43..6d93ac35a 100644 --- a/docs/source/content/hook_system.md +++ b/docs/source/content/hook_system.md @@ -103,7 +103,7 @@ Stable strings; differ between HookedTransformer and TransformerBridge: | `TransformerBridge` (default) | Architecture-native | `blocks.5.attn.q.hook_out`, `blocks.5.hook_out`, `embed.hook_out` | | `TransformerBridge` + compatibility mode | Bridge-native AND HT-style aliases | Above + `blocks.5.attn.hook_q` etc. | -Full catalogue: [Main Demo](generated/demos/Main_Demo), [Exploratory Analysis Demo](generated/demos/Exploratory_Analysis_Demo). Architecture diagram: [TransformerLens_Diagram.svg](../_static/TransformerLens_Diagram.svg). +Full catalogue: [Main Demo](../generated/demos/Main_Demo), [Exploratory Analysis Demo](../generated/demos/Exploratory_Analysis_Demo). Architecture diagram: [TransformerLens_Diagram.svg](../_static/TransformerLens_Diagram.svg). Porting HT code to Bridge: `bridge.enable_compatibility_mode()` (see [Compatibility Mode](compatibility_mode.md)) registers HT aliases so existing names resolve. @@ -173,5 +173,5 @@ model.run_with_hooks( - [Compatibility Mode](compatibility_mode.md) — when to enable HT-style hook aliases on a Bridge model. - [Migrating to TransformerLens 3](migrating_to_v3.md) — porting HookedTransformer hook patterns to TransformerBridge. -- [Main Demo](generated/demos/Main_Demo) — end-to-end walkthrough using the hook system. +- [Main Demo](../generated/demos/Main_Demo) — end-to-end walkthrough using the hook system. - [`transformer_lens/hook_points.py`](https://github.com/TransformerLensOrg/TransformerLens/blob/main/transformer_lens/hook_points.py), [`transformer_lens/ActivationCache.py`](https://github.com/TransformerLensOrg/TransformerLens/blob/main/transformer_lens/ActivationCache.py), [`transformer_lens/patching.py`](https://github.com/TransformerLensOrg/TransformerLens/blob/main/transformer_lens/patching.py) — source. diff --git a/docs/source/content/tutorials.md b/docs/source/content/tutorials.md index 9068b9fb6..366b7003f 100644 --- a/docs/source/content/tutorials.md +++ b/docs/source/content/tutorials.md @@ -14,7 +14,7 @@ ## Demos -- [**Activation Patching in TransformerLens**](https://colab.research.google.com/github/TransformerLensOrg/TransformerLens/blob/main/demos/Activation_Patching_in_TL_Demo.ipynb) - Accompanies the [Exploratory Analysis Demo](https://colab.research.google.com/github/TransformerLensOrg/TransformerLens/blob/main/demos/Exploratory Analysis Demo.ipynb). This demo explains how to use [Activation Patching](https://dynalist.io/d/n2ZWtnoYHrU1s4vnFSAQ519J#z=qeWBvs-R-taFfcCq-S_hgMqx) in TransformerLens, a mechanistic interpretability technique that uses causal intervention to identify which activations in a model matter for producing an output. +- [**Activation Patching in TransformerLens**](https://colab.research.google.com/github/TransformerLensOrg/TransformerLens/blob/main/demos/Activation_Patching_in_TL_Demo.ipynb) - Accompanies the [Exploratory Analysis Demo](https://colab.research.google.com/github/TransformerLensOrg/TransformerLens/blob/main/demos/Exploratory_Analysis_Demo.ipynb). This demo explains how to use [Activation Patching](https://dynalist.io/d/n2ZWtnoYHrU1s4vnFSAQ519J#z=qeWBvs-R-taFfcCq-S_hgMqx) in TransformerLens, a mechanistic interpretability technique that uses causal intervention to identify which activations in a model matter for producing an output. - [**Attribution Patching**](https://colab.research.google.com/github/TransformerLensOrg/TransformerLens/blob/main/demos/Attribution_Patching_Demo.ipynb) - [Attribution Patching](https://www.neelnanda.io/mechanistic-interpretability/attribution-patching) is an incomplete project that uses gradients to take a linear approximation to activation patching. It's a good approximation when patching in small activations like the outputs of individual attention heads, and bad when patching in large activations like a residual stream. @@ -34,6 +34,6 @@ - [**Othello-GPT**](https://colab.research.google.com/github/TransformerLensOrg/TransformerLens/blob/main/demos/Othello_GPT.ipynb) - This is a demo notebook porting the weights of the Othello-GPT Model from the excellent [Emergent World Representations](https://arxiv.org/pdf/2210.13382.pdf) paper to TransformerLens. Neel's [sequence on investigating this](https://www.lesswrong.com/s/nhGNHyJHbrofpPbRG) is also well worth reading if you're interested in this topic! -- [**SVD Interpreter Demo**](https://colab.research.google.com/github/TransformerLensOrg/TransformerLens/blob/main/demos/SVD_Interpreter_demo.ipynb) - Based on the [Conjecture post](https://www.lesswrong.com/posts/mkbGjzxD8d8XqKHzA/the-singular-value-decompositions-of-transformer-weight#Directly_editing_SVD_representations) about how the singular value decompositions of transformer matrices are surprisingly interpretable, this demo shows how to use TransformerLens to reproduce this and investigate further. +- [**SVD Interpreter Demo**](https://colab.research.google.com/github/TransformerLensOrg/TransformerLens/blob/main/demos/SVD_Interpreter_Demo.ipynb) - Based on the [Conjecture post](https://www.lesswrong.com/posts/mkbGjzxD8d8XqKHzA/the-singular-value-decompositions-of-transformer-weight#Directly_editing_SVD_representations) about how the singular value decompositions of transformer matrices are surprisingly interpretable, this demo shows how to use TransformerLens to reproduce this and investigate further. - [**Tracr to TransformerLens**](https://colab.research.google.com/github/TransformerLensOrg/TransformerLens/blob/main/demos/Tracr_to_Transformer_Lens_Demo.ipynb) - [Tracr](https://github.com/deepmind/tracr) is a cool new DeepMind tool that compiles a written program in [RASP](https://arxiv.org/abs/2106.06981) to transformer weights.This is a (hacky!) script to convert Tracr weights from the JAX form to a TransformerLens HookedTransformer in PyTorch.