> ## Documentation Index
> Fetch the complete documentation index at: https://wb-21fd5541-docs-weave-byob-note.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# spaCy

> Integrate W&B with spaCy v3 to track training metrics and version models and datasets through the WandbLogger config.

[spaCy](https://spacy.io) is a popular "industrial-strength" NLP library: fast, accurate models with a minimum of fuss. As of spaCy v3, W\&B can now be used with [`spacy train`](https://spacy.io/api/cli#train) to track your spaCy model's training metrics as well as to save and version your models and datasets. And all it takes is a few added lines in your configuration.

## Sign up and create an API key

An API key authenticates your machine to W\&B. You can generate an API key from your user profile.

<Note>
  For a more streamlined approach, create an API key by going directly to [User Settings](https://wandb.ai/settings). Copy the newly created API key immediately and save it in a secure location such as a password manager.
</Note>

1. Click your user profile icon in the upper right corner.
2. Select **User Settings**, then scroll to the **API Keys** section.

## Install the `wandb` library and log in

To install the `wandb` library locally and log in:

<Tabs>
  <Tab title="Command Line">
    1. Set the `WANDB_API_KEY` [environment variable](/models/track/environment-variables/) to your API key.

       ```bash theme={null}
       export WANDB_API_KEY=<your_api_key>
       ```

    2. Install the `wandb` library and log in.

       ```shell theme={null}
       pip install wandb

       wandb login
       ```
  </Tab>

  <Tab title="Python">
    ```bash theme={null}
    pip install wandb
    ```

    ```python theme={null}
    import wandb
    wandb.login()
    ```
  </Tab>

  <Tab title="Python notebook">
    ```notebook theme={null}
    !pip install wandb

    import wandb
    wandb.login()
    ```
  </Tab>
</Tabs>

## Add the `WandbLogger` to your spaCy config file

spaCy config files are used to specify all aspects of training, not just logging -- GPU allocation, optimizer choice, dataset paths, and more. Minimally, under `[training.logger]` you need to provide the key `@loggers` with the value `"spacy.WandbLogger.v3"`, plus a `project_name`.

<Note>
  For more on how spaCy training config files work and on other options you can pass in to customize training, check out [spaCy's documentation](https://spacy.io/usage/training).
</Note>

```python theme={null}
[training.logger]
@loggers = "spacy.WandbLogger.v3"
project_name = "my_spacy_project"
remove_config_values = ["paths.train", "paths.dev", "corpora.train.path", "corpora.dev.path"]
log_dataset_dir = "./corpus"
model_log_interval = 1000
```

| Name                   | Description                                                                                                                                                                                                              |
| ---------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `project_name`         | `str`. The name of the W\&B Project. The project will be created automatically if it doesn’t exist yet.                                                                                                                  |
| `remove_config_values` | `List[str]` . A list of values to exclude from the config before it is uploaded to W\&B. `[]` by default.                                                                                                                |
| `model_log_interval`   | `Optional int`. `None` by default. If set, enables [model versioning](/models/registry/) with [Artifacts](/models/artifacts/). Pass in the number of steps to wait between logging model checkpoints. `None` by default. |
| `log_dataset_dir`      | `Optional str`. If passed a path, the dataset will be uploaded as an Artifact at the beginning of training. `None` by default.                                                                                           |
| `entity`               | `Optional str` . If passed, the run will be created in the specified entity                                                                                                                                              |
| `run_name`             | `Optional str` . If specified, the run will be created with the specified name.                                                                                                                                          |

## Start training

Once you have added the `WandbLogger` to your spaCy training config you can run `spacy train` as usual.

<Tabs>
  <Tab title="Command Line">
    ```python theme={null}
    python -m spacy train \
        config.cfg \
        --output ./output \
        --paths.train ./train \
        --paths.dev ./dev
    ```
  </Tab>

  <Tab title="Python">
    ```python theme={null}
    python -m spacy train \
        config.cfg \
        --output ./output \
        --paths.train ./train \
        --paths.dev ./dev
    ```
  </Tab>

  <Tab title="Python notebook">
    ```notebook theme={null}
    !python -m spacy train \
        config.cfg \
        --output ./output \
        --paths.train ./train \
        --paths.dev ./dev
    ```
  </Tab>
</Tabs>

When training begins, a link to your training run's [W\&B page](/models/runs/) will be output which will take you to this run's experiment tracking [dashboard](/models/track/workspaces/) in the W\&B web UI.
