`pydantic_evals.generation`

用于为 pydantic_evals 生成示例数据集的工具。

该模块提供了用于生成样本数据集以进行测试和示例的函数，使用 LLM 创建具有正确结构的真实测试数据。

generate_dataset `async`

generate_dataset(
    *,
    dataset_type: type[
        Dataset[InputsT, OutputT, MetadataT]
    ],
    path: Path | str | None = None,
    custom_evaluator_types: Sequence[
        type[Evaluator[InputsT, OutputT, MetadataT]]
    ] = (),
    model: Model | KnownModelName = "openai:gpt-4o",
    n_examples: int = 3,
    extra_instructions: str | None = None
) -> Dataset[InputsT, OutputT, MetadataT]

使用 LLM 生成一个测试用例数据集，每个测试用例包含输入、预期输出和元数据。

此函数创建一个具有指定输入、输出和元数据类型的结构正确的数据集。它使用 LLM 尝试生成符合类型模式的真实测试用例。

参数

名称	类型	描述	默认值
`path`	`Path \| str \| None`	用于保存生成的数据集的可选路径。如果提供，数据集将被保存到此位置。	`None`
`dataset_type`	`type[Dataset[InputsT, OutputT, MetadataT]]`	要生成的数据集的类型，具有所需的输入、输出和元数据类型。	必需
`custom_evaluator_types`	`Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]]`	要包含在模式中的自定义评估器类的可选序列。	`()`
`model`	`Model \| KnownModelName`	用于生成的 Pydantic AI 模型。默认为 'gpt-4o'。	`'openai:gpt-4o'`
`n_examples`	`int`	要生成的示例数量。默认为 3。	`3`
`extra_instructions`	`str \| None`	提供给 LLM 的可选附加指令。	`None`

返回

类型	描述
`Dataset[InputsT, OutputT, MetadataT]`	一个结构正确的 Dataset 对象，包含生成的测试用例。

引发

类型	描述
`ValidationError`	如果 LLM 的响应无法解析为有效的数据集。

源代码位于 pydantic_evals/pydantic_evals/generation.py

async def generate_dataset(
    *,
    dataset_type: type[Dataset[InputsT, OutputT, MetadataT]],
    path: Path | str | None = None,
    custom_evaluator_types: Sequence[type[Evaluator[InputsT, OutputT, MetadataT]]] = (),
    model: models.Model | models.KnownModelName = 'openai:gpt-4o',
    n_examples: int = 3,
    extra_instructions: str | None = None,
) -> Dataset[InputsT, OutputT, MetadataT]:
    """Use an LLM to generate a dataset of test cases, each consisting of input, expected output, and metadata.

    This function creates a properly structured dataset with the specified input, output, and metadata types.
    It uses an LLM to attempt to generate realistic test cases that conform to the types' schemas.

    Args:
        path: Optional path to save the generated dataset. If provided, the dataset will be saved to this location.
        dataset_type: The type of dataset to generate, with the desired input, output, and metadata types.
        custom_evaluator_types: Optional sequence of custom evaluator classes to include in the schema.
        model: The Pydantic AI model to use for generation. Defaults to 'gpt-4o'.
        n_examples: Number of examples to generate. Defaults to 3.
        extra_instructions: Optional additional instructions to provide to the LLM.

    Returns:
        A properly structured Dataset object with generated test cases.

    Raises:
        ValidationError: If the LLM's response cannot be parsed as a valid dataset.
    """
    output_schema = dataset_type.model_json_schema_with_evaluators(custom_evaluator_types)

    # TODO(DavidM): Update this once we add better response_format and/or ResultTool support to Pydantic AI
    agent = Agent(
        model,
        system_prompt=(
            f'Generate an object that is in compliance with this JSON schema:\n{output_schema}\n\n'
            f'Include {n_examples} example cases.'
            ' You must not include any characters in your response before the opening { of the JSON object, or after the closing }.'
        ),
        output_type=str,
        retries=1,
    )

    result = await agent.run(extra_instructions or 'Please generate the object.')
    try:
        result = dataset_type.from_text(result.output, fmt='json', custom_evaluator_types=custom_evaluator_types)
    except ValidationError as e:  # pragma: no cover
        print(f'Raw response from model:\n{result.output}')
        raise e
    if path is not None:
        result.to_file(path, custom_evaluator_types=custom_evaluator_types)  # pragma: no cover
    return result

pydantic_evals.generation

generate_dataset async

`pydantic_evals.generation`

generate_dataset `async`