Skip to content

[BUG] Langfuse tests fail on Python 3.14 — langfuse SDK uses pydantic v1 internals incompatible with 3.14 #171

@afarntrog

Description

@afarntrog

Strands Version: N/A (dependency issue)

Strands Evals Version: current main (post-1dafdfb)

Python Version: 3.14.0

Operating System: macOS (Darwin 24.6.0)

Installation Method: pip

Steps to Reproduce:

  1. Install strands-agents-evals with langfuse extra on Python 3.14
  2. Run: hatch test tests/strands_evals/providers/test_langfuse_provider.py
  3. All 29 langfuse provider tests fail at setup — the from langfuse import Langfuse
    import crashes because the installed langfuse (<3) uses pydantic.v1.BaseModel
    for its internal API models (e.g. AnnotationQueue), and pydantic v1's
    _set_default_and_type() cannot infer types under Python 3.14's new annotation
    evaluation (PEP 649).

Error: pydantic.v1.errors.ConfigError: unable to infer type for attribute "description"

Stack trace originates in:
langfuse/api/resources/annotation_queues/types/annotation_queue.py
→ class AnnotationQueue(pydantic_v1.BaseModel)
→ pydantic/v1/fields.py:576 ConfigError

Expected Behavior:
All langfuse provider tests pass on Python 3.14, as they do on Python ≤3.13.

Actual Behavior:
7 FAILED, 22 ERROR — every test in test_langfuse_provider.py fails. The provider fixture cannot patch
langfuse_provider.Langfuse because importing the module crashes. Tests on other Python versions are
unaffected (631 passed).

Additional Context:

  • The langfuse warning confirms it: Core Pydantic V1 functionality isn't compatible with Python 3.14 or
    greater.
  • Current dependency constraint: langfuse>=2.0.0,<3
  • langfuse 4.0.1 (latest) has moved off pydantic.v1.BaseModel to pydantic.BaseModel (v2) via
    UniversalBaseModel. It requires pydantic>=2,<3 and no longer ships pydantic v1 model definitions.
  • However, langfuse 4.x introduces a breaking change: observations.get_many switches from page-based
    pagination (page/meta.total_pages) to cursor-based pagination (cursor/meta.cursor). The trace.list API is
    unchanged.

Possible Solution:

  1. Bump dependency to langfuse>=4.0.0,<5
  2. Update LangfuseProvider._fetch_observations to use cursor-based pagination (the new observations.get_many
    uses cursor param and returns ObservationsV2Meta with .cursor instead of MetaResponse with .total_pages)
  3. Update _fetch_all_pages or add a _fetch_all_cursor helper for the observations endpoint
  4. Update test mocks to reflect the new pagination shape
  5. _fetch_traces_for_session and its tests need no changes — trace.list still uses page-based pagination with
    MetaResponse

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions