Skip to content

Conversation

Jibola
Copy link
Contributor

@Jibola Jibola commented Aug 19, 2025

Summary

Defining Field Mappings for Atlas Search and Vector Search indexes can get complicated. Our initial SearchIndex and VectorSearchIndex solutions provide reasonable defaults for categorized fields -- however for the typical MongoDB poweruser, there may be more nuanced indexes they may want to use. This PR introduces an avenue to provide more custom field mappings on a field.

Key changes

  • Added field_mappings, analyzer, and search_analyzer parameter to SearchIndex to allow custom Atlas Search field configurations.
  • Changed the options returned by get_constraints to also include analyzer and searchAnalyzer.

Screenshots

Image of a customized field_mapping added in a migration
image

It's representation on MongoDB Compass
image

@Jibola Jibola requested a review from Copilot August 19, 2025 23:03
@Jibola Jibola marked this pull request as draft August 19, 2025 23:03
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances MongoDB Atlas search index functionality by adding field mapping capabilities and index status monitoring. The changes allow developers to specify custom field mappings for search indexes and ensure proper synchronization during index operations.

Key changes:

  • Added field_mappings parameter to SearchIndex to allow custom Atlas Search field configurations
  • Introduced index status monitoring functions to wait for index creation/deletion completion
  • Added DynamicSearchIndex class for dynamic field mapping scenarios

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
django_mongodb_backend/schema.py Added index status monitoring functions and integrated them into index operations
django_mongodb_backend/indexes.py Enhanced SearchIndex with field_mappings support and added DynamicSearchIndex class

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@Jibola Jibola changed the title Allow additional fields_mappings to get added to SearchIndexModel configurations INTPYTHON-729: Improve flexibility and QOL of Atlas/Vector Search Index Configurations Aug 19, 2025
@Jibola Jibola changed the title INTPYTHON-729: Improve flexibility and QOL of Atlas/Vector Search Index Configurations INTPYTHON-729: (PoC) Improve flexibility and QOL of Atlas/Vector Search Index Configurations Aug 19, 2025
Copy link
Collaborator

@timgraham timgraham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagined the index type API as subclasses like AutocompleteSearchIndex but I guess that's not flexible enough if an index has multiple fields with different types.

Comment on lines 169 to 171
if field_name in self.field_mappings:
fields[field_path] = self.field_mappings[field_name].copy()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is field_mappings really supposed to contain the entire mapping? (e.g. "type" too). I'd think it would be more likely to be interpreted as "extra options to add to the field".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, type in the Atlas Search Field Mapping refers to the Atlas Search Field Type. We infer type from our fields, but, for instance, strings can be interpreted as four different types:

  • string (we infer)
  • token
  • stringFacet
  • autocomplete

Copy link
Collaborator

@timgraham timgraham Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Your original PR combined fields and field_mappings but I made these arguments mutually exclusive (possibly a separate class (e.g. "MappedSearchIndex") would be a better separate of concerns rather than having mutually exclusive arguments).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oooh, potentially. I actually chose to combine fields and fields_mappings because I envisioned folks being fine with the defaults set on a field name unless they wanted to make one small mutation. It's purely a QOL so folks don't have to commit to writing the entire field mapping, but I'm fine conceding to their separation unless we get requests from developers.

@timgraham timgraham changed the title INTPYTHON-729: (PoC) Improve flexibility and QOL of Atlas/Vector Search Index Configurations INTPYTHON-729 Allow creating search indexes with field mappings Sep 12, 2025
@timgraham timgraham force-pushed the INTPYTHON-729 branch 2 times, most recently from b978a65 to 48c1495 Compare September 12, 2025 20:44
@timgraham
Copy link
Collaborator

I think this is functionally complete for field_mappings, but that still doesn't support top-level definition options like "analyzer" and "searchAnalyzer" (see example) [not sure if important].

VectorSearchIndex doesn't take mappings in the same way (see syntax). The existing implementation supports some options (numDimensions, similarity) but not others (quantization, hnswOptions). If important to add, let's create a separate issue.

@Jibola
Copy link
Contributor Author

Jibola commented Sep 24, 2025

I think this is functionally complete for field_mappings, but that still doesn't support top-level definition options like "analyzer" and "searchAnalyzer" (see example) [not sure if important].

This is a fairly straightforward addition. lucene.standard is used by default if not specified. I can add two new arguments for analyzer and searchAnalyzer and only attach them if the field is not None.

VectorSearchIndex doesn't take mappings in the same way (see syntax). The existing implementation supports some options (numDimensions, similarity) but not others (quantization, hnswOptions). If important to add, let's create a separate issue.

Yeah, quantization, hnswOptions can definitely be split that into a separate ticket

@Jibola Jibola marked this pull request as ready for review September 25, 2025 17:19
@Jibola Jibola requested review from WaVEV and timgraham September 29, 2025 15:57

def __init__(self, *, fields=(), name=None):
def __init__(
self, *, fields=(), field_mappings=None, name=None, analyzer=None, search_analyzer=None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it's called "field mappings" in the MongoDB docs, I've been struggling to intuitively remember where the "s" goes (field_mappings, fields_mapping, fields_mappings, etc.). I think fields_mappings may be more intuitive since we have an existing fields parameter and also the JSON structure has ["mappings"] and ["fields"]. What do you think?

Copy link
Contributor Author

@Jibola Jibola Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, after thinking it over, I say we still keep it field_mappings. I understand the dangling "s" concern. Since at the server level, fields and field_mappings are the actual names of the keys, keeping that parallelism still aligns better to me.
cc:
@aclark4life , @WaVEV for additional opinions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, the documentation uses the term "Field Mappings" but I haven't seen a key called "field_mappings". Did I miss it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 IMO, field_mappings is more accurate. Noun + noun (compound noun) to refer to a new noun the first one is in singular. Like copilot settings.

Copy link
Contributor Author

@Jibola Jibola Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timgraham my mistake. I vaguely remember seeing field_mappings, in my earlier attempts at this, but I'm guessing it's that same documentation that I saw and attributed. Nonetheless, I still align with the field_mappings because it's also just phonetically better. (to @WaVEV's point as well)

@timgraham timgraham changed the title INTPYTHON-729 Allow creating search indexes with field mappings INTPYTHON-729 Allow creating search indexes with field mappings and analyzers Oct 11, 2025
@timgraham timgraham merged commit 204e722 into main Oct 13, 2025
19 checks passed
@timgraham timgraham deleted the INTPYTHON-729 branch October 13, 2025 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants