Skip to content

Java Client

Matthew Davis edited this page Jan 3, 2023 · 44 revisions

Zulia Java Client

Gradle

repositories {
    mavenCentral()
    maven {
        url "https://maven.ascend-tech.us/repo/"
    }
}


dependencies {
    implementation 'io.zulia:zulia-client:2.7.6'
    implementation 'org.mongodb:mongodb-driver-sync:4.8.1'
}

Maven

<repository>
   <id>astMaven</id>
   <name>AST Maven</name>
   <url>https://maven.ascend-tech.us/repo</url>
</repository>
<dependencies>
  <dependency>
      <groupId>io.zulia</groupId>
      <artifactId>zulia-client</artifactId>
      <version>2.7.6</version>
  </dependency>
  <dependency>
      <groupId>org.mongodb</groupId>
      <artifactId>mongodb-driver-sync</artifactId>
      <version>4.8.1</version>
  </dependency>
</dependencies>

Creating a Client

The Zulia java client is named ZuliaWorkPool. ZuliaWorkPool is a thread safe connection pool using a gRPC connection to Zulia on the service port. There are async versions methods of all methods that return a ListenableFuture<> of the result.

Simple Client creation

ZuliaWorkPool zuliaWorkPool = new ZuliaWorkPool(new ZuliaPoolConfig().addNode("someIp")); 

Full Client Configuration

ZuliaPoolConfig zuliaPoolConfig = new ZuliaPoolConfig();
zuliaPoolConfig.addNode("someIp");
//optionally give ports if not default values
//zuliaPoolConfig.addNode("localhost", 32191, 32192);

//optional settings (default values shown)
zuliaPoolConfig.setDefaultRetries(0);//Number of attempts to try before throwing an exception
zuliaPoolConfig.setMaxConnections(10); //Maximum connections per server
zuliaPoolConfig.setMaxIdle(10); //Maximum idle connections per server
zuliaPoolConfig.setCompressedConnection(false); //Use this for WAN client connections
zuliaPoolConfig.setPoolName(null); //For logging purposes only, null gives default of zuliaPool-n
zuliaPoolConfig.setNodeUpdateEnabled(true); //Periodically update the nodes of the cluster and to enable smart routing to the correct node. Do not use this with ssh port forwarding.  This can be done manually with zuliaWorkPool.updateNodes();
zuliaPoolConfig.setNodeUpdateInterval(10000); //Interval to update the nodes in ms
zuliaPoolConfig.setRoutingEnabled(true); //enable routing indexing to the correct server, this only works if automatic node updating is enabled or it is periodically called manually.

//create the connection pool
ZuliaWorkPool zuliaWorkPool = new ZuliaWorkPool(zuliaPoolConfig);

Creating an Index

Basic Creation

ClientIndexConfig indexConfig = new ClientIndexConfig().setIndexName("test").addDefaultSearchField("test");
indexConfig.addFieldConfig(FieldConfigBuilder.createString("title").indexAs(DefaultAnalyzers.STANDARD));
indexConfig.addFieldConfig(FieldConfigBuilder.createString("issn").indexAs(DefaultAnalyzers.LC_KEYWORD).facet());
indexConfig.addFieldConfig(FieldConfigBuilder.createInt("an").index().sort());
// createLong, createFloat, createDouble, createBool, createDate, createVector, createUnitVector is also available
// or create(storedFieldName, fieldType)

CreateIndex createIndex = new CreateIndex(indexConfig);
zuliaWorkPool.createIndex(createIndex);
  • Calling create index again will update index settings. However, the number of shards cannot be changed for the index once the index is created. The number of shards can be greater than the number of nodes to future-proof index if using sharding. Also see UpdateIndex for partial index setting updates.
  • Changing or adding analyzers for fields that are already indexed may require re-indexing for desired results.
  • Zulia supports indexes created from object annotations. For more info see section on Object Persistence.

Index Config Details

Full ClientIndexConfig settings are explained below:

defaultSearchField - The field that is searched if no field is given to a query (missing query fields or direct fielded search)
defaultAnalyzer - The default analyzer for all fields not specified by a field config
fieldConfig - Overrides the default analyzer for a field
shardCommitInterval - Indexes or deletes to shard before a commit is forced (default 3200)
idleTimeWithoutCommit - Time without indexing before commit is forced in seconds (0 disables) (default 30)
applyUncommitedDeletes - Apply all deletes before search (default true)
shardQueryCacheSize - Number of queries cached at the shard level
shardQueryCacheMaxAmount - Queries with more than this amount of documents returned are not cached

//The following are used in optimizing federation of shards when more than one shard is used. 
//The amount requested from each shard on a query is (((amountRequestedByQuery / numberOfShards) + minShardRequest) * requestFactor).
requestFactor - Used in calculation of request size for a shard (default 2.0)
minShardRequest - Added to the calculated request for a shard (default 2)
shardTolerance - Difference in scores between shards tolerated before requesting full results (query request amount) from the shard (default 0.05)

These Field Types are Available

STRING
NUMERIC_INT
NUMERIC_LONG
NUMERIC_FLOAT 
NUMERIC_DOUBLE
DATE
BOOL 
UNIT_VECTOR
VECTOR

These built-in Analyzers are available (DefaultAnalyzers)

KEYWORD - Field is searched as one token
LC_KEYWORD - Field is searched as one token in lowercase (case insenstive, use for wildcard searches)
LC_CONCAT_ALL
STANDARD - Standard lucene analyzer (good for general full text)
MIN_STEM - Minimal English Stemmer
KSTEMMED - K Stemmer
LSH - Locality Sensitive Hash
TWO_TWO_SHINGLE - (n-grams)
THREE_THREE_SHINGLE - (n-grams)

Custom Analyzer

clientIndexConfig.addAnalyzerSetting("myAnalyzer", Tokenizer.WHITESPACE, Arrays.asList(Filter.ASCII_FOLDING, Filter.LOWERCASE), Similarity.BM25);
clientIndexConfig.addFieldConfig(FieldConfigBuilder.create("abstract", FieldType.STRING).indexAs("myAnalyzer"));

Index Metadata

clientIndexConfig.setMeta(new Document("category", "special").append("otherKey", 10));

Warmed Searches

Search search1 = new Search("someIndex").addQuery(new FilterQuery("the best query")).setSearchLabel("custom");
Search search2 = new Search("someIndex").addQuery(new FilterQuery("the worst query")).setSearchLabel("mine");
clientIndexConfig.addWarmingSearch(search1);
clientIndexConfig.addWarmingSearch(search2);

Update Index

To replace the entire index config use the CreateIndexcommand. For partial updates use UpdateIndex

Basic Usage

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// ... any set of changes listed below (can change multiple things at once) 
UpdateIndexResult updateIndexResult = zuliaWorkPool.updateIndex(updateIndex);
// full index settings are returned after the change that can be accessed if needed
IndexSettings fullIndexSettings = updateIndexResult.getFullIndexSettings();

Numeric Settings

UpdateIndex updateIndex = new UpdateIndex("someIndex");

// selectivity call setXXX on the settings that you can to change
// if set is not called there will be no changes to that setting 
updateIndex.setIndexWeight(10);
// ...
zuliaWorkPool.updateIndex(updateIndex);

Add/Change Field(s)

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// if a field myField or otherField exists, it will be updated with these settings
FieldConfigBuilder myField = FieldConfigBuilder.createString("myField").indexAs(DefaultAnalyzers.STANDARD).sort();
FieldConfigBuilder otherField = FieldConfigBuilder.createString("otherField").indexAs(DefaultAnalyzers.LC_KEYWORD).sort();
updateIndex.mergeFieldConfig(myField, otherField);
zuliaWorkPool.updateIndex(updateIndex);

Replace Fields

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// replaces all fields with the two fields given
FieldConfigBuilder myField = FieldConfigBuilder.createString("myField").indexAs(DefaultAnalyzers.STANDARD).sort();
FieldConfigBuilder otherField = FieldConfigBuilder.createString("otherField").indexAs(DefaultAnalyzers.LC_KEYWORD).sort();
updateIndex.replaceFieldConfig(myField, otherField);
zuliaWorkPool.updateIndex(updateIndex);

Remove Fields

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// removes the stored field with name myField if it exists 
updateIndex.removeFieldConfigByStoredName(List.of("myField"));
zuliaWorkPool.updateIndex(updateIndex);

Add/Change Custom Analyzers

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// if an analyzer custom or mine exists, it will be updated with these settings, otherwise they are added
ZuliaIndex.AnalyzerSettings custom = ZuliaIndex.AnalyzerSettings.newBuilder().setName("custom").addFilter(Filter.LOWERCASE).build();
ZuliaIndex.AnalyzerSettings mine = ZuliaIndex.AnalyzerSettings.newBuilder().setName("mine").addFilter(Filter.LOWERCASE).addFilter(Filter.BRITISH_US)
        .build();
updateIndex.mergeAnalyzerSettings(custom, mine);

Replace Custom Analyzers

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// replaces all analyzers with the two custom analyzers given
ZuliaIndex.AnalyzerSettings custom = ZuliaIndex.AnalyzerSettings.newBuilder().setName("custom").addFilter(Filter.LOWERCASE).build();
ZuliaIndex.AnalyzerSettings mine = ZuliaIndex.AnalyzerSettings.newBuilder().setName("mine").addFilter(Filter.LOWERCASE).addFilter(Filter.BRITISH_US)
        .build();
updateIndex.replaceAnalyzerSettings(custom, mine);

Remove Custom Analyzer

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// removes the analyzer field with name myCustomOne if it exists
updateIndex.removeAnalyzerSettingsByName(List.of("myCustomOne"));
zuliaWorkPool.updateIndex(updateIndex);

Add/Change Warmed Searches

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// if a warmed search with search label custom or mine exists, it will be updated with these settings, otherwise they are added
Search search1 = new Search("someIndex").addQuery(new FilterQuery("the best query")).setSearchLabel("custom");
Search search2 = new Search("someIndex").addQuery(new FilterQuery("the worst query")).setSearchLabel("mine");
updateIndex.mergeWarmingSearches(search1, search2);

Replace Warmed Searches

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// replaces all warmed searches with the given warmed searches
Search search1 = new Search("someIndex").addQuery(new FilterQuery("some stuff")).setSearchLabel("the best label");
Search search2 = new Search("someIndex").addQuery(new FilterQuery("more stuff")).setSearchLabel("the good label");
updateIndex.replaceWarmingSearches(search1, search2);

Remove Warmed Search

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// removes the warmed search with search label myCustomOne if it exists
updateIndex.removeWarmingSearchesByLabel(List.of("myCustomOne"));
zuliaWorkPool.updateIndex(updateIndex);

Add/Change Metadata

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// replaces key someKey with value 5 and otherKey with value "a string" if they exist, otherwise add they to the metadata (putAll with new metadata)
updateIndex.mergeMetadata(new Document().append("someKey", 5).append("otherKey", "a string"));
zuliaWorkPool.updateIndex(updateIndex);

Add/Change Metadata

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// replaces key someKey with value 5 and otherKey with value "a string" if they exist, otherwise add they to the metadata (putAll with new metadata)
updateIndex.mergeMetadata(new Document().append("someKey", 5).append("otherKey", "a string"));
zuliaWorkPool.updateIndex(updateIndex);

Replace Metadata

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// replaces metadata document with the document below
updateIndex.replaceMetadata(new Document().append("stuff", "for free"));
zuliaWorkPool.updateIndex(updateIndex);

Remove Metadata

UpdateIndex updateIndex = new UpdateIndex("someIndex");
// removes the keys below from the metadata object if they exist
updateIndex.removeMetadataByKey(List.of("oneKey", "twoKey", "redKey", "blueKey"));
zuliaWorkPool.updateIndex(updateIndex);

Delete Index

Basic Delete

zuliaWorkPool.deleteIndex("myIndex");

Delete Index and Associated Files

DeleteIndex deleteIndex = new DeleteIndex("myIndex").setDeleteAssociated(true);
zuliaWorkPool.deleteIndex(deleteIndex);

Storing / Indexing Documents

Zulia supports indexing and storing from object annotations. For more info see section on Object Persistence

Result Document Storage

Simple Store

Document document = new Document();
document.put("id", "myid222");
document.put("title", "Magic Java Beans");
document.put("issn", "4321-4321");

Store store = new Store("myid222", "myIndexName").setResultDocument(document);
zuliaWorkPool.store(store);

Simple Store Json

String json = """
        {
          "documentId": "someId",
          "docType": "pdf",
          "docAuthor": "Java Developer Zone",
          "docTitle": "Elastic Search Blog",
          "isParent": false,
          "parentDocId": 1,
          "docLanguage": [
            "en",
            "czech"
          ]
        }""";

Store store = new Store("someId", "myIndexName").setResultDocument(json);
zuliaWorkPool.store(store);

Store with Metadata

Document document = new Document();
document.put("id", "myid222");
document.put("title", "Magic Java Beans");
document.put("issn", "4321-4321");

Store store = new Store("myid222", "myIndexName");

ResultDocBuilder resultDocumentBuilder = new ResultDocBuilder().setDocument(document);
//optional metadata document 
resultDocumentBuilder.setMetadata(new Document().append("test1", "val1").append("test2", "val2"));
store.setResultDocument(resultDocumentBuilder);

zuliaWorkPool.store(store);

Storing Associated Documents

AssociatedBuilder associatedBuilder = new AssociatedBuilder();
associatedBuilder.setFilename("myfile2.txt");
// either set as text
associatedBuilder.setDocument("Some Text3");
// or as bytes
associatedBuilder.setDocument(new byte[]{0, 1, 2, 3});
associatedBuilder.setMetadata(new Document().append("mydata", "myvalue2").append("sometypeinfo", "text file2"));

//can be part of the same store request as the document
Store store = new Store("myid123", "someIndex");

//multiple associated documented can be added at once
store.addAssociatedDocument(associatedBuilder);

zuliaWorkPool.store(store);

Storing Large Associated Documents (Streaming)

StoreLargeAssociated storeLargeAssociated = new StoreLargeAssociated("myid333", "myIndexName", "myfilename", new File("/tmp/myFile"));
zuliaWorkPool.storeLargeAssociated(storeLargeAssociated);

Fetching Documents

Fetch Document

FetchDocument fetchDocument = new FetchDocument("myid222", "myIndex");

FetchResult fetchResult = zuliaWorkPool.fetch(fetchDocument);

if (fetchResult.hasResultDocument()) {
    Document document = fetchResult.getDocument();

    //Get optional Meta
    Document meta = fetchResult.getMeta();
}

Fetch All Associated

FetchAllAssociated fetchAssociated = new FetchAllAssociated("myid123", "myIndexName");

FetchResult fetchResult = zuliaWorkPool.fetch(fetchAssociated);

if (fetchResult.hasResultDocument()) {
    Document object = fetchResult.getDocument();

    //Get optional metadata
    Document meta = fetchResult.getMeta();
}

for (AssociatedResult ad : fetchResult.getAssociatedDocuments()) {
    //use correct function for document type
    String text = ad.getDocumentAsUtf8();
    // OR
    byte[] documentAsBytes = ad.getDocumentAsBytes();

    //get optional metadata
    Document meta = ad.getMeta();

    String filename = ad.getFilename();

}

Fetch Associated

FetchAssociated fetchAssociated = new FetchAssociated("myid123", "myIndexName", "myfile2");

FetchResult fetchResult = zuliaWorkPool.fetch(fetchAssociated);


AssociatedResult ad = fetchResult.getFirstAssociatedDocument();
//use correct function for document type
String text = ad.getDocumentAsUtf8();
// OR
byte[] documentAsBytes = ad.getDocumentAsBytes();

//get optional metadata
Document meta = ad.getMeta();

String filename = ad.getFilename();

Fetch Large Associated (Streaming)

FetchLargeAssociated fetchLargeAssociated = new FetchLargeAssociated("myid333", "myIndexName", "myfilename", new File("/tmp/myFetchedFile"));
zuliaWorkPool.fetchLargeAssociated(fetchLargeAssociated);

Querying

Simple Query with only ids returned

Search search = new Search("myIndexName").setAmount(10);
search.addQuery(new ScoredQuery("issn:1234-1234 AND title:special"));
search.setResultFetchType(ZuliaQuery.FetchType.NONE); // just return the score and unique id 

SearchResult searchResult = zuliaWorkPool.search(search);

long totalHits = searchResult.getTotalHits();

System.out.println("Found <" + totalHits + "> hits");
for (CompleteResult completeResult : searchResult.getCompleteResults()) {
    System.out.println("Matching document <" + completeResult.getUniqueId() + "> with score <" + completeResult.getScore() + ">");
}

Simple Query with full documents returned

Search search = new Search("myIndexName").setAmount(10);
search.addQuery(new ScoredQuery("issn:1234-1234 AND title:special"));
search.setResultFetchType(ZuliaQuery.FetchType.FULL); //return the full bson document that was stored

SearchResult searchResult = zuliaWorkPool.search(search);

long totalHits = searchResult.getTotalHits();

System.out.println("Found <" + totalHits + "> hits");
for (Document document : searchResult.getDocuments()) {
    System.out.println("Matching document <" + document + ">");
}

Caching

// make sure this search stays in the query cache until the index is changed or zulia is restarted
Search search = new Search("myIndexName").setAmount(10);
search.addQuery(new ScoredQuery("issn:1234-1234 AND title:special"));
search.setPinToCache(true);

// Alternatively can force search to not be cached.  Searches that return more results than shardQueryCacheMaxAmount are not cached regardless
search.setDontCache(true);

Search Multiple Indexes

Search search = new Search("myIndexName", "myOtherIndex").setAmount(10);
search.addQuery(new ScoredQuery("issn:1234-1234 AND title:special"));


SearchResult searchResult = zuliaWorkPool.search(search);

long totalHits = searchResult.getTotalHits();

System.out.println("Found <" + totalHits + "> hits");
for (CompleteResult completeResult : searchResult.getCompleteResults()) {
    Document doc = completeResult.getDocument();
    System.out.println("Matching document <" + completeResult.getUniqueId() + "> with score <" + completeResult.getScore() + "> from index <" + completeResult.getIndexName() + ">");
    System.out.println(" full document <" + doc + ">");
}

Sorting

Search search = new Search("myIndexName").setAmount(100);
search.addQuery(new FilterQuery("title:(brown AND bear)"));
// can add multiple sorts with ascending or descending (default ascending)
// can also specify whether missing values are returned first or last (default missing first)
search.addSort(new Sort("year").descending());
search.addSort(new Sort("journal").ascending().missingLast());
SearchResult searchResult = zuliaWorkPool.search(search);

Query Fields

Query fields set the search field used when one is not given for a term. if query fields are not set on the query and a term is not qualified, the default search fields on the index will be used.

Query Fields Given

Search search = new Search("myIndexName").setAmount(100);

// search for lung in title,abstract AND cancer in title,abstract AND treatment in title
search.addQuery(new ScoredQuery("lung cancer title:treatment").addQueryFields("title", "abstract").setDefaultOperator(ZuliaQuery.Query.Operator.AND));

Default Query Fields

// search for lung in default index fields OR cancer in default index fields
// OR is the default operator unless set
Search search = new Search("myIndexName").setAmount(100);
search.addQuery(new ScoredQuery("lung cancer"));

Wildcard Query Fields

Search search = new Search("myIndexName").setAmount(100);

// search for lung in any field starting with title and abstract AND cancer in any field starting with title and abstract
// can also use title*:someTerm in a query, see Query Syntax Documentation
search.addQuery(new ScoredQuery("lung cancer").addQueryFields("title*", "abstract").setDefaultOperator(ZuliaQuery.Query.Operator.AND));

Highlighting

Search search = new Search("myIndexName").setAmount(100);
search.addQuery(new ScoredQuery("lung cancer").addQueryFields("title").setDefaultOperator(ZuliaQuery.Query.Operator.AND));

//can optionally set pre and post tag for the the highlight and set the number of fragments on the Highlight object
search.addHighlight(new Highlight("title"));

SearchResult searchResult = zuliaWorkPool.search(search);

for (CompleteResult completeResult : searchResult.getCompleteResults()) {
    Document document = completeResult.getDocument();
    List<String> titleHighlightsForDoc = completeResult.getHighlightsForField("title");
}

Filter Queries

Filter queries are the same as scored queries except they do not require the search engine to compute a score. They should be used in cases where a sort is being applied and a score is not needed or when a filter should not influence the relevance score. Filter queries and scored queries can be combined together.

Search search = new Search("myIndexName").setAmount(100);
// include only years 2020 forward
search.addQuery(new FilterQuery("year:[2020 TO *]"));
// require both terms to be matched in either the title or abstract
search.addQuery(new FilterQuery("cheetah cub").setDefaultOperator(Operator.AND).addQueryFields("title", "abstract"));
// require two out of the three terms in the abstract
search.addQuery(new FilterQuery("sleep play run").setMinShouldMatch(2).addQueryField("abstract"));
// exclude the journal nature
search.addQuery(new FilterQuery("journal:Nature").exclude());
SearchResult searchResult = zuliaWorkPool.search(search);

Query Helpers

FilterFactory for numerics

search = new Search("myIndexName");
// Search for pub years in range [2015, 2020]
search.addQuery(FilterFactory.rangeInt("pubYear").setRange(2015, 2020));

search = new Search("myIndexName");
// Search for pubs for any year before 2020
search.addQuery(FilterFactory.rangeInt("pubYear").setMaxValue(2020).setEndpointBehavior(RangeBehavior.EXCLUSIVE));

Values for tokens

String query;

// (a OR b)
query = Values.any().of("a", "b").asString();

// ("slow cat" OR "Pink Shirt")
query = Values.any().of("slow cat", "Pink Shirt").asString();

// ("slow cat" OR "Pink Shirt")
Function<String, String> quoteAndTrim = s -> Values.VALUE_QUOTER.apply(s).trim(); // Values.VALUE_QUOTER is default value handler
query = Values.all().valueHandler(quoteAndTrim).of("   slow cat   ", "   Pink Shirt ").asString();

// title,abstract:(a OR b)
query = Values.any().of("a", "b").withFields("title", "abstract").asString();

// -title,abstract:(a OR b OR c)
query = Values.any().of("a", "b", "c").withFields("title", "abstract").exclude().asString();

// title,abstract:(\"fast dog\" OR b OR c)~2
query = Values.atLeast(2).of("fast dog", "b", "c").withFields("title", "abstract").asString();



// -title,abstract:(a OR b OR c)~2
query = Values.atLeast(2).of("a", "b", "c").withFields("title", "abstract").exclude().asString();

FilterQuery fq;
// fq = new FilterQuery("\"fast dog\" b c").setDefaultOperator(ZuliaQuery.Query.Operator.OR).exclude().addQueryFields("title", "abstract").setMinShouldMatch(2)
fq = Values.atLeast(2).of("fast dog", "b", "c").withFields("title", "abstract").exclude().asFilterQuery();

ScoredQuery sq;
// sq = new ScoredQuery("\"slow cat\" b c").setDefaultOperator(ZuliaQuery.Query.Operator.OR).addQueryFields("title", "abstract").setMinShouldMatch(2);
sq = Values.atLeast(2).of("slow cat", "b", "c").withFields("title", "abstract").asScoredQuery();

Term Queries

Optimized search for many terms. Terms given are not analyzed, so they must match exactly what is in the search engine. This is most useful for things like ids that are not analyzed with KEYWORD or lightly analyzed with something like LC_KEYWORD (lower case keyword)

Search search = new Search("myIndexName").setAmount(100);

// search for the terms 1,2,3,4 in the field id
search.addQuery(new TermQuery("id").addTerms("1", "2", "3", "4"));

SearchResult searchResult = zuliaWorkPool.search(search);

Numeric Set Queries

Optimized search for many numeric terms

Search search = new Search("myIndexName").setAmount(100);
//search for values 1, 5, 7, 9 in the field intField
search.addQuery(new NumericSetQuery("intField").addValues(1, 5, 7, 9));

Vector Queries

Vector Indexing and Basic Queries

// create an index with add field config
ClientIndexConfig indexConfig = new ClientIndexConfig();

// call createVector or createUnitVector depending on if the vector is unit normalized
indexConfig.addFieldConfig(FieldConfigBuilder.createUnitVector("v").index());
// ...
indexConfig.setIndexName("vectorTestIndex");
// also can could updateIndex with mergeFieldConfig to add vector field to existing index
zuliaWorkPool.createIndex(indexConfig);


// store some documents with a vector field
Document mongoDocument = new Document();
float[] vector = new float[]{ 0, 0, 0.70710678f, 0.70710678f };
mongoDocument.put("v", Floats.asList(vector));
Store s = new Store("someId", "vectorTestIndex").setResultDocument(mongoDocument);
zuliaWorkPool.store(s);

Search search = new Search("vectorTestIndex").setAmount(100);
// returns the top 3 documents closest to [1.0,0,0,0] in the field v
search.addQuery(new VectorTopNQuery(new float[] { 1.0f, 0.0f, 0.0f, 0.0f }, 3, "v"));

SearchResult searchResult = zuliaWorkPool.search(search);

Pre Filters with Vector Queries

Search search = new Search("vectorTestIndex").setAmount(100);
// filters for blue in the description then returns the top 3 documents closest to [1.0,0,0,0] in the field v 
StandardQuery descriptionQuery = new FilterQuery("blue").addQueryField("description");
search.addQuery(new VectorTopNQuery(new float[] { 1.0f, 0.0f, 0.0f, 0.0f }, 3, "v").addPreFilterQuery(descriptionQuery));

Post Filters with Vector Queries

Search search = new Search("vectorTestIndex").setAmount(100);
// returns the top 3 documents closest to [1.0,0,0,0] in the field v, then filters for red in the description (possible less than 3 now)
search.addQuery(new VectorTopNQuery(new float[] { 1.0f, 0.0f, 0.0f, 0.0f }, 3, "v"));
search.addQuery(new FilterQuery("red").addQueryField("description"));

Count Facets

// Can set number of documents to return to 0 or omit setAmount unless you want the documents at the same time
// normally is combined with a FilterQuery or ScoredQuery to count a set of results
Search search = new Search("myIndexName").setAmount(0);

search.addCountFacet(new CountFacet("issn").setTopN(20));

SearchResult searchResult = zuliaWorkPool.search(search);
for (ZuliaQuery.FacetCount fc : searchResult.getFacetCounts("issn")) {
    System.out.println("Facet <" + fc.getFacet() + "> with count <" + fc.getCount() + ">");
}

Numeric Stat

// show number of values, number of documents, min, max, and sum for field pubYear
// normally is combined with a FilterQuery or ScoredQuery to count a set of results
Search search = new Search("myIndexName").setAmount(100);
search.addStat(new NumericStat("pubYear"));
SearchResult searchResult = zuliaWorkPool.search(search);
ZuliaQuery.FacetStats pyFieldStat = searchResult.getNumericFieldStat("pubYear");
System.out.println(pyFieldStat.getMin()); // minimum value for the field
System.out.println(pyFieldStat.getMax()); // maximum value for the field
System.out.println(pyFieldStat.getSum()); // sum of the values for the field, use one of the counts below for the average/mean
System.out.println(pyFieldStat.getDocCount()); // count of documents with the field not null
System.out.println(pyFieldStat.getAllDocCount()); // count of documents matched by the query
System.out.println(pyFieldStat.getValueCount()); // count of total number of values in the field (equal to document count except for multivalued fields)

Numeric Stat with Percentiles

List<Double> percentiles = List.of(
	0.0,  // 0th percentile (min) - can be retrieved without percentiles
	0.25, // 25th percentile
	0.50, // median
	0.75, // 75th percentile
	1.0   // 100th percentile (max) - can be retrieved without percentiles
);

Search search = new Search("myIndexName");
// Get the requested percentiles within 1% of their true value
search.addStat(new NumericStat("pubYear").setPercentiles(percentiles).setPercentilePrecision(0.01));
SearchResult searchResult = zuliaWorkPool.search(search);
for (ZuliaQuery.Percentile percentile : searchResult.getNumericFieldStat("pubYear").getPercentilesList()) {
	System.out.println(percentile.getPoint() + " -> " + percentile.getValue());
}

Stat Facet

// return the highest sum on author count for each journal name
Search search = new Search("myIndexName").setAmount(100);
search.addStat(new StatFacet("authorCount", "journalName"));
SearchResult searchResult = zuliaWorkPool.search(search);

// journals ordered by the sum of author count
List<ZuliaQuery.FacetStats> authorCountForJournalName = searchResult.getFacetFieldStat("authorCount", "journalName");
for (ZuliaQuery.FacetStats journalStats : authorCountForJournalName) {
    System.out.println(journalStats.getFacet()); // the journal
    System.out.println(journalStats.getMin()); // minimum value of author count for journal 
    System.out.println(journalStats.getMax()); // maximum value of author count for journal
    System.out.println(journalStats.getSum()); // sum of the values of author count for journal, use counts below for average/mean
    System.out.println(journalStats.getDocCount()); // count of documents for the journal where the author count not null
    System.out.println(journalStats.getAllDocCount()); // count of documents for the journal
    System.out.println(journalStats.getValueCount()); // count of total number of values of author count for the journal (equal to document count except for multivalued fields)
}

Stat Facet Percentiles

//get the 25th percentile, median, and 75th percentile of author count for the journal names
Search search = new Search("myIndexName").setAmount(100);
search.addStat(new StatFacet("authorCount", "journalName").setPercentiles(List.of(0.25, 0.5, 0.75)).setPercentilePrecision(0.01));
SearchResult searchResult = zuliaWorkPool.search(search);

// journals ordered by the sum of author count
List<ZuliaQuery.FacetStats> authorCountForJournalName = searchResult.getFacetFieldStat("authorCount", "journalName");
for (ZuliaQuery.FacetStats journalStats : authorCountForJournalName) {
    for (ZuliaQuery.Percentile percentile : journalStats.getPercentilesList()) {
        System.out.println(percentile.getPoint() + " -> " + percentile.getValue());
    }
    // journalStats also will have facet, min, max, sum, and counts as other example
}

Drilling Down Facets

Search search = new Search("myIndexName").setAmount(100);
search.addFacetDrillDown("issn", "1111-1111");
SearchResult searchResult = zuliaWorkPool.search(search);

Getting the second page of results with a cursor

Search search = new Search("myIndexName");
search.setAmount(100);
search.addQuery(new ScoredQuery("issn:1234-1234 AND title:special"));

// on a changing index a sort on  is necessary
// it can be sort on another field AND id as well
search.addSort(new Sort("id"));

SearchResult firstResult = zuliaWorkPool.search(search);

search.setLastResult(firstResult);


SearchResult secondResult = zuliaWorkPool.search(search);

Getting the all results with a cursor

Search search = new Search("myIndexName");
search.setAmount(100); //this will be the page size
search.addQuery(new ScoredQuery("issn:1234-1234 AND title:special"));

// on a changing index a sort on  is necessary
// it can be sort on another field AND id as well
search.addSort(new Sort("id"));

//option 1 - requires fetch type full (default)
zuliaWorkPool.searchAllAsDocument(search, document -> {
    // do something with mongo bson document
});

//variation 2 - when score is needed, searching multiple indexes and index name is needed, or fetch type is NONE/META
zuliaWorkPool.searchAllAsScoredResult(search, scoredResult -> {
    System.out.println(scoredResult.getUniqueId() + " has score " + scoredResult.getScore() + " for index " + scoredResult.getIndexName());
    // if result fetch type is full (default)
    Document document = ResultHelper.getDocumentFromScoredResult(scoredResult);
});

//variation 3 - each page is a returned as a search result.  less convenient but gives access to total hits
zuliaWorkPool.searchAll(search, searchResult -> {
    System.out.println("There are " + searchResult.getTotalHits());

    // variation 3a - requires fetch type full (default)
    for (Document document : searchResult.getDocuments()) {

    }

    // variation 3b - when score is needed, searching multiple indexes and index name is needed, or fetch type is NONE/META
    for (CompleteResult result : searchResult.getCompleteResults()) {
        System.out.println("Result for <" + result.getIndexName() + "> with score <" + result.getScore() + ">");
        //if fetch type is FULL
        Document document = result.getDocument();
    }
});

Deleting

Delete From Index

//Deletes the document from the index but not any associated documents
DeleteFromIndex deleteFromIndex = new DeleteFromIndex("myid111", "myIndexName");
zuliaWorkPool.delete(deleteFromIndex);

Delete Completely

//Deletes the result document, the index documents and all associated documents associated with an id
DeleteFull deleteFull = new DeleteFull("myid123", "myIndexName");
zuliaWorkPool.delete(deleteFull);

Delete Single Associated

//Removes a single associated document with the unique id and filename given
DeleteAssociated deleteAssociated = new DeleteAssociated("myid123", "myIndexName", "myfile2");
zuliaWorkPool.delete(deleteAssociated);

Delete All Associated

DeleteAllAssociated deleteAllAssociated = new DeleteAllAssociated("myid123", "myIndexName");
zuliaWorkPool.delete(deleteAllAssociated);

Other Operations

Get Current Document Count for Index

GetNumberOfDocsResult result = zuliaWorkPool.getNumberOfDocs("myIndexName");
System.out.println(result.getNumberOfDocs());

Get Fields for Index

GetFieldsResult result = zuliaWorkPool.getFields(new GetFields("myIndexName"));
System.out.println(result.getFieldNames());

Get Terms for Field

GetTermsResult getTermsResult = zuliaWorkPool.getTerms(new GetTerms("myIndexName", "title"));
for (ZuliaBase.Term term : getTermsResult.getTerms()) {
    System.out.println(term.getValue() + ": " + term.getDocFreq());
}

Get Cluster Nodes

GetNodesResult getNodesResult = zuliaWorkPool.getNodes();
for (Node node : getNodesResult.getNodes()) {
    System.out.println(node);
}

Async API

Every Function has a Corresponding Async Version

Executor executor = Executors.newCachedThreadPool();

Search search = new Search("myIndexName").setAmount(10);

ListenableFuture<SearchResult> resultFuture = zuliaWorkPool.searchAsync(search);

Futures.addCallback(resultFuture, new FutureCallback<>() {
    @Override
    public void onSuccess(SearchResult result) {

    }

    @Override
    public void onFailure(Throwable t) {

    }
}, executor);

Object Persistence / Mapping

Annotated Object Example

@Settings(indexName = "wikipedia", numberOfShards = 16, shardCommitInterval = 6000)
public class Article {

	public Article() {

	}

	@UniqueId
	private String id;

	@Indexed(analyzerName = DefaultAnalyzers.STANDARD)
	private String title;

	@Indexed
	private Integer namespace;

	@DefaultSearch
	@Indexed(analyzerName = DefaultAnalyzers.STANDARD)
	private String text;

	private Long revision;

	@Indexed
	private Integer userId;

	@Indexed(analyzerName = DefaultAnalyzers.STANDARD)
	private String user;

	@Indexed
	private Date revisionDate;

	//Getters and Setters
	//....
}

Creating Index for Annotated Class Example

Mapper<Article> mapper = new Mapper<>(Article.class);
zuliaWorkPool.createIndex(mapper.createOrUpdateIndex());

Storing an Object with Mapper

Article article = new Article();
//...
Store store = mapper.createStore(article);
zuliaWorkPool.store(store);

Querying with Mapper

Search search = new Search("wikipedia").setAmount(10);
search.addQuery(new ScoredQuery("title:technology"));

SearchResult searchResult = zuliaWorkPool.search(search);
List<Article> articles = searchResult.getMappedDocuments(mapper);
Clone this wiki locally