Searches the VectorStore using queries from a VectorStoreSearchInput object and returns
embed
indexers.VectorStore.embed(query)
Converts text (provided via a VectorStoreEmbedInput object) into vector embeddings using the Vectoriser and returns a VectorStoreEmbedOutput dataframe with columns id, text, and embedding.
Parameters
Name
Type
Description
Default
query
VectorStoreEmbedInput
The VectorStoreEmbedInput object containing the strings to be embedded and their ids.
required
Returns
Name
Type
Description
VectorStoreEmbedOutput
The VectorStoreEmbedOutput object containing the embeddings along with their corresponding ids and texts.
Creates a VectorStore instance from stored metadata and Parquet files. This method reads the metadata and vectors from the specified folder, validates the contents, and initializes a VectorStore object with the loaded data. It checks that the metadata contains the required keys, that the Parquet file exists and is not empty, and that the vectoriser class matches the one used to create the vectors. If any checks fail, it raises a ValueError with an appropriate message. This method is useful for loading previously created vector stores without needing to reprocess the original text data.
Parameters
Name
Type
Description
Default
folder_path
str
The folder path containing the metadata and Parquet files.
required
vectoriser
object
The Vectoriser object used to transform text into vector embeddings.
required
hooks
dict
[optional] A dictionary of user-defined hooks for preprocessing and postprocessing. Defaults to None.
None
Returns
Name
Type
Description
VectorStore
An instance of the VectorStore class.
Raises
Name
Type
Description
DataValidationError
If input arguments are invalid or if there are issues with the metadata or Parquet files.
ConfigurationError
If there are configuration issues, such as Vectoriser mismatches.
IndexBuildError
If there are failures during loading or parsing the files.
Reverse searches the VectorStore using a VectorStoreReverseSearchInput object and returns matched results in VectorStoreReverseSearchOutput object. If using partial matching, matches if document label starts with query label.
Parameters
Name
Type
Description
Default
query
VectorStoreReverseSearchInput
A VectorStoreReverseSearchInput object containing the text query or list of queries to search for with ids.
required
max_n_results
int
[optional] Number of top results to return for each query, set to -1 to return all results. Defaults to 100.
100
partial_match
bool
[optional] If True, the search behaviour is set to return results where the document_id is prefixed by the query. Defaults to False.
False
Returns
Name
Type
Description
VectorStoreReverseSearchOutput
A VectorStoreReverseSearchOutput object containing reverse search results with columns for query_id, query_text, document_id, document_text and any associated metadata columns.
Searches the VectorStore using queries from a VectorStoreSearchInput object and returns ranked results in VectorStoreSearchOutput object. In batches, converts users text queries into vector embeddings, computes cosine similarity with stored document vectors, and retrieves the top results.
Parameters
Name
Type
Description
Default
query
VectorStoreSearchInput
A VectorStoreSearchInput object containing the text query or list of queries to search for with ids.
required
n_results
int
[optional] Number of top results to return for each query. Default 10.
10
batch_size
int
[optional] The batch size for processing queries. Default 8.
8
Returns
Name
Type
Description
VectorStoreSearchOutput
A VectorStoreSearchOutput object containing search results with columns for query_id, query_text, document_id, document_text, rank, score, and any associated metadata columns.