ThinkAutomation includes a built-in 'vector database'. A vector database is a type of database designed to store, index, and search data represented as vectors, typically high-dimensional numerical arrays. These vectors are often embeddings - mathematical representations of data such as text, images, audio, or other unstructured content -generated by machine learning models. When searching, instead of exact matches (like in traditional databases), vector databases find similar items using approximate nearest neighbor (ANN) algorithms.
The ThinkAutomation vector database allows you to store vectors for any external data (such as database records, images or document and email content). With each vector record you also store an external 'title'. When a search is performed the closest matching titles (and optionally the text itself) will be returned (in relevancy order). You could then use these title values to lookup the actual data and add this to the 'context' for the Ask AI action, or provide advanced search results.
The ThinkAutomation Embedded Knowledge Store allows you to add the content and embeddings to 'articles' that can then be used as context for the Ask AI action. However, this is limited to about 25,000 articles, since the search is performed in memory. The Vector Database on the other hand has no limit, since the database is maintained on disk.
Collection Name
Title/vector pairs are contained within a Collection. Multiple collections can be used. Collection names can contain letters or numbers only. Title/vector pair collections are global to the ThinkAutomation instance (IE: The same collection can be used on all Solutions/Automations).
From the Vector Operation list, choose: Update, Search, Delete, Drop or Count:
Update
Add or update a record in the vector database collection. If a record with the specified title does not exist, a new record will be added, otherwise the existing record will be updated.
Specify the Title. The title can be any text. This should be some form of unique id for record (such as a document title, file path or database primary key).
Specify the Text. This is the text content you want to store vectors (embeddings) for. If you have setup an AI Provider in the ThinkAutomation Server Settings, then you can enable the Get Embeddings option. When the record is saved the AI Provider will be called to obtain the embeddings, which will then be used as the vectors.
Enable the Save Text option if you want the actual text stored with the vectors in the database. The text can then be returned when a search is performed. If this option is not enabled, then only the key and the vectors will be stored. You would then use the returned keys to lookup the actual text when a search is performed.
If the add is successful then the title value will be assigned to the variable specified in the Assign To list.
You can also specify the vectors in the text itself. This is for use cases where you are obtaining vectors via another method. In this case the Get Embeddings and Save Text options should be disabled.
Note: The number of vector dimensions must be the same for each record. For example, if the first record added has vectors with 1024 dimensions, then all subsiquent records added to the same collection must have the same vector dimensions. Different collections can have vectors with different dimensions.
Search
Search the vector database for relevant items based on the Search Text text. You can return the Top x most relevant items - in relevance order. The Relevancy Threshold setting controls the relevancy level. Items below the relevancy % will not be included. This value defaults to 20%.
If you have setup an AI Provider in the ThinkAutomation Server Settings, then you can enable the Get Embeddings option. Before the search is performed the AI Provider will be called to obtain the embeddings, which will then be used as the vectors. The number of vector dimensions for the search text must be the same as the vector dimensions stored in the database.
You can specify the Max Tokens to return. When a record is added to the vector database, the number of tokens used in the text is also saved. Search results will be limited to the max tokens specified. This is useful when using the vector database search along with the Ask AI action.
In the Return As list select either:
[ { "Title": "About Parker Software", "Text": "Parker Software is an independent software house.", "Similarity": 0.78213344, "Tokens": 4 }, { ... } ]
Specify Json if you are searching for items to add as context for the Ask AI action.
Select the variable to receive the results from the Assign To list.
Delete
Delete an existing item. Specify the Title to delete. If the delete was successful the title will be returned to the variable specified in the Assign To list.
Drop
Drops the entire collection. If the drop was successful the collection name will be returned to the variable specified in the Assign To list.
Count
Returns the total number of records stored in the specified collection. The count will be returned to the variable specified in the Assign To list.