base
BaseNode
BaseNode (metadata, model_context, prev_node=None, next_node=None, parent_node=None, child_node=[], embedding=[], id=None)
Lowest level abstraction for storing interrelated pieces of information, building block for other types of nodes.
TextNode
TextNode (text, model_context, metadata, prev_node=None, next_node=None, parent_node=None, child_node=[], embedding=[], auto_embed=True, doc_id=None, source_id=None, id=None, idx_ref=None)
Class for creating chunks of Text that contain additional information like relationships of metadata, inheritance from BaseNode but geared specifically towards text
Document
Document (metadata={}, name=None, text=None, prev_node=None, next_node=None, parent_node=None, child_node=[], embedding=[], source_id=None, doc_separator=None, id=None, nodes=[], store=None)
Class that serves as a way to group information that comes from different sources intended to be stored or integrated with other services. It serves as the centralized source of truth of the information that is transformed into nodes. It can be used to store the information of a document, a webpage, pdf or any other source of information. The hash is recalculated once the info is changed and serves as an interface to docstore.
TableIndex
TableIndex (df)
Table for representing data to analyze. Rows can be added a single class groups all its needed for llm context. Acts as a node store and can be stored on pillow format. Another diff paradigm for more distributes info structure. Nodes can be inserted. Or docs can be converted to nodes and inserted. Can have interop with duckdb thanks to arrow format. By default add class supports adding new nodes or dfs to the table index.
Imagining that there is a list of Nodes, for the first case.
TODO: Create struct to handle dfs under the hood. Using polars.
- Allow switch back and force from nodes to dataframes for things like time sorting or other strategies that can be tried out for retrieval.
First we need keys to look for that a TextNode could have to add it to a df.
Each key of the metadata must be converted to other value. We will convert every column that actually exists
See how to make it accept nodes and documents.
The TableIndex is both the struct and the index, an hybrid that can serve for distributed storage as pillow.