Data Structure

Note

You can find the data structure in the Prisma schema located at webapp/src/prisma/schema.prisma.
We use PostgreSQL with pgvector extension to store the data.

Types of Documents

Ada can injest multiple types of documents:

Pdfs
Images
Youtube Videos
Websites
...

Document Structure

Documents are processed and stored in a three-level hierarchy:

Document Level
- The top-level entity representing the document
- Contains basic file information: i.e. name, original path, s3 access...
Chunks Level
- Documents are broken down into smaller text chunks for easier indexing
- Each chunk's text is transformed into vector embeddings
- These embeddings are used for semantic search
Metadata Level
- Each chunk can have associated metadata
- Stores contextual information about the chunk: i.e. page number, bounding box

Semantic Search Implementation

The semantic search capability is enabled by:

Breaking documents into manageable chunks
Converting chunk text into vector embeddings
Storing these embeddings in pgvector
Using vector similarity for search operations

Data Structure

Types of Documents

Document Structure

Semantic Search Implementation

On this page