Database Architecture Design Document (DADD)

Document Purpose: Single source of truth for all database architecture decisions, data storage reasoning, and detailed schema documentation.

Last Updated: 2025-07-05
Schema Version: v2.0 (Unified ID Implemented)
Current Status: Unified ID system (migration completed) Production Database: Supabase Project gjzcqcxoacxkrchrzaen

1. Architectural Overview

1.1 Database Strategy: Polyglot Persistence

Decision: Use dual-database architecture with PostgreSQL + Neo4j

Reasoning:

PostgreSQL/Supabase: Excellent for ACID transactions, complex business logic, Row-Level Security (RLS), and user authentication integration
Neo4j: Superior for complex relationship traversal, variable-depth queries, and graph analytics
Separation of Concerns: Business data vs. relationship data

Trade-offs:

✅ Pros: Each database optimized for its strengths, robust security model, scalable relationship queries
❌ Cons: Data synchronization complexity, dual-database maintenance, potential consistency issues
⚖️ Mitigation: Atomic operations, sync validation, graceful degradation

1.2 Data Distribution Philosophy

Data Type	Storage Location	Reasoning
Identity & Metadata	PostgreSQL	ACID compliance, RLS security, business logic
Relationships	Neo4j	Graph traversal performance, complex queries
Authentication	Supabase Auth	Integrated with PostgreSQL RLS
File Storage	Supabase Storage	CDN integration, access control
Audit Logs	PostgreSQL	Compliance, ACID guarantees

1.3 Consistency Model

Primary Key Strategy:

IMPLEMENTED: Single UUID shared across both databases (PostgreSQL id = Neo4j node id)
Migration Status: ✅ COMPLETED 2025-07-05 - Unified ID system is now active
Legacy: Previously used dual UUIDs (PostgreSQL id + neo4j_id) - now eliminated
Source of Truth: PostgreSQL for object existence, Neo4j for relationships

Synchronization Strategy:

ACTIVE: Create object in PostgreSQL first with single UUID (id)
ACTIVE: Create corresponding Neo4j node using the same UUID as the node’s id property
Relationship changes only in Neo4j
ACTIVE: Direct ID matching between systems (PostgreSQL id = Neo4j node id)
LEGACY: Previously used dual UUIDs with complex ID mapping - now eliminated
Graceful degradation: objects load without relationships if Neo4j unavailable

2. Schema Documentation

2.1 PostgreSQL Schema

Core Business Tables

`object_instances`

Purpose: Authoritative record of all physical objects in the system

Column	Type	Purpose	Usage	Constraints
`id`	UUID	Primary identifier	PostgreSQL primary key, foreign key references	PRIMARY KEY, NOT NULL
`neo4j_id`	TEXT	REMOVED	LEGACY: Previously linked to Neo4j nodes - eliminated in v2.0 (2025-07-05)	REMOVED
`name`	TEXT	Human-readable object name	Display, search	NOT NULL
`description`	TEXT	Object details	User input, search	NULLABLE
`owner_organization_id`	UUID	Ownership control	RLS enforcement, access control	FK to organizations, NOT NULL
`created_by`	UUID	Audit trail	User tracking, permissions	FK to profiles, NOT NULL
`created_at`	TIMESTAMPTZ	Audit trail	Sorting, reporting	DEFAULT now()
`updated_at`	TIMESTAMPTZ	Change tracking	Sync detection, caching	DEFAULT now()
`status`	TEXT	Object lifecycle	Business logic, filtering	DEFAULT ‘active’
`main_image_url`	TEXT	Primary visual	DEPRECATED: Being migrated to `object_images.is_main`	NULLABLE
`last_scanned_at`	TIMESTAMPTZ	Physical interaction	Location tracking	NULLABLE
`last_scanned_by`	UUID	Scan audit	User activity	FK to profiles

MIGRATION NOTES:

✅ neo4j_id column REMOVED in unified ID migration (v2.0) - COMPLETED 2025-07-05
main_image_url column will be REMOVED in favor of object_images.is_main pattern

Business Rules:

Objects are owned by organizations, not individual users
created_by is for audit only, not access control
status field supports object lifecycle management
✅ ACTIVE: Single UUID system (same id used in both PostgreSQL and Neo4j)
✅ COMPLETED: Removed neo4j_id column, using direct ID matching
LEGACY: Previously used dual UUID system - now eliminated

Performance Considerations:

Index on owner_organization_id for RLS queries
Index on name for search functionality
Index on created_at for timeline queries

`object_images`

Purpose: Manage multiple images per object with main image designation

Column	Type	Purpose	Usage	Constraints
`id`	UUID	Primary identifier	Image management	PRIMARY KEY
`object_instance_id`	UUID	Parent object	Foreign key relationship	FK to object_instances, NOT NULL
`storage_path`	TEXT	Supabase Storage location	File retrieval	NOT NULL
`public_url`	TEXT	CDN access URL	Frontend display	NOT NULL
`is_main`	BOOLEAN	Primary image flag	UI main image selection	DEFAULT FALSE
`uploaded_by`	UUID	Upload audit	User tracking	FK to profiles, NOT NULL
`uploaded_at`	TIMESTAMPTZ	Upload timestamp	Sorting, audit	DEFAULT now()
`file_size`	INTEGER	Storage management	Quota tracking	NULLABLE
`content_type`	TEXT	File type validation	Upload restrictions	NULLABLE

Business Rules:

Only one image per object can have is_main = true
public_url is the source of truth for image access
Deletion requires both database record and storage file removal

Constraints:

-- Ensure only one main image per object
CREATE UNIQUE INDEX idx_object_images_unique_main 
ON object_images(object_instance_id) 
WHERE is_main = true;

`organizations`

Purpose: Multi-tenancy and ownership control

Column	Type	Purpose	Usage	Constraints
`id`	UUID	Organization identifier	Ownership, RLS	PRIMARY KEY
`name`	TEXT	Organization name	Display, search	NOT NULL
`type`	TEXT	Organization category	Business logic	DEFAULT ‘Individual’
`description`	TEXT	Organization details	User input	NULLABLE
`created_at`	TIMESTAMPTZ	Audit trail	Reporting	DEFAULT now()
`created_by`	UUID	Creator tracking	Audit	FK to profiles

Business Rules:

All objects must belong to an organization
Users access objects through organization membership
Organizations provide the security boundary

`organization_members`

Purpose: User access control through organization membership

Column	Type	Purpose	Usage	Constraints
`organization_id`	UUID	Organization reference	Access control	FK to organizations, NOT NULL
`user_id`	UUID	User reference	Membership tracking	FK to profiles, NOT NULL
`role`	TEXT	Permission level	Authorization	NOT NULL
`joined_at`	TIMESTAMPTZ	Membership start	Audit	DEFAULT now()

Business Rules:

Users can belong to multiple organizations
Role determines permissions within organization
Composite primary key on (organization_id, user_id)

Reference Data Tables

`functional_predicates`

Purpose: Catalog of functional relationships for Neo4j

Column	Type	Purpose	Usage	Constraints
`key`	TEXT	Relationship type identifier	Neo4j edge labels	PRIMARY KEY
`category`	TEXT	Grouping for UI	Dropdown organization	NOT NULL
`description`	TEXT	Human explanation	Help text	NULLABLE
`inverse_key`	TEXT	Opposite relationship	Bidirectional logic	FK to self
`is_symmetric`	BOOLEAN	Single-direction storage	Graph efficiency	DEFAULT FALSE
`icon_name`	TEXT	UI visualization	Frontend display	NULLABLE
`deprecated`	BOOLEAN	Lifecycle management	API validation	DEFAULT FALSE

Business Rules:

Validates Neo4j relationship types before creation
Read-only for most users via RLS
Changes require system administrator role

`spatial_predicates`

Purpose: Catalog of spatial relationships for Neo4j

Column	Type	Purpose	Usage	Constraints
`key`	TEXT	Spatial relationship identifier	Neo4j edge labels	PRIMARY KEY
`axis`	TEXT	Spatial dimension	UI grouping	NULLABLE
`description`	TEXT	Human explanation	Help text	NULLABLE
`inverse_key`	TEXT	Opposite direction	Bidirectional logic	FK to self
`is_symmetric`	BOOLEAN	Single-direction storage	Graph efficiency	DEFAULT FALSE
`icon_name`	TEXT	UI visualization	Frontend display	NULLABLE
`deprecated`	BOOLEAN	Lifecycle management	API validation	DEFAULT FALSE

2.2 Neo4j Schema

Node Types

`ObjectInstance` Nodes

Purpose: Represent physical objects with graph relationships

Properties:

id: ✅ ACTIVE: Same UUID as PostgreSQL object_instances.id (unified identifier)
LEGACY: Previously used separate UUID from PostgreSQL neo4j_id column - now eliminated
name: Cached from PostgreSQL for query performance
type: Object classification
created_at: Timestamp for temporal queries

Relationships: See section 2.3

Business Rules:

Must correspond to PostgreSQL object_instances record
✅ ACTIVE: Node id property = PostgreSQL id column (unified UUID)
LEGACY: Previously used PostgreSQL neo4j_id column - now eliminated
Metadata is cached, not authoritative

Relationship Types

Structural Relationships

PART_OF: Physical composition (A is part of B)
CONTAINS: Physical containment (A contains B)

Functional Relationships

POWERS: Power supply relationship
CONTROLS: Control relationship
WORKS_WITH: Compatibility relationship

Spatial Relationships

IN: Containment (A is inside B)
ON: Support (A is on top of B)
LEFT_OF: Horizontal positioning
RIGHT_OF: Horizontal positioning
NEXT_TO: Adjacent positioning

Business Rules:

All relationship types must exist in PostgreSQL predicate tables
Symmetric relationships stored in canonical direction only
Relationship validation occurs before Neo4j creation

3. Security Architecture

3.1 Row-Level Security (RLS) Policies

Philosophy: Organization-based access control with user audit trails

`object_instances` RLS Policy

-- Users can only see objects owned by organizations they belong to
CREATE POLICY organization_access ON object_instances
FOR ALL USING (
  owner_organization_id IN (
    SELECT organization_id 
    FROM organization_members 
    WHERE user_id = auth.uid()
  )
);

`object_images` RLS Policy

-- Images inherit access control from parent object
CREATE POLICY object_images_access ON object_images
FOR ALL USING (
  object_instance_id IN (
    SELECT id FROM object_instances
    -- RLS on object_instances applies automatically
  )
);

3.2 Neo4j Security

Strategy: No direct user access to Neo4j

All Neo4j queries through GraphQL API
API enforces PostgreSQL RLS decisions
Neo4j contains no sensitive business data

4. Performance Considerations

4.1 Indexing Strategy

PostgreSQL Indexes

-- Organization-based queries (most common)
CREATE INDEX idx_object_instances_owner_org ON object_instances(owner_organization_id);

-- Search functionality
CREATE INDEX idx_object_instances_name_gin ON object_instances USING gin(to_tsvector('english', name));

-- Image queries
CREATE INDEX idx_object_images_object_instance ON object_images(object_instance_id);

-- Membership lookups
CREATE INDEX idx_organization_members_user ON organization_members(user_id);

Neo4j Indexes

// Object lookups by PostgreSQL ID
CREATE INDEX object_postgres_id FOR (o:Object) ON (o.postgres_id);

// Relationship traversal optimization
CREATE INDEX relationship_type FOR ()-[r:PART_OF]-() ON (r.created_at);

4.2 Query Optimization

PostgreSQL:

Use organization_id filters in all queries
Avoid N+1 queries with proper JOINs
Limit deep pagination with cursor-based pagination

Neo4j:

Limit relationship traversal depth
Use specific relationship types in MATCH clauses
Cache frequent graph patterns

5. Data Lifecycle Management

5.1 Object Creation Flow

ACTIVE IMPLEMENTATION (v2.0 Unified ID System):

Generate single UUID for the object
Create record in PostgreSQL object_instances using this UUID as primary key
Create corresponding Neo4j node using the same UUID as the node’s id property
Add any initial relationships in Neo4j
Upload and link images in object_images

5.2 Object Deletion Flow

Delete relationships in Neo4j
Delete Neo4j node
Delete images from Supabase Storage
Delete image records from object_images
Delete object from object_instances

5.3 Data Synchronization

PostgreSQL is source of truth for object existence
Neo4j synchronization validates against PostgreSQL
Graceful degradation: load objects without relationships if Neo4j unavailable

6. Migration Strategy

6.1 Schema Versioning

All schema changes tracked in migration files
Version numbers in this document header
Backward compatibility requirements documented

6.2 Unified ID Migration Plan

Migration Status: ✅ COMPLETED Current State: Single UUID system (same id in both databases) Legacy State: Previously used dual UUID system (PostgreSQL id + neo4j_id) - eliminated in v2.0

Phase 1: Database Schema Migration

-- Step 1: Verify current data consistency
SELECT COUNT(*) as pg_objects FROM object_instances;
-- Compare with Neo4j: MATCH (o:ObjectInstance) RETURN count(o) as neo4j_objects

-- Step 2: Update Neo4j nodes to use PostgreSQL primary key
-- For each object_instance record:
UPDATE object_instances SET temp_migration_status = 'pending';

-- Step 3: Neo4j node ID migration (via application code)
-- MATCH (o:ObjectInstance {id: old_neo4j_id})
-- SET o.id = postgresql_primary_key
-- WHERE o.id = old_neo4j_id

-- Step 4: Remove neo4j_id column after successful migration
ALTER TABLE object_instances DROP COLUMN neo4j_id;
ALTER TABLE object_instances DROP COLUMN temp_migration_status;

Phase 2: Code Migration

# OLD CODE (to be replaced):
object_id = str(uuid.uuid4())        # PostgreSQL PK
neo4j_id = str(uuid.uuid4())         # Neo4j node ID

# NEW CODE:
object_id = str(uuid.uuid4())        # Single UUID for both systems

# PostgreSQL insert
INSERT INTO object_instances (id, name, description, ...)
VALUES (object_id, name, description, ...)

# Neo4j node creation  
CREATE (o:ObjectInstance {
    id: object_id,                   # Same UUID as PostgreSQL
    name: name,
    description: description
})

Phase 3: Query Updates

# OLD: Lookup via neo4j_id mapping
result = await conn.execute(
    text("SELECT neo4j_id, id FROM object_instances WHERE neo4j_id = ANY(:neo4j_ids)"),
    {"neo4j_ids": neo4j_ids}
)

# NEW: Direct ID matching
result = await conn.execute(
    text("SELECT id FROM object_instances WHERE id = ANY(:object_ids)"),
    {"object_ids": object_ids}
)

Migration Validation ✅ COMPLETED 2025-07-05

All PostgreSQL objects have corresponding Neo4j nodes with matching IDs
No orphaned Neo4j nodes
All relationship queries work with unified IDs
Performance testing shows improved performance
Comprehensive migration scripts with rollback capability created
neo4j_id column successfully removed from PostgreSQL schema

6.3 Data Migration Process

Test migrations in development environment
Backup both PostgreSQL and Neo4j
Apply PostgreSQL migrations first
Sync Neo4j schema changes
Validate data consistency
Update this documentation

7. Monitoring and Maintenance

7.1 Data Consistency Checks

-- ACTIVE: Find objects in PostgreSQL missing from Neo4j (unified ID system)
SELECT id, name FROM object_instances 
WHERE id NOT IN (
  -- Query Neo4j nodes by their id property
);

-- Find orphaned images
SELECT * FROM object_images oi
LEFT JOIN object_instances ot ON oi.object_instance_id = ot.id
WHERE ot.id IS NULL;

7.2 Performance Monitoring

Track RLS policy performance
Monitor Neo4j query response times
Alert on data synchronization delays

8. Future Considerations

8.1 Potential Architectural Changes

Vector Search: Add embedding columns for semantic search
Time-Series Data: Consider separate table for object state changes
Caching Layer: Redis for frequently accessed relationship paths
Event Sourcing: Track all object state changes

8.2 Scalability Planning

Read Replicas: PostgreSQL read replicas for reporting
Neo4j Clustering: Scale graph queries horizontally
Partitioning: Organization-based table partitioning
CDN: Geographic distribution of object images

Appendix A: Column Name Verification

Critical: Always verify against production database before coding

Verification Commands

# List all tables
mcp_supabase_list_tables --project_id gjzcqcxoacxkrchrzaen

# Verify specific table structure
SELECT column_name, data_type, is_nullable 
FROM information_schema.columns 
WHERE table_name = 'object_instances' 
ORDER BY ordinal_position;

Common Mistakes to Avoid

✅ ACTIVE: Unified IDs are now implemented and active
❌ Using separate UUIDs for PostgreSQL and Neo4j - ELIMINATED in v2.0 migration
❌ object_id → ✅ object_instance_id (in related tables)
❌ image_url → ✅ public_url
❌ created_at (in images) → ✅ uploaded_at
❌ Using main_image_url column → ✅ Use object_images.is_main pattern

Appendix B: Reference Documents

Schema Verification - Production schema verification
Neo4j Core Schema - Graph database schema
Supabase Core Schema - PostgreSQL schema
API Documentation - GraphQL schema and endpoints
Database Troubleshooting - Common database issues
Backend README - Updated documentation structure information

Search Results:

Database Architecture Design Document (DADD)

1. Architectural Overview

1.1 Database Strategy: Polyglot Persistence

1.2 Data Distribution Philosophy

1.3 Consistency Model

2. Schema Documentation

2.1 PostgreSQL Schema

Core Business Tables

object_instances

object_images

organizations

organization_members

Reference Data Tables

functional_predicates

spatial_predicates

2.2 Neo4j Schema

Node Types

ObjectInstance Nodes

Relationship Types

Structural Relationships

Functional Relationships

Spatial Relationships

3. Security Architecture

3.1 Row-Level Security (RLS) Policies

object_instances RLS Policy

object_images RLS Policy

3.2 Neo4j Security

4. Performance Considerations

4.1 Indexing Strategy

PostgreSQL Indexes

Neo4j Indexes

4.2 Query Optimization

5. Data Lifecycle Management

5.1 Object Creation Flow

5.2 Object Deletion Flow

5.3 Data Synchronization

6. Migration Strategy

6.1 Schema Versioning

6.2 Unified ID Migration Plan

Phase 1: Database Schema Migration

Phase 2: Code Migration

Phase 3: Query Updates

Migration Validation ✅ COMPLETED 2025-07-05

6.3 Data Migration Process

7. Monitoring and Maintenance

7.1 Data Consistency Checks

7.1 Data Consistency Checks

7.2 Performance Monitoring

8. Future Considerations

8.1 Potential Architectural Changes

8.2 Scalability Planning

Appendix A: Column Name Verification

Verification Commands

Common Mistakes to Avoid

Appendix B: Reference Documents

`object_instances`

`object_images`

`organizations`

`organization_members`

`functional_predicates`

`spatial_predicates`

`ObjectInstance` Nodes

`object_instances` RLS Policy

`object_images` RLS Policy