Import API

Intelligent file import with auto-detection, neural entity extraction, relationship inference, and dual storage (VFS + Knowledge Graph).

Supported Formats

Format Extensions Features
Excel .xlsx, .xls Multi-sheet, auto-header detection, type inference
PDF .pdf Text extraction, table detection, metadata
CSV .csv, .tsv Auto-delimiter, encoding detection
JSON .json Nested objects, arrays, schema inference
Markdown .md Frontmatter, headings, code blocks
YAML .yaml, .yml Multi-document, anchors
Word .docx Text, styles, embedded media
Images .jpg, .png, etc. EXIF metadata, dimensions

Quick Start

import-quickstart.js
import { Brainy } from '@soulcraft/brainy'

const brain = new Brainy()
await brain.init()

// Simple import - auto-detects format, extracts entities
const result = await brain.import('./glossary.xlsx')

console.log(`Imported ${result.entities} entities`)
console.log(`Created ${result.relationships} relationships`)

Signature

brain.import(source: Buffer | string | object, options?: ImportOptions): Promise<ImportResult>

Parameters

Source

The source can be:

Options

Option Type Default Description
format string auto Force format: 'excel', 'pdf', 'csv', 'json', etc.
vfsPath string - Store in VFS at this path
enableNeuralExtraction boolean true Extract entity names via AI
enableRelationshipInference boolean true Detect semantic relationships
enableConceptExtraction boolean true Extract types/concepts via AI
groupBy string 'flat' 'type', 'sheet', 'flat', 'custom'
preserveSource boolean false Keep original file in VFS
confidenceThreshold number 0.7 Minimum confidence for extraction
onProgress function - Progress callback

Examples

Full-Featured Import

full-import.js
const result = await brain.import('./data.xlsx', {
  // AI features
  enableNeuralExtraction: true,
  enableRelationshipInference: true,
  enableConceptExtraction: true,

  // VFS storage
  vfsPath: '/imports/my-data',
  groupBy: 'type',
  preserveSource: true,

  // Progress tracking
  onProgress: (p) => {
    console.log(`[${p.stage}] ${p.message}`)
    console.log(`Entities: ${p.entities || 0}, Rels: ${p.relationships || 0}`)
    if (p.throughput) console.log(`Rate: ${p.throughput.toFixed(1)}/sec`)
  }
})

Progress Callback

The progress callback receives standardized events for all 8 formats:

progress-stages.js
const result = await brain.import(file, {
  onProgress: (progress) => {
    switch (progress.stage) {
      case 'detecting':
        console.log('Detecting file format...')
        break

      case 'extracting':
        console.log(`Extracting: ${progress.current}/${progress.total}`)
        break

      case 'storing-vfs':
        console.log('Storing in VFS...')
        break

      case 'storing-graph':
        console.log(`Adding to graph: ${progress.entities} entities`)
        break

      case 'relationships':
        console.log(`Creating relationships: ${progress.relationships}`)
        break

      case 'complete':
        console.log('Import complete!')
        break
    }
  }
})

Import from Buffer

buffer-import.js
import { readFileSync } from 'fs'

// From buffer (must specify format)
const buffer = readFileSync('./report.pdf')
const result = await brain.import(buffer, { format: 'pdf' })

// From fetch response
const response = await fetch('https://example.com/data.csv')
const csvBuffer = Buffer.from(await response.arrayBuffer())
const result2 = await brain.import(csvBuffer, { format: 'csv' })

Import Pre-parsed Data

object-import.js
// Import from object
const result = await brain.import({
  entities: [
    { name: 'Alice', type: 'Person', department: 'Engineering' },
    { name: 'Bob', type: 'Person', department: 'Marketing' },
    { name: 'Acme Corp', type: 'Organization' }
  ]
})

Custom Grouping

custom-grouping.js
const result = await brain.import('./data.csv', {
  vfsPath: '/imports',
  groupBy: 'custom',
  customGrouping: (entity) => {
    // Group by first letter of name
    const firstLetter = entity.name?.[0]?.toUpperCase() || '#'
    return `/imports/by-letter/${firstLetter}`
  }
})

Performance Mode (Large Files)

performance-import.js
// Optimize for large files
const result = await brain.import('./huge-file.csv', {
  // Disable features for speed
  enableNeuralExtraction: false,
  enableRelationshipInference: false,

  // Higher threshold = fewer entities
  confidenceThreshold: 0.9,

  // Track progress
  onProgress: (p) => {
    if (p.processed && p.total) {
      const percent = ((p.processed / p.total) * 100).toFixed(1)
      console.log(`${percent}% - ETA: ${p.eta}s`)
    }
  }
})

Return Value

{
  entities: 150,              // Number of entities created
  relationships: 45,         // Number of relationships created
  vfsFiles: 3,               // Files stored in VFS
  format: 'excel',           // Detected format
  duration: 2340,            // Time in ms
  errors: []                 // Any extraction errors
}

Dual Storage

Import stores data in two complementary systems:

  1. VFS (Virtual File System) - Organized file structure, original content
  2. Knowledge Graph - Connected entities with relationships
dual-storage.js
await brain.import('./employees.xlsx', {
  vfsPath: '/data/employees'
})

// Access via VFS
const files = await brain.vfs.readdir('/data/employees')
// [{ name: 'Alice.json', type: 'file' }, ...]

// Search via Knowledge Graph
const results = await brain.find({
  query: 'engineers in San Francisco',
  nounType: NounType.Person
})

See Also

Next: Storage Adapters →