Import API
Intelligent file import with auto-detection, neural entity extraction, relationship inference, and dual storage (VFS + Knowledge Graph).
Supported Formats
| Format | Extensions | Features |
|---|---|---|
| Excel | .xlsx, .xls |
Multi-sheet, auto-header detection, type inference |
.pdf |
Text extraction, table detection, metadata | |
| CSV | .csv, .tsv |
Auto-delimiter, encoding detection |
| JSON | .json |
Nested objects, arrays, schema inference |
| Markdown | .md |
Frontmatter, headings, code blocks |
| YAML | .yaml, .yml |
Multi-document, anchors |
| Word | .docx |
Text, styles, embedded media |
| Images | .jpg, .png, etc. |
EXIF metadata, dimensions |
Quick Start
import-quickstart.js
import { Brainy } from '@soulcraft/brainy'
const brain = new Brainy()
await brain.init()
// Simple import - auto-detects format, extracts entities
const result = await brain.import('./glossary.xlsx')
console.log(`Imported ${result.entities} entities`)
console.log(`Created ${result.relationships} relationships`)
Signature
brain.import(source: Buffer | string | object, options?: ImportOptions): Promise<ImportResult>
Parameters
Source
The source can be:
- File path -
'./data.xlsx' - Buffer - Raw file contents
- Object - Pre-parsed data with
{ entities: [...] }
Options
| Option | Type | Default | Description |
|---|---|---|---|
format |
string |
auto | Force format: 'excel', 'pdf', 'csv', 'json', etc. |
vfsPath |
string |
- | Store in VFS at this path |
enableNeuralExtraction |
boolean |
true | Extract entity names via AI |
enableRelationshipInference |
boolean |
true | Detect semantic relationships |
enableConceptExtraction |
boolean |
true | Extract types/concepts via AI |
groupBy |
string |
'flat' | 'type', 'sheet', 'flat', 'custom' |
preserveSource |
boolean |
false | Keep original file in VFS |
confidenceThreshold |
number |
0.7 | Minimum confidence for extraction |
onProgress |
function |
- | Progress callback |
Examples
Full-Featured Import
full-import.js
const result = await brain.import('./data.xlsx', {
// AI features
enableNeuralExtraction: true,
enableRelationshipInference: true,
enableConceptExtraction: true,
// VFS storage
vfsPath: '/imports/my-data',
groupBy: 'type',
preserveSource: true,
// Progress tracking
onProgress: (p) => {
console.log(`[${p.stage}] ${p.message}`)
console.log(`Entities: ${p.entities || 0}, Rels: ${p.relationships || 0}`)
if (p.throughput) console.log(`Rate: ${p.throughput.toFixed(1)}/sec`)
}
})
Progress Callback
The progress callback receives standardized events for all 8 formats:
progress-stages.js
const result = await brain.import(file, {
onProgress: (progress) => {
switch (progress.stage) {
case 'detecting':
console.log('Detecting file format...')
break
case 'extracting':
console.log(`Extracting: ${progress.current}/${progress.total}`)
break
case 'storing-vfs':
console.log('Storing in VFS...')
break
case 'storing-graph':
console.log(`Adding to graph: ${progress.entities} entities`)
break
case 'relationships':
console.log(`Creating relationships: ${progress.relationships}`)
break
case 'complete':
console.log('Import complete!')
break
}
}
})
Import from Buffer
buffer-import.js
import { readFileSync } from 'fs'
// From buffer (must specify format)
const buffer = readFileSync('./report.pdf')
const result = await brain.import(buffer, { format: 'pdf' })
// From fetch response
const response = await fetch('https://example.com/data.csv')
const csvBuffer = Buffer.from(await response.arrayBuffer())
const result2 = await brain.import(csvBuffer, { format: 'csv' })
Import Pre-parsed Data
object-import.js
// Import from object
const result = await brain.import({
entities: [
{ name: 'Alice', type: 'Person', department: 'Engineering' },
{ name: 'Bob', type: 'Person', department: 'Marketing' },
{ name: 'Acme Corp', type: 'Organization' }
]
})
Custom Grouping
custom-grouping.js
const result = await brain.import('./data.csv', {
vfsPath: '/imports',
groupBy: 'custom',
customGrouping: (entity) => {
// Group by first letter of name
const firstLetter = entity.name?.[0]?.toUpperCase() || '#'
return `/imports/by-letter/${firstLetter}`
}
})
Performance Mode (Large Files)
performance-import.js
// Optimize for large files
const result = await brain.import('./huge-file.csv', {
// Disable features for speed
enableNeuralExtraction: false,
enableRelationshipInference: false,
// Higher threshold = fewer entities
confidenceThreshold: 0.9,
// Track progress
onProgress: (p) => {
if (p.processed && p.total) {
const percent = ((p.processed / p.total) * 100).toFixed(1)
console.log(`${percent}% - ETA: ${p.eta}s`)
}
}
})
Return Value
{
entities: 150, // Number of entities created
relationships: 45, // Number of relationships created
vfsFiles: 3, // Files stored in VFS
format: 'excel', // Detected format
duration: 2340, // Time in ms
errors: [] // Any extraction errors
}
Dual Storage
Import stores data in two complementary systems:
- VFS (Virtual File System) - Organized file structure, original content
- Knowledge Graph - Connected entities with relationships
dual-storage.js
await brain.import('./employees.xlsx', {
vfsPath: '/data/employees'
})
// Access via VFS
const files = await brain.vfs.readdir('/data/employees')
// [{ name: 'Alice.json', type: 'file' }, ...]
// Search via Knowledge Graph
const results = await brain.find({
query: 'engineers in San Francisco',
nounType: NounType.Person
})
See Also
- Entity Extraction - Manual extraction API
- VFS API - Virtual filesystem
- add() - Manual entity creation
- Batch Operations - Bulk operations