Documentation
¶
Overview ¶
Package filesql provides a file-based SQL driver implementation that enables querying CSV, TSV, LTSV, Parquet, and Excel (XLSX) files using SQLite3 SQL syntax.
filesql allows you to treat structured text files as SQL databases without any data import or transformation steps. It uses SQLite3 as an in-memory database engine, providing full SQL capabilities including JOINs, aggregations, window functions, and CTEs.
Features ¶
- Query CSV, TSV, LTSV, Parquet, and Excel (XLSX) files using standard SQL
- Automatic handling of compressed files (gzip, bzip2, xz, zstandard)
- Support for multiple input sources (files, directories, io.Reader, embed.FS)
- Efficient streaming for large files with configurable chunk sizes
- Cross-platform compatibility (Linux, macOS, Windows)
- Optional auto-save functionality to persist changes
Basic Usage ¶
The simplest way to use filesql is with the Open or OpenContext functions:
db, err := filesql.Open("data.csv")
if err != nil {
log.Fatal(err)
}
defer db.Close()
rows, err := db.Query("SELECT * FROM data WHERE age > 25")
if err != nil {
log.Fatal(err)
}
defer rows.Close()
Advanced Usage ¶
For more complex scenarios, use the Builder pattern:
builder := filesql.NewBuilder().
AddPath("users.csv").
AddPath("orders.tsv").
EnableAutoSave("./output")
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
Table Naming ¶
Table names are automatically derived from file paths:
- "users.csv" becomes table "users"
- "data.tsv.gz" becomes table "data"
- "/path/to/logs.ltsv" becomes table "logs"
- "sales.xlsx" with multiple sheets becomes tables "sales_Sheet1", "sales_Sheet2", etc.
Data Modifications ¶
INSERT, UPDATE, and DELETE operations affect only the in-memory database. Original files remain unchanged unless auto-save is enabled. To persist changes manually, use the DumpDatabase function.
SQL Syntax ¶
Since filesql uses SQLite3 as its underlying engine, all SQL syntax follows SQLite3's SQL dialect. This includes support for:
- Common Table Expressions (CTEs)
- Window functions
- JSON functions
- Date and time functions
- And all other SQLite3 features
Column Name Handling ¶
Column names are handled with case-sensitive comparison for duplicate detection, maintaining backward compatibility. Headers with identical names after trimming whitespace (regardless of case differences) are considered duplicates and will result in an error.
For complete SQL syntax documentation, see: https://www.sqlite.org/lang.html
Index ¶
- Constants
- Variables
- func ClearACHTableSetRegistry()
- func DumpACH(ctx context.Context, db *sql.DB, baseTableName, outputPath string) error
- func DumpACHWithTableSet(ctx context.Context, db *sql.DB, baseTableName, outputPath string, ...) error
- func DumpDatabase(db *sql.DB, outputDir string, opts ...DumpOptions) error
- func IsACHBaseTableName(tableName string) (baseName string, isACH bool)
- func Open(paths ...string) (*sql.DB, error)
- func OpenContext(ctx context.Context, paths ...string) (*sql.DB, error)
- func UnregisterACHTableSet(baseTableName string)
- type ACHTableInfo
- func (a ACHTableInfo) AddendaTable() string
- func (a ACHTableInfo) AllTableNames() []string
- func (a ACHTableInfo) BatchesTable() string
- func (a ACHTableInfo) EntriesTable() string
- func (a ACHTableInfo) FileHeaderTable() string
- func (a ACHTableInfo) IATAddendaTable() string
- func (a ACHTableInfo) IATBatchesTable() string
- func (a ACHTableInfo) IATEntriesTable() string
- type ChunkSize
- type CompressionFactory
- func (f *CompressionFactory) CreateHandlerForFile(path string) CompressionHandler
- func (f *CompressionFactory) CreateReaderForFile(path string) (io.Reader, func() error, error)
- func (f *CompressionFactory) CreateWriterForFile(path string, compressionType CompressionType) (io.Writer, func() error, error)
- func (f *CompressionFactory) DetectCompressionType(path string) CompressionType
- func (f *CompressionFactory) GetBaseFileType(path string) FileType
- func (f *CompressionFactory) RemoveCompressionExtension(path string) string
- type CompressionHandler
- type CompressionType
- type ContextLogger
- type DBBuilder
- func (b *DBBuilder) AddFS(filesystem fs.FS) *DBBuilder
- func (b *DBBuilder) AddPath(path string) *DBBuilder
- func (b *DBBuilder) AddPaths(paths ...string) *DBBuilder
- func (b *DBBuilder) AddReader(reader io.Reader, tableName string, fileType FileType) *DBBuilder
- func (b *DBBuilder) Build(ctx context.Context) (*DBBuilder, error)
- func (b *DBBuilder) DisableAutoSave() *DBBuilder
- func (b *DBBuilder) EnableAutoSave(outputDir string, options ...DumpOptions) *DBBuilder
- func (b *DBBuilder) EnableAutoSaveOnCommit(outputDir string, options ...DumpOptions) *DBBuilder
- func (b *DBBuilder) Open(ctx context.Context) (*sql.DB, error)
- func (b *DBBuilder) OpenReadOnly(ctx context.Context) (*ReadOnlyDB, error)
- func (b *DBBuilder) SetDefaultChunkSize(size int) *DBBuilder
- func (b *DBBuilder) WithLogger(logger Logger) *DBBuilder
- type DumpOptions
- type ErrorContext
- type FileType
- type Logger
- type MemoryInfo
- type MemoryLimit
- func (ml *MemoryLimit) CheckMemoryUsage() MemoryStatus
- func (ml *MemoryLimit) CreateMemoryError(operation string) error
- func (ml *MemoryLimit) Disable()
- func (ml *MemoryLimit) Enable()
- func (ml *MemoryLimit) GetMemoryInfo() MemoryInfo
- func (ml *MemoryLimit) IsEnabled() bool
- func (ml *MemoryLimit) SetWarningThreshold(threshold float64)
- func (ml *MemoryLimit) ShouldReduceChunkSize(chunkSize int) (bool, int)
- type MemoryPool
- func (mp *MemoryPool) ForceGC()
- func (mp *MemoryPool) GetByteBuffer() []byte
- func (mp *MemoryPool) GetRecordSlice() []Record
- func (mp *MemoryPool) GetStringSlice() []string
- func (mp *MemoryPool) PutByteBuffer(buf []byte)
- func (mp *MemoryPool) PutRecordSlice(slice []Record)
- func (mp *MemoryPool) PutStringSlice(slice []string)
- type MemoryStatus
- type OutputFormat
- type ReadOnlyDB
- func (r *ReadOnlyDB) Begin() (*ReadOnlyTx, error)
- func (r *ReadOnlyDB) BeginTx(ctx context.Context, opts *sql.TxOptions) (*ReadOnlyTx, error)
- func (r *ReadOnlyDB) Close() error
- func (r *ReadOnlyDB) DB() *sql.DB
- func (r *ReadOnlyDB) Exec(query string, args ...any) (sql.Result, error)
- func (r *ReadOnlyDB) ExecContext(ctx context.Context, query string, args ...any) (sql.Result, error)
- func (r *ReadOnlyDB) Ping() error
- func (r *ReadOnlyDB) PingContext(ctx context.Context) error
- func (r *ReadOnlyDB) Prepare(query string) (*ReadOnlyStmt, error)
- func (r *ReadOnlyDB) PrepareContext(ctx context.Context, query string) (*ReadOnlyStmt, error)
- func (r *ReadOnlyDB) Query(query string, args ...any) (*sql.Rows, error)
- func (r *ReadOnlyDB) QueryContext(ctx context.Context, query string, args ...any) (*sql.Rows, error)
- func (r *ReadOnlyDB) QueryRow(query string, args ...any) *sql.Row
- func (r *ReadOnlyDB) QueryRowContext(ctx context.Context, query string, args ...any) *sql.Row
- type ReadOnlyStmt
- func (s *ReadOnlyStmt) Close() error
- func (s *ReadOnlyStmt) Exec(args ...any) (sql.Result, error)
- func (s *ReadOnlyStmt) ExecContext(ctx context.Context, args ...any) (sql.Result, error)
- func (s *ReadOnlyStmt) Query(args ...any) (*sql.Rows, error)
- func (s *ReadOnlyStmt) QueryContext(ctx context.Context, args ...any) (*sql.Rows, error)
- func (s *ReadOnlyStmt) QueryRow(args ...any) *sql.Row
- func (s *ReadOnlyStmt) QueryRowContext(ctx context.Context, args ...any) *sql.Row
- type ReadOnlyTx
- func (t *ReadOnlyTx) Commit() error
- func (t *ReadOnlyTx) Exec(query string, args ...any) (sql.Result, error)
- func (t *ReadOnlyTx) ExecContext(ctx context.Context, query string, args ...any) (sql.Result, error)
- func (t *ReadOnlyTx) Prepare(query string) (*ReadOnlyStmt, error)
- func (t *ReadOnlyTx) PrepareContext(ctx context.Context, query string) (*ReadOnlyStmt, error)
- func (t *ReadOnlyTx) Query(query string, args ...any) (*sql.Rows, error)
- func (t *ReadOnlyTx) QueryContext(ctx context.Context, query string, args ...any) (*sql.Rows, error)
- func (t *ReadOnlyTx) QueryRow(query string, args ...any) *sql.Row
- func (t *ReadOnlyTx) QueryRowContext(ctx context.Context, query string, args ...any) *sql.Row
- func (t *ReadOnlyTx) Rollback() error
- type Record
- type SlogAdapter
- type SlogContextAdapter
- func (s *SlogContextAdapter) DebugContext(ctx context.Context, msg string, args ...any)
- func (s *SlogContextAdapter) ErrorContext(ctx context.Context, msg string, args ...any)
- func (s *SlogContextAdapter) InfoContext(ctx context.Context, msg string, args ...any)
- func (s *SlogContextAdapter) WarnContext(ctx context.Context, msg string, args ...any)
- func (s *SlogContextAdapter) With(args ...any) Logger
- type TableName
Examples ¶
- DBBuilder (Chaining)
- DBBuilder (ErrorHandling)
- DBBuilder.AddFS
- DBBuilder.AddFS (EmbedFS)
- DBBuilder.AddPath
- DBBuilder.AddPaths
- DBBuilder.AddReader
- DBBuilder.AddReader (Compressed)
- DBBuilder.AddReader (Multiple)
- DBBuilder.Build
- DBBuilder.EnableAutoSave
- DBBuilder.EnableAutoSaveOnCommit
- DBBuilder.Open
- DumpDatabase
- DumpDatabase (DataProcessing)
- DumpDatabase (MultipleFormats)
- DumpDatabase (WithOptions)
- DumpOptions.FileExtension
- NewBuilder
- Open
- Open (AdvancedSQL)
- Open (CompressionSupport)
- Open (Constraints)
- Open (CustomerInsights)
- Open (ErrorHandling)
- Open (FinancialDataAnalysis)
- Open (MultipleFiles)
- Open (PerformanceOptimization)
- Open (SalesAnalysis)
- Open (WebLogAnalysis)
- OpenContext
Constants ¶
const ( // DefaultRowsPerChunk is the default number of rows per chunk DefaultRowsPerChunk = 1000 // DefaultChunkSize is the default chunk size (rows); alias for clarity DefaultChunkSize = DefaultRowsPerChunk // MinChunkSize is the minimum allowed rows per chunk MinChunkSize = 1 // ValidationPeekSize is the size used for validation peek operations ValidationPeekSize = 1 )
Processing constants (rows-based)
const ( // MaxSampleSize limits how many values to sample for type inference MaxSampleSize = 1000 // MinConfidenceThreshold is the minimum percentage of values that must match a type MinConfidenceThreshold = 0.8 // EarlyTerminationThreshold is the percentage of text values that triggers early termination EarlyTerminationThreshold = 0.5 // MinDatetimeLength is the minimum reasonable length for datetime values MinDatetimeLength = 4 // MaxDatetimeLength is the maximum reasonable length for datetime values MaxDatetimeLength = 35 // SamplingStratificationFactor determines when to use stratified vs simple sampling SamplingStratificationFactor = 3 // MinRealThreshold is the minimum percentage of real values needed to classify as REAL MinRealThreshold = 0.1 )
Type inference constants
Variables ¶
var ( // ErrEmptyData indicates that the data source contains no records. ErrEmptyData = errors.New("filesql: empty data source") // ErrUnsupportedFormat indicates an unsupported file format. ErrUnsupportedFormat = errors.New("filesql: unsupported file format") // ErrInvalidData indicates malformed or invalid data. ErrInvalidData = errors.New("filesql: invalid data format") // ErrNoTables indicates no tables found in database. ErrNoTables = errors.New("filesql: no tables found in database") // ErrFileNotFound indicates file not found. ErrFileNotFound = errors.New("filesql: file not found") // ErrPermissionDenied indicates permission denied. ErrPermissionDenied = errors.New("filesql: permission denied") // ErrMemoryLimit indicates memory limit exceeded. ErrMemoryLimit = errors.New("filesql: memory limit exceeded") // ErrContextCancelled indicates context was cancelled. ErrContextCancelled = errors.New("filesql: context cancelled") // ErrDuplicateColumn indicates duplicate column names in the data source. ErrDuplicateColumn = errors.New("filesql: duplicate column name") // ErrDuplicateTable indicates a table with the same name already exists. ErrDuplicateTable = errors.New("filesql: duplicate table name") // ErrNilInput indicates a required input parameter is nil. ErrNilInput = errors.New("filesql: nil input") // ErrEmptyPath indicates an empty path was provided. ErrEmptyPath = errors.New("filesql: empty path") // ErrNoFiles indicates no supported files were found. ErrNoFiles = errors.New("filesql: no supported files found") // ErrTableNotFound indicates the specified table does not exist. ErrTableNotFound = errors.New("filesql: table not found") // ErrColumnMismatch indicates record column count doesn't match header. ErrColumnMismatch = errors.New("filesql: column count mismatch") // ErrDatabaseOperation indicates a database operation failed. ErrDatabaseOperation = errors.New("filesql: database operation failed") // ErrIOOperation indicates an I/O operation failed. ErrIOOperation = errors.New("filesql: I/O operation failed") // ErrCompression indicates a compression/decompression operation failed. ErrCompression = errors.New("filesql: compression operation failed") // ErrParsing indicates a file parsing operation failed. ErrParsing = errors.New("filesql: parsing failed") // ErrACH indicates an ACH file operation failed. ErrACH = errors.New("filesql: ACH operation failed") )
Sentinel errors for consistent error handling across the package. Use errors.Is() to check for these errors.
var ErrReadOnly = errors.New("database is read-only: write operations are not allowed")
ErrReadOnly is returned when a write operation is attempted on a read-only database.
Functions ¶
func ClearACHTableSetRegistry ¶ added in v0.9.0
func ClearACHTableSetRegistry()
ClearACHTableSetRegistry removes all ACH TableSets from the registry. Use this to reset state, typically in tests or when shutting down.
func DumpACH ¶ added in v0.9.0
DumpACH exports ACH tables from the database back to an ACH file. This function reconstructs the ACH file from the _file_header, _batches, _entries, and _addenda tables that were created when the file was loaded.
The TableSet is automatically retrieved from the internal registry if the ACH file was loaded via Open() or Builder. If you have the TableSet from another source, use DumpACHWithTableSet instead.
Parameters:
- ctx: Context for cancellation
- db: The database containing ACH tables
- baseTableName: The base name used when the ACH file was loaded (e.g., "payment" for payment.ach)
- outputPath: The path where the ACH file should be written
Returns an error if the export fails or if no TableSet is found for the given base table name.
func DumpACHWithTableSet ¶ added in v0.9.0
func DumpACHWithTableSet(ctx context.Context, db *sql.DB, baseTableName, outputPath string, tableSet *achconv.TableSet) error
DumpACHWithTableSet exports ACH tables from the database back to an ACH file using an explicitly provided TableSet.
Use this function when you have the TableSet from a source other than the internal registry, or when you need more control over which TableSet to use.
Parameters:
- ctx: Context for cancellation
- db: The database containing ACH tables
- baseTableName: The base name used when the ACH file was loaded (e.g., "payment" for payment.ach)
- outputPath: The path where the ACH file should be written
- tableSet: The TableSet containing the original ACH structure
Returns an error if the export fails.
func DumpDatabase ¶
func DumpDatabase(db *sql.DB, outputDir string, opts ...DumpOptions) error
DumpDatabase saves all database tables to files in the specified directory.
Basic usage:
err := filesql.DumpDatabase(db, "./output")
This will save all tables as CSV files in the output directory.
Advanced usage with options:
// Default: Export as CSV files err := DumpDatabase(db, "./output") // Export as TSV files with gzip compression options := NewDumpOptions(). WithFormat(OutputFormatTSV). WithCompression(CompressionGZ) err := DumpDatabase(db, "./output", options)
Example ¶
ExampleDumpDatabase demonstrates exporting modified data
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"strings"
"time"
"github.com/nao1215/filesql"
)
// createTempTestData creates temporary CSV files for the example
func createTempTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_example")
if err != nil {
log.Fatal(err)
}
employeesData := `id,name,department_id,salary,hire_date
1,Alice Johnson,1,95000,2020-01-15
2,Bob Smith,1,85000,2019-03-22
3,Charlie Brown,1,80000,2021-06-10
4,David Wilson,1,75000,2022-02-28
5,Eve Davis,2,70000,2020-09-15
6,Frank Miller,2,65000,2021-11-30
7,Grace Lee,3,60000,2019-12-05
8,Henry Taylor,3,55000,2022-04-18`
err = os.WriteFile(filepath.Join(tmpDir, "employees.csv"), []byte(employeesData), 0600)
if err != nil {
log.Fatal(err)
}
departmentsData := `id,name,budget,manager_id
1,Engineering,1000000,1
2,Marketing,800000,5
3,Sales,600000,7
4,HR,400000,9`
err = os.WriteFile(filepath.Join(tmpDir, "departments.csv"), []byte(departmentsData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
func main() {
tmpDir := createTempTestData()
defer os.RemoveAll(tmpDir)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, filepath.Join(tmpDir, "employees.csv"))
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Modify data in memory
_, err = db.ExecContext(ctx, `
UPDATE employees
SET salary = salary * 1.10
WHERE department_id = 1
`)
if err != nil {
log.Fatal(err)
}
// Add a new employee
_, err = db.ExecContext(ctx, `
INSERT INTO employees (id, name, department_id, salary, hire_date)
VALUES (99, 'New Employee', 2, 60000, '2024-01-01')
`)
if err != nil {
log.Fatal(err)
}
// Create output directory
outputDir := filepath.Join(tmpDir, "output")
err = os.MkdirAll(outputDir, 0750)
if err != nil {
log.Fatal(err)
}
// Export modified data
err = filesql.DumpDatabase(db, outputDir)
if err != nil {
log.Fatal(err)
}
// Verify export by reading the exported file
exportedFile := filepath.Join(outputDir, "employees.csv")
if _, err := os.Stat(exportedFile); err != nil {
log.Fatal("Exported file not found:", err)
}
// Count records in exported file
db2, err := filesql.OpenContext(ctx, exportedFile)
if err != nil {
log.Fatal(err)
}
defer db2.Close()
var count int
err = db2.QueryRowContext(ctx, "SELECT COUNT(*) FROM employees").Scan(&count)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Original file: 8 employees\n")
fmt.Printf("Modified and exported: %d employees\n", count)
// Extract just the filename for consistent output (normalize path separators for cross-platform compatibility)
exportPath := strings.Replace(exportedFile, tmpDir, "/tmp/filesql_example*", 1)
exportPath = strings.ReplaceAll(exportPath, "\\", "/") // Convert Windows backslashes to forward slashes
fmt.Printf("Export location: %s\n", exportPath)
}
Output: Original file: 8 employees Modified and exported: 9 employees Export location: /tmp/filesql_example*/output/employees.csv
Example (DataProcessing) ¶
package main
import (
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
func main() {
tempDir := filepath.Join(os.TempDir(), "filesql_processing_example")
if err := os.MkdirAll(tempDir, 0750); err != nil {
log.Fatal(err)
}
defer os.RemoveAll(tempDir)
// Open CSV file
db, err := filesql.Open(filepath.Join("testdata", "sample.csv"))
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Process data with SQL
_, err = db.Exec(`
UPDATE sample
SET age = age + 1
WHERE name LIKE '%John%'
`)
if err != nil {
log.Fatal(err)
}
// Add aggregated data
_, err = db.Exec(`
INSERT INTO sample (id, name, age, email)
SELECT 999, 'Summary: ' || COUNT(*), AVG(age), '[email protected]'
FROM sample
WHERE id < 999
`)
if err != nil {
log.Fatal(err)
}
// Export processed data in different formats for different use cases
// 1. TSV for spreadsheet import
options := filesql.NewDumpOptions().WithFormat(filesql.OutputFormatTSV)
spreadsheetDir := filepath.Join(tempDir, "for_spreadsheet")
if err := filesql.DumpDatabase(db, spreadsheetDir, options); err != nil {
log.Fatal(err)
}
fmt.Println("Exported TSV for spreadsheet import")
// 2. Compressed CSV for archival
options = filesql.NewDumpOptions().
WithFormat(filesql.OutputFormatCSV).
WithCompression(filesql.CompressionGZ)
archiveDir := filepath.Join(tempDir, "for_archive")
if err := filesql.DumpDatabase(db, archiveDir, options); err != nil {
log.Fatal(err)
}
fmt.Println("Exported compressed CSV for archival")
// 3. LTSV for log analysis
options = filesql.NewDumpOptions().WithFormat(filesql.OutputFormatLTSV)
logDir := filepath.Join(tempDir, "for_logs")
if err := filesql.DumpDatabase(db, logDir, options); err != nil {
log.Fatal(err)
}
fmt.Println("Exported LTSV for log analysis")
// Show what was created
dirs := []string{"for_spreadsheet", "for_archive", "for_logs"}
for _, dir := range dirs {
files, err := filepath.Glob(filepath.Join(tempDir, dir, "*"))
if err != nil {
log.Fatal(err)
}
for _, file := range files {
fmt.Printf("%s: %s\n", dir, filepath.Base(file))
}
}
}
Output: Exported TSV for spreadsheet import Exported compressed CSV for archival Exported LTSV for log analysis for_spreadsheet: sample.tsv for_archive: sample.csv.gz for_logs: sample.ltsv
Example (MultipleFormats) ¶
package main
import (
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
func main() {
tempDir := filepath.Join(os.TempDir(), "filesql_formats_example")
if err := os.MkdirAll(tempDir, 0750); err != nil {
log.Fatal(err)
}
defer os.RemoveAll(tempDir)
// Open CSV file and modify data
db, err := filesql.Open(filepath.Join("testdata", "sample.csv"))
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Add some data to demonstrate functionality
_, err = db.Exec("INSERT INTO sample (id, name, age, email) VALUES (4, 'Alice Brown', 28, '[email protected]')")
if err != nil {
log.Fatal(err)
}
// Demonstrate different compression options
compressionTypes := []struct {
name string
compression filesql.CompressionType
extension string
}{
{"No compression", filesql.CompressionNone, ""},
{"Gzip compression", filesql.CompressionGZ, ".gz"},
{"XZ compression", filesql.CompressionXZ, ".xz"},
{"Zstd compression", filesql.CompressionZSTD, ".zst"},
}
for _, ct := range compressionTypes {
fmt.Printf("%s:\n", ct.name)
options := filesql.NewDumpOptions().
WithFormat(filesql.OutputFormatCSV).
WithCompression(ct.compression)
outputDir := filepath.Join(tempDir, "compression_"+ct.compression.String())
if err := filesql.DumpDatabase(db, outputDir, options); err != nil {
log.Fatal(err)
}
files, err := filepath.Glob(filepath.Join(outputDir, "*"))
if err != nil {
log.Fatal(err)
}
for _, file := range files {
fmt.Printf(" %s\n", filepath.Base(file))
}
}
}
Output: No compression: sample.csv Gzip compression: sample.csv.gz XZ compression: sample.csv.xz Zstd compression: sample.csv.zst
Example (WithOptions) ¶
package main
import (
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
func main() {
// Create a temporary directory for output
tempDir := filepath.Join(os.TempDir(), "filesql_dump_example")
if err := os.MkdirAll(tempDir, 0750); err != nil {
log.Fatal(err)
}
defer os.RemoveAll(tempDir)
// Open CSV file
db, err := filesql.Open(filepath.Join("testdata", "sample.csv"))
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Example 1: Default CSV output (no options)
fmt.Println("Example 1: Default CSV output")
csvDir := filepath.Join(tempDir, "csv_output")
if err := filesql.DumpDatabase(db, csvDir); err != nil {
log.Fatal(err)
}
// List output files
files1, err := filepath.Glob(filepath.Join(csvDir, "*"))
if err != nil {
log.Fatal(err)
}
for _, file := range files1 {
fmt.Printf("Created: %s\n", filepath.Base(file))
}
// Example 2: TSV output with gzip compression
fmt.Println("\nExample 2: TSV output with gzip compression")
tsvDir := filepath.Join(tempDir, "tsv_output")
options := filesql.NewDumpOptions().
WithFormat(filesql.OutputFormatTSV).
WithCompression(filesql.CompressionGZ)
if err := filesql.DumpDatabase(db, tsvDir, options); err != nil {
log.Fatal(err)
}
files2, err := filepath.Glob(filepath.Join(tsvDir, "*"))
if err != nil {
log.Fatal(err)
}
for _, file := range files2 {
fmt.Printf("Created: %s\n", filepath.Base(file))
}
// Example 3: LTSV output with zstd compression
fmt.Println("\nExample 3: LTSV output with zstd compression")
ltsvDir := filepath.Join(tempDir, "ltsv_output")
options3 := filesql.NewDumpOptions().
WithFormat(filesql.OutputFormatLTSV).
WithCompression(filesql.CompressionZSTD)
if err := filesql.DumpDatabase(db, ltsvDir, options3); err != nil {
log.Fatal(err)
}
files3, err := filepath.Glob(filepath.Join(ltsvDir, "*"))
if err != nil {
log.Fatal(err)
}
for _, file := range files3 {
fmt.Printf("Created: %s\n", filepath.Base(file))
}
}
Output: Example 1: Default CSV output Created: sample.csv Example 2: TSV output with gzip compression Created: sample.tsv.gz Example 3: LTSV output with zstd compression Created: sample.ltsv.zst
func IsACHBaseTableName ¶ added in v0.9.0
IsACHBaseTableName checks if a table name is an ACH-related table (ends with _file_header, _batches, _entries, _addenda, _iat_batches, _iat_entries, or _iat_addenda).
func Open ¶
Open creates an SQL database from CSV, TSV, or LTSV files.
Quick start:
db, err := filesql.Open("data.csv")
if err != nil {
return err
}
defer db.Close()
rows, err := db.Query("SELECT * FROM data WHERE age > 25")
Parameters:
- paths: One or more file paths or directories
- Files: "users.csv", "products.tsv", "logs.ltsv"
- Compressed: "data.csv.gz", "archive.tsv.bz2"
- Directories: "/data/" (loads all CSV/TSV/LTSV files recursively)
Table names:
- "users.csv" → table "users"
- "data.tsv.gz" → table "data"
- "/path/to/sales.csv" → table "sales"
- "user-data.csv" → table "user_data" (hyphens become underscores)
- "my file.csv" → table "my_file" (spaces become underscores)
Special characters in file names are automatically sanitized for SQL safety.
Note: Original files are never modified. Changes exist only in memory. To save changes, use DumpDatabase() function.
Example with multiple files:
// Open a single CSV file
db, err := filesql.Open("data/users.csv")
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Complex query with JOINs, aggregation, and window functions
rows, err := db.Query(`
SELECT
u.name,
u.department,
u.salary,
AVG(u.salary) OVER (PARTITION BY u.department) as dept_avg_salary,
RANK() OVER (PARTITION BY u.department ORDER BY u.salary DESC) as salary_rank,
COUNT(*) OVER (PARTITION BY u.department) as dept_size
FROM users u
WHERE u.salary > (
SELECT AVG(salary) * 0.8
FROM users
WHERE department = u.department
)
ORDER BY u.department, u.salary DESC
`)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
// Process results
for rows.Next() {
var name, dept string
var salary, deptAvg float64
var rank, deptSize int
if err := rows.Scan(&name, &dept, &salary, &deptAvg, &rank, &deptSize); err != nil {
log.Fatal(err)
}
fmt.Printf("%s (%s): $%.2f (Rank: %d/%d, Dept Avg: $%.2f)\n",
name, dept, salary, rank, deptSize, deptAvg)
}
Example ¶
ExampleOpen demonstrates how to use filesql.Open() with complex SQL queries. This example shows advanced SQL features including JOINs, window functions, subqueries, and aggregations on CSV data loaded into an in-memory SQLite database.
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
func main() {
// Create temporary test data files
tmpDir := createTempTestData()
defer os.RemoveAll(tmpDir)
// Open the database with multiple files
db, err := filesql.Open(filepath.Join(tmpDir, "employees.csv"), filepath.Join(tmpDir, "departments.csv"))
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Complex query demonstrating multiple SQL features:
// - JOINs between tables
// - Window functions (RANK, AVG, COUNT)
// - Subqueries with correlated conditions
// - CASE statements
// - Grouping and ordering
query := `
SELECT
e.name,
d.name as department_name,
e.salary,
d.budget,
RANK() OVER (PARTITION BY e.department_id ORDER BY e.salary DESC) as salary_rank_in_dept,
AVG(e.salary) OVER (PARTITION BY e.department_id) as dept_avg_salary,
COUNT(*) OVER (PARTITION BY e.department_id) as dept_employee_count,
CASE
WHEN e.salary > (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id) * 1.2
THEN 'High Performer'
WHEN e.salary < (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id) * 0.8
THEN 'Below Average'
ELSE 'Average'
END as performance_category,
ROUND(e.salary / d.budget * 100, 2) as salary_budget_percentage
FROM employees e
JOIN departments d ON e.department_id = d.id
WHERE e.salary > (
SELECT AVG(salary) * 0.7
FROM employees e2
WHERE e2.department_id = e.department_id
)
AND d.budget > 500000
ORDER BY d.name, e.salary DESC
LIMIT 10
`
rows, err := db.QueryContext(context.Background(), query)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
fmt.Println("Employee Analysis Report:")
fmt.Println("========================")
for rows.Next() {
var name, deptName, perfCategory string
var salary, budget, deptAvgSalary, salaryBudgetPct float64
var salaryRank, deptEmpCount int
err := rows.Scan(&name, &deptName, &salary, &budget, &salaryRank,
&deptAvgSalary, &deptEmpCount, &perfCategory, &salaryBudgetPct)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-15s | %-12s | $%7.0f | Rank: %d/%d | %s\n",
name, deptName, salary, salaryRank, deptEmpCount, perfCategory)
}
if err = rows.Err(); err != nil {
log.Fatal(err)
}
}
// createTempTestData creates temporary CSV files for the example
func createTempTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_example")
if err != nil {
log.Fatal(err)
}
employeesData := `id,name,department_id,salary,hire_date
1,Alice Johnson,1,95000,2020-01-15
2,Bob Smith,1,85000,2019-03-22
3,Charlie Brown,1,80000,2021-06-10
4,David Wilson,1,75000,2022-02-28
5,Eve Davis,2,70000,2020-09-15
6,Frank Miller,2,65000,2021-11-30
7,Grace Lee,3,60000,2019-12-05
8,Henry Taylor,3,55000,2022-04-18`
err = os.WriteFile(filepath.Join(tmpDir, "employees.csv"), []byte(employeesData), 0600)
if err != nil {
log.Fatal(err)
}
departmentsData := `id,name,budget,manager_id
1,Engineering,1000000,1
2,Marketing,800000,5
3,Sales,600000,7
4,HR,400000,9`
err = os.WriteFile(filepath.Join(tmpDir, "departments.csv"), []byte(departmentsData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
Output: Employee Analysis Report: ======================== Alice Johnson | Engineering | $ 95000 | Rank: 1/4 | Average Bob Smith | Engineering | $ 85000 | Rank: 2/4 | Average Charlie Brown | Engineering | $ 80000 | Rank: 3/4 | Average David Wilson | Engineering | $ 75000 | Rank: 4/4 | Average Eve Davis | Marketing | $ 70000 | Rank: 1/2 | Average Frank Miller | Marketing | $ 65000 | Rank: 2/2 | Average Grace Lee | Sales | $ 60000 | Rank: 1/2 | Average Henry Taylor | Sales | $ 55000 | Rank: 2/2 | Average
Example (AdvancedSQL) ¶
ExampleOpen_advancedSQL demonstrates advanced SQL features available in SQLite3
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"strings"
"time"
"github.com/nao1215/filesql"
)
func main() {
tmpDir := createAdvancedTestData()
defer os.RemoveAll(tmpDir)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, tmpDir)
if err != nil {
log.Fatal(err)
}
defer db.Close()
fmt.Println("=== Advanced SQL Features ===")
// Window functions with RANK() and ROW_NUMBER()
fmt.Println("\n1. Window Functions - Employee Rankings by Department:")
rows, err := db.QueryContext(ctx, `
SELECT
e.name,
d.name as department,
e.salary,
RANK() OVER (PARTITION BY e.department_id ORDER BY e.salary DESC) as salary_rank,
ROW_NUMBER() OVER (ORDER BY e.salary DESC) as overall_rank
FROM employees e
JOIN departments d ON e.department_id = d.id
ORDER BY e.department_id, salary_rank
`)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-15s %-12s %-8s %-10s %s\n", "Name", "Department", "Salary", "Dept Rank", "Overall Rank")
fmt.Println(strings.Repeat("-", 65))
for rows.Next() {
var name, department string
var salary float64
var salaryRank, overallRank int
if err := rows.Scan(&name, &department, &salary, &salaryRank, &overallRank); err != nil {
log.Fatal(err)
}
fmt.Printf("%-15s %-12s $%-7.0f %-10d %d\n", name, department, salary, salaryRank, overallRank)
}
_ = rows.Close() // Ignore close error in test cleanup
// Common Table Expressions (CTE)
fmt.Println("\n2. Common Table Expressions - Department Analysis:")
rows, err = db.QueryContext(ctx, `
WITH dept_stats AS (
SELECT
d.name as department,
COUNT(e.id) as employee_count,
AVG(e.salary) as avg_salary,
MAX(e.salary) as max_salary,
MIN(e.salary) as min_salary
FROM departments d
LEFT JOIN employees e ON d.id = e.department_id
GROUP BY d.id, d.name
),
company_avg AS (
SELECT AVG(salary) as company_avg_salary
FROM employees
)
SELECT
ds.department,
ds.employee_count,
ds.avg_salary,
ca.company_avg_salary,
ds.avg_salary - ca.company_avg_salary as salary_diff,
CASE
WHEN ds.avg_salary > ca.company_avg_salary THEN 'Above Average'
WHEN ds.avg_salary < ca.company_avg_salary THEN 'Below Average'
ELSE 'At Average'
END as comparison
FROM dept_stats ds
CROSS JOIN company_avg ca
WHERE ds.employee_count > 0
ORDER BY ds.avg_salary DESC
`)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-12s %-5s %-10s %-12s %-10s %s\n", "Department", "Count", "Avg Salary", "Company Avg", "Difference", "Comparison")
fmt.Println(strings.Repeat("-", 75))
for rows.Next() {
var department, comparison string
var employeeCount int
var avgSalary, companyAvg, salaryDiff float64
if err := rows.Scan(&department, &employeeCount, &avgSalary, &companyAvg, &salaryDiff, &comparison); err != nil {
log.Fatal(err)
}
fmt.Printf("%-12s %-5d $%-9.0f $%-11.0f $%-9.0f %s\n",
department, employeeCount, avgSalary, companyAvg, salaryDiff, comparison)
}
_ = rows.Close() // Ignore close error in test cleanup
// JSON operations (if data contains JSON)
fmt.Println("\n3. Text Functions - Name Analysis:")
rows, err = db.QueryContext(ctx, `
SELECT
name,
LENGTH(name) as name_length,
UPPER(SUBSTR(name, 1, 1)) || LOWER(SUBSTR(name, 2)) as formatted_name,
INSTR(name, ' ') as space_position,
CASE
WHEN INSTR(name, ' ') > 0 THEN SUBSTR(name, 1, INSTR(name, ' ') - 1)
ELSE name
END as first_name
FROM employees
WHERE LENGTH(name) > 8
ORDER BY name_length DESC
`)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-15s %-6s %-15s %-8s %s\n", "Name", "Length", "Formatted", "Space@", "First Name")
fmt.Println(strings.Repeat("-", 60))
for rows.Next() {
var name, formattedName, firstName string
var nameLength, spacePos int
if err := rows.Scan(&name, &nameLength, &formattedName, &spacePos, &firstName); err != nil {
log.Fatal(err)
}
fmt.Printf("%-15s %-6d %-15s %-8d %s\n", name, nameLength, formattedName, spacePos, firstName)
}
_ = rows.Close() // Ignore close error in test cleanup
}
// createAdvancedTestData creates test data for advanced SQL examples
func createAdvancedTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_advanced_example_*")
if err != nil {
log.Fatal(err)
}
employeesData := `id,name,department_id,salary,hire_date
1,Alice Johnson,1,95000,2023-01-15
2,Bob Smith,2,85000,2023-02-20
3,Charlie Brown,1,80000,2023-03-10
4,David Wilson,1,75000,2023-04-05
5,Eve Davis,2,65000,2023-05-15
6,Frank Miller,3,70000,2023-06-01`
employeesFile := filepath.Join(tmpDir, "employees.csv")
err = os.WriteFile(employeesFile, []byte(employeesData), 0600)
if err != nil {
log.Fatal(err)
}
departmentsData := `id,name,budget
1,Engineering,500000
2,Sales,300000
3,Marketing,200000
4,HR,150000`
departmentsFile := filepath.Join(tmpDir, "departments.csv")
err = os.WriteFile(departmentsFile, []byte(departmentsData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
Output: === Advanced SQL Features === 1. Window Functions - Employee Rankings by Department: Name Department Salary Dept Rank Overall Rank ----------------------------------------------------------------- Alice Johnson Engineering $95000 1 1 Charlie Brown Engineering $80000 2 3 David Wilson Engineering $75000 3 4 Bob Smith Sales $85000 1 2 Eve Davis Sales $65000 2 6 Frank Miller Marketing $70000 1 5 2. Common Table Expressions - Department Analysis: Department Count Avg Salary Company Avg Difference Comparison --------------------------------------------------------------------------- Engineering 3 $83333 $78333 $5000 Above Average Sales 2 $75000 $78333 $-3333 Below Average Marketing 1 $70000 $78333 $-8333 Below Average 3. Text Functions - Name Analysis: Name Length Formatted Space@ First Name ------------------------------------------------------------ Alice Johnson 13 Alice johnson 6 Alice Charlie Brown 13 Charlie brown 8 Charlie David Wilson 12 David wilson 6 David Frank Miller 12 Frank miller 6 Frank Bob Smith 9 Bob smith 4 Bob Eve Davis 9 Eve davis 4 Eve
Example (CompressionSupport) ¶
ExampleOpen_compressionSupport demonstrates working with compressed files
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"strings"
"time"
"github.com/nao1215/filesql"
)
func main() {
tmpDir := createCompressedTestData()
defer os.RemoveAll(tmpDir)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
// Open compressed files seamlessly
db, err := filesql.OpenContext(ctx, tmpDir)
if err != nil {
log.Fatal(err)
}
defer db.Close()
fmt.Println("=== Compression Support Demo ===")
fmt.Println("Successfully loaded compressed files:")
// List all tables from compressed files
rows, err := db.QueryContext(ctx, `
SELECT name, sql
FROM sqlite_master
WHERE type='table'
ORDER BY name
`)
if err != nil {
log.Fatal(err)
}
for rows.Next() {
var tableName, createSQL string
if err := rows.Scan(&tableName, &createSQL); err != nil {
log.Fatal(err)
}
// Count records in each table
var count int
countQuery := "SELECT COUNT(*) FROM " + tableName
if err := db.QueryRowContext(ctx, countQuery).Scan(&count); err != nil {
log.Fatal(err)
}
fmt.Printf("- %s: %d records\n", tableName, count)
}
_ = rows.Close() // Ignore close error in test cleanup
// Demonstrate querying across compressed files
fmt.Println("\nCross-file analysis from compressed data:")
analysisRows, err := db.QueryContext(ctx, `
SELECT
'logs' as source_table,
COUNT(*) as total_records,
MIN(timestamp) as earliest,
MAX(timestamp) as latest
FROM logs
UNION ALL
SELECT
'products' as source_table,
COUNT(*) as total_records,
'N/A' as earliest,
'N/A' as latest
FROM products
ORDER BY source_table
`)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-12s %-8s %-19s %s\n", "Table", "Records", "Earliest", "Latest")
fmt.Println(strings.Repeat("-", 60))
for analysisRows.Next() {
var sourceTable, earliest, latest string
var totalRecords int
if err := analysisRows.Scan(&sourceTable, &totalRecords, &earliest, &latest); err != nil {
log.Fatal(err)
}
fmt.Printf("%-12s %-8d %-19s %s\n", sourceTable, totalRecords, earliest, latest)
}
_ = analysisRows.Close() // Ignore close error in test cleanup
}
// createCompressedTestData creates test data with compressed files
func createCompressedTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_compressed_example_*")
if err != nil {
log.Fatal(err)
}
logsData := `timestamp,level,message,user_id
2024-01-01 10:00:00,INFO,User login,1001
2024-01-01 11:30:00,INFO,Order created,1002
2024-01-01 12:15:00,ERROR,Payment failed,1003
2024-01-01 13:45:00,INFO,User logout,1001
2024-01-01 14:00:00,INFO,System backup completed,0`
logsFile := filepath.Join(tmpDir, "logs.csv")
err = os.WriteFile(logsFile, []byte(logsData), 0600)
if err != nil {
log.Fatal(err)
}
productsData := `id,name,category,price,in_stock
1,Laptop Pro,Electronics,1299.99,true
2,Office Chair,Furniture,299.99,true
3,Wireless Mouse,Electronics,49.99,false`
productsFile := filepath.Join(tmpDir, "products.csv")
err = os.WriteFile(productsFile, []byte(productsData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
Output: === Compression Support Demo === Successfully loaded compressed files: - logs: 5 records - products: 3 records Cross-file analysis from compressed data: Table Records Earliest Latest ------------------------------------------------------------ logs 5 2024-01-01 10:00:00 2024-01-01 14:00:00 products 3 N/A N/A
Example (Constraints) ¶
ExampleOpen_constraints demonstrates the constraint that modifications don't affect original files
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
// createTempTestData creates temporary CSV files for the example
func createTempTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_example")
if err != nil {
log.Fatal(err)
}
employeesData := `id,name,department_id,salary,hire_date
1,Alice Johnson,1,95000,2020-01-15
2,Bob Smith,1,85000,2019-03-22
3,Charlie Brown,1,80000,2021-06-10
4,David Wilson,1,75000,2022-02-28
5,Eve Davis,2,70000,2020-09-15
6,Frank Miller,2,65000,2021-11-30
7,Grace Lee,3,60000,2019-12-05
8,Henry Taylor,3,55000,2022-04-18`
err = os.WriteFile(filepath.Join(tmpDir, "employees.csv"), []byte(employeesData), 0600)
if err != nil {
log.Fatal(err)
}
departmentsData := `id,name,budget,manager_id
1,Engineering,1000000,1
2,Marketing,800000,5
3,Sales,600000,7
4,HR,400000,9`
err = os.WriteFile(filepath.Join(tmpDir, "departments.csv"), []byte(departmentsData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
func main() {
tmpDir := createTempTestData()
defer os.RemoveAll(tmpDir)
db, err := filesql.Open(filepath.Join(tmpDir, "employees.csv"))
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Show original data count
var originalCount int
err = db.QueryRowContext(context.Background(), "SELECT COUNT(*) FROM employees").Scan(&originalCount)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Original employee count: %d\n", originalCount)
// Insert new data (only affects in-memory database)
_, err = db.ExecContext(context.Background(), "INSERT INTO employees (id, name, department_id, salary, hire_date) VALUES (99, 'Test User', 1, 50000, '2023-01-01')")
if err != nil {
log.Fatal(err)
}
// Show in-memory count
var memoryCount int
err = db.QueryRowContext(context.Background(), "SELECT COUNT(*) FROM employees").Scan(&memoryCount)
if err != nil {
log.Fatal(err)
}
fmt.Printf("In-memory count after INSERT: %d\n", memoryCount)
// Verify original file is unchanged by reopening
db2, err := filesql.Open(filepath.Join(tmpDir, "employees.csv"))
if err != nil {
log.Fatal(err)
}
defer db2.Close()
var fileCount int
err = db2.QueryRowContext(context.Background(), "SELECT COUNT(*) FROM employees").Scan(&fileCount)
if err != nil {
log.Fatal(err)
}
fmt.Printf("File-based count (unchanged): %d\n", fileCount)
}
Output: Original employee count: 8 In-memory count after INSERT: 9 File-based count (unchanged): 8
Example (CustomerInsights) ¶
ExampleOpen_customerInsights demonstrates customer behavior analysis
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"strings"
"time"
"github.com/nao1215/filesql"
)
// createSalesTestData creates test data for sales analysis examples
func createSalesTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_sales_example")
if err != nil {
log.Fatal(err)
}
salesData := `order_id,customer_id,product_name,category,quantity,unit_price,order_date,region
1,101,Laptop Pro,Electronics,2,1299.99,2024-01-15,North
2,102,Wireless Mouse,Electronics,1,29.99,2024-01-16,South
3,103,Office Chair,Furniture,1,299.99,2024-01-17,East
4,101,USB Cable,Electronics,3,12.99,2024-01-18,North
5,104,Standing Desk,Furniture,1,599.99,2024-01-19,West
6,105,Bluetooth Speaker,Electronics,2,79.99,2024-01-20,South
7,106,Coffee Table,Furniture,1,199.99,2024-01-21,East
8,102,Keyboard,Electronics,1,89.99,2024-01-22,South
9,107,Monitor 24inch,Electronics,1,249.99,2024-01-23,North
10,103,Desk Lamp,Furniture,2,39.99,2024-01-24,East`
err = os.WriteFile(filepath.Join(tmpDir, "sales.csv"), []byte(salesData), 0600)
if err != nil {
log.Fatal(err)
}
customersData := `customer_id,name,email,city,registration_date
101,John Doe,[email protected],New York,2023-06-01
102,Jane Smith,[email protected],Los Angeles,2023-07-15
103,Bob Johnson,[email protected],Chicago,2023-08-20
104,Alice Brown,[email protected],Houston,2023-09-10
105,Charlie Wilson,[email protected],Phoenix,2023-10-05
106,Diana Lee,[email protected],Philadelphia,2023-11-12
107,Frank Miller,[email protected],San Antonio,2023-12-03`
err = os.WriteFile(filepath.Join(tmpDir, "customers.csv"), []byte(customersData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
func main() {
tmpDir := createSalesTestData()
defer os.RemoveAll(tmpDir)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, tmpDir)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Customer lifetime value and behavior analysis
query := `
SELECT
c.name,
c.city,
COUNT(s.order_id) as total_orders,
SUM(s.quantity * s.unit_price) as lifetime_value,
AVG(s.quantity * s.unit_price) as avg_order_value,
MIN(s.order_date) as first_purchase,
MAX(s.order_date) as last_purchase,
julianday(MAX(s.order_date)) - julianday(MIN(s.order_date)) as days_active,
COUNT(DISTINCT s.category) as categories_purchased
FROM customers c
JOIN sales s ON c.customer_id = s.customer_id
GROUP BY c.customer_id, c.name, c.city
HAVING total_orders > 1
ORDER BY lifetime_value DESC
`
rows, err := db.QueryContext(ctx, query)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
fmt.Println("Customer Insights (Multi-Purchase Customers):")
fmt.Println("===========================================")
fmt.Printf("%-12s %-12s %-7s %-10s %-10s %-12s %-12s %-6s %s\n",
"Name", "City", "Orders", "LTV", "Avg Order", "First Buy", "Last Buy", "Days", "Categories")
fmt.Println(strings.Repeat("-", 100))
for rows.Next() {
var name, city, firstPurchase, lastPurchase string
var totalOrders, daysActive, categoriesPurchased int
var lifetimeValue, avgOrderValue float64
err := rows.Scan(&name, &city, &totalOrders, &lifetimeValue, &avgOrderValue,
&firstPurchase, &lastPurchase, &daysActive, &categoriesPurchased)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-12s %-12s %-7d $%-9.2f $%-9.2f %-12s %-12s %-6d %d\n",
name, city, totalOrders, lifetimeValue, avgOrderValue,
firstPurchase, lastPurchase, daysActive, categoriesPurchased)
}
}
Output: Customer Insights (Multi-Purchase Customers): =========================================== Name City Orders LTV Avg Order First Buy Last Buy Days Categories ---------------------------------------------------------------------------------------------------- John Doe New York 2 $2638.95 $1319.47 2024-01-15 2024-01-18 3 1 Bob Johnson Chicago 2 $379.97 $189.99 2024-01-17 2024-01-24 7 1 Jane Smith Los Angeles 2 $119.98 $59.99 2024-01-16 2024-01-22 6 1
Example (ErrorHandling) ¶
ExampleOpen_errorHandling demonstrates proper error handling patterns
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"strings"
"time"
"github.com/nao1215/filesql"
)
// createTempTestData creates temporary CSV files for the example
func createTempTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_example")
if err != nil {
log.Fatal(err)
}
employeesData := `id,name,department_id,salary,hire_date
1,Alice Johnson,1,95000,2020-01-15
2,Bob Smith,1,85000,2019-03-22
3,Charlie Brown,1,80000,2021-06-10
4,David Wilson,1,75000,2022-02-28
5,Eve Davis,2,70000,2020-09-15
6,Frank Miller,2,65000,2021-11-30
7,Grace Lee,3,60000,2019-12-05
8,Henry Taylor,3,55000,2022-04-18`
err = os.WriteFile(filepath.Join(tmpDir, "employees.csv"), []byte(employeesData), 0600)
if err != nil {
log.Fatal(err)
}
departmentsData := `id,name,budget,manager_id
1,Engineering,1000000,1
2,Marketing,800000,5
3,Sales,600000,7
4,HR,400000,9`
err = os.WriteFile(filepath.Join(tmpDir, "departments.csv"), []byte(departmentsData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
func main() {
// Example 1: Handling non-existent files gracefully
_, err := filesql.Open("nonexistent.csv")
if err != nil {
fmt.Printf("Expected error for non-existent file: %v\n", err)
}
// Example 2: Context timeout handling
ctx, cancel := context.WithTimeout(context.Background(), 1*time.Nanosecond) // Very short timeout
defer cancel()
// This will likely timeout
tmpDir := createTempTestData()
defer os.RemoveAll(tmpDir)
time.Sleep(10 * time.Millisecond) // Ensure timeout triggers
_, err = filesql.OpenContext(ctx, tmpDir)
if err != nil {
// Extract the core error message (context deadline exceeded)
errMsg := err.Error()
if strings.Contains(errMsg, "context deadline exceeded") {
fmt.Printf("Expected timeout error: %s\n", "context deadline exceeded")
} else {
fmt.Printf("Expected timeout error: %v\n", err)
}
}
// Example 3: Successful operation with proper error checking
ctx2, cancel2 := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel2()
db, err := filesql.OpenContext(ctx2, tmpDir)
if err != nil {
fmt.Printf("Unexpected error: %v\n", err)
return
}
defer db.Close()
// Test query with error handling
rows, err := db.QueryContext(ctx2, "SELECT COUNT(*) FROM employees")
if err != nil {
fmt.Printf("Query error: %v\n", err)
return
}
defer rows.Close()
var count int
if rows.Next() {
if err := rows.Scan(&count); err != nil {
fmt.Printf("Scan error: %v\n", err)
return
}
fmt.Printf("Successfully counted %d employees\n", count)
}
if err := rows.Err(); err != nil {
fmt.Printf("Rows iteration error: %v\n", err)
return
}
}
Output: Expected error for non-existent file: filesql: file not found: nonexistent.csv Expected timeout error: context deadline exceeded Successfully counted 8 employees
Example (FinancialDataAnalysis) ¶
ExampleOpen_financialDataAnalysis demonstrates financial data processing
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"strings"
"time"
"github.com/nao1215/filesql"
)
func main() {
tmpDir := createFinancialTestData()
defer os.RemoveAll(tmpDir)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, tmpDir)
if err != nil {
log.Fatal(err)
}
defer db.Close()
fmt.Println("=== Financial Data Analysis ===")
// Monthly revenue trend
fmt.Println("\n1. Monthly Revenue Trend:")
rows, err := db.QueryContext(ctx, `
SELECT
strftime('%Y-%m', transaction_date) as month,
COUNT(*) as transaction_count,
SUM(amount) as total_revenue,
AVG(amount) as avg_transaction,
MAX(amount) as largest_transaction
FROM transactions
WHERE type = 'sale'
GROUP BY month
ORDER BY month
`)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-8s %-6s %-10s %-8s %s\n", "Month", "Count", "Revenue", "Average", "Largest")
fmt.Println(strings.Repeat("-", 50))
for rows.Next() {
var month string
var count int
var revenue, average, largest float64
if err := rows.Scan(&month, &count, &revenue, &average, &largest); err != nil {
log.Fatal(err)
}
fmt.Printf("%-8s %-6d $%-9.2f $%-7.2f $%.2f\n", month, count, revenue, average, largest)
}
_ = rows.Close() // Ignore close error in test cleanup
// Expense category breakdown
fmt.Println("\n2. Expense Categories:")
rows, err = db.QueryContext(ctx, `
SELECT
category,
COUNT(*) as transaction_count,
SUM(ABS(amount)) as total_expense,
ROUND(SUM(ABS(amount)) * 100.0 / (
SELECT SUM(ABS(amount)) FROM transactions WHERE type = 'expense'
), 2) as percentage
FROM transactions
WHERE type = 'expense'
GROUP BY category
ORDER BY total_expense DESC
`)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-15s %-6s %-12s %-10s\n", "Category", "Count", "Total", "Percentage")
fmt.Println(strings.Repeat("-", 45))
for rows.Next() {
var category string
var count int
var expense, percentage float64
if err := rows.Scan(&category, &count, &expense, &percentage); err != nil {
log.Fatal(err)
}
fmt.Printf("%-15s %-6d $%-11.2f %-10.2f%%\n", category, count, expense, percentage)
}
_ = rows.Close() // Ignore close error in test cleanup
// Cash flow summary
fmt.Println("\n3. Cash Flow Summary:")
var totalIncome, totalExpenses, netIncome float64
err = db.QueryRowContext(ctx, `
SELECT
SUM(CASE WHEN type = 'sale' THEN amount ELSE 0 END) as total_income,
SUM(CASE WHEN type = 'expense' THEN ABS(amount) ELSE 0 END) as total_expenses,
SUM(CASE WHEN type = 'sale' THEN amount ELSE -ABS(amount) END) as net_income
FROM transactions
`).Scan(&totalIncome, &totalExpenses, &netIncome)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Total Income: $%.2f\n", totalIncome)
fmt.Printf("Total Expenses: $%.2f\n", totalExpenses)
fmt.Printf("Net Income: $%.2f\n", netIncome)
fmt.Printf("Profit Margin: %.2f%%\n", (netIncome/totalIncome)*100)
}
// createFinancialTestData creates test data for financial analysis examples
func createFinancialTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_financial_example_*")
if err != nil {
log.Fatal(err)
}
transactionData := `transaction_id,transaction_date,type,category,amount,description
1,2024-01-15,sale,Product Sales,1500.00,Sale of premium product
2,2024-01-20,sale,Product Sales,850.00,Sale of standard product
3,2024-01-25,sale,Service,1200.00,Consulting service
4,2024-01-10,expense,Office Supplies,-150.00,Office equipment purchase
5,2024-01-18,expense,Marketing,-250.00,Social media advertising
6,2024-02-05,sale,Product Sales,1200.00,Sale of premium product
7,2024-02-15,sale,Service,1000.00,Training service
8,2024-02-08,expense,Office Supplies,-200.00,Stationery purchase`
transactionFile := filepath.Join(tmpDir, "transactions.csv")
err = os.WriteFile(transactionFile, []byte(transactionData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
Output: === Financial Data Analysis === 1. Monthly Revenue Trend: Month Count Revenue Average Largest -------------------------------------------------- 2024-01 3 $3550.00 $1183.33 $1500.00 2024-02 2 $2200.00 $1100.00 $1200.00 2. Expense Categories: Category Count Total Percentage --------------------------------------------- Office Supplies 2 $350.00 58.33 % Marketing 1 $250.00 41.67 % 3. Cash Flow Summary: Total Income: $5750.00 Total Expenses: $600.00 Net Income: $5150.00 Profit Margin: 89.57%
Example (MultipleFiles) ¶
ExampleOpen_multipleFiles demonstrates opening multiple files and directories
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
// createTempTestData creates temporary CSV files for the example
func createTempTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_example")
if err != nil {
log.Fatal(err)
}
employeesData := `id,name,department_id,salary,hire_date
1,Alice Johnson,1,95000,2020-01-15
2,Bob Smith,1,85000,2019-03-22
3,Charlie Brown,1,80000,2021-06-10
4,David Wilson,1,75000,2022-02-28
5,Eve Davis,2,70000,2020-09-15
6,Frank Miller,2,65000,2021-11-30
7,Grace Lee,3,60000,2019-12-05
8,Henry Taylor,3,55000,2022-04-18`
err = os.WriteFile(filepath.Join(tmpDir, "employees.csv"), []byte(employeesData), 0600)
if err != nil {
log.Fatal(err)
}
departmentsData := `id,name,budget,manager_id
1,Engineering,1000000,1
2,Marketing,800000,5
3,Sales,600000,7
4,HR,400000,9`
err = os.WriteFile(filepath.Join(tmpDir, "departments.csv"), []byte(departmentsData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
func main() {
tmpDir := createTempTestData()
defer os.RemoveAll(tmpDir)
// Open database with multiple paths (files and directories)
db, err := filesql.Open(tmpDir)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Query to show all available tables
rows, err := db.QueryContext(context.Background(), "SELECT name FROM sqlite_master WHERE type='table' ORDER BY name")
if err != nil {
log.Fatal(err)
}
defer rows.Close()
fmt.Println("Available tables:")
for rows.Next() {
var tableName string
if err := rows.Scan(&tableName); err != nil {
log.Fatal(err)
}
fmt.Printf("- %s\n", tableName)
}
}
Output: Available tables: - departments - employees
Example (PerformanceOptimization) ¶
ExampleOpen_performanceOptimization demonstrates techniques for handling large datasets efficiently
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"time"
"github.com/nao1215/filesql"
)
func main() {
tmpDir := createLargeTestData()
defer os.RemoveAll(tmpDir)
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, tmpDir)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Technique 1: Use LIMIT and OFFSET for pagination
fmt.Println("=== Performance Optimization Techniques ===")
fmt.Println("\n1. Pagination with LIMIT and OFFSET:")
pageSize := 3
offset := 0
for page := 1; page <= 2; page++ {
rows, err := db.QueryContext(ctx, `
SELECT customer_id, name, total_orders
FROM customer_summary
ORDER BY total_orders DESC
LIMIT ? OFFSET ?
`, pageSize, offset)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Page %d:\n", page)
for rows.Next() {
var customerID int
var name string
var totalOrders int
if err := rows.Scan(&customerID, &name, &totalOrders); err != nil {
log.Fatal(err)
}
fmt.Printf(" - %s (ID: %d, Orders: %d)\n", name, customerID, totalOrders)
}
_ = rows.Close() // Ignore close error in test cleanup
offset += pageSize
}
// Technique 2: Use indexes by querying with WHERE clauses on sorted columns
fmt.Println("\n2. Efficient filtering with indexes:")
rows, err := db.QueryContext(ctx, `
SELECT name, email, registration_date
FROM customer_summary
WHERE total_spent > 1000
ORDER BY total_spent DESC
`)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
fmt.Println("High-value customers:")
for rows.Next() {
var name, email, regDate string
if err := rows.Scan(&name, &email, ®Date); err != nil {
log.Fatal(err)
}
fmt.Printf(" - %s (%s) - Registered: %s\n", name, email, regDate)
}
// Technique 3: Aggregate queries for summary statistics
fmt.Println("\n3. Summary statistics:")
var totalCustomers int
var avgOrders, totalRevenue, avgSpent float64
err = db.QueryRowContext(ctx, `
SELECT
COUNT(*) as total_customers,
AVG(total_orders) as avg_orders,
SUM(total_spent) as total_revenue,
AVG(total_spent) as avg_spent
FROM customer_summary
`).Scan(&totalCustomers, &avgOrders, &totalRevenue, &avgSpent)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Total customers: %d\n", totalCustomers)
fmt.Printf("Average orders per customer: %.1f\n", avgOrders)
fmt.Printf("Total revenue: $%.2f\n", totalRevenue)
fmt.Printf("Average customer value: $%.2f\n", avgSpent)
}
// createLargeTestData creates test data for performance optimization examples
func createLargeTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_large_example_*")
if err != nil {
log.Fatal(err)
}
customerData := `customer_id,name,email,registration_date,total_orders,total_spent
1001,Premium Customer A,[email protected],2023-01-15,15,1500.00
1002,Premium Customer B,[email protected],2023-02-20,12,1200.00
1003,Premium Customer C,[email protected],2023-03-10,10,1000.00
1004,Regular Customer D,[email protected],2023-04-05,8,800.00
1005,Regular Customer E,[email protected],2023-05-15,6,600.00
1006,Regular Customer F,[email protected],2023-06-20,5,500.00
1007,Budget Customer G,[email protected],2023-07-10,3,300.00
1008,Budget Customer H,[email protected],2023-08-25,2,200.00
1009,New Customer I,[email protected],2023-09-30,1,100.00
1010,New Customer J,[email protected],2023-10-15,1,100.00`
customerFile := filepath.Join(tmpDir, "customer_summary.csv")
err = os.WriteFile(customerFile, []byte(customerData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
Output: === Performance Optimization Techniques === 1. Pagination with LIMIT and OFFSET: Page 1: - Premium Customer A (ID: 1001, Orders: 15) - Premium Customer B (ID: 1002, Orders: 12) - Premium Customer C (ID: 1003, Orders: 10) Page 2: - Regular Customer D (ID: 1004, Orders: 8) - Regular Customer E (ID: 1005, Orders: 6) - Regular Customer F (ID: 1006, Orders: 5) 2. Efficient filtering with indexes: High-value customers: - Premium Customer A ([email protected]) - Registered: 2023-01-15 - Premium Customer B ([email protected]) - Registered: 2023-02-20 3. Summary statistics: Total customers: 10 Average orders per customer: 6.3 Total revenue: $6300.00 Average customer value: $630.00
Example (SalesAnalysis) ¶
ExampleOpen_salesAnalysis demonstrates practical sales data analysis
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"strings"
"time"
"github.com/nao1215/filesql"
)
// createSalesTestData creates test data for sales analysis examples
func createSalesTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_sales_example")
if err != nil {
log.Fatal(err)
}
salesData := `order_id,customer_id,product_name,category,quantity,unit_price,order_date,region
1,101,Laptop Pro,Electronics,2,1299.99,2024-01-15,North
2,102,Wireless Mouse,Electronics,1,29.99,2024-01-16,South
3,103,Office Chair,Furniture,1,299.99,2024-01-17,East
4,101,USB Cable,Electronics,3,12.99,2024-01-18,North
5,104,Standing Desk,Furniture,1,599.99,2024-01-19,West
6,105,Bluetooth Speaker,Electronics,2,79.99,2024-01-20,South
7,106,Coffee Table,Furniture,1,199.99,2024-01-21,East
8,102,Keyboard,Electronics,1,89.99,2024-01-22,South
9,107,Monitor 24inch,Electronics,1,249.99,2024-01-23,North
10,103,Desk Lamp,Furniture,2,39.99,2024-01-24,East`
err = os.WriteFile(filepath.Join(tmpDir, "sales.csv"), []byte(salesData), 0600)
if err != nil {
log.Fatal(err)
}
customersData := `customer_id,name,email,city,registration_date
101,John Doe,[email protected],New York,2023-06-01
102,Jane Smith,[email protected],Los Angeles,2023-07-15
103,Bob Johnson,[email protected],Chicago,2023-08-20
104,Alice Brown,[email protected],Houston,2023-09-10
105,Charlie Wilson,[email protected],Phoenix,2023-10-05
106,Diana Lee,[email protected],Philadelphia,2023-11-12
107,Frank Miller,[email protected],San Antonio,2023-12-03`
err = os.WriteFile(filepath.Join(tmpDir, "customers.csv"), []byte(customersData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
func main() {
tmpDir := createSalesTestData()
defer os.RemoveAll(tmpDir)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, tmpDir)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Sales summary by category and region
query := `
SELECT
category,
region,
COUNT(*) as order_count,
SUM(quantity * unit_price) as total_revenue,
AVG(quantity * unit_price) as avg_order_value,
MIN(order_date) as first_order,
MAX(order_date) as last_order
FROM sales
GROUP BY category, region
ORDER BY total_revenue DESC
`
rows, err := db.QueryContext(ctx, query)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
fmt.Println("Sales Analysis by Category and Region:")
fmt.Println("=====================================")
fmt.Printf("%-12s %-8s %-6s %-10s %-12s %-12s %s\n",
"Category", "Region", "Orders", "Revenue", "Avg Order", "First Order", "Last Order")
fmt.Println(strings.Repeat("-", 80))
for rows.Next() {
var category, region, firstOrder, lastOrder string
var orderCount int
var totalRevenue, avgOrderValue float64
err := rows.Scan(&category, ®ion, &orderCount, &totalRevenue, &avgOrderValue, &firstOrder, &lastOrder)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-12s %-8s %-6d $%-9.2f $%-11.2f %-12s %s\n",
category, region, orderCount, totalRevenue, avgOrderValue, firstOrder, lastOrder)
}
}
Output: Sales Analysis by Category and Region: ===================================== Category Region Orders Revenue Avg Order First Order Last Order -------------------------------------------------------------------------------- Electronics North 3 $2888.94 $962.98 2024-01-15 2024-01-23 Furniture West 1 $599.99 $599.99 2024-01-19 2024-01-19 Furniture East 3 $579.96 $193.32 2024-01-17 2024-01-24 Electronics South 3 $279.96 $93.32 2024-01-16 2024-01-22
Example (WebLogAnalysis) ¶
ExampleOpen_webLogAnalysis demonstrates analyzing web server logs
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"strings"
"time"
"github.com/nao1215/filesql"
)
func main() {
tmpDir := createWebLogTestData()
defer os.RemoveAll(tmpDir)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, tmpDir)
if err != nil {
log.Fatal(err)
}
defer db.Close()
fmt.Println("=== Web Log Analysis ===")
// Top pages by hits
fmt.Println("\n1. Top Pages by Hits:")
rows, err := db.QueryContext(ctx, `
SELECT
path,
COUNT(*) as hits,
COUNT(DISTINCT ip_address) as unique_visitors
FROM access_logs
WHERE status_code = 200
GROUP BY path
ORDER BY hits DESC
LIMIT 5
`)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-20s %-6s %s\n", "Path", "Hits", "Unique")
fmt.Println(strings.Repeat("-", 35))
for rows.Next() {
var path string
var hits, unique int
if err := rows.Scan(&path, &hits, &unique); err != nil {
log.Fatal(err)
}
fmt.Printf("%-20s %-6d %d\n", path, hits, unique)
}
_ = rows.Close() // Ignore close error in test cleanup
// Error analysis
fmt.Println("\n2. Error Analysis:")
rows, err = db.QueryContext(ctx, `
SELECT
status_code,
COUNT(*) as error_count,
ROUND(COUNT(*) * 100.0 / (SELECT COUNT(*) FROM access_logs), 2) as percentage
FROM access_logs
WHERE status_code >= 400
GROUP BY status_code
ORDER BY error_count DESC
`)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-12s %-6s %-10s\n", "Status Code", "Count", "Percentage")
fmt.Println(strings.Repeat("-", 30))
for rows.Next() {
var statusCode, errorCount int
var percentage float64
if err := rows.Scan(&statusCode, &errorCount, &percentage); err != nil {
log.Fatal(err)
}
fmt.Printf("%-12d %-6d %-10.2f%%\n", statusCode, errorCount, percentage)
}
_ = rows.Close() // Ignore close error in test cleanup
// Hourly traffic pattern
fmt.Println("\n3. Traffic by Hour:")
rows, err = db.QueryContext(ctx, `
SELECT
CAST(strftime('%H', timestamp) AS INTEGER) as hour,
COUNT(*) as requests,
AVG(response_time) as avg_response_time
FROM access_logs
GROUP BY hour
ORDER BY hour
`)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-5s %-9s %-12s\n", "Hour", "Requests", "Avg Response")
fmt.Println(strings.Repeat("-", 28))
for rows.Next() {
var hour, requests int
var avgResponseTime float64
if err := rows.Scan(&hour, &requests, &avgResponseTime); err != nil {
log.Fatal(err)
}
fmt.Printf("%-5d %-9d %-12.0fms\n", hour, requests, avgResponseTime)
}
_ = rows.Close() // Ignore close error in test cleanup
}
// createWebLogTestData creates test data for web log analysis examples
func createWebLogTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_weblog_example_*")
if err != nil {
log.Fatal(err)
}
accessLogsData := `timestamp,ip_address,method,path,status_code,response_time,user_agent
2024-01-01 09:15:30,192.168.1.100,GET,/,200,150,Mozilla/5.0
2024-01-01 09:30:45,192.168.1.101,GET,/products,200,200,Mozilla/5.0
2024-01-01 10:05:15,192.168.1.100,GET,/,200,120,Mozilla/5.0
2024-01-01 10:20:30,192.168.1.102,GET,/about,200,180,Mozilla/5.0
2024-01-01 10:35:45,192.168.1.101,GET,/products,200,160,Mozilla/5.0
2024-01-01 11:10:15,192.168.1.103,GET,/contact,200,140,Mozilla/5.0
2024-01-01 11:25:30,192.168.1.100,GET,/,200,200,Mozilla/5.0
2024-01-01 11:40:45,192.168.1.104,GET,/missing,404,50,Mozilla/5.0
2024-01-01 14:15:30,192.168.1.105,GET,/notfound,404,100,Mozilla/5.0`
accessLogsFile := filepath.Join(tmpDir, "access_logs.csv")
err = os.WriteFile(accessLogsFile, []byte(accessLogsData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
Output: === Web Log Analysis === 1. Top Pages by Hits: Path Hits Unique ----------------------------------- / 3 1 /products 2 1 /contact 1 1 /about 1 1 2. Error Analysis: Status Code Count Percentage ------------------------------ 404 2 22.22 % 3. Traffic by Hour: Hour Requests Avg Response ---------------------------- 9 2 175 ms 10 3 153 ms 11 3 130 ms 14 1 100 ms
func OpenContext ¶ added in v0.0.2
OpenContext is like Open but accepts a context for cancellation and timeout control.
Use this when you need to:
- Set timeouts for loading large files
- Support cancellation in server applications
- Integrate with context-aware code
Example with timeout:
// Open a single CSV file with timeout
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, "data/users.csv")
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Complex query with JOINs, aggregation, and window functions
rows, err := db.QueryContext(ctx, `
SELECT
u.name,
u.department,
u.salary,
AVG(u.salary) OVER (PARTITION BY u.department) as dept_avg_salary,
RANK() OVER (PARTITION BY u.department ORDER BY u.salary DESC) as salary_rank,
COUNT(*) OVER (PARTITION BY u.department) as dept_size
FROM users u
WHERE u.salary > (
SELECT AVG(salary) * 0.8
FROM users
WHERE department = u.department
)
ORDER BY u.department, u.salary DESC
`)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
// Process results
for rows.Next() {
var name, dept string
var salary, deptAvg float64
var rank, deptSize int
if err := rows.Scan(&name, &dept, &salary, &deptAvg, &rank, &deptSize); err != nil {
log.Fatal(err)
}
fmt.Printf("%s (%s): $%.2f (Rank: %d/%d, Dept Avg: $%.2f)\n",
name, dept, salary, rank, deptSize, deptAvg)
}
OpenContext creates an SQL database from CSV, TSV, or LTSV files with context support.
This function is similar to Open() but allows cancellation and timeout control through context. Table names are automatically generated from file names with special characters sanitized for SQL safety (e.g., hyphens become underscores: "data-file.csv" → "data_file").
Example:
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, "large-dataset.csv")
if err != nil {
return err
}
defer db.Close()
Parameters:
- ctx: Context for cancellation and timeout control
- paths: One or more file paths or directories to load
Example ¶
ExampleOpenContext demonstrates opening files with context support for timeout and cancellation
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"time"
"github.com/nao1215/filesql"
)
// createTempTestData creates temporary CSV files for the example
func createTempTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_example")
if err != nil {
log.Fatal(err)
}
employeesData := `id,name,department_id,salary,hire_date
1,Alice Johnson,1,95000,2020-01-15
2,Bob Smith,1,85000,2019-03-22
3,Charlie Brown,1,80000,2021-06-10
4,David Wilson,1,75000,2022-02-28
5,Eve Davis,2,70000,2020-09-15
6,Frank Miller,2,65000,2021-11-30
7,Grace Lee,3,60000,2019-12-05
8,Henry Taylor,3,55000,2022-04-18`
err = os.WriteFile(filepath.Join(tmpDir, "employees.csv"), []byte(employeesData), 0600)
if err != nil {
log.Fatal(err)
}
departmentsData := `id,name,budget,manager_id
1,Engineering,1000000,1
2,Marketing,800000,5
3,Sales,600000,7
4,HR,400000,9`
err = os.WriteFile(filepath.Join(tmpDir, "departments.csv"), []byte(departmentsData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
func main() {
tmpDir := createTempTestData()
defer os.RemoveAll(tmpDir)
// Create a context with timeout
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
// Open database with context
db, err := filesql.OpenContext(ctx, filepath.Join(tmpDir, "employees.csv"))
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Query with context support
rows, err := db.QueryContext(ctx, `
SELECT name, salary
FROM employees
WHERE salary > 70000
ORDER BY salary DESC
`)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
fmt.Println("High earners (>$70,000):")
for rows.Next() {
var name string
var salary float64
if err := rows.Scan(&name, &salary); err != nil {
log.Fatal(err)
}
fmt.Printf("- %s: $%.0f\n", name, salary)
}
}
Output: High earners (>$70,000): - Alice Johnson: $95000 - Bob Smith: $85000 - Charlie Brown: $80000 - David Wilson: $75000
func UnregisterACHTableSet ¶ added in v0.9.0
func UnregisterACHTableSet(baseTableName string)
UnregisterACHTableSet removes an ACH TableSet from the registry. Call this when you're done with an ACH file to free memory. This is automatically called by the auto-save connection when using EnableAutoSave.
Example:
db, _ := filesql.Open("payment.ach")
// ... work with the database ...
db.Close()
filesql.UnregisterACHTableSet("payment") // Free the TableSet memory
Types ¶
type ACHTableInfo ¶ added in v0.9.0
type ACHTableInfo struct {
// BaseName is the base table name derived from the ACH filename.
// For example, if the file is "payment.ach", BaseName is "payment".
BaseName string
}
ACHTableInfo represents information about ACH tables created from an ACH file. It provides methods to get the complete table names for each ACH table type.
func GetACHTableInfos ¶ added in v0.9.0
func GetACHTableInfos() []ACHTableInfo
GetACHTableInfos returns ACHTableInfo for all registered ACH files. Each ACHTableInfo provides methods to get complete table names.
Example:
db, _ := filesql.Open("payment.ach", "refund.ach")
infos := filesql.GetACHTableInfos()
for _, info := range infos {
fmt.Println(info.EntriesTable()) // "payment_entries", "refund_entries"
}
func (ACHTableInfo) AddendaTable ¶ added in v0.9.0
func (a ACHTableInfo) AddendaTable() string
AddendaTable returns the complete table name for the addenda table. Example: "payment" -> "payment_addenda"
func (ACHTableInfo) AllTableNames ¶ added in v0.9.0
func (a ACHTableInfo) AllTableNames() []string
AllTableNames returns all possible ACH table names for this base name. This includes both standard and IAT tables.
func (ACHTableInfo) BatchesTable ¶ added in v0.9.0
func (a ACHTableInfo) BatchesTable() string
BatchesTable returns the complete table name for the batches table. Example: "payment" -> "payment_batches"
func (ACHTableInfo) EntriesTable ¶ added in v0.9.0
func (a ACHTableInfo) EntriesTable() string
EntriesTable returns the complete table name for the entries table. Example: "payment" -> "payment_entries"
func (ACHTableInfo) FileHeaderTable ¶ added in v0.9.0
func (a ACHTableInfo) FileHeaderTable() string
FileHeaderTable returns the complete table name for the file header table. Example: "payment" -> "payment_file_header"
func (ACHTableInfo) IATAddendaTable ¶ added in v0.9.0
func (a ACHTableInfo) IATAddendaTable() string
IATAddendaTable returns the complete table name for the IAT addenda table. Example: "payment" -> "payment_iat_addenda"
func (ACHTableInfo) IATBatchesTable ¶ added in v0.9.0
func (a ACHTableInfo) IATBatchesTable() string
IATBatchesTable returns the complete table name for the IAT batches table. Example: "payment" -> "payment_iat_batches"
func (ACHTableInfo) IATEntriesTable ¶ added in v0.9.0
func (a ACHTableInfo) IATEntriesTable() string
IATEntriesTable returns the complete table name for the IAT entries table. Example: "payment" -> "payment_iat_entries"
type ChunkSize ¶ added in v0.4.2
type ChunkSize int
ChunkSize represents a chunk size with validation
func NewChunkSize ¶ added in v0.4.2
NewChunkSize creates a new ChunkSize with validation
type CompressionFactory ¶ added in v0.4.4
type CompressionFactory struct{}
CompressionFactory provides factory methods for compression handling
func NewCompressionFactory ¶ added in v0.4.4
func NewCompressionFactory() *CompressionFactory
NewCompressionFactory creates a new compression factory
func (*CompressionFactory) CreateHandlerForFile ¶ added in v0.4.4
func (f *CompressionFactory) CreateHandlerForFile(path string) CompressionHandler
CreateHandlerForFile creates an appropriate compression handler for a given file path
func (*CompressionFactory) CreateReaderForFile ¶ added in v0.4.4
CreateReaderForFile opens a file and returns a reader that handles decompression
func (*CompressionFactory) CreateWriterForFile ¶ added in v0.4.4
func (f *CompressionFactory) CreateWriterForFile(path string, compressionType CompressionType) (io.Writer, func() error, error)
CreateWriterForFile creates a file and returns a writer that handles compression
func (*CompressionFactory) DetectCompressionType ¶ added in v0.4.4
func (f *CompressionFactory) DetectCompressionType(path string) CompressionType
DetectCompressionType detects the compression type from a file path
func (*CompressionFactory) GetBaseFileType ¶ added in v0.4.4
func (f *CompressionFactory) GetBaseFileType(path string) FileType
GetBaseFileType determines the base file type after removing compression extensions
func (*CompressionFactory) RemoveCompressionExtension ¶ added in v0.4.4
func (f *CompressionFactory) RemoveCompressionExtension(path string) string
RemoveCompressionExtension removes the compression extension from a file path if present
type CompressionHandler ¶ added in v0.4.4
type CompressionHandler interface {
// CreateReader wraps an io.Reader with a decompression reader if needed
CreateReader(reader io.Reader) (io.Reader, func() error, error)
// CreateWriter wraps an io.Writer with a compression writer if needed
CreateWriter(writer io.Writer) (io.Writer, func() error, error)
// Extension returns the file extension for this compression type (e.g., ".gz")
Extension() string
}
CompressionHandler defines the interface for handling file compression/decompression
func NewCompressionHandler ¶ added in v0.4.4
func NewCompressionHandler(compressionType CompressionType) CompressionHandler
NewCompressionHandler creates a new compression handler for the given compression type
type CompressionType ¶ added in v0.0.4
type CompressionType int
CompressionType represents the compression type
const ( // CompressionNone represents no compression CompressionNone CompressionType = iota // CompressionGZ represents gzip compression CompressionGZ // CompressionBZ2 represents bzip2 compression CompressionBZ2 // CompressionXZ represents xz compression CompressionXZ // CompressionZSTD represents zstd compression CompressionZSTD // CompressionZLIB represents zlib compression CompressionZLIB // CompressionSNAPPY represents snappy compression CompressionSNAPPY // CompressionS2 represents s2 compression CompressionS2 // CompressionLZ4 represents lz4 compression CompressionLZ4 )
func (CompressionType) Extension ¶ added in v0.2.0
func (c CompressionType) Extension() string
Extension returns the file extension for the compression type
func (CompressionType) String ¶ added in v0.2.0
func (c CompressionType) String() string
String returns the string representation of CompressionType
type ContextLogger ¶ added in v0.10.0
type ContextLogger interface {
Logger
// DebugContext logs a debug message with context
DebugContext(ctx context.Context, msg string, args ...any)
// InfoContext logs an info message with context
InfoContext(ctx context.Context, msg string, args ...any)
// WarnContext logs a warning message with context
WarnContext(ctx context.Context, msg string, args ...any)
// ErrorContext logs an error message with context
ErrorContext(ctx context.Context, msg string, args ...any)
}
ContextLogger extends Logger with context-aware logging methods
type DBBuilder ¶ added in v0.1.0
type DBBuilder struct {
// contains filtered or unexported fields
}
DBBuilder configures and creates database connections from various data sources.
Basic usage:
builder := filesql.NewBuilder().
AddPath("data.csv").
AddPath("users.tsv")
validatedBuilder, err := builder.Build(ctx)
if err != nil {
return err
}
db, err := validatedBuilder.Open(ctx)
defer db.Close()
Supports:
- File paths (AddPath)
- Embedded filesystems (AddFS)
- io.Reader streams (AddReader)
- Auto-save functionality (EnableAutoSave)
- Read-only mode (OpenReadOnly)
Example (Chaining) ¶
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"testing/fstest"
"github.com/nao1215/filesql"
)
func main() {
// Create temporary files
tempDir, _ := os.MkdirTemp("", "filesql-example")
defer os.RemoveAll(tempDir)
csvFile := filepath.Join(tempDir, "data1.csv")
content1 := "id,name\n1,Alice\n2,Bob\n"
os.WriteFile(csvFile, []byte(content1), 0644)
tsvFile := filepath.Join(tempDir, "data2.tsv")
content2 := "id\tproduct\n1\tLaptop\n2\tPhone\n"
os.WriteFile(tsvFile, []byte(content2), 0644)
// Create mock filesystem
mockFS := fstest.MapFS{
"logs.ltsv": &fstest.MapFile{Data: []byte("time:2024-01-01T00:00:00Z\tlevel:info\n")},
}
// Demonstrate method chaining
ctx := context.Background()
db, err := filesql.NewBuilder().
AddPath(csvFile).
AddPaths(tsvFile).
AddFS(mockFS).
Build(ctx)
if err != nil {
log.Fatal(err)
}
connection, err := db.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer connection.Close()
// Count tables from different sources
rows, err := connection.Query("SELECT name FROM sqlite_master WHERE type='table' ORDER BY name")
if err != nil {
log.Fatal(err)
}
defer rows.Close()
var tableCount int
for rows.Next() {
var tableName string
rows.Scan(&tableName)
tableCount++
}
fmt.Printf("Successfully loaded %d tables from mixed sources\n", tableCount)
}
Output: Successfully loaded 3 tables from mixed sources
Example (ErrorHandling) ¶
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
func main() {
// Example 1: Build without inputs should fail
builder := filesql.NewBuilder()
ctx := context.Background()
_, err := builder.Build(ctx)
if err != nil {
fmt.Printf("Expected error for no inputs: %v\n", err)
}
// Example 2: Open without Build should fail
builder2 := filesql.NewBuilder().AddPath("nonexistent.csv")
_, err = builder2.Open(ctx)
if err != nil {
fmt.Println("Expected error for Open without Build")
}
// Example 3: Non-existent file should fail during Build
builder3 := filesql.NewBuilder().AddPath(filepath.Join("nonexistent", "file.csv"))
_, err = builder3.Build(ctx)
if err != nil {
fmt.Println("Expected error for non-existent file")
}
// Example 4: Success case
tempDir, _ := os.MkdirTemp("", "filesql-example")
defer os.RemoveAll(tempDir)
csvFile := filepath.Join(tempDir, "valid.csv")
os.WriteFile(csvFile, []byte("id,name\n1,test\n"), 0644)
builder4 := filesql.NewBuilder().AddPath(csvFile)
validatedBuilder, err := builder4.Build(ctx)
if err != nil {
log.Fatal(err)
}
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
fmt.Println("Success: Valid file loaded correctly")
}
Output: Expected error for no inputs: filesql: no supported files found: at least one path must be provided Expected error for Open without Build Expected error for non-existent file Success: Valid file loaded correctly
func NewBuilder ¶ added in v0.1.0
func NewBuilder() *DBBuilder
NewBuilder creates a new database builder.
Start here when you need:
- Multiple data sources (files, streams, embedded FS)
- Auto-save functionality
- Custom chunk sizes for large files
- More control than the simple Open() function
Example:
builder := filesql.NewBuilder().
AddPath("data.csv").
EnableAutoSave("./backup")
Example ¶
ExampleNewBuilder demonstrates the basic usage of the Builder pattern. This is the recommended approach for most use cases, especially when working with embedded filesystems or when you need more control over the database creation process.
package main
import (
"fmt"
"github.com/nao1215/filesql"
)
func main() {
// Create a new builder - this is the starting point for all Builder pattern usage
builder := filesql.NewBuilder()
// The builder supports method chaining for a fluent API
// You can add individual paths or multiple paths at once
builder.AddPath("users.csv").AddPaths("orders.tsv", "products.ltsv")
fmt.Printf("Builder created successfully: %t\n", builder != nil)
// In real usage, you would continue with:
// ctx := context.Background()
// validatedBuilder, err := builder.Build(ctx)
// if err != nil { return err }
// db, err := validatedBuilder.Open(ctx)
// if err != nil { return err }
// defer db.Close()
}
Output: Builder created successfully: true
func (*DBBuilder) AddFS ¶ added in v0.1.0
AddFS adds files from an embedded filesystem (go:embed).
Automatically finds all CSV, TSV, and LTSV files in the filesystem.
Example:
//go:embed data/*.csv data/*.tsv var dataFS embed.FS builder.AddFS(dataFS)
Returns self for chaining.
Example ¶
package main
import (
"context"
"fmt"
"log"
"testing/fstest"
"github.com/nao1215/filesql"
)
func main() {
// Create mock filesystem with test data
mockFS := fstest.MapFS{
"users.csv": &fstest.MapFile{Data: []byte("id,name,department\n1,Alice,Engineering\n2,Bob,Sales\n")},
"products.tsv": &fstest.MapFile{Data: []byte("id\tname\tprice\n1\tLaptop\t1000\n2\tPhone\t500\n")},
"logs.ltsv": &fstest.MapFile{Data: []byte("time:2024-01-01T00:00:00Z\tlevel:info\tmsg:started\n")},
"readme.txt": &fstest.MapFile{Data: []byte("This file will be ignored\n")}, // unsupported format
}
// Use builder with filesystem
builder := filesql.NewBuilder().AddFS(mockFS)
ctx := context.Background()
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Clean up temporary files
// List all tables that were created from the filesystem
rows, err := db.Query("SELECT name FROM sqlite_master WHERE type='table' ORDER BY name")
if err != nil {
log.Fatal(err)
}
defer rows.Close()
var tableCount int
for rows.Next() {
var tableName string
rows.Scan(&tableName)
tableCount++
}
fmt.Printf("Created %d tables from filesystem\n", tableCount)
}
Output: Created 3 tables from filesystem
Example (EmbedFS) ¶
package main
import (
"context"
"embed"
"fmt"
"io/fs"
"log"
"github.com/nao1215/filesql"
)
//go:embed testdata/embed_test/*.csv testdata/embed_test/*.tsv
var builderExampleFS embed.FS
func main() {
// Use embedded test filesystem
subFS, err := fs.Sub(builderExampleFS, "testdata/embed_test")
if err != nil {
log.Fatal(err)
}
// Use builder with embedded filesystem
builder := filesql.NewBuilder().AddFS(subFS)
ctx := context.Background()
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Clean up temporary files
// Count the number of tables created from embedded files
rows, err := db.Query("SELECT COUNT(*) FROM sqlite_master WHERE type='table'")
if err != nil {
log.Fatal(err)
}
defer rows.Close()
var tableCount int
if rows.Next() {
rows.Scan(&tableCount)
}
fmt.Printf("Created %d tables from embedded files\n", tableCount)
}
Output: Created 3 tables from embedded files
func (*DBBuilder) AddPath ¶ added in v0.1.0
AddPath adds a file or directory to load.
Examples:
- Single file: AddPath("users.csv")
- Compressed: AddPath("data.tsv.gz")
- Directory: AddPath("/data/") // loads all CSV/TSV/LTSV files
Returns self for chaining.
Example ¶
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
func main() {
// Create temporary CSV file for example
tempDir, _ := os.MkdirTemp("", "filesql-example")
defer os.RemoveAll(tempDir)
csvFile := filepath.Join(tempDir, "users.csv")
content := "id,name,age\n1,Alice,30\n2,Bob,25\n"
os.WriteFile(csvFile, []byte(content), 0644)
// Use builder to add a single file path
builder := filesql.NewBuilder().AddPath(csvFile)
// Build and open database
ctx := context.Background()
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Query the data
rows, err := db.Query("SELECT COUNT(*) FROM users")
if err != nil {
log.Fatal(err)
}
defer rows.Close()
var count int
if rows.Next() {
rows.Scan(&count)
}
fmt.Printf("Number of users: %d\n", count)
}
Output: Number of users: 2
func (*DBBuilder) AddPaths ¶ added in v0.1.0
AddPaths adds multiple files or directories at once.
Example:
builder.AddPaths("users.csv", "products.tsv", "/data/logs/")
Returns self for chaining.
Example ¶
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
func main() {
// Create temporary files for example
tempDir, _ := os.MkdirTemp("", "filesql-example")
defer os.RemoveAll(tempDir)
// Create users.csv
usersFile := filepath.Join(tempDir, "users.csv")
usersContent := "id,name\n1,Alice\n2,Bob\n"
os.WriteFile(usersFile, []byte(usersContent), 0644)
// Create products.csv
productsFile := filepath.Join(tempDir, "products.csv")
productsContent := "id,product_name\n1,Laptop\n2,Phone\n"
os.WriteFile(productsFile, []byte(productsContent), 0644)
// Use builder to add multiple file paths
builder := filesql.NewBuilder().AddPaths(usersFile, productsFile)
ctx := context.Background()
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Query data from both tables
rows, err := db.Query(`
SELECT u.name, p.product_name
FROM users u
JOIN products p ON u.id = p.id
`)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
for rows.Next() {
var name, product string
rows.Scan(&name, &product)
fmt.Printf("%s has %s\n", name, product)
}
}
Output: Alice has Laptop Bob has Phone
func (*DBBuilder) AddReader ¶ added in v0.2.0
AddReader adds data from an io.Reader (file, network stream, etc.).
Parameters:
- reader: Any io.Reader (file, bytes.Buffer, http.Response.Body, etc.)
- tableName: Name for the SQL table (e.g., "users")
- fileType: Data format (FileTypeCSV, FileTypeTSV, FileTypeLTSV, etc.)
Example:
resp, _ := http.Get("https://example.com/data.csv")
builder.AddReader(resp.Body, "remote_data", FileTypeCSV)
Returns self for chaining.
Example ¶
ExampleDBBuilder_AddReader demonstrates using io.Reader as a data source
package main
import (
"context"
"fmt"
"log"
"strings"
"github.com/nao1215/filesql"
)
func main() {
// CSV data from an io.Reader (could be from network, API response, etc.)
csvData := `id,name,department,salary
1,Alice,Engineering,95000
2,Bob,Sales,78000
3,Charlie,Engineering,102000
4,Diana,Marketing,85000`
// Create a reader from the CSV data
reader := strings.NewReader(csvData)
// Build database with Reader input
ctx := context.Background()
builder := filesql.NewBuilder().
AddReader(reader, "employees", filesql.FileTypeCSV). // Specify table name and type explicitly
SetDefaultChunkSize(5000) // Set 5000 rows per chunk for large data
// Build validates the input
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
// Open creates the database connection
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Query the data
rows, err := db.Query(`
SELECT name, department, salary
FROM employees
WHERE salary > 80000
ORDER BY salary DESC
`)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
// Process results
fmt.Println("High earners (salary > 80,000):")
for rows.Next() {
var name, dept string
var salary int
if err := rows.Scan(&name, &dept, &salary); err != nil {
log.Fatal(err)
}
fmt.Printf("- %s (%s): $%d\n", name, dept, salary)
}
}
Output: High earners (salary > 80,000): - Charlie (Engineering): $102000 - Alice (Engineering): $95000 - Diana (Marketing): $85000
Example (Compressed) ¶
ExampleDBBuilder_AddReader_compressed demonstrates using compressed data from io.Reader
package main
import (
"bytes"
"context"
"fmt"
"log"
"github.com/nao1215/filesql"
)
func main() {
// Simulate compressed TSV data (in practice, this would be actual compressed data)
tsvData := "product_id\tproduct_name\tprice\n1\tLaptop\t999\n2\tMouse\t25\n3\tKeyboard\t75"
reader := bytes.NewReader([]byte(tsvData))
ctx := context.Background()
builder := filesql.NewBuilder().
// Specify that this is TSV data (not actually compressed in this example)
AddReader(reader, "products", filesql.FileTypeTSV)
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Query the products
var count int
err = db.QueryRow("SELECT COUNT(*) FROM products WHERE price < 100").Scan(&count)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Products under $100: %d\n", count)
}
Output: Products under $100: 2
Example (Multiple) ¶
ExampleDBBuilder_AddReader_multiple demonstrates combining multiple readers with files
package main
import (
"context"
"fmt"
"log"
"strings"
"github.com/nao1215/filesql"
)
func main() {
// First reader: Users data
usersCSV := `user_id,name,email
1,Alice,[email protected]
2,Bob,[email protected]`
// Second reader: Orders data
ordersCSV := `order_id,user_id,amount
101,1,250
102,2,180
103,1,320`
ctx := context.Background()
builder := filesql.NewBuilder().
AddReader(strings.NewReader(usersCSV), "users", filesql.FileTypeCSV).
AddReader(strings.NewReader(ordersCSV), "orders", filesql.FileTypeCSV).
SetDefaultChunkSize(2500) // 2500 rows per chunk
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Join query across both tables
rows, err := db.Query(`
SELECT u.name, SUM(o.amount) as total_spent
FROM users u
JOIN orders o ON u.user_id = o.user_id
GROUP BY u.user_id, u.name
ORDER BY total_spent DESC
`)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
fmt.Println("Customer spending:")
for rows.Next() {
var name string
var total float64
if err := rows.Scan(&name, &total); err != nil {
log.Fatal(err)
}
fmt.Printf("- %s: $%.0f\n", name, total)
}
}
Output: Customer spending: - Alice: $570 - Bob: $180
func (*DBBuilder) Build ¶ added in v0.1.0
Build validates all configured inputs and prepares the builder for opening a database. This method must be called before Open(). It performs the following operations:
1. Validates that at least one input source is configured 2. Checks existence and format of all file paths 3. Processes embedded filesystems by converting files to streaming readers 4. Validates that all files have supported extensions
After successful validation, the builder is ready to create database connections with Open(). The context is used for file operations and can be used for cancellation.
Returns the same builder instance for method chaining, or an error if validation fails.
Example ¶
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
func main() {
// Create temporary CSV file
tempDir, _ := os.MkdirTemp("", "filesql-example")
defer os.RemoveAll(tempDir)
csvFile := filepath.Join(tempDir, "data.csv")
content := "name,value\ntest,123\n"
os.WriteFile(csvFile, []byte(content), 0644)
// Build validates inputs and prepares for opening
builder := filesql.NewBuilder().AddPath(csvFile)
ctx := context.Background()
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Builder validated successfully: %t\n", validatedBuilder != nil)
}
Output: Builder validated successfully: true
func (*DBBuilder) DisableAutoSave ¶ added in v0.1.0
DisableAutoSave disables automatic saving (default behavior). Returns the builder for method chaining.
func (*DBBuilder) EnableAutoSave ¶ added in v0.1.0
func (b *DBBuilder) EnableAutoSave(outputDir string, options ...DumpOptions) *DBBuilder
EnableAutoSave automatically saves changes when the database is closed.
Parameters:
- outputDir: Where to save files
- "" (empty): Overwrite original files
- "./backup": Save to backup directory
Example:
builder.AddPath("data.csv").
EnableAutoSave("") // Auto-save to original file on db.Close()
Returns self for chaining.
Example ¶
ExampleDBBuilder_EnableAutoSave demonstrates automatic saving on database close. This feature automatically saves modified data when the database connection is closed, providing a convenient way to persist changes without manual intervention.
package main
import (
"context"
"fmt"
"os"
"path/filepath"
"time"
"github.com/nao1215/filesql"
)
func main() {
// Create temporary directory and test file
tempDir, _ := os.MkdirTemp("", "filesql-autosave-example")
defer os.RemoveAll(tempDir)
// Create sample CSV file
csvPath := filepath.Join(tempDir, "employees.csv")
csvContent := "name,department,salary\nAlice,Engineering,80000\nBob,Marketing,65000\n"
_ = os.WriteFile(csvPath, []byte(csvContent), 0600)
// Create output directory
outputDir := filepath.Join(tempDir, "backup")
_ = os.MkdirAll(outputDir, 0750)
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
// Configure builder with auto-save on close
builder := filesql.NewBuilder().
AddPath(csvPath).
EnableAutoSave(outputDir, filesql.NewDumpOptions()) // Save to backup directory on close
validatedBuilder, _ := builder.Build(ctx)
db, _ := validatedBuilder.Open(ctx)
// Modify data - this will be automatically saved when db.Close() is called
_, _ = db.ExecContext(ctx, "INSERT INTO employees (name, department, salary) VALUES ('Charlie', 'Sales', 70000)")
_, _ = db.ExecContext(ctx, "UPDATE employees SET salary = 85000 WHERE name = 'Alice'")
// Close database - triggers automatic save to backup directory
_ = db.Close()
// Verify the backup file was created and contains our changes
backupFile := filepath.Join(outputDir, "employees.csv")
if _, err := os.Stat(backupFile); err == nil {
fmt.Println("Auto-save completed successfully")
}
}
Output: Auto-save completed successfully
func (*DBBuilder) EnableAutoSaveOnCommit ¶ added in v0.1.0
func (b *DBBuilder) EnableAutoSaveOnCommit(outputDir string, options ...DumpOptions) *DBBuilder
EnableAutoSaveOnCommit automatically saves changes after each transaction commit.
Use this for real-time persistence. Note: May impact performance.
Example:
builder.AddPath("data.csv").
EnableAutoSaveOnCommit("./output") // Save after each commit
Returns self for chaining.
Example ¶
ExampleDBBuilder_EnableAutoSaveOnCommit demonstrates automatic saving on transaction commit. This provides more frequent saves but may impact performance for workloads with many commits.
package main
import (
"context"
"fmt"
"os"
"path/filepath"
"time"
"github.com/nao1215/filesql"
)
func main() {
// Create temporary directory and test file
tempDir, _ := os.MkdirTemp("", "filesql-commit-save-example")
defer os.RemoveAll(tempDir)
// Create sample CSV file
csvPath := filepath.Join(tempDir, "transactions.csv")
csvContent := "id,amount,status\n1,100.50,pending\n2,250.75,pending\n"
_ = os.WriteFile(csvPath, []byte(csvContent), 0600)
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
// Configure builder with auto-save on commit
// Using temp directory to keep the example self-contained
builder := filesql.NewBuilder().
AddPath(csvPath).
EnableAutoSaveOnCommit(tempDir, filesql.NewDumpOptions()) // Save to temp directory on each commit
validatedBuilder, _ := builder.Build(ctx)
db, _ := validatedBuilder.Open(ctx)
defer func() { _ = db.Close() }()
// Start transaction
tx, _ := db.BeginTx(ctx, nil)
// Process transactions within the transaction
_, _ = tx.ExecContext(ctx, "UPDATE transactions SET status = 'completed' WHERE id = 1")
_, _ = tx.ExecContext(ctx, "INSERT INTO transactions (id, amount, status) VALUES (3, 175.25, 'completed')")
// Commit transaction - triggers automatic save
_ = tx.Commit()
fmt.Println("Transaction committed with auto-save")
}
Output: Transaction committed with auto-save
func (*DBBuilder) Open ¶ added in v0.1.0
Open creates and returns a database connection using the configured and validated inputs. This method can only be called after Build() has been successfully executed. It creates an in-memory SQLite database and loads all configured files as tables using streaming.
Table names are derived from file names without extensions: - "users.csv" becomes table "users" - "data.tsv.gz" becomes table "data" - "user-data.csv" becomes table "user_data" (hyphens become underscores) - "my file.csv" becomes table "my_file" (spaces become underscores)
Special characters in file names are automatically sanitized for SQL safety.
The returned database connection supports the full SQLite3 SQL syntax. Auto-save functionality is supported for both file paths and reader inputs. The caller is responsible for closing the connection when done.
Returns a *sql.DB connection or an error if the database cannot be created.
Example ¶
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
func main() {
// Create temporary CSV file
tempDir, _ := os.MkdirTemp("", "filesql-example")
defer os.RemoveAll(tempDir)
csvFile := filepath.Join(tempDir, "employees.csv")
content := "id,name,salary\n1,Alice,50000\n2,Bob,60000\n3,Charlie,55000\n"
os.WriteFile(csvFile, []byte(content), 0644)
// Complete builder workflow: AddPath -> Build -> Open
builder := filesql.NewBuilder().AddPath(csvFile)
ctx := context.Background()
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Perform complex SQL query
rows, err := db.Query(`
SELECT name, salary,
salary - (SELECT AVG(salary) FROM employees) as salary_diff
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees)
ORDER BY salary DESC
`)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
for rows.Next() {
var name string
var salary, diff float64
rows.Scan(&name, &salary, &diff)
fmt.Printf("%s: $%.0f (+$%.0f above average)\n", name, salary, diff)
}
}
Output: Bob: $60000 (+$5000 above average)
func (*DBBuilder) OpenReadOnly ¶ added in v0.9.0
func (b *DBBuilder) OpenReadOnly(ctx context.Context) (*ReadOnlyDB, error)
OpenReadOnly creates a read-only database connection. This is a convenience method that calls Open() and wraps the result in a ReadOnlyDB. All SELECT queries work normally, but write operations return ErrReadOnly.
This is useful for audit scenarios where you want to query data without risk of accidental modification.
Example:
builder := filesql.NewBuilder().
AddPath("payment.ach")
validatedBuilder, err := builder.Build(ctx)
if err != nil {
return err
}
rodb, err := validatedBuilder.OpenReadOnly(ctx)
if err != nil {
return err
}
defer rodb.Close()
// SELECT works
rows, _ := rodb.Query("SELECT * FROM payment_entries")
// Write operations are rejected
_, err = rodb.Exec("DELETE FROM payment_entries") // returns ErrReadOnly
func (*DBBuilder) SetDefaultChunkSize ¶ added in v0.2.0
SetDefaultChunkSize sets chunk size (number of rows) for large file processing.
Default: 1000 rows. Adjust based on available memory and processing needs.
Example:
builder.SetDefaultChunkSize(5000) // 5000 rows per chunk
Returns self for chaining.
func (*DBBuilder) WithLogger ¶ added in v0.10.0
WithLogger sets a custom logger for internal operations.
The logger interface is compatible with slog.Logger. You can use the provided SlogAdapter to wrap an existing slog.Logger, or implement your own Logger.
Examples:
// Using slog logger := slog.New(slog.NewJSONHandler(os.Stdout, nil)) builder.WithLogger(filesql.NewSlogAdapter(logger)) // Using a custom logger builder.WithLogger(myCustomLogger)
Returns self for chaining.
type DumpOptions ¶ added in v0.0.4
type DumpOptions struct {
// Format specifies the output file format
Format OutputFormat
// Compression specifies the compression type
Compression CompressionType
}
DumpOptions configures how database tables are exported to files.
Example:
options := NewDumpOptions(). WithFormat(OutputFormatTSV). WithCompression(CompressionGZ) err := DumpDatabase(db, "./output", options)
func NewDumpOptions ¶ added in v0.0.4
func NewDumpOptions() DumpOptions
NewDumpOptions creates default export options (CSV, no compression).
Modify with:
- WithFormat(): Change file format (CSV, TSV, LTSV)
- WithCompression(): Add compression (GZ, BZ2, XZ, ZSTD)
func (DumpOptions) FileExtension ¶ added in v0.2.0
func (o DumpOptions) FileExtension() string
FileExtension returns the complete file extension including compression
Example ¶
package main
import (
"fmt"
"github.com/nao1215/filesql"
)
func main() {
// Show how file extensions are built
examples := []struct {
format filesql.OutputFormat
compression filesql.CompressionType
}{
{filesql.OutputFormatCSV, filesql.CompressionNone},
{filesql.OutputFormatTSV, filesql.CompressionGZ},
{filesql.OutputFormatLTSV, filesql.CompressionBZ2},
{filesql.OutputFormatCSV, filesql.CompressionXZ},
{filesql.OutputFormatTSV, filesql.CompressionZSTD},
}
for _, ex := range examples {
options := filesql.DumpOptions{
Format: ex.format,
Compression: ex.compression,
}
fmt.Printf("Format: %-4s, Compression: %-4s -> Extension: %s\n",
ex.format.String(),
ex.compression.String(),
options.FileExtension())
}
}
Output: Format: csv , Compression: none -> Extension: .csv Format: tsv , Compression: gz -> Extension: .tsv.gz Format: ltsv, Compression: bz2 -> Extension: .ltsv.bz2 Format: csv , Compression: xz -> Extension: .csv.xz Format: tsv , Compression: zstd -> Extension: .tsv.zst
func (DumpOptions) WithCompression ¶ added in v0.2.0
func (o DumpOptions) WithCompression(compression CompressionType) DumpOptions
WithCompression adds compression to output files.
Options:
- CompressionNone: No compression (default)
- CompressionGZ: Gzip compression (.gz)
- CompressionBZ2: Bzip2 compression (.bz2) - read only, writing not supported
- CompressionXZ: XZ compression (.xz)
- CompressionZSTD: Zstandard compression (.zst)
- CompressionZLIB: Zlib compression (.z)
- CompressionSNAPPY: Snappy compression (.snappy)
- CompressionS2: S2 compression (.s2) - Snappy compatible
- CompressionLZ4: LZ4 compression (.lz4)
func (DumpOptions) WithFormat ¶ added in v0.2.0
func (o DumpOptions) WithFormat(format OutputFormat) DumpOptions
WithFormat sets the output file format.
Options:
- OutputFormatCSV: Comma-separated values
- OutputFormatTSV: Tab-separated values
- OutputFormatLTSV: Labeled tab-separated values
- OutputFormatParquet: Apache Parquet columnar format
type ErrorContext ¶ added in v0.4.2
ErrorContext provides context for where an error occurred
func NewErrorContext ¶ added in v0.4.2
func NewErrorContext(operation, filePath string) *ErrorContext
NewErrorContext creates a new error context
func (*ErrorContext) Error ¶ added in v0.4.2
func (ec *ErrorContext) Error(baseErr error) error
Error creates a formatted error with context
func (*ErrorContext) WithDetails ¶ added in v0.4.2
func (ec *ErrorContext) WithDetails(details string) *ErrorContext
WithDetails adds details to the error context
func (*ErrorContext) WithTable ¶ added in v0.4.2
func (ec *ErrorContext) WithTable(tableName string) *ErrorContext
WithTable adds table context to the error
type FileType ¶ added in v0.2.0
type FileType int
FileType represents supported file types including compression variants
const ( // FileTypeCSV represents CSV file type FileTypeCSV FileType = iota // FileTypeTSV represents TSV file type FileTypeTSV // FileTypeLTSV represents LTSV file type FileTypeLTSV // FileTypeParquet represents Parquet file type FileTypeParquet // FileTypeXLSX represents Excel XLSX file type FileTypeXLSX // FileTypeCSVGZ represents gzip-compressed CSV file type FileTypeCSVGZ // FileTypeTSVGZ represents gzip-compressed TSV file type FileTypeTSVGZ // FileTypeLTSVGZ represents gzip-compressed LTSV file type FileTypeLTSVGZ // FileTypeParquetGZ represents gzip-compressed Parquet file type FileTypeParquetGZ // FileTypeCSVBZ2 represents bzip2-compressed CSV file type FileTypeCSVBZ2 // FileTypeTSVBZ2 represents bzip2-compressed TSV file type FileTypeTSVBZ2 // FileTypeLTSVBZ2 represents bzip2-compressed LTSV file type FileTypeLTSVBZ2 // FileTypeParquetBZ2 represents bzip2-compressed Parquet file type FileTypeParquetBZ2 // FileTypeCSVXZ represents xz-compressed CSV file type FileTypeCSVXZ // FileTypeTSVXZ represents xz-compressed TSV file type FileTypeTSVXZ // FileTypeLTSVXZ represents xz-compressed LTSV file type FileTypeLTSVXZ // FileTypeParquetXZ represents xz-compressed Parquet file type FileTypeParquetXZ // FileTypeCSVZSTD represents zstd-compressed CSV file type FileTypeCSVZSTD // FileTypeTSVZSTD represents zstd-compressed TSV file type FileTypeTSVZSTD // FileTypeLTSVZSTD represents zstd-compressed LTSV file type FileTypeLTSVZSTD // FileTypeParquetZSTD represents zstd-compressed Parquet file type FileTypeParquetZSTD // FileTypeXLSXGZ represents gzip-compressed Excel XLSX file type FileTypeXLSXGZ // FileTypeXLSXBZ2 represents bzip2-compressed Excel XLSX file type FileTypeXLSXBZ2 // FileTypeXLSXXZ represents xz-compressed Excel XLSX file type FileTypeXLSXXZ // FileTypeXLSXZSTD represents zstd-compressed Excel XLSX file type FileTypeXLSXZSTD // FileTypeCSVZLIB represents zlib-compressed CSV file type FileTypeCSVZLIB // FileTypeTSVZLIB represents zlib-compressed TSV file type FileTypeTSVZLIB // FileTypeLTSVZLIB represents zlib-compressed LTSV file type FileTypeLTSVZLIB // FileTypeParquetZLIB represents zlib-compressed Parquet file type FileTypeParquetZLIB // FileTypeXLSXZLIB represents zlib-compressed XLSX file type FileTypeXLSXZLIB // FileTypeCSVSNAPPY represents snappy-compressed CSV file type FileTypeCSVSNAPPY // FileTypeTSVSNAPPY represents snappy-compressed TSV file type FileTypeTSVSNAPPY // FileTypeLTSVSNAPPY represents snappy-compressed LTSV file type FileTypeLTSVSNAPPY // FileTypeParquetSNAPPY represents snappy-compressed Parquet file type FileTypeParquetSNAPPY // FileTypeXLSXSNAPPY represents snappy-compressed XLSX file type FileTypeXLSXSNAPPY // FileTypeCSVS2 represents s2-compressed CSV file type FileTypeCSVS2 // FileTypeTSVS2 represents s2-compressed TSV file type FileTypeTSVS2 // FileTypeLTSVS2 represents s2-compressed LTSV file type FileTypeLTSVS2 // FileTypeParquetS2 represents s2-compressed Parquet file type FileTypeParquetS2 // FileTypeXLSXS2 represents s2-compressed XLSX file type FileTypeXLSXS2 // FileTypeCSVLZ4 represents lz4-compressed CSV file type FileTypeCSVLZ4 // FileTypeTSVLZ4 represents lz4-compressed TSV file type FileTypeTSVLZ4 // FileTypeLTSVLZ4 represents lz4-compressed LTSV file type FileTypeLTSVLZ4 // FileTypeParquetLZ4 represents lz4-compressed Parquet file type FileTypeParquetLZ4 // FileTypeXLSXLZ4 represents lz4-compressed XLSX file type FileTypeXLSXLZ4 // FileTypeUnsupported represents unsupported file type FileTypeUnsupported )
type Logger ¶ added in v0.10.0
type Logger interface {
// Debug logs a debug message with optional key-value pairs
Debug(msg string, args ...any)
// Info logs an info message with optional key-value pairs
Info(msg string, args ...any)
// Warn logs a warning message with optional key-value pairs
Warn(msg string, args ...any)
// Error logs an error message with optional key-value pairs
Error(msg string, args ...any)
// With returns a new Logger with the given key-value pairs added to the context
With(args ...any) Logger
}
Logger defines the interface for logging within filesql. Users can implement this interface to use their own logging solution. The interface is designed to be compatible with slog.Logger.
type MemoryInfo ¶ added in v0.4.4
type MemoryInfo struct {
CurrentMB int64 // Current memory usage in MB
LimitMB int64 // Memory limit in MB
Usage float64 // Usage percentage (0.0-1.0)
Status MemoryStatus // Current status
}
MemoryInfo contains detailed memory usage information
type MemoryLimit ¶ added in v0.4.4
type MemoryLimit struct {
// contains filtered or unexported fields
}
MemoryLimit provides configurable memory limits with graceful degradation for file processing operations. It monitors heap usage and can trigger memory management actions when thresholds are exceeded.
The system supports three states:
- OK: Memory usage is within acceptable limits
- WARNING: Memory usage approaches the limit, suggesting reduced chunk sizes
- EXCEEDED: Memory usage has exceeded the limit, processing should be halted
Usage example:
limit := NewMemoryLimit(512) // 512MB limit
if limit.CheckMemoryUsage() == MemoryStatusExceeded {
return limit.CreateMemoryError("processing")
}
Performance Note: CheckMemoryUsage() calls runtime.ReadMemStats which can pause for milliseconds. Use sparingly in hot paths.
Thread Safety: All methods are safe for concurrent use by multiple goroutines.
func NewMemoryLimit ¶ added in v0.4.4
func NewMemoryLimit(maxMemoryMB int64) *MemoryLimit
NewMemoryLimit creates a new memory limit configuration
func (*MemoryLimit) CheckMemoryUsage ¶ added in v0.4.4
func (ml *MemoryLimit) CheckMemoryUsage() MemoryStatus
CheckMemoryUsage checks current memory usage against limits
func (*MemoryLimit) CreateMemoryError ¶ added in v0.4.4
func (ml *MemoryLimit) CreateMemoryError(operation string) error
CreateMemoryError creates a memory limit error with helpful context
func (*MemoryLimit) Disable ¶ added in v0.4.4
func (ml *MemoryLimit) Disable()
Disable disables memory limit checking
func (*MemoryLimit) Enable ¶ added in v0.4.4
func (ml *MemoryLimit) Enable()
Enable enables memory limit checking
func (*MemoryLimit) GetMemoryInfo ¶ added in v0.4.4
func (ml *MemoryLimit) GetMemoryInfo() MemoryInfo
GetMemoryInfo returns current memory usage information
func (*MemoryLimit) IsEnabled ¶ added in v0.4.4
func (ml *MemoryLimit) IsEnabled() bool
IsEnabled returns whether memory limits are enabled
func (*MemoryLimit) SetWarningThreshold ¶ added in v0.4.4
func (ml *MemoryLimit) SetWarningThreshold(threshold float64)
SetWarningThreshold sets the warning threshold (0.0-1.0)
func (*MemoryLimit) ShouldReduceChunkSize ¶ added in v0.4.4
func (ml *MemoryLimit) ShouldReduceChunkSize(chunkSize int) (bool, int)
ShouldReduceChunkSize returns true if chunk size should be reduced for memory management
type MemoryPool ¶ added in v0.4.4
type MemoryPool struct {
// contains filtered or unexported fields
}
MemoryPool manages a pool of reusable byte slices, record slices, and string slices to reduce memory allocations during file processing operations.
The pool automatically manages object lifecycles and includes safeguards against memory leaks by limiting the maximum size of objects that can be returned to the pool. Objects that grow beyond maxSize are discarded rather than pooled.
Usage example:
pool := NewMemoryPool(1024 * 1024) // 1MB max buffer size buffer := pool.GetByteBuffer() defer pool.PutByteBuffer(buffer) // Use buffer...
Thread Safety: All methods are safe for concurrent use by multiple goroutines.
func NewMemoryPool ¶ added in v0.4.4
func NewMemoryPool(maxSize int) *MemoryPool
NewMemoryPool creates a new memory pool with configurable max buffer size
func (*MemoryPool) ForceGC ¶ added in v0.4.4
func (mp *MemoryPool) ForceGC()
ForceGC forces garbage collection and clears pools if memory pressure is high
func (*MemoryPool) GetByteBuffer ¶ added in v0.4.4
func (mp *MemoryPool) GetByteBuffer() []byte
GetByteBuffer gets a byte buffer from the pool
func (*MemoryPool) GetRecordSlice ¶ added in v0.4.4
func (mp *MemoryPool) GetRecordSlice() []Record
GetRecordSlice gets a record slice from the pool
func (*MemoryPool) GetStringSlice ¶ added in v0.4.4
func (mp *MemoryPool) GetStringSlice() []string
GetStringSlice gets a string slice from the pool
func (*MemoryPool) PutByteBuffer ¶ added in v0.4.4
func (mp *MemoryPool) PutByteBuffer(buf []byte)
PutByteBuffer returns a byte buffer to the pool if it's not too large
func (*MemoryPool) PutRecordSlice ¶ added in v0.4.4
func (mp *MemoryPool) PutRecordSlice(slice []Record)
PutRecordSlice returns a record slice to the pool if it's not too large
func (*MemoryPool) PutStringSlice ¶ added in v0.4.4
func (mp *MemoryPool) PutStringSlice(slice []string)
PutStringSlice returns a string slice to the pool if it's not too large
type MemoryStatus ¶ added in v0.4.4
type MemoryStatus int
MemoryStatus represents the current memory status
const ( // MemoryStatusOK indicates memory usage is within acceptable limits MemoryStatusOK MemoryStatus = iota // MemoryStatusWarning indicates memory usage is approaching the limit MemoryStatusWarning // MemoryStatusExceeded indicates memory usage has exceeded the limit MemoryStatusExceeded )
Memory status constants
func (MemoryStatus) String ¶ added in v0.4.4
func (ms MemoryStatus) String() string
String returns string representation of memory status
type OutputFormat ¶ added in v0.0.4
type OutputFormat int
OutputFormat represents the output file format
const ( // OutputFormatCSV represents CSV output format OutputFormatCSV OutputFormat = iota // OutputFormatTSV represents TSV output format OutputFormatTSV // OutputFormatLTSV represents LTSV output format OutputFormatLTSV // OutputFormatParquet represents Parquet output format OutputFormatParquet // OutputFormatXLSX represents Excel XLSX output format OutputFormatXLSX // OutputFormatACH represents ACH (NACHA) output format OutputFormatACH )
func (OutputFormat) Extension ¶ added in v0.2.0
func (f OutputFormat) Extension() string
Extension returns the file extension for the format
func (OutputFormat) String ¶ added in v0.2.0
func (f OutputFormat) String() string
String returns the string representation of OutputFormat
type ReadOnlyDB ¶ added in v0.9.0
type ReadOnlyDB struct {
// contains filtered or unexported fields
}
ReadOnlyDB wraps a *sql.DB to prevent write operations. All SELECT queries work normally, but INSERT, UPDATE, DELETE, DROP, ALTER, and CREATE statements are rejected with ErrReadOnly.
This is useful for audit scenarios where you want to view data without risk of modification.
Example:
db, err := filesql.Open("payment.ach")
if err != nil {
return err
}
defer db.Close()
rodb := filesql.NewReadOnlyDB(db)
// SELECT works fine
rows, err := rodb.Query("SELECT * FROM payment_entries")
// UPDATE/DELETE/INSERT are rejected
_, err = rodb.Exec("DELETE FROM payment_entries") // returns ErrReadOnly
func NewReadOnlyDB ¶ added in v0.9.0
func NewReadOnlyDB(db *sql.DB) *ReadOnlyDB
NewReadOnlyDB creates a read-only wrapper around an existing database connection. The underlying database is not modified; write operations are simply rejected at the API level.
func (*ReadOnlyDB) Begin ¶ added in v0.9.0
func (r *ReadOnlyDB) Begin() (*ReadOnlyTx, error)
Begin starts a read-only transaction. Deprecated: Use BeginTx instead.
func (*ReadOnlyDB) BeginTx ¶ added in v0.9.0
func (r *ReadOnlyDB) BeginTx(ctx context.Context, opts *sql.TxOptions) (*ReadOnlyTx, error)
BeginTx starts a read-only transaction with context and options.
func (*ReadOnlyDB) Close ¶ added in v0.9.0
func (r *ReadOnlyDB) Close() error
Close closes the underlying database connection.
func (*ReadOnlyDB) DB ¶ added in v0.9.0
func (r *ReadOnlyDB) DB() *sql.DB
DB returns the underlying *sql.DB. Use with caution as this bypasses read-only protection.
func (*ReadOnlyDB) Exec ¶ added in v0.9.0
Exec rejects write operations and returns ErrReadOnly. For read-only databases, use Query methods instead. Deprecated: Use ExecContext instead.
func (*ReadOnlyDB) ExecContext ¶ added in v0.9.0
func (r *ReadOnlyDB) ExecContext(ctx context.Context, query string, args ...any) (sql.Result, error)
ExecContext rejects write operations and returns ErrReadOnly. For read-only databases, use Query methods instead.
func (*ReadOnlyDB) Ping ¶ added in v0.9.0
func (r *ReadOnlyDB) Ping() error
Ping verifies the connection to the database. Deprecated: Use PingContext instead.
func (*ReadOnlyDB) PingContext ¶ added in v0.9.0
func (r *ReadOnlyDB) PingContext(ctx context.Context) error
PingContext verifies the connection to the database with context.
func (*ReadOnlyDB) Prepare ¶ added in v0.9.0
func (r *ReadOnlyDB) Prepare(query string) (*ReadOnlyStmt, error)
Prepare creates a prepared statement. Deprecated: Use PrepareContext instead.
func (*ReadOnlyDB) PrepareContext ¶ added in v0.9.0
func (r *ReadOnlyDB) PrepareContext(ctx context.Context, query string) (*ReadOnlyStmt, error)
PrepareContext creates a prepared statement with context.
func (*ReadOnlyDB) Query ¶ added in v0.9.0
Query executes a query that returns rows (SELECT statements). Deprecated: Use QueryContext instead.
func (*ReadOnlyDB) QueryContext ¶ added in v0.9.0
func (r *ReadOnlyDB) QueryContext(ctx context.Context, query string, args ...any) (*sql.Rows, error)
QueryContext executes a query that returns rows with context.
func (*ReadOnlyDB) QueryRow ¶ added in v0.9.0
func (r *ReadOnlyDB) QueryRow(query string, args ...any) *sql.Row
QueryRow executes a query that returns at most one row. Deprecated: Use QueryRowContext instead.
func (*ReadOnlyDB) QueryRowContext ¶ added in v0.9.0
QueryRowContext executes a query that returns at most one row with context.
type ReadOnlyStmt ¶ added in v0.9.0
type ReadOnlyStmt struct {
// contains filtered or unexported fields
}
ReadOnlyStmt wraps a *sql.Stmt to enforce read-only operations.
func (*ReadOnlyStmt) Close ¶ added in v0.9.0
func (s *ReadOnlyStmt) Close() error
Close closes the statement.
func (*ReadOnlyStmt) Exec ¶ added in v0.9.0
func (s *ReadOnlyStmt) Exec(args ...any) (sql.Result, error)
Exec is not allowed for read-only statements. Deprecated: Use ExecContext instead.
func (*ReadOnlyStmt) ExecContext ¶ added in v0.9.0
ExecContext is not allowed for read-only statements.
func (*ReadOnlyStmt) Query ¶ added in v0.9.0
func (s *ReadOnlyStmt) Query(args ...any) (*sql.Rows, error)
Query executes a prepared query statement. Deprecated: Use QueryContext instead.
func (*ReadOnlyStmt) QueryContext ¶ added in v0.9.0
QueryContext executes a prepared query statement with context.
func (*ReadOnlyStmt) QueryRow ¶ added in v0.9.0
func (s *ReadOnlyStmt) QueryRow(args ...any) *sql.Row
QueryRow executes a prepared query statement that returns at most one row. Deprecated: Use QueryRowContext instead.
func (*ReadOnlyStmt) QueryRowContext ¶ added in v0.9.0
QueryRowContext executes a prepared query statement that returns at most one row with context.
type ReadOnlyTx ¶ added in v0.9.0
type ReadOnlyTx struct {
// contains filtered or unexported fields
}
ReadOnlyTx wraps a *sql.Tx to enforce read-only operations.
func (*ReadOnlyTx) Commit ¶ added in v0.9.0
func (t *ReadOnlyTx) Commit() error
Commit commits the transaction.
func (*ReadOnlyTx) Exec ¶ added in v0.9.0
Exec rejects write operations. Deprecated: Use ExecContext instead.
func (*ReadOnlyTx) ExecContext ¶ added in v0.9.0
func (t *ReadOnlyTx) ExecContext(ctx context.Context, query string, args ...any) (sql.Result, error)
ExecContext rejects write operations.
func (*ReadOnlyTx) Prepare ¶ added in v0.9.0
func (t *ReadOnlyTx) Prepare(query string) (*ReadOnlyStmt, error)
Prepare creates a prepared statement within the transaction. Deprecated: Use PrepareContext instead.
func (*ReadOnlyTx) PrepareContext ¶ added in v0.9.0
func (t *ReadOnlyTx) PrepareContext(ctx context.Context, query string) (*ReadOnlyStmt, error)
PrepareContext creates a prepared statement within the transaction with context.
func (*ReadOnlyTx) Query ¶ added in v0.9.0
Query executes a query that returns rows. Deprecated: Use QueryContext instead.
func (*ReadOnlyTx) QueryContext ¶ added in v0.9.0
func (t *ReadOnlyTx) QueryContext(ctx context.Context, query string, args ...any) (*sql.Rows, error)
QueryContext executes a query that returns rows with context.
func (*ReadOnlyTx) QueryRow ¶ added in v0.9.0
func (t *ReadOnlyTx) QueryRow(query string, args ...any) *sql.Row
QueryRow executes a query that returns at most one row. Deprecated: Use QueryRowContext instead.
func (*ReadOnlyTx) QueryRowContext ¶ added in v0.9.0
QueryRowContext executes a query that returns at most one row with context.
func (*ReadOnlyTx) Rollback ¶ added in v0.9.0
func (t *ReadOnlyTx) Rollback() error
Rollback aborts the transaction.
type Record ¶ added in v0.4.4
type Record []string
Record represents file records as a slice of string fields. This type was changed from unexported 'record' to exported 'Record' in v0.5.0 to fix lint issues with exported methods returning unexported types.
Breaking change: Code that previously imported and used the unexported 'record' type will need to be updated to use 'Record'.
type SlogAdapter ¶ added in v0.10.0
type SlogAdapter struct {
// contains filtered or unexported fields
}
SlogAdapter wraps slog.Logger to implement the Logger interface
func NewSlogAdapter ¶ added in v0.10.0
func NewSlogAdapter(logger *slog.Logger) *SlogAdapter
NewSlogAdapter creates a new SlogAdapter wrapping the given slog.Logger
func (*SlogAdapter) Debug ¶ added in v0.10.0
func (s *SlogAdapter) Debug(msg string, args ...any)
Debug logs a debug message
func (*SlogAdapter) Error ¶ added in v0.10.0
func (s *SlogAdapter) Error(msg string, args ...any)
Error logs an error message
func (*SlogAdapter) Info ¶ added in v0.10.0
func (s *SlogAdapter) Info(msg string, args ...any)
Info logs an info message
func (*SlogAdapter) Warn ¶ added in v0.10.0
func (s *SlogAdapter) Warn(msg string, args ...any)
Warn logs a warning message
func (*SlogAdapter) With ¶ added in v0.10.0
func (s *SlogAdapter) With(args ...any) Logger
With returns a new Logger with the given key-value pairs added
type SlogContextAdapter ¶ added in v0.10.0
type SlogContextAdapter struct {
SlogAdapter
}
SlogContextAdapter wraps slog.Logger to implement the ContextLogger interface
func NewSlogContextAdapter ¶ added in v0.10.0
func NewSlogContextAdapter(logger *slog.Logger) *SlogContextAdapter
NewSlogContextAdapter creates a new SlogContextAdapter wrapping the given slog.Logger
func (*SlogContextAdapter) DebugContext ¶ added in v0.10.0
func (s *SlogContextAdapter) DebugContext(ctx context.Context, msg string, args ...any)
DebugContext logs a debug message with context
func (*SlogContextAdapter) ErrorContext ¶ added in v0.10.0
func (s *SlogContextAdapter) ErrorContext(ctx context.Context, msg string, args ...any)
ErrorContext logs an error message with context
func (*SlogContextAdapter) InfoContext ¶ added in v0.10.0
func (s *SlogContextAdapter) InfoContext(ctx context.Context, msg string, args ...any)
InfoContext logs an info message with context
func (*SlogContextAdapter) WarnContext ¶ added in v0.10.0
func (s *SlogContextAdapter) WarnContext(ctx context.Context, msg string, args ...any)
WarnContext logs a warning message with context
func (*SlogContextAdapter) With ¶ added in v0.10.0
func (s *SlogContextAdapter) With(args ...any) Logger
With returns a new ContextLogger with the given key-value pairs added
type TableName ¶ added in v0.4.2
type TableName struct {
// contains filtered or unexported fields
}
TableName represents a table name with validation
func NewTableName ¶ added in v0.4.2
NewTableName creates a new TableName with validation
