OOM crash on large transcript files — split_mega_files.py and normalize.py load entire file into memory

## Description

`split_mega_files.py` and `normalize.py` eagerly load entire file contents into memory, causing `MemoryError` / OOM kills on large transcript exports (multi-GB Slack or ChatGPT bulk exports).

## Root Cause

### split_mega_files.py

Line 185 — `split_file()` loads the entire file:
```python
lines = path.read_text(errors="replace").splitlines(keepends=True)
```

Line 270 — `main()` scan loop loads it **again** to count session boundaries:
```python
lines = f.read_text(errors="replace").splitlines(keepends=True)
```

Every file is read fully into memory **twice**.

### normalize.py

Line 29-30:
```python
with open(filepath, "r", encoding="utf-8", errors="replace") as f:
    content = f.read()
```

## Memory Impact (measured)

```
Input: 1MB test file
read_text(): 1,040,041 bytes allocated
splitlines(): 4,071,081 bytes allocated
Overhead ratio: 2.9x the input size (string + list of lines)
```

For a 2GB Slack export:
- `read_text()` allocates ~2GB
- `.splitlines(keepends=True)` allocates an additional ~2-3GB
- **Total peak: ~5.8GB** for a 2GB file
- On an 8GB machine → OOM kill
- `split_mega_files.py` reads the file twice → could peak at ~11.6GB

## Impact

- `MemoryError` crash on large transcript files
- OOM killer terminates the process on memory-constrained systems
- ChatGPT `conversations.json` exports can easily exceed 1GB
- Slack workspace exports can exceed 5GB

## Suggested Fix

Stream files line-by-line instead of loading everything into memory:

```python
def find_session_boundaries_streaming(filepath):
    boundaries = []
    with open(filepath, errors="replace") as f:
        for i, line in enumerate(f):
            if "Claude Code v" in line:
                # peek ahead for context restore detection
                boundaries.append(i)
    return boundaries
```

And refactor `split_file()` to use a two-pass streaming approach or split using byte offsets.

## Environment

- mempalace v3.0.13 (current main)
- `split_mega_files.py` lines 185, 270
- `normalize.py` lines 29-30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OOM crash on large transcript files — split_mega_files.py and normalize.py load entire file into memory #396

Description

Root Cause

split_mega_files.py

normalize.py

Memory Impact (measured)

Impact

Suggested Fix

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

OOM crash on large transcript files — split_mega_files.py and normalize.py load entire file into memory #396

Description

Description

Root Cause

split_mega_files.py

normalize.py

Memory Impact (measured)

Impact

Suggested Fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions