Cross-platform Python toolkit for managing LTO tape archives using LTFS (Linear Tape File System). This package provides functionality similar to commercial tools like Canister, but with Linux support. It is not intended to replace industry-tested commercial tools, which are more robust and mature.
This is experimental software under active development. While we have implemented verification at every step (source hashing, transfer, read-back verification), this code is new and has not been battle-tested in production environments.
Before using this for important data:
- Do not rely solely on this tool for critical backups. Always maintain multiple copies of irreplaceable data using different methods and tools.
- Verify independently. After any transfer, consider spot-checking files manually or using additional verification tools.
- Test thoroughly. Run test transfers with non-critical data before trusting this tool with important archives.
- Keep your source data. Do not delete original files until you have independently confirmed the tape contents are correct and readable.
- Understand the risks. Tape drives, LTFS implementations, and this software can all have bugs. A single point of failure in your backup strategy is dangerous.
The authors provide this software as-is, without warranty. Use at your own risk.
- Cross-platform: Works on macOS and Linux
- XXHash64 verification: Fast, reliable file integrity checking
- MHL output: Industry-standard Media Hash List format
- Automatic LTFS index backup: Captures tape indexes for recovery
- Offline catalog system: Browse tape contents without mounting (Canister-style)
- CatalogFS (FUSE): Mount catalogs as a virtual filesystem with real file sizes
- Tape metadata extraction: Read volume UUID, generation, software info via xattrs
- Rich CLI: Beautiful progress bars and formatted output
- Pythonic API: Use as a library in your own scripts
# From source
git clone https://github.com/ctsilva/ltfs-tools.git
cd ltfs-tools
pip install -e .
# Or with development dependencies
pip install -e ".[dev]"- Python 3.9+
- LTFS drivers (IBM Spectrum Archive or vendor-provided)
- rsync
- xattr Python module (auto-installed)
# Mount a tape (auto-captures indexes)
ltfs-tool mount BACKUP01
# Check tape info (fast - no deep scan)
ltfs-tool info
# Transfer files with verification
ltfs-tool transfer /path/to/archive BACKUP01
# Unmount (critical - writes final index!)
ltfs-tool unmountMount an LTFS-formatted tape. Automatically captures index backups to ~/ltfs-archives/indexes/.
ltfs-tool mount [VOLUME_NAME] [--mount-point PATH] [--device DEVICE]
# Examples
ltfs-tool mount # Mount with defaults
ltfs-tool mount BACKUP01 # Mount with volume name
ltfs-tool mount BACKUP01 -m /mnt/tape # Custom mount pointImportant: On Linux, the tape device is auto-detected via lsscsi -g. On macOS, uses device index 0.
Safely unmount a tape. Always do this before ejecting! This writes the final LTFS index to tape.
ltfs-tool unmount [--mount-point PATH]Display information about a mounted tape. Shows:
- Mount point and device
- Volume UUID and generation number
- Software vendor/product/version
- Format specification version
- Top-level directory contents
- Total size (using fast
ducommand)
ltfs-tool info [--mount-point PATH]
# Example output:
# LTFS Tape Information
#
# Mount Point /media/tape
# Platform linux
# Device /dev/sg1
# Volume UUID 001c2668-aa66-475e-a211-bfcfb7b64712
# Software Vendor IBM
# Software Product LTFS LE
# Software Version 2.5.0.0 (Prelim)
# Format Spec 2.4.0
#
# Top-level Contents:
# Directories:
# Downloads/
# backups/
# home-old-vida/Transfer files with XXHash verification, logging, and MHL generation.
ltfs-tool transfer SOURCE [TAPE_NAME] [--dry-run] [--no-verify] [--mount-point PATH]
# Examples
ltfs-tool transfer ~/Documents BACKUP01
ltfs-tool transfer /data/project PROJECT_ARCHIVE
ltfs-tool transfer ~/Documents BACKUP01 --dry-run # Preview only
ltfs-tool transfer ~/Documents BACKUP01 -m /Volumes/LTFS_TAPE # Custom mount pointTransfer Process:
- Phase 1: Hash all source files (XXHash64), tracking excluded files
- Phase 2: Transfer files with rsync (excluding system/temp files)
- Phase 3: Verify destination files by comparing hashes (reads from tape, not cache)
- Phase 4: Generate MHL file with all hashes and metadata
- Phase 5: Update catalog with zero-byte placeholders
Phase Performance Output:
After each transfer, detailed timing is shown:
Phase Performance
Phase 1 (Hash source) 16.68s 614.0 MB/s
Phase 2 (Transfer) 46.42s 220.6 MB/s
Phase 3 (Verify) 35.38s 289.4 MB/s
Phase 4 (MHL) 0.04s —
Phase 5 (Catalog) 0.05s —
Overall 111.34s 92.0 MB/s
File Exclusions: By default, system and temporary files are excluded:
.DS_Store,.Spotlight-*,.fseventsd,.Trashes(macOS)Thumbs.db(Windows)*.tmp,.TemporaryItemsLibrary/Caches/
Excluded files are listed in the transfer log for transparency.
Long-Running Transfers:
For large transfers that may take hours, use tmux to prevent disconnection from stopping the job:
# Start a tmux session
tmux new -s tape-transfer
# Inside tmux, run your transfer
ltfs-tool transfer /large/dataset BACKUP01
# Detach from tmux (transfer continues in background)
# Press: Ctrl+b, then d
# Later, reattach to check progress
tmux attach -t tape-transfer
# List all tmux sessions
tmux lsThis ensures your transfer continues even if your SSH connection drops. The transfer log at ~/ltfs-archives/logs/ will contain the complete record.
Outputs:
- Transfer log:
~/ltfs-archives/logs/(includes excluded files list) - MHL file:
~/ltfs-archives/mhl/ - LTFS index backups:
~/ltfs-archives/indexes/(automatic) - Catalog:
~/ltfs-archives/catalogs/(created from indexes)
Recover MHL and catalog after a failed transfer by re-hashing the source files (fast).
Use this when a transfer crashed during Phase 4 (MHL) or Phase 5 (Catalog) but the files are already on tape. This reads from the original source (SSD/disk) which is much faster than reading from tape.
ltfs-tool recover SOURCE [TAPE_NAME]
# Example - recover using original source
ltfs-tool recover /scratch/csilva/deathstar2 deathstar2Time estimate: ~500-600 MB/s (limited by source disk speed)
Generate MHL and catalog from tape files (slower than recover).
Use this when the original source is no longer available and you need to generate MHL/catalog from the tape itself.
ltfs-tool finalize SOURCE_NAME [TAPE_NAME]
# Example - finalize from tape
ltfs-tool finalize deathstar2Time estimate: ~200-300 MB/s (limited by tape read speed)
Verify files against an MHL file.
ltfs-tool verify MHL_FILE [BASE_PATH]
# Examples
ltfs-tool verify ~/ltfs-archives/mhl/BACKUP01.mhl # Verify mounted tape
ltfs-tool verify archive.mhl /path/to/restored/files # Verify restored filesManage tape catalogs. Catalogs are created from LTFS index XML files and allow searching tape contents without mounting the tape.
# List cataloged tapes
ltfs-tool catalog list
# Search across catalogs
ltfs-tool catalog search "*.mov"
ltfs-tool catalog search "project*" --tape BACKUP01Mount all tape catalogs as a virtual filesystem using FUSE. Files show their real sizes from LTFS indexes without consuming disk space. Inspired by Canister's catalog browsing feature.
# Install FUSE support
pip install ltfs-tools[fuse]
# Also ensure FUSE is installed on your system:
# Linux: sudo apt install fuse3
# macOS: brew install macfuse
# Mount catalogs
ltfs-tool catalog mount /mnt/catalogs
# Browse with standard tools - files show real sizes!
ls -la /mnt/catalogs/
ls -lh /mnt/catalogs/BACKUP01/Videos/
du -sh /mnt/catalogs/BACKUP01/*
# Reading a file shows which tape it's on
cat /mnt/catalogs/BACKUP01/Videos/project.mov
# Output: [File is on tape: BACKUP01]
# Size: 1,234,567,890 bytes
# Mount tape BACKUP01 to access this file.
# Run in foreground for debugging
ltfs-tool catalog mount /mnt/catalogs --foreground
# Unmount
ltfs-tool catalog unmount /mnt/catalogsFeatures:
- Shows actual file sizes from LTFS index (not zero-byte placeholders)
- Preserves timestamps from original files
- Read-only filesystem (protects against accidental writes)
- Browse multiple tapes in one mount point
- Works with
ls,find,du, and other standard tools
from pathlib import Path
from ltfs_tools import mount, transfer, verify, unmount
# Mount tape (auto-captures indexes)
mount(volume_name="BACKUP01")
# Transfer files
result = transfer(
source=Path("/path/to/archive"),
tape_name="BACKUP01",
)
print(f"Transferred {result.files_verified} files")
print(f"MHL file: {result.mhl_path}")
# Unmount
unmount()The killer feature - browse tape contents without mounting!
from pathlib import Path
from ltfs_tools import (
create_catalog_from_index,
update_catalog_from_latest_index,
search_catalogs,
list_tapes
)
# Option 1: Create catalog from specific index file
index_file = Path("~/ltfs-archives/indexes/001c2668-226-b.xml")
catalog_dir = create_catalog_from_index(index_file, tape_name="BACKUP01")
# Option 2: Auto-update from latest index (recommended)
catalog_dir = update_catalog_from_latest_index("BACKUP01")
# List all cataloged tapes
tapes = list_tapes()
print(f"Available tapes: {tapes}")
# Search without mounting the tape!
results = search_catalogs("*.mov")
for tape_name, file_path in results:
print(f"{tape_name}: {file_path}")from pathlib import Path
from ltfs_tools.ltfs_index import LTFSIndexParser
# Parse an LTFS index XML file
index = LTFSIndexParser.parse(Path("tape-index.xml"))
print(f"Volume UUID: {index.volume_uuid}")
print(f"Generation: {index.generation}")
print(f"Format version: {index.version}")
# Get all files
all_files = LTFSIndexParser.get_all_files(index)
for file in all_files:
print(f"{file.path}: {file.size} bytes")
print(f" Physical location: partition {file.extents[0].partition}, "
f"block {file.extents[0].start_block}")
# Get all directories
all_dirs = LTFSIndexParser.get_all_directories(index)from ltfs_tools import MHL, HashEntry
from pathlib import Path
# Create an MHL
mhl = MHL()
mhl.add_hash(HashEntry(
file="document.pdf",
size=1234567,
xxhash64be="abc123def456789",
))
mhl.save(Path("archive.mhl"))
# Load and iterate
mhl = MHL.load(Path("archive.mhl"))
for entry in mhl:
print(f"{entry.file}: {entry.xxhash64be}")from ltfs_tools import hash_file, verify_hash
from pathlib import Path
# Hash a file (XXHash64 big-endian)
file_hash = hash_file(Path("large_video.mov"))
# Verify against known hash
is_valid = verify_hash(Path("large_video.mov"), "expected_hash")from ltfs_tools import get_tape_info
from pathlib import Path
# Get info about mounted tape
info = get_tape_info(Path("/media/tape"))
if info["mounted"]:
print(f"Volume UUID: {info.get('volumeUUID')}")
print(f"Generation: {info.get('generation')}")
print(f"Software: {info.get('softwareProduct')} {info.get('softwareVersion')}")
print(f"Top-level dirs: {info['top_level_dirs']}")export LTFS_MOUNT_POINT="/media/tape"
export LTFS_DEVICE="/dev/sg1"
export LTFS_ARCHIVE_BASE="/data/tape-archives"from ltfs_tools import Config, set_config
from pathlib import Path
config = Config(
mount_point=Path("/custom/mount"),
device="/dev/sg1",
archive_base=Path("/data/archives"),
excludes=[".DS_Store", "*.tmp", "node_modules/"],
)
set_config(config)- LTFS binary:
/usr/local/bin/ltfs - Default device:
/dev/sg1(SCSI generic for mounting) - Format device:
/dev/st0(tape device formkltfs) - Default mount:
/media/tape
- LTFS binary:
/Library/Frameworks/LTFS.framework/.../ltfs - Default device:
0(device index) - Default mount:
/Volumes/LTFS
LTFS Tools creates an organized archive structure:
~/ltfs-archives/
├── indexes/ # LTFS index backups (XML)
│ ├── 001c2668-aa66-475e-a211-bfcfb7b64712-226-a.xml
│ └── 001c2668-aa66-475e-a211-bfcfb7b64712-226-b.xml
├── catalogs/ # Zero-byte placeholders (Canister-style)
│ └── BACKUP01/
│ ├── Documents/
│ │ ├── file1.pdf (0 bytes, preserves mtime)
│ │ └── file2.docx (0 bytes, preserves mtime)
│ └── Videos/
├── mhl/ # Media Hash Lists (XML)
│ └── BACKUP01-20251206.mhl
└── logs/ # Transfer logs
└── BACKUP01-20251206.log
LTFS indexes are XML files containing complete filesystem metadata:
- Volume UUID, generation number, format version
- Complete directory tree structure
- File metadata (paths, sizes, timestamps)
- Physical tape locations (partition, block numbers, byte offsets)
- Used for mounting and recovery
Automatically captured during mount with -o capture_index.
Transfer logs provide detailed records of each transfer operation:
LTFS Transfer Log
================
Source: /path/to/source
Destination: /media/tape/destination
Tape: BACKUP01
Started: 2025-12-06T23:17:07Z
Files counted: 824
Files to transfer: 810
Files excluded: 14
--- Excluded files ---
.DS_Store
.fseventsd/...
.Trashes/...
[additional excluded files]
--- rsync output ---
[rsync transfer progress]
--- Summary ---
Finished: 2025-12-06T23:23:16Z
Duration: 369.5s
Files counted: 824
Files transferred: 810
Files excluded: 14
Files verified: 810
Files failed: 0
Catalogs enable offline tape browsing without mounting:
- Zero-byte placeholder files mirroring tape structure
- Preserves timestamps and directory hierarchy
- Searchable with standard filesystem tools
- Created from LTFS index XML files
Industry-standard XML format for archival verification:
<?xml version="1.0" encoding="UTF-8"?>
<hashlist version="1.1">
<creatorinfo>
<name>User Name</name>
<tool>ltfs-tools 0.1.0</tool>
</creatorinfo>
<tapeinfo>
<name>BACKUP01</name>
</tapeinfo>
<hash>
<file>document.pdf</file>
<size>1234567</size>
<xxhash64be>abc123def456789</xxhash64be>
<hashdate>2025-12-06T15:30:00Z</hashdate>
</hash>
</hashlist>Based on benchmarks with 10 GB transfers:
| File Size | Write Speed | Verify Speed | Overall |
|---|---|---|---|
| 10 MB | 220 MB/s | 289 MB/s | 92 MB/s |
| 100 MB | 203 MB/s | 217 MB/s | 86 MB/s |
| 1 MB | 184 MB/s | 212 MB/s | 77 MB/s |
| File Size | Write Speed | Verify Speed | Overall |
|---|---|---|---|
| 10 MB | 101 MB/s | 81 MB/s | 44 MB/s |
| 100 MB | 113 MB/s | 88 MB/s | 49 MB/s |
| 1 MB | 112 MB/s | 76 MB/s | 43 MB/s |
Time estimates for large transfers:
| Data Size | LTO-9 (Linux) | LTO-6 (macOS) |
|---|---|---|
| 1 TB | ~3 hr | ~6 hr |
| 5 TB | ~15 hr | ~30 hr |
| 10 TB | ~30 hr | ~60 hr |
Rule of thumb: Total time ≈ 3× raw transfer time (includes hashing and verification).
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run tests with coverage
pytest --cov=ltfs_tools
# Format code
black src tests
# Lint
ruff src tests
# Type check
mypy srcFor operations that may take hours (large transfers, tape formatting), use tmux or screen to ensure the process continues even if your SSH session disconnects:
# Start a named tmux session
tmux new -s backup-job
# Run your long operation inside tmux
ltfs-tool transfer /large/dataset BACKUP-TAPE
# Detach from tmux (keeps process running)
# Press: Ctrl+b, then d
# Reconnect later from any terminal
tmux attach -t backup-job
# List all active sessions
tmux ls
# Kill a session when done
tmux kill-session -t backup-jobUseful tmux commands:
Ctrl+b, d- Detach from sessionCtrl+b, c- Create new windowCtrl+b, n- Next windowCtrl+b, p- Previous windowCtrl+b, [- Scroll mode (use arrow keys,qto exit)
- Always mount with a volume name:
ltfs-tool mount TAPE-NAME - Check tape info before transfer:
ltfs-tool info - Use tmux for large transfers: Prevents interruption from network issues
- Monitor logs:
tail -f ~/ltfs-archives/logs/transfer_*.log - Always unmount properly:
ltfs-tool unmountwrites final index to tape - Verify catalog after transfer:
ltfs-tool catalog list
- Never power off without unmounting - this will corrupt the LTFS index
- Eject only after unmount - ensures final index is written
- Label tapes clearly with both physical labels and volume names
- Store transfer logs - they contain excluded files lists and verification results
Use sudo or add your user to the tape group:
sudo usermod -a -G tape $USER
# Log out and back in for group change to take effectTo ensure Phase 3 reads from tape (not Linux page cache), ltfs-tools needs permission to drop the cache. Add this sudoers rule:
echo "$USER ALL=(ALL) NOPASSWD: /usr/sbin/sysctl -w vm.drop_caches=3" | sudo tee /etc/sudoers.d/ltfs-drop-cache
sudo chmod 0440 /etc/sudoers.d/ltfs-drop-cacheWithout this, verification may read from memory cache instead of tape, showing unrealistic speeds (4000+ MB/s instead of ~200-300 MB/s).
On macOS, LTFS may create mount points with the tape barcode in the name (e.g., /Volumes/LTFS_10WT050503_SILV03). Use the --mount-point option to specify the correct path:
ltfs-tool info --mount-point /Volumes/LTFS_10WT050503_SILV03
ltfs-tool transfer ~/data BACKUP01 --mount-point /Volumes/LTFS_10WT050503_SILV03macOS does not support dropping the page cache (vm.drop_caches), so verification speeds may be inflated if recently transferred files are still cached in memory. For accurate verification timing, wait a few minutes or transfer datasets larger than available RAM.
Check that the index directory exists and is writable:
ls -la ~/ltfs-archives/indexes/Indexes are only written when the LTFS index is updated (every 5 minutes by default with sync_type=time@5, or during unmount).
Make sure catalogs have been created from indexes:
from ltfs_tools import update_catalog_from_latest_index
update_catalog_from_latest_index("TAPE_NAME")MIT
This project started by loosely reverse-engineering how commercial tools like Canister work with LTFS on macOS. The core insight is that LTFS indexes contain all metadata needed to create offline catalogs. Significant additional engineering was required to support Linux, including automatic device detection, platform-specific mount handling, and page cache management for accurate verification.