feat: redesign Models tab with improved layout and detail panel#14
Merged
feat: redesign Models tab with improved layout and detail panel#14
Conversation
Add inline model selection and H2H comparison to the Benchmarks tab: - Space to toggle model selection (max 8) with colored markers - Bottom panel auto-transitions to H2H table when 2+ selected - Head-to-head comparison with ranked metrics and star winners - Detail overlay popup (d key) when H2H is shown - Contextual footer and help documentation - Status messages for selection actions - Selection state on App level, cleared on store rebuild Co-Authored-By: Claude Opus 4.6 <[email protected]>
Add Scatter and Radar visualization views to the Benchmarks tab bottom panel, cycling with 'v' when 2+ models are selected. Rebind 'c' to clear selections, freeing 'x'/'y' for scatter axis cycling and 'a' for radar preset cycling. - Port radar.rs module with Canvas/Braille rendering, 3 presets (Agentic, Academic, Indexes), and average baseline polygon - Port draw_scatter() with auto-log scale, crosshair averages, colored selected model markers, and legend - Add sub-tab bar [H2H] [Scatter] [Radar] with active view highlight - Contextual footer hints and help popup per view type - Status messages for view/axis/preset cycling - 10 new unit tests for cycling logic and spoke geometry Co-Authored-By: Claude Opus 4.6 <[email protected]>
Add 9 missing metrics (IFBench, Terminal, Tau2, LCR, MATH-500, AIME, AIME'25, TTFAT, Blended price) and organize into 4 sections matching the detail panel: Indexes, Benchmarks, Performance, Pricing. Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Add Model Info section with Creator, Source, Region, Type, Released - Source colored green (Open) / red (Closed) - Region and Type use creator category colors from sidebar - Add medal-colored ranks: gold ★ (#1), silver (#2), bronze (#3) - Add ★ Wins summary row near top showing per-model win counts - Pre-compute wins via two-pass to position above metric sections Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Wins row now appears directly under model name header - Section headers provide natural separation between info and metrics Co-Authored-By: Claude Opus 4.6 <[email protected]>
Browse mode (0-1 selections): 25/75 horizontal, 65/35 vertical split giving the list more space and shrinking the detail panel. Compare mode (2+ selections): creators sidebar hidden, compact list at 30% width (min 35 chars) with full-height comparison views at 70%. H2H table scrollable via j/k when Compare panel focused. Focus model switches between List↔Compare (h/l/Tab). Border colors indicate focused panel. Footer hints adapt to current mode and focus. Co-Authored-By: Claude Opus 4.6 <[email protected]>
The compare mode compact list now displays the active sort column/ direction and source filter in the title bar, matching the full list. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Press `t` in compare mode to swap the left panel between the compact models list and the creators sidebar. Focus adjusts automatically to match the visible panel. Footer hint shows the toggle target. Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Evenly-spaced tick marks on both axes (7 ticks) - Independent per-axis averages shown in axis titles (gray) - Crosshair lines mark average positions on the chart - Whole number bounds for non-log axes (floor/ceil) - Log-aware tick formatting Co-Authored-By: Claude Opus 4.6 <[email protected]>
Non-log axes snap bounds to nice multiples (10, 20, 50) so ticks land on whole numbers. Log axes auto-increase decimal precision until all tick labels are unique, preventing duplicate "0.0" entries. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Parse parenthetical metadata from BenchmarkEntry names into structured fields: reasoning_status (Reasoning/NonReasoning/Adaptive/None), effort_level, variant_tag, and display_name (name with parens stripped). Add reasoning filter to Benchmarks tab (key 7, cycles All → Reasoning → Non-reasoning) with R/NR/AR indicator column, title bar indicator, footer hint, and help popup entry. Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Detail panel: reasoning status, effort level, and variant tag displayed alongside release date and source metadata - H2H view: header uses display_name, added Reasoning/Effort/Variant rows in Model Info section (effort/variant only shown when present) - Fix date regex to match full month names (June, March, etc.) - Handle compound parentheticals like "(March 2025, chatgpt-4o-latest)" by splitting date from variant tag Co-Authored-By: Claude Opus 4.6 <[email protected]>
Three-layer reasoning status detection: 1. Parse parenthetical metadata: (Reasoning), (Non-reasoning), (Adaptive) 2. Name heuristics: standalone effort implies reasoning (e.g. "o4-mini (high)"), "Reasoning"/"Thinking" in base name detected 3. Models.dev augmentation: apply_reasoning_from_models_dev() uses the existing Jaro-Winkler matching pipeline to fill in reasoning status for models without parenthetical hints (o3, o1, DeepSeek R1, etc.) Refactored open_weights.rs to extract both open_weights and reasoning from a shared match_entries() function, avoiding duplicate matching. Co-Authored-By: Claude Opus 4.6 <[email protected]>
…s.dev Extend the models.dev matching pipeline to also extract tool_call, context_window, and max_output alongside open_weights and reasoning. Renamed apply_reasoning_from_models_dev → apply_model_traits with a shared ModelTraits struct and from_model() constructor. Applied in both startup and BenchmarkDataReceived paths. Detail panel shows Tools/Context/Output row. H2H view shows Tools, Context, and Max Output in the Model Info section. Co-Authored-By: Claude Opus 4.6 <[email protected]>
File now handles open_weights, reasoning, tool_call, and context limits from models.dev — "model_traits" better describes its expanded scope. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
Adds fmt_tokens() helper: 200000→"200k", 1000000→"1M", 1500000→"1.5M". Used in detail panel and H2H view for context_window and max_output. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Browse mode now uses creators (20%) | list (40%) | detail (40%) horizontal layout instead of stacking the detail panel below the list. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Replaces push_three_col with push_two_col for narrower detail panel. Better use of vertical space in the side-by-side layout. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Compute label/value widths proportionally from area.width instead of hardcoded format widths. Adds DetailCols struct and push_meta_row helper with tuple-based API to stay under clippy's argument limit. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Replace manual DetailCols arithmetic with ColumnWidths struct that uses Layout::split with Constraint::Percentage for proportional column sizing. Replace push_two_col with push_detail_row/push_meta_row helpers. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
Press 's' to open a popup listing all sort columns with picker_label descriptions. j/k to navigate, Enter to select, Esc to dismiss. Shows ▼/▲ indicator next to the currently active sort column. Replaces the old cycle-through-all-columns behavior with direct selection. Co-Authored-By: Claude Opus 4.6 <[email protected]>
…lists Both browse and compact (compare mode) lists now show: - Reasoning status: R (cyan), AR (yellow), NR (gray) - Source: O (green) for open, C (red) for closed - Region/Type: colored 2-letter codes when grouping is active Creators panel and compact list titles now display active filter indicators. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Replace single-line legend with a bordered box showing one model per line with axis labels and values. Dynamically sizes based on number of selected models. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Replace single-line legend with a bordered box showing one line per selected model with axis values. Name column dynamically fills available width instead of truncating at a fixed 20 chars. Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Convert scatter plot legend from manual Span/Paragraph to ratatui Table widget with Constraint-based column widths - Add legend box to radar chart with model scores per axis and avg row - Use short labels in legend (Cod, LC, SC, TB, IF, LCR) with full names on radar spokes (e.g., "LiveCodeBench (LC)") - Wrap long spoke labels (Long Context Reasoning) to avoid chart overlap - Use correct benchmark names from Artificial Analysis - Avg row uses distinct light gray color for visibility Co-Authored-By: Claude Opus 4.6 <[email protected]>
…panel - Restructure from 2-row to 3-column layout (Providers | Models | Right Panel) - Right panel splits into adaptive Provider box and Details box - Provider panel height computed from actual content with URL wrapping - Detail panel uses Table widgets for aligned Pricing/Limits/Dates sections - Section headers with dash-padding separators - Capability badges show full names (Reasoning, Tools, Files, etc.) - Smart pricing format: whole numbers when no decimals, 2dp otherwise - Keybinding hints (o/A) only shown when provider has docs/api URLs - 60/40 left/right split ratio Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Replace span byte length (.len()) with line.width() for unicode-aware width calculation matching ratatui's internal wrapping - Add +1 line buffer for lines that wrap to account for ratatui's word-wrapping using more visual lines than char-level div_ceil - Also fix width calculation in agents tab detail scroll offsets Co-Authored-By: Claude Opus 4.6 <[email protected]>
Co-Authored-By: Claude Opus 4.6 <[email protected]>
…panel - Replace fixed-width provider column with percentage-based 3-column layout (20% providers | 45% models | 35% detail panel) - Move model ID to its own line under the model name in detail panel - Modalities displayed as labeled Input/Output table - Conditional keybinding hints (o/A) only when URLs exist Co-Authored-By: Claude Opus 4.6 <[email protected]>
…list - Add RTFO column to model list (Reasoning, Tools, Files, Open/Closed) - R=Cyan, T=Yellow, F=Magenta, O=Green, C=Red, inactive=dot - Capabilities section in detail panel now uses labeled Table with Yes/No values, Source: Open/Closed matching benchmarks tab style - Consistent colors between list indicators and detail panel - 20/45/35 percentage-based 3-column layout Co-Authored-By: Claude Opus 4.6 <[email protected]>
…d field Add Temperature row to the capabilities table section. Move model status from a [deprecated] badge on the name line to a labeled "Status:" field on the Provider/Family row, consistent with other identity fields. Co-Authored-By: Claude Opus 4.6 <[email protected]>
arimxyer
added a commit
that referenced
this pull request
Mar 20, 2026
feat: redesign Models tab with improved layout and detail panel
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Test plan
mise run fmt && mise run clippy && mise run test🤖 Generated with Claude Code