Text layer by teunbrand · Pull Request #155 · posit-dev/ggsql

teunbrand · 2026-02-25T14:04:18Z

This PR implements text layers.

Unfortunately, it has become quite complex and I would have made a shorter PR if that were easy.
The main thing driving complexity is that the writer only accepts static angle/hjust/vjust/family/fontface per layer. In the worst case scenario, we split up the layer into many for every row in the data, but generally we try to be economical about this and use run length encoding to collect more rows per layer. In the best case scenario with static angle/hjust/vjust/family/fontface we just emit a single layer.

Also while this PR touches the 'label' layer, we haven't figured out yet how to draw fitted rectangles behind text, so we shouldn't consider the label layer finished.

Introduces a separate 'fontsize' aesthetic as an alternative to 'size' for text/label geoms. Unlike 'size' (which uses area-based scaling with radius² conversion for point marks), 'fontsize' uses linear scaling for font sizes. Changes: - Grammar: Add 'fontsize' to aesthetic names - Geoms: Add 'fontsize' to Text and Label supported aesthetics - Aesthetics: Register 'fontsize' in NON_POSITIONAL list - Writer: Map 'fontsize' → 'size' channel in Vega-Lite output - Scale: Add default range [8.0, 20.0] for fontsize aesthetic - Tests: Add test_fontsize_linear_scaling integration test Usage: DRAW text MAPPING x AS x, y AS y, value AS fontsize SCALE fontsize TO [10, 20] -- Linear: 10pt to 20pt (not area-converted) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Add TextRenderer implementation that handles font aesthetics (family, fontface, hjust, vjust) by splitting data into multiple Vega-Lite layers when font properties vary across rows. Key features: - Single-layer optimization: When all fonts are constant, generates one layer with mark properties set directly - Multi-layer splitting: When fonts vary, creates one layer per unique font combination while preserving ORDER BY - Proper SOURCE_COLUMN filtering: Uses empty string for single-layer and suffix keys for multi-layer to match BoxplotRenderer pattern - Font property mapping: - family → mark.font - fontface → mark.fontWeight/fontStyle - hjust → mark.align - vjust → mark.baseline Tests included for both constant and varying font cases. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Remove the FontStrategy enum variants and use a single struct with a groups vector. The single-layer case now has 1 group containing all rows, while the multi-layer case has N groups. Benefits: - Eliminates redundant code paths (no more match statements) - Simpler prepare_data() - just iterate over groups - Simpler finalize() - unified layer generation logic - Fewer lines of code overall Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

TextMetadata was simply wrapping FontStrategy with no additional value. Store FontStrategy directly in PreparedData metadata instead. This eliminates 4 lines and one level of indirection. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

The signature field was only used during group construction as a HashMap key to track row assignments. After groups are built, the field was never accessed (marked with #[allow(dead_code)]). Removed the field and its assignments, keeping the local signature variable for grouping logic. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Eliminated FontGroup struct and common_properties field by: - Using HashMap<String, (properties, indices)> for grouping during construction, then converting to sorted Vec - Storing all properties (constant + varying) in each group's HashMap - Using plain tuples (HashMap<String, Value>, Vec<usize>) instead of a dedicated struct This reduces code by 24 net lines while maintaining the same functionality. Properties are now the HashMap keys (via signature) and row indices are values, making the data structure more direct. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

FontStrategy was just wrapping a single Vec. Eliminated it by: - Returning Vec<(HashMap<String, Value>, Vec<usize>)> directly from analyze_font_columns() - Storing the Vec directly as metadata in PreparedData::Composite - Downcasting to Vec type directly in finalize() This removes 7 net lines while maintaining identical functionality. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Refactored TextRenderer to use FontKey tuple containing converted Vega-Lite Values instead of intermediate structures: - FontKey = (family, fontWeight, fontStyle, align, baseline) as Values - convert_fontface returns (fontWeight, fontStyle) tuple - Properties converted once during grouping (in analyze_font_columns) - finalize_layers directly inserts Values into mark object - Eliminated font_key_to_properties, apply_mark_property, and map_aesthetic_to_mark_property helpers Benefits: - No string signatures or intermediate HashMaps - Properties converted once per unique combination (not per row) - Simpler finalize_layers with direct value insertion - No special-case spreading logic for fontface This removes 70 net lines while maintaining identical functionality. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Changed analyze_font_columns to return Vec<(FontKey, Vec<usize>)> instead of HashMap, with sorting done once at the end of grouping. Before: HashMap was sorted twice - once in prepare_data() and again in finalize_layers() to maintain consistent ordering. After: Groups are sorted once after HashMap construction in analyze_font_columns(), then both prepare_data() and finalize_layers() iterate the pre-sorted Vec directly. This preserves HashMap's O(1) insertion benefit during construction while eliminating redundant sort operations. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Changes: - convert_family() returns Option<Value> instead of Value - Returns None for empty family strings - Simplifies finalize_layers to use if let Some(family_val) - Apply clippy suggestion: use or_default() instead of or_insert_with(Vec::new) This eliminates the is_none_or check and makes the intent clearer: family is optional and should be omitted from the mark object when not specified. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

When font groups have non-contiguous row indices (e.g., [0, 2, 5, 6]), split them into separate contiguous ranges ([0], [2], [5, 6]) to preserve rendering order. Example: - Row 0: Arial "A" - Row 1: Courier "B" - Row 2: Arial "C" Before: Arial layer renders A and C together, then B on top After: Three layers render in order: A, then B, then C This ensures that the DRAW clause ORDER BY is respected for z-order stacking, even when rows with the same font properties are interleaved with rows having different properties. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

The label aesthetic (mapped to Vega-Lite 'text' encoding) should not generate a legend or scale, as text values are literal display strings rather than data values that need scaling or legend representation. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Changes: - Use nested layer structure for multi-group text rendering - Single group: returns one layer with full encoding - Multiple groups: returns parent layer with shared encoding, child layers only have mark + transform - Extract helper functions for code reuse: - apply_font_properties: applies font properties to mark object - build_transform_with_filter: creates transform with source filter - Both finalize_single_layer and finalize_nested_layers now use helpers to avoid duplication This approach eliminates duplicate encoding specifications in multi-layer output while preserving z-order through contiguous range splitting. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Verifies nested layer structure is correct for multiple font groups - Tests that parent spec has shared encoding - Tests that child layers only have mark + transform - Tests that font properties are applied to mark objects Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Changes: - Remove finalize_single_layer function - Always use nested layer structure (works for 1 or N groups) - Simplify prepare_data to always use _font_N suffix - Update test expectations This eliminates code duplication and special-case handling for single-group scenarios, reducing implementation by ~24 lines. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Changes: - Add 'angle' to supported aesthetics in Text geom - Update FontKey tuple to include angle (6th element) - Extract angle column in analyze_font_columns - Add convert_angle function (parses numeric angle in degrees) - Apply angle property in apply_font_properties - Remove angle from encoding in modify_encoding The angle aesthetic is now handled the same way as other font properties (family, fontface, hjust, vjust) via data-splitting, since Vega-Lite requires it as a mark property. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

This commit completes the angle aesthetic implementation: Grammar changes: - Add 'angle' to aesthetic keywords in tree-sitter grammar Label geom consistency: - Add 'angle' to supported aesthetics in Label geom - Brings label geom in line with text geom support TextRenderer improvements: - Fix convert_angle to handle both numeric and string columns - Add angle normalization to [0, 360) range - Handle integer, float, and string angle values Integration test: - Add test_text_angle_integration for full SQL → Vega-Lite pipeline - Verifies nested layer structure with angle mark properties - Tests angle normalization and data splitting - Validates non-contiguous index handling The angle aesthetic now works end-to-end: SQL query with angle column → TextRenderer splits data by unique angles → Vega-Lite generates nested layers with angle mark properties. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Replace the group-sort-split approach with elegant run-length encoding for handling font property variations in text layers. Changes: Algorithm improvement: - Replace HashMap grouping + sorting + contiguous splitting with single-pass RLE scan - Complexity: O(n log n) → O(n) - Memory: 8n bytes per run → 16 bytes per run Type simplification: - Before: Vec<(FontKey, Vec<usize>)> - explicit row indices - After: Vec<(FontKey, usize)> - run lengths with implicit positions - Start positions derived from cumulative run lengths DataFrame operations: - Replace boolean masking (filter_by_indices) with direct slicing - Use df.slice(position, length) - O(1) pointer arithmetic - Remove filter_by_indices helper function entirely Function rename: - analyze_font_columns() → build_font_rle() - Clearer name indicating RLE technique and output type Benefits: - 28 net lines removed (52 insertions, 80 deletions) - Simpler single-pass algorithm - More efficient memory usage - Faster DataFrame operations - All tests pass unchanged The refactoring maintains identical behavior while using the canonical run-length encoding pattern for grouping consecutive rows. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Add nudge parameters that map to Vega-Lite's xOffset/yOffset mark properties, allowing fine-grained positioning adjustments for text labels. Changes: Text and Label geoms: - Add nudge_x and nudge_y to default_params - Default to Null (not applied unless explicitly set) TextRenderer: - Build base mark prototype with nudge offsets (if specified) - Clone and extend with font properties for each run - Pass layer to finalize_nested_layers for parameter access Integration test: - Verify nudge_x → xOffset and nudge_y → yOffset mapping - Confirm parameters apply to all nested text layers Usage: DRAW text SETTING nudge_x => 5, nudge_y => -10 This enables fine-tuning text label positions without modifying the underlying x/y data, useful for avoiding overlaps or improving label placement in dense visualizations. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Add template-based label formatting to text/label geoms, reusing the existing format.rs infrastructure from SCALE RENAMING. Changes: format.rs improvements: - Add format_dataframe_column() - clean API for DataFrame column formatting - Refactor to convert columns to strings first, then apply formatting - Add format_value() helper shared by both APIs - Improved error message showing actual datatype for unsupported types - Two-step process: column→string, then template application Text/Label geoms: - Add 'format' parameter (defaults to Null) - Works with both geoms for consistency TextRenderer: - Add apply_label_formatting() helper - Apply formatting in prepare_data() before font analysis - Pass layer parameter through prepare_data() trait method - Update all GeomRenderer implementations Integration tests: - test_text_label_formatting - Title case transformation - test_text_label_formatting_numeric - Printf-style number formatting Supported placeholder syntax: - {} - Plain insertion - {:UPPER} - Uppercase - {:lower} - Lowercase - {:Title} - Title Case - {:time %fmt} - DateTime strftime format - {:num %fmt} - Number printf format Usage: DRAW text SETTING format => 'Region: {:Title}' DRAW text SETTING format => '${:num %.2f}' DRAW text SETTING format => '{:time %b %Y}' The format parameter transforms label values before rendering, enabling clean label presentation without modifying source data. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Separate value selection from conversion in all convert functions - Use early returns with ? operator for cleaner control flow - Inline convert function calls to eliminate intermediate variables - Change property insertion to use if let Some with .insert() - Fix column lookup to use naming::aesthetic_column() - Optimize angle extraction to handle numeric columns without cast->parse - Remove unused FontKey type alias - Fix test_fontsize_linear_scaling to include required label aesthetic All text rendering tests passing (11/11). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

…results on my machine

teunbrand and others added 27 commits February 20, 2026 14:53

Merge branch 'main' into text_layer

1b522f7

soothe compiler

6990790

Handle font properties from parameters

39e4550

specify fontsize in pt

bfbf943

delenda est

504f631

docs

862b6ba

teunbrand mentioned this pull request Feb 25, 2026

Polishing Layers #139

Open

24 tasks

teunbrand marked this pull request as ready for review February 25, 2026 14:43

teunbrand requested a review from thomasp85 February 25, 2026 14:43

teunbrand added 4 commits March 3, 2026 11:24

Merge branch 'main' into text_layer

eb0824c

fix mismerged test

05d6bab

fix another test expectation

f1fd917

finally do something about this darn test that keeps mucking up test …

e8be14e

…results on my machine

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Text layer#155

Text layer#155
teunbrand wants to merge 31 commits intoposit-dev:mainfrom
teunbrand:text_layer

teunbrand commented Feb 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

teunbrand commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

teunbrand commented Feb 25, 2026 •

edited

Loading