Skip to content

ARROW-007: Add linter rule for Schema.all_fields() removal in Arrow 56.0 #8116

@ryanrussell

Description

@ryanrussell

ARROW-007: Schema.all_fields() Method Removal Detection

Problem

Arrow 56.0 removed Schema::all_fields() method causing compilation failures. Our linter doesn't detect this breaking change.

API Change Details

Removed in 56.0.0:

-pub fn arrow_schema::Schema::all_fields(&self) -> alloc::vec::Vec<&arrow_schema::Field>

Commit Reference: Based on cargo-public-api diff 55.2.0..56.0.0 for arrow-schema package.

Implementation Task

Create src/rules/arrow_007_schema_all_fields_removed.rs with these specifications:

Rule Implementation

use regex::Regex;
use std::path::Path;
use crate::output::{Issue, Severity};
use crate::rules::Rule;

/// ARROW-007: Detect removed Schema::all_fields usage
/// 
/// Arrow 56.0 removed Schema::all_fields method. Suggest migration to
/// Schema::fields() or Schema::flattened_fields() depending on use case.
pub struct Arrow007Rule;

impl Rule for Arrow007Rule {
    fn rule_id(&self) -> &'static str {
        "ARROW-007"
    }

    fn check_rust_source(&self, file_path: &Path, content: &str) -> Result<Vec<Issue>, Box<dyn std::error::Error>> {
        let mut issues = Vec::new();
        
        // Pattern to match .all_fields() method calls
        let pattern = Regex::new(r"\.all_fields\s*\(\s*\)")?;
        
        for (line_num, line) in content.lines().enumerate() {
            if let Some(mat) = pattern.find(line) {
                let issue = Issue {
                    rule_id: self.rule_id().to_string(),
                    severity: Severity::Error,
                    message: "Schema::all_fields() was removed in Arrow 56.0".to_string(),
                    file_path: file_path.to_path_buf(),
                    line: line_num + 1,
                    column: mat.start() + 1,
                    suggestion: Some("Replace with schema.fields() for immediate fields or schema.flattened_fields() for recursive field collection. Note: flattened_fields() returns owned Fields, not references.".to_string()),
                    auto_fixable: false, // Requires semantic analysis
                    deprecated_since: Some("56.0.0".to_string()),
                    changelog_url: Some("https://github.com/apache/arrow-rs/blob/main/CHANGELOG.md#560".to_string()),
                    migration_guide_url: Some("https://arrow.apache.org/docs/rust/migration_guide.html#schema-all-fields-removal".to_string()),
                };
                issues.push(issue);
            }
        }
        
        Ok(issues)
    }

    fn check_cargo_toml(&self, _file_path: &Path, _content: &str) -> Result<Vec<Issue>, Box<dyn std::error::Error>> {
        Ok(Vec::new())
    }
}

Tests Required

#[cfg(test)]
mod tests {
    use super::*;
    use std::path::PathBuf;

    #[test]
    fn test_detects_all_fields_usage() {
        let rule = Arrow007Rule;
        let content = r#"
use arrow_schema::Schema;

fn process_schema(schema: &Schema) {
    let fields = schema.all_fields();
    for field in fields {
        println\!("{}", field.name());
    }
}
"#;
        
        let issues = rule.check_rust_source(&PathBuf::from("test.rs"), content).unwrap();
        assert_eq\!(issues.len(), 1);
        assert_eq\!(issues[0].rule_id, "ARROW-007");
        assert\!(matches\!(issues[0].severity, Severity::Error));
        assert\!(issues[0].message.contains("all_fields() was removed"));
        assert\!(\!issues[0].auto_fixable);
    }

    #[test]
    fn test_no_issues_for_fields() {
        let rule = Arrow007Rule;
        let content = r#"
use arrow_schema::Schema;

fn process_schema(schema: &Schema) {
    let fields = schema.fields();
    for field in fields {
        println\!("{}", field.name());
    }
}
"#;
        
        let issues = rule.check_rust_source(&PathBuf::from("test.rs"), content).unwrap();
        assert_eq\!(issues.len(), 0);
    }
}

Integration Steps

  1. Add to src/rules/mod.rs:

    pub mod arrow_007_schema_all_fields_removed;
    // In RuleEngine::new():
    rules.push(Box::new(arrow_007_schema_all_fields_removed::Arrow007Rule));
  2. Test with this example code:

    let fields = schema.all_fields(); // Should trigger ARROW-007
    let fields = schema.fields();     // Should not trigger

Acceptance Criteria

  • ✅ Detects .all_fields() method calls on any receiver
  • ✅ Provides clear migration guidance for both alternatives
  • ✅ Marks as non-auto-fixable due to semantic differences
  • ✅ Error severity (compilation failure)
  • ✅ Comprehensive test coverage

Migration Notes for Users

  • schema.fields() - Returns &Fields (reference to immediate fields)
  • schema.flattened_fields() - Returns Vec<Field> (owned, recursive)
  • Breaking: Return types are different, requiring code changes beyond simple replacement

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions